Anecdote: The benefits of applying structured expertise

By mrkiouak@gmail.com on 2025-06-16
GenAI
Anecdote: The benefits of applying structured expertise

Structured generation

Most of the popular generative AI excitement I see from business is around automating previously manual but rote or well understood work. I suspect we'll see a "camcorder revolution" like effect with GenAI, and I don't think this is getting as much attention.

By camcorder revolution (see e.g. The Down Side of the Camcorder Revolution -- this was the first news link when I searched the term, and the quaintness of a 1990 article bemoaning the loss of privacy for celebrities because of home camcorder use gave me such an appreciation for how much has changed given what we know was coming ~20 years later) I mean the democratization of video recording. El Mariachi, Slackers, and Clerks are three movies that immediately come to my mind as output of this change.

El Mariachi & Clerks were both shot for $8-30k, which was 100x less what it cost to make a movie before video recorders became available.

Previously I wrote about an experiment to generate children story style image and text and the problems I had in generating the stories with consistency across LLM interactions: Problems in GenerativeAI: Continuity.

This past weekend I got around to trying generating structured character information. After coming up with this initial character JSON format:

JSON

{
    "character_id": "string",
    "species": "Human",
    "family_name": "string",
    "given_name": "string",
    "aliases": [],
    "role": "ROLE_UNSPECIFIED",
    "description": "string",
    "clothing": {
        "overall_style": "string",
        "current_outfit": []
    },
    "possessions": {
        "inventory": [],
        "currency": {
            "unit": "Gold Pieces",
            "amount": 0
        }
    },
    "physical_attributes": {
        "actual_age": 0,
        "apparent_age": "string",
        "height_cm": 0,
        "weight_kg": 0,
        "build": "string",
        "skin_tone": "string",
        "eye_color": "string",
        "hair_color": "string",
        "hair_style": "string",
        "distinguishing_features": [],
        "health_summary": "string"
    },
    "mental_attributes": {
        "personality_summary": "string",
        "temperament": "string",
        "virtues": [],
        "vices": [],
        "phobias": [],
        "current_emotional_state": "string",
        "iq": 0,
        "eq": 0
    },
    "skills": {
        "known_skills": []
    },
    "relationships": {
        "connection": []
    },
    "background": {
        "backstory": "string",
        "place_of_birth": "string",
        "hometown": "string",
        "significant_life_events": [],
        "family_history": "string",
        "education": "string",
        "occupation_history": []
    },
    "motivations": {
        "primary_goal": "string",
        "secondary_goals": [],
        "core_values": [],
        "desires": [],
        "alignment": "ALIGNMENT_UNSPECIFIED"
    }
}

(which I filled out using AI, and will go and prune & curate in a future step), I created a prompt to generate an image and video of the character. I settled on the fields above after reviewing some creative writing courses & book sections on characters.

I asked Gemini 2.5 Pro to provide some sample data for me to test my integration, and I was impressed at how evocative some of the details it generated were:

JSON

{
    "given_name": "Barnaby",
    "family_name": "Sticks",
    "species": "Anthropomorphic Canis lupus familiaris (Landseer Newfoundland)",
    "aliases": [
        "Barney",
        "The Stick Collector of the Green Mountains"
    ],
    "role": "PROTAGONIST",
    "description": "A gentle giant with a heart full of adventure and paws ready for the trail. Barnaby's life revolves around two things: the majestic mountains of Vermont and his ever-growing collection of extraordinary sticks.",
    "clothing": {
        "overall_style": "Rugged and practical outdoor gear, built for comfort and durability.",
        "current_outfit": [
            {
                "name": "Worn Red Flannel Shirt",
                "description": "A comfortable flannel that keeps the chill off during high-altitude hikes. It has a special pocket for very small, 'twiglet' sticks.",
                "color": "Red and Black Check",
                "condition": "WORN"
            },
            {
                "name": "Sturdy Hiking Trousers",
                "description": "Reinforced canvas trousers with plenty of pockets for trail snacks and supplies.",
                "color": "Forest Green",
                "condition": "WORN"
            }
        ]
    },
    "possessions": {
        "inventory": [
            {
                "name": "The 'Crown Jewel' Stick",
                "description": "A perfectly Y-shaped maple branch, worn smooth from being carried on countless adventures. It is not for throwing.",
                "type": "KEY_ITEM",
                "quantity": 1
            },
            {
                "name": "Trail Map of the Green Mountains",
                "description": "A well-creased, waterproof map with hand-drawn notes on the best stick-finding locations and where the tastiest wild blueberries grow.",
                "type": "TOOL",
                "quantity": 1
            }
        ],
        "currency": {
            "unit": "Maple Creemee Tokens",
            "amount": 5
        }
    },
    "physical_attributes": {
        "apparent_age": "Young Adult",
        "build": "Large and powerful, yet surprisingly nimble on the trail.",
        "eye_color": "Warm, soulful brown",
        "hair_color": "Lustrous black and white patches, characteristic of the Landseer Newfoundland.",
        "hair_style": "Thick, water-resistant double coat that is usually full of burrs.",
        "distinguishing_features": [
            "A distinct black 'saddle' marking across his back.",
            "Paws that are almost always a little muddy from his latest excursion."
        ]
    },
    "mental_attributes": {
        "personality_summary": "Incredibly good-natured, patient, and brave. Possesses a simple wisdom and finds immense joy in the little things, like the perfect heft of a fallen branch or the scent of pine after a rain shower.",
        "temperament": "Calm and steady, but with a spark of playful curiosity.",
        "virtues": [
            "Loyalty",
            "Courage",
            "Kindness to all creatures"
        ],
        "vices": [
            "A bit of a packrat (specifically with sticks)",
            "Easily distracted by squirrels and interesting smells"
        ]
    },
    "skills": {
        "known_skills": [
            {
                "name": "Stick Appraisal",
                "proficiency": "MASTER",
                "origin": "A lifetime of dedicated, hands-on (or paws-on) research."
            },
            {
                "name": "Mountain Navigation",
                "proficiency": "EXPERT",
                "origin": "Learned from years of exploring the Green Mountains."
            },
            {
                "name": "Swimming",
                "proficiency": "MASTER",
                "origin": "Natural Newfoundland talent."
            }
        ]
    },
    "background": {
        "backstory": "Barnaby was born with a wanderer's soul in a small cabin nestled deep in the Green Mountains. He discovered his passion for stick collecting as a pup and has since dedicated his life to exploring every peak and valley in search of the most unique specimens to add to his grand collection.",
        "hometown": "A secluded valley near Stowe, Vermont",
        "occupation_history": [
            "Professional Stick Collector",
            "Unofficial Trail Guide"
        ]
    },
    "motivations": {
        "primary_goal": "To find the legendary, moss-covered Elderwood Branch, said to grow only on the misty summit of Mount Mansfield.",
        "secondary_goals": [
            "To map every trail in the Green Mountains.",
            "To build the most magnificent stick pile in all of Vermont."
        ],
        "core_values": [
            "The beauty and sanctity of nature.",
            "The simple joy of a good walk.",
            "Friendship and helping those in need."
        ],
        "alignment": "NEUTRAL_GOOD"
    }
}

All I prompted with was:

Please provide a sample POST character body I can use to test the endpoint. Populate the character data with a description of an anthropomorphic landseer (black and white) newfoundland dog that collects sticks and goes on adventures in the mountains of Vermont.

Things like "Incredibly good-natured, patient, and brave. Possesses a simple wisdom and finds immense joy in the little things, like the perfect heft of a fallen branch or the scent of pine after a rain shower." and the reference to Mt. Mansfield & Stowe really captured details of Vermont, as well as a dog that collects sticks.

Taking a small set of this generated data I was able to generate:

Computer Animated Newfoundland/St. Bernard-esque Dog

Even more impressively, it generated a cool video as well:

I didn't get a chance this weekend to test stitching images or multiple videos together with continuity of character, but even just the continuity between the image and video (though in this case, the video was generated with a prompt that passed the image in), but I think the progress we've seen both in model quality, and from structuring data in a way novelists and creative writers try to teach shows that its only a matter of time before a very small team, maybe even one person, creates a movie theater (or streaming) hit.

The barrier's to entry for storytelling in animation and high quality video seem like they have been obliterated, and I think we're under appreciating how much of a change in the entertainment industry this will be.

This is before even thinking about extending the image -> video chain to image -> video -> video game. The movie industry is ~$100 billion, and the game industry is ~$400 billion. GenAI could put a serious dent in a half a trillion dollars worth of an industry and this isn't even the most talked about use case.

Comments

No comments yet. Be the first to comment!