Experiments with AI, Animation, and New Approaches to Digital Art. Development of the short film 'Maintenance'
August 3, 2023
Preproduction AI Experiments
Digital image production paradigms are undergoing radical change due to the advent and popularisation of generative AI. These (and related) technologies will find their way into every aspect of the (digital) image production pipeline.
Preproduction Storyboards that look like storyboards…
While interesting for generating concepts, perhaps shots and angles, or interpretations of imagery, is there value in generating images that explicitly ‘look’ like pre-production materials (concept style, storyboard style etc)?
Massaging specific scenes for concept/storyboard by refined prompts and model selection
The generative platform resists certain descriptions, such as ‘profile’, defaulting to a 3/4 view. Prompts with complex description such as position, direction, description, action, environment, expression, colors etc become very unwieldy! Utilizing image constraints, or iterating between sketches and generation cycles becomes a necessary approach (human-in-the-loop).
Prompt Engineering is more complex than it seems…
We must understand the nature of prompts as tokens and interpretive clip models that create conditionings. As well as combining the different aspects of a description as text (ie, manipulating the order and nature of tokens), there are various ways of combining different conditionings – notably: combining, concatenating or averaging the weights of different prompt parts.
Left (a) = Prompt 1. Left (b) = Prompt 2. Right (a) = image gen with COMBINED conditioning. Right (b) = with CONCATENATED conditioning. Right (c) = WEIGHTED AVERAGE of conditioning.
Altering prompts can often improve results, but most often other variations are also needed, such as using constraint images via controlnets, masking, pose estimators and other tools.
From sketch to concept art
Final concept imageAn improvised shelter, constructed from spaceship wreckage. In this example, a quick sketch is used as the starting point in a multi-stage iterative process. The sketch can be transformed into a high quality concept image through the use of different guidance parameters, and different training models. Elements of the image can be refined separately – the layout, color scheme and content can be controlled independently by using different image guidance constraints.
sketchConstrained iterationConcept image
Pose constraints
A very useful feature for pre-production tasks is the ability to constrain generated images to defined poses.
Pose constraintGenerated 3D style (Note leg dislocation)Generated realistic style
Limitations on discipline understanding
The fine tuned models available are mostly identified for style, not purpose. This means that it is relatively easy to select a model that aligns with a style (or to train one, given enough source images), but it is more difficult at this stage to create planning images with proper utility for their role in the design process. For example, while the models seem to understand that an expression sheet involves multiple faces, it is very difficult to create prompts that generate an appropriate array of expressions.
Creating facial expression sheets (with multiple fine tuned models) is challenging – the models may not contain all expressions! Even when defining by the 7 cardinal emotions within prompts, the expressions lack variation needed for emotion expression. The fine tuned model includes too many examples of smiling!While there is some consistency between the model poses (eg, left), is this just by chance? The model lacks the understanding that all views are needed (see repeats on left) and that consistency between the character depiction is critical for these images to function as useful design reference (on right, the body shape in the background is decorative, but inconsistent).Inconsistent structure between versions or viewsMultiple thumbs…Component parts can be generated, but display typical errors (hands remain challenging, but can be refined to remove these errors)Known formats, such as character turnarounds can be produced somewhat reliably. But these examples lack strict consistency which may be needed for modeling referenceThe model can be prompted to produce sprite arrays. However, these appear largely ineffective for actual animation. They have captured the form of the artefact, but not its purpose and subsequent function.Character design iterations. For now, the ability to ‘design’ is completely dominated by the model and the fine tuning. Interesting artefacts emerge.