Skip to main content

Command Palette

Search for a command to run...

SpriteDX - Stage 2 - Prompt Engineering

Published
7 min readView as Markdown
SpriteDX - Stage 2 - Prompt Engineering

We did some trials and error on the Stage 2 Prompts in previous post. I want to formalize a unified format for defining multi-shot prompt that works with Seedance 1 Pro.

XML Format

Here is the multi-shot prompt format, I’m using:

<shot 
  num="1"
  id="UNIQUE_ID"
  src="filename_that_best_describes_the_shot.gif"
  tags="keywords for the shot and overall animation"other attributes… 
>
  Description of the shot
</shot>

The benefit is that (1) this works for Seedance 1 Pro multishot prompting and is in XML format and we can easily add other key value pairs.

Here is the example prompt.

<shot 
  num="1"
  id="HI" 
  src="eliana-sprite-greet-loop.gif"
  character="eliana"
  costume="1"
  camera="fixed"
  tags="white-bg sprite-anim 角色帧动画"
>
  Pixel art 2D game sprite character for the game “Machi.” 
  Her name is Eliana (brown hair, ponytail, blue dress, red shoes, red hair tie)
  Eliana (from the reference image) waves or says “hi.” 
  full-body view. Smooth looping motion.
</shot>
<shot
  num="2" 
  id="IDLE" 
  character="eliana"
  costume="1"
  camera="fixed"
  src="eliana-sprite-idle-loop.gif"
  tags="white-bg sprite-anim idle-loop 角色帧动画"
>
  Eliana stands still facing right (+x). 
  Shows an idle animation with gentle breathing — at least two full cycles. 
  Character stays fixed in place, no translation. Loop seamlessly. 
</shot>
<shot
  num="3"
  id="RUN"
  character="eliana"
  costume="1"
  camera="fixed"
  src="eliana-sprite-run-loop.gif"
  dir="right"
  tags="white-bg sprite-anim run-cycle 角色帧动画"
>
  Eliana (runs:1.6) in-place facing (right:1.6) (+x direction). 
  Feet move while her body stays fixed in the frame. 
  No scrolling, zooming, or panning. Loop smoothly. Facing right.
</shot>

Why XML?

It is just a clean way to define an animation sequence. We could choose JSON or YAML but this seems to work fine and is quite readable.

And there is clear separation between “metadata“ like camera="fixed" and descriptions of the content.

Example Renders


Alternate Idea - XML Format + 1

Instead of defining the content in natural language can we describe them in a more structured way?

Right now, the innerText of shots contain description of what is happening in the scene.

But if we were to really mimic what HTML does with the content, what is happening should really go into alt attribute, and the HTML structure and hierarchy should define what goes inside the shots.

Format

<shot 
  ...
  alt="description of the shot"
>
  <character
    name="Name of the character"
    state="character animation state"
    src="character_animation_state_gif_file_name.gif"
    style="style: attributes;"
    alt="Description of what this object/character is doing"
  >
</shot>

Example

<shot 
  num="1"
  id="greet" 
  camera="fixed"
  zoom="1"
  loop="true"
  style="background: white;"
  alt="Pixel art 2D game sprite character for the game 'Machi'. Character waves or says 'hi.'"
  tags="pixelart spriteanim fullbody loopanim 角色帧动画"
>
  <character 
    class="ELI-93Q"
    name="Eliana"
    state="greet"
    dir="front"
    src="eliana-sprite-greet-loop.gif"
    style="rendering: pixelart; shadow: none; effects: none;"
    alt="pixel art character says HI."
  />
</shot>
<shot
  num="2"
  id="idle"
  camera="fixed"
  zoom="1"
  loop="true"
  style="background: white;"
  alt="Eliana stands still facing right (+x). Shows an idle animation with gentle breathing — at least two full cycles. "
  tags="pixelart spriteanim fullbody loopanim 角色帧动画"
>
  <character 
    class="ELI-93Q"
    name="Eliana"
    state="idle"
    dir="right"
    src="eliana-sprite-idle-loop.gif"
    style="rendering: pixelart; shadow: none; effects: none; direction: right;"
    alt="Pixel art character facing right breathes in and out."
  />
</shot>
<shot
  num="3"
  id="run"
  camera="fixed"
  zoom="1"
  loop="true"
  style="background: white;"
  alt="Eliana run in-place facing right. Shows an run loop animation — at least two full cycles."
  tags="pixelart spriteanim fullbody loopanim 角色帧动画"
>
  <character
    class="ELI-93Q"
    name="Eliana"
    state="run"
    dir="right"
    src="eliana-sprite-run-loop.gif"
    style="rendering: pixelart; shadow: none; effects: none;"
    alt="Pixel art character runs in-place for at least 3 full cycles."
  />
</shot>

Example Generations

This wing thingy always comes. I guess it is just the issue with Seedance’s data distribution.


Prompt Issues

Let’s iterate on above method. So far we’ve seen (sample size=38):

Issue TypeCountPercentageNotes
No issue2052.6%
Wing issue513.2%Perhaps we can add a style attribute decorations: none;
Costume Change Issue25.3%
Missing Shot Issue37.9%Perhaps we can add duration:"1s" field on each.
Run Direction Issue37.9%Perhaps dir="right" is not explicit enough. Perhaps we make it explicit by direction="right".
Character Dances25.3%Umm…
Character Offset Issue12.6%
Zoom Issue12.6%
Background Issue12.6%

Updated Prompt

Here is the updated prompts

<shot 
  num="1"
  id="greet" 
  camera="fixed"
  zoom="1"
  duration="1s"
  loop="true"
  style="background: white;"
  alt="Pixel art 2D game sprite character for the game 'Machi'. Character waves or says 'hi.'"
  tags="pixelart spriteanim fullbody loopanim 角色帧动画"
>
  <character 
    class="ELI-93Q"
    name="Eliana"
    state="greet"
    dir="front"
    src="eliana-sprite-greet-loop.gif"
    style="rendering: pixelart; shadow: none; effects: none; decoration: none;"
    alt="pixel art character says HI."
  />
</shot>
<shot
  num="2"
  id="idle"
  camera="fixed"
  zoom="1"
  duration="1s"
  loop="true"
  style="background: white;"
  alt="Eliana stands still facing right (+x). Shows an idle animation with gentle breathing — at least two full cycles. "
  tags="pixelart spriteanim fullbody loopanim 角色帧动画"
>
  <character 
    class="ELI-93Q"
    name="Eliana"
    state="idle"
    direction="right"
    src="eliana-sprite-idle-loop.gif"
    style="rendering: pixelart; shadow: none; effects: none; decoration: none; direction: right;"
    alt="Pixel art character facing right breathes in and out."
  />
</shot>
<shot
  num="3"
  id="run"
  camera="fixed"
  zoom="1"
  duration="1s"
  loop="true"
  style="background: white;"
  alt="Eliana run in-place facing right. Shows an run loop animation — at least two full cycles."
  tags="pixelart spriteanim fullbody loopanim 角色帧动画"
>
  <character
    class="ELI-93Q"
    name="Eliana"
    state="run"
    dir="right"
    src="eliana-sprite-run-loop.gif"
    style="rendering: pixelart; shadow: none; effects: none; decoration: none;"
    alt="Pixel art character runs in-place for at least two full cycles."
  />
</shot>

Results

Sample size: 28

Issue TypeCountsPercentageNotes
No issue1864.3%22% increase
Wing issue13.5%I guess decoration: none is not really enough. Let’s add wing: none; to be more explicit.
Costume Change Issue00.0%
Missing Shot Issue13.5%This was reduced significantly. 👍
Run Direction Issue517.9%This is still rather high. In above direction=”right” was only in idle character, lets add to run.
Character Dances00.0%
Character Offset Issue13.5%
Zoom Issue27.1%
Background Issue00.0%

Updated Prompt 2

<shot 
  num="1"
  id="greet" 
  camera="fixed"
  zoom="1"
  duration="1s"
  loop="true"
  style="background: white;"
  alt="Pixel art 2D game sprite character for the game 'Machi'. Character waves or says 'hi.'"
  tags="pixelart spriteanim fullbody loopanim 角色帧动画"
>
  <character 
    class="ELI-93Q"
    name="Eliana"
    state="greet"
    direction="front"
    src="eliana-sprite-greet-loop.gif"
    style="rendering: pixelart; shadow: none; effects: none; decoration: none; direction: front; wing: none;"
    alt="pixel art character says HI."
  />
</shot>
<shot
  num="2"
  id="idle"
  camera="fixed"
  zoom="1"
  duration="1s"
  loop="true"
  style="background: white;"
  alt="Eliana stands still facing right (+x). Shows an idle animation with gentle breathing — at least two full cycles. "
  tags="pixelart spriteanim fullbody loopanim 角色帧动画"
>
  <character 
    class="ELI-93Q"
    name="Eliana"
    state="idle"
    direction="right"
    src="eliana-sprite-idle-loop.gif"
    style="rendering: pixelart; shadow: none; effects: none; decoration: none; direction: right; wing: none;"
    alt="Pixel art character facing right breathes in and out."
  />
</shot>
<shot
  num="3"
  id="run"
  camera="fixed"
  zoom="1"
  duration="1s"
  loop="true"
  style="background: white;"
  alt="Eliana run in-place facing right. Shows an run loop animation — at least two full cycles."
  tags="pixelart spriteanim fullbody loopanim 角色帧动画"
>
  <character
    class="ELI-93Q"
    name="Eliana"
    state="run"
    direction="right"
    src="eliana-sprite-run-loop.gif"
    style="rendering: pixelart; shadow: none; effects: none; decoration: none; direction: right; wing: none; "
    alt="Pixel art character runs in-place for at least two full cycles."
  />
</shot>

Results

Sample size: 21

Issue TypeCountsPercentageNotes
No issue1781.0%Additional 26% increase, 54% increase from original base line.
Wing issue00.0%Wing issue was reduced significantly
Costume Change Issue00.0%
Missing Shot Issue00.0%
Run Direction Issue314.3%This is still an issue but slightly reduced.
Character Dances00.0%
Character Offset Issue14.8%
Zoom Issue00.0%
Background Issue00.0%

Conclusion

We defined a generic XML based prompt format that allows for key-value type information specification and negative prompting through style="shadow: none" attributes. We also defined a way to deliver a scene graph via prompt.

The benefit is that it is (1) flexible (2) extendable and can work in various scenarios and give the prompter ability to prompt composition in a scene graph format rather than natural language.

We also increased the prompt accuracy significantly via tuning those attributes resulting in 81% success rate.


This was rather succesful research. I will update the code with this next.

—Sprited Dev