Digital Being

Right now, we have a vessel that can express different states. We need a controller that controls this.

Model vs State Machine

The behavior of a digital being is governed by two cooperating systems:

the Model — decides what to do
the State Machine — enforces how it can be done

They operate at different layers and must remain strictly separated.

💪 State machine defines the structure and rules of the vessel.

It is responsible for:

Defining all possible states.
Mapping states to animations and actuations.
Enforcing transition constraints
Handling timing (looping, completion)
Handling interruption rules (what can/cannot be interrupted)
Resolving transitions deterministically.

It answers:

Given a requested state change, is it valid, and how is it executed?

The state machine is authoritative. Even if a higher-level system requests something invalid, the state machine can reject or defer it.

Example: If the model requests sleep while the entity is mid-jump, the state machine may: ignore it, queue it or redirect to a valid intermediate state.

🤖 Model is the policy that selects behavior over time. It is responsible for:

Observing context (world, needs, memory, events)
Choosing the next desired state (or intent)
Deciding when to interrupt current behavior
Providing high-level direction (goal-driven or reactive)

It answers:

What should this being try to do next?

The model is suggestive, not authoritative.

It proposes actions, but does not guarantee execution.

Example: An LLM-based model may request engage after detecting another agent.

Aspect	Model	State Machine
Role	Decision	Execution
Authority	Suggestive	Final
Focus	Intent / Choice	Validity / Constraints
Time Scale	Discrete Decisions	Continuous Execution
Input	Context, Memory, Needs	Current State, Animation Status
Output	Desired State	Actual State Transitions

Interaction Flow

At each tick:

Model evaluates context
→ proposes a desired state (e.g. run)
State Machine evaluates proposal
→ checks: Is transition allowed? Is current state interruptible? Are preconditions met?
State Machine decides outcome: accept, reject, redirect.
State machine executes:
- Plays animation
- Applies actuation
- Tracks completion

Separation of Concerns

State Machine controls low-level states. This simplifies the complexities and vocabularies for the model.

Language of the Model

We need to define the language that the model uses to communicate with the state machine.

For example, let's say our being is tired, on certain tick or an event, we can prompt the following:

{
  "messages": [
    {
      "role": "system",
      "content": "You are controlling a digital being. Use tools when appropriate."
    },
    {
      "role": "user",
      "content": "Current state: fatigue=92, safe=true, grounded=true"
    }
  ],
  "tools": [{
    "name": "rest",
    "description": "Request bodily rest to reduce fatigue.",
    "parameters": {
      "type": "object",
      "properties": {
        "durationSec": {
          "type": "integer",
          "minimum": 1
        },
        "preferredMode": {
          "type": "string",
          "enum": ["sit", "nap", "sleep"]
        }
      },
      "required": ["durationSec"]
    }
  }]
}

Then, model output will look like:

{
  "tool_calls": [
    {
      "id": "call_1",
      "name": "rest",
      "arguments": {
        "durationSec": 600,
        "preferredMode": "nap"
      }
    }
  ]
}

Then State Machine (or the Body) will execute it and return following result.

{
  "role": "tool",
  "tool_call_id": "call_1",
  "content": {
    "accepted": true,
    "finalState": "sleep",
    "durationSec": 600
  }
}

While sleeping, the state machine won't bother the model much. However, if big change in the system has been detected it will wake itself up automatically then query the model again.

Q: Should State Machine (Body) also be an LLM?

Just a thought but we could use lesser powered version of LLM that is fine-tuned. If we use LLM here, we would get stochastic version of the behaviors more easily. That is, in deterministic state machine case, there is no probabilistic transitions are going to be very threshold-y an inorganic. If we the state machine be LLM, we can have the model behave more naturally depending on situations and parameters.

One other benefit is that now we can interface with the main model in human words than in terms of tool calls.

On the other hand, we will still need some sort of tool calls anyway. So, I think we need to start with a basic state machine that is not powered by LLM.

Bag of Skills

Let's now define our bag of skills.

{
  "rest": { "durationSec": "number", "preferredMode": ["sit", "nap", "sleep"] },
  "eat": { "targetId": "entity|item", "amount": "optional" },
  "drink": { "targetId": "entity|item" },
  "moveTo": { "target": "position|entityId", "speed": ["walk", "run"] },
  "wander": { "radius": "number", "durationSec": "optional" },
  "follow": { "target": "entity", "distance": "number" },
  "flee": { "from": "entity|position", "durationSec": "optional" },
  "approach": { "targetId": "entity", "distance": "number" },
  "pickUp": { "targetId": "item" },
  "drop": { "targetId": "item", "position": "optional" },
  "use": { "targetId": "item|object" },
  "inspect": { "targetId": "entity|object" },
  "harvest": { "targetId": "resourceNode" },
  "engage": { "targetId": "entity" },
  "greet": { "targetId": "entity", "style": ["neutral", "friendly", "excited"] },
  "communicate": { "targetId": "entity", "content": "string" },
  "mimic": { "targetId": "entity" },
  "runAway": { "targetId": "entity|item" },
  "express": { "type": ["happy", "sad", "angry", "laugh", "cry"], "durationSec": "optional" },
  "gesture": { "type": ["wave", "nod", "bow"] },
  "pose": { "type": ["sit", "stand", "idle"] }
  "attack": { "targetId": "entity", "style": "optional" },
  "defend": { "durationSec": "optional" },
  "evade": { "direction": "optional" },
  "remember": { "key": "string", "value": "any" },
  "recall": { "key": "string" },
  "plan": { "goal": "string" },
  "reflect": { "topic": "optional" },
}

This totals to 507 tokens. I think this is a good place to stop. Let's see if we can implement these skills next.

-- Sprited Dev 🐛

Digital Being - State Machine Design

Model vs State Machine

Interaction Flow

Separation of Concerns

Language of the Model

Q: Should State Machine (Body) also be an LLM?

Bag of Skills

Comments

Digital Being - Hunger, Energy and Morale

More from this blog

Introducing Monet: Born in the Middle of the Story

Does SAM3D Body Work on Chibi Character Animations

Monet - Before and After

Monet - Mouth Removal

SpriteDX - Failures.gif

Command Palette

Model vs State Machine

Interaction Flow

Separation of Concerns

Language of the Model

Q: Should State Machine (Body) also be an LLM?

Bag of Skills

Comments

Digital Being

Digital Being - Hunger, Energy and Morale

More from this blog