Skip to main content

Command Palette

Search for a command to run...

Digital Being - State Machine Design

Published
โ€ข5 min read
Digital Being - State Machine Design

Right now, we have a vessel that can express different states. We need a controller that controls this.

Model vs State Machine

The behavior of a digital being is governed by two cooperating systems:

  • the Model โ€” decides what to do

  • the State Machine โ€” enforces how it can be done

They operate at different layers and must remain strictly separated.

๐Ÿ’ช State machine defines the structure and rules of the vessel.

It is responsible for:

  • Defining all possible states.

  • Mapping states to animations and actuations.

  • Enforcing transition constraints

  • Handling timing (looping, completion)

  • Handling interruption rules (what can/cannot be interrupted)

  • Resolving transitions deterministically.

It answers:

Given a requested state change, is it valid, and how is it executed?

The state machine is authoritative. Even if a higher-level system requests something invalid, the state machine can reject or defer it.

Example: If the model requests sleep while the entity is mid-jump, the state machine may: ignore it, queue it or redirect to a valid intermediate state.

๐Ÿค– Model is the policy that selects behavior over time. It is responsible for:

  • Observing context (world, needs, memory, events)

  • Choosing the next desired state (or intent)

  • Deciding when to interrupt current behavior

  • Providing high-level direction (goal-driven or reactive)

It answers:

What should this being try to do next?

The model is suggestive, not authoritative.

It proposes actions, but does not guarantee execution.

Example: An LLM-based model may request engage after detecting another agent.

Aspect Model State Machine
Role Decision Execution
Authority Suggestive Final
Focus Intent / Choice Validity / Constraints
Time Scale Discrete Decisions Continuous Execution
Input Context, Memory, Needs Current State, Animation Status
Output Desired State Actual State Transitions

Interaction Flow

At each tick:

  1. Model evaluates context
    โ†’ proposes a desired state (e.g. run)

  2. State Machine evaluates proposal
    โ†’ checks: Is transition allowed? Is current state interruptible? Are preconditions met?

  3. State Machine decides outcome: accept, reject, redirect.

  4. State machine executes:

    • Plays animation

    • Applies actuation

    • Tracks completion


Separation of Concerns

State Machine controls low-level states. This simplifies the complexities and vocabularies for the model.


Language of the Model

We need to define the language that the model uses to communicate with the state machine.

For example, let's say our being is tired, on certain tick or an event, we can prompt the following:

{
  "messages": [
    {
      "role": "system",
      "content": "You are controlling a digital being. Use tools when appropriate."
    },
    {
      "role": "user",
      "content": "Current state: fatigue=92, safe=true, grounded=true"
    }
  ],
  "tools": [{
    "name": "rest",
    "description": "Request bodily rest to reduce fatigue.",
    "parameters": {
      "type": "object",
      "properties": {
        "durationSec": {
          "type": "integer",
          "minimum": 1
        },
        "preferredMode": {
          "type": "string",
          "enum": ["sit", "nap", "sleep"]
        }
      },
      "required": ["durationSec"]
    }
  }]
}

Then, model output will look like:

{
  "tool_calls": [
    {
      "id": "call_1",
      "name": "rest",
      "arguments": {
        "durationSec": 600,
        "preferredMode": "nap"
      }
    }
  ]
}

Then State Machine (or the Body) will execute it and return following result.

{
  "role": "tool",
  "tool_call_id": "call_1",
  "content": {
    "accepted": true,
    "finalState": "sleep",
    "durationSec": 600
  }
}

While sleeping, the state machine won't bother the model much. However, if big change in the system has been detected it will wake itself up automatically then query the model again.


Q: Should State Machine (Body) also be an LLM?

Just a thought but we could use lesser powered version of LLM that is fine-tuned. If we use LLM here, we would get stochastic version of the behaviors more easily. That is, in deterministic state machine case, there is no probabilistic transitions are going to be very threshold-y an inorganic. If we the state machine be LLM, we can have the model behave more naturally depending on situations and parameters.

One other benefit is that now we can interface with the main model in human words than in terms of tool calls.

On the other hand, we will still need some sort of tool calls anyway. So, I think we need to start with a basic state machine that is not powered by LLM.


Bag of Skills

Let's now define our bag of skills.

{
  "rest": { "durationSec": "number", "preferredMode": ["sit", "nap", "sleep"] },
  "eat": { "targetId": "entity|item", "amount": "optional" },
  "drink": { "targetId": "entity|item" },
  "moveTo": { "target": "position|entityId", "speed": ["walk", "run"] },
  "wander": { "radius": "number", "durationSec": "optional" },
  "follow": { "target": "entity", "distance": "number" },
  "flee": { "from": "entity|position", "durationSec": "optional" },
  "approach": { "targetId": "entity", "distance": "number" },
  "pickUp": { "targetId": "item" },
  "drop": { "targetId": "item", "position": "optional" },
  "use": { "targetId": "item|object" },
  "inspect": { "targetId": "entity|object" },
  "harvest": { "targetId": "resourceNode" },
  "engage": { "targetId": "entity" },
  "greet": { "targetId": "entity", "style": ["neutral", "friendly", "excited"] },
  "communicate": { "targetId": "entity", "content": "string" },
  "mimic": { "targetId": "entity" },
  "runAway": { "targetId": "entity|item" },
  "express": { "type": ["happy", "sad", "angry", "laugh", "cry"], "durationSec": "optional" },
  "gesture": { "type": ["wave", "nod", "bow"] },
  "pose": { "type": ["sit", "stand", "idle"] }
  "attack": { "targetId": "entity", "style": "optional" },
  "defend": { "durationSec": "optional" },
  "evade": { "direction": "optional" },
  "remember": { "key": "string", "value": "any" },
  "recall": { "key": "string" },
  "plan": { "goal": "string" },
  "reflect": { "topic": "optional" },
}

This totals to 507 tokens. I think this is a good place to stop. Let's see if we can implement these skills next.

-- Sprited Dev ๐Ÿ›