# Journey into Running Comfy Headless on RunPod Serverless

Yesterday, I tried deploying Comfy in Fly.io and had some issues with machine timeouts. Perhaps there’s a shortage of machines there, or maybe my Docker image is just too large to handle.

Instead of continuing down that path, I decided to switch focus and try out RunPod Serverless. It already has a base Docker image for headless ComfyUI, which would fit our needs perfectly—if it delivers on performance.

[https://github.com/runpod-workers/worker-comfyui](https://github.com/runpod-workers/worker-comfyui)

---

## Prerequisite

* Computer with Internet access.
    
* Lunch money.
    

---

## Easy Mode — Watch out for Rabbit Hole…

Here is the steps I took to get it up and running with Flux1.dev models. I just wanted to see it working. Instructions are based on the [official documentation](https://github.com/runpod-workers/worker-comfyui/blob/main/docs/deployment.md#deploying-pre-built-official-images).

First, get RunPod account, and put your lunch money in there.

Go to [https://hub.docker.com/r/runpod/worker-comfyui](https://hub.docker.com/r/runpod/worker-comfyui)

* Select from recent tags. I chose: `5.3.0-flux1-dev`.
    
* Copy the string after “docker pull” → `runpod/worker-comfyui:5.3.0-flux1-dev`
    

Go to RunPod &gt; Templates &gt; Go to My Templates &gt; New Template

* Name: comfy-worker-poc
    
* Serverless
    
* Container Image: `runpod/worker-comfyui:5.3.0-flux1-dev`
    
* Container Disk: 30 GB
    
* Then “Save Template.”
    

Now, go to RunPod &gt; Serverless &gt; New Endpoint

* Click on “Import from Docker Registry”
    
* Choose a template &gt; comfy-worker-poc
    
* Endpoint Name: comfy-worker-poc
    
* Endpoint Type: Queue
    
* Worker Type: GPU
    
* GPU Configuration: 24GB PRO
    
* Then “Deploy“
    

Once you see Workers in `idle`, you can go to Serverless &gt; comfy-worker-poc &gt; Requests.

I tried copying and pasting [test\_input.json](https://github.com/runpod-workers/worker-comfyui/blob/main/test_input.json) from the repo directly, but ran into an error: `'type': 'value_not_in_list', 'message': 'Value not in list', 'details': "ckpt_name: 'flux1-dev-fp8.safetensors' not in []"`

---

## Rabbit Hole

> Feel free to skip this section entirely, just log of me embarrassing myself.

Looks like the checkpoint is not found. Let’s decipher the [Dockerfile](https://github.com/runpod-workers/worker-comfyui/blob/main/Dockerfile) to see which Flux model variant is embedded in the images.

It has following lines of code, which suggests that since we selected `flux1-dev` model type, it won’t download the quantized version of the model `flux1-dev-fp8`.

```dockerfile
RUN if [ "$MODEL_TYPE" = "flux1-dev" ]; then \
      wget -q --header="Authorization: Bearer ${HUGGINGFACE_ACCESS_TOKEN}" -O models/unet/flux1-dev.safetensors https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors && \
      wget -q -O models/clip/clip_l.safetensors https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors && \
      wget -q -O models/clip/t5xxl_fp8_e4m3fn.safetensors https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors && \
      wget -q --header="Authorization: Bearer ${HUGGINGFACE_ACCESS_TOKEN}" -O models/vae/ae.safetensors https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/ae.safetensors; \
    fi

RUN if [ "$MODEL_TYPE" = "flux1-dev-fp8" ]; then \
      wget -q -O models/checkpoints/flux1-dev-fp8.safetensors https://huggingface.co/Comfy-Org/flux1-dev/resolve/main/flux1-dev-fp8.safetensors; \
    fi
```

Let’s go back to the Request tab and replace `flux-dev-fp8` to `flux-dev`.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755623492880/e58c9fca-4360-4bba-94af-83f7fcb08d0d.png align="center")

```diff
- "ckpt_name": "flux1-dev-fp8.safetensors"
+ "ckpt_name": "flux1-dev.safetensors"
```

Then click “Run.” Unfortunately, that didn’t work either.

```plaintext
"ckpt_name: 'flux1-dev.safetensors' not in []
```

Let’s check back on the [Dockerfile](https://github.com/runpod-workers/worker-comfyui/blob/main/Dockerfile). In fine prints, it is looking for `HUGGINGFACE_ACCESS_TOKEN`, and since we haven’t provided this huggingface access token as environment variable, it would have failed to download the model (and interestingly, the flux1-dev-fp8 model is distributed without needing huggingface authentication. Little weird that non-quantized version requires authentication but the latter doesn’t).

```dockerfile
RUN if [ "$MODEL_TYPE" = "flux1-dev" ]; then \
      wget -q --header="Authorization: Bearer ${HUGGINGFACE_ACCESS_TOKEN}" -O models/unet/flux1-dev.safetensors https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors && \
      wget -q -O models/clip/clip_l.safetensors https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors && \
      wget -q -O models/clip/t5xxl_fp8_e4m3fn.safetensors https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors && \
      wget -q --header="Authorization: Bearer ${HUGGINGFACE_ACCESS_TOKEN}" -O models/vae/ae.safetensors https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/ae.safetensors; \
    fi

RUN if [ "$MODEL_TYPE" = "flux1-dev-fp8" ]; then \
      wget -q -O models/checkpoints/flux1-dev-fp8.safetensors https://huggingface.co/Comfy-Org/flux1-dev/resolve/main/flux1-dev-fp8.safetensors; \
    fi
```

I was hoping to see wget failure in the logs but looks like the logs got trimmed and I didn’t see any logs that were emitted while bring up.

Anyhow, we need to now provide a HuggingFace tokens. Let’s create one. (It would be easier just switching to the quantized one but let’s try it anyway).

Go to HuggingFace and create your account (free). Then navigate to following Flux.1-dev page here: [https://huggingface.co/black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev).

The access is only granted to people who agree to their terms and conditions. So accept the terms, and you should have access to the weights.

Now go to Hugging Face &gt; Access Tokens &gt; Create new token.

* Token type: Read
    
* Token name: runpod-comfy-worker
    

Copy the generated token and hop on over to RunPod &gt; Serverless &gt; comfy-worker-poc &gt; Manage &gt; Edit Endpoint &gt; Environment Variables.

* key: `HUGGINGFACE_ACCESS_TOKEN`
    
* value: Paste in your key
    

Once you save it, RunPod will start to roll the change out.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755624733464/b50420ec-dbd4-43a8-a721-61b80cb037d4.png align="center")

Once the rollout completes, let’s run the request again. I still got the same issue though.

`ckpt_name: 'flux1-dev.safetensors' not in []`

Perhaps the “rollout” isn’t going to re-run the commands in the Dockerfile. … Actually, yeah, the Docker image is already pre-built, so the `RUN` steps inside Dockerfiles won’t run when we bring up the instance 🙈.

Let’s actually run the docker image locally to see what is going on.

```bash
docker pull --platform linux/amd64 runpod/worker-comfyui:5.3.0-flux1-dev
docker run --platform linux/amd64 -it runpod/worker-comfyui:5.3.0-flux1-dev
```

This will take some time depending on your network speed. It’s gonna take several hours on my **Panera Bread Free Wifi** 🙈.

Okay, either I move my ass to my office which has a slightly better internet or let’s switch to `runpod/worker-comfyui:5.3.0-base` instead.

Go to RunPod &gt; Serverless &gt; comfy-worker-poc &gt; Manage &gt; Edit Endpoint &gt; Docker Configuration

* Switch from `runpod/worker-comfyui:5.3.0-flux1-dev` to `runpod/worker-comfyui:5.3.0-base`.
    

Once the rollout completes, let’s copy paste the exact text from [test\_input.json](https://github.com/runpod-workers/worker-comfyui/blob/main/test_input.json) into Requests. 🥁 Nah. That didn’t work either. 🙈

Okay, okay, let’s breath in and then out a few times and try it again. A guy next me my table in Panera Bread is fighting with her sister over the phone for an hour, but it’s okay, let’s just breath in and out. It’s going to be okay.

The main issue is that I don’t have the visibility into the running process in the RunPod Serverless since there is no server to connect to. So, I can’t inspect what is inside the folders unless I run it as a Pod or just run it locally.

I actually have a Ubuntu server running with a better network connection, let’s ssh login.

```bash
docker pull --platform linux/amd64 runpod/worker-comfyui:5.3.0-flux1-dev
```

Hold on ✋ The image is 29.99 GB. That’s really big. Hmm, this one is loading Flux1-dev model which is already 23.8GB. Let’s switch to non base variant which comes with fp8 quantization. That one’s only 8.6 GB.

```bash
docker pull --platform linux/amd64 runpod/worker-comfyui:5.3.0-base
```

Okay good. Now let’s run it just to check what’s inside the models folder.

```bash
docker run --platform linux/amd64 -it runpod/worker-comfyui:5.3.0-base bash
```

* `--rm` → ensures the container is deleted when you exit.
    
* `-it` → attaches an interactive TTY.
    
* `bash` (or `sh`) → gives you a shell inside the container.
    

Now, let’s check.

```bash
ls /comfyui/models/checkpoints
```

It returns nothing other than `put_checkpoint_here` file.

```bash
ls /comfyui/models/diffusion_models
```

No luck here either.

```bash
find /comfyui/models -type f -name "*flux*"
```

Looking at [README](https://github.com/runpod-workers/worker-comfyui/blob/main/README.md) again, it actually says the `base` variant does not come with any model. So, my assumption that it came with a flux1-dev-fp8 was wrong 😅. I should have seen it coming… My bad.

> * `runpod/worker-comfyui:<version>-base`: Clean ComfyUI install with no models.
>     

Okay, back to downloading Flux1-dev variant again. 🥹

I’m gonna try 5.2.0 instead of 5.3.0 because 5.3.0 may have had a regression.

```bash
docker pull --platform linux/amd64 runpod/worker-comfyui:5.2.0-flux1-dev
```

The 23.8GB out of 29.99 GB is Flux model, so I’m expecting it should be there somewhere.

While I’m downloading the docker image, let’s talk about SpriteDX architecture little bit. My inclination is to use headless Comfy as an workflow orchestration layer for SpriteDX image/video generation. We will create comfy workflows and store json exports inside the another python based api service. This api service will basically be gate keeping headless Comfy. We could simply expose Comfy `/prompt` endpoint, but each of these calls are expensive and not vetted for security, so best to keep it behind a gating api layer. Gating API service will be living in the same docker image. It will interface with comfy service. Probably the **worker-comfyui** already has some sort of api frontend that gates the comfy. I shall look into it next.

Okay, download is done. Let’s check.

```bash
docker run --platform linux/amd64 -it runpod/worker-comfyui:5.2.0-flux1-dev bash
```

Where is it?

```bash
> find /comfyui/models -type f -name "*flux*" 
/comfyui/models/unet/flux1-dev.safetensors
```

Little odd that it’s in `unet` folder. The [`CheckpointLoaderSimple`](https://github.com/comfyanonymous/ComfyUI/blob/4977f203fa8e9e3ab22884c8ace8f9b540d48952/nodes.py#L564) mentioned in [`test_input.json`](https://github.com/runpod-workers/worker-comfyui/blob/23239a92d3ad9be719aa6e882b7898d859440b87/test_input.json#L55C24-L55C46) would only look at the `checkpoints` folder. So, it will not find it.

The “unet” folder appears to be a legacy folder. Right now it is [mapped](https://github.com/comfyanonymous/ComfyUI/blob/4977f203fa8e9e3ab22884c8ace8f9b540d48952/folder_paths.py#L91) to “diffusion\_models”. And, diffusion models are looked up by “Load Diffusion Model“ node (`UNETLoader`).

Unfortunately, UNETLoader only exports the diffusion model and not VAEs and text encoders, so I can’t simply replace it………. We will need to find the VAEs and text encoders and separately load them. Did they even test this out before putting up the Docker images? 🫠

Let’s look for VAEs and text encoders now.

```bash
> find /comfyui/models -type f -name "*.safetensors"
/comfyui/models/unet/flux1-dev.safetensors
/comfyui/models/clip/clip_l.safetensors
/comfyui/models/clip/t5xxl_fp8_e4m3fn.safetensors
/comfyui/models/vae/ae.safetensors
```

Okay, at least they exist. Now, we have to craft a workflow that will be able to pull up these models. Luckly I have an comfy instance that is running in my server. Let me create one quickly.

We open ComfyUI and pick Flux1-dev template. It uses Load Checkpoint node (`CheckpointLoaderSimple`).

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755631944227/83380588-b313-4695-ae49-a689ec3c303c.png align="center")

Let’s switch it up with “Load Diffusion Model” node, “Load VAE” and “Load CLIP“ node.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755632257329/6ab8ad86-d3e7-46d5-b35a-2ef9fced33bb.png align="center")

Then click on ComfyUI &gt; Workflow &gt; Export (API).

```json
{
  "6": {
    "inputs": {
      "text": "cute anime girl with massive fluffy fennec ears and a big fluffy tail blonde messy long hair blue eyes wearing a maid outfit with a long black gold leaf pattern dress and a white apron mouth open placing a fancy black forest cake with candles on top of a dinner table of an old dark Victorian mansion lit by candlelight with a bright window to the foggy forest and very expensive stuff everywhere there are paintings on the walls",
      "clip": [
        "41",
        0
      ]
    },
    "class_type": "CLIPTextEncode",
    "_meta": {
      "title": "CLIP Text Encode (Positive Prompt)"
    }
  },
  "8": {
    "inputs": {
      "samples": [
        "31",
        0
      ],
      "vae": [
        "40",
        0
      ]
    },
    "class_type": "VAEDecode",
    "_meta": {
      "title": "VAE Decode"
    }
  },
  "9": {
    "inputs": {
      "filename_prefix": "ComfyUI",
      "images": [
        "8",
        0
      ]
    },
    "class_type": "SaveImage",
    "_meta": {
      "title": "Save Image"
    }
  },
  "27": {
    "inputs": {
      "width": 1024,
      "height": 1024,
      "batch_size": 1
    },
    "class_type": "EmptySD3LatentImage",
    "_meta": {
      "title": "EmptySD3LatentImage"
    }
  },
  "31": {
    "inputs": {
      "seed": 880604085770567,
      "steps": 20,
      "cfg": 1,
      "sampler_name": "euler",
      "scheduler": "simple",
      "denoise": 1,
      "model": [
        "39",
        0
      ],
      "positive": [
        "35",
        0
      ],
      "negative": [
        "33",
        0
      ],
      "latent_image": [
        "27",
        0
      ]
    },
    "class_type": "KSampler",
    "_meta": {
      "title": "KSampler"
    }
  },
  "33": {
    "inputs": {
      "text": "",
      "clip": [
        "41",
        0
      ]
    },
    "class_type": "CLIPTextEncode",
    "_meta": {
      "title": "CLIP Text Encode (Negative Prompt)"
    }
  },
  "35": {
    "inputs": {
      "guidance": 3.5,
      "conditioning": [
        "6",
        0
      ]
    },
    "class_type": "FluxGuidance",
    "_meta": {
      "title": "FluxGuidance"
    }
  },
  "39": {
    "inputs": {
      "unet_name": "flux1-dev.safetensors",
      "weight_dtype": "default"
    },
    "class_type": "UNETLoader",
    "_meta": {
      "title": "Load Diffusion Model"
    }
  },
  "40": {
    "inputs": {
      "vae_name": "ae.safetensors"
    },
    "class_type": "VAELoader",
    "_meta": {
      "title": "Load VAE"
    }
  },
  "41": {
    "inputs": {
      "clip_name": "clip_l.safetensors",
      "type": "stable_diffusion",
      "device": "default"
    },
    "class_type": "CLIPLoader",
    "_meta": {
      "title": "Load CLIP"
    }
  }
}
```

We need to wrap this in the request format expected in RunPod Serverless.

```json
{
  "input": {
    "workflow": …above stuff… 
  }
}
```

Now go back to RunPod &gt; Serverless &gt; comfy-worker-poc &gt; Requests.

Then replace the `workflow` section with above JSON. You can also remove images section.

Then, “Run” Fingers crossed 🫰. If things are working correctly, you should see “Running“ and the $/s appearing.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755632709543/9fa4a809-66b3-4634-9c46-d58b2594ddbf.png align="center")

Not quite sure why, but my request is just sitting in the queue and not executing. I can see one node is “running” since 5 minutes ago. Perhaps it takes a while because of the cold start.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755632838201/7875b3d4-a28a-4c80-946f-174f3747afc1.png align="center")

Oh, shoot, actually, I’ve configured the base image to be using base variant which contains no models. 😇 🐇🐇🐇🐇🐇🐇🐇🐇🐇🐇🐇

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755632939721/318e158a-e78f-4b4c-9cc5-85a21a52e702.png align="center")

Updated to `runpod/worker-comfyui:5.2.0-flux1-dev` and redeploy.

Even with this, the workflow didn’t work.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755633992208/c9062376-eaed-49e3-a7be-207c9fd9094b.png align="center")

As much as I hate to admit, I don’t see a good way to run proper Flux text-to-image on this prebuilt image.

**TL;DR**: Couldn’t run it in Easy Mode.

---

## Hard Mode

Let’s fix up the **Dockerfile** ourself so that the flux file appears in the correct location.

```bash
git@github.com:runpod-workers/worker-comfyui.git
cd worker-comfyui
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

Let’s then replace `unet` in Dockerfile.

```diff

 RUN if [ "$MODEL_TYPE" = "flux1-schnell" ]; then \
-      wget -q --header="Authorization: Bearer ${HUGGINGFACE_ACCESS_TOKEN}" -O models/unet/flux1-schnell.safetensors https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/flux1-schnell.safetensors && \
+      wget -q --header="Authorization: Bearer ${HUGGINGFACE_ACCESS_TOKEN}" -O models/checkpoints/flux1-schnell.safetensors https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/flux1-schnell.safetensors && \
       wget -q -O models/clip/clip_l.safetensors https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors && \
       wget -q -O models/clip/t5xxl_fp8_e4m3fn.safetensors https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors && \
       wget -q --header="Authorization: Bearer ${HUGGINGFACE_ACCESS_TOKEN}" -O models/vae/ae.safetensors https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors; \
     fi
 
 RUN if [ "$MODEL_TYPE" = "flux1-dev" ]; then \
-      wget -q --header="Authorization: Bearer ${HUGGINGFACE_ACCESS_TOKEN}" -O models/unet/flux1-dev.safetensors https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors && \
+      wget -q --header="Authorization: Bearer ${HUGGINGFACE_ACCESS_TOKEN}" -O models/checkpoints/flux1-dev.safetensors https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors && \
       wget -q -O models/clip/clip_l.safetensors https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors && \
       wget -q -O models/clip/t5xxl_fp8_e4m3fn.safetensors https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors && \
       wget -q --header="Authorization: Bearer ${HUGGINGFACE_ACCESS_TOKEN}" -O models/vae/ae.safetensors https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/ae.safetensors; \
     fi
```

Then push it to a fork (in my case: [https://github.com/kndlt/worker-comfyui](https://github.com/kndlt/worker-comfyui)).

Then, we must provide some environment variables. We create `.env` file (already gitignored) with following:

```plaintext
HUGGINGFACE_ACCESS_TOKEN=<hugging face access token from earlier>
```

Let’s then try to trigger the build. Let’s build the `flux-dev-fp8` variant.

```bash
docker buildx bake flux1-dev-fp8 --env-file .env
```

It should take 10-20 minutes to build the image. Let’s test it once it is built.

```bash
docker run --rm -it -p 8000:8000 runpod/worker-comfyui:latest-flux1-dev-fp8
```

This actually failed with this message:

```bash
Traceback (most recent call last):
  File "/comfyui/main.py", line 132, in <module>
    import execution
  File "/comfyui/execution.py", line 14, in <module>
    import comfy.model_management
  File "/comfyui/comfy/model_management.py", line 221, in <module>
    total_vram = get_total_memory(get_torch_device()) / (1024 * 1024)
                                  ^^^^^^^^^^^^^^^^^^
  File "/comfyui/comfy/model_management.py", line 172, in get_torch_device
    return torch.device(torch.cuda.current_device())
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/cuda/__init__.py", line 1071, in current_device
    _lazy_init()
  File "/opt/venv/lib/python3.12/site-packages/torch/cuda/__init__.py", line 412, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
```

It looks like to use CUDA inside containers, I need to install [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) on my server. Installed it according to their instructions. Let’s confirm that container is able to detect CUDA.

```bash
> docker run --rm --gpus all nvidia/cuda:12.6.3-runtime-ubuntu24.04 nvidia-smi

==========
== CUDA ==
==========

CUDA Version 12.6.3

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

Tue Aug 19 21:41:46 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.153.02             Driver Version: 570.153.02     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX PRO 6000 Blac...    On  |   00000000:01:00.0 Off |                  Off |
| 30%   30C    P8             14W /  300W |   24716MiB /  97887MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
```

Looks good so far. Let’s try running the built container again.

```bash
docker run --rm -it -p 8000:8000 --gpus all runpod/worker-comfyui:latest-flux1-dev-fp8
```

Finally, it worked, at least in my node! 🧚

```bash
DEBUG  | local_test | Handler output: {'images': [{'filename': 'ComfyUI_00001_.png', 'type': 'base64', 'data': 'iVBORw0KGgoAAAANSUhEUgAAAgAAAAIACAIAAAB7GkOtAAEAAElEQVR4Xlz92bNsSZbeh/2+5b4j4kx3yLyZNXR1dTW6GxQI0ACIEsUHmkwPetEDX/Uv0wyCiZTBREoY2EQ3Gqiurs7KvMM5EbHd16eH5XEyoX3vOSdib/fla/jW4L59R+i/+OZJ6ZRR
…
```

Let’s push this to Dockerhub now. First, we need to tag it to be under our repo name.

```bash
docker tag runpod/worker-comfyui:latest-flux1-dev-fp8 sprited/worker-comfyui:latest-flux1-dev-fp8
```

Then, push.

```bash
docker login # login
docker push sprited/worker-comfyui:latest-flux1-dev-fp8
```

The image is pretty large, so pushing will take some time.

Pushed [sprited/worker-comfyui:latest-flux1-dev-fp8](https://hub.docker.com/repository/docker/sprited/worker-comfyui/tags/latest-flux1-dev-fp8/sha256-673c6e3bbc1d3beea8ad6ffc083928dc54779117b1c5de42d65496b0a5add4c5). Yay!

Now, go back to RunPod &gt; Serverless &gt; comfy-worker-poc &gt; Manage &gt; Edit Endpoint &gt; Docker Configuration, then switch to this image.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755640485545/c404af5d-3040-41a4-8d29-3e3ef9f1ca38.png align="center")

Then deploy 🚀.

---

While it’s deploying, let me see if I can pull up some comparison between RunPod and Fly.io.

RunPod is mostly GPU hosting service that caters towards AI workloads, and Fly.io is mostly for app hosting. So, when I tried to run 30GB docker image fly.io is probably going to choke.

My thinking was that, I would use other inference service providers like fal.ai and Replicate to do the heavy lifting so, I wouldn’t need GPUs for the orchestration layer. However, if you want to use headless Comfy, it basically requires you to run on Nvidia CUDA, so no luck there.

We could in theory fork ComfyUI to work with cpu version of pytorch. That might reduce the image size substantially. Perhaps that is a good cost saving measure. At this time though, I need an infra and testbed that just works, so I will stick with the full CUDA comfy.

Comfy being open source, there is also a possibility of forking it and creating a more lightweight version without CUDA requirement which caters to forks using apple devices. That would be an interesting approach.

---

**Why RunPod Serverless?** Okay, RunPod Serverless works out pretty well because if there are no calls I don’t have to pay a penny. Also deployments don’t cost any money. So far today, I’ve spent something like 5 cent from my node.

Another benefit is that it works stateless and ephemeral fashion so there is less privacy and security to worry about. It systemically isn’t able to retain any data outside its ephemeral lifetime.

---

Roll out is completed. Let’s test it out. Go to Requests, and copy and paste test\_input.json mentioned above. And the request succeeds now.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755642941470/f38a91dd-602d-4eac-ac56-20819aee0305.png align="center")

The generated image will be in the “images“ section. It will be returned in base64 encoding.

We extract the `output.images[0].data` and prefix it with `data:image/png;base64,` then use the tools like [https://www.site24x7.com/tools/datauri-to-image.html](https://www.site24x7.com/tools/datauri-to-image.html) to render the image.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755643182439/a987555c-4b3e-4cff-a5cd-9ccc91debb4d.png align="center")

So, now I have a serverless RunPod that can run image generations. 🥳

---

Next steps:

1. Post pull request to [https://hub.docker.com/r/runpod/worker-comfyui](https://hub.docker.com/r/runpod/worker-comfyui) for the `sed/unet/checkpoints` fix.
    
2. Try calling it directly from SpriteDX Web UI to do a sample generation.
    

Gotta go.

— Sprited Dev 🌱
