Testing black-forest-labs/FLUX.1-dev

In this post, we try running black-forest-labs/FLUX.1-dev on Nvidia RTX Pro 6000 machine.
AMD Ryzen 7 9800X3D 8-Core Processor
NVIDIA RTX PRO 6000 Blackwell Workstation Edition
Ubuntu 22.04
We using nvidia-driver-570-open.
NVIDIA-SMI 570.153.02
Driver Version: 570.153.02
CUDA Version: 12.8
and Python 3.12.
First, log into Huggingface:
huggingface-cli login
You have to access terms from: https://huggingface.co/black-forest-labs/FLUX.1-dev.
Let’s set up the repo:
mkdir sprite-flux
cd sprite-flux
poetry init --name sprite-flux --python ^3.12 --no-interaction
poetry env use python3.12
eval $(poetry env activate)
poetry source add --priority=explicit torch-cu128 https://download.pytorch.org/whl/cu128
poetry add --source torch-cu128 torch torchvision torchaudio
poetry add protobuf sentencepiece
poetry run python -c "import torch; print(torch.__version__, torch.version.cuda, torch.cuda.is_available())" # -> 2.7.1+cu128 12.8 True
poetry add diffusers transformers accelerate xformers
Python Script
# scripts/inference/run_flux_retro_lora.py
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained("black-forest-labs/FLUX.1-dev")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]
Running it inside poetry env.
poetry run python scripts/inference/run_flux_retro_lora.py
This just takes forever since it’s not using any GPUs. Let’s move it to GPU.
from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained("black-forest-labs/FLUX.1-dev")
pipe.to("cuda") # 👈 THIS is what makes it actually use your GPU
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]
image.save("output.png")
Now, it should produce something within reasonable amount of time.
If you run nvidia-smi you should see something like:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.153.02 Driver Version: 570.153.02 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA RTX PRO 6000 Blac... On | 00000000:01:00.0 Off | Off |
| 48% 84C P1 599W / 600W | 67227MiB / 97887MiB | 100% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1202 G /usr/lib/xorg/Xorg 165MiB |
| 0 N/A N/A 1362 G /usr/bin/gnome-shell 23MiB |
| 0 N/A N/A 14768 C ...ux-b5L0JzLd-py3.12/bin/python 66986MiB |
+-----------------------------------------------------------------------------------------+
Completed in: 01 minute and 13 seconds.

Result is kinda amazing but quite slow.




