Imager & ComfyUI add-on — local image generation from a chat

In one sentence: Imager is a thin FastAPI bridge that exposes a curated set of ComfyUI workflows as a stable HTTP API, and the ComfyUI add-on in Companion wires that bridge to the chat — type /comfyui <prompt>, get an image inline, with the model and dimensions you actually wanted.

How it works

The stack is three small services, each one doing one thing. Imager does not run inference, Companion does not run inference, ComfyUI runs the GPU.

                  Companion (chat client)
                  │  /comfyui <prompt>
                  │  /v1/templates        ← list curated workflows
                  ▼
            OdyssAI-Imager  (FastAPI bridge,  :8008)
                  │  1. Resolve template + patch the user's inputs in
                  │  2. Optional SSH-fallback download (FP8 → FP16)
                  │  3. Submit to ComfyUI's /prompt
                  │  4. Poll /history/{id} until done
                  │  5. Stream image bytes back to the browser via /v1/image/{filename}
                  ▼
              ComfyUI server  ( :8188 )
                  │  runs the workflow, writes PNGs under output/
                  ▼
                 the GPU (MPS, CUDA, …)

Three concepts:

Term	Definition
Bridge (Imager)	FastAPI process that exposes ComfyUI as a stable, opinionated HTTP API. Lives next to ComfyUI but doesn’t share its filesystem.
Template	A ComfyUI workflow JSON, declared in `registry/inputs_map.yaml`, with a small manifest of which inputs the user can change (`prompt`, `width`, `height`, `steps`).
Add-on (Companion)	The wiring inside the chat client: stores `bridgeUrl` + `bridgeToken` per user, exposes `/comfyui`, renders results inline.

The two pieces ship in two repos and talk over plain HTTP. Imager is backend-only. The Companion add-on is frontend + a small admin route. Neither one duplicates the other.

What you actually do, end-to-end

Configure the bridge once (URL + optional token) in Companion → Settings → Add-ons → ComfyUI Imager.
Type /comfyui <prompt> in any chat. A modal opens with the template selector, dimensions, steps, and a Generate button.
The modal POSTs to Imager, Imager submits the workflow to ComfyUI, the result PNG comes back inline in the conversation, with a hover Save picture button.
The image is persisted in the conversation (the reference, not the bytes). A refresh keeps it visible. Save keeps the original filename.

That’s the whole user journey. The complexity lives in the bridge, and only the bridge operator sees it.

Advantages

One chat surface for text and image. No context switch, no separate app, no copy-paste between an image tool and the chat. You stay in the conversation; the LLM can call the bridge on its own when an image would help the answer (comfyui_generate tool).
Stable HTTP contract, ComfyUI underneath. Imager is the only thing that knows the workflow JSONs. The chat sees slug, description, model, inputs, defaults. ComfyUI can be upgraded, swapped, or restarted without the chat noticing.
Multi-user, isolated config. Each Companion user has their own bridgeUrl + bridgeToken in the addons table. Operator can ship a default via IMAGER_BRIDGE_URL env on Companion, per-user wins.
Curated, not exposed. Imager ships a registry (inputs_map.yaml) that says which workflows are user-facing, with their default dimensions baked in. You don’t ship ComfyUI’s full surface — you ship the three workflows you actually trust.
Sovereign by default. Imager and ComfyUI run on the operator’s own machines. Nothing leaves the LAN unless you point the bridge at a cloud ComfyUI explicitly. The image proxy (/v1/image/{filename}) keeps the compute host’s address private — the browser talks to Imager, never directly to the GPU box.

Use cases

Hero image for a blog post or article. /comfyui from the Companion chat, FLUX.1-dev 20 steps, the result drops into the conversation and gets a Save button on hover.
Quick visual in a brainstorm. /comfyui with the image-rapide template (FLUX.2-klein 4B, 12 steps, ~90s on M3 Ultra) — fast enough to iterate, quality enough to keep.
Cinematic 21:9 frame for a write-up. photo-article-tmb template, 1664×928 baked-in default, no manual sizing.
Inline illustration on demand from the LLM itself. The comfyui_generate tool is exposed to the chat; if the model decides the answer needs an image, it calls the bridge on its own.
Local-only generation on a single Mac. Imager + ComfyUI on the same machine, MPS, no LAN, no auth, no Docker — the simplest possible install (a ComfyUI custom node, in the custom-node variant).

Configuring Imager

Server-side: environment

Imager reads its config from env (prefix IMAGER_). The defaults match the operator’s own LAN deployment; you only override what differs.

Variable	Default	What it does
`IMAGER_HOST`	`0.0.0.0`	Bind address for the bridge FastAPI server.
`IMAGER_PORT`	`8008`	Bind port. Companion add-on points here.
`IMAGER_COMFYUI_HOST`	`192.168.86.42`	Where ComfyUI lives. Override to `127.0.0.1` if it runs on the same box.
`IMAGER_COMFYUI_PORT`	`8188`	ComfyUI’s default.
`IMAGER_COMFYUI_TIMEOUT_S`	`1800`	30 min — FLUX.2-dev at 5120×2880 can take ~20 min.
`IMAGER_COMFYUI_MODELS_ROOT`	`/Volumes/models/comfyui`	Path on the compute host where ComfyUI looks for models.
`IMAGER_SSH_HOST`	`admin@192.168.86.42`	Used by the FP8→FP16 fallback downloader. Empty if Imager and ComfyUI share a filesystem.
`IMAGER_SSH_PYTHON_BIN`	`…/ComfyUI/.venv/bin/python`	Python on the compute host, for `hf_hub_download`.
`IMAGER_HF_TOKEN`	empty	Optional. If set, the SSH download sets `HF_TOKEN` for private models.
`IMAGER_BRIDGE_TOKEN`	empty	Optional bearer token gating the bridge. Empty = LAN-only, no auth.

A IMAGER_BRIDGE_TOKEN set = Companion must send it as Authorization: Bearer <token>. Leave it empty on a trusted LAN.

Server-side: templates

Templates are declared in registry/inputs_map.yaml. Each entry is a slug, a path to a workflow JSON, the model it binds to, the inputs the user is allowed to override, and optional defaults. The current catalogue:

Slug	Model	Default size	Steps	Use it for
`image-rapide`	`flux2-klein-base-4b`	1664×928	12	Quick visual in a brainstorm, ~90s on M3 Ultra.
`photo-article-tmb`	`flux1-dev-t2i-v1`	1664×928	20	Hero image for a blog post, full quality.
`image-z-image-turbo`	`z-image-turbo`	1664×928	5	Fastest possible, draft.

Add a template: drop a workflow JSON in workflows/, add a row in inputs_map.yaml, restart the bridge. The catalogue (GET /v1/templates) picks it up automatically. No code change.

Client-side: the Companion add-on

In Companion → Settings → Add-ons → ComfyUI Imager:

Bridge URL — default http://192.168.86.141:8008. The operator can pre-fill this for every user with the IMAGER_BRIDGE_URL env on Companion. Per-user override wins.
Bridge token — leave empty on a trusted LAN. Set it if you set IMAGER_BRIDGE_TOKEN on the bridge.
Enabled — on/off switch per user. Disabled users get a clean “Imager not configured, go to Settings” message in the chat, no 404.
Probe — calls the bridge /health and reports back. Use this to confirm the LAN path is open.

Probe returns 502? Open the URL in a browser. If you can’t reach the bridge from your laptop, the chat can’t reach it from the server either. Common causes: the bridge is on a different VLAN, the container restarted on a new IP, or IMAGER_COMFYUI_HOST points at a compute host that’s down.

Using the stack

From Companion (the chat)

The user-facing surface. Two paths.

/comfyui <prompt> slash command. Type it in any chat — the modal opens with the template selector, dimensions, steps, CFG, seed, and a Generate button. The modal seeds its values from the selected template’s defaults (so image-rapide opens at 1664×928 / 12 steps, not 1024² / 20). The last prompt + last params are remembered in localStorage (key companion:comfyui:last), so re-opening the modal is one click away from the previous run.

comfyui_generate tool call from the LLM. The chat model has the bridge exposed as a tool. When the model decides the answer needs an image, it calls the bridge on its own; the result drops into the conversation the same way. No user action required.

In both cases, the result is a chat message with an attachment reference, not inline base64. The browser hits GET /v1/image/{filename} on the bridge to render. Save picture on hover fetches the same URL and downloads the PNG with its original filename.

From the HTTP API directly

Imager is a plain HTTP service. Five routes cover everything.

Method	Path	Purpose
`GET`	`/health`	Liveness. `{status, service, version}`.
`GET`	`/ready`	Readiness. Confirms the process is up and configured.
`GET`	`/v1/templates`	Catalogue: `[{slug, description, model, inputs, defaults}, …]`.
`POST`	`/v1/templates/{slug}/run`	Fire-and-forget. Body `{prompt, width?, height?, steps?}`. Returns `{prompt_id, status_url, template, model}`.
`GET`	`/v1/status/{prompt_id}`	Polling. `{prompt_id, completed, status, images}`.
`GET`	`/v1/image/{filename}?subfolder=&type=output`	Image proxy over ComfyUI’s `/view`. Browser-friendly.

Typical sequence:

BRIDGE=http://192.168.86.141:8008

# 1. List what you can run
curl -s $BRIDGE/v1/templates | jq

# 2. Submit a generation
curl -s -X POST $BRIDGE/v1/templates/image-rapide/run \
  -H 'Content-Type: application/json' \
  -d '{"prompt":"a stag in a misty forest, 21:9, cinematic"}'
# → {"prompt_id":"…","status":"submitted","template":"image-rapide", …}

# 3. Poll
curl -s $BRIDGE/v1/status/<prompt_id>

# 4. Render the result
curl -s -o out.png "$BRIDGE/v1/image/<filename>?type=output"

/v1/generate (legacy) still exists as a placeholder-template SSE path. The Companion modal does not use it. It’s kept for the 1-month grace period while we confirm no external tool depends on it. If you don’t have a reason to use it, don’t.

The FP8→FP16 fallback

A small but real piece of plumbing. ComfyUI workflows reference files by exact filename; if you change a quant, the filename changes, and the workflow breaks. The bridge keeps a registry of canonical filename → HF file and, when it sees an unknown filename, it SSHes into the compute host and runs hf_hub_download with the FP16 file as a substitute. This is why IMAGER_SSH_HOST and IMAGER_SSH_PYTHON_BIN exist.

Set the SSH target to your compute host (admin@192.168.86.33 in prod, admin@.42 in sandbox). On a single-machine install, the SSH target is 127.0.0.1 and the variables are essentially a no-op.

In the custom-node variant (Imager = ComfyUI plugin), this whole section is irrelevant. The plugin and ComfyUI share a filesystem; the user drops models in ComfyUI/models/diffusion_models/ like any other ComfyUI install. SSH download is a cluster-operator thing.

Two install shapes

The bridge ships in two shapes, depending on who runs it.

	Docker standalone (current default)	ComfyUI custom node (in development, ticket #73)
Who it’s for	Cluster operators — Imager on `.141`, ComfyUI on a separate compute host (`.33`, `.42`).	External users — one Mac, MPS, no Docker, no SSH.
Install	`docker compose up -d` on the bridge host, point `IMAGER_COMFYUI_HOST` at the compute box.	`git clone … ComfyUI/custom_nodes/odyssai_imager && restart ComfyUI`.
Model management	Bridge SSHes the compute host, `hf_hub_download` into `/Volumes/models/comfyui/…`.	User drops files in `ComfyUI/models/…` like a normal ComfyUI install.
Auth	`IMAGER_BRIDGE_TOKEN` optional. LAN-only by default.	Same host, same process — auth is not a concern.
WebUI	None needed. Defaults + per-template manifest in `inputs_map.yaml`.	None needed. Same.

The two shapes share the same HTTP API. Companion doesn’t care which shape the bridge is — it just talks to a URL.

In short


What	FastAPI bridge in front of ComfyUI, plus a Companion add-on.
For whom	Anyone who wants image generation as a first-class chat action.
How	User types `/comfyui <prompt>` → Companion add-on → Imager HTTP API → ComfyUI → GPU → result back to the conversation.
Config	Bridge env (`IMAGER_*`); Companion add-on (URL, token, enabled per user).
Usage	Chat slash command, LLM tool call, or direct HTTP.
Templates	3 today (`image-rapide`, `photo-article-tmb`, `image-z-image-turbo`), add by dropping a workflow JSON + a manifest row.
The edge	One chat surface for text and image, with curated ComfyUI workflows and a stable HTTP contract underneath.