This example is only compatible with CLI v1.20 and later. Should you be making use of an older version of the CLI, please run pip install --upgrade cerebrium to upgrade it to the latest version.

Introduction

ComfyUI is a popular no-code interface for building complex stable diffusion workflows. Its ease of use, modular setup, and intuitive flowchart interface have helped the community build an extensive collection of workflows. Several websites offer shared workflows to help you get started:

While experimenting with ComfyUI workflows is straightforward, guidance on production-scale deployment is limited. This tutorial shows how to use Cerebrium to deploy your pipelines as autoscaling API endpoints with pay-as-you-go compute costs. You can find the full example code here.

Creating your Comfy UI workflow locally

Create your workflow either locally or using a rented GPU from Lambda Labs. First, ensure ComfyUI is installed in your local environment.

This tutorial uses Stable Diffusion XL and ControlNet to create custom QR codes. If you already have a workflow, skip to “Export ComfyUI Workflow”.

  1. Let us first create our Cerebrium project with: cerebrium init 1-comfyui
  2. Inside your project, lets copy the ComfyUI GitHub project: git clone https://github.com/comfyanonymous/ComfyUI
  3. Download the following models and install them in the appropriate folders within the ComfyUI folder:
    • SDXL base in models/checkpoints.
    • ControlNet in models/ControlNet.
  4. To run ComfyUI locally, run the command: python main.py —force-fp16 on MacOS. Make sure you run this inside the ComfyUI folder you just cloned.
  5. A server should be loaded locally at http://127.0.0.1:8188/‍

In this view, you should be able to see the default user interface for a ComfyUI workflow. You can use this locally running instance to create your image generation pipeline.

Export ComfyUI Workflow

In our example GitHub repository, we have a workflow.json file. You can click the “Load” button on the right to load in our workflow. You then should see the workflow populated

Don’t worry about the pre-filled values and prompts, we will edit these values on inference when we run our workflow.

To export this workflow to work with Cerebrium, we need to export this to an API format. In the top right over that hovering box on the right. Click the gear icon (settings). You must then make sure that “Enable dev mode” is selected and then can close the popup.

You should then see that a button appear on the right hover box that reads “Save (API format)”. Use that to save your workflow with the name “workflow_api.json”

ComfyUI Application

The main code in main.py uses the exported ComfyUI API workflow to create an endpoint. The code:

  1. Initializes the ComfyUI server
  2. Loads the workflow API template
  3. Processes inputs (prompts, images) through the predict function
  4. Generates base64-encoded image outputs

Note: Cerebrium runs code outside the predict function only during initialization, while the predict function handles all subsequent requests.

We need to alter our workflow_api.json file to have placeholders so that we can replace user values on inference. You can alter the file as follows.

  • Replace line 4, the seed input, with: “{{seed}}”
  • Replace line 45, the input text of node 6 with: “{{positive_prompt}}”
  • Replace line 58, the input text of node 7 with: “{{negative_prompt}}”
  • Replace line 108, the image of node 11 with: “{{controlnet_image}}”

We created a file that contains utility functions to make it easier to work with ComfyUI. You can find the code here. Create a file named helpers.py and copy the code into there.

from typing import Optional
from pydantic import BaseModel

import copy
import json
import os
import time
import uuid
from multiprocessing import Process
from typing import Dict

import websocket
from helpers import (
    convert_outputs_to_base64,
    convert_request_file_url_to_path,
    fill_template,
    get_images,
    setup_comfyui,
)

server_address = "127.0.0.1:8188"
client_id = str(uuid.uuid4())
original_working_directory = os.getcwd()
global json_workflow
json_workflow = None

global side_process
side_process = None
if side_process is None:
    side_process = Process(
        target=setup_comfyui,
        kwargs=dict(
            original_working_directory=original_working_directory,
            data_dir="",
        ),
    )
    side_process.start()

# Load the workflow file as a python dictionary
with open(
    os.path.join("./", "workflow_api.json"), "r"
) as json_file:
    json_workflow = json.load(json_file)

# Connect to the ComfyUI server via websockets
socket_connected = False
while not socket_connected:
    try:
        ws = websocket.WebSocket()
        ws.connect(
            "ws://{}/ws?clientId={}".format(server_address, client_id)
        )
        socket_connected = True
    except Exception as e:
        print("Could not connect to comfyUI server. Trying again...")
        time.sleep(5)

print("Successfully connected to the ComfyUI server!")

class Item(BaseModel):
    workflow_values: Optional[Dict]

def predict(workflow_values=None, run_id, logger):
    item = Item(workflow_values=workflow_values)

    template_values = item.workflow_values

    template_values, tempfiles = convert_request_file_url_to_path(template_values)
    json_workflow_copy = copy.deepcopy(json_workflow)
    json_workflow_copy = fill_template(json_workflow_copy, template_values)
    outputs = {}  # Initialize outputs to an empty dictionary

    try:
        outputs = get_images(
            ws, json_workflow_copy, client_id, server_address
        )

    except Exception as e:
        print('did it get here')
        print("Error occurred while running Comfy workflow: ", e)

    for file in tempfiles:
        file.close()

    result = []
    for node_id in outputs:
        for unit in outputs[node_id]:
            file_name = unit.get("filename")
            file_data = unit.get("data")
            output = convert_outputs_to_base64(
                node_id=node_id, file_name=file_name, file_data=file_data
            )
            result.append(output)

    return {"result": result}

Deploy ComfyUI Application

Running cerebrium deploy uploads both the ComfyUI directory and ~10GB of model weights. For slower connections, use our helper file to download model weights directly to the appropriate folders. Create a file called model.json with the following contents:

[
    {
        "url": "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors",
        "path": "models/checkpoints/sd_xl_base_1.0.safetensors"
    },
    {
        "url": "https://huggingface.co/diffusers/controlnet-canny-sdxl-1.0/resolve/main/diffusion_pytorch_model.fp16.safetensors",
        "path": "models/controlnet/diffusers_xl_canny_full.safetensors"
    }
]

This configuration tells the helper function where to download models from and where to save them. Models download only on first deployment, with subsequent deploys checking for existing files.

This is what your file folder structure should look like:

You can now deploy your application by running: cerebrium deploy

Once your ComfyUI application has been deployed successfully, you should be able to make a request to the endpoint using the following JSON payload:

curl --location 'https://api.cortex.cerebrium.ai/v4/p-xxxx/1-comfyui/predict' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <YOUR TOKEN HERE>' \
--data '{"workflow_values": {
  "positive_prompt": "A top down view of a mountain with large trees and green plants",
  "negative_prompt": "blurry, text, low quality",
  "controlnet_image": "https://cerebrium-assets.s3.eu-west-1.amazonaws.com/qr-code.png",
  "seed": 1000
}
}'

You will get two responses from the output:

  • The base64 encoded image of the outline of your original ControlNet image. This is what it uses as a input in your flow.
  • A base64 encoded image of the final result.

Conclusion

Cerebrium enables companies to deploy production-ready ComfyUI workflows for unique user experiences. Workloads automatically scale with demand, and costs align with actual compute usage. Share your creations by tagging @cerebriumai!