We are saddened by the news that Mystic AI is sunsetting their services. They were an early pioneer in the space and really pushed the industry forward. This guide will walk you through migrating apps from Mystic to Cerebrium to ensure that users’ apps remain functional despite the circumstances.We’ll show you how to convert your existing Mystic code (using a stable diffusion example) and configuration to work with the Cerebrium platform. You’ll learn about the tools and features available to make this transition as seamless as possible, including ways to optimize your deployments for better performance and cost efficiency.
Cerebrium helps teams deploy and run their models efficiently. Our infrastructure is designed for reliable performance:
The average model cold-starts in 2-5 seconds.
Updates to your code deploy quickly, taking only 8-14 seconds.
99.9% uptime.
Cerebrium gives you precise control over your computing resources. Instead of managing entire instances, which can become costly and unnecessary, you choose exactly how much CPU, memory, and GPU power you need. You pay only for the resources you use, calculated down to the second. To better understand costs for your specific needs, you can use our pricing calculator.
Transforms into Cerebrium’s easy-to-understand TOML config:
Copy
Ask AI
# cerebrium.toml[cerebrium.deployment]name = "stable-diffusion"python_version = "3.11"docker_base_image_url = "debian:bookworm-slim"include = ["./*", "main.py", "cerebrium.toml"]exclude = [".*"][cerebrium.hardware]compute = "AMPERE_A10" # Choose your GPU typecpu = 4 # Number of CPU coresmemory = 16.0 # Memory in GBgpu_count = 1 # Number of GPUs[cerebrium.scaling]min_replicas = 0 # Save costs when inactive and scale down your appmax_replicas = 2 # Handle increased traffic and scale up where necessarycooldown = 60 # Time to wait before scaling down an idle instancereplica_concurrency = 1 # The number of requests a single container can support[cerebrium.dependencies.pip]torch = ">=2.0.0"pydantic = "latest"transformers = "latest"accelerate = "latest"diffusers = "latest"safetensors = "latest"xformers = "latest"
Now we’ll convert your model implementation. Here’s what a typical Mystic pipeline looks like:
Copy
Ask AI
import typing as tfrom pathlib import Pathfrom PIL.Image import Imagefrom pipeline.cloud.pipelines import run_pipelinefrom pipeline.objects.graph import InputField, InputSchemafrom pipeline import File, Pipeline, Variable, entity, pipeHF_MODEL_ID = "runwayml/stable-diffusion-v1-5"class ModelKwargs(InputSchema): num_images_per_prompt: int | None = InputField( title="num_images_per_prompt", description="The number of images to generate per prompt.", default=1, optional=True, ) height: int | None = InputField( title="height", description="The height in pixels of the generated image.", default=512, optional=True, multiple_of=64, ge=64, ) width: int | None = InputField( title="width", description="The width in pixels of the generated image.", default=512, optional=True, multiple_of=64, ge=64, ) num_inference_steps: int | None = InputField( title="num_inference_steps", description=( "The number of denoising steps. More denoising steps " "usually lead to a higher quality image at the expense " "of slower inference." ), default=50, optional=True, )@entityclass StableDiffusionModel: def __init__(self) -> None: self.model = None self.device = None @pipe(run_once=True, on_startup=True) def load(self) -> None: """ Load the HF model into memory""" import torch from diffusers import StableDiffusionPipeline device = torch.device("cuda") if torch.cuda.is_available() else "cpu" self.model = StableDiffusionPipeline.from_pretrained(HF_MODEL_ID) self.model.to(device) @pipe def predict(self, prompt: str, model_kwargs: ModelKwargs) -> t.List[Image]: """ Generates a list of PIL images. """ return self.model(prompt=prompt, **model_kwargs.to_dict()).images @pipe def postprocess(self, images: t.List[Image]) -> t.List[File]: """ Creates a list of Files from the `PIL` images. """ output_images = [] for i, image in enumerate(images): path = Path(f"/tmp/sd/image-{i}.jpg") path.parent.mkdir(parents=True, exist_ok=True) image.save(str(path)) output_images.append(File(path=path, allow_out_of_context_creation=True)) return output_imageswith Pipeline() as builder: prompt = Variable( str, title="prompt", description="The prompt to guide image generation", max_length=512, ) model_kwargs = Variable(ModelKwargs) model = StableDiffusionModel() model.load() images: t.List[Image] = model.predict(prompt, model_kwargs) output: t.List[File] = model.postprocess(images) builder.output(output)pipeline_graph = builder.get_pipeline()
Which becomes much simpler in Cerebrium. Add the following to you main.py file:
Copy
Ask AI
import base64import ioimport torchfrom diffusers import StableDiffusionPipeline, DPMSolverMultistepSchedulerfrom pydantic import BaseModel# Define the structure of input parametersclass Item(BaseModel): prompt: str height: int width: int num_inference_steps: int num_images_per_prompt: int# Load the model and set it up for inferencemodel_id = "stabilityai/stable-diffusion-2-1"pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)pipe.enable_xformers_memory_efficient_attention()pipe = pipe.to("cuda")# The endpoint we'll call to make inferencedef predict( prompt: str, height: int = 512, width: int = 512, num_inference_steps: int = 25, num_images_per_prompt: int = 1,): item = Item( prompt=prompt, height=height, width=width, num_inference_steps=num_inference_steps, num_images_per_prompt=num_images_per_prompt, ) images = pipe( prompt=item.prompt, height=item.height, width=item.width, num_images_per_prompt=item.num_images_per_prompt, num_inference_steps=item.num_inference_steps, ).images finished_images = [] for image in images: buffered = io.BytesIO() image.save(buffered, format="PNG") finished_images.append(base64.b64encode(buffered.getvalue()).decode("utf-8")) return finished_images
Once your app is deployed, you can make requests to your model using the example cURL request below:
Copy
Ask AI
curl --location 'https://api.cortex.cerebrium.ai/v4/p-<YOUR PROJECT ID>/stable-diffusion/predict' \--header 'Content-Type: application/json' \--header 'Authorization: Bearer <YOUR TOKEN HERE>' \--data '{ "prompt": "a photo of an astronaut riding a horse on mars"}'
Transitioning between platforms requires careful planning and execution. We’re here to help make this process as smooth as possible for teams like yours. Our platform provides the tools and support you need to ensure your apps continue running with unparalleled reliability.
Connect with other developers and our team in our active communities for better response and issue resolution times:
Join our Discord server.
Join our Slack workspace.
These communities are great places to share migration experiences, get quick answers to technical questions, learn best practices from other developers, and stay updated on new features and improvements.