Run a hyperparameter sweep on Llama 3.2 with WandB
When training machine learning models, finding the perfect combination of hyper parameters can feel overwhelming however if you do it well, it can turn a good model into a great one! Hyper parameter sweeps help you find the best performing model for the least amount of compute or time spent training - think of them as your systematic approach to testing every variation to uncover the best result.
In this tutorial, we’ll walk through training Llama 3.2, using Wandb (Weights and Biases) to run hyper parameter sweeps to optimize its performance and we’ll leverage Cerebrium to scale our experiments across serverless GPUs, allowing us to find the best-performing model faster than ever.
If you would like to see the final version of this tutorial, you can view it on Github here.
Read this section if you’re unfamiliar with sweeps.
Forget about ML for a second. Imagine you’re making pizzas, and you want to discover the most delicious combination of toppings. You can change three things about your pizza:
• Type of Cheese (mozzarella, cheddar, parmesan)
• Type of Sauce (tomato, pesto)
• Extra Topping (pepperoni, mushrooms, olives)
There are 12 possible combinations of pizzas you can make. One of them will taste the best!
To find out which pizza is the tastiest, you need to try all the combinations and rate them. This process is called a hyperparameter sweep. Your three hyperparameters are the cheese, sauce, and extra topping.
If you do it one pizza at a time, it could take hours. But if you had 12 ovens, you could bake all the pizzas at once and find the best one in just a few minutes!
If a kitchen is a GPU, then you need 12 GPUs to run each experiment to see which cookie is the best. The power of Cerebrium is the ability to run sweeps like this on 12 different GPUs (or 1,000 GPUs if you’d like) to get you the best version of a model fast.
If you don’t have a Cerebrium account, you can run the following in your cli:
This creates a folder with two files:
We will return back to these files later but for now, we will continue the rest of this tutorial in this folder.
Weights & Biases (Wandb) is a powerful tool for tracking, visualizing, and managing machine learning experiments in real-time. It helps you log hyperparameters, metrics, and results, making it easy to compare models, optimize performance, and collaborate effectively with your team.
You should see a link printed in your terminal - click it and copy the API from the webpage back into your terminal.
Add your W&B API key to Cerebrium secrets for use in your code. You can go to your Cerebrium Dashboard and navigate to the “secrets” tab in the left side bar. Add the following:
Click the “Save All Changes” button to save the changes!
You should then be authenticated with Wandb and ready to go!
To train with Llama 3.2, you’ll need:
Model access permission:
Hugging Face token:
HF_TOKEN
Our training script adapts this Kaggle notebook.
Create a requirements.txt
file with these dependencies:
These packages are required both locally and on Cerebrium. Update your cerebrium.toml
to include:
response_grace_period
Add this configuration:
Install the dependencies locally:
Add this code to main.py
:
You can read a deeper explanation of the training script here but here’s a high-level explanation of the code in bullet points:
Deploy the training endpoint:
This command:
Cerebrium requires no special decorators or syntax - just wrap your training code in a function. The endpoint automatically scales based on request volume, making it perfect for hyperparameter sweeps.
Let us create a run.py file that we will use to run locally. Put the following code in there:
This code implements a hyperparameter sweep system using Weights & Biases (wandb) sweeps to train a Llama 3.2 model for customer support. Here’s what it does:
Run the script:
Monitor training progress in your W&B dashboard:
Export model:
Quality assurance:
Deployment:
Hyperparameter optimization becomes manageable with the right tools. This tutorial showed how combining W&B for tracking and Cerebrium for serverless compute enables efficient hyperparameter sweeps for Llama 3.2, optimizing model performance with minimal effort.
View the complete code in the GitHub repository
Run a hyperparameter sweep on Llama 3.2 with WandB
When training machine learning models, finding the perfect combination of hyper parameters can feel overwhelming however if you do it well, it can turn a good model into a great one! Hyper parameter sweeps help you find the best performing model for the least amount of compute or time spent training - think of them as your systematic approach to testing every variation to uncover the best result.
In this tutorial, we’ll walk through training Llama 3.2, using Wandb (Weights and Biases) to run hyper parameter sweeps to optimize its performance and we’ll leverage Cerebrium to scale our experiments across serverless GPUs, allowing us to find the best-performing model faster than ever.
If you would like to see the final version of this tutorial, you can view it on Github here.
Read this section if you’re unfamiliar with sweeps.
Forget about ML for a second. Imagine you’re making pizzas, and you want to discover the most delicious combination of toppings. You can change three things about your pizza:
• Type of Cheese (mozzarella, cheddar, parmesan)
• Type of Sauce (tomato, pesto)
• Extra Topping (pepperoni, mushrooms, olives)
There are 12 possible combinations of pizzas you can make. One of them will taste the best!
To find out which pizza is the tastiest, you need to try all the combinations and rate them. This process is called a hyperparameter sweep. Your three hyperparameters are the cheese, sauce, and extra topping.
If you do it one pizza at a time, it could take hours. But if you had 12 ovens, you could bake all the pizzas at once and find the best one in just a few minutes!
If a kitchen is a GPU, then you need 12 GPUs to run each experiment to see which cookie is the best. The power of Cerebrium is the ability to run sweeps like this on 12 different GPUs (or 1,000 GPUs if you’d like) to get you the best version of a model fast.
If you don’t have a Cerebrium account, you can run the following in your cli:
This creates a folder with two files:
We will return back to these files later but for now, we will continue the rest of this tutorial in this folder.
Weights & Biases (Wandb) is a powerful tool for tracking, visualizing, and managing machine learning experiments in real-time. It helps you log hyperparameters, metrics, and results, making it easy to compare models, optimize performance, and collaborate effectively with your team.
You should see a link printed in your terminal - click it and copy the API from the webpage back into your terminal.
Add your W&B API key to Cerebrium secrets for use in your code. You can go to your Cerebrium Dashboard and navigate to the “secrets” tab in the left side bar. Add the following:
Click the “Save All Changes” button to save the changes!
You should then be authenticated with Wandb and ready to go!
To train with Llama 3.2, you’ll need:
Model access permission:
Hugging Face token:
HF_TOKEN
Our training script adapts this Kaggle notebook.
Create a requirements.txt
file with these dependencies:
These packages are required both locally and on Cerebrium. Update your cerebrium.toml
to include:
response_grace_period
Add this configuration:
Install the dependencies locally:
Add this code to main.py
:
You can read a deeper explanation of the training script here but here’s a high-level explanation of the code in bullet points:
Deploy the training endpoint:
This command:
Cerebrium requires no special decorators or syntax - just wrap your training code in a function. The endpoint automatically scales based on request volume, making it perfect for hyperparameter sweeps.
Let us create a run.py file that we will use to run locally. Put the following code in there:
This code implements a hyperparameter sweep system using Weights & Biases (wandb) sweeps to train a Llama 3.2 model for customer support. Here’s what it does:
Run the script:
Monitor training progress in your W&B dashboard:
Export model:
Quality assurance:
Deployment:
Hyperparameter optimization becomes manageable with the right tools. This tutorial showed how combining W&B for tracking and Cerebrium for serverless compute enables efficient hyperparameter sweeps for Llama 3.2, optimizing model performance with minimal effort.
View the complete code in the GitHub repository