Langchain and Langsmith

In this tutorial, we’ll create Cal-vin, an executive assistant that manages calendar appointments (via Cal.com) with employees, customers, partners, and friends. Using the LangChain SDK for agent creation and LangSmith platform for monitoring, we’ll track scheduling activities and identify failure points. Finally, we’ll deploy on Cerebrium to demonstrate seamless deployment and scaling. You can find the final version of the code here

Concepts

This app requires calendar interaction based on user instructions - an ideal use case for an agent with function (tool) calling capabilities. LangChain, a framework with extensive agent support, created LangSmith, making integration straightforward. A tool refers to any framework, utility, or system with defined functionality for specific use cases, such as searching Google or retrieving credit card transactions. Key LangChain concepts: ChatModel.bind_tools(): Attaches tool definitions to model calls. While providers have different tool definition formats, LangChain provides a standard interface for versatility. Accepts tool definitions as dictionaries, Pydantic classes, LangChain tools, or functions, telling the LLM how to use each tool.

@tool
def exponentiate(x: float, y: float) -> float:
    """Raise 'x' to the 'y'."""
    return x**y

AIMessage.tool_calls: An attribute on AIMessage that provides easy access to model-initiated tool calls, specifying invocations in the bind_tools format:

# -> AIMessage(
# 	  content=...,
# 	  additional_kwargs={...},
# 	  tool_calls=[{'name': 'exponentiate', 'args': {'y': 2.743, 'x': 5.0}, 'id': '54c166b2-f81a-481a-9289-eea68fc84e4f'}]
# 	  response_metadata={...},
# 	  id='...'
#   )

create_tool_calling_agent(): Unifies the above concepts to work across different provider formats, enabling easy model switching.

agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241", })

Setup Cal.com

Cal.com provides our calendar management foundation. Create an account here if needed. Cal serves as the source of truth - any updates to time zones or working hours automatically reflect in the assistant’s responses. After creating your account:

Navigate to “API keys” in the sidebar
Create an API key without expiration
Test the setup with a CURL request (replace these variables):
- Username
- API key
- dateFrom and dateTo

curl --location 'https://api.cal.com/v1/availability?apiKey=cal_live_xxxxxxxxxxxxxx&dateFrom=2024-04-15T00%3A00%3A00.000Z&dateTo=2024-04-22T00%3A00%3A00.000Z&username=michael-louis-xxxx'

You should get a response similar to the following:

{
    "busy": [
        {
            "start": "2024-04-15T13:00:00.000Z",
            "end": "2024-04-15T13:30:00.000Z"
        },
        {
            "start": "2024-04-22T13:00:00.000Z",
            "end": "2024-04-22T13:30:00.000Z"
        },
        {
            "start": "2024-04-29T13:00:00.000Z",
            "end": "2024-04-29T13:30:00.000Z"
        },
	   ....
    ],
    "timeZone": "America/New_York",
    "dateRanges": [
        {
            "start": "2024-04-15T13:45:00.000Z",
            "end": "2024-04-15T16:00:00.000Z"
        },
        {
            "start": "2024-04-15T16:45:00.000Z",
            "end": "2024-04-15T19:45:00.000Z"
        },
	    ....
        {
            "start": "2024-04-19T18:45:00.000Z",
            "end": "2024-04-19T21:00:00.000Z"
        }
    ],
    "oooExcludedDateRanges": [

    ],
    "workingHours": [
        {
            "days": [
                1,
                2,
                3,
                4,
                5
            ],
            "startTime": 780,
            "endTime": 1260,
            "userId": xxxx
        }
    ],
    "dateOverrides": [],
    "currentSeats": null,
    "datesOutOfOffice": {}
}

Great! Now we know that our API key is working and pulling information from our calendar. The API calls we will be using later in this tutorial are:

/availability: Get your availability
/bookings: Book a slot

Cerebrium setup

Set up Cerebrium:

Sign up here
Follow installation docs here
Create a starter project:
```
cerebrium init agent-tool-calling
```
This creates:
- main.py: Entrypoint file
- cerebrium.toml: Build and environment configuration

Add these pip packages to your cerebrium.toml:

[cerebrium.dependencies.pip]
pydantic = "latest"
langchain = "latest"
pytz = "latest" ##this is used for timezones
openai = "latest"
langchain_openai = "latest"

Set up API keys:

OpenAI GPT-3.5:
- Sign up at OpenAI
- Create API key here (format: sk_xxxxx)
Add secrets in Cerebrium dashboard:
- Navigate to “Secrets”
- Add keys:
  - CAL_API_KEY: Your Cal.com API key
  - OPENAI_API_KEY: Your OpenAI API key

Agent Setup

Create two tool functions in main.py for calendar management:

Get availability tool
Book slot tool

The Cal.com API provides:

Busy time slots
Working hours per day

Below is the code to achieve this:


from langchain_core.tools import tool
import os
import requests
from cal import find_available_slots

@tool
def get_availability(fromDate: str, toDate: str) -> float:
    """Get my calendar availability using the 'fromDate' and 'toDate' variables in the date format '%Y-%m-%dT%H:%M:%S.%fZ'"""

    url = "https://api.cal.com/v1/availability"
    params = {
        "apiKey": os.environ.get("CAL_API_KEY"),
        "username": "xxxxx",
        "dateFrom": fromDate,
        "dateTo": toDate
    }
    response = requests.get(url, params=params)
    if response.status_code == 200:
        availability_data = response.json()
        available_slots = find_available_slots(availability_data, fromDate, toDate)
        return available_slots
    else:
        return {}

The code above:

Uses @tool decorator to identify functions as LangChain tools
Includes docstrings explaining functionality and required inputs
Uses find_available_slots helper function to format Cal.com API responses into readable time slots ‍

We then follow a similar practice to write our book_slot tool. This will book a slot in my calendar based on the selected time/day. You can get the eventTypeId from your dashboard, select an event and grab the ID in the URL.

@tool
def book_slot(datetime: str, name: str, email: str, title: str, description: str) -> float:
    """Book a meeting on my calendar at the requested date and time using the 'datetime' variable. Get a description about what the meeting is about and make a title for it"""
    url = "https://api.cal.com/v1/bookings"
    params = {
        "apiKey": os.environ.get("CAL_API_KEY"),
        "username": "xxxx",
        "eventTypeId": "xxx",
        "start": datetime,
        "responses": {
            "name": name,
            "email": email,
            "guests": [],
            "metadata": {},
            "location": {
              "value": "inPerson",
              "optionValue": ""
            }
        },
        "timeZone": "America/New York",
        "language": "en",
        "status": "PENDING",
        "title": title,
        "description": description,
    }
    response = requests.post(url, params=params)
    if response.status_code == 200:
        booking_data = response.json()
        return booking_data
    else:
        print('error')
        print(response)
        return {}

Now that we have created our two tools let us create our agent in our main.py file too:

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_openai import ChatOpenAI

prompt = ChatPromptTemplate.from_messages([
    ("system", "you're a helpful assistant managing the calendar of Michael Louis. You need to book appointments for a user based on available capacity and their preference. You need to find out if the user is: From Michaels team, a customer of Cerebrium or a friend or entrepreneur. If the person is from his team, book a morning slot. If its a potential customer for Cerebrium, book an afternoon slot. If its a friend or entrepreneur needing help or advice, book a night time slot. If none of these are available, book the earliest slot. Do not book a slot without asking the user what their preferred time is. Find out from the user, their name and email address."),
    MessagesPlaceholder(variable_name="chat_history"),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

tools = [get_availability, book_slot]


llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0, api_key=os.environ.get("OPENAI_API_KEY"))
agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

The above snippet is used to create our agent executor which consists of:

Our prompt template:
- This is where we can give instructions to our agent on what role it is taking on, its goal and how it should perform in certain situations etc. The more precise and concise this is, the better.
- Chat History is where we will inject all previous messages so that the agent has context on what was said previously.
- Input is new input from the end user.
We then instantiate our GPT3.5 model that will be the LLM we will be using. You can swap this our with Antrophic or any other provider just by replacing this one line - LangChain makes this seamless.
Lastly, we join this all together with our tools to create an agent executor.

Setup Chatbot

The above code is static in that it will only reply to our first question but we might need to have a conversation to find a time that suits both the user and my schedule. We therefore need to create a chatbot with tool calling capabilities and the ability to remember past messages. LangChain supports this with RunnableWithMessageHistory(). It essentially allows us to store the previous replies of our conversation in a chat_history variable (mentioned above in our prompt template) and tie this all to a session identifier so your API can remember information pertaining to a specific user/session. Below is our code to implement this:

from langchain.memory import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

demo_ephemeral_chat_history_for_chain = ChatMessageHistory()
conversational_agent_executor = RunnableWithMessageHistory(
    agent_executor,
    lambda session_id: demo_ephemeral_chat_history_for_chain,
    input_messages_key="input",
    output_messages_key="output",
    history_messages_key="chat_history",
)

Let us run a simple local test to make sure everything is working as expected.

class Item(BaseModel):
    prompt: str
    session_id: str

def predict(item, run_id, logger):
    item = Item(**item)

    output = conversational_agent_executor.invoke(
        {
            "input": user_input,
        },
        {"configurable": {"session_id": item.session_id}},
    )

    return {"result": output} # return your results

if __name__ == "__main__":
    while True:
        user_input = input("Enter the input (or type 'exit' to stop): ")
        if user_input.lower() == 'exit':
            break
        result = predict({"prompt": user_input, "session_id": "12345"}, "test", logger=None)
        print(result)

The above code does the following:

We define a Pydantic object which specifies the parameters our API expects - the user prompt and a session id to tie the conversation to.
The predict function in Cerebrium is the entry point for our API so we just pass the prompt and session id to our agent and print the results. ‍

To run this, simply install the pip dependencies manually by typing the following into your terminal pip install pydantic langchain pytz openai langchain_openai langchain-community and then run python main.py to execute your main python file. You will need to replace your secrets with the actual values when running locally. You should then see output similar to the following: Langchain Agent

If you keep talking and answering, you will see it will eventually book a slot.

Integrate Langsmith

Production monitoring is crucial, especially for agent applications with indeterministic workflows based on user interactions. LangSmith, a LangChain tool for logging, debugging, and monitoring, helps track performance and handle edge cases. Learn more about LangSmith here. Set up LangSmith monitoring:

Add LangSmith to cerebrium.toml dependencies
Create a free LangSmith account here
Generate API key (click gear icon in bottom left)

Next we need to set the following environment variables. You can add the following code at the top of your main.py. You can add the API key to your secrets in Cerebrium

import os
os.environ['LANGCHAIN_TRACING_V2']="true"
os.environ['LANGCHAIN_API_KEY']=os.environ.get("LANGCHAIN_API_KEY")

Enable tracing by adding the @traceable decorator to your functions. LangSmith automatically tracks tool invocations and OpenAI responses through function traversal. Add the decorator to the predict function and any independently instantiated functions. Edit main.py to have the following:

from langsmith import traceable

@traceable
def predict(item, run_id, logger):

Easy! Now LangSmith is set up. Run python main.py to run your file and test booking an appointment with yourself. After you have completed a successful test run you should see data populating in LangSmith. You should see the following: LangSmith Runs Dashboard

In the Runs tab, you can see all your runs (ie: invocations/API requests). In 1 above, it takes the name of our function, input is set to the Cerebrium RunID which in this case we set to “test”. Lastly, you can see the input as well as the total latency of your run. LangSmith supports various data automations:

Data annotation for positive/negative case labeling
Dataset creation for model training
Online LLM-based evaluation (rudeness, topic analysis)
Webhook endpoint triggers
Additional features

You can set these automations by clicking the “Add rule” button above (2) and specifying under what conditions you would like the above to occur. The options to create a rule on are a filter, a sampling rate, and an action. Lastly, in 3 you can see overall metrics about your project such as number of runs, error rate, latency etc. LangSmith Threads provide clean conversation tracking between agents and users. Track conversation evolution and investigate anomalies through trace analysis. Each thread links to its associated session ID.

Lastly, you can monitor performance metrics regarding your agent in the Monitor tab. It shows metrics such as trace count, LLM call success rate, First time for token and much more. LangSmith Performance Monitoring

LangSmith offers simple integration for agent development with extensive functionality. While we’ve covered the basics, it provides comprehensive support for the application feedback loop: data collection/annotation → monitoring → iteration.

Deploy to Cerebrium

To deploy this application to Cerebrium you can simply run the command: cerebrium deploy in your terminal. Just make sure to delete the name == “main” code since that was just to run locally. If it deployed successfully, you should see something like this: Cerebrium Deployment

You can now call this via an API endpoint and our agent will remember the conversation as long as the session id is the same. Cerebrium will automatically scale up your application based on demand and only pay for the compute you use.

{
    "run_id": "UHCJ_GkTKh451R_nKUd3bDxp8UJrcNoPWfEZ3AYiqdY85UQkZ6S1vg==",
    "status_code": 200,
    "result": {
        "result": {
            "input": "Hi! I would like to book a time with Michael the 18th of April 2024.",
            "chat_history": [],
            "output": "Michael is available on the 18th of April 2024 at the following times:\n1. 13:00 - 13:30\n2. 14:45 - 17:00\n3. 17:45 - 19:00\n\nPlease let me know your preferred time slot. Are you from Michael's team, a potential customer of Cerebrium, or a friend/entrepreneur seeking advice?"
        }
    },
    "run_time_ms": 6728.828907012939,
    "process_time_ms": 6730.178117752075
}

You can find the final version of the code here.

Future Enhancements

Consider implementing:

Response streaming for seamless user experience
Email integration for context-aware scheduling when Claire is tagged
Voice capabilities for phone-based scheduling

Conclusion

The combination of LangChain, LangSmith, and Cerebrium enables scalable agent deployment. LangChain excels at LLM orchestration, tooling, and memory management, while LangSmith provides production monitoring and edge case identification. Cerebrium offers pay-as-you-go scaling across hundreds or thousands of CPU/GPUs. Tag us as @cerebriumai in extensions you make to the code repository so we can share it with our community.

Examples

​Concepts

​Setup Cal.com

​Cerebrium setup

​Agent Setup

​Setup Chatbot

​Integrate Langsmith

​Deploy to Cerebrium

​Future Enhancements

​Conclusion