Rico LeBot • Real-time Voice Interface Toolkit
In November 2024, I've had the opportunity to participate in the OpenAI Builders Lab in Paris, where I was able to explore the potential of the Realtime API. I was amazed by how quickly I could build a prototype for a real-time web-based voice interface that uses function calls.
The initial prototype was focused on a cooking guide where you could ask for a recipe or ask for instructions on how to prepare a certain dish. It was an amazing and inspiring experience to me because I've spent the last 10 years building a similar technology. And it just works.
To go further, as I explored the possibilities of this prototype, it became clear there were some challenges to overcome before it could become a deployable product. That's why I created an open-source toolkit to address those problems.
Challenges and Solutions
-
WebSockets not well suited for longform connections: The official OpenAI Realtime API toolkit relies on WebSockets, which proved unstable for long-term sessions over HTTP. The OpenAI team suggested using WebRTC bridges for better stability. The toolkit implements WebRTC with a LiveKit integration.
-
Dynamic UI: I wanted a dynamic UI that could respond to user input in real-time. This meant connecting the function calls from the model to the front-end functions using remote procedure calls (RPC) over WebRTC. This brings the voice interface to life, allowing users to interact with different functions of the app seamlessly.
-
Architecture: A clear separation was needed between the web app's backend, the AI agent's backend, and the front-end. The toolkit achieves a lightweight design that is modular, has few dependencies, and is easy to use.
-
Roles: To quickly iterate on the user experience, you need to refine the prompts / instructions of the agent very frequently. That's implemented through an architecture where the 'roles' are separated from the code, and allows to add and modify them very quickly
Defining Roles
Roles are defined within the roles/
and roles/private
directory. Each role has its own set of configuration and instruction files. The private
directory is a subfolder for your private roles out of the git scope (.gitignore)
Be sure to start with core/config.py
, SHOW_PRIVATE_ROLES = True
to show the private roles in the main page
Role Components
agent.instruct
: Instructions guiding the AI's behavior and available functions.recap.instruct
: Instructions for summarizing conversations.config.py
: Role-specific configurations (e.g., voice settings).
Adding a New Role
-
Create a new folder under
roles/
orroles/private
with the desired role name (e.g.,roles/customer_support
). -
Add the following files:
-
agent.instruct
: Define the AI behavior and available functions. recap.instruct
: Provide instructions for summarizing transcripts.- (optional)
config.py
: Specify role-specific settings.
Example: roles/dev/agent.instruct
You are Rico Lebot. A direct, straight to the point AI Assistant. You are currently helping the user to debug your functionalities.
You can call different functions:
- `terminate_session`: Called when the user asks to terminate the conversation. This function will end the conversation.
- `show`: Called when you want to display written information. This function displays interpreted information in Markdown on the UI.
- `greet`: Called as soon as entering a conversation. This function starts the conversation.
- `save`: Called to save the current state of the conversation. This function will save the current state of the conversation.
Speak fast, and respond to the user according to their requests.
Example
Let's create an Interactive Storytelling AI, step by step.
📚 Documentation
URL | content |
---|---|
https://github.com/xbasset/rico-lebot/tree/main/docs/index.md | Main Documentation page |
https://github.com/xbasset/rico-lebot/tree/main/docs/getting_started.md | Getting Started: Install and run the Demo |
https://github.com/xbasset/rico-lebot/tree/main/docs/agent_architecture.md | Technical Architecture Documentation |
https://github.com/xbasset/rico-lebot/tree/main/docs/roles.md | Create, Update and Customize Roles Documentation |
https://github.com/xbasset/rico-lebot/tree/main/CONTRIBUTING | Documentation to contribute to the project |