Skip to content

Blog

The world is changing fast. Post-Covid, and with climate change on our minds, we have to ask ourselves: Do we really need more machines? More physical stuff? Do complex hardware systems actually help us communicate better? I’m becoming more skeptical. Maybe language, shared through the platforms we already have, is all we need.

Memory, data, latent spaces, and digital assistants—these are the things I’m obsessed with.

The Evolution of My Coding Muscle

I've spent the last 20 years coding.

My brain thrived on programming languages: C, Java, Python, Dart/Flutter, JS/TS.

I learned tricks, optimized algorithms, and debated design patterns.

Crafting solutions was a meticulous skill—like building precise machinery.

Then AI-powered coding tools arrived. GitHub Copilot. Cursor.

Suddenly, my routine transformed.

calculator

Dynamic Confidence Estimation for LLM Task Execution

How do you measure the reliability of LLM outputs in production?

Run tasks multiple times, track dominant results, and calculate dynamic confidence with statistical methods (LLN, CLT). Practical guidelines and code in this post.

As developers, we often use Large Language Models (LLMs) in real-world systems where not only the answer to a task matters, but also how confident we are in that answer. In an experiment, I ran a simple task—"How many R's are in 'strawberry'?"—103 times sequentially and independently on the DeepSeek-R1-Distill-Llama-8B model. This experiment illustrates two key aspects of a production system:

  1. Task Result: What is the output of the task?
  2. Probabilistic Confidence: How much trust can we put in that output?

result

When we observe successive executions, both pieces of information evolve. The outcome of the task can be inferred using principles like the Law of Large Numbers (LLN) and the Central Limit Theorem (CLT). But if the results do not converge to a stable answer, it might mean that the task is inherently ambiguous for the model. On the other hand, the confidence is evaluated dynamically based on the number of executions and the distribution of outputs.

In other words, you want to ask, "Now that I've done X executions, and I see that Y of them point to one particular result while the rest differ, how confident am I that this result is indeed the correct one?"

I've saved 13% disk space on my Mac today

Today, I've cleaned 65GB safely and I feel better now. That's ~13% of my disk space.

alt text

See this big gray "System Data" space? That's where I've been cleaning a lot stuff.

What's annoying as a developer, is that it keeps growing and that the built-in MacOS storage space optimization tools don't really get it.

And I never really took time to understand it before it was my last viable option before buying a bigger Mac.

Yesterday, even the 'magical' Store in iCloud didn't change a thing.

alt text

So I had to figure out what was going on.

How Realtime Voice AI is Transforming Digital Interactions

This post is part of a series of reflexions on the upcoming transformation of digital interactions:

  • Use Cases
  • Technical (coming soon)
  • Responsible Design (coming soon)

Early November 2024, I spent a day with the OpenAI team in Paris. A few weeks earlier, the announcement of an API to allow developers to create multimodal assistants—voice + text—in real time had made waves. I was fortunate enough to have time and receive firsthand advice to explore this new Realtime API with the team that designed it.

This technology is what powers the ChatGPT app in voice mode, and it's now available to developers for creating new applications.

I was blown away.

In 6 hours, I prototyped an interaction for a cooking coach. 100% voice-based, with beautiful sound quality, expressive intentions in the voice, and emotion. User-driven interactions, assistant-driven responses, contextual information references—all orchestrated with very simple code for a highly encouraging result.

Rico LeBot • Real-time Voice Interface Toolkit

In November 2024, I've had the opportunity to participate in the OpenAI Builders Lab in Paris, where I was able to explore the potential of the Realtime API. I was amazed by how quickly I could build a prototype for a real-time web-based voice interface that uses function calls.

The initial prototype was focused on a cooking guide where you could ask for a recipe or ask for instructions on how to prepare a certain dish. It was an amazing and inspiring experience to me because I've spent the last 10 years building a similar technology. And it just works.

To go further, as I explored the possibilities of this prototype, it became clear there were some challenges to overcome before it could become a deployable product. That's why I created an open-source toolkit to address those problems.

Challenges and Solutions
  • WebSockets not well suited for longform connections: The official OpenAI Realtime API toolkit relies on WebSockets, which proved unstable for long-term sessions over HTTP. The OpenAI team suggested using WebRTC bridges for better stability. The toolkit implements WebRTC with a LiveKit integration.

  • Dynamic UI: I wanted a dynamic UI that could respond to user input in real-time. This meant connecting the function calls from the model to the front-end functions using remote procedure calls (RPC) over WebRTC. This brings the voice interface to life, allowing users to interact with different functions of the app seamlessly.

  • Architecture: A clear separation was needed between the web app's backend, the AI agent's backend, and the front-end. The toolkit achieves a lightweight design that is modular, has few dependencies, and is easy to use.

  • Roles: To quickly iterate on the user experience, you need to refine the prompts / instructions of the agent very frequently. That's implemented through an architecture where the 'roles' are separated from the code, and allows to add and modify them very quickly

Rico LeBot – Screenshot

Evolution of jobs

I remember when I was a kid in the 90s, my mother, a pharmacist in the hospital, telling me about her struggles with the new computers she had to use. She felt like these computers were stupid tools because they didn't understand her, and she often came home disturbed and uncomfortable with this new way of working. It was not a choice of her to use them, it was kind of forced with limited bandwith to adapt to it, and that was a bad feeling.

At the same time, I remember my own passion for tech growing, wondering why my mother was so bothered by it while I found so much joy in it.

Jobs have always evolved over time. One notable example is the blacksmith, a profession that saw significant growth from the Middle Ages with the development of the population and the strong need for the agricultural world.

Maréchal Ferrand

What I build

My interests revolve around natural interactions with digital services and AI as a tool to make it happen.


Motivations

My goal?

Create ways everyone can trust when interacting with technology. Be it for communicating, learning, or just playing around.

But here’s the thing.

How we connect with the digital world keeps evolving. We started with punch cards, then screens and keyboards. Then came the mouse and GUI. We got touchscreens, then pocket-sized devices. Now, it’s voice interactions.

More input. More ways to connect.

And with more input comes a deeper understanding of people.

But the big question: as things change, how important is data ownership in our future with digital assistants?

I want to build a world with the kind of tech I’d want my kids to use and live with for their entire lives.

What do our interactions with machines reveal?

Xavier Basset - 2020 - TEDx Mines Ales
Xavier on stage at TEDx

TL;DR:

As a designer and developer, I have often thought about the ways in which our interactions with machines reveal insights about ourselves as humans.

These interactions can help us understand ourselves better and push us towards more intuitive and natural uses of technology. I have personally experienced the emotional power of such interactions, as demonstrated by the close connection my child formed with a robot during the holiday season. Additionally, I have come to appreciate the concept of attention in machine-human interactions and the importance of building relationships through attention and response. Empathy is also crucial in designing technology that meets the needs and emotions of the user.

I believe that technology has the potential to facilitate communication and connection between people, even across language barriers.

This is the best opportunity to make people communicate, learn, play and build love and happiness together at scale.