Podcasts

Inside Pipecat: Building Voice Agents That Improve Themselves with Kwin Kramer, Co-Founder of Pipecat

Inside Pipecat: Building Voice Agents That Improve Themselves with Kwin Kramer, Co-Founder of Pipecat

Inside Pipecat: Building Voice Agents That Improve Themselves with Kwin Kramer, Co-Founder of Pipecat

"Every piece of software is going to have a voice interface. Everyone is going to have voice as a primary input. I just don't think that idea has gotten the traction it deserves yet."

On this week's episode of Skywatch, Bluejay's car podcast, Rohan sat down with Kwin Kramer, co-founder of PipeCat and Daily. Kwin has been betting on real time infrastructure before most people knew they needed it. Ten out of twelve investors passed on Daily in 2016 saying they were just happy making phone calls. That same stubbornness eventually built PipeCat, now the most widely used open source framework for voice agents. NVIDIA and AWS have standardized on it.

Before all of that, he was building gestural interfaces that ended up in Minority Report. Steven Spielberg shot demo reels for his company. Then came room sized systems for Fortune 500 companies. Then the bet nobody else wanted to take.

A few things from this conversation we have not stopped thinking about:

In summer 2023 Kwin was showing voice agent demos to anyone who would watch. Every single person had the same three reactions. Whoa. That cannot be real. And then I do not know if I want to talk to an AI. He thinks we are still not fully past that third one.

During early COVID when video usage exploded overnight, Kwin did not sleep more than three hours at a time for three months. They kept adding servers and chasing the next bottleneck. A lot of that growth turned out not to be durable revenue. He kept serving everyone anyway. That is the kind of thing that builds real infrastructure companies.

Continual improvement is still unsolved. Closing the loop so your agents get better automatically from production data without human intervention at every step is what Kwin is calling the defining engineering challenge of 2026.

The people building the foundations of voice AI right now are thinking about problems most of us have not even named yet. This conversation is a good place to start.

Full episode on YouTube and Spotify 👇

Spotify: https://open.spotify.com/episode/18pRHSXu1t4qYnhwyc4etf?si=86zyi1NrTC-0WBUNAGeGPg

Youtube: