by kwindla on 5/13/24, 5:21 PM with 39 comments
On the theory that something like a LlamaIndex or LangChain for real-time/conversational AI would be useful, a few of us started working on a Python library for voice (and multimodal) AI assistants/agents.
So ... Pipecat: a framework for building things like personal coaches, meeting assistants, story-telling toys for kids, customer support bots, virtual friends, and snarky social bots.
Most of the core contributors to Pipecat so far work together at our day jobs. This has been a kind of "20% time" thing at our company. But we're serious about welcoming all contributions. We want Pipecat to support any and all models, services, transport layers, and infrastructure tooling. If you're interested in this stuff, please check it out and let us know what you think. Submit PRs. Become a maintainer. Join the Discord. Post cool stuff. Post funny stuff when your voice agent goes completely off the rails (as mine sometimes do).
by awenix on 5/13/24, 6:59 PM
by ilaksh on 5/13/24, 6:12 PM
Edit: someone found one: https://news.ycombinator.com/item?id=40346992
by johnmaguire on 5/13/24, 6:17 PM
From what I can tell, Siri is still a dumpster fire that nobody is willing to use. And I have no personal experience with Alexa, so I can't speak to it. But I do have a few Google Home speakers and an Android phone, and I have seen no major improvements in years. In fact, it has gotten worse - for example, you can no longer add items directly to AnyList[0], only Google Keep.
Or, as an incredibly simple example of something I thought we'd get a long time ago, it's still unable to interpret two-part requests, e.g. "please repeat that but louder," or "please turn off the kitchen and dining room lights."
I find voice assistants very useful - especially when driving, lying in bed, cooking, or when I'm otherwise preoccupied. Yet they have stagnated almost since their debut. I can only imagine nobody has found a viable way to monetize them.
What will it take to get a better voice assistant for consumers? Willow[1] doesn't seem to have taken off.
[0] https://help.anylist.com/articles/google-assistant-overview/
edit: I realize I hijacked your thread to dump something that's been on my mind lately. Pipecat looks really cool, and I hope it takes off! I hope to get some time to experiment this weekend.
by userhacker on 5/14/24, 3:59 AM
by xan_ps007 on 5/13/24, 7:24 PM
by russ on 5/13/24, 7:49 PM
by orliesaurus on 5/14/24, 3:11 AM
by canadiantim on 5/13/24, 6:02 PM
by 35mm on 5/14/24, 2:30 PM
by bamazizi on 5/13/24, 6:10 PM
The demo on real-time multi language translation conversation blew me away!