More

nextworddev · 2025-12-27T01:09:28 1766797768

(Inception scene) here a minute is seven hours

nextworddev · 2025-12-26T00:31:26 1766709086

ok this is a pretty stupid take from an otherwise smart guy.

tikok/YT shorts/IG reels is many orders of magnitude higher supply of slop than Simon Schuster paperbacks

nextworddev · 2025-12-25T22:21:56 1766701316

Meanwhile Harrison Chase is laughing his way to the bank

nextworddev · 2025-12-25T17:45:28 1766684728

Don’t worry, SWE profession won’t really exist in 5 years as it currently stands

petermcneeley · 2025-12-25T19:24:27 1766690667

I thought AGI was 2027. Given it is a week from 2026 we only have 1 year left right?

nextworddev · 2025-12-25T22:20:31 1766701231

We don't need AGI to replace the median dev

dvrj101 · 2025-12-26T04:02:02 1766721722

From. > SWE profession won’t really exist To. > we don't median dev (another vague term)

bunch of goalpost moving statements.

nextworddev · 2025-12-26T04:11:56 1766722316

How so? Median is a concrete metric.

nextworddev · 2025-12-25T16:28:30 1766680110

one shotted vibe coded blog

nextworddev · 2025-12-25T16:28:03 1766680083

This is a misread of Benioff's intent behind his comment lol.

Salesforce has a vested interest in maintaing its seat based licenses, so it's not in favor of mass layoffs.

Internally Salesforce is pushing AgentForce full stop

nextworddev · 2025-12-25T01:09:22 1766624962

Can I connect this to Twilio

kwindla · 2025-12-25T02:29:47 1766629787

One easy way to build voice agents and connect them to Twilio is the Pipecat open source framework. Pipecat supports a wide variety of network transports, including the Twilio MediaStream WebSocket protocol so you don't have to bounce through a SIP server. Here's a getting started doc.[1]

(If you do need SIP, this Asterisk project looks really great.)

Pipecat has 90 or so integrations with all the models/services people use for voice AI these days. NVIDIA, AWS, all the foundation labs, all the voice AI labs, most of the video AI labs, and lots of other people use/contribute to Pipecat. And there's lots of interesting stuff in the ecosystem, like the open source, open data, open training code Smart Turn audio turn detection model [2], and the Pipecat Flows state machine library [3].

[1] - https://docs.pipecat.ai/guides/telephony/twilio-websockets [2] - https://github.com/pipecat-ai/pipecat-flows/ [3] - https://github.com/pipecat-ai/smart-turn

Disclaimer: I spend a lot of my time working on Pipecat. Also writing about both voice AI in general and Pipecat in particular. For example: https://voiceaiandvoiceagents.com/

ldenoue · 2025-12-25T03:57:03 1766635023

The problem with PipeCat and LiveKit (the 2 major stacks for building voice ai) is the deployment at scale.

That’s why I created a stack entirely in Cloudflare workers and durable objects in JavaScript.

Providers like AssemblyAI and Deepgram now integrate VAD in their realtime API so our voice AI only need networking (no CPU anymore).

nextworddev · 2025-12-25T04:16:23 1766636183

let me get this straight, you are storing convo threads / context in DOs?

e.g. Deepgram (STT) via websocket -> DO -> LLM API -> TTS?

ldenoue · 2025-12-27T01:26:03 1766798763

Yes DO let you handle long lived websocket connections. I think this is unique to Cloudflare. AWS or Google Cloud don't seem to offer these things (statefulness basically).

Same with TTS: some like Deepgram and ElevenLabs let you stream the LLM text (or chunks per sentence) over their websocket API, making your Voice AI bot really really low latency.

nextworddev · 2025-12-25T03:17:43 1766632663

This is good stuff.

In your opinion, how close is Pipecat + OSS to replacing proprietary infra from Vapi, Retell, Sierra, etc?

kwindla · 2025-12-25T15:11:22 1766675482

It depends on what you mean by replacing.

The integrated developer experience is much better on Vapi, etc.

The goal of the Pipecat project is to provide state of the art building blocks if you want to control every part of the multimodal, realtime agent processing flow and tech stack. There are thousands of companies with Pipecat voice agents deployed at scale in production, including some of the world's largest e-commerce, financial services, and healthtech companies. The Smart Turn model benchmarks better than any of the proprietary turn detection models. Companies like Modal have great info about how to build agents with sub-second voice-to-voice latency.[1] Most of the next-generation video avatar companies are building on Pipecat.[2] NVIDIA built the ACE Controller robot operating system on Pipecat.[3]

[1] https://modal.com/blog/low-latency-voice-bot - [2] https://lemonslice.com/ = [3] https://github.com/NVIDIA/ace-controller/

nextworddev · 2025-12-25T16:14:31 1766679271

Is there a simple, serverless version of deploying Pipecat stack, without: - me having to self host on my infra

I just want to provide: - business logic - tools - configuration metadata (e.g. which voice to use)

I don't like Vapi due to 1) extensive GUI driven experience, 2) cost

ldenoue · 2025-12-27T01:27:39 1766798859

Check out something like LayerCode (Cloudflare based).

Or PipeCat Cloud / LiveKit cloud (I think they charge 1 cent per minute?)

ldenoue · 2025-12-25T03:54:59 1766634899

I developed a stack on Cloudflare workers where latency is super low and it is cheap to run at scale thanks to Cloudflare pricing.

Runs at around 50 cents per hour using AssemblyAI or Deepgram as the STT, Gemini Flash as LLM and InWorld.ai as the TTS (for me it’s on par with ElevenLabs and super fast)

picardo · 2025-12-25T15:56:07 1766678167

Is AssemblyAI or Deepgram compatible with OpenAI Realtime API, esp. around voice activity detection and turn taking? How do you implement those?

ldenoue · 2025-12-27T01:29:38 1766798978

I am not using speech to speech APIs like OpenAI, but it would be easy to swap the STT + LLM + TTS to using Realtime (or Gemini Live API for that matter).

OpenAI realtime voices are really bad though, so you can also configure your session to accept AUDIO and output TEXT, and then use any TTS provider (like ElevenLabs or InWord.ai, my favorite for cost) so generate the audio.

pugio · 2025-12-25T04:43:59 1766637839

Do you have anything written up about how you're doing this? Curious to learn more...

ldenoue · 2025-12-27T01:30:10 1766799010

I don't but I should open source this code. I was trying to sell to OEM though, that's why. Are you interested in licensing it?

VladVladikoff · 2025-12-25T01:45:41 1766627141

Technically yes, twilio has sip trunks.

nextworddev · 2025-12-24T21:43:45 1766612625

They should have bought nbis

nextworddev · 2025-12-24T03:46:03 1766547963

The reason why there are meetings is due to existing org layers.

Thus, the root cause of the meetings' existence is BS mostly. That's why you have BS meetings.

The fastest way to drive AI adoption is thus by thinning out org layers.

recursive · 2025-12-24T04:13:05 1766549585

Not all meetings are status reports. There is also the working meeting where people figure stuff out.

nextworddev · 2025-12-23T18:21:47 1766514107

Now rebrand as an AI agent company and raise at 1bn valuation /s