<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Engineering voice agents: Latency, quality, and scale — Rishabh Bhargava, Together AI</title>
        <link>https://video.ut0pia.org/videos/watch/0d693833-d12f-449e-9f56-e7091ffa7b6c</link>
        <description>Users notice latency above 500ms and hang up above one second. In an already optimized pipeline, 75ms of network latency from models sitting in a different data center adds 30% overhead. Colocating everything in the same building drops that to around 5ms. Rishabh Bhargava from Together AI walks through the full speech to text, LLM, and text to speech pipeline at that level of specificity. The LLM dominates the budget: 200 to 300ms time to first token target, 8 to 30B parameter range — larger models blow the latency budget, smaller ones break tool calling. Speech to text target is P90 under 100ms with around 6% word error rate. One pattern for handling complex workflows without adding latency: a small thinker LLM handles conversation flow and issues a single tool call to a larger model when the request is complex, keeping the fast path fast. Speaker info: https://www.linkedin.com/in/bhargavarishabh</description>
        <lastBuildDate>Mon, 01 Jun 2026 05:52:50 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>PeerTube - https://video.ut0pia.org</generator>
        <image>
            <title>Engineering voice agents: Latency, quality, and scale — Rishabh Bhargava, Together AI</title>
            <url>https://video.ut0pia.org/lazy-static/avatars/0287a09a-aae7-4840-9843-b416426e7046.webp</url>
            <link>https://video.ut0pia.org/videos/watch/0d693833-d12f-449e-9f56-e7091ffa7b6c</link>
        </image>
        <copyright>All rights reserved, unless otherwise specified in the terms specified at https://video.ut0pia.org/about and potential licenses granted by each content's rightholder.</copyright>
        <atom:link href="https://video.ut0pia.org/feeds/video-comments.xml?videoId=0d693833-d12f-449e-9f56-e7091ffa7b6c" rel="self" type="application/rss+xml"/>
    </channel>
</rss>