<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Beyond Transcription: Building Voice AI That Understands Conversations — Hervé Bredin, pyannoteAI</title>
        <link>https://video.ut0pia.org/videos/watch/28e991d4-136c-4187-bf6e-48a6f4867d46</link>
        <description>The open ASR leaderboard reports Nvidia Parakeet at 11.4% word error rate on AMI meeting data. Hervé Bredin runs the same model on the same dataset and gets 26%. Same model, same recordings, different microphone: the leaderboard uses headset audio, he uses the table mic. Most voice AI benchmarks are measuring single speaker speech and calling it solved. The talk covers speaker diarization (who speaks when), why combining it with transcription is harder than it looks, and what breaks at the word level when two speakers overlap. Bredin demos live on a two speaker phone call, walks through the word that falls between two speaker boundaries with no clean owner, and runs pyannoteAI's Precision 2 model down to 3% diarization error against the open source baseline at 5%. State of the art today: 2% on clean telephone calls, 41% in a noisy restaurant. Speaker info: https://x.com/hbredin, https://www.linkedin.com/in/herve-bredin/, https://github.com/hbredin</description>
        <lastBuildDate>Sat, 06 Jun 2026 10:11:18 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>PeerTube - https://video.ut0pia.org</generator>
        <image>
            <title>Beyond Transcription: Building Voice AI That Understands Conversations — Hervé Bredin, pyannoteAI</title>
            <url>https://video.ut0pia.org/lazy-static/avatars/0287a09a-aae7-4840-9843-b416426e7046.webp</url>
            <link>https://video.ut0pia.org/videos/watch/28e991d4-136c-4187-bf6e-48a6f4867d46</link>
        </image>
        <copyright>All rights reserved, unless otherwise specified in the terms specified at https://video.ut0pia.org/about and potential licenses granted by each content's rightholder.</copyright>
        <atom:link href="https://video.ut0pia.org/feeds/video-comments.xml?videoId=28e991d4-136c-4187-bf6e-48a6f4867d46" rel="self" type="application/rss+xml"/>
    </channel>
</rss>