<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>The maturity phases of running evals — Phil Hetzel, Braintrust</title>
        <link>https://video.ut0pia.org/videos/watch/54faa9c3-fcd9-4fc5-b8bd-d930be9f1d01</link>
        <description>Most teams approach evals like unit tests and try to cover every possible failure. Phil Hetzel from Braintrust argues that is the wrong frame: enumerate your known failure modes, cover those specifically, and ship. The goal is a flywheel where production traces surface what is going wrong, feed back into offline experimentation, and guide the next improvement. The session walks four maturity stages: vibe checking with documented human justifications not just thumbs up or down, LLM as judge built from those justifications at scale, then the hard part, tool calls that touch external systems. Context gathering tools are manageable. CRUD tools are not, because you have to represent the state of external systems at the exact moment the original trace ran. Timestamp queries against a vector database and injecting captured system state directly into the trace are two approaches for getting there. Speaker info: https://www.linkedin.com/in/philliphetzel/</description>
        <lastBuildDate>Thu, 28 May 2026 15:19:47 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>PeerTube - https://video.ut0pia.org</generator>
        <image>
            <title>The maturity phases of running evals — Phil Hetzel, Braintrust</title>
            <url>https://video.ut0pia.org/lazy-static/avatars/0287a09a-aae7-4840-9843-b416426e7046.webp</url>
            <link>https://video.ut0pia.org/videos/watch/54faa9c3-fcd9-4fc5-b8bd-d930be9f1d01</link>
        </image>
        <copyright>All rights reserved, unless otherwise specified in the terms specified at https://video.ut0pia.org/about and potential licenses granted by each content's rightholder.</copyright>
        <atom:link href="https://video.ut0pia.org/feeds/video-comments.xml?videoId=54faa9c3-fcd9-4fc5-b8bd-d930be9f1d01" rel="self" type="application/rss+xml"/>
    </channel>
</rss>