<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>From 46% to 90%: Fine-Tuning Tiny LLMs for On-Device Agents — Cormac Brick, Google</title>
        <link>https://video.ut0pia.org/videos/watch/82426cd8-9140-4c49-9b59-5dd95b81e107</link>
        <description>Function Gemma ships at 270 million parameters and processes nearly 2,000 tokens per second prefill on a Pixel 7. Out of the box, on a fixed set of app intents, it hits 46% accuracy. Fine-tuned on a synthetically generated dataset, it clears 90% on eight of ten functions. Cormac Brick covers the two options developers have for on-device AI: Gemini Nano via AI core for common tasks, and LiteRT-LM for custom models that ship inside your app. The session walks through a live skill harness built on Gemma 4 with a restaurant roulette demo running fully on-device, and Eloquent, a production transcription app built by chaining two models under a few hundred million parameters. Speaker info: https://www.linkedin.com/in/cbrick/</description>
        <lastBuildDate>Wed, 20 May 2026 23:07:16 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>PeerTube - https://video.ut0pia.org</generator>
        <image>
            <title>From 46% to 90%: Fine-Tuning Tiny LLMs for On-Device Agents — Cormac Brick, Google</title>
            <url>https://video.ut0pia.org/lazy-static/avatars/0287a09a-aae7-4840-9843-b416426e7046.webp</url>
            <link>https://video.ut0pia.org/videos/watch/82426cd8-9140-4c49-9b59-5dd95b81e107</link>
        </image>
        <copyright>All rights reserved, unless otherwise specified in the terms specified at https://video.ut0pia.org/about and potential licenses granted by each content's rightholder.</copyright>
        <atom:link href="https://video.ut0pia.org/feeds/video-comments.xml?videoId=82426cd8-9140-4c49-9b59-5dd95b81e107" rel="self" type="application/rss+xml"/>
    </channel>
</rss>