<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>We Cut 94% of AI Coding Tokens With a Local Code Index - Rajkumar Sakthivel, Tesco</title>
        <link>https://video.ut0pia.org/videos/watch/334bf3d1-53a9-46cb-a42c-69ac9dfd0430</link>
        <description>Every AI coding tool we tried had the same assumption: send as much context as possible. In our production codebase, that meant sending 45,000 tokens per query — even when only ~5,000 were actually useful. We didn’t notice how inefficient this was until we saw the cost and latency impact. We tried improving prompts and tweaking model settings, but nothing addressed the core problem: we were optimising the model, not the context. So we built a local retrieval layer between the codebase and the agent. Instead of sending full files, we: Structured code using AST-aware chunks (tree-sitter) Combined vector search with keyword matching for better retrieval Used a lightweight relationship layer to follow execution across files The result: 👉 94% reduction in tokens 👉 faster responses 👉 more accurate outputs The hardest problem wasn’t retrieval — it was knowing when retrieval was wrong. We experimented with LLM-based scoring and threshold tuning, but a simple heuristic ended up working best. Everything runs locally, with no data leaving the machine, and one index supports multiple AI tools. In this talk, I’ll walk through: What we got wrong initially Why context matters more than model tuning The architecture behind the system Real benchmarks and trade-offs The key takeaway: 👉 The biggest optimisation in AI coding isn’t the model — it’s the context. Speakers: Rajkumar Sakthivel (Tesco): Rajkumar Sakthivel builds LLM infrastructure at scale and co-created Code Context Engine after his team's AI coding bill jumped from £15 to £200 in a single month. X/Twitter: https://x.com/rajkumarsakthi LinkedIn: https://www.linkedin.com/in/rajkumar-sakthivel/ GitHub: https://github.com/rajkumarsakthivel</description>
        <lastBuildDate>Mon, 29 Jun 2026 15:51:33 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>PeerTube - https://video.ut0pia.org</generator>
        <image>
            <title>We Cut 94% of AI Coding Tokens With a Local Code Index - Rajkumar Sakthivel, Tesco</title>
            <url>https://video.ut0pia.org/lazy-static/avatars/0287a09a-aae7-4840-9843-b416426e7046.webp</url>
            <link>https://video.ut0pia.org/videos/watch/334bf3d1-53a9-46cb-a42c-69ac9dfd0430</link>
        </image>
        <copyright>All rights reserved, unless otherwise specified in the terms specified at https://video.ut0pia.org/about and potential licenses granted by each content's rightholder.</copyright>
        <atom:link href="https://video.ut0pia.org/feeds/video-comments.xml?videoId=334bf3d1-53a9-46cb-a42c-69ac9dfd0430" rel="self" type="application/rss+xml"/>
    </channel>
</rss>