<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>The Small Model Infrastructure Nobody Built (So We Did) — Filip Makraduli, Superlinked</title>
        <link>https://video.ut0pia.org/videos/watch/9ed9f20b-d951-4aa2-9a62-05baf9224a6b</link>
        <description>Most embedding infrastructure assumes you know exactly which model you want ahead of time. This talk starts where that assumption breaks. Filip Makraduli walks through the real profiling mistakes, infrastructure gaps, and production constraints that led to building an embedding inference engine designed for dynamic model loading, hot-swapping, and memory-aware eviction instead of brittle one-model-per-container deployments. If you're working on small-model inference, embeddings, or GPU infrastructure, this is a practical look at what breaks in the real world and how to design around it. Speaker info: https://www.linkedin.com/in/filipmakraduli/</description>
        <lastBuildDate>Wed, 06 May 2026 13:45:04 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>PeerTube - https://video.ut0pia.org</generator>
        <image>
            <title>The Small Model Infrastructure Nobody Built (So We Did) — Filip Makraduli, Superlinked</title>
            <url>https://video.ut0pia.org/lazy-static/avatars/0287a09a-aae7-4840-9843-b416426e7046.webp</url>
            <link>https://video.ut0pia.org/videos/watch/9ed9f20b-d951-4aa2-9a62-05baf9224a6b</link>
        </image>
        <copyright>All rights reserved, unless otherwise specified in the terms specified at https://video.ut0pia.org/about and potential licenses granted by each content's rightholder.</copyright>
        <atom:link href="https://video.ut0pia.org/feeds/video-comments.xml?videoId=9ed9f20b-d951-4aa2-9a62-05baf9224a6b" rel="self" type="application/rss+xml"/>
    </channel>
</rss>