<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Road to 5 Million Tokens: Breaking Barriers in Long Context Training — Max Ryabinin, Together AI</title>
        <link>https://video.ut0pia.org/videos/watch/009bdbea-e3df-4a44-ad4c-7aae8f21322c</link>
        <description>Training a standard LLaMA 3B model with a 3 million token context on a single 8xH100 node fails before you even start: the model parameters alone exhaust GPU memory. Max Ryabinin from Together AI walks through the full stack of techniques needed to get there: fully sharded data parallelism, DeepSpeed Ulysses context parallelism for an 8x activation reduction, activation checkpointing for another 8x, CPU offloading for transformer block inputs, and chunked sequence training to avoid allocating buffers 3 million tokens wide. Even that stack falls short at 5 million tokens. The novel contribution, Untied Ulysses, goes deeper into the context parallelism step: instead of allocating one large buffer per attention head group, it chunks the heads further and reuses those buffers across iterations, cutting activation memory with negligible throughput impact. At both 8B and 32B scale the results match the most memory optimized transformer training baselines while pushing sequence length 25% further than prior Ulysses implementations. Speaker info: https://www.linkedin.com/in/max-ryabinin/, https://x.com/m_ryabinin</description>
        <lastBuildDate>Wed, 17 Jun 2026 10:24:45 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>PeerTube - https://video.ut0pia.org</generator>
        <image>
            <title>Road to 5 Million Tokens: Breaking Barriers in Long Context Training — Max Ryabinin, Together AI</title>
            <url>https://video.ut0pia.org/lazy-static/avatars/0287a09a-aae7-4840-9843-b416426e7046.webp</url>
            <link>https://video.ut0pia.org/videos/watch/009bdbea-e3df-4a44-ad4c-7aae8f21322c</link>
        </image>
        <copyright>All rights reserved, unless otherwise specified in the terms specified at https://video.ut0pia.org/about and potential licenses granted by each content's rightholder.</copyright>
        <atom:link href="https://video.ut0pia.org/feeds/video-comments.xml?videoId=009bdbea-e3df-4a44-ad4c-7aae8f21322c" rel="self" type="application/rss+xml"/>
    </channel>
</rss>