Banner background

    Announcing our $11.8M Series Seed.

    Read more

    Blog

    Stay informed about models we're releasing, upgrades to our API services and our thoughts on the industry.

    The Workhorse Era of AI: Moats Are Built, Not Rented

    The Workhorse Era of AI: Moats Are Built, Not Rented

    When everyone's using the same frontier models, nobody has an edge.

    Oct 17, 2025

    M

    Michael Ryaboy

    Announcing our $11.8M Series Seed

    Announcing our $11.8M Series Seed

    We raised $11.8 million in funding led by Multicoin Capital and a16z CSX to train and hosti custom language models that are faster, more affordable, and more accurate than what the Big Labs offer.

    Oct 14, 2025

    S

    Sam Hogan

    RAG Is Over: RL Agents Are the New Retrieval Stack

    RAG Is Over: RL Agents Are the New Retrieval Stack

    RL takes search agents to the next level. Without RL, agentic search is powerful but slow; you often need expensive frontier models to get the best results. With RL, it becomes much more viable.

    Sep 23, 2025

    M

    Michael Ryaboy

    Introducing Schematron: Structured HTML Extraction 40-80x Cheaper than GPT-5

    Introducing Schematron: Structured HTML Extraction 40-80x Cheaper than GPT-5

    Schematron-8B and Schematron-3B deliver frontier-level extraction quality at 1-2% of the cost and 10x+ faster inference than large, general-purpose LLMs.

    Sep 9, 2025

    M

    Michael Ryaboy

    Arbitraging Down LLM Inference to the Cost of Electricity

    Arbitraging Down LLM Inference to the Cost of Electricity

    What if we allow every GPU to run serverless inference, and can verify that their LLM output is correct?

    Aug 25, 2025

    M

    Michael Ryaboy

    Introducing ClipTagger-12b: SoTA Video Understanding at 15x Lower Cost

    Introducing ClipTagger-12b: SoTA Video Understanding at 15x Lower Cost

    We're thrilled to announce the release of ClipTagger-12b, a groundbreaking open-source vision-language model that delivers GPT-4.1-level performance for video understanding at a fraction of the cost.

    Aug 14, 2025

    S

    Sam Hogan

    GPU-Rich Labs Have Won: What's Left for the Rest of Us is Distillation

    GPU-Rich Labs Have Won: What's Left for the Rest of Us is Distillation

    massive training runs and powerful but expensive models means another technique is starting to dominate: distillation

    Jul 31, 2025

    M

    Michael Ryaboy

    On the Economics of Hosting Open Source Models

    On the Economics of Hosting Open Source Models

    The open source community is buzzing around the new Wan release, but what are the economics of the businesses hosting it right now? Or hosting open source models in general?

    Jul 29, 2025

    A

    Amar Singh

    Batch vs Real-Time LLM APIs: When to Use Each

    Batch vs Real-Time LLM APIs: When to Use Each

    Not every LLM request needs an immediate response. Chat interfaces need real-time. But data extraction, enrichment, and background jobs can wait hours.

    Jul 24, 2025

    M

    Michael Ryaboy

    Do You Need Model Distillation? The Complete Guide

    Do You Need Model Distillation? The Complete Guide

    Model distillation is particularly valuable in scenarios where large models are impractical due to resource constraints or performance requirements.

    Jul 22, 2025

    M

    Michael Ryaboy