DeepSeek-V3-0324 is now live.Try it

    CHANGELOG Header Image

    CHANGELOG

    Mar 21, 2025

    March 21st, 2025

    New Model: Google Gemma3

    The new Gemma 3 model family is a multi-modal model capable of accepting image inputs and understands over 140 languages.

    It has a 128k context length and comes in 1b, 4b, 12b, and 27b variants.

    As of today, we're offering the 27b variant at $0.40 per million input/output tokens.

    Structured Output and JSON Mode

    Enable structured outputs support for several models:

    5x-8x Faster DeepSeek Model Performance

    We made some huge optimizations to our DeepSeek R1 and V3 models that have resulted in token-per-second speedups in the range of 5x to 8x.

    Feb 28, 2025

    February 28th, 2025

    Happy Friday from the inference.net team!

    We have some exciting new releases this week.

    ---

    New Models, Including DeepSeek R1 and DeepSeek V3

    We've released several new models that you can find on the model explorer page.

    Here's whats new:

    We offer some of the best pricing for these models and are so excited to see what you build with them!

    Image Input Support

    Vision-language-models (VLM's for short) are a special type of language model which has the ability to see visually and accepts images as input.

    We support image input with the Llama 3.2 11b model and support for image inputs is now generally available.

    Visit the docs here to learn how you can start using image inputs in your app today.

    New Website Launch

    We've released a brand new landing page at inference.net, packed with interactive model playgrounds and pricing tools to help you explore our capabilities and optimize your costs. Let us know what you think!

    Feb 21, 2025

    Week of Feb 21, 2025

    • Release of new model playground experience, allowing for easy testing of models on the dashboard.
    • Added GitHub OAuth. This is now the recommended way to login to Inference.net.
    • Improved TTFT: Requests are saved and dispatched to inference runtime 30% faster than previously, resulting in improved TTFT.
    • Release of VLM support for multi-modals like meta/llama-3.1-11b

    Feb 4, 2025

    Website & Dashboard Migration to TanStack Start

    Major frontend overhaul and new marketing website.

    • TanStack Start Integration: Implemented server-side rendering (SSR) capabilities for improved SEO and initial page load performance.
    • TanStack Router Migration: Moved from React Router to TanStack Router for enhanced type safety and better integration with our stack.
    • File-based Routing: Adopted a more intuitive file-based routing system.
    • Type-safe Routing: Implemented fully type-safe routing throughout the application.
    • New Marketing Website: Shipped a brand new marketing website, which is fully integrated with our dashboard as a single frontend.
    • Integrated CMS: Integrated a new CMS tool into our system, to support blog and changelog content.

    START BUILDING TODAY

    15 minutes could save you 50% or more on compute.