CHANGELOG

Mar 21, 2025

March 21st, 2025

New Model: Google Gemma3

The new Gemma 3 model family is a multi-modal model capable of accepting image inputs and understands over 140 languages.

It has a 128k context length and comes in 1b, 4b, 12b, and 27b variants.

As of today, we're offering the 27b variant at $0.40 per million input/output tokens.

Structured Output and JSON Mode

Enable structured outputs support for several models:

5x-8x Faster DeepSeek Model Performance

We made some huge optimizations to our DeepSeek R1 and V3 models that have resulted in token-per-second speedups in the range of 5x to 8x.

Feb 28, 2025

February 28th, 2025

Happy Friday from the inference.net team!

We have some exciting new releases this week.

---

New Models, Including DeepSeek R1 and DeepSeek V3

We've released several new models that you can find on the model explorer page.

Here's whats new:

We offer some of the best pricing for these models and are so excited to see what you build with them!

Image Input Support

Vision-language-models (VLM's for short) are a special type of language model which has the ability to see visually and accepts images as input.

We support image input with the Llama 3.2 11b model and support for image inputs is now generally available.

Visit the docs here to learn how you can start using image inputs in your app today.

New Website Launch

We've released a brand new landing page at inference.net, packed with interactive model playgrounds and pricing tools to help you explore our capabilities and optimize your costs. Let us know what you think!

Feb 21, 2025

Week of Feb 21, 2025

Release of new model playground experience, allowing for easy testing of models on the dashboard.
Added GitHub OAuth. This is now the recommended way to login to Inference.net.
Improved TTFT: Requests are saved and dispatched to inference runtime 30% faster than previously, resulting in improved TTFT.
Release of VLM support for multi-modals like meta/llama-3.1-11b

Feb 4, 2025

Website & Dashboard Migration to TanStack Start

Major frontend overhaul and new marketing website.

TanStack Start Integration: Implemented server-side rendering (SSR) capabilities for improved SEO and initial page load performance.
TanStack Router Migration: Moved from React Router to TanStack Router for enhanced type safety and better integration with our stack.
File-based Routing: Adopted a more intuitive file-based routing system.
Type-safe Routing: Implemented fully type-safe routing throughout the application.
New Marketing Website: Shipped a brand new marketing website, which is fully integrated with our dashboard as a single frontend.
Integrated CMS: Integrated a new CMS tool into our system, to support blog and changelog content.