DeepSeek-V3-0324 is now live.Try it

    EXPLORE MODELS Header Image

    EXPLORE MODELS

    Explore and experiment with today's leading models. Use our model documentation to setup your model of choice in minutes.

    TEXT TO TEXT

    Prices shown are per 1 million tokens

    DeepSeek R1 Distill Llama 70B visualization
    DeepSeek
    FP8

    DeepSeek R1 Distill Llama 70B

    Feel the power of reasoning models. This distilled model beats GPT-4o on math & matches o1-mini on coding.

    $0.40 / $0.40
    16K Context
    DeepSeek V3 visualization
    DeepSeek
    FP8

    DeepSeek V3

    DeepSeek-V3 is a 671 billion parameter Mixture-of-Experts (MoE) language model optimized for efficiency and performance, demonstrating superior results across various benchmarks through innovative strategies and extensive pre-training on high-quality data.

    $0.40 / $1.20
    125K Context
    Google Gemma 3 visualization
    Google
    BF16

    Google Gemma 3

    Gemma 3 is a versatile, lightweight, multimodal open-source model family by Google DeepMind, primed for text and image processing and text generation, supporting over 140 languages with a 128K context window, designed for easy deployment in resource-constrained environments.

    $0.30 / $0.40
    125K Context
    JSON
    Llama 3.1 70B Instruct visualization
    Meta
    FP16

    Llama 3.1 70B Instruct

    The Meta Llama 3.1 collection consists of high-performing, multilingual large language models optimized for dialogue and capable of handling text and code across 8 languages, available in 8B, 70B, and 405B parameter sizes, with a focus on safety, inclusivity, and societal benefit.

    $0.30 / $0.40
    16K Context
    JSON
    Llama 3.1 8B Instruct visualization
    Meta
    FP8
    FP16

    Llama 3.1 8B Instruct

    Meta Llama 3.1 is a collection of advanced, multilingual large language models designed for dialogues, available in 8B, 70B, and 405B sizes, that outperform many chat models on industry benchmarks and emphasize safe, responsible use in various applications.

    $0.025 / $0.025
    16K Context
    JSON
    Llama 3.2 11B Vision Instruct visualization
    Meta
    FP16

    Llama 3.2 11B Vision Instruct

    Llama 3.2-Vision, developed by Meta, is a state-of-the-art multimodal language model optimized for image recognition, reasoning, and captioning, surpassing both open and closed models in industry benchmarks.

    $0.055 / $0.055
    16K Context
    JSON
    Llama 3.2 1B Instruct visualization
    Meta
    FP16

    Llama 3.2 1B Instruct

    Llama 3.2 is a multilingual large language model collection from Meta, fine-tuned for dialogue and summarization tasks in multiple languages, designed for enhanced retrieval and conversational agents.

    $0.01 / $0.01
    16K Context
    JSON
    Llama 3.2 3B Instruct visualization
    Meta
    FP16

    Llama 3.2 3B Instruct

    Llama 3.2 is a multilingual large language model collection optimized for dialogue, retrieval, and summarization tasks with enhanced performance on industry benchmarks, employing supervised fine-tuning and reinforcement learning for safety and human-aligned responses.

    $0.02 / $0.02
    16K Context
    JSON
    Tool Calling
    Mistral Nemo 12B Instruct visualization
    Mistral
    FP8

    Mistral Nemo 12B Instruct

    Mistral-NeMo-12B-Instruct is a 12-billion-parameter multilingual large language model designed for English-language chat applications, featuring impressive multilingual and code comprehension, with customization options via NVIDIA's NeMo Framework.

    $0.038 / $0.10
    16K Context
    JSON
    Tool Calling
    Qwen 2.5 7B Vision Instruct visualization
    Qwen
    BF16

    Qwen 2.5 7B Vision Instruct

    Qwen2.5-7B-Instruct is a multilingual large language model from Alibaba Cloud, offering enhanced capabilities in knowledge, coding, mathematics, and instruction-following, along with support for processing long texts and generating structured outputs like JSON.

    $0.20 / $0.20
    125K Context
    JSON

    START BUILDING TODAY

    15 minutes could save you 50% or more on compute.