DeepSeek-V3-0324 is now live.Try it

    PRICING Header Image

    PRICING

    The best prices on the market. 90% lower cost than other providers. All models are billed by usage.

    Start for free

    Start for free

    Begin with $25 in free credits to explore our models via the Playground.

    Simple API call

    Simple API call

    Switch to inference by changing a single line of code. Start saving in 5 minutes.

    Pay as you go

    Pay as you go

    Only pay for what you use. Set limits and monitor usage via our dashboards.

    TEXT TO TEXT

    Prices shown are per 1 million tokens

    ModelQuantizationInputOutput
    DeepSeek R1 FP8$0.75$3.00
    DeepSeek R1 Distill Llama 70B FP8$0.40$0.40
    DeepSeek V3 FP8$0.40$1.20
    DeepSeek V3 0324 FP8$0.75$1.50
    Google Gemma 3 BF16$0.30$0.40
    Llama 3.1 70B Instruct FP16$0.30$0.40
    Llama 3.1 8B Instruct FP16$0.03$0.03
    FP8$0.025$0.025
    Llama 3.2 11B Vision Instruct FP16$0.055$0.055
    Llama 3.2 1B Instruct FP16$0.01$0.01
    Llama 3.2 3B Instruct FP16$0.02$0.02
    Llama 3.3 70B Instruct FP16$0.30$0.40
    Mistral Nemo 12B Instruct FP8$0.038$0.10
    Qwen 2.5 7B Vision Instruct BF16$0.20$0.20

    NEED A RESEARCH GRANT?

    Inference’s Grants program offers free compute resources to researchers and developers working on open-source AI projects. Fill out an application and our team will be in touch within 24 hours.

    NEED ENTERPRISE PRICING?

    Inference is the best solution for large scale operations looking to source affordable inference compute. Leverage our network's capabilities and our team's expertise for your next initiative.

    START BUILDING TODAY

    15 minutes could save you 50% or more on compute.