Pricing
Custom models trained for your use case, or serverless API access to popular open-source models.
Custom Model Pricing
We train and deploy custom models fine tuned to your use case. Built from scratch or distilled from frontier models, deployed on our infrastructure or yours.
Serverless API Pricing
Pay-as-you-go pricing for popular open-source models. 90% lower cost than other providers.
Start for free
Begin with $25 in free credits to explore our models via the Playground.
Integrate in minutes
Switch to Inference.net by changing a single line of code. Start saving today.
Pay-as-you-go
Only pay for what you use. Set limits and monitor usage via our dashboards.
Text-to-Text
Prices shown are per 1 million tokens
| Model | Quantization | Input | Output |
|---|---|---|---|
| Llama 3.1 8B Instruct | FP16 | $0.02 | $0.03 |
| FP8 | $0.025 | $0.025 | |
| Llama 3.2 1B Instruct | FP16 | $0.01 | $0.01 |
| Llama 3.2 3B Instruct | FP16 | $0.02 | $0.02 |
| Mistral Nemo 12B Instruct | FP8 | $0.038 | $0.10 |
| Osmosis Structure 0.6B | FP32 | $0.10 | $0.50 |
| Schematron 3B | BF16 | $0.02 | $0.05 |
| Schematron 8B | BF16 | $0.04 | $0.10 |
Image-to-Text
Prices shown are per 1 million tokens
| Model | Quantization | Input | Output |
|---|---|---|---|
| ClipTagger 12B | FP8 | $0.30 | $0.50 |
| Google Gemma 3 | BF16 | $0.15 | $0.30 |
| Llama 3.2 11B Vision Instruct | FP16 | $0.055 | $0.055 |
| Qwen 2.5 7B Vision Instruct | BF16 | $0.20 | $0.20 |
Embeddings
Prices shown are per 1 million input tokens
| Model | Input |
|---|---|
| Qwen 3 Embedding 4B | $0.01 |
Own your model. Scale with confidence.
Schedule a call with our research team to learn more about custom training. We'll propose a plan that beats your current SLA and unit cost.





