UNMATCHED SCALE & COST
Handle large-scale asynchronous LLM tasks with unparalleled cost-efficiency. World-class performance at the most competitive prices.
No Rate Limits
Process millions of requests in parallel at maximum throughput without rate limits or slowdowns.
Lowest Prices
The best prices on the market. 90% lower cost than other providers.
POWERING ADVANCED WORKFLOWS
Queue massive batches of requests. Poll for results, or receive webhooks when processing is complete.
Synthetic Data Generation
Easily generate gigatoken scale post-training datasets at a fraction of the cost. First-class support for popular frameworks like Bespoke Labs Curator.
RAG Pre-Processing
Efficiently process document batches to create real-time datasets for RAG applications. Never worry about slowdowns or rate limits.
Data Extraction
Native support for JSON mode, tool calling, and more. Use top open source projects like Outlines to extract structured data from batches of documents.
BUILT FOR DEVELOPERS
We put developer experience at the forefront of our design process. Integrate in 2 minutes, find the code samples you need, and monitor your jobs in real-time.
Complete API Docs
Detailed API documentation, quick-start guides, and code samples for seamless integration.
Fully OpenAI-Compatible
Our batch API is fully compatible with the OpenAI SDK. Switch providers with only a two line code change.
Real-Time Monitoring
Monitor batch jobs with real-time dashboards and metrics to ensure optimal performance.
Meta Llama 3.1 8B Instruct FP8
Meta Llama 3.1 is a collection of advanced, multilingual large language models designed for dialogues, available in 8B, 70B, and 405B sizes, that outperform many chat models on industry benchmarks and emphasize safe, responsible use in various applications.
Meta Llama 3.1 70B Instruct FP8
The Meta Llama 3.1 collection consists of high-performing, multilingual large language models optimized for dialogue and capable of handling text and code across 8 languages, available in 8B, 70B, and 405B parameter sizes, with a focus on safety, inclusivity, and societal benefit.
Meta Llama 3.2 11B Instruct FP16
Llama 3.2-Vision, developed by Meta, is a state-of-the-art multimodal language model optimized for image recognition, reasoning, and captioning, surpassing both open and closed models in industry benchmarks.