Explore Models
Explore and experiment with today's leading models. Use our model documentation to setup your model of choice in minutes.
Workhorse Models

Schematron 3B
Schematron-3B is a state-of-the-art large language model designed for reasoning and complex problem-solving tasks, with a focus on accuracy and efficiency in various domains, offering advanced capabilities for structured output generation and complex reasoning.

Schematron 8B
Schematron-8B is a state-of-the-art large language model designed for reasoning and complex problem-solving tasks, with a focus on accuracy and efficiency in various domains, offering advanced capabilities for structured output generation and complex reasoning.

ClipTagger 12B
ClipTagger-12b is a highly efficient, open-source 12-billion parameter vision-language model designed for scalable video understanding, providing frontier-quality performance through schema-consistent JSON outputs for video frames at a fraction of the cost of leading closed-source models.
Text-to-Text
Prices shown are per 1 million tokens

Nemotron 3 Super
Gemma 3 is a versatile, lightweight, multimodal open-source model family by Google DeepMind, primed for text and image processing and text generation, supporting over 140 languages with a 128K context window, designed for easy deployment in resource-constrained environments.

Schematron 3B
Schematron-3B is a state-of-the-art large language model designed for reasoning and complex problem-solving tasks, with a focus on accuracy and efficiency in various domains, offering advanced capabilities for structured output generation and complex reasoning.

Schematron 8B
Schematron-8B is a state-of-the-art large language model designed for reasoning and complex problem-solving tasks, with a focus on accuracy and efficiency in various domains, offering advanced capabilities for structured output generation and complex reasoning.
Image-to-Text
Prices shown are per 1 million tokens

ClipTagger 12B
ClipTagger-12b is a highly efficient, open-source 12-billion parameter vision-language model designed for scalable video understanding, providing frontier-quality performance through schema-consistent JSON outputs for video frames at a fraction of the cost of leading closed-source models.

Google Gemma 3
Gemma 3 is a versatile, lightweight, multimodal open-source model family by Google DeepMind, primed for text and image processing and text generation, supporting over 140 languages with a 128K context window, designed for easy deployment in resource-constrained environments.


