News

    Introducing Catalyst: Train self-improving AI models

    Learn more

    Comparison · June, 2026

    Inference Catalyst vs Langfuse

    A side-by-side comparison of two OpenTelemetry-native platforms for AI agent observability. Both capture LLM traces; they differ in what you can do with them afterward: analysis, the broader workflow, pricing, and hosting.

    TL;DR

    • Both capture traces. Catalyst adds analysis that tells you what to fix. Halo, an open-source RLM, reads entire agent runs and returns ranked issues with cited trace IDs. Langfuse focuses on capturing and displaying traces.

    • Catalyst is agent-tracing first. It's designed around full agent runs, where many LLM and tool calls make up a single trace, not just individual model calls. Langfuse started as LLM-call tracing and added agent support later.

    • Catalyst covers the full agent-improvement loop. Tracing, evals, datasets, fine-tuning, and deployment of custom models on one platform. Langfuse centers on observability and prompt management.

    • Langfuse leads on open-source self-hosting and prompt management. If a fully self-hostable platform or a first-class prompt CMS is your priority, it's a strong fit.

    What is Langfuse?

    Langfuse is an open-source LLM engineering platform for tracing, evaluating, and managing prompts for AI applications. It began as a way to trace individual LLM calls, generations, and prompts, and has since grown to cover agent traces too. Its core is MIT-licensed and can be self-hosted, alongside a managed cloud offering and a commercially licensed enterprise tier (SSO admin, RBAC, audit logs, SCIM). Teams use it to capture LLM and agent traces, manage prompts through a CMS and playground, run evaluations, and store datasets. In early 2026, Langfuse was acquired by ClickHouse.

    What is Inference Catalyst?

    Inference Catalyst is the agent observability and improvement platform from Inference.net. It's agent-tracing first, designed around full multi-step agent runs rather than individual model calls. It captures OpenTelemetry traces from your agents and adds Halo, an open-source recursive language model, to analyze those traces and surface what to fix. Beyond tracing, Catalyst covers the rest of the loop: inference observability, evals, datasets, fine-tuning custom models, and deploying them behind one API key. The platform is cloud-hosted; Halo itself is open source, including a local UI you can self-host to capture traces and run analysis on your own machine, though the hosted version is where it's most capable. You can read the tracing docs for supported languages and frameworks.

    Where Catalyst goes further: trace analysis

    Both tools capture LLM calls, tool calls, and the structure of an agent run, and both let you search and inspect them. Where they diverge is what happens next.

    Catalyst includes Halo, a recursive language model (an RLM). Across all of an agent's runs there's far more data than fits in a normal model's context window, so most tools can only inspect a slice at a time. Halo's context is effectively unlimited, so it reads the full body of runs at once and returns a ranked list of failure modes, each backed by the trace IDs that show it happening, so you start with what matters instead of scrolling a dashboard.

    Halo

    Open source · MIT
    • Reads across all of an agent's runs at once with effectively unlimited context, rather than the truncated slice a normal model is limited to.
    • Returns ranked failure modes, each backed by the trace IDs that show it happening.
    • Run it locally on exported traces, or use the hosted version for chat-with-your-traces, scheduled runs, and high-level metrics.

    Alongside Halo's analysis, Catalyst gives you the everyday tools for living in your traces: high-level metrics across your data, chat with your traces, deep search that returns every run where a keyword appeared in the conversation, and a view of all the sessions and runs for a given user. Signals go a step further. You can tag a whole trace with the things that matter to your product, like a frustrated user, NSFW content, or an unusually positive session, so you understand how people and your agents actually interact, not just whether a call returned 200.

    Signals feed alerts, so you're notified when one shifts meaningfully or when a Halo run lands new recommendations. And because the traces are yours, you can pull them over MCP and let your own coding agent reason about production runs locally. Langfuse captures and displays all of this well; turning it into a ranked list of fixes is left to you.

    Beyond tracing: the full improvement stack

    Tracing is one node in a loop, and traces only pay off if you act on them. This is where Catalyst's scope is widest. It takes your traces all the way to a better model, on one platform:

    • Tracing. Capture spans from any OTel-instrumented agent.
    • Datasets. Turn traces, or filtered subsets, into datasets in one click.
    • Evals. Run those datasets against new model or prompt versions, with LLM-as-judge or code-based scoring, scheduled or ad-hoc.
    • Fine-tune. Train custom models on your own data, including first-party Cliptagger and Schematron variants.
    • Deploy. Serve fine-tuned models behind the same API key.

    The result is improvement on both axes most platforms treat separately: the agent harness (prompts, tools, and retries, informed by Halo) and the model itself (fine-tuned on your own data). Langfuse focuses on the observability and prompt-management layer, and it doesn't train or host custom models.

    Side by side

    Both tools support OpenTelemetry, render trace trees, and capture inputs and outputs. The comparison below focuses on where the two actually differ.

    Tracing
    Built for agent traces
    Langfuse began as LLM-call tracing and added agents later; Catalyst was built around agent runs from the start.
    Catalyst
    Langfuse
    Pricing & limits
    Free tier
    Catalyst1M spans / mo
    Langfuse50k units / mo
    Counting model
    What you count drives what you pay.
    CatalystSpans only
    LangfuseTraces + spans + events + scores + generations
    Per-seat pricing
    Invite your whole team for free on Catalyst.
    Catalyst
    Langfuse
    Analysis
    Built-in trace analysis
    Halo returns ranked findings with cited trace IDs.
    Catalyst
    Langfuse
    Chat with your traces
    Catalyst
    Langfuse
    Trace-level signals
    Tag whole traces (sentiment, NSFW, frustration) to study interactions. Langfuse offers scores and annotation queues.
    Catalyst
    Langfuse
    Alerts on signal shifts & analysis runs
    Langfuse has spend and usage alerts today; alerting on metrics and evals is on its roadmap.
    Catalyst
    Langfuse
    Scheduled analysis runs
    Catalyst
    Langfuse
    Open-source analysis engine
    CatalystMIT (Halo)
    Langfuse
    Beyond observability
    Fine-tune your own models
    Catalyst
    Langfuse
    Deploy fine-tuned models
    Serve custom models behind the same API key.
    Catalyst
    Langfuse
    Prompt CMS / playground
    Langfuse has first-class prompt management; we don't yet.
    Catalyst
    Langfuse
    Hosting
    Full self-host (platform)
    Self-host the full platform free under MIT; only enterprise governance (SSO, RBAC, audit logs, SCIM) needs a commercial license.
    Catalyst
    Langfuse

    Pricing

    Catalyst is cheaper, and the gap grows as you scale

    Catalyst Free

    1,000,000

    spans / month, free

    Langfuse Hobby

    50,000

    units / month, free

    That's 20× the free headroom before you pay anything, and the unit you bill on widens the gap further. A Langfuse "unit" is any trace, observation (span, event, or generation), or score, so a single agent run with 20 steps can bill as dozens of units; paid usage starts at $8 per 100k. Catalyst counts spans only, and your whole team is free with no per-seat charge.

    Based on publicly available Langfuse documentation and pricing as of June 2026. Spot something out of date? Tell us.

    When Catalyst is the right fit

    Catalyst is built for teams that want more than a place to store traces: analysis that tells you what to fix, and a single platform that carries traces all the way to a fine-tuned, deployed model.

    Choose Catalyst when

    • You're tracing agents, not just single LLM calls
    • You want ranked findings, not just a dashboard to dig through
    • You want signals on interaction quality, like sentiment, frustration, or NSFW content
    • You'll eval, build datasets, and fine-tune off your own traces
    • You want one platform from trace to deployed model
    • A generous free tier (1M spans/mo) matters
    • You want a hands-on team that ships daily and helps you integrate

    Already using Langfuse?

    If your app already uses the Langfuse SDK, moving to Catalyst is a drop-in env-var switch. Point LANGFUSE_HOST at Catalyst and set your Catalyst API key, and your existing instrumentation sends traces to Catalyst with no code changes. The Langfuse integration guide has the exact variables. Once you've confirmed Catalyst fits, you can move from the Langfuse SDK to our native instrumentation for fuller control over how spans, metadata, and agent identity are captured.

    Starting from scratch instead? inf instrument sets up tracing for you: run it in your project and it hands a live instrumentation skill to your coding agent (Claude Code, OpenCode, or Codex), which reads your code and proposes the diff for your approval. Prefer to wire it up by hand? The capture your first trace guide walks through it end to end.

    ~/your-agent-repo
    $ npm install -g @inference/cli
    $ inf auth login
    → Opening browser for authentication…
    → Authenticated. Active project: my-agent.
    $ inf instrument
    → Detected: TypeScript project (apps/api), OpenAI SDK 5.2.1
    → Mode: tracing
    → Hand off to: [1] Claude Code [2] OpenCode [3] Codex

    Frequently asked questions

    Send your first trace to Catalyst in 5 minutes.

    Free account, 1M spans/month, run Halo on real production traces. No credit card. Already on Langfuse? Switch with a quick env-var change.

    Catalyst Tracing