On the Economics of Hosting Open Source Models

    A

    Amar Singh

    Published on Jul 29, 2025

    On the Economics of Hosting Open Source Models

    Alibaba’s research group recently released Wan 2.2, the successor to its famous Wan series. As of July 2025, it’s one of the best video generation models available, entering a crowded field alongside giants like Hailuo Minimax 2.0, Seedance 1.0 Pro, and Kling 2.1 Master. While it’s a step below the top-tier Veo 3, that model has its own issues with exorbitant pricing and a lack of open image-to-video generation.

    Prominent companies like Fal and WaveSpeedAI have already started hosting Wan 2.2, with pricing around $0.08 per second of video. This translates to $0.40 for a 5-second clip or $4.80 per minute. Here’s how that stacks up against its closed-source competitors:

    Veo 3: $45/min

    Kling 2.1 Master: $16.80/min

    Seedance 1.0 Pro: $7.44/min

    Wan 2.2: $4.80/min

    Hailuo Minimax 2.0: $2.70/min

    My testing shows Wan 2.2 is a great model, but not the best. And for price-effectiveness, Minimax 2.0 blows it out of the water with better quality for a lower cost. This raises a crucial question: if a new model isn’t the best or the cheapest, why bother hosting it?

    The answer lies in one huge advantage: Wan 2.2 is open source. This single factor can push it into the spotlight and create interesting business opportunities for the companies that host it.

    Releasing a high-caliber open-source video model fosters an ecosystem that drives incremental improvements. We’ve seen this before—from Stable Video Diffusion to Hunyuan and now the Wan series, each open model has unlocked new research and capabilities for the community. There’s also a significant population interested in hosting these models locally (shoutout to r/StableDiffusion).

    The release of Wan 2.1, for example, triggered an explosion of LoRAs, adapters, and modifications like VACE that upgraded how the community used the model. Techniques were even developed for running it with low RAM or training LoRAs in resource-constrained environments, allowing users to add their own unique twists.

    This brings us to the economics of hosting these models. With an Apache 2.0 license, anyone can serve Wan 2.2 commercially. But looking at the prices, how does that work? Is it just a race to the bottom?

    Let’s look at Fal, which lists the price at $0.40 for a 5-second clip. On 8 H100 GPUs, a 30-step generation (Fal’s default) takes about 3 minutes. An H100 GPU runs about $1.00/hr on a long-term lease. Based on these numbers, a single 5-second generation costs Fal roughly $0.40. At best, they break even. If a user increases the step count, Fal loses money, as their pricing remains static.

    Of course, providers have internal optimizations—custom kernels, specialized schedulers, or tricks like TeaCache—that can speed up inference. Assuming a conservative 25% speedup, the cost drops to $0.30 per 5-second clip. This means on a default generation, they earn a dime. Pretty neat, right?

    Not really. Earning $100 per 1,000 inferences isn’t a path to riches. Video generation is niche, and while traffic is high at launch, it tapers off as the next model steals the spotlight. Even if Fal generates 10,000 videos a day for three months, their profit would be around $90,000—hardly enough to justify their $49M Series B.

    What about offering LoRA fine-tuning? This is another benefit of open source. Fal offers Hunyuan LoRA training for $5. We can estimate their infrastructure cost is around $3.50 per LoRA. Even at 1,000 LoRAs a day, the profit over three months is in a similar ballpark. It’s not the main business.

    So, how do these companies actually make money? The truth is, most of the revenue doesn't come from hosting these models for consumers. The real gold mine is in enterprise deals and dedicated instances.

    Rumor has it that Together AI is on its way to passing $500M in revenue, with the majority coming from providing dedicated instances or bare metal servers to enterprise customers. These companies need secure, private infrastructure for compliance and liability reasons, and they’re willing to pay a premium for it. Fal and Replicate offer similar enterprise packages with SLAs and support. It’s not a far shot to imagine this is where they make their real money.

    This reframes the entire exercise. Hosting the latest open-source models for consumers isn't about direct profit. It’s a top-of-funnel strategy. It serves as a public demonstration of technical prowess and reliability, building a reputation that opens the door to high-margin enterprise contracts. Given that lucrative opportunity, it makes perfect sense why you would go out of your way to be the leading provider for these open-source models, even if the consumer-facing margins don't exist.

    So next time you think about how you can outperform one of these companies at their game, remember that the true playing field might be at a different place altogether.

    START BUILDING TODAY

    15 minutes could save you 50% or more on compute.