Case Studies / Cal AI

How Cal AI reduced latency by 66% while improving reliability

Snap a photo, scan a barcode, or describe your meal and get instant calorie and nutrient info.

Outcomes

3x

Faster Inference

Millions

of daily users supported

99.99%

uptime

Talk to an Engineer

Company overview

Cal AI lets users track calories by simply taking a picture of their food. The app processes millions of food images daily through an AI pipeline that identifies dishes and extracts calories, protein, fats, and carbs. As the fastest-growing nutrition app in the market, Cal AI has become the go-to solution for users who find traditional calorie tracking tedious and difficult.

Challenges

Reliability issues disrupting millions of users

As Cal AI scaled to millions of users, their reliance on frontier models from OpenAI and Anthropic created critical operational challenges that threatened their growth.

"As we started to scale up to millions of users, we started running into issues where the models would go down for extended periods of time. We had super high costs and latency going up month after month, so we knew that there had to be a better solution."
— Henry, CTO of Cal AI

Extended model outages meant millions of users couldn't log their meals, directly impacting retention and growth metrics. The unpredictability of third-party providers made it impossible to guarantee the consistent experience their users expected.

Escalating costs threatening unit economics

Cal AI's explosive user growth meant their API costs ballooned proportionally. The frontier models they relied on were built for general-purpose tasks, not the specific visual and structured data extraction Cal AI needed. These models were too slow for real-time nutritional analysis, forcing Cal AI to pay premium prices for capabilities they didn't use.

Beyond cost, Cal AI faced a critical reliability challenge. Using frontier serverless APIs meant sharing infrastructure with every other customer. When providers experienced outages or degraded performance, millions of Cal AI users couldn't log their meals. With their user base depending on consistent daily tracking, they couldn't afford to gamble on shared infrastructure reliability.

Solution

Rapid custom model deployment

Inference.net partnered with Cal AI to train a custom model specifically optimized for nutritional extraction from food images. Using Cal AI's proprietary food dataset, the team trained and deployed a specialized model in just one week—a timeline that surprised even Cal AI's engineering team.

"It kind of felt like they were part of our engineering team in a way. They would just write really quick scripts to process the data and get it up and running, and they even made some online dashboards for us to see how their model compared to ours. We could see statistics about cost and latency."
— Henry, CTO of Cal AI

Forward-deployed engineering approach

Rather than maintaining a traditional vendor relationship, Inference.net embedded directly with Cal AI's team. We wrote custom data processing scripts, built evaluation dashboards, and enabled Cal AI to monitor model performance against their existing solution throughout the entire process.

This collaborative approach ensured the custom model was perfectly tailored to Cal AI's specific workflow, eliminating unnecessary capabilities while maximizing performance for food recognition tasks.

Impact

3x faster response times transform user experience

The custom model delivered a dramatic reduction in latency, providing near-instant nutritional data extraction that fundamentally improved the user experience. This speed improvement directly translated to better engagement metrics.

"After we launched it, we realized that it was performing exactly the same. Our metrics were doing better than ever, and most importantly, our cost and latency went down."
— Henry, CTO of Cal AI

Achieved 99.99% uptime with full model ownership

Cal AI now owns their core AI infrastructure, eliminating dependency on third-party providers. The custom model has maintained 99.99% uptime, exceeding the reliability of their previous providers while giving them complete control over their AI stack.

A/B tests confirmed the custom model matched frontier model quality while actually improving key metrics like user retention.

"So now I'm able to sleep better at night. I would suggest to any startup who's growing as fast as we were to consider working with Inference if you have extremely specific workloads that you need."
— Henry, CTO of Cal AI

Sustainable economics at scale

With dramatically reduced costs and improved performance, Cal AI transformed their unit economics while scaling to millions of users. They now have sustainable, owned AI infrastructure that scales with their business rather than against it.

What's Next

Beyond the immediate benefits of lower cost, lower latency, and higher reliability, Cal AI now has control over their inference stack. They have a partner who can continuously train and improve their models using their own data, optimizing for the specific metrics that matter to their business. This ownership gives them the flexibility to evolve their AI capabilities as they learn what drives user value, rather than being constrained by the one-size-fits-all limitations of frontier models.

Looking forward, we're excited to work with Cal AI to make their models even faster, cheaper, and more accurate.

Own your model. Scale with confidence.

Schedule a call with our research team to learn more about custom training. We'll propose a plan that beats your current SLA and unit cost.

Schematron

ClipTagger

View All Models