WebsiteGear Logo Log In
New User? Sign Up
About | Contact | FAQ
  Home News Web Hosting Domain Name Industry Monday, February 16, 2026 
Add Press Release News | News Feeds Feeds | Email This News Email


Runloop.ai and Fermatix.ai Partner to Introduce Custom Benchmarks for AI Agents
Thursday, October 9, 2025

SAN FRANCISCO, Oct. 1, 2025 /PRNewswire/ -- Runloop.ai, the leading enterprise infrastructure platform for AI agents, today announced the launch of its Custom Benchmarks product. The new offering enables organizations to create highly specialized, private benchmarks that accurately measure and refine AI agents on their unique, proprietary codebases and business logic. To highlight the product's broad applications and strategic value, Runloop.ai is collaborating with Fermatix.ai, a specialist in full-cycle data generation, on a landmark pilot program.

The explosion of AI agents has created a critical need for rigorous and relevant evaluation and functional training. While public benchmarks are crucial for general model evaluation, they often fail to capture the specific requirements of AI agents or the validation needs of enterprises. Runloop.ai's Custom Benchmarks solve this problem by providing a secure, scalable platform for companies to build benchmarks that test against their own internal business logic, tech stacks, and performance metrics.

Key features of Runloop.ai's Custom Benchmarks product include:

    --  Private benchmarking: Securely test AI agents on proprietary code
        without exposing intellectual property.
    --  Accurate performance evaluation: Measure agent effectiveness in
        real-world, business-specific conditions.
    --  Scalable infrastructure: A reliable and isolated environment for running
        thousands of tests simultaneously.
    --  Strategic model refinement: Obtain data for targeted improvement and
        retraining of AI agents for specific tasks.

"As AI agents move from prototypes to production, the benchmarks we use to evaluate them must evolve from generic tests to strategic assets," said Jonathan Wall, CEO of Runloop.ai. "Our new Custom Benchmarks product empowers enterprises to define what 'good' looks like for their unique business, enabling them to fine-tune and trust their AI agents in real-world scenarios. The pilot with Fermatix.ai is the perfect example of this in action, demonstrating the value of this approach in the most demanding environments."

Fermatix.ai, a company known for creating expert-level training data tailored to industry-critical tasks and highly specialized domains, with annotators who are practicing industry experts, brings the perfect expertise for this pilot. By leveraging Runloop.ai's infrastructure, Fermatix.ai is strategically expanding its capabilities to offer custom, in-house verification for its clients. The collaboration allows Fermatix.ai to move beyond its current offerings and provide a new level of assurance by creating benchmarks tailored to specific enterprise needs. This pilot program will demonstrate how Fermatix.ai's expertise in data engineering and expert-level annotation can be applied to create high-fidelity, multilingual benchmarks on Runloop.ai's platform.

"At Fermatix.ai, we've built our reputation on creating expert-level training data with practicing industry professionals as annotators," said Sergey Anchutin, CEO and Founder of Fermatix.ai. "This partnership with Runloop.ai represents a strategic evolution--moving beyond one-time data labeling to creating reusable benchmarks that deliver ongoing value to our clients. By leveraging our domain expertise and Runloop's infrastructure, we're not just providing data anymore; we're building the testing standards that will define how enterprises evaluate their AI agents across industry-critical tasks."

The Custom Benchmarks product is now available to all Runloop.ai Pro clients, with early results from the Fermatix.ai pilot program expected to be shared in the coming months.

About Runloop.ai

Runloop provides infrastructure and tooling for building, testing, refining, and deploying AI agents at scale. Founded by engineers with deep experience in building large-scale systems, Runloop provides secure, isolated environments, rich developer tooling, and a suite of benchmarking capabilities that help companies deploy and manage AI agents with confidence.

Media contact:
Michelle Faulkner
Big Swing
617-510-6998
michelle@big-swing.com

https://www.linkedin.com/company/runloopai https://x.com/runloopdev https://github.com/runloopai

View original content to download multimedia:https://www.prnewswire.com/news-releases/runloopai-and-fermatixai-partner-to-introduce-custom-benchmarks-for-ai-agents-302572197.html

SOURCE Runloop.ai



Email This News Email | Submit To Slashdot Slashdot | Submit To Digg.com Digg | Submit To del.icio.us Del.icio.us | News Feeds Feeds

RELATED NEWS ARTICLES
Nav Weekly Recap: 11 Tech Press Releases You Need to See | Jan 22, 2026
Nav Sup AI Sets New Benchmark Record with 52.15% on Humanity's Last Exam | Jan 22, 2026
Nav Trigent Partners with WeWork India to Expand its GCC Footprint | Jan 22, 2026
Nav DEADLINE ANNOUNCED FOR 2026 NEW TOP-LEVEL DOMAIN APPLICATIONS | Jan 22, 2026
Nav Skunk Works® and XTEND Expand Joint All Domain Command and Control for Advanced Mission Execution | Jan 22, 2026
Nav Glasswall Brings Defense-Level File Sanitization to Every Government Agency and Business Using Microsoft 365 | Jan 22, 2026
Nav Exia Labs Brings Keystone to the U.S. Navy via DIU's Blue Object Management Challenge | Jan 22, 2026
Nav Veteran Ventures Capital Announces Investment in Vatn Systems, Supporting a New Era of Scalable Undersea Autonomy | Jan 22, 2026
Nav Buyers Edge Platform Appoints Jaime Selga to Lead Expansion Across the Middle East, Africa & Asia | Jan 22, 2026
Nav Everflow Drives $4.3 Billion in High-Value Partner Revenue, Delivering Essential Solutions for Modern Affiliate Programs | Jan 22, 2026
NEWS SEARCH

FEATURED NEWS | POPULAR NEWS
Submit News | View More News View More News