WebsiteGear Logo Log In
New User? Sign Up
About | Contact | FAQ
  Home News Website Related Website Development Thursday, May 15, 2025 
Add Press Release News | News Feeds Feeds | Email This News Email


New SEI Tool Enhances Machine Learning Model Test and Evaluation
Friday, October 25, 2024

PITTSBURGH, Oct. 17, 2024 /PRNewswire/ -- Software systems with a machine learning (ML) component often fail in production. One reason is that ML models are frequently developed in isolation, making it impossible to test and evaluate against system and operational requirements and constraints. The Software Engineering Institute (SEI) at Carnegie Mellon University (CMU) today announced its release of a new tool to help teams developing ML-enabled software systems mitigate this problem. Machine Learning Test and Evaluation (MLTE), available for download from GitHub, is a semi-automated process and infrastructure for testing ML models based on stakeholder-generated quality attribute requirements.

ML model developers often work in silos. They lack knowledge of the overarching system or its operational environment. Without this context, developers can only evaluate a model on its accuracy, or the predictability of its output. Once the model is delivered, software engineers and quality assurance teams often have no specifications or knowledge to guide its testing. None of the groups can evaluate how well the model will work in production.

"The bottom line is that many models fail in production because they are not tested properly," said Grace Lewis, a principal researcher at the SEI and lead of its Tactical and AI-Enabled Systems Initiative. "When ML-enabled systems fail operational tests because of problems with the model, it creates huge delays in system delivery, especially if new data needs to be collected to retrain the model."

To fill this gap in the development of ML-enabled software, Lewis and her team at the SEI collaborated with the U.S. Army Artificial Intelligence Integration Center (AI2C) and Christian Kästner, an associate professor in the CMU School of Computer Science.

They created MLTE, which applies best practices from traditional software development to ML model test and evaluation (T&E). The process brings together all the stakeholders of an ML-enabled software project, not just the ML developers, to negotiate the model's quality attribute requirements based on system needs. Those attributes become specifications for automated internal and system-dependent testing. Test results populate reports that developers and other stakeholders can use to decide if the model is ready for production. If it is not, the reports can inform further iteration and testing. Special libraries within the MLTE infrastructure automate parts of the process.

"MLTE provides system and operational context for ML model developers to make informed decisions about design and development," said Lewis. "Other stakeholders can better understand whether the requirements for models are realistic so that problems can be detected and fixed early in the process, not discovered in operational tests or production."

MLTE is a system-centric, quality-attribute-driven, semi-automated process and infrastructure to enable negotiation, specification, and testing of ML model and system qualities. It incorporates TEC, an earlier SEI tool that detects mismatched expectations among the teams building an ML component. Both TEC and MLTE are part of an SEI effort to establish integrated T&E of ML capabilities throughout the Department of Defense.

To download MLTE, visit the project's GitHub site. Read more about the tool's background in the papers Using Quality Attribute Scenarios for ML Model Test Case Generation and MLTEing Models: Negotiating, Evaluating, and Documenting Model and System Qualities.

About the Carnegie Mellon University Software Engineering Institute
Always focused on the future, the Software Engineering Institute (SEI) advances software as a strategic advantage for national security. We lead research and direct transition of software engineering, cybersecurity, and artificial intelligence technologies at the intersection of academia, industry, and government. We serve the nation as a federally funded research and development center (FFRDC) sponsored by the U.S. Department of Defense (DoD) and are based at Carnegie Mellon University, a global research university annually rated among the best for its programs in computer science and engineering. For more information, visit the SEI website at https://www.sei.cmu.edu.

View original content to download multimedia:https://www.prnewswire.com/news-releases/new-sei-tool-enhances-machine-learning-model-test-and-evaluation-302279126.html

SOURCE Carnegie Mellon Software Engineering Institute



Email This News Email | Submit To Slashdot Slashdot | Submit To Digg.com Digg | Submit To del.icio.us Del.icio.us | News Feeds Feeds

RELATED NEWS ARTICLES
Nav Dream Vacations Transforms Travel Booking Experience with New Website | Mar 28, 2025
Nav Land id(TM) Partners with The Land Report to Profile America's Largest Landowners | Mar 28, 2025
Nav New medical technology pilot environment established in Finland aims to slash market entry time of patient-friendly solutions | Mar 28, 2025
Nav Playcasino.co.nz Releases Comprehensive Guide to Online Casino Payment Methods for NZ Players | Mar 28, 2025
Nav MetAI to Debut AI-Powered Controller Simulator at NVIDIA GTC 2025, Advancing Digital Twins for Industrial Automation | Mar 28, 2025
Nav Netflix Worldwide Exclusive Streaming Anime"The Summer Hikaru Died" Reveals Main Trailer and New Cast Members, Yumiri Hanamori, Wakana Kowaka, and Chikahiro Kobayashi | Mar 28, 2025
Nav Constructor Unveils Retail Media Suite with Personalized Sponsored Listings and Retail Media Network Integrations | Mar 28, 2025
Nav Appcast Named a 2025 Google Premier Partner | Mar 28, 2025
Nav Hy-Vee RedMedia Partners with Instacart to Further Retail Media Capabilities | Mar 28, 2025
Nav New Website Developed by Flightpath, A Ruder Finn Company Launched by Goya Foods | Mar 28, 2025
NEWS SEARCH

FEATURED NEWS | POPULAR NEWS
Submit News | View More News View More News