Skip to content

Machine Learning Test and Evaluation (MLTE)

mlte_logo

MLTE (pronounced "melt") is a process and toolset for machine learning (ML) test and evaluation. MLTE enables teams to more effectively negotiate, document, and evaluate model qualities.

MLTE Process

MLTE Diagram

Continuous Negotiation

To begin, model developers and project stakeholders meet to determine mission, business, and system-derived requirements that will influence model development, such as the deployment environment, available data, and model requirements. Throughout the process, teams continue to have meetings to update their assumptions and requirements.

MLTE Negotiation Card

As part of the negotiation, teams fill out a MLTE Negotiation Card which allows them to record agreements and drives model development and testing.

Initial Model Testing (IMT)

Teams use information from the Negotiation Card during initial model development to inform model requirements and thresholds. Once initial development is complete, model teams do initial testing during this step to determine when the model exceeds established baselines.

System Dependent Model Testing (SDMT)

Once a model passes its baseline requirements in IMT, teams can then focus on ensuring that it passes the larger set of model requirements. To do so, teams use system-derived requirements and quality attribute information from the Negotiation Card to develop a test specification, which contains code that will evaluate each requirement.

If results are satisfactory, the output is a production-ready model (meaning that it meets defined model requirements, including system-derived requirements), along with all testing evidence (code, data, and results).

If results are not satisfactory, more negotiation is required to determine if requirements are realistic, whether more experimentation is required, or whether results triggered additional requirements or tests.

Further Information

MLTE Metadata

  • Version: 1.0.2
  • Contact Email: mlte dot team dot info at gmail dot com
  • Citations: While not required, it is highly encouraged and greatly appreciated if you cite our paper when you use MLTE for academic research.
    @inproceedings{maffey2023,
        title={MLTEing models: Negotiating, Evaluating, and Documenting
        Model and System Qualities},
        author={Maffey, Katherine R and Dotterrer, Kyle and Niemann,
        Jennifer and Cruickshank, Iain and Lewis, Grace A and 
        K{\"a}stner, Christian},
        booktitle={2023 IEEE/ACM 45th International Conference on 
        Software Engineering: New Ideas and Emerging Results 
        (ICSE-NIER)},
        pages={31--36},
        year={2023},
        organization={IEEE}
    }

... or if you use, or are inspired by, quality attributes for ML model test case generation.

    @inproceedings{brower2024,
        author={Brower-Sinning, Rachel and Lewis, Grace A. and Echeverría,
        Sebastián and Ozkaya, Ipek},
        booktitle={2024 IEEE 21st International Conference on Software
        Architecture Companion (ICSA-C)}, 
        title={Using Quality Attribute Scenarios for ML Model Test 
        Case Generation}, 
        year={2024},
        pages={307-310},
        organization={IEEE}
    }  

Check out MLTE on GitHub!