This repository contains templates and examples for creating evaluation cards - structured documents for describing and documenting machine learning evaluation frameworks in materials science.
You can upload evaluation cards in a HuggingFace Space.
Evaluation cards are structured documents that capture key aspects of ML evaluation frameworks through the lens of measurement theory. They promote transparency about design choices, limitations, and tradeoffs in evaluation frameworks. Inspired by model cards and data cards, evaluation cards help developers document and users understand:
- What is being measured (estimand)
- How it is measured (estimator)
- How results should be reported (estimate)
The template is organized around three core concepts:
- Estimand: What you want to measure
- Estimator: How you measure it
- Estimate: How results should be reported
Key sections include:
- Evaluation Design
- Target Constructs
- Assessment Components
- Metrics and Protocols
- Known Limitations
- Usage Guidelines
We welcome contributions! Please see our contribution guidelines for details on:
- Adding new examples
- Improving the template
- Reporting issues
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request with a clear description
If you use this template in your work, please cite:
[Citation to be added after publication]
- Model Cards for Model Reporting (Mitchell et al.)
- Data Cards (Pushkarna et al.)
- Datasheets for Datasets (Gebru et al.)
This project is licensed under the MIT License - see the LICENSE file for details.