Evaluation Module#
This document explains how the eval_module of the CTF for Science framework is structured and used. It includes metrics definitions, evaluation routines, and batch results handling.
Overview#
The evaluation module provides tools to:
Compute metrics such as:
Short-time forecast accuracy
Reconstruction accuracy
Long-time behavior accuracy using:
Histogram comparisons (for ODEs)
Spectral analysis (for PDEs)
Extract metrics in a consistent order across sub-datasets
Save results to disk in a reproducible format
Please refer to API for the full API of the evaluation module.
Notes#
Metric scores are always percentages, where 100% means perfect match.
Config files specify which metrics to evaluate per pair:
pairs:
- id: 1
metrics: [short_time, reconstruction, long_time]
evaluation_paramsspecify parameters likek,modes,bins.
Typical Workflow#
Run a model to get
predictions.Call
evaluate(...)to compute metrics.Call
save_results(...)to save predictions, config, and scores.Optionally use
extract_metrics_in_order(...)for plotting.
Future Enhancements#
Additional metrics (e.g., mean absolute error, KL-divergence)
Per-variable breakdown of metrics
Support for multi-output models with task-specific metrics