ctf4science.performance_module.PerformanceMonitor#

class ctf4science.performance_module.PerformanceMonitor(output_dir: str | None = None)#

Bases: object

Performance monitoring for CTF models (wall-clock time only).

Tracks total wall-clock time and calculates averages across multiple runs. Energy consumption is measured at the SLURM job level using EAR through bash scripts.

Parameters:
output_dirstr, optional

Directory to save performance results. Defaults to results/performance_results.

Methods

record_run(run_id, duration)

Record a completed run and update cumulative time and run count.

start_monitoring()

Start monitoring a process or session.

stop_monitoring()

Stop monitoring and return summary metrics.

Notes

Class Methods:

start_monitoring():

  • Start monitoring a process or session. Resets total time and run count, and records the current time as session start.

  • Returns:
    • None.

record_run(self, run_id, duration):

  • Record a completed run and update cumulative time and run count.

  • Parameters:
    • run_id : str. Unique identifier for the run (e.g. "run_1").

    • duration : float. Duration of the run in seconds.

  • Returns:
    • None.

  • Raises ValueError if duration is negative.

stop_monitoring():

  • Stop monitoring and return summary metrics. Computes total run time, average time per run, and session duration; writes a performance summary YAML file to output_dir. If no session was started, returns an empty dict.

  • Returns:
    • dict. Summary with keys including total_num_runs, total_run_time_seconds, average_time_per_run_seconds, total_session_time_seconds, timestamp, etc.

_save_summary_results(self, metrics):

  • Save summary results to a YAML file in the output directory.

  • Parameters:
    • metrics : dict. Summary metrics to write (e.g. from stop_monitoring).

  • Returns:
    • None. Exceptions during write are logged but not raised.