ctf4science.tune_module.ModelTuner#

class ctf4science.tune_module.ModelTuner(config_path: str, model_name: str | None = None, save_final_config: bool = True, metric: str = 'score', mode: str = 'max', ignore_reinit_error: bool = False, time_budget_hours: float = 24.0, use_asha: bool = False, asha_config: dict[str, Any] | None = None, gpus_per_trial: int = 0, enable_performance_monitoring: bool = False, performance_output_dir: str | None = None, ray_results_dir: str | None = None, run_opt_main: Callable[[str], None] | None = None)#

Bases: object

Orchestrates hyperparameter tuning for CTF models using Ray Tune.

Supports tuning with a specific config file or automatically detecting all config files in a model’s tuning_config/config_*.yaml.

Parameters:

config_pathstr: Path to the configuration file (dataset, model, hyperparameters).
model_namestr, optional: Model name; if not provided, inferred from config or directory.
save_final_configbool, optional: Whether to save the final optimal config. Default is True.
metricstr, optional: Metric to optimize. Default is "score".
modestr, optional: Optimization mode "min" or "max". Default is "max".
ignore_reinit_errorbool, optional: Whether to ignore Ray reinit errors (dev only). Default is False.
time_budget_hoursfloat, optional: Max time budget in hours. Default is 24.0.
use_ashabool, optional: Whether to use ASHA scheduler. Default is False.
asha_configdict, optional: ASHA config (max_t, grace_period, reduction_factor, brackets).
gpus_per_trialint, optional: GPUs per trial (0 = all available). Default is 0.
enable_performance_monitoringbool, optional: Enable performance (time) monitoring. Default is False.
performance_output_dirstr, optional: Directory for performance results.
ray_results_dirstr, optional: Directory for Ray temporary results.
run_opt_maincallable, optional: Callable that runs a single optimization trial given a config path (typically run_opt.main from the model directory). Required.

Methods

`run_from_cli`([description])	Run tuning from the command line.
`run_optimization`()	Run the complete hyperparameter optimization workflow.

Raises:

ValueError: If config is missing required sections (dataset, model, hyperparameters) or if run_opt_main is not provided.

Notes

Class Methods:

run_optimization():

Run the full tuning workflow: create search space, Tuner, fit(), save best config and history when applicable.
Returns:
- None.

run_from_cli(description): (static)

Run tuning from the command line: parse CLI, import model-local run_opt.main, run ModelTuner per config.
Parameters:
- description : str, optional. Description for the argument parser. Default "CTF Model Hyperparameter Tuner".
Returns:
- None.

_construct_output_dir():

Construct the output directory path from model, dataset, and pair_id. Structure: results/tune_results/{model_name}/{dataset_name}/pair_id_{pair_ids}/{timestamp}/.
Returns:
- Path. Constructed output directory path.

_infer_model_name(config_path): (static)

Infer model name from directory structure (e.g. parent of tuning_config).
Parameters:
- config_path : str. Path to the config file.
Returns:
- str. Inferred model name.
Raises ValueError if model name cannot be inferred.

_validate_config(self, config):

Validate that the configuration contains required sections: dataset, model, hyperparameters.
Parameters:
- config : dict. Configuration dictionary to validate.
Returns:
- None.
Raises ValueError if required sections are missing or invalid.

_validate_param_space(self, param_space):

Validate the parameter space (types, bounds, choices, etc.).
Parameters:
- param_space : dict. Parameter space dictionary to validate.
Returns:
- None.
Raises ValueError if parameter space is empty or invalid.

_objective(self, config):

Objective function for Ray Tune: generate config, run model via run_opt_main, sum results.
Parameters:
- config : dict. Trial hyperparameter configuration.
Returns:
- Dict[str, float]. Dictionary with the optimization metric (e.g. score).

_parse_pair_ids(dataset_config): (static)

Parse pair_id from dataset config (int, list of int, or 'all').
Parameters:
- dataset_config : dict. The dataset section from the config file.
Returns:
- list of int or None. Pair IDs to optimize for, or None for all pairs.
Raises ValueError if pair_id is not int, list of ints, or 'all'.

_sum_results(self, results):

Sum metric values from the results dictionary (all pairs in results['pairs']).
Parameters:
- results : dict. Evaluation results from run_opt (pairs with metrics).
Returns:
- float. Sum of all metric values.

_create_search_space(self, tuning_config):

Build a Ray Tune search space dictionary from the tuning config.
Parameters:
- tuning_config : dict. Hyperparameter spec (type, lower_bound, upper_bound, choices, etc.).
Returns:
- dict. Ray Tune search space.
Raises Exception if required keys or types are missing or unsupported.

_generate_config(self, config, template, name):

Generate a YAML config file with the given hyperparameters and save to output_dir.
Parameters:
- config : dict. Selected hyperparameters.
- template : dict. Config template to fill.
- name : str. Output filename base (without extension).
Returns:
- str. Path to the generated configuration file.

_get_resources():

Get resource configuration (cpu, gpu) for Ray Tune trials.
Returns:
- dict. Keys cpu and gpu for per-trial resources.