Skip to content

simet.pipeline.pipeline

simet.pipeline.pipeline

Pipeline

Pipeline(loader, restraints)

Orchestrates dataset loading, feature extraction, and restraint checks.

The Pipeline wires together a :class:DatasetLoader (providers, transform, feature extractor) and an ordered list of :class:Restraint instances. Calling :meth:run applies each restraint in sequence and short-circuits on the first failure.

Attributes:

Name Type Description
loader DatasetLoader

Constructed data/feature loader.

restraints list[Restraint]

Ordered list of checks/metrics to apply.

Example

Minimal YAML structure expected by :meth:from_yaml:

pipeline:
  loader:
    real_provider:
      type: LocalProviderWithClass
      path: data/real
    synth_provider:
      type: LocalProviderWithClass
      path: data/synth
    provider_transform:
      type: InceptionTransform
    feature_extractor:
      type: InceptionFeatureExtractor
  restraints:
    - type: FIDRestraint
      upper_bound: 40.0
    - type: RocAucRestraint
      lower_bound: 0.85

p = Pipeline.from_yaml(Path("pipeline.yaml")) ok = p.run() # returns True iff all restraints pass

Create a pipeline from a loader and a list of restraints.

Parameters:

Name Type Description Default
loader DatasetLoader

Prepared :class:DatasetLoader (providers, transforms, FE).

required
restraints list[Restraint]

Ordered list of :class:Restraint instances to evaluate.

required
Source code in simet/pipeline/pipeline.py
68
69
70
71
72
73
74
75
76
def __init__(self, loader: DatasetLoader, restraints: list[Restraint]) -> None:
    """Create a pipeline from a loader and a list of restraints.

    Args:
        loader: Prepared :class:`DatasetLoader` (providers, transforms, FE).
        restraints: Ordered list of :class:`Restraint` instances to evaluate.
    """
    self.loader = loader
    self.restraints = restraints

from_yaml classmethod

from_yaml(config_path)

Construct a pipeline from a YAML file on disk.

Parses the YAML located at config_path, validates required sections, and delegates to :meth:_from_config_dict.

Parameters:

Name Type Description Default
config_path Path

Path to a YAML file matching the expected schema.

required

Returns:

Name Type Description
Pipeline Pipeline

A ready-to-run pipeline instance.

Raises:

Type Description
OSError

If the file cannot be opened.

YAMLError

If the YAML is invalid.

ValueError

If required keys are missing (re-raised from _from_config_dict).

Source code in simet/pipeline/pipeline.py
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
@classmethod
def from_yaml(cls, config_path: Path) -> "Pipeline":
    """Construct a pipeline from a YAML file on disk.

    Parses the YAML located at `config_path`, validates required sections,
    and delegates to :meth:`_from_config_dict`.

    Args:
        config_path (Path): Path to a YAML file matching the expected schema.

    Returns:
        Pipeline: A ready-to-run pipeline instance.

    Raises:
        OSError: If the file cannot be opened.
        yaml.YAMLError: If the YAML is invalid.
        ValueError: If required keys are missing (re-raised from `_from_config_dict`).
    """
    try:
        with open(config_path, "r") as file:
            pipeline_data = yaml.safe_load(file)
            return cls._from_config_dict(pipeline_data)
    except Exception as e:
        logger.error(f"Failed to parse pipeline file: {e}")
        raise

run

run()

Execute restraints in order; stop at first failure.

Iterates over self.restraints, logs the metric name, and calls restraint.apply(self.loader). If any check fails (passes is False), logs a warning and returns False. Returns True only if all pass.

Returns:

Name Type Description
bool bool

True if all restraints pass; False otherwise.

Source code in simet/pipeline/pipeline.py
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
def run(self) -> bool:
    """Execute restraints in order; stop at first failure.

    Iterates over `self.restraints`, logs the metric name, and calls
    `restraint.apply(self.loader)`. If any check fails (`passes is False`),
    logs a warning and returns `False`. Returns `True` only if all pass.

    Returns:
        bool: `True` if all restraints pass; `False` otherwise.
    """
    for restraint in self.restraints:
        logger.info(f"Applying restraint: {restraint.metric.name}")
        passes, _ = restraint.apply(self.loader)
        if not passes:
            logger.warning(
                f"Restraint {restraint.metric.name} failed. Stopping pipeline."
            )
            return False

    logger.info("All restraints passed successfully")
    return True