eradiate.pipelines.Pipeline#

class eradiate.pipelines.Pipeline(steps=NOTHING)[source]#

Bases: object

A simple data processing pipeline remotely inspired from scikit-learn’s Pipeline class.

add(name, step, position=None, before=None, after=None)[source]#

Add a step to an existing pipeline.

Parameters
  • name (str) – Name of the step to be added. If the passed name already exists, an exception will be raised.

  • step (PipelineStep) – Step to be added to the pipeline.

  • position (int, optional) – Index where step will be inserted.

  • before (str, optional) – Insert step before the step with the name name. Exclusive with after.

  • after (str, optional) – Insert step after the step with the name name. Exclusive with before.

Raises
  • ValueError – If name maps to an already registered step.

  • ValueError – If both before and after are set.

Notes

  • If none of position, before or after are set, the step will be appended to the pipeline.

  • If position and before (resp. after) are set, before (resp. after) takes precedence.

transform(x, start=None, stop=None, stop_after=None, step=None)[source]#

Apply the pipeline to a given data. Keyword arguments can be used to restrict pipeline execution to selected steps.

Parameters
  • x – Data to apply the pipeline to.

  • start (int or str, optional) – If set, start execution at indexed step.

  • stop (int or str, optional) – If set, stop execution at step preceding indexed step.

  • stop_after (int or str, optional) – If set, stop execution after indexed step. Takes precedence over stop.

  • step (int or str, optional) – If set, execute indexed step only. Takes precedence on all other step selectors.

Returns

xt – Processed data.

update()[source]#

Update internal state. Should be run whenever steps is modified or mutated.

property named_steps#

A dictionary mapping names to their corresponding step.

Type

dict[str, PipelineStep]