Skip to content

composite

optimus_dl.modules.data.transforms.composite

CompositeTransform

Bases: BaseTransform

Transform that applies multiple transformations in sequence.

This allows building complex data processing pipelines by composing simpler transforms (e.g., Tokenize -> Chunk -> Batch).

Parameters:

Name Type Description Default
cfg CompositeTransformConfig

Composite transform configuration.

required
Source code in optimus_dl/modules/data/transforms/composite.py
@register_transform("compose", CompositeTransformConfig)
class CompositeTransform(BaseTransform):
    """Transform that applies multiple transformations in sequence.

    This allows building complex data processing pipelines by composing simpler
    transforms (e.g., Tokenize -> Chunk -> Batch).

    Args:
        cfg: Composite transform configuration.
    """

    def __init__(self, cfg: CompositeTransformConfig, *args, **kwargs):
        super().__init__(*args, **kwargs)
        transforms = []
        for transform in cfg.transforms:
            transforms.append(build_transform(transform, *args, **kwargs))
        self.transforms = transforms

    def build(self, source: BaseNode) -> BaseNode:
        """Chain all internal transformations together starting from the source."""
        for transform in self.transforms:
            source = transform.build(source)
        return source

build(source)

Chain all internal transformations together starting from the source.

Source code in optimus_dl/modules/data/transforms/composite.py
def build(self, source: BaseNode) -> BaseNode:
    """Chain all internal transformations together starting from the source."""
    for transform in self.transforms:
        source = transform.build(source)
    return source

CompositeTransformConfig dataclass

Bases: RegistryConfigStrict

Configuration for a chain of transforms.

Attributes:

Name Type Description

Parameters:

Name Type Description Default
transforms list[RegistryConfig]
'???'
Source code in optimus_dl/modules/data/transforms/composite.py
@dataclass
class CompositeTransformConfig(RegistryConfigStrict):
    """Configuration for a chain of transforms.

    Attributes:
        transforms: List of transformation configurations to apply in order.
    """

    transforms: list[RegistryConfig] = MISSING