Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with composing tasks with output_file_template in a workflow #641

Closed
ghisvail opened this issue Apr 21, 2023 · 3 comments
Closed

Issue with composing tasks with output_file_template in a workflow #641

ghisvail opened this issue Apr 21, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@ghisvail
Copy link
Collaborator

I am attempting to build workflows which compose shell command tasks together, some of which providing default output names via output_file_template. However, these trigger an exception at runtime if they are not explicitly set when running the workflow, which defeats their purpose in my opinion.

The following minimal example defines a toy backup function wrapping the cp command and using a templated output for the target file. The code works fine on the task alone, but fails when wrapped with a worfklow with:

future: <Task finished name='Task-2' coro=<ConcurrentFuturesWorker.exec_as_coro() done, defined at .../pydra/engine/workers.py:169> exception=Exception("target_file has to be str or bool, but LF('backup', 'target_file') set")>

I would expect both runs to be successful and yield the same result modulo the temporary directory prefix.

import os
import pathlib
import tempfile

import attrs
import pydra


class BackupFile(pydra.engine.ShellCommandTask):
    """Example task performing a file backup with cp.

    If not target file is provided, it uses a default file suffixed by copy.
    """

    @attrs.define(kw_only=True)
    class InputSpec(pydra.specs.ShellSpec):
        source_file: os.PathLike = attrs.field(
            metadata={"help_string": "source file", "mandatory": True, "argstr": ""}
        )

        target_file: str = attrs.field(
            metadata={
                "help_string": "target file",
                "argstr": "",
                "output_file_template": "{source_file}_copy",
            }
        )

    input_spec = pydra.specs.SpecInfo(name="Input", bases=(InputSpec,))

    executable = "cp"


def backup(name="backup", **kwargs) -> pydra.Workflow:

    wf = pydra.Workflow(input_spec=["source_file", "target_file"], name=name, **kwargs)

    wf.add(
        BackupFile(
            name="backup_file",
            source_file=wf.lzin.source_file,
            target_file=wf.lzin.target_file,
        )
    )

    wf.set_output({"target_file": wf.backup_file.lzout.target_file})

    return wf


if __name__ == "__main__":

    # Execute with standalone backup task.
    with tempfile.TemporaryDirectory() as td:
        source_file = (pathlib.Path(td) / "source1.file")
        source_file.touch()

        task = BackupFile(source_file=source_file)
        result = task()

        print(result.output.target_file)

    # Execute backup task wrapped in a workflow.
    with tempfile.TemporaryDirectory() as td:
        source_file = (pathlib.Path(td) / "source2.file")
        source_file.touch()

        task = backup(source_file=source_file)
        result = task()

        print(result.output.target_file)
@ghisvail ghisvail added the bug Something isn't working label Apr 21, 2023
@ghisvail
Copy link
Collaborator Author

The issue is here.

Looking at the following code extraction:

def template_update_single(...):
    ...
    if spec_type == "input":
        if field.type not in [str, ty.Union[str, bool]]:
            raise Exception(
                "fields with output_file_template"
                "has to be a string or Union[str, bool]"
            )
        inp_val_set = inputs_dict_st[field.name]
        if inp_val_set is not attr.NOTHING and not isinstance(inp_val_set, (str, bool)):
            raise Exception(
                f"{field.name} has to be str or bool, but {inp_val_set} set"
            )
        if isinstance(inp_val_set, bool) and field.type is str:
            raise Exception(
                f"type of {field.name} is str, consider using Union[str, bool]"
            )

Since type(inp_val_set) is LazyField in the workflow case above, it cannot satisfy the checks for str or bool.

@ghisvail
Copy link
Collaborator Author

One possible solution would be for LazyField to become a covariant typed variable like LazyField[str] or LazyField[bool], which could then satisfy the isinstance checks.

But I believe it would fail further down at the formatting stage where the actual value needs to be accessed. It looks like LazyField.get_value requires a workflow argument, which is not available in these helper functions.

Unless I missed something?

@ghisvail
Copy link
Collaborator Author

Fixed by #662

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant