Abstract
Scientific workflow management systems offer features for composing complex
computational pipelines from modular building blocks, for executing the
resulting automated workflows, and for recording the provenance of data
products resulting from workflow runs. Despite the advantages such features
provide, many automated workflows continue to be implemented and executed
outside of scientific workflow systems due to the convenience and familiarity
of scripting languages (such as Perl, Python, R, and MATLAB), and to the high
productivity many scientists experience when using these languages. YesWorkflow
is a set of software tools that aim to provide such users of scripting
languages with many of the benefits of scientific workflow systems. YesWorkflow
requires neither the use of a workflow engine nor the overhead of adapting code
to run effectively in such a system. Instead, YesWorkflow enables scientists to
annotate existing scripts with special comments that reveal the computational
modules and dataflows otherwise implicit in these scripts. YesWorkflow tools
extract and analyze these comments, represent the scripts in terms of entities
based on the typical scientific workflow model, and provide graphical
renderings of this workflow-like view of the scripts. Future versions of
YesWorkflow also will allow the prospective provenance of the data products of
these scripts to be queried in ways similar to those available to users of
scientific workflow systems.
Description
[1502.02403] YesWorkflow: A User-Oriented, Language-Independent Tool for Recovering Workflow Information from Scripts
Links and resources
Tags