VisTrails is an open-source provenance management and scientific workflow system that was designed to support the scientific discovery process. VisTrails provides unique support for data analysis and visualization, a comprehensive provenance infrastructure, and a user-centered design. The system combines and substantially extends useful features of visualization and scientific workflow systems. Similar to visualization systems, VisTrails makes advanced visualization techniques available to users, allowing them to explore and compare different visual representations of their data; and similar to scientific workflow systems, VisTrails enables the composition of workflows that combine specialized libraries, distributed computing infrastructure, and Web services. As a result, users can create complex workflows that encompass important steps of scientific discovery, from data gathering and manipulation, to complex analyses and visualizations, all integrated in one system. VisTrails was designed to manage rapidly-evolving workflows, and another distinguishing feature of VisTrails is a comprehensive provenance infrastructure that maintains detailed history information about the steps followed and data derived in the course of an exploratory task: VisTrails maintains provenance of data products (e.g.: visualizations, plots), of the workflows that derive these products, and their executions. The system also provides extensive annotation capabilities that allow users to enrich the automatically captured provenance. Besides enabling reproducible results, VisTrails leverages provenance information through a series of operations and intuitive user interfaces that aid users to collaboratively analyze data. Notably, the system supports reflective reasoning by storing temporary results, by providing users the ability to reason about these results and to follow chains of reasoning backward and forward. Users can navigate workflow versions in an intuitive way, undo changes but not lose any results, visually compare multiple workflows and show their results side-by-side in a visual spreadsheet, and examine the actions that led to a result. In addition, the system has native support for parameter sweeps, whose results can also be displayed on the spreadsheet. VisTrails addresses important usability issues that have hampered a wider adoption of workflows and visualization systems. It provides a series of operations and user interfaces that simplify workflow design and use, including the ability to create and refine workflows by analogy, to query workflows by example, and a recommendation system that automatically suggests workflow completions as users interactively construct their workflows. The system also supports the creation of mashups—customized and simplified applications that can be more easily deployed to scientists.



