
Related items loading ...
Section 1: Publication
Publication Type
Thesis
Authorship
Hossain, M. M.
Title
A Provenance-Aware Visual Framework for Explorative and Reproducible Computational Scientific Experiments
Year
2025
Publication Outlet
University of Saskatchewan Harvest, Graduate Theses and Dissertations
DOI
ISBN
ISSN
Citation
Abstract
Researchers encapsulate diverse tools and data into a cohesive pipeline, known as a scientific workflow, to conduct computational scientific experiments. This workflow is then submitted to a runtime infrastructure for execution. Scientific Workflow Management Systems (SWfMSs) integrate various tools, techniques, languages, and graphical interfaces to provide platforms for specifying, executing, monitoring, and managing workflows, effectively abstracting the complexities of data and process management for researchers. These systems support both reproducible research, which ensures valid results through established protocols, and exploratory research, which investigates new phenomena iteratively. Provenance information collected during workflow composition and execution validates workflow structure and execution results via queries and visualization. SWfMSs provide graphical or textual interfaces for workflow composition using either a graphical or textual language. Graphical languages are user-friendly but can become unwieldy with complex workflows, whereas textual languages offer concise expressions but require steeper learning curves. Despite advancements in execution environments, limited usability in composition interfaces hinders SWfMS adoption, leading to the retirement of once-prominent systems. Additionally, managing tool integration poses challenges, especially in web-based SWfMSs, which must serve many users simultaneously and deal with platform and tool incompatibility. Empowering end-users to integrate external tools via extensibility mechanisms is essential for improving usability and flexibility in explorative research. Similarly, reproducing external experiments within SWfMSs is challenging and requires innovative solutions. To address these issues, we conducted five studies. First, we investigated existing SWfMS architectures and derived a novel architecture for a graphical SWfMS framework designed for intuitive workflow composition, along with abstracted execution and data and process management. Second, we examined the challenges in designing scientific workflows and addressed them by proposing an interactive experiment development environment. This framework facilitates the rapid development of scientific experiments by combining textual Domain-Specific Language (DSL)-based workflow specifications with graphical tools that intuitively accelerate composition and enhance comprehension. The third study designed a domain-specific environment for capturing, querying, and visualizing provenance information. The fourth study addressed tool integration challenges through bioinformatics and software analytics case studies. The fifth study proposed packaging experiments and complex tools in Docker containers and registering them via a graphical interface, overcoming installation barriers of entire experiments and complex tools and enhancing integration with SWfMS composition and runtime environments. Through prototypes, experiments, user studies, and case studies, our work advances the usability, flexibility, reproducibility, extensibility, and scalability of SWfMSs for computational scientific experiments.
Plain Language Summary