Fundamentals of Data Analysis Workflows: Using Workflows and Pipelines in Industry

Self-Learning Course

Take this course at your own pace through pre-recorded video and online resources.

A methodical and reproducible research workflow is the cornerstone for scientific research practice. 

Workflows are often misunderstood; the terms workflow and pipeline are often used interchangeably.  Workflows and pipelines will be explored by examining the approach to answering scientific questions and the computational steps that are typically undertaken to aid answering those questions. 

The three stages of a scientific workflow will also be discussed. The first stage includes processing, interrogating, and screening data. This stage is followed by focussing attention on promising avenues, and when satisfied, the output is assembled for review and evaluation as part of the final stage. These workflow stages are often iterative; feedback is passed back to the first stage to aid workflow refinement and improvement. 

This course will include:

  • An introduction to data analysis workflows and pipelines, which includes an exploration of data analysis phases, how to distinguish workflows from pipelines, and a tour of pipeline types
  • Demonstration of examples: Apache Airflow and Nextflow pipelines 
  • Guidance on how to get started with your own pipeline

Pre-requisites: None

Create a free account to our Training Portal to register for a course and browse all available training courses.

Join Newsletter

Provide your details to receive regular updates from the STFC Hartree Centre.