Uncovering how food and microbes interact and specifically, how the resident gut microbial community – the microbiome – can determine our health is a key research priority for the Quadram Institute. The team are investigating how the gut is populated with microbes from birth and wanted to overcome a major bottleneck in their ability to process large metagenomic datasets and derive a quick overview from their sample before deciding which characteristics to investigate in more detail.
To address the ‘need for speed,’ our data scientists used their expertise in genomic analyses to design a suitable workflow for rapid resistome profiling. This enabled researchers at the Quadram Institute to quickly establish what bacteria were present in neonatal gut microbiome samples and understand whether they carried antimicrobial resistant genes. The team went on to create software capable of running the bioinformatics workflow locally on a laptop – processing a typical two gigabyte metagenome in two minutes. The software helped to cluster samples and was coupled with artificial intelligence workflows to accurately classify samples based on their antibiotic exposure. Finally, the team produced a pipeline that could neatly package the complex workflows in to portable, easy to use solutions that can be run on additional compute resources.
These tools - developed as part of the Innovation Return on Research (IROR) programme, a collaboration between STFC and IBM Research – have helped to speed up processing of large quantities of metagenomic data from neonatal gut microbiome samples. This rapid resistome profiling workflow will help researchers at the Quadram Institute to quickly identify samples that are relevant to their investigations, ultimately streamlining data processing tasks. The team also delivered a flexible solution that can be used on accessible, localised hardware such as laptops without the need to rely solely on high performance computing infrastructure.
"Working with the Hartree Centre team has enabled us to take our microbiome data and move forward into the translational arena.This work has provided a platform for next stage development of our rapid computational tools and software for profiling at-risk patient populations."
At a glance
- Researchers are now able to quickly identify samples relevant to their investigations
- Designed a new bioinformatics workflow to understand large genetic datasets
- Flexible, portable bioinformatics tools accessible on laptops, without the need for existing HPC infrastructure
- Capable of processing a typical 2GB metagenome in 2 minutes
- Streamlined and optimised data processing