Case Study: Optimizing Genomics Workflows with AWS HealthOmics

Building Private Workflows Using AWS HealthOmics

Executive Summary

This case study describes details of the work done by Clovertex in deploying a publicly available genomics analysis pipeline using AWS HealthOmics as a Private Workflow. The blog outlines the steps taken to successfully execute the analysis pipeline using AWS HealthOmics, the changes made to the public code, and benefits of HealthOmics.

About AWS HealthOmics

AWS HealthOmics is an AWS service that helps users such as bioinformaticians, researchers, and scientists to store, query, analyze, and generate insights from genomics and other biological data. It simplifies and accelerates the process of storing and analyzing genomic information for research and clinical organizations, and makes scientific discovery and insight generation faster.

About wf-human-variation workflow from Oxford Nanopore

The wf-human-variation workflow is developed by Oxford Nanopore. The workflow is available as a publicly available github repository. The repository contains a nextflow workflow for analyzing variation in human genomic data. Specially the workflow can perform basecalling of FAST5 or POD5 sequence data, diploid variant calling, structural variant calling, aggregation of modified base counts, copy number variant calling and short tandem repeat expansion genotyping.

Results

Clovertex was able to migrate the workflow successfully to AWS HealthOmics as a private workflow.

Lessons Learned

  • New workflow needs to be created in HealthOmics using the workflow zip and workflow parameters

  • The code change needs to account for Docker containers in ECR

  • Build parameters to account for ALL the parameters needed for workflow

  • HealthOmics infrastructure are built securely and limits access to external sources

  • HealthOmics can take inputs from S3 or OMICS datastores and can store results back to S3.

Business Benefits

  • Ideal for handling production and frequently repeated workloads

  • Simplified migration process with focus on challenging optimization tasks

  • Requires minimal adjustments when modifying workflows

  • Build and use a system that is HIPAA eligible, secure and scalable

  • Able to collaborate more effectively with other scientists

Methodology

The goal of this experiment was to understand how to enable a private workflow on AWS HealthOmics. Most of the work was done using AWS console. The methodology consisted of two steps. Building a workflow using the artifacts from the wf-human-variation git repository with appropriate modifications to the code. And running a set of workflows using the workflow that was created. The workflow was also run directly using EC2 instance where Nextflow was deployed, for comparison.

Workflow Creation

The workflow creation step consisted of:
Diagram of Workflow Creation

Ready to see AWS HealthOmics in action?

Watch our overview video and discover how this groundbreaking service is reshaping omics data, leading to valuable insights that improve health outcomes.

Recent Posts

Contact us for more information

Head Office (USA)

275 Grove St Suite 2-400 Newton, MA, 02466

Regional Office (India)

1st Floor, My Home Twitza, Hitech City Main Rd, Diamond Hills, Lumbini Avenue, HITEC City, Hyderabad, India

Clovertex is hiring.
To apply, visit the Careers page.