Case Study: Leveraging HPC on AWS to Accelerate cryo-EM Processing

Situation

The use of cryo-electron microscopy (cryo-EM) in the pharmaceutical R&D sector is rapidly growing. Like many pharmaceutical companies, our client’s molecular profiling group relied on leased cryo-electron microscopes at a shared facility to engage in drug discovery.

Credit: University of Texas at Austin

This shared facility was processing data on AWS. Though it was cloud-based, it was still only utilizing a single machine. This caused data sets to take up to 3-4 weeks to process, limiting the number of experiments they could run. Another factor was data uploading. The considerable amount of data being generated by the microscopes didn’t even start uploading until all the processing was completed. Then, compounding this, whenever parameters were changed everything had to be re-processed once again, further adding to delays in results.

Objective

Thanks in part to their rapid and continued growth, our client decided to invest in building their own facility, which greatly improved throughput. This additional throughput, though, then created a new downstream challenge:

  • How to process the greatly increased amount of data being generated by their on-prem cryo-EM facility?
  • They wanted to reduce processing time from weeks to days, enabling them to process multiple times per week.

Action

To address these needs, Clovertex built a High Performance Cluster (HPC) on AWS, initially using the open-source Relion software to process the cryo-EM data. We then built a way to quickly and securely transfer data from their cryo-EM lab to AWS, where it got processed and the data was stored.

By leveraging AWS cloud elasticity, we also unlocked the benefits of parallelism, which significantly reduced overall processing time.

To improve data transfer, Clovertex implemented incremental data transfer, so data could start moving from the microscope to AWS as it was being created, instead of having to wait until the entire process was completed.

In addition, now that they had an on-prem facility, our client chose to invest in cryoSPARC, a commercial cryo-EM software which unlocked even higher performance by leveraging a SPARC cluster. Clovertex helped migrate their processes from Relion to this more robust, well-supported application.

AWS services used in this solution include:

  • AWS Direct Connect
  • AWS Data Sync
  • Amazon S3
  • AWS ParallelCluster with Amazon EC2 GPU Based Instance Families G4, G5, P3 and P4
  • Amazon Elastic Block Storage (Amazon EBS)
  • Amazon FSx for Lustre
  • Amazon Cloud Watch

Results

The combination of parallel processing on AWS, improved data transport, and optimized settings enabled our client to finish processing an experiment in less than a week. In addition, it enabled multiple scientists to run multiple clusters simultaneously, enabling experiments with the same data but different parameters, simultaneously. Also, whenever a new software version is released, they can now evaluate the new version separately without disrupting current version production processing.

Recent Posts

Blog

Clovertex Receives AWS Funding Support for Its Clients Under AWS MAP Program

We are pleased to announce that Clovertex is the latest AWS partner to become MAP-qualified. With this qualification, along with AWS Advanced Tier Services Partnership with AWS, we are in a great position to help you support your cloud migration and modernization journey and enjoy the benefits of lowering infrastructure costs, reducing security incidents, lowering time to market of new features and innovating faster.

Read More »
Video

Faster Drug Discovery Design with WEKA on AWS

Clovertex Principal Architect Baris Guler and WEKA Director of Sales, Pruitt Chamness, co-presented at AWS re:Invent on their collaboration to enable cryo-EM data processing at scale. This solution allows scientists to access data quicker, drive results faster and focus on their research instead of infrastructure.

Read More »

Contact Us

Head Office (USA)

275 Grove St Suite 2-400
Newton, MA, 02466
[email protected]
+1 (508) 395-3423

Regional Office (India)

Workafella, Cyber Crown, Suite #204 2nd floor, Sec-II Village, HUDA Techno Enclave, Madhapur, Telangana, 500081.

Clovertex is hiring.
To apply, visit the Careers page.