Singe cell RNA sequencing data processing using cloud-based serverless computing

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Singe cell RNA sequencing data processing using cloud-based serverless computing

Authors

Hung, L.-H.; Nasam, N.; Lloyd, W.; Yeung, K. Y.

Abstract

Singe cell RNA sequencing (scRNA-seq) has become a routine method for measuring cell activities. Processing large scRNA-seq datasets requires high-performance computing resources. The emergence of cloud computing allows us to leverage its on-demand capabilities without major investment in infrastructure. Serverless computing provides cost efficiency by allowing users to pay only for actual resource usage, eliminating the necessity for pre-allocated server capacities. Additionally, there is no requirement to set up servers in advance. We present a novel and generalizable methodology using serverless cloud computing to accelerate computationally intensive workflows. We create an on-demand supercomputer using rapidly deployable cloud serverless functions as automatically provisioned computation units. We tested our methodology of optimizing a scRNA-seq workflow by leveraging serverless functions on the cloud using two publicly available peripheral blood mononuclear cell (PBMC) datasets. In addition, we demonstrate our approach using data generated by the NIH MorPhiC program, where we process a 450 GB human scRNA-seq dataset across 86 cell lines designed to study the temporal impact of perturbations on pancreatic differentiation. We compared the total execution time of the scRNA-seq serverless workflow with the traditional workflow without using serverless functions, and demonstrate major speedup for large scRNA-seq datasets.

Follow Us on

0 comments

Add comment