A Cross-Species Generative Cell Atlas Across 1.5 Billion Years ofEvolution: The TranscriptFormer Single-cell Model

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

A Cross-Species Generative Cell Atlas Across 1.5 Billion Years ofEvolution: The TranscriptFormer Single-cell Model

Authors

Pearce, J. D.; Simmonds, S. E.; Mahmoudabadi, G.; Krishnan, L.; Palla, G.; Istrate, A.-M.; Tarashansky, A.; Nelson, B.; Valenzuela, O.; Li, D.; Quake, S. R.; Karaletsos, T.

Abstract

Single-cell transcriptomics has revolutionized our understanding of cellular diversity, but integrating this knowledge across evolutionary distances remains challenging. Here we present TranscriptFormer, a family of generative foundation models representing a cross-species generative cell atlas trained on up to 112 million cells spanning 1.53 billion years of evolution across 12 species. TranscriptFormer jointly models genes and transcripts using a novel generative architecture, enabling it to function as a virtual instrument for probing cellular biology. In zero-shot settings, our models demonstrate superior performance on both in-distribution and out-of-distribution cell type classification, with robust performance even for species separated by over 685 million years of evolutionary distance. TranscriptFormer can also perform zero-shot disease state identification in human cells and accurately transfers cell type annotations across species boundaries. Being a generative model, TranscriptFormer can be prompted to predict cell type-specific transcription factors and gene-gene interactions that align with independent experimental observations. This work establishes a powerful framework for integrating and interrogating cellular diversity across species as well as offering a foundation for in silico experimentation with a generative single-cell atlas model.

Follow Us on

0 comments

Add comment