Susagi: A Microbiome World Model
Susagi: A Microbiome World Model
Peluso, M.; Tackmann, J.; von Mering, C.
AbstractMotivation: Accurately modelling how microbial communities assemble and change across hosts and environments is essential for analysis and intervention. Typical pipelines capture limited generalisable structure and often depend on fixed ecological unit definitions. Results: We present Susagi (Set Unsupervised Assessment of Genetic Imposters), a permutation-invariant denoising transformer that operates directly on sets of bacterial SSU rRNA gene embeddings to learn a member-level stability function. The model was trained on 2 x 10^6 bacterial community samples. We show that it reliably predicts community composition dynamics in a zero-shot (no training) setting, demonstrated here across three challenging microbiomes for which traditional ML methods do not exceed random expectation. The model's stability scores capture biological structure: across datasets, higher scores are enriched for agricultural, cropland, and soil-associated habitats, consistent with microbial communities in these environments supporting positive diversity-stability relationships. The highest stability scores are only attained by communities with high Pielou evenness and large size, despite the fact that the model has never seen abundances suggesting it can recognise community dysbiosis from presence absence alone. Furthermore, they also track biological gradients such as subject age. Susagi is competitive with another Large Microbiome Model (Microbial General Model,MGM) on diverse classification tasks, without task-specific fine tuning and with an increased parameter efficiency. Ultimately, our model will facilitate hypothesis generation for complex microbial processes, including deterministic assembly and microbial interactions, crucial for instance in the design of communities in silico. Availability and implementation: Evaluation code and model weights can be found from https://github.com/the-puzzler/Microbiome-Modelling. Model weights can also be downloaded directly from https://huggingface.co/basilboy/microbiome-model. Interactive demo can be found here: https://huggingface.co/spaces/basilboy/microbiome-space.