The Latentverse: An Open-Source Benchmarking Toolkit for Evaluating Latent Representations
The Latentverse: An Open-Source Benchmarking Toolkit for Evaluating Latent Representations
Turura, Y.; Friedman, S. F.; Cremer, A.; Maddah, M.; Tonekaboni, S.
AbstractSelf-supervised representation learning is a powerful approach for extracting meaningful features without relying on large amounts of labeled data, making it particularly valuable in fields like healthcare. This enables pretrained models to be shared and fine-tuned with minimal data for various downstream applications. However, evaluating the quality and behavior of these representations remains challenging. To address this, we introduce Latentverse, an open-source library and web-based platform for evaluating latent representations. Latentverse generates detailed reports with visualizations and metrics that provide a comprehensive perspective on different properties of representations, such as clustering, disentanglement, generalization, expressiveness, and robustness. It also allows for the comparison of different representations, enabling developers to refine model architectures and helping users assess how well an embedding model aligns with the requirements of their specific applications.