Rapid Identification of Metal Resistance Genes Using an Enhanced ResNet Deep Learning Model Trained on a largely Expanded BacMet-Based Database
Rapid Identification of Metal Resistance Genes Using an Enhanced ResNet Deep Learning Model Trained on a largely Expanded BacMet-Based Database
CHEN, J.; Gao, X.; Zhang, C.; Ge, Y.
AbstractHeavy metal pollution poses significant risks to both the environment and public health. Effective management requires not only reducing contaminants but also understanding microbial adaptation, which could be achieved through the comprehensive identification and classification of metal resistance genes. This study expanded the existing BacMet database by incorporating 1,219,137 unique amino acid sequences through BLASTp analysis, thereby increasing the number of metal resistance-related acid sequences by more than 1,600-fold compared to the 753 sequences included in the previous version. We employed various deep learning models for our proposed multi-label framework which could effectively overcome the well-recognized challenging issue of strict classification among metal translocating proteins such as CopA, ZntA, SilA, CadA, and CzcA due to their high sequence similarity and overlapping metal specificities, including AlexNet, VGG, GoogleNet, ResNet, BERT, ViT, and Mamba2, with ResNet demonstrating superior performance in terms of accuracy, robustness, and computational efficiency. The ResNet model achieved a Jaccard score of 98.91%, significantly higher than BLASTp (98.19%) and DIAMOND (98.16%), and was approximately 6,500 times faster in inference speed than these traditional alignment-based methods. Furthermore, the predictive performance of the model and the reliability of the expanded gene library were experimentally validated through overexpression and metal resistance assays of selected genes. Additionally, we developed the Predicting Metal Resistance Amino Acid Sequences (PMRAAS, https://s3.v100.vip:16165) website, facilitating online predictions of gene metal resistance. Our findings provided deeper insights into microbial adaptation mechanisms in metal-polluted environments and offered a robust tool for advancing heavy metal pollution management through enhanced bioremediation strategies.