A novel pipeline for the rapid expansion of ecological trait databases using LLMs
A novel pipeline for the rapid expansion of ecological trait databases using LLMs
Ramos, R. J.; Afkhami, M. E.; Aguilar-Trigueros, C. A.; Barbour, K. M.; Chaverri, P.; Cuprewich, S. A.; Egan, C. P.; Lynn, K. M. T.; Peay, K. G.; Norros, V.; Romero-Olivares, A. L.; Ward, L.; Chaudhary, B.
AbstractThis paper presents a novel workflow leveraging Large Language Models (LLMs) to rapidly extract trait data from fungal species descriptions, addressing a significant bottleneck in ecological research. We developed and evaluated an LLM pipeline to extract morphological trait data from arbuscular mycorrhizal fungi, comparing performance against a manually curated dataset (TraitAM). Results demonstrate the potential of LLMs for automated trait data acquisition, though accuracy varies by trait and model, with systematic biases observed. This framework offers a blueprint for building trait databases across diverse taxa and domains, significantly accelerating ecological research and conservation efforts.