Practical Use of Advanced AI Frameworks on Real-Life Scientific Problems: Three Case Studies
Practical Use of Advanced AI Frameworks on Real-Life Scientific Problems: Three Case Studies
Gulluoglu, H. S. A.; Baby, J.; Bagul, K. M.; Basangari, B. R.; Bathini, S. A.; Chalamalla, N. K. R.; Dcunha, J.; Gupta, O.; Huang, L.; Jiang, X.; Naidu, Y. R.; Sathishkumar, G.; Sehrawat, M.; Thota, S. L.; Thuvara, D.; Vanguri, M. B.; Yin, J.; Jugder, B.-E.; Lusky, I. E.; Li, J.; Sinitskiy, A.
AbstractAgentic artificial intelligence (AI) systems increasingly claim to automate scientific research, yet independent evaluations report persistent gaps between those claims and demonstrated capability. We tested frontier agentic AI systems on three practical problems: prediction of treatment non-response in immune-mediated inflammatory diseases, optical chemical structure recognition for literature mining, and prediction of drug-design-related properties from small datasets. Each problem was first assigned to autonomous frameworks and then reattempted as human-led, AI-assisted work. Autonomous runs failed in most cases, while human-led work produced reusable resources and modest but defensible performance, including new evidence for possible mechanisms of treatment resistance and a more practical benchmark for mining chemical structures from scientific papers. Property prediction was the single task on which one autonomous AI framework matched the human expert. We conclude that current frameworks can carry out engineering and analysis once a human expert leads the project, but cannot yet engineer a novel solution without oversight. The use of AI on real-life scientific problems remains an art rather than a routine technology.