Evaluating DNA function understanding in genomic language models using evolutionarily implausible sequences
Jiang, Shiyu, Liu, Xuyin, Wang, Zitong Jerry
–arXiv.org Artificial Intelligence
Genomic language models (gLMs) hold promise for generating novel, functional DNA sequences for synthetic biology. However, realizing this potential requires models to go beyond evolutionary plausibility and understand how DNA sequence encodes gene expression and regulation. We introduce a benchmark called Nullsettes, which assesses how well models can predict in silico loss-of-function (LOF) mutations, in synthetic expression cassettes with little evolutionary precedent. Testing 12 state-of-the-art gLMs, we find that most fail to consistently detect these strong LOF mutations. All models show a sharp drop in predictive accuracy as the likelihood assigned to the original (nonmutant) sequence decreases, suggesting that gLMs rely heavily on pattern-matching to their evolutionary prior rather than on any mechanistic understanding of gene expression. Our findings highlight fundamental limitations in how gLMs generalize to engineered, non-natural sequences, and underscore the need for benchmarks and modeling strategies that prioritize functional understanding.
arXiv.org Artificial Intelligence
Aug-27-2025
- Country:
- Asia > China
- Zhejiang Province > Hangzhou (0.04)
- North America > United States
- California > Los Angeles County > Los Angeles (0.28)
- Asia > China
- Genre:
- Research Report
- Experimental Study (0.46)
- New Finding (0.66)
- Research Report
- Industry:
- Technology: