AITopics | capability and efficiency

Collaborating Authors

capability and efficiency

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EvaLearn Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving

Neural Information Processing SystemsJun-22-2026, 05:43:11 GMT

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > Middle East (0.45)
North America > United States > Minnesota (0.27)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving

Dou, Shihan, Zhang, Ming, Huang, Chenhao, Chen, Jiayi, Chen, Feng, Liu, Shichun, Liu, Yan, Liu, Chenxiao, Zhong, Cheng, Zhang, Zongzhang, Gui, Tao, Xin, Chao, Wei, Chengzhi, Yan, Lin, Wu, Yonghui, Zhang, Qi, Huang, Xuanjing

arXiv.org Artificial IntelligenceOct-22-2025

We introduce EvaLearn, a pioneering benchmark designed to evaluate large language models (LLMs) on their learning capability and efficiency in challenging tasks, a critical, yet underexplored aspect of model potential. EvaLearn contains 648 challenging problems across six task types, grouped into 182 sequences, each sequence dedicated to one task type. Diverging from most existing benchmarks that evaluate models in parallel, EvaLearn requires models to solve problems sequentially, allowing them to leverage the experience gained from previous solutions. EvaLearn provides five comprehensive automated metrics to evaluate models and quantify their learning capability and efficiency. We extensively benchmark nine frontier models and observe varied performance profiles: some models, such as Claude-3.7-sonnet, start with moderate initial performance but exhibit strong learning ability, while some models struggle to benefit from experience and may even show negative transfer. Moreover, we investigate model performance under two learning settings and find that instance-level rubrics and teacher-model feedback further facilitate model learning. Importantly, we observe that current LLMs with stronger static abilities do not show a clear advantage in learning capability across all tasks, highlighting that EvaLearn evaluates a new dimension of model performance. We hope EvaLearn provides a novel evaluation perspective for assessing LLM potential and understanding the gap between models and human capabilities, promoting the development of deeper and more dynamic evaluation approaches. All datasets, the automatic evaluation framework, and the results studied in this paper are available at the GitHub repository.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.02672

Country:

Asia > China (0.46)
Asia > Middle East (0.45)
North America > United States > Minnesota (0.27)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Chips for Deep learning continue to leapfrog in capabilities and efficiency

#artificialintelligenceDec-26-2016, 09:35:22 GMT

Deep learning has continued to drive the computing industry's agenda in 2016. But come 2017, experts say the Artificial Intelligence community will intensify its demand for higher performance and more power efficient "inference" engines for deep neural networks. The current deep learning system leverages advances in large computation power to define network, big data sets for training, and access to the large computing system to accomplish its goal. Unfortunately, the efficient execution of this learning is not so easy on embedded systems (i.e. This problem leaves wide open the possibility for innovation of technologies that can put deep neural network power into end devices. "Deploying Artificial Intelligence at the edge [of the network] is becoming a massive trend," Movidius CEO, Remi El-Ouazzane, told us a few months ago.

artificial intelligence, capability and efficiency, machine learning, (10 more...)

#artificialintelligence

Industry: Information Technology (0.79)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback