AITopics | leaderboard

Collaborating Authors

leaderboard

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

NeurIPS_Dynaboard

Tristan Thrush

Neural Information Processing SystemsApr-25-2026, 23:44:31 GMT

We introduce Dynaboard, an evaluation-as-a-service framework for hosting benchmarks and conducting holistic model comparison, integrated with the Dynabench platform. Our platform evaluates NLP models directly instead of relying on selfreported metrics or predictions on a single dataset. Under this paradigm, models are submitted to be evaluated in the cloud, circumventing the issues of reproducibility, accessibility, and backwards compatibility that often hinder benchmarking in NLP. This allows users to interact with uploaded models in real time to assess their quality, and permits the collection of additional metrics such as memory use, throughput, and robustness, which - despite their importance to practitioners - have traditionally been absent from leaderboards. On each task, models are ranked according to the Dynascore, a novel utility-based aggregation of these statistics, which users can customize to better reflect their preferences, placing more/less weight on a particular axis of evaluation or dataset. As state-of-the-art NLP models push the limits of traditional benchmarks, Dynaboard offers a standardized solution for a more diverse and comprehensive evaluation of model quality.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models

Neural Information Processing SystemsFeb-18-2026, 13:04:58 GMT

With the rapid development of code LLMs, many popular evaluation benchmarks, such as HumanEval, DS-1000, and MBPP, have emerged to measure the performance of code LLMs with a particular focus on code generation tasks. However, they are insufficient to cover the full range of expected capabilities of code LLMs, which span beyond code generation to answering diverse coding-related questions.

benchmark, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
Asia > China > Hong Kong (0.04)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Benchmark Data Repositories for Better Benchmarking

Neural Information Processing SystemsFeb-16-2026, 23:49:23 GMT

In machine learning research, it is common to evaluate algorithms via their performance on standard benchmark datasets.

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Orange County > Irvine (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)
(13 more...)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

cd61a580392a70389e27b0bc2b439f49-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-14-2026, 04:34:14 GMT

hamming distance, reviewer, softmax, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.37)

Add feedback

563991b5c8b45fe75bea42db738223b2-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-13-2026, 17:37:19 GMT

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > Ontario (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(9 more...)

Genre: Research Report (0.67)

Industry:

Media > Film (1.00)
Leisure & Entertainment > Games > Computer Games (0.67)
Information Technology (0.67)
Leisure & Entertainment > Games > Chess (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback

cd88d62a2063fdaf7ce6f9068fb15dcd-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 23:27:19 GMT

leaderboard, natural question, page-level r-precision, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.33)

Add feedback

P: A Sorghum Genotype Phenotype Prediction Dataset and Benchmark

Neural Information Processing SystemsFeb-10-2026, 10:00:18 GMT

A plant's phenome is defined by its physical and biochemical characteristics, and is the result of the interaction of its genome and its environment.

artificial intelligence, deep learning, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Arizona > Pinal County > Maricopa (0.04)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Food & Agriculture > Agriculture (1.00)
Energy > Renewable (0.93)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Sensing and Signal Processing (0.94)

Add feedback

FLEX: Unifying Evaluation for Few-Shot NLP

Jonathan Bragg, Arman Cohan, Kyle Lo, Iz Beltagy

Neural Information Processing SystemsFeb-9-2026, 15:45:17 GMT

benchmark, dataset, evaluation, (15 more...)

Neural Information Processing Systems

Country:

Europe > Czechia > Prague (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report > Promising Solution (0.68)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)

Add feedback

OpenXAI: Towards a Transparent Evaluation of Post hoc Model Explanations

Neural Information Processing SystemsFeb-9-2026, 11:28:24 GMT

While several types of post hoc explanation methods have been proposed in recent literature, there is very little work on systematically benchmarking these methods. Here, we introduce OpenXAI, a comprehensive and extensible open-source framework for evaluating and benchmarking post hoc explanation methods.

explanation, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: