AITopics | deprecation

Collaborating Authors

deprecation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Leaderboard Illusion

Neural Information Processing SystemsJun-18-2026, 09:26:40 GMT

Measuring progress is fundamental to the advancement of any scientific field. As benchmarks play an increasingly central role, they also become more susceptible to distortion. Chatbot Arena has emerged as the go-to leaderboard for ranking the most capable AI systems. Yet, in this work we identify systematic issues that have skewed the competitive landscape. Specifically, undisclosed private testing practices benefit a handful of providers who are able to test multiple variants before public release and selectively retract scores.

chatbot arena, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
North America > Mexico (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry:

Information Technology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

Deprecating Benchmarks: Criteria and Framework

Joaquin, Ayrton San, Gipiškis, Rokas, Staufer, Leon, Gil, Ariel

arXiv.org Artificial IntelligenceJul-10-2025

As frontier artificial intelligence (AI) models rapidly advance, benchmarks are integral to comparing different models and measuring their progress in different task-specific domains. However, there is a lack of guidance on when and how benchmarks should be deprecated once they cease to effectively perform their purpose. This risks benchmark scores over-valuing model capabilities, or worse, obscuring capabilities and safety-washing. Based on a review of benchmarking practices, we propose criteria to decide when to fully or partially deprecate benchmarks, and a framework for deprecating benchmarks. Our work aims to advance the state of benchmarking towards rigorous and quality evaluations, especially for frontier models, and our recommendations are aimed to benefit benchmark developers, benchmark users, AI governance actors (across governments, academia, and industry panels), and policy makers.

benchmark, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2507.06434

Country:

Europe (1.00)
North America > United States (0.46)

Genre:

Overview (0.68)
Research Report (0.64)

Industry: Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Issues (0.68)

Add feedback

The Leaderboard Illusion

Singh, Shivalika, Nan, Yiyang, Wang, Alex, D'Souza, Daniel, Kapoor, Sayash, Üstün, Ahmet, Koyejo, Sanmi, Deng, Yuntian, Longpre, Shayne, Smith, Noah A., Ermis, Beyza, Fadaee, Marzieh, Hooker, Sara

arXiv.org Artificial IntelligenceMay-13-2025

Measuring progress is fundamental to the advancement of any scientific field. As benchmarks play an increasingly central role, they also grow more susceptible to distortion. Chatbot Arena has emerged as the go-to leaderboard for ranking the most capable AI systems. Yet, in this work we identify systematic issues that have resulted in a distorted playing field. We find that undisclosed private testing practices benefit a handful of providers who are able to test multiple variants before public release and retract scores if desired. We establish that the ability of these providers to choose the best score leads to biased Arena scores due to selective disclosure of performance results. At an extreme, we identify 27 private LLM variants tested by Meta in the lead-up to the Llama-4 release. We also establish that proprietary closed models are sampled at higher rates (number of battles) and have fewer models removed from the arena than open-weight and open-source alternatives. Both these policies lead to large data access asymmetries over time. Providers like Google and OpenAI have received an estimated 19.2% and 20.4% of all data on the arena, respectively. In contrast, a combined 83 open-weight models have only received an estimated 29.7% of the total data. We show that access to Chatbot Arena data yields substantial benefits; even limited additional data can result in relative performance gains of up to 112% on the arena distribution, based on our conservative estimates. Together, these dynamics result in overfitting to Arena-specific dynamics rather than general model quality. The Arena builds on the substantial efforts of both the organizers and an open community that maintains this valuable evaluation platform. We offer actionable recommendations to reform the Chatbot Arena's evaluation framework and promote fairer, more transparent benchmarking for the field

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.20879

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Health & Medicine (1.00)
Leisure & Entertainment > Games > Computer Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

Add feedback

Neural Transition-based Parsing of Library Deprecations

Babkin, Petr, Navarro, Nacho, Alamir, Salwa, Shah, Sameena

arXiv.org Artificial IntelligenceDec-23-2022

This paper tackles the challenging problem of automating code updates to fix deprecated API usages of open source libraries by analyzing their release notes. Our system employs a three-tier architecture: first, a web crawler service retrieves deprecation documentation from the web; then a specially built parser processes those text documents into tree-structured representations; finally, a client IDE plugin locates and fixes identified deprecated usages of libraries in a given codebase. The focus of this paper in particular is the parsing component. We introduce a novel transition-based parser in two variants: based on a classical feature engineered classifier and a neural tree encoder. To confirm the effectiveness of our method, we gathered and labeled a set of 426 API deprecations from 7 well-known Python data science libraries, and demonstrated our approach decisively outperforms a non-trivial neural machine translation baseline.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2212.12584

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Greece > Attica > Athens (0.04)
(7 more...)

Genre: Research Report (0.64)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Predictions Series 2022: How to Win in an Opt-In Era

#artificialintelligenceNov-12-2022, 16:40:48 GMT

Opt-in doomsayers believe this legislation could cripple the entire advertising industry because consumers will have more meaningful control over their privacy, and authenticated audiences will shrink as a result. However, the trends that have shaped the market can be bucked, and publishers and advertisers have an exciting opportunity to create a new ecosystem in compliance with the opt-in marketplace that benefits everyone involved – including consumers. Further, the browser and device manufacturer changes that are already in-progress are already moving the industry towards a more logged-in environment, in which it becomes easier for consumers to opt in as they authenticate. As we look towards an opt-in era in the future, it's important that publishers and marketers consider how the industry arrived at this point, and the lessons they can take away from this journey. Under the opt-out default, it's easy to see that the consumer experience has been lacking, and a lot of that falls on technology. The opt-out default enabled the propagation of third-party cookies, and the collection of data – often in a way that was not as transparent as it could have been for consumers.

consumer, publisher and advertiser, value exchange, (13 more...)

#artificialintelligence

Country: North America > United States (0.06)

Industry:

Information Technology > Security & Privacy (0.50)
Law > Statutes (0.36)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

The Problem of Zombie Datasets:A Framework For Deprecating Datasets

Corry, Frances, Sridharan, Hamsini, Luccioni, Alexandra Sasha, Ananny, Mike, Schultz, Jason, Crawford, Kate

arXiv.org Artificial IntelligenceOct-18-2021

What happens when a machine learning dataset is deprecated for legal, ethical, or technical reasons, but continues to be widely used? In this paper, we examine the public afterlives of several prominent deprecated or redacted datasets, including ImageNet, 80 Million Tiny Images, MS-Celeb-1M, Duke MTMC, Brainwash, and HRT Transgender, in order to inform a framework for more consistent, ethical, and accountable dataset deprecation. Building on prior research, we find that there is a lack of consistency, transparency, and centralized sourcing of information on the deprecation of datasets, and as such, these datasets and their derivatives continue to be cited in papers and circulate online. These datasets that never die -- which we term "zombie datasets" -- continue to inform the design of production-level systems, causing technical, legal, and ethical challenges; in so doing, they risk perpetuating the harms that prompted their supposed withdrawal, including concerns around bias, discrimination, and privacy. Based on this analysis, we propose a Dataset Deprecation Framework that includes considerations of risk, mitigation of impact, appeal mechanisms, timeline, post-deprecation protocol, and publication checks that can be adapted and implemented by the machine learning community. Drawing on work on datasheets and checklists, we further offer two sample dataset deprecation sheets and propose a centralized repository that tracks which datasets have been deprecated and could be incorporated into the publication protocols of venues like NeurIPS.

dataset, deprecation, repository, (13 more...)

arXiv.org Artificial Intelligence

2111.04424

Country:

Oceania > Australia (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > France (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Law > Litigation (0.93)
Government (0.93)
Law > Civil Rights & Constitutional Law (0.69)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback