AITopics | reviewer

Collaborating Authors

reviewer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

British Police Built a Sprawling Crime-Prediction Machine. Some Results Couldn't Be Trusted

WIREDJun-25-2026, 10:00:00 GMT

British Police Built a Sprawling Crime-Prediction Machine. Some Results Couldn't Be Trusted As UK police embrace the AI revolution, a WIRED investigation reveals the messy inside story of one region's experiment with predictive analytics. The Think Family Database holds records on close to half a million people who live in the city of Bristol, England. For many years, few of them knew anything about it. Launched in 2016 by the Bristol City Council and the regional Avon and Somerset Police, the database has stored all manner of sensitive information--police intelligence reports, housing status, mental health records, teenage pregnancies, enrollment in parenting courses, free school meals. On top of this sensitive data, officials built machine-learning models to assign scores to thousands of adults and children. They hoped to build what they called a "picture of threat, harm, and risk" in the region. At an event in early 2022 to help officials tackle child exploitation crimes, one police data scientist described part of the approach this way: "I essentially dump all that data in a big bucket and stir it with a data-science spatula, and we come out with a lovely risk score for everybody." This risk scoring inside the Think Family Database was just one part of Avon and Somerset Police's sprawling predictive analytics program.

artificial intelligence, data mining, machine learning, (14 more...)

WIRED

Country: Europe > United Kingdom > England > Bristol (0.24)

Genre: Research Report (0.68)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > Europe Government > United Kingdom Government (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MLR-Bench: Evaluating AIAgents on Open-Ended Machine Learning Research Hui Chen Miao Xiong Yujie Lu Wei Han Ailin Deng Yufei He Jiaying Wu Yibo Li

Neural Information Processing SystemsJun-21-2026, 15:15:59 GMT

Recent advancements in AI agents have demonstrated their growing potential to drive and support scientific discovery. In this work, we introduce MLR-Bench, a comprehensive benchmark for evaluating AI agents on open-ended machine learning research. MLR-Bench includes three key components: (1) 201 research tasks sourced from NeurIPS, ICLR, and ICML workshops covering diverse ML topics; (2) MLR-Judge, an automated evaluation framework combining LLMbased reviewers with carefully designed review rubrics to assess research quality; and (3) MLR-Agent, a modular agent scaffold capable of completing research tasks through four stages: idea generation, proposal formulation, experimentation, and paper writing. Our framework supports both stepwise assessment across these distinct research stages, and end-to-end evaluation of the final research paper. We then use MLR-Bench to evaluate six frontier LLMs and an advanced coding agent, finding that while LLMs are effective at generating coherent ideas and well-structured papers, current coding agents frequently (e.g., in 80% of the cases) produce fabricated or invalidated experimental results--posing a major barrier to scientific reliability.

large language model, machine learning, natural language, (23 more...)

Neural Information Processing Systems

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
(2 more...)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

693e00827fd44bdfca210801fe1e6439-Paper-Position_Paper_Track.pdf

Neural Information Processing SystemsJun-18-2026, 02:04:10 GMT

The meteoric rise of Artificial Intelligence (AI), with its rapidly expanding market capitalization, presents both transformative opportunities and critical challenges. Chief among these is the urgent need for a new, unified paradigm for trustworthy evaluation, as current benchmarks increasingly reveal critical vulnerabilities. Issues like data contamination and selective reporting by model developers fuel hype, while inadequate data quality control can lead to biased evaluations that, even if unintentionally, may favor specific approaches. As a flood of participants enters the AI space, this "Wild West" of assessment makes distinguishing genuine progress from exaggerated claims exceptionally difficult. Such ambiguity blurs scientific signals and erodes public confidence, much as unchecked claims would destabilize financial markets reliant on credible oversight from agencies like Moody's. In high-stakes human examinations (e.g., SAT, GRE), substantial effort is devoted to ensuring fairness and credibility; why settle for less in evaluating AI, especially given its profound societal impact? This position paper argues that a laissezfaire approach is untenable. For true and sustainable AI advancement, we call for a paradigm shift to a unified, live, and quality-controlled benchmarking framework--robust by construction rather than reliant on courtesy or goodwill.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Industry:

Social Sector (0.66)
Information Technology > Security & Privacy (0.46)
Banking & Finance > Trading (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

DQVis Dataset: Natural Language to Biomedical Visualization

Neural Information Processing SystemsJun-17-2026, 08:47:47 GMT

Biomedical research data portals are essential resources for scientific inquiry, and interactive exploratory visualizations are an integral component for querying such data repositories. Increasingly, machine learning is being integrated into visualization systems to create natural language interfaces where questions about data can be answered with visualizations, and follow-up questions can build on the previous state. This paper introduces a framework that takes abstract low-level questions about data and a visualization grammar specification that can answer such a question, reifies them with data entities and fields that meet certain constraints, and paraphrases the question language to produce the final collection of realized data-question-visualization triplets. Furthermore, we can link these foundational elements together to construct chains of queries, visualizations, and follow-up queries. We developed an open-source review interface for evaluating the results of these datasets. We applied this framework to five biomedical research data repositories, resulting in DQVis, a dataset of 1.08 million dataquestion-visualization triplets and 11.4 thousand two-step question samples. Five visualization experts provided feedback on the generated dataset through our review interface. We present a summary of their input and publish the full reviews as an additional resource alongside the dataset.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (0.92)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

APrincipled Approach to Randomized Selection under Uncertainty: Applications to Peer Review and Grant Funding

Neural Information Processing SystemsJun-17-2026, 05:41:23 GMT

Many decision-making processes involve evaluating and selecting items, including scientific peer review, job hiring, school admissions, and investment decisions. These domains feature error-prone evaluations and uncertainty about outcomes, which undermine deterministic selection rules. Consequently, randomized selection mechanisms are gaining traction. However, current randomized approaches are ad hoc and, as we prove, inappropriate for their purported objectives. We propose a principled framework for randomized decision-making based on interval estimates of item quality. We introduce MERIT (Maximin Efficient Randomized Interval Top-k), which maximizes the worst-case expected number of top candidates selected under uncertainty represented by overlapping intervals. MERIT provides optimal resource allocation under an interpretable robustness notion. We develop a polynomial-time, practically efficient algorithm and prove our approach satisfies desirable axiomatic properties not guaranteed by existing methods. Experiments on synthetic peer review data from grant funding and conferences demonstrate that MERIT matches existing algorithms' expected utility under fully probabilistic models while outperforming them under our worst-case formulation.

artificial intelligence, decision support system, proposal, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.67)
North America > United States (0.46)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.93)
Research Report > New Finding (0.67)

Industry:

Social Sector (0.70)
Education (0.48)
Government (0.46)
Banking & Finance > Trading (0.34)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback

Fostering the Ecosystem of AI for Social Impact Requires Expanding and Strengthening Evaluation Standards

Neural Information Processing SystemsJun-15-2026, 21:19:53 GMT

There has been increasing research interest in AI/ML for social impact, and correspondingly more publication venues have refined review criteria for practice-driven AI/ML research. However, these review guidelines tend to most concretely recognize projects that simultaneously achieve deployment and novel ML methodological innovation. We argue that this introduces incentives for researchers that undermine the sustainability of a broader research ecosystem of social impact, which benefits from projects that make contributions on single front (applied or methodological) that may better meet project partner needs. Our position is that researchers and reviewers in machine learning for social impact must simultaneously adopt: 1) a more expansive conception of social impacts beyond deployment and 2) more rigorous evaluations of the impact of deployed systems.

artificial intelligence, contribution, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.68)

Industry: Social Sector (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

From Replication to Redesign: Exploring Pairwise Comparisons for LLM-Based Peer Review

Neural Information Processing SystemsJun-14-2026, 20:56:23 GMT

The advent of large language models (LLMs) offers unprecedented opportunities to reimagine peer review beyond the constraints of traditional workflows. Despite these opportunities, prior efforts have largely focused on replicating traditional review workflows with LLMs serving as direct substitutes for human reviewers, while limited attention has been given to exploring new paradigms that fundamentally rethink how LLMs can participate in the academic review process. In this paper, we introduce and explore a novel mechanism that employs LLM agents to perform pairwise comparisons among manuscripts instead of individual scoring. By aggregating outcomes from substantial pairwise evaluations, this approach enables a more accurate and robust measure of relative manuscript quality. Our experiments demonstrate that this comparative approach significantly outperforms traditional rating-based methods in identifying high-impact papers. However, our analysis also reveals emergent biases in the selection process, notably a reduced novelty in research topics and an increased institutional imbalance. These findings highlight both the transformative potential of rethinking peer review with LLMs and critical challenges that future systems must address to ensure equity and diversity.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research

Neural Information Processing SystemsJun-13-2026, 19:22:06 GMT

Recent advancements in AI agents have demonstrated their growing potential to drive and support scientific discovery. In this work, we introduce MLR-Bench, a comprehensive benchmark for evaluating AI agents on open-ended machine learning research. MLR-Bench includes three key components: (1) 201 research tasks sourced from NeurIPS, ICLR, and ICML workshops covering diverse ML topics; (2) MLR-Judge, an automated evaluation framework combining LLM-based reviewers with carefully designed review rubrics to assess research quality; and (3) MLR-Agent, a modular agent scaffold capable of completing research tasks through four stages: idea generation, proposal formulation, experimentation, and paper writing. Our framework supports both stepwise assessment across these distinct research stages, and end-to-end evaluation of the final research paper. We then use MLR-Bench to evaluate six frontier LLMs and an advanced coding agent, finding that while LLMs are effective at generating coherent ideas and well-structured papers, current coding agents frequently (e.g., in 80\% of the cases) produce fabricated or invalidated experimental results--posing a major barrier to scientific reliability.

artificial intelligence, large language model, natural language, (12 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)

Add feedback

Review Networks for Caption Generation

Zhilin Yang, Ye Yuan, Yuexin Wu, William W. Cohen, Russ R. Salakhutdinov

Neural Information Processing SystemsMay-1-2026, 05:55:20 GMT

We propose a novel extension of the encoder-decoder framework, called a review network. The review network is generic and can enhance any existing encoderdecoder model: in this paper, we consider RNN decoders with both CNN and RNN encoders. The review network performs a number of review steps with attention mechanism on the encoder hidden states, and outputs a thought vector after each review step; the thought vectors are used as the input of the attention mechanism in the decoder. We show that conventional encoder-decoders are a special case of our framework.

Add feedback

012a91467f210472fab4e11359bbfef6-AuthorFeedback.pdf

Neural Information Processing SystemsMay-1-2026, 05:27:45 GMT

First, as R4 suggested, "symbolic35 tree" was more approachable for people in the ML community. Second, the symbolic tree is declared by the user using36 decorators and serves to represent high-level program constructs, which is different from the AST that represents all37 the syntactic structures for the program. For example, the full Python AST contains information about objects' class38 methods, whereas our symbolic representation does not.39 R4: "Second, most of their tool/language design could be summarized as adding some kind of non determinis-40 tic/parametric choice ... It's extension to ML does not introduce anything particularly new ..."41 We agree with R4 that symbolic programming and non-deterministic programming are well-studied topics in the PL42 community. However, we would like to emphasize that this work is the first to introduce such concepts to AutoML43 to significantly reduce engineering effort, which is a novel and useful contribution. For example, PyGlove leverages44 symbolic manipulation to decouple the search algorithm, search space and child program, which enabled us to unify45 the interface among search methods with and without weight sharing. To enable symbolic programming in Python,46 PyGlove implements an object model for maintaining the consistency of program state during symbolic manipulation.47 R4 "Provide the grammar in the main text"48 We understand the "grammar" here as a reference to the formal definition of the search space specification. We will49 revise current Appendix Table 3 into a formal definition, and add it to the "search space" sub-section.50

artificial intelligence, pyglove, search space, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.86)

Add feedback