AITopics | Negative Result

Collaborating Authors

Negative Result

Tight Asymptotics of Extreme Order Statistics

Neural Information Processing SystemsJun-23-2026, 04:47:05 GMT

A classic statistical problem is to study the asymptotic behavior of the order statistics of a large number of independent samples taken from a distribution with finite expectation. This behavior has implications for several core problems in machine learning and economics -- including robust learning under adversarial noise, best-arm identification in bandit algorithms, revenue estimation in secondprice auctions, and the analysis of tail-sensitive statistics used in out-of-distribution detection. The research question we tackle in this paper is: How large can the expectation of the ℓ-th maximum of the n samples be? For ℓ = 1, i.e., the maximum, this expectation is known to grow as o(n), which can be shown to be tight. We show that there is a sharp contrast when considering any fixed ℓ > 1. Surprisingly, in

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.66)

Add feedback

AIDebate Aids Assessment of Controversial Claims

Neural Information Processing SystemsJun-23-2026, 03:37:39 GMT

As AI grows more powerful, it will increasingly shape how we understand the world. But with this influence comes the risk of amplifying misinformation and deepening social divides--especially on consequential topics where factual accuracy directly impacts well-being. Scalable Oversight aims to ensure AI systems remain truthful even when their capabilities exceed those of their evaluators. Yet when humans serve as evaluators, their own beliefs and biases can impair judgment. We study whether AI debate can guide biased judges toward the truth by having two AI systems debate opposing sides of controversial factuality claims on COVID-19 and climate change where people hold strong prior beliefs.

final decision false round 1, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > California (0.45)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (1.00)
(3 more...)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.66)

Add feedback

VMDT: Decoding the Trustworthiness of Video Foundation Models

Neural Information Processing SystemsJun-23-2026, 01:31:42 GMT

As foundation models become more sophisticated, ensuring their trustworthiness becomes increasingly critical; yet, unlike text and image, the video modality still lacks comprehensive trustworthiness benchmarks. We introduce VMDT (VideoModal DecodingTrust), the first unified platform for evaluating text-to-video (T2V) and video-to-text (V2T) models across five key trustworthiness dimensions: safety, hallucination, fairness, privacy, and adversarial robustness. Through our extensive evaluation of 7 T2V models and 19 V2T models using VMDT, we uncover several significant insights. For instance, all open-source T2V models evaluated fail to recognize harmful queries and often generate harmful videos, while exhibiting higher levels of unfairness compared to image modality models. In V2T models, unfairness and privacy risks rise with scale, whereas hallucination and adversarial robustness improve--though overall performance remains low. Uniquely, safety shows no correlation with model size, implying that factors other than scale govern current safety levels. Our findings highlight the urgent need for developing more robust and trustworthy video foundation models, and VMDT provides a systematic framework for measuring and tracking progress toward this goal.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.45)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.65)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
(5 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

ASTROVISBENCH: ACode Benchmark for Scientific Computing and Visualization in Astronomy

Neural Information Processing SystemsJun-22-2026, 16:28:36 GMT

Large Language Models (LLMs) are being explored for applications in scientific research, including their capabilities to synthesize literature, answer research questions, generate research ideas, and even conduct computational experiments. Ultimately, our goal is for these to help scientists derive novel scientific insights. In many areas of science, such insights often arise from processing and visualizing data to understand its patterns. However, evaluating whether an LLM-mediated scientific workflow produces outputs conveying the correct scientific insights is challenging to evaluate and has not been addressed in past work. We introduce ASTROVISBENCH, the first benchmark for both scientific computing and visualization in the astronomy domain. ASTROVISBENCH judges a language model's ability to both (1) create astronomy-specific workflows to process and analyze data and (2) visualize the results of these workflows through complex plots.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre:

Workflow (1.00)
Research Report > Experimental Study > Negative Result (0.34)

Industry:

Information Technology (0.93)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On Group Sufficiency Under Label Bias

Neural Information Processing SystemsJun-22-2026, 06:07:44 GMT

Real-world classification datasets often contain label bias, where observed labels differ systematically from the true labels at different rates for different demographic groups. Machine learning models trained on such datasets may then exhibit disparities in predictive performance across these groups. In this work, we characterize the problem of learning fair classification models with respect to the underlying ground truth labels when given only label biased data. We focus on the particular fairness definition of group sufficiency, i.e. equal calibration of risk scores across protected groups. We theoretically show that enforcing fairness with respect to label biased data necessarily results in group miscalibration with respect to the true labels. We then propose a regularizer which minimizes an upper bound on the sufficiency gap by penalizing a conditional mutual information term. Across experiments on eight tabular, image, and text datasets with both synthetic and real label noise, we find that our method reduces the sufficiency gap by up to 7.2% with no significant decrease in overall accuracy.

artificial intelligence, dataset, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.27)

Genre: Research Report > Experimental Study > Negative Result (0.34)

Industry:

Health & Medicine (1.00)
Education > Educational Setting (1.00)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

SimSort: AData-Driven Framework for Spike Sorting by Large-Scale Electrophysiology Simulation

Neural Information Processing SystemsJun-21-2026, 09:35:51 GMT

Spike sorting is an essential process in neural recording, which identifies and separates electrical signals from individual neurons recorded by electrodes in the brain, enabling researchers to study how specific neurons communicate and process information. Although there exist a number of spike sorting methods which have contributed to significant neuroscientific breakthroughs, many are heuristically designed, making it challenging to verify their correctness due to the difficulty of obtaining ground truth labels from real-world neural recordings. In this work, we explore a data-driven, deep learning-based approach. We begin by creating a largescale dataset through electrophysiology simulations using biologically realistic computational models.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Exploring the Translation Mechanism of Large Language Models

Neural Information Processing SystemsJun-20-2026, 08:19:33 GMT

While large language models (LLMs) demonstrate remarkable success in multilingual translation, their internal core translation mechanisms, even at the fundamental word level, remain insufficiently understood. To address this critical gap, this work introduces a systematic framework for interpreting the mechanism behind LLM translation from the perspective of computational components. This paper first proposes subspace-intervened path patching for precise, fine-grained causal analysis, enabling the detection of components crucial to translation tasks and subsequently characterizing their behavioral patterns in human-interpretable terms. Comprehensive experiments reveal that translation is predominantly driven by a sparse subset of components: specialized attention heads serve critical roles in extracting source language, translation indicators, and positional features, which are then integrated and processed by specific multi-layer perceptrons (MLPs) into intermediary English-centric latent representations before ultimately yielding the final translation. The significance of these findings is underscored by the empirical demonstration that targeted fine-tuning a minimal parameter subset (< 5%) enhances translation performance while preserving general capabilities. This result further indicates that these crucial components generalize effectively to sentence-level translation and are instrumental in elucidating more intricate translation tasks. Code is available at this URL.

large language model, machine learning, translation, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Asia > China (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Time-Masked Transformers with Lightweight Test-Time Adaptation for Neural Speech Decoding

Neural Information Processing SystemsJun-19-2026, 07:54:03 GMT

Speech neuroprostheses aim to restore communication for people with severe paralysis by decoding speech directly from neural activity. To accelerate algorithmic progress, a recent benchmark released intracranial recordings from a paralyzed participant attempting to speak, along with a baseline decoding algorithm. Prior work on the benchmark showed impressive accuracy gains. However, these gains increased computational costs and were not demonstrated in a real-time decoding setting. Here, we make three contributions that pave the way towards accurate, efficient, and real-time neural speech decoding.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study > Negative Result (0.46)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Speech (0.93)
Information Technology > Artificial Intelligence > Vision (0.93)

Add feedback

Scalable, Explainable and Provably Robust Anomaly Detection with One-Step Flow Matching

Neural Information Processing SystemsJun-18-2026, 23:42:24 GMT

We introduce Time-Conditioned Contraction Matching (TCCM), a novel method for semi-supervised anomaly detection in tabular data. TCCM is inspired by flow matching, a recent generative modeling framework that learns velocity fields between probability distributions and has shown strong performance compared to diffusion models and generative adversarial networks. Instead of directly applying flow matching as originally formulated, TCCM builds on its core idea--learning velocity fields between distributions--but simplifies the framework by predicting a time-conditioned contraction vector toward a fixed target (the origin) at each sampled time step. This design offers three key advantages: (1) a lightweight and scalable training objective that removes the need for solving ordinary differential equations during training and inference; (2) an efficient scoring strategy called one time-step deviation, which quantifies deviation from expected contraction behavior in a single forward pass, addressing the inference bottleneck of existing continuous-time models such as DTE (a diffusion-based model with leading anomaly detection accuracy but heavy inference cost); and (3) explainability and provable robustness, as the learned velocity field operates directly in input space, making the anomaly score inherently feature-wise attributable; moreover, the score function is Lipschitz-continuous with respect to the input, providing theoretical guarantees under small perturbations. Extensive experiments on the ADBench benchmark show that TCCM strikes a favorable balance between detection accuracy and inference cost, outperforming state-of-the-art methods--especially on high-dimensional and large-scale datasets.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe (0.27)
North America > United States (0.27)
Asia (0.27)

Genre:

Research Report > New Finding (1.00)
Instructional Material (0.87)
Research Report > Promising Solution (0.67)
Research Report > Experimental Study > Negative Result (0.45)

Industry:

Information Technology > Security & Privacy (0.67)
Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)

Add feedback

Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values

Neural Information Processing SystemsJun-18-2026, 21:49:47 GMT

The growing interest in employing large language models (LLMs) for decision-making in social and economic contexts has raised questions about their potential to function as agents in these domains. A significant number of societal problems involve the distribution of resources, where fairness, along with economic efficiency, play a critical role in the desirability of outcomes. In this paper, we examine whether LLM responses adhere to fundamental fairness concepts such as equitability, envy-freeness, and Rawlsian maximin, and investigate their alignment with human preferences. We evaluate the performance of several LLMs, providing a comparative benchmark of their ability to reflect these measures. Our results demonstrate a lack of alignment between current LLM responses and human distributional preferences. Moreover, LLMs are unable to utilize money as a transferable resource to mitigate inequality. Nonetheless, we demonstrate a stark contrast when (some) LLMs are tasked with selecting from a predefined menu of options rather than generating one. In addition, we analyze the robustness of LLM responses to variations in semantic factors (e.g., intentions or personas) or non-semantic prompting changes (e.g., templates or orderings). Finally, we highlight potential strategies aimed at enhancing the alignment of LLM behavior with well-established fairness concepts.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: