AITopics | ccuracy

Collaborating Authors

ccuracy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bag of Tricks for Inference-time Computation of LLM Reasoning

Neural Information Processing SystemsJun-23-2026, 12:22:12 GMT

With the advancement of large language models (LLMs), solving complex reasoning tasks has gained increasing attention. Inference-time computation methods (e.g., Best-of-N, beam search) are particularly valuable as they can enhance reasoning performance without modifying model parameters or requiring additional training. However, these techniques come with implementation challenges, and most existing methods remain at the proof-of-concept stage with limited practical adoption due to their computational complexity and varying effectiveness across different tasks. In this paper, we investigate and benchmark diverse inference-time computation strategies across reasoning tasks of varying complexity. Since most current methods rely on a proposer-verifier pipeline that first generates candidate solutions (e.g., reasoning solutions) and then selects the best one based on reward signals (e.g., RLHF rewards, process rewards), our research focuses on optimizing both candidate solution generation (e.g., instructing prompts, hyperparameters such as temperature and top-p) and reward mechanisms (e.g., self-evaluation, reward types). Through extensive experiments (more than 20,000 A100-80GGPU hours with over 1,000 experiments) across a variety of models (e.g., Llama, Qwen, and Mistral families) of various sizes, our ablation studies reveal that previously overlooked strategies can significantly enhance performance (e.g., tuning temperature can improve reasoning task performance by up to 5%). Furthermore, we establish a standardized benchmark for inference-time computation by systematically evaluating six representative methods across eight reasoning tasks. These findings provide a stronger foundation for future research.

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

LoRASuite: Efficient LoRA Adaptation Across Large Language Model Upgrades

Neural Information Processing SystemsJun-23-2026, 03:36:20 GMT

As Large Language Models (LLMs) are frequently updated, LoRA weights trained on earlier versions quickly become obsolete. The conventional practice of retraining LoRA weights from scratch on the latest model is costly, time-consuming, and environmentally detrimental, particularly as the diversity of LLMs and downstream tasks expands. This motivates a critical question: "How can we efficiently leverage existing LoRA weights to adapt to newer model versions?" To address this, we propose LoRASuite, a modular approach tailored specifically to various types of LLM updates. First, we compute a transfer matrix utilizing known parameters from both old and new LLMs.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Neural Information Processing SystemsJun-21-2026, 23:32:23 GMT

We show that reinforcement learning with verifiable reward using one training example (1-shot RLVR) is effective in incentivizing the mathematical reasoning capabilities of large language models (LLMs).

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Mint Test Time Adaptation of VisionLanguage Models against Common Corruptions

Neural Information Processing SystemsJun-17-2026, 09:03:36 GMT

Pretrained vision-language models such as CLIP achieve strong zero-shot generalization but remain vulnerable to distribution shifts caused by input corruptions. In this work, we investigate how corruptions affect CLIP's image embeddings and uncover a consistent phenomenon we term as embedding variance collapse, where both intra-class and inter-class variances shrink as corruption severity increases. We find that this collapse is closely tied to performance degradation, with inter-class variance strongly correlated with classification accuracy. To explain this phenomenon, we analyze how corruptions alter the structure of the embedding space. Our theoretical results suggest that the visual encoder tends to encode corruption-related signals, which dilute class-discriminative features and compress the representation geometry. We further show that maximizing inter-class variance, even when estimated from pseudo-labels, can provably enhance embedding quality. Based on this insight, we propose Mint, a simple test-time adaptation method that maximizes pseudo-label-based inter-class variance on the fly using cumulative prototypes and gradient estimates. Mintoperates effectively with small batch sizes and consistently improves performance across multiple corruption benchmarks and CLIP architectures. Our code is available at https://github.com/baowenxuan/Mint.

large language model, machine learning, variance, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Add feedback

Unveiling Memorization-Generalization Coexistence: A Case Study on Arithmetic Tasks with Label Noise

Liu, Linyu, Lu, Pinyan

arXiv.org Machine LearningMay-19-2026

Highly over-parameterized models can simultaneously memorize noisy labels and generalize well, yet how these behaviors coexist remains poorly understood. In this work, we investigate the underlying mechanisms of this coexistence using modular arithmetic tasks under heavy label noise. Through extensive experiments on two-layer neural networks, we find that larger models tend to generalize better under appropriate optimization and model configurations, while noisy labels are memorized faster than clean data. Over-parameterized models internally form a generalization structure, but its expression in the output is suppressed by the need to fit noisy labels. Remarkably, even with 80\% label noise, near-perfect test accuracy can be achieved by extracting this internal structure using frequency-based methods. We further propose a task-agnostic method to partition networks into generalization and memorization components. Although this subnetwork improves generalization, it is limited compared with frequency-based extraction, indicating that the generalization structure is distributed across neurons and motivating the development of new tools to retrieve generalizable knowledge from over-parameterized networks.

artificial intelligence, generalization, machine learning, (16 more...)

arXiv.org Machine Learning

2605.18022

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.64)

Add feedback

Revisiting Out of distribution Robustness in NLP Benchmark Analysis and LLMs Evaluations

Neural Information Processing SystemsApr-29-2026, 12:56:08 GMT

We find that the distribution shift settings in previous studies commonly lack adequate challenges, hindering the accurate evaluation of OOD robustness. To address these issues, we propose a benchmark construction protocol that ensures clear differentiation and challenging distribution shifts. Then we introduce BOSS, a Benchmark suite for Out-of-distribution robustneSS evaluation covering 5 tasks and 20 datasets. Based on BOSS, we conduct a series of experiments on pretrained language models for analysis and evaluation of OOD robustness. First, for vanilla fine-tuning, we examine the relationship between in-distribution (ID) and OOD performance. We identify three typical types that unveil the inner learning mechanism, which could potentially facilitate the forecasting of OOD robustness, correlating with the advancements on ID datasets. Then, we evaluate 5 classic methods on BOSS and find that, despite exhibiting some effectiveness in specific cases, they do not offer significant improvement compared to vanilla fine-tuning. Further, we evaluate 5 LLMs with various adaptation paradigms and find that when sufficient ID data is available, fine-tuning domain-specific models outperform LLMs on ID examples significantly.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.45)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Stimulative Training of Residual Networks: ASocial Psychology Perspective of Loafing

Neural Information Processing SystemsApr-24-2026, 19:55:32 GMT

We further verify that stimulative training can well handle the loafing problem on different datasets and residual networks. As shown in Fig. r1, we can see that stimulative training can always improve the performance of a given residual network and all of its sub-networks by a larger margin on various residual networks and benchmark datasets. In other words, different residual networks trained on different datasets invariably suffer from the problem of network loafing, which can be well solved by the proposed stimulative training strategy. Figure r1: Stimulative training can improve the performance of a given residual network and all of its sub-networks significantly. We further verify it on various residual networks and benchmark datasets.

artificial intelligence, ccuracy, machine learning, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Leveraging Teleconnections with Physics-Informed Graph Attention Networks for Long-Range Extreme Rainfall Forecasting in Thailand

Chobtham, Kiattikun, Sarinnapakorn, Kanoksri, Torsri, Kritanai, Deeprasertkul, Prattana, Kamma, Jirawan

arXiv.org Artificial IntelligenceOct-28-2025

Accurate rainfall forecasting, particularly for extreme events, remains a significant challenge in climatology and the Earth system. This paper presents novel physics-informed Graph Neural Networks (GNNs) combined with extreme-value analysis techniques to improve gauge-station rainfall predictions across Thailand. The model leverages a graph-structured representation of gauge stations to capture complex spatiotemporal patterns, and it offers explainability through teleconnections. We preprocess relevant climate indices that potentially influence regional rainfall. The proposed Graph Attention Network with Long Short-Term Memory (Attention-LSTM) applies the attention mechanism using initial edge features derived from simple orographic-precipitation physics formulation. The embeddings are subsequently processed by LSTM layers. To address extremes, we perform Peak-Over-Threshold (POT) mapping using the novel Spatial Season-aware Generalized Pareto Distribution (GPD) method, which overcomes limitations of traditional machine-learning models. Experiments demonstrate that our method outperforms well-established baselines across most regions, including areas prone to extremes, and remains strongly competitive with the state of the art. Compared with the operational forecasting system SEAS5, our real-world application improves extreme-event prediction and offers a practical enhancement to produce high-resolution maps that support decision-making in long-term water management.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.12328

Country: Asia > Thailand (1.00)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Supplementary Materials A Constraint Explanation for Problem 2 Constraint 2b checks the loss of each sample and the derivation is shown as follows

Neural Information Processing SystemsAug-22-2025, 00:02:18 GMT

Constraint 2b checks the loss of each sample and the derivation is shown as follows. Constraint 2d to 2i are mainly adopted from Bertsimas and Dunn [2019] on Chapter 8.2. Here we briefly explain the meaning and derivations of these constraints. Constraint 2g and 2h are set to ensure that if there's no split on Constraint 2i enforces the hierarchical structure of the tree. Algorithm 1 depicts the details of the Branch-and-bound scheme for training the optimal decision tree.

dataset, node, sample reduction, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.50)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.48)

Add feedback

Experimenting with Normalization Layers in Federated Learning on non-IID scenarios

Casella, Bruno, Esposito, Roberto, Sciarappa, Antonio, Cavazzoni, Carlo, Aldinucci, Marco

arXiv.org Artificial IntelligenceMar-19-2023

Training Deep Learning (DL) models require large, high-quality datasets, often assembled with data from different institutions. Federated Learning (FL) has been emerging as a method for privacy-preserving pooling of datasets employing collaborative training from different institutions by iteratively globally aggregating locally trained models. One critical performance challenge of FL is operating on datasets not independently and identically distributed (non-IID) among the federation participants. Even though this fragility cannot be eliminated, it can be debunked by a suitable optimization of two hyper-parameters: layer normalization methods and collaboration frequency selection. In this work, we benchmark five different normalization layers for training Neural Networks (NNs), two families of non-IID data skew, and two datasets. Results show that Batch Normalization, widely employed for centralized DL, is not the best choice for FL, whereas Group and Layer Normalization consistently outperform Batch Normalization. Similarly, frequent model aggregation decreases convergence speed and mode quality.

artificial intelligence, machine learning, normalization layer, (14 more...)

arXiv.org Artificial Intelligence

2303.1063

Country: