AITopics | artificial intelligence

Collaborating Authors

artificial intelligence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models

Neural Information Processing SystemsJun-2-2025, 13:21:47 GMT

Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and Parameter-Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting LLMs to downstream tasks. ICL typically constructs a few-shot learning scenario, either manually or by setting up a Retrieval-Augmented Generation (RAG) system, helping models quickly grasp domain knowledge or question-answering patterns without changing model parameters. However, this approach involves trade-offs, such as slower inference speed and increased space occupancy. PEFT assists the model in adapting to tasks through minimal parameter modifications, but the training process still demands high hardware requirements, even with a small number of parameters involved. To address these challenges, we propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning, maintaining low inference costs. RTD constructs a reference datastore from the provided training examples and optimizes the LLM's final vocabulary distribution by flexibly selecting suitable references based on the input, resulting in more trustable responses and enabling the model to adapt to downstream tasks at a low cost.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America (0.67)
Asia > China > Hubei Province (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Diffusion Improves Graph Learning

Neural Information Processing SystemsJun-2-2025, 13:21:35 GMT

Graph convolution is the core of most Graph Neural Networks (GNNs) and usually approximated by message passing between direct (one-hop) neighbors. In this work, we remove the restriction of using only the direct neighbors by introducing a powerful, yet spatially localized graph convolution: Graph diffusion convolution (GDC). GDC leverages generalized graph diffusion, examples of which are the heat kernel and personalized PageRank. It alleviates the problem of noisy and often arbitrarily defined edges in real graphs. We show that GDC is closely related to spectral-based models and thus combines the strengths of both spatial (message passing) and spectral methods. We demonstrate that replacing message passing with graph diffusion convolution consistently leads to significant performance improvements across a wide range of models on both supervised and unsupervised tasks and a variety of datasets. Furthermore, GDC is not limited to GNNs but can trivially be combined with any graph-based model or algorithm (e.g.

artificial intelligence, graph, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > Germany (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

which, to the best of our knowledge, is not covered by existing work. TD and SGD: Our unified analysis covers both TD and SGD, where TD is not covered by any existing supervised

Neural Information Processing SystemsJun-2-2025, 13:21:11 GMT

We appreciate the valuable comments from the reviewers. In contrast, in supervised learning, such a matrix is the Hessian, which must be symmetric. A "straightforward adaptation" of existing supervised learning analysis does not yield the global convergence of TD. Nonconvex mirror descent: Most existing analysis of mirror descent's convergence to a global optimum builds Error propagation: RL is divided into policy-based and value-based approaches. In particular, the Q-function tracked in Q-learning is not the action-value function of any policy.

artificial intelligence, machine learning, td and sgd, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback

Decision-Focused Learning with Directional Gradients

Neural Information Processing SystemsJun-2-2025, 13:19:08 GMT

We propose a novel family of decision-aware surrogate losses, called Perturbation Gradient (PG) losses, for the predict-then-optimize framework. The key idea is to connect the expected downstream decision loss with the directional derivative of a particular plug-in objective, and then approximate this derivative using zeroth order gradient techniques. Unlike the original decision loss which is typically piecewise constant and discontinuous, our new PG losses is a Lipschitz continuous, difference of concave functions that can be optimized using off-the-shelf gradient-based methods. Most importantly, unlike existing surrogate losses, the approximation error of our PG losses vanishes as the number of samples grows. Hence, optimizing our surrogate loss yields a best-in-class policy asymptotically, even in misspecified settings. This is the first such result in misspecified settings, and we provide numerical evidence confirming our PG losses substantively outperform existing proposals when the underlying model is misspecified.

artificial intelligence, experiment, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

a novel and scalable method for inferring a continuous target as well as representations for epistemic and aleatoric

Neural Information Processing SystemsJun-2-2025, 13:18:25 GMT

We thank the reviewers for their very constructive and detailed feedback on our manuscript. "Confused evidence": As R1 correctly states, the regularizer captures scenarios where the evidence However, we do not believe that the approach "conflates Further details and analysis are added to the manuscript. AUC: The histograms (and CDFs) provided in Figs. 5, 6, and S5 (as in [21], [Nalisnick, E. et al. '18], and others) are richer performance statistics and directly reduce to the requested To address these concerns, we have added all AUC-ROC values to our performance charts. Adversarial: We updated the implementation details of the attack method (FGSM). R2: 1. Figure 1 aleatoric: Within the training region there are very few differences, which can be attributed to intrisinic OOD there is much more variability, aligning with MVE [18, 28].

artificial intelligence, manuscript, novel and scalable method, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.50)

Add feedback

AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment

Neural Information Processing SystemsJun-2-2025, 13:18:07 GMT

Motivated by the transformative capabilities of large language models (LLMs) across various natural language tasks, there has been a growing demand to deploy these models effectively across diverse real-world applications and platforms. However, the challenge of efficiently deploying LLMs has become increasingly pronounced due to the varying application-specific performance requirements and the rapid evolution of computational platforms, which feature diverse resource constraints and deployment flows. These varying requirements necessitate LLMs that can adapt their structures (depth and width) for optimal efficiency across different platforms and application specifications. To address this critical gap, we propose AmoebaLLM, a novel framework designed to enable the instant derivation of LLM subnets of arbitrary shapes, which achieve the accuracyefficiency frontier and can be extracted immediately after a one-time fine-tuning. In this way, AmoebaLLM significantly facilitates rapid deployment tailored to various platforms and applications. Specifically, AmoebaLLM integrates three innovative components: (1) a knowledge-preserving subnet selection strategy that features a dynamic-programming approach for depth shrinking and an importancedriven method for width shrinking; (2) a shape-aware mixture of LoRAs to mitigate gradient conflicts among subnets during fine-tuning; and (3) an in-place distillation scheme with loss-magnitude balancing as the fine-tuning objective. Extensive experiments validate that AmoebaLLM not only sets new standards in LLM adaptability but also successfully delivers subnets that achieve stateof-the-art trade-offs between accuracy and efficiency.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Uncovering, Explaining, and Mitigating the Superficial Safety of Backdoor Defense Rui Min 1

Neural Information Processing SystemsJun-2-2025, 13:17:50 GMT

Backdoor attacks pose a significant threat to Deep Neural Networks (DNNs) as they allow attackers to manipulate model predictions with backdoor triggers. To address these security vulnerabilities, various backdoor purification methods have been proposed to purify compromised models.

artificial intelligence, machine learning, purified model, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Referee

Neural Information Processing SystemsJun-2-2025, 13:17:28 GMT

Despite this, q-means is still useful for machine learning purposes.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.61)

Add feedback

Faster Accelerated First-order Methods for Convex Optimization with Strongly Convex Function Constraints

Neural Information Processing SystemsJun-2-2025, 13:17:14 GMT

We show the superior performance of our methods in sparsity-inducing constrained optimization, notably Google's personalized

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > China (0.14)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

TrajCLIP: Pedestrian Trajectory Prediction Method Using Contrastive Learning and Idempotent Networks

Neural Information Processing SystemsJun-2-2025, 13:17:02 GMT

The distribution of pedestrian trajectories is highly complex and influenced by the scene, nearby pedestrians, and subjective intentions. This complexity presents challenges for modeling and generalizing trajectory prediction. Previous methods modeled the feature space of future trajectories based on the high-dimensional feature space of historical trajectories, but this approach is suboptimal because it overlooks the similarity between historical and future trajectories. Our proposed method, TrajCLIP, utilizes contrastive learning and idempotent generative networks to address this issue. By pairing historical and future trajectories and applying contrastive learning on the encoded feature space, we enforce same-space consistency constraints. To manage complex distributions, we use idempotent loss and tightness loss to control over-expansion in the latent space. Additionally, we have developed a trajectory interpolation algorithm and synthetic trajectory data to enhance model capacity and improve generalization. Experimental results on public datasets demonstrate that TrajCLIP achieves state-of-the-art performance and excels in scene-to-scene transfer, few-shot transfer, and online learning tasks.

artificial intelligence, machine learning, trajectory, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback