AITopics | exploiting

Collaborating Authors

exploiting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Supplementary Materials for FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning

Neural Information Processing SystemsFeb-8-2026, 04:45:03 GMT

Since the Resnet-18 feature extractor uses a ReLU activation function, the feature representation values are all non-negative, so the inputs to tukey's ladder of powers transformation are all valid. As expected, the performance of both methods drops a bit when the pre-training is not done on the similar classes. Still FeCAM outperforms NCM by about 10% on the final accuracy. In Algorithm 1, we present the pseudo code for using FeCAM classifier.Algorithm 1 FeCAM Require: Training data (D

artificial intelligence, learning, machine learning, (15 more...)

Neural Information Processing Systems

Country: Asia > China > Liaoning Province > Shenyang (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Exploiting hidden structures in non-convex games for convergence to Nash equilibrium

Neural Information Processing SystemsDec-26-2025, 21:04:09 GMT

A wide array of modern machine learning applications - from adversarial models to multi-agent reinforcement learning - can be formulated as non-cooperative games whose Nash equilibria represent the system's desired operational states. Despite having a highly non-convex loss landscape, many cases of interest possess a latent convex structure that could potentially be leveraged to yield convergence to an equilibrium. Driven by this observation, our paper proposes a flexible first-order method that successfully exploits such "hidden structures" and achieves convergence under minimal assumptions for the transformation connecting the players' control variables to the game's latent, convex-structured layer. The proposed method - which we call preconditioned hidden gradient descent (PHGD) - hinges on a judiciously chosen gradient preconditioning scheme related to natural gradient methods. Importantly, we make no separability assumptions for the game's hidden structure, and we provide explicit convergence rate guarantees for both deterministic and stochastic environments.

convergence, equilibrium, non-convex game, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)

Add feedback

Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time

Neural Information Processing SystemsDec-26-2025, 11:53:07 GMT

Large language models(LLMs) have sparked a new wave of exciting AI applications. Hosting these models at scale requires significant memory resources. One crucial memory bottleneck for the deployment stems from the context window. It is commonly recognized that model weights are memory hungry; however, the size of key-value embedding stored during the generation process (KV cache) can easily surpass the model size. The enormous size of the KV cache puts constraints on the inference batch size, which is crucial for high throughput inference workload.

importance hypothesis, llm kv cache compression, persistence, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.62)

Add feedback

Exploiting the Relationship Between Kendall's Rank Correlation and Cosine Similarity for Attribution Protection

Neural Information Processing SystemsDec-24-2025, 15:21:16 GMT

Model attributions are important in deep neural networks as they aid practitioners in understanding the models, but recent studies reveal that attributions can be easily perturbed by adding imperceptible noise to the input. The non-differentiable Kendall's rank correlation is a key performance index for attribution protection. In this paper, we first show that the expected Kendall's rank correlation is positively correlated to cosine similarity and then indicate that the direction of attribution is the key to attribution robustness. Based on these findings, we explore the vector space of attribution to explain the shortcomings of attribution defense methods using $\ell_p$ norm and propose integrated gradient regularizer (IGR), which maximizes the cosine similarity between natural and perturbed attributions. Our analysis further exposes that IGR encourages neurons with the same activation states for natural samples and the corresponding perturbed samples. Our experiments on different models and datasets confirm our analysis on attribution protection and demonstrate a decent improvement in adversarial robustness.

attribution, kendall, rank correlation and cosine similarity, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)

Add feedback

Exploiting a Zoo of Checkpoints for Unseen Tasks

Neural Information Processing SystemsDec-24-2025, 15:21:02 GMT

There are so many models in the literature that it is difficult for practitioners to decide which combinations are likely to be effective for a new task. This paper attempts to address this question by capturing relationships among checkpoints published on the web.

checkpoint, exploiting, name change, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.40)

Add feedback

Exploiting the Surrogate Gap in Online Multiclass Classification

Neural Information Processing SystemsDec-24-2025, 03:56:48 GMT

In the full information setting we provide expected mistake bounds for \textproc{Gaptron} with respect to the logistic loss, hinge loss, and the smooth hinge loss with $O(K)$ regret, where the expectation is with respect to the learner's randomness and $K$ is the number of classes. In the bandit classification setting we show that \textproc{Gaptron} is the first linear time algorithm with $O(K\sqrt{T})$ expected regret. Additionally, the expected mistake bound of \textproc{Gaptron} does not depend on the dimension of the feature vector, contrary to previous algorithms with $O(K\sqrt{T})$ regret in the bandit classification setting. We present a new proof technique that exploits the gap between the zero-one loss and surrogate losses rather than exploiting properties such as exp-concavity or mixability, which are traditionally used to prove logarithmic or constant regret bounds.

name change, online multiclass classification, surrogate gap, (8 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning

Neural Information Processing SystemsDec-24-2025, 00:13:08 GMT

Exemplar-free class-incremental learning (CIL) poses several challenges since it prohibits the rehearsal of data from previous tasks and thus suffers from catastrophic forgetting. Recent approaches to incrementally learning the classifier by freezing the feature extractor after the first task have gained much attention. In this paper, we explore prototypical networks for CIL, which generate new class prototypes using the frozen feature extractor and classify the features based on the Euclidean distance to the prototypes. In an analysis of the feature distributions of classes, we show that classification based on Euclidean metrics is successful for jointly trained features. However, when learning from non-stationary data, we observe that the Euclidean metric is suboptimal and that feature distributions are heterogeneous. To address this challenge, we revisit the anisotropic Mahalanobis distance for CIL. In addition, we empirically show that modeling the feature covariance relations is better than previous attempts at sampling features from normal distributions and training a linear classifier.

class distribution, exemplar-free continual learning, heterogeneity, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Exploiting the Structure: Stochastic Gradient Methods Using Raw Clusters

Neural Information Processing SystemsNov-21-2025, 14:38:40 GMT

The amount of data available in the world is growing faster than our ability to deal with it. However, if we take advantage of the internal structure, data may become much smaller for machine learning purposes. In this paper we focus on one of the fundamental machine learning tasks, empirical risk minimization (ERM), and provide faster algorithms with the help from the clustering structure of the data. We introduce a simple notion of raw clustering that can be efficiently computed from the data, and propose two algorithms based on clustering information. Our accelerated algorithm ClusterACDM is built on a novel Haar transformation applied to the dual space of the ERM problem, and our variance-reduction based algorithm ClusterSVRG introduces a new gradient estimator using clustering. Our algorithms outperform their classical counterparts ACDM and SVRG respectively.

exploiting, name change, stochastic gradient method, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Exploiting the Replay Memory Before Exploring the Environment: Enhancing Reinforcement Learning Through Empirical MDP Iteration

Neural Information Processing SystemsMay-31-2025, 15:37:12 GMT

Reinforcement learning (RL) algorithms are typically based on optimizing a Markov Decision Process (MDP) using the optimal Bellman equation. Recent studies have revealed that focusing the optimization of Bellman equations solely on in-sample actions tends to result in more stable optimization, especially in the presence of function approximation. Upon on these findings, in this paper, we propose an Empirical MDP Iteration (EMIT) framework. For each of these empirical MDPs, it learns an estimated Q-function denoted as \widehat{Q} . The key strength is that by restricting the Bellman update to in-sample bootstrapping, each empirical MDP converges to a unique optimal \widehat{Q} function.

algorithm, empirical mdp iteration, enhancing reinforcement learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Fast Rates in Stochastic Online Convex Optimization by Exploiting the Curvature of Feasible Sets

Neural Information Processing SystemsMay-27-2025, 14:02:28 GMT

In this work, we explore online convex optimization (OCO) and introduce a new condition and analysis that provides fast rates by exploiting the curvature of feasible sets. In online linear optimization, it is known that if the average gradient of loss functions exceeds a certain threshold, the curvature of feasible sets can be exploited by the follow-the-leader (FTL) algorithm to achieve a logarithmic regret. This study reveals that algorithms adaptive to the curvature of loss functions can also leverage the curvature of feasible sets. In particular, we first prove that if an optimal decision is on the boundary of a feasible set and the gradient of an underlying loss function is non-zero, then the algorithm achieves a regret bound of O(\rho \log T) in stochastic environments. Here, \rho 0 is the radius of the smallest sphere that includes the optimal decision and encloses the feasible set.

artificial intelligence, curvature, machine learning, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback