AITopics | activation pattern

Collaborating Authors

activation pattern

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

One SPACE to Rule Them All: Jointly Mitigating Factuality and Faithfulness Hallucinations in LLMs

Neural Information Processing SystemsJun-23-2026, 00:44:42 GMT

LLMs have demonstrated unprecedented capabilities in natural language processing, yet their practical deployment remains hindered by persistent factuality and faithfulness hallucinations. While existing methods address these hallucination types independently, they inadvertently induce performance trade-offs, as interventions targeting one type often exacerbate the other. Through empirical and theoretical analysis of activation space dynamics in LLMs, we reveal that these hallucination categories share overlapping subspaces within neural representations, presenting an opportunity for concurrent mitigation. To harness this insight, we propose SPACE, a unified framework that jointly enhances factuality and faithfulness by editing shared activation subspaces. SPACE establishes a geometric foundation for shared subspace existence through dual-task feature modeling, then identifies and edits these subspaces via a hybrid probe strategy combining spectral clustering and attention head saliency scoring. Experimental results across multiple benchmark datasets demonstrate the superiority of our approach.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia (0.46)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

The Computational Complexity of Counting Linear Regions in ReLU Neural Networks

Neural Information Processing SystemsJun-22-2026, 05:47:30 GMT

An established measure of the expressive power of a given ReLU neural network is the number of linear regions into which it partitions the input space. There exist many different, non-equivalent definitions of what a linear region actually is. We systematically assess which papers use which definitions and discuss how they relate to each other. We then analyze the computational complexity of counting the number of such regions for the various definitions. Generally, this turns out to be an intractable problem. We prove NPand #P-hardness results already for networks with one hidden layer and strong hardness of approximation results for two or more hidden layers. Finally, on the algorithmic side, we demonstrate that counting linear regions can at least be achieved in polynomial space for some common definitions.

artificial intelligence, linear region, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Generating and Checking DNNVerification Proofs

Neural Information Processing SystemsJun-17-2026, 16:20:57 GMT

Deep Neural Networks (DNN) have emerged as an effective approach to implementing challenging subproblems. They are increasingly being used as components in critical transportation, medical, and military systems. However, like human-written software, DNNs may have flaws that can lead to unsafe system performance. To confidently deploy DNNs in such systems, strong evidence is needed that they do not contain such flaws. This has led researchers to explore the adaptation and customization of software verification approaches to the problem of neural network verification (NNV). Many dozens of NNV tools have been developed in recent years and as a field these techniques have matured to the point where realistic networks can be analyzed to detect flaws and to prove conformance with specifications. NNV tools are highly-engineered and complex may harbor flaws that cause them to produce unsound results. We identify commonalities in algorithmic approaches taken by NNV tools to define a verifier independent proof format--activation pattern tree proofs (APTP)--and design an algorithm for checking those proofs that is proven correct and optimized to enable scalable checking. We demonstrate that existing verifiers can efficiently generate APTP proofs, and that an APTPcheckersignificantly outperforms prior work on a benchmark of 16 neural networks and 400 NNV problems, and that it is robust to variation in APTP proof structure arising from different NNV tools.

artificial intelligence, machine learning, proof tree, (19 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > Virginia (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.93)
Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Correcting misinterpretations of additive models

Neural Information Processing SystemsJun-16-2026, 02:56:44 GMT

Correct model interpretation in high-stakes settings is critical, yet both post-hoc feature attribution methods and so-called intrinsically interpretable models can systematically attribute false-positive importance to non-informative features such as suppressor variables. Specifically, both linear models and their powerful nonlinear generalisation such as General Additive Models (GAMs) are susceptible to spurious attributions to suppressors. We present a principled generalisation of activation patterns - originally developed to make linear models interpretable - to additive models, correctly rejecting suppressor effects for non-linear features. This yields PatternGAM, an importance attribution method based on univariate generative surrogate models for the broad family of additive models, and PatternQLR for polynomial models. Empirical evaluations on the XAI-TRIS benchmark with a novel false-negative invariant formulation of the earth mover's distance accuracy metric demonstrates significant improvements over popular feature attribution methods and the traditional interpretation of additive models. Finally, real-world case studies on the COMPAS and MIMIC-IV datasets provide new insights into the role of specific features by disentangling genuine target-related information from suppression effects that would mislead conventional GAM interpretations.

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Hierarchical Spatio-Channel Clustering for Efficient Model Compression in Medical Image Analysis

Hamlomo, Sisipho, Atemkeng, Marcellin, Likassa, Habte Tadesse, Ravelo, Blaise, Bouwmans, Thierry, Lalléchère, Sébastien, Vacavant, Antoine, Chen, Ding-Geng

arXiv.org Machine LearningApr-28-2026

Convolutional neural networks (CNNs) have become increasingly difficult to deploy in resource-constrained environments due to their large memory and computational requirements. Although low-rank compression methods can reduce this burden, most existing approaches compress spatial and channel redundancy independently and therefore do not fully exploit the localised structure within convolutional feature maps. This paper proposes a hierarchical spatio-channel low-rank compression framework for CNNs that exploits redundancy across spatial regions and channel activations. Unlike conventional methods, which apply a uniform decomposition across an entire layer, the proposed approach first partitions feature maps into spatial regions, then groups channels according to their co-activation patterns within each region, and finally applies rank-adaptive SVD to each resulting spatio-channel cluster. The method is evaluated on an AlexNet-based brain tumour MRI classification model and compared with Global SVD and Tucker decomposition under \(3\times\) and \(6\times\) compression budgets. Our method outperforms both baselines, reducing FLOPs from \(8.21\,\mathrm{G}\) to \(1.55\,\mathrm{G}\) (\(81.1\%\) reduction), achieving a \(1.38\times\) inference speed-up, and increasing classification accuracy from \(87.76\%\) to \(89.80\%\). The method also improves the macro \(F_1\)-score and performance on challenging classes such as meningioma. A hyper-parameter trade-off analysis demonstrates that the framework provides Pareto-optimal configurations, enabling control over the balance between compression and predictive performance. Moderate clustering with adaptive rank selection yields strong results. Bootstrap standard errors are reported for all classification metrics.

artificial intelligence, decomposition, machine learning, (19 more...)

arXiv.org Machine Learning

2604.23375

Country:

North America > United States (0.93)
Africa (0.68)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Add feedback

Details

Neural Information Processing SystemsApr-24-2026, 09:50:56 GMT

Here we derive Equation 8 for 0 and out = > 0. Since ESN(µ, 2,0) = NR(µ,), we can obtain Equation 4 for ID activation by specializing the result to =0 . We begin with a useful lemma. Let X ESN(0, 2,) and let a b 0, 0 c d. Then P(a X b)= (1+) h The result for P(c X d) follows analogously. For the reader's convenience, we summarize in detail a few common techniques for defining OOD scores that measure the degree of ID-ness on the given sample. All the methods derive the score post hoc on neural networks trained with in-distribution data only.

activation, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data

Yuanzhi Li, Yingyu Liang

Neural Information Processing SystemsMar-15-2026, 09:13:34 GMT

Neural networks have many successful applications, while much less theoretical understanding has been gained. Towards bridging this gap, we study the problem of learning a two-layer overparameterized ReLU neural network for multi-class classification via stochastic gradient descent (SGD) from random initialization. In the overparameterized setting, when the data comes from mixtures of well-separated distributions, we prove that SGD learns a network with a small generalization error, albeit the network has enough capacity to fit arbitrary labels. Furthermore, the analysis provides interesting insights into several aspects of learning neural networks and can be verified based on empirical studies on synthetic data and on the MNIST dataset.

artificial intelligence, initialization, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Deep Contract Design via Discontinuous Networks

Neural Information Processing SystemsFeb-17-2026, 05:16:13 GMT

Contract theory studies the setting where a principal seeks to design a contract for rewarding an agent on the basis of the uncertain outcomes caused by the agent's private actions [

artificial intelligence, contract, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Asia > Middle East > Israel (0.04)
North America > United States (0.04)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

incorporate feedback into our final revision. 4 [R1]: " I don't exactly see if small batch vs large batch captures this phenomenon; if yes should say explicitly. "

Neural Information Processing SystemsFeb-13-2026, 21:02:44 GMT

We thank the reviewers for the detailed and insightful reviews. As the reviews noted, our work 1) introduces "novel Smith et al. [2017] make an explicit connection between small vs. large batch "A small discussion on if the phenomenon has been observed for different datasets/tasks with different optimizers" The phenomenon may not be true for other optimizers such as Adam, though. "concept of "memorizable and generalizable", though intuitive, is sketchy and not formally explained ... authors We acknowledge that the terms "memorizable" and "generalizable" are potentially confusing. We will revise our terminology to clarify this distinction. By "inherently noisy", we refer to the fact that high noise in the datapoints will necessitate larger sample complexity.

artificial intelligence, machine learning, noise, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

activation pattern

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

One SPACE to Rule Them All: Jointly Mitigating Factuality and Faithfulness Hallucinations in LLMs

The Computational Complexity of Counting Linear Regions in ReLU Neural Networks

Generating and Checking DNNVerification Proofs

Correcting misinterpretations of additive models

Hierarchical Spatio-Channel Clustering for Efficient Model Compression in Medical Image Analysis

Details

Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data

Deep Contract Design via Discontinuous Networks

7016d7b7b6e3c05b2128ac5b3aae492d-Paper-Conference.pdf

incorporate feedback into our final revision. 4 [R1]: " I don't exactly see if small batch vs large batch captures this phenomenon; if yes should say explicitly. "