AITopics | dnn

Collaborating Authors

dnn

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Composite Activation Function for Learning Stable Binary Representations

Park, Seokhun, Kim, Choeun, Lee, Kwanho, Park, Sehyun, Kong, Insung, Kim, Yongdai

arXiv.org Machine LearningMay-13-2026

Activation functions play a central role in neural networks by shaping internal representations. Recently, learning binary activation representations has attracted significant attention due to their advantages in computational and memory efficiency, as well as interpretability. However, training neural networks with Heaviside activations remains challenging, as their non-differentiability obstructs standard gradient-based optimization. In this paper, we propose Heavy Tailed Activation Function (HTAF), a smooth approximation to the Heaviside function that enables stable training with gradient-based optimization. We construct HTAF as a sigmoid hyperbolic tangent composite function and theoretically show that it maintains a large gradient mass around zero inputs while exhibiting slower gradient decay in the tail regions. We show that Spiking Neural Networks, Binary Neural Networks and Deep Heaviside neural Networks can be trained stably using HTAF with gradient-based optimization. Finally, we introduce Implicit Concept Bottleneck Models (ICBMs), an interpretable image model that leverages HTAF to induce discrete feature representations. Extensive experiments across various architectures and image datasets demonstrate that ICBM enables stable discretization while achieving prediction performance comparable to or better than standard models.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

2605.11558

Genre: Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Universality in Deep Neural Networks: An approach via the Lindeberg exchange principle

Giovagnini, Filippo, Kotitsas, Sotirios, Romito, Marco

arXiv.org Machine LearningMay-5-2026

We consider the infinite-width limit of a fully connected deep neural network with general weights, and we prove quantitative general bounds on the $2$-Wasserstein distance between the network and its infinite-width Gaussian limit, under appropriate regularity assumptions on the activation function. Our main tool is a Lindeberg principle for Deep Neural Networks, which we use to successively replace the weights on each layer by Gaussian random variables.

artificial intelligence, machine learning, neural network, (19 more...)

arXiv.org Machine Learning

2605.02771

Country: North America > United States > New York (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Rethinking Calibration of Deep Neural Networks: Do Not Be Afraid of Overconfidence

Neural Information Processing SystemsApr-26-2026, 04:51:45 GMT

Capturing accurate uncertainty quantification of the predictions from deep neural networks is important in many real-world decision-making applications. A reliable predictor is expected to be accurate when it is confident about its predictions and indicate high uncertainty when it is likely to be inaccurate. However, modern neural networks have been found to be poorly calibrated, primarily in the direction of overconfidence. In recent years, there is a surge of research on model calibration by leveraging implicit or explicit regularization techniques during training, which achieve well calibration performance by avoiding overconfident outputs. In our study, we empirically found that despite the predictions obtained from these regularized models are better calibrated, they suffer from not being as calibratable, namely, it is harder to further calibrate these predictions with post-hoc calibration methods like temperature scaling and histogram binning. We conduct a series of empirical studies showing that overconfidence may not hurt final calibration performance if post-hoc calibration is allowed, rather, the penalty of confident outputs will compress the room of potential improvement in post-hoc calibration phase. Our experimental findings point out a new direction to improve calibration of DNNs by considering main training and post-hoc calibration as a unified framework.

artificial intelligence, calibration, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.47)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback

Interpreting Representation Quality of DNNs for 3DPoint Cloud Processing: Supplementary Materials Wen Shenb Qihan Rena Dongrui Liua Quanshi Zhanga aShanghai Jiao Tong UniversitybTongji University

Neural Information Processing SystemsApr-25-2026, 18:23:00 GMT

This section provides more details about Shapley values in Section 3 of the paper. Linearity: If two independent games vand wcan be merged into one game u(S) = v(S)+w(S), then the Shapley value of the player i in game v and game w also can be merged, i.e. φu(i) = φv(i)+φw(i). Nullity: A dummy player isatisfies S N\{i},v(S {i}) = v(S)+v({i}), which indicates that the player ihas no interaction with other players, i.e. φ(i) = v({i}). Efficiency: The overall reward can be allocated to all players in the game, i.e. This section provides more details about multi-order interactions [8] in Section 3.3 of the paper.

artificial intelligence, machine learning, sensitivity, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Interpreting Representation Quality of DNNs for 3DPoint Cloud Processing

Neural Information Processing SystemsApr-25-2026, 18:22:56 GMT

In this paper, we evaluate the quality of knowledge representations encoded in deep neural networks (DNNs) for 3D point cloud processing. We propose a method to disentangle the overall model vulnerability into the sensitivity to the rotation, the translation, the scale, and local 3D structures. Besides, we also propose metrics to evaluate the spatial smoothness of encoding 3D structures, and the representation complexity of the DNN. Based on such analysis, experiments expose representation problems with classic DNNs, and explain the utility of the adversarial training. The code will be released when this paper is accepted.

artificial intelligence, machine learning, sensitivity, (16 more...)

Neural Information Processing Systems

Country:

Asia (0.28)
North America > United States (0.28)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Polyhedron Attention Module: Learning Adaptive-order Interactions

Neural Information Processing SystemsApr-25-2026, 15:31:10 GMT

Learning feature interactions can be the key for multivariate predictive modeling. ReLU-activated neural networks create piecewise linear prediction models. Other nonlinear activation functions lead to models with only high-order feature interactions, thus lacking of interpretability. Recent methods incorporate candidate polynomial terms of fixed orders into deep learning, which is subject to the issue of combinatorial explosion, or learn the orders that are difficult to adapt to different regions of the feature space. We propose a Polyhedron Attention Module (PAM) to create piecewise polynomial models where the input space is split into polyhedrons which define the different pieces and on each piece the hyperplanes that define the polyhedron boundary multiply to form the interactive terms, resulting in interactions of adaptive order to each piece. PAM is interpretable to identify important interactions in predicting a target. Theoretic analysis shows that PAM has stronger expression capability than ReLU-activated networks. Extensive experimental results demonstrate the superior classification performance of PAM on massive datasets of the click-through rate prediction and PAM can learn meaningful interaction effects in a medical problem.

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

2d2f85c0f93e69cf71f58eebaebb5e8d-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 06:35:18 GMT

artificial intelligence, machine learning, reformulation, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

AUnified Game-Theoretic Interpretation of Adversarial Robustness: Supplementary Material

Neural Information Processing SystemsApr-25-2026, 01:01:28 GMT

In this section, in order to help readers understand the metric in the paper, we first revisit the definition of the Shapley value [14], which is widely considered as an unbiased estimation of the numerical importance w.r.t. each input variable. In game theory, the complex system is usually represented as a game, where each input variable is taken as a player, and the output of this system is regarded as the total reward of all players. Given a game with multiple players (input variables) N = {1,2,,n}, some players cooperate to pursue a high reward. Thus, the task is to divide the total reward, and fairly assign the divided elementary reward to each individual player. In this way, the elementary reward can be considered as the numerical importance of the corresponding variable to the complex system. Let 2N def= {S|S N}indicate all potential subsets of N. The game v: 2N R is a function, which estimates the overall reward v(S) earned by each specific subset of players S N. In this way, the Shapley value, denoted by φ(i), represents the numerical importance of the player ito the game v. φ(i) = X Using Shapley values to explain DNNs.

artificial intelligence, deep learning, machine learning, (20 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Industry: Leisure & Entertainment > Games (0.54)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Filters

Collaborating Authors

dnn

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

A Composite Activation Function for Learning Stable Binary Representations

Universality in Deep Neural Networks: An approach via the Lindeberg exchange principle

d899a31938c7838965b589d9b14a5ca6-Paper-Conference.pdf

Rethinking Calibration of Deep Neural Networks: Do Not Be Afraid of Overconfidence

Interpreting Representation Quality of DNNs for 3DPoint Cloud Processing: Supplementary Materials Wen Shenb Qihan Rena Dongrui Liua Quanshi Zhanga aShanghai Jiao Tong UniversitybTongji University

Interpreting Representation Quality of DNNs for 3DPoint Cloud Processing

Polyhedron Attention Module: Learning Adaptive-order Interactions

33ebd5b07dc7e407752fe773eed20635-Paper.pdf

2d2f85c0f93e69cf71f58eebaebb5e8d-Paper-Conference.pdf

AUnified Game-Theoretic Interpretation of Adversarial Robustness: Supplementary Material