AITopics | output activation

Collaborating Authors

output activation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Neural Nets with Interpolating Function as Output Activation

Bao Wang, Xiyang Luo, Zhen Li, Wei Zhu, Zuoqiang Shi, Stanley Osher

Neural Information Processing SystemsFeb-13-2026, 03:51:27 GMT

And we propose end-to-end training and testing algorithms for this new architecture. Compared to classical neural nets with softmax function as output activation, the surrogate with interpolating function as output activation combines advantages of both deep and manifold learning.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Regularizing Optimal Transport with f-Divergences Name f (v) f

Neural Information Processing SystemsFeb-7-2026, 23:13:35 GMT

The primal and dual are related by the Lagrangian L (,',), L ( ',,)= E We proceed to proofs of the theorems stated in Section 4. Assumption NTK, the regularization parameter, and it may also depend indirectly on the bound R . Theorem 4.2 follows immediately from Lemmas B.1 and B.2. Theorem The following result follows from Proposition E.4 and E.5 of of Luise et al. Interestingly, the rate of estimation of the Sinkhorn plan breaks the curse of dimensionality. B.2 Log-concavity of Sinkhorn Factor The optimal entropy regularized Sinkhorn plan is given by The optimal potentials satisfy fixed point equations. Using this result, one can prove the following lemma.

artificial intelligence, exp 1, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Deep Neural Nets with Interpolating Function as Output Activation

Neural Information Processing SystemsNov-20-2025, 22:23:52 GMT

We replace the output layer of deep neural nets, typically the softmax function, by a novel interpolating function. And we propose end-to-end training and testing algorithms for this new architecture. Compared to classical neural nets with softmax function as output activation, the surrogate with interpolating function as output activation combines advantages of both deep and manifold learning. The new framework demonstrates the following major advantages: First, it is better applicable to the case with insufficient training data. Second, it significantly improves the generalization accuracy on a wide variety of networks. The algorithm is implemented in PyTorch, and the code is available at https://github.com/

deep neural net, interpolating function, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Deep Neural Nets with Interpolating Function as Output Activation

Bao Wang, Xiyang Luo, Zhen Li, Wei Zhu, Zuoqiang Shi, Stanley Osher

Neural Information Processing SystemsNov-20-2025, 17:17:29 GMT

artificial intelligence, machine learning, wnll, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Canada > Quebec > Montreal (0.04)
Asia > China > Hong Kong (0.04)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Channel Gating Neural Networks

Weizhe Hua, Yuan Zhou, Christopher M. De Sa, Zhiru Zhang, G. Edward Suh

Neural Information Processing SystemsOct-2-2025, 22:08:18 GMT

Unlike static network pruning, channel gating optimizes CNN inference at run-time by exploiting input-specific characteristics, which allows substantially reducing the compute cost with almost no accuracy loss. We experimentally show that applying channel gating in state-of-the-art networks achieves 2.7-8.0

activation, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Hardware (0.93)

Add feedback

6c2e49911b68d315555d5b3eb0dd45bf-Supplemental.pdf

Neural Information Processing SystemsSep-25-2025, 07:48:24 GMT

exp 1, kx yk 2, training iteration, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

A Regularizing Optimal Transport with f-Divergences Name f (v) f

Neural Information Processing SystemsAug-15-2025, 00:44:05 GMT

exp 1, optimization adam, training iteration, (14 more...)

Neural Information Processing Systems

Country: North America (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Deriving Equivalent Symbol-Based Decision Models from Feedforward Neural Networks

Seidel, Sebastian, Borghoff, Uwe M.

arXiv.org Artificial IntelligenceJul-28-2025

Artificial intelligence (AI) has emerged as a transformative force across industries, driven by advances in deep learning and natural language processing, and fueled by large-scale data and computing resources. Despite its rapid adoption, the opacity of AI systems poses significant challenges to trust and acceptance. This work explores the intersection of connectionist and symbolic approaches to artificial intelligence, focusing on the derivation of interpretable symbolic models, such as decision trees, from feedforward neural networks (FNNs). Decision trees provide a transparent framework for elucidating the operations of neural networks while preserving their functionality. The derivation is presented in a step-by-step approach and illustrated with several examples. A systematic methodology is proposed to bridge neural and symbolic paradigms by exploiting distributed representations in FNNs to identify symbolic components, including fillers, roles, and their interrelationships. The process traces neuron activation values and input configurations across network layers, mapping activations and their underlying inputs to decision tree edges. The resulting symbolic structures effectively capture FNN decision processes and enable scalability to deeper networks through iterative refinement of subpaths for each hidden layer. To validate the theoretical framework, a prototype was developed using Keras .h5-data and emulating TensorFlow within the Java JDK/JavaFX environment. This prototype demonstrates the feasibility of extracting symbolic representations from neural networks, enhancing trust in AI systems, and promoting accountability.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.3389/frai.2025.1618149

2504.12446

Country:

North America > United States (0.46)
Europe > Spain (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.67)
Information Technology (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning

Mi, Zhendong, Kong, Zhenglun, Yuan, Geng, Huang, Shaoyi

arXiv.org Artificial IntelligenceMay-29-2025

With the rapid expansion of large language models (LLMs), the demand for memory and computational resources has grown significantly. Recent advances in LLM pruning aim to reduce the size and computational cost of these models. However, existing methods often suffer from either suboptimal pruning performance or low time efficiency during the pruning process. In this work, we propose an efficient and effective pruning method that simultaneously achieves high pruning performance and fast pruning speed with improved calibration efficiency. Our approach introduces two key innovations: (1) An activation cosine similarity loss-guided pruning metric, which considers the angular deviation of the output activation between the dense and pruned models. (2) An activation variance-guided pruning metric, which helps preserve semantic distinctions in output activations after pruning, enabling effective pruning with shorter input sequences. These two components can be readily combined to enhance LLM pruning in both accuracy and efficiency. Experimental results show that our method achieves up to an 18% reduction in perplexity and up to 63% decrease in pruning time on prevalent LLMs such as LLaMA, LLaMA-2, and OPT.

large language model, machine learning, pruning, (18 more...)

arXiv.org Artificial Intelligence

2505.21987

Country: North America > United States > Maryland (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Ensemble Kalman filter for uncertainty in human language comprehension

Bhandari, Diksha, Lopopolo, Alessandro, Rabovsky, Milena, Reich, Sebastian

arXiv.org Machine LearningMay-6-2025

Artificial neural networks (ANNs) are widely used in modeling sentence processing but often exhibit deterministic behavior, contrasting with human sentence comprehension, which manages uncertainty during ambiguous or unexpected inputs. This is exemplified by reversal anomalies--sentences with unexpected role reversals that challenge syntax and semantics--highlighting the limitations of traditional ANN models, such as the Sentence Gestalt (SG) Model. To address these limitations, we propose a Bayesian framework for sentence comprehension, applying an extention of the ensemble Kalman filter (EnKF) for Bayesian inference to quantify uncertainty. By framing language comprehension as a Bayesian inverse problem, this approach enhances the SG model's ability to reflect human sentence processing with respect to the representation of uncertainty. Numerical experiments and comparisons with maximum likelihood estimation (MLE) demonstrate that Bayesian methods improve uncertainty representation, enabling the model to better approximate human cognitive processing when dealing with linguistic ambiguities. Introduction Artificial neural networks (ANNs) have become indispensable tools in modeling sentence processing within the field of natural language processing and cognitive science. These models are capable of handling complex linguistic structures, making accurate predictions, and resolving ambiguities with a notable degree of certainty, even when they are wrong Guo et al. (2017); Hein et al. (2019). However, this behavior stands in contrast to human sentence comprehension, which often involves managing uncertainty, especially when faced with ambiguous or unexpected language inputs. The research has been funded by the Deutsche Forschungsgemeinschaft (DFG)- Project-ID 318763901 - SFB1294.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2505.0259

Country: