AITopics | deep neural net

Collaborating Authors

deep neural net

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

9872ed9fc22fc182d371c3e9ed316094-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-13-2026, 02:17:15 GMT

bc-red, convergence, denoiser, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.35)

Add feedback

Generalization in multitask deep neural classifiers: a statistical physics approach

Neural Information Processing SystemsDec-25-2025, 03:11:14 GMT

A proper understanding of the striking generalization abilities of deep neural networks presents an enduring puzzle. Recently, there has been a growing body of numerically-grounded theoretical work that has contributed important insights to the theory of learning in deep neural nets. There has also been a recent interest in extending these analyses to understanding how multitask learning can further improve the generalization capacity of deep neural nets. These studies deal almost exclusively with regression tasks which are amenable to existing analytical techniques. We develop an analytic theory of the nonlinear dynamics of generalization of deep neural networks trained to solve classification tasks using softmax outputs and cross-entropy loss, addressing both single task and multitask settings. We do so by adapting techniques from the statistical physics of disordered systems, accounting for both finite size datasets and correlated outputs induced by the training dynamics. We discuss the validity of our theoretical results in comparison to a comprehensive suite of numerical experiments. Our analysis provides theoretical support for the intuition that the performance of multitask learning is determined by the noisiness of the tasks and how well their input features align with each other. Highly related, clean tasks benefit each other, whereas unrelated, clean tasks can be detrimental to individual task performance.

generalization, multitask deep neural classifier, statistical physics approach, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Neural Nets with Interpolating Function as Output Activation

Neural Information Processing SystemsNov-20-2025, 22:23:52 GMT

We replace the output layer of deep neural nets, typically the softmax function, by a novel interpolating function. And we propose end-to-end training and testing algorithms for this new architecture. Compared to classical neural nets with softmax function as output activation, the surrogate with interpolating function as output activation combines advantages of both deep and manifold learning. The new framework demonstrates the following major advantages: First, it is better applicable to the case with insufficient training data. Second, it significantly improves the generalization accuracy on a wide variety of networks. The algorithm is implemented in PyTorch, and the code is available at https://github.com/

deep neural net, interpolating function, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

9872ed9fc22fc182d371c3e9ed316094-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 06:51:48 GMT

We thank the reviewers for carefully reading the manuscript and providing us with valuable feedback. This was omitted from the submitted manuscript due to space. We will clarify L220 to make this more precise. However, we will certainly include citations to both Danielyan and Tseng in the manuscript. L17 to say that the true prior might be unknown for certain signals, such as natural images.

bc-red, convergence, denoiser, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.35)

Add feedback

Generalization in multitask deep neural classifiers: a statistical physics approach

Neural Information Processing SystemsJan-22-2025, 06:23:49 GMT

generalization, multitask deep neural classifier, statistical physics approach, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: Robustness of classifiers: from adversarial to random noise

Neural Information Processing SystemsJan-20-2025, 14:04:43 GMT

This paper offers a thorough analysis of the effect of both worse-case (adversarial) and random noise in machine learning classifiers. It derives bounds that precisely describe the robustness of classifiers in function of the curvature of the decision boundary. This leads to some surprisingly (at least to me) general conclusions: * For random noise, the robustness of classifiers behaves as sqrt(d) times the distance from the datapoint to the classification boundary (where d denotes the dimension of the data) provided the curvature of the decision boundary is sufficiently small. This corroborates the intuition that random noise is less of an issue for high-dimensional data. On the other hand, how do we know the curvature of decision boundaries for general classifiers?

classifier, decision boundary, random noise, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

Reviews: Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs

Neural Information Processing SystemsOct-8-2024, 02:22:56 GMT

Update after author response: Thank you for the response. Additional details that the curves between the local optima are not unique would be also interesting to see. Summary: This paper first shows a very interesting finding on the loss surfaces of deep neural nets, and then presents a new ensembling method called Fast Geometric Ensembling (FGE). Given two already well trained deep neural nets (with no limitations on their architectures, apparently), we have two sets of weight vectors w1 and w2 (in a very high-dimensional space). This paper states a (surprising) fact that for given two weights w1 and w2, we can (always?) Figure 1 demonstrates this, and Left is the training accuracy plot on the 2D subspace passing independent weights w1, w2, w3 of ResNet-164 (from different random starts); whereas Middle and Right are the 2D subspace passing independent weights w1, w2 and one bend point w3 on the curve (Middle: Bezier, Right: Polygonal chain).

accuracy, ensemble, mode connectivity, (9 more...)

Neural Information Processing Systems

Genre: Research Report (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Reviews: Deep Neural Nets with Interpolating Function as Output Activation

Neural Information Processing SystemsOct-7-2024, 13:05:45 GMT

This paper develops a new data-dependent output activation function base on interpolation function. It is a nonparametric model based on a subset of training data. The activation function is defined in an implicit manner by solving a set of linear equations. Therefore, it cannot be solved directly by backpropagation. Instead it proposes an auxiliary network with linear output to approximate the gradient.

deep neural net, interpolating function, output activation, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Interview with Yuan Yang: working at the intersection of AI and cognitive science

AIHubJul-2-2024, 08:50:33 GMT

In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. The Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. In this latest interview, we hear from Yuan Yang, who completed his PhD in May. This autumn, Yuan will be joining the College of Information, Mechanical and Electrical Engineering, Shanghai Normal University as an associate professor. From August 2018 to May 2024, I did my PhD in the computer science department at Vanderbilt University, which is located in the famous music city – Nashville, Tennessee.

abstract concept, ai system, cognitive science, (14 more...)

AIHub

Country:

North America > United States > Tennessee > Davidson County > Nashville (0.25)
Asia > China > Shanghai > Shanghai (0.25)

Genre: Personal (0.35)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.33)
Media > Film (0.30)
Leisure & Entertainment (0.30)

Technology: Information Technology > Artificial Intelligence > Cognitive Science > Cognitive Architectures (0.41)

Add feedback

TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural Nets Toward Machine Precision

Chen, Zhuo, McCarran, Jacob, Vizcaino, Esteban, Soljačić, Marin, Luo, Di

arXiv.org Artificial IntelligenceJun-3-2024

Partial differential equations (PDEs) are instrumental for modeling dynamical systems in science and engineering. The advent of neural networks has initiated a significant shift in tackling these complexities though challenges in accuracy persist, especially for initial value problems. In this paper, we introduce the $\textit{Time-Evolving Natural Gradient (TENG)}$, generalizing time-dependent variational principles and optimization-based time integration, leveraging natural gradient optimization to obtain high accuracy in neural-network-based PDE solutions. Our comprehensive development includes algorithms like TENG-Euler and its high-order variants, such as TENG-Heun, tailored for enhanced precision and efficiency. TENG's effectiveness is further validated through its performance, surpassing current leading methods and achieving $\textit{machine precision}$ in step-by-step optimizations across a spectrum of PDEs, including the heat equation, Allen-Cahn equation, and Burgers' equation.

equation, teng, time step, (13 more...)

arXiv.org Artificial Intelligence

2404.10771

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback