AITopics | fully-connected layer

Collaborating Authors

fully-connected layer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Details

Neural Information Processing SystemsApr-27-2026, 10:23:28 GMT

A.1 Difference between the performance of two joint policies In Section 3.1, the difference between the performance of two joint policies is expressed as follows: The proof is a multi-agent version of the proof in (Kakade and Langford, 2002). Now we provide the mathematical detail formally. A.2 Approximation that matches the true value to first order In Section 3.1, we claim that Jπ( π) matches J( π) to first order. Intuitively, this means that a sufficiently small update of the joint policy which improves Jπ( π) will also improve J( π). Now we prove it formally.

agent, artificial intelligence, section 3, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.34)

Add feedback

285b06e0dd856f20591b0a5beb954151-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 04:37:35 GMT

artificial intelligence, data mining, machine learning, (15 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

Supplementary to " Approximation with CNNs in Sobolev Space: with Applications to Classification "

Neural Information Processing SystemsApr-24-2026, 17:08:08 GMT

In the Supplementary materials, we include detailed descriptions on convex surrogate losses,convolutional neural networks, non-asymptotic error bounds for commonly used loss functions, and prove Theorems 2.1,2.2, A toy example on the numerical performance of CNN approximation is presented in Appendix D. We next give a brief review of the convex surrogate loss functions and discuss in details on the connection between the excess risk with respect to the ϕ-loss and that of 0-1 loss [28, 4]. Let ϕbe a given convex univariate function ϕ: R [0,). Instead of minimizing the excess risk R over H, we consider minimizing the risk with respect to the loss ϕ(ϕ-risk) R(f):= E{ϕ(Yf(X))} over a certain class of functions F, where ϕ: R [0,) is some generic loss function. For the special case when H = {h: h(x) = sign(f(x)),f F} and ϕ() is a step function, i.e., ϕ(x) = 1 Guohao Shen and Yuling Jiao contributed equally to this work Corresponding authors 36th Conference on Neural Information Processing Systems (NeurIPS 2022). As shown in [28] and [4], for a properly chosen ϕ, ˆfn can indeed help reduce the 0-1 excess risk R (ˆhn) R (h0). More precisely, let R0:= inff measurable R(f), then for a proper ϕ, we have ψ(R (ˆhn) R (h0)) R(ˆfn) R(f0), where ψ: [ 1,1] [0,)is a nonnegative continuous function, invertible on [0,1], and achieves its minimum at 0 with ψ(0) = 0. A wide variety of popular classification methods are based on this tactic.

artificial intelligence, machine learning, smin, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes

Neural Information Processing SystemsApr-24-2026, 13:10:12 GMT

Convolutional neural networks were the standard for solving many computer vision tasks until recently, when Transformers of MLP-based architectures have started to show competitive performance. These architectures typically have a vast number of weights and need to be trained on massive datasets; hence, they are not suitable for their use in low-data regimes. In this work, we propose a simple yet effective framework to improve generalization from small amounts of data. We augment modern CNNs with fully-connected (FC) layers and show the massive impact this architectural change has in low-data regimes. We further present an online joint knowledge-distillation method to utilize the extra FC layers at train time but avoid them during test time. This allows us to improve the generalization of a CNN-based model without any increase in the number of weights at test time. We perform classification experiments for a large range of network backbones and several standard datasets on supervised learning and active learning. Our experiments significantly outperform the networks without fully-connected layers, reaching a relative improvement of up to 16% validation accuracy in the supervised setting without adding any extra parameters during inference.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Learning the Number of Neurons in Deep Networks

Jose M. Alvarez, Mathieu Salzmann

Neural Information Processing SystemsMar-23-2026, 12:13:31 GMT

Neural Information Processing Systems http://nips.cc/

architecture, neuron, regularizer, (15 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

MemoryFormer : Minimize Transformer Computation by Removing Fully-Connected Layers

Neural Information Processing SystemsMar-19-2026, 02:07:08 GMT

In order to reduce the computational complexity of large language models, great efforts have been made to to improve the efficiency of transformer models such as linear attention and flash-attention. However, the model size and corresponding computational complexity are constantly scaled up in pursuit of higher performance. In this work, we present MemoryFormer, a novel transformer architecture which significantly reduces the computational complexity (FLOPs) from a new perspective. We eliminate nearly all the computations of the transformer model except for the necessary computation required by the multi-head attention operation. This is made possible by utilizing an alternative method for feature transformation to replace the linear projection of fully-connected layers. Specifically, we first construct a group of in-memory lookup tables that store a large amount of discrete vectors to replace the weight matrix used in linear projection.

artificial intelligence, large language model, natural language, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.59)

Add feedback