AITopics | weight sharing

Collaborating Authors

weight sharing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

eb1bad7a84ef68a64f1afd6577725d45-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 19:00:28 GMT

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

error is simply the

Neural Information Processing SystemsFeb-7-2026, 13:34:05 GMT

Figure (b) above shows that the performance is robust to different GCN embedding sizes. EA... degree to help": Figure (a) shows ablation study on NAS-Bench-201, which varies each component (surrogate The other experimental settings are the same as in Section 4.2. As can be seen, more accurate architectures are close to each other. BO typically works better in low-dimensional...": We use Here, in Figure (d) above, we use subnets that are sampled in the same search iteration. F or example, it is common to see pooling": Y es, we Thus, GCN propagation part is more important than how to add global node.

artificial intelligence, correlation, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Add feedback

Theoretical Analysis of the Inductive Biases in Deep Convolutional Networks

Neural Information Processing SystemsDec-27-2025, 03:04:04 GMT

In this paper, we provide a theoretical analysis of the inductive biases in convolutional neural networks (CNNs). We start by examining the universality of CNNs, i.e., the ability to approximate any continuous functions. We prove that a depth of $\mathcal{O}(\log d)$ suffices for deep CNNs to achieve this universality, where $d$ in the input dimension. Additionally, we establish that learning sparse functions with CNNs requires only $\widetilde{\mathcal{O}}(\log^2d)$ samples, indicating that deep CNNs can efficiently capture {\em long-range} sparse correlations. These results are made possible through a novel combination of the multichanneling and downsampling when increasing the network depth.

inductive bias, theoretical analysis, weight sharing, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Add feedback

13d4635deccc230c944e4ff6e03404b5-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 04:01:45 GMT

artificial intelligence, correlation, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Add feedback

Towards Biologically Plausible Convolutional Networks

Neural Information Processing SystemsAug-15-2025, 05:12:55 GMT

Convolutional networks are ubiquitous in deep learning.

convolutional network, data augmentation, neuron, (12 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
Asia > Japan > Honshū > Chūbu > Aichi Prefecture > Nagoya (0.04)

Genre: Research Report (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Privacy Preserving Charge Location Prediction for Electric Vehicles

Marlin, Robert, Jurdak, Raja, Abuadbba, Alsharif, Miller, Dimity

arXiv.org Artificial IntelligenceJan-30-2025

By 2050, electric vehicles (EVs) are projected to account for 70% of global vehicle sales. While EVs provide environmental benefits, they also pose challenges for energy generation, grid infrastructure, and data privacy. Current research on EV routing and charge management often overlooks privacy when predicting energy demands, leaving sensitive mobility data vulnerable. To address this, we developed a Federated Learning Transformer Network (FLTN) to predict EVs' next charge location with enhanced privacy measures. Each EV operates as a client, training an onboard FLTN model that shares only model weights, not raw data with a community-based Distributed Energy Resource Management System (DERMS), which aggregates them into a community global model. To further enhance privacy, non-transitory EVs use peer-to-peer weight sharing and augmentation within their community, obfuscating individual contributions and improving model accuracy. Community DERMS global model weights are then redistributed to EVs for continuous training. Our FLTN approach achieved up to 92% accuracy while preserving data privacy, compared to our baseline centralised model, which achieved 98% accuracy with no data privacy. Simulations conducted across diverse charge levels confirm the FLTN's ability to forecast energy demands over extended periods. We present a privacy-focused solution for forecasting EV charge location prediction, effectively mitigating data leakage risks.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.00068

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
Oceania > Australia > Queensland (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)
Information Technology > Security & Privacy (1.00)
(2 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Theoretical Analysis of the Inductive Biases in Deep Convolutional Networks

Neural Information Processing SystemsJan-20-2025, 01:30:49 GMT

In this paper, we provide a theoretical analysis of the inductive biases in convolutional neural networks (CNNs). We start by examining the universality of CNNs, i.e., the ability to approximate any continuous functions. We prove that a depth of \mathcal{O}(\log d) suffices for deep CNNs to achieve this universality, where d in the input dimension. Additionally, we establish that learning sparse functions with CNNs requires only \widetilde{\mathcal{O}}(\log 2d) samples, indicating that deep CNNs can efficiently capture {\em long-range} sparse correlations. These results are made possible through a novel combination of the multichanneling and downsampling when increasing the network depth.

deep convolutional network, theoretical analysis, weight sharing, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

Head-wise Shareable Attention for Large Language Models

Cao, Zouying, Yang, Yifei, Zhao, Hai

arXiv.org Artificial IntelligenceOct-24-2024

Large Language Models (LLMs) suffer from huge number of parameters, which restricts their deployment on edge devices. Weight sharing is one promising solution that encourages weight reuse, effectively reducing memory usage with less performance drop. However, current weight sharing techniques primarily focus on small-scale models like BERT and employ coarse-grained sharing rules, e.g., layer-wise. This becomes limiting given the prevalence of LLMs and sharing an entire layer or block obviously diminishes the flexibility of weight sharing. In this paper, we present a perspective on head-wise shareable attention for large language models. We further propose two memory-efficient methods that share parameters across attention heads, with a specific focus on LLMs. Both of them use the same dynamic strategy to select the shared weight matrices. The first method directly reuses the pre-trained weights without retraining, denoted as $\textbf{DirectShare}$. The second method first post-trains with constraint on weight matrix similarity and then shares, denoted as $\textbf{PostShare}$. Experimental results reveal our head-wise shared models still maintain satisfactory capabilities, demonstrating the feasibility of fine-grained weight sharing applied to LLMs.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2402.11819

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Law (0.46)
Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Reviews: Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures

Neural Information Processing SystemsOct-7-2024, 11:26:29 GMT

The authors provide a clear and succinct introduction to the problems and approaches of biologically plausible forms of backprop in the brain. They argue for behavioural realism apart from physiological realism and undertake a detailed comparison of backprop versus difference target prop and its variants (some of which they newly propose) and also direct feedback alignment. In the end though, they find that all proposed forms of bio-plausible alternatives to backprop fall quite short on complex image recognition tasks. Despite the negative results, I find such a comparison very timely to consolidate results and push the community to search for better and more diverse alternatives. Overall I find the work impressive. The authors claim that weight sharing is not plausible in the brain.

biologically-motivated deep learning algorithm, deep learning algorithm and architecture, weight sharing, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback