AITopics | network backbone

Recent progress has demonstrated that such meta-learning methods may exceed scalable human-invented architectures on image classification tasks.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

TSGym: Design Choices for Deep Multivariate Time-Series Forecasting

Liang, Shuang, Hou, Chaochuan, Yao, Xu, Wang, Shiping, Jiang, Minqi, Han, Songqiao, Huang, Hailiang

arXiv.org Artificial IntelligenceSep-23-2025

Recently, deep learning has driven significant advancements in multivariate time series forecasting (MTSF) tasks. However, much of the current research in MTSF tends to evaluate models from a holistic perspective, which obscures the individual contributions and leaves critical issues unaddressed. Adhering to the current modeling paradigms, this work bridges these gaps by systematically decomposing deep MTSF methods into their core, fine-grained components like series-patching tokenization, channel-independent strategy, attention modules, or even Large Language Models and Time-series Foundation Models. Through extensive experiments and component-level analysis, our work offers more profound insights than previous benchmarks that typically discuss models as a whole. Furthermore, we propose a novel automated solution called TSGym for MTSF tasks. Unlike traditional hyperparameter tuning, neural architecture searching or fixed model selection, TSGym performs fine-grained component selection and automated model construction, which enables the creation of more effective solutions tailored to diverse time series data, therefore enhancing model transferability across different data sources and robustness against distribution shifts. Extensive experiments indicate that TSGym significantly outperforms existing state-of-the-art MTSF and AutoML methods. All code is publicly available on https://github.com/SUFE-AILAB/TSGym.

forecasting, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.17063

Country: North America (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Energy (0.93)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Transferable Normalization: Towards Improving Transferability of Deep Neural Networks

Ximei Wang, Ying Jin, Mingsheng Long, Jianmin Wang, Michael I. Jordan

Neural Information Processing SystemsAug-20-2025, 11:10:14 GMT

Pre-trained DNNs also show strong transferability when fine-tuned to other labeled datasets.

domain adaptation, transferability, transnorm, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
Asia > China (0.05)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

bc573864331a9e42e4511de6f678aa83-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 03:21:50 GMT

fine-tuning, statistics, stochnorm, (15 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.68)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Review for NeurIPS paper: Stochastic Normalization

Neural Information Processing SystemsFeb-5-2025, 06:19:51 GMT

Summary and Contributions: This paper introduces a novel method to prevent overfitting when fine-tuning a pre-trained network for a new task using a small training set. The paper proposes a hybrid batch normalization layer, called stochastic normalization that, randomly switches the normalization statistics between: those calculated from the current min-batch and the moving average statistics. The authors replace the standard batch normalization layer of different network architectures such as VGG-16, Inception-V3, and Resnet-50 with their proposed stochastic normalization and show empirically that the fine-tuning using the adopted architecture outperforms multiple existing methods for over-fitting problem in fine-tuning. Overall, the paper is studying a very important problem and the proposed method seems to be working in practice. The major problem I have with this paper is the lack of consistency in the experimental set up.

network backbone, rebuttal, stochastic normalization, (10 more...)

Neural Information Processing Systems

Genre: Research Report > Promising Solution (0.38)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

zkFL: Zero-Knowledge Proof-based Gradient Aggregation for Federated Learning

Wang, Zhipeng, Dong, Nanqing, Sun, Jiahao, Knottenbelt, William

arXiv.org Artificial IntelligenceOct-19-2023

Federated Learning (FL) is a machine learning paradigm, which enables multiple and decentralized clients to collaboratively train a model under the orchestration of a central aggregator. Traditional FL solutions rely on the trust assumption of the centralized aggregator, which forms cohorts of clients in a fair and honest manner. However, a malicious aggregator, in reality, could abandon and replace the client's training models, or launch Sybil attacks to insert fake clients. Such malicious behaviors give the aggregator more power to control clients in the FL setting and determine the final training results. In this work, we introduce zkFL, which leverages zero-knowledge proofs (ZKPs) to tackle the issue of a malicious aggregator during the training model aggregation process. To guarantee the correct aggregation results, the aggregator needs to provide a proof per round. The proof can demonstrate to the clients that the aggregator executes the intended behavior faithfully. To further reduce the verification cost of clients, we employ a blockchain to handle the proof in a zero-knowledge way, where miners (i.e., the nodes validating and maintaining the blockchain data) can verify the proof without knowing the clients' local and aggregated models. The theoretical analysis and empirical results show that zkFL can achieve better security and privacy than traditional FL, without modifying the underlying FL network structure or heavily compromising the training speed.

aggregator, model update, zkfl system, (15 more...)

arXiv.org Artificial Intelligence

2310.02554

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Hierarchical Network-Oriented Analysis of User Participation in Misinformation Spread on WhatsApp

Nobre, Gabriel Peres, Ferreira, Carlos H. G., Almeida, Jussara M.

arXiv.org Artificial IntelligenceSep-21-2021

WhatsApp emerged as a major communication platform in many countries in the recent years. Despite offering only one-to-one and small group conversations, WhatsApp has been shown to enable the formation of a rich underlying network, crossing the boundaries of existing groups, and with structural properties that favor information dissemination at large. Indeed, WhatsApp has reportedly been used as a forum of misinformation campaigns with significant social, political and economic consequences in several countries. In this article, we aim at complementing recent studies on misinformation spread on WhatsApp, mostly focused on content properties and propagation dynamics, by looking into the network that connects users sharing the same piece of content. Specifically, we present a hierarchical network-oriented characterization of the users engaged in misinformation spread by focusing on three perspectives: individuals, WhatsApp groups and user communities, i.e., groupings of users who, intentionally or not, share the same content disproportionately often. By analyzing sharing and network topological properties, our study offers valuable insights into how WhatsApp users leverage the underlying network connecting different groups to gain large reach in the spread of misinformation on the platform.

backbone, co-sharing network, misinformation, (15 more...)

arXiv.org Artificial Intelligence

2109.10462

Country:

North America > United States (0.14)
South America > Brazil > Minas Gerais (0.04)
Asia > India (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Media > News (1.00)
Information Technology > Services (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.67)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.85)

Add feedback

Searching for Efficient Multi-Scale Architectures for Dense Image Prediction

Chen, Liang-Chieh, Collins, Maxwell, Zhu, Yukun, Papandreou, George, Zoph, Barret, Schroff, Florian, Adam, Hartwig, Shlens, Jon

Neural Information Processing SystemsDec-31-2018

The design of neural network architectures is an important component for achieving state-of-the-art performance with machine learning systems across a broad array of tasks. Much work has endeavored to design and build architectures automatically through clever construction of a search space paired with simple learning algorithms. Recent progress has demonstrated that such meta-learning methods may exceed scalable human-invented architectures on image classification tasks. An open question is the degree to which such methods may generalize to new domains. In this work we explore the construction of meta-learning techniques for dense image prediction focused on the tasks of scene parsing, person-part segmentation, and semantic image segmentation. Constructing viable search spaces in this domain is challenging because of the multi-scale representation of visual information and the necessity to operate on high resolution imagery. Based on a survey of techniques in dense image prediction, we construct a recursive search space and demonstrate that even with efficient random search, we can identify architectures that outperform human-invented architectures and achieve state-of-the-art performance on three dense prediction tasks including 82.7% on Cityscapes (street scene parsing), 71.3% on PASCAL-Person-Part (person-part segmentation), and 87.9% on PASCAL VOC 2012 (semantic image segmentation). Additionally, the resulting architecture is more computationally efficient, requiring half the parameters and half the computational cost as previous state of the art systems.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Genre:

Overview (0.54)
Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Searching for Efficient Multi-Scale Architectures for Dense Image Prediction

Chen, Liang-Chieh, Collins, Maxwell, Zhu, Yukun, Papandreou, George, Zoph, Barret, Schroff, Florian, Adam, Hartwig, Shlens, Jon

Neural Information Processing SystemsDec-31-2018

The design of neural network architectures is an important component for achieving state-of-the-art performance with machine learning systems across a broad array of tasks. Much work has endeavored to design and build architectures automatically through clever construction of a search space paired with simple learning algorithms. Recent progress has demonstrated that such meta-learning methods may exceed scalable human-invented architectures on image classification tasks. An open question is the degree to which such methods may generalize to new domains. In this work we explore the construction of meta-learning techniques for dense image prediction focused on the tasks of scene parsing, person-part segmentation, and semantic image segmentation. Constructing viable search spaces in this domain is challenging because of the multi-scale representation of visual information and the necessity to operate on high resolution imagery. Based on a survey of techniques in dense image prediction, we construct a recursive search space and demonstrate that even with efficient random search, we can identify architectures that outperform human-invented architectures and achieve state-of-the-art performance on three dense prediction tasks including 82.7% on Cityscapes (street scene parsing), 71.3% on PASCAL-Person-Part (person-part segmentation), and 87.9% on PASCAL VOC 2012 (semantic image segmentation). Additionally, the resulting architecture is more computationally efficient, requiring half the parameters and half the computational cost as previous state of the art systems.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Genre:

Overview (0.54)
Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Searching for Efficient Multi-Scale Architectures for Dense Image Prediction

Chen, Liang-Chieh, Collins, Maxwell D., Zhu, Yukun, Papandreou, George, Zoph, Barret, Schroff, Florian, Adam, Hartwig, Shlens, Jonathon

arXiv.org Machine LearningSep-11-2018

The design of neural network architectures is an important component for achieving state-of-the-art performance with machine learning systems across a broad array of tasks. Much work has endeavored to design and build architectures automatically through clever construction of a search space paired with simple learning algorithms. Recent progress has demonstrated that such meta-learning methods may exceed scalable human-invented architectures on image classification tasks. An open question is the degree to which such methods may generalize to new domains. In this work we explore the construction of meta-learning techniques for dense image prediction focused on the tasks of scene parsing, person-part segmentation, and semantic image segmentation. Constructing viable search spaces in this domain is challenging because of the multi-scale representation of visual information and the necessity to operate on high resolution imagery. Based on a survey of techniques in dense image prediction, we construct a recursive search space and demonstrate that even with efficient random search, we can identify architectures that outperform human-invented architectures and achieve state-of-the-art performance on three dense prediction tasks including 82.7\% on Cityscapes (street scene parsing), 71.3\% on PASCAL-Person-Part (person-part segmentation), and 87.9\% on PASCAL VOC 2012 (semantic image segmentation). Additionally, the resulting architecture is more computationally efficient, requiring half the parameters and half the computational cost as previous state of the art systems.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

1809.04184

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Filters

Collaborating Authors

network backbone

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Searching for Efficient Multi-Scale Architectures for Dense Image Prediction

TSGym: Design Choices for Deep Multivariate Time-Series Forecasting

Transferable Normalization: Towards Improving Transferability of Deep Neural Networks

bc573864331a9e42e4511de6f678aa83-Paper.pdf

Review for NeurIPS paper: Stochastic Normalization

zkFL: Zero-Knowledge Proof-based Gradient Aggregation for Federated Learning

A Hierarchical Network-Oriented Analysis of User Participation in Misinformation Spread on WhatsApp

Searching for Efficient Multi-Scale Architectures for Dense Image Prediction

Searching for Efficient Multi-Scale Architectures for Dense Image Prediction

Searching for Efficient Multi-Scale Architectures for Dense Image Prediction