AITopics | tiny deep learning

MCUNet: Tiny Deep Learning on IoT Devices

Neural Information Processing SystemsDec-24-2025, 06:27:22 GMT

Machine learning on tiny IoT devices based on microcontroller units (MCU) is appealing but challenging: the memory of microcontrollers is 2-3 orders of magnitude smaller even than mobile phones. We propose MCUNet, a framework that jointly designs the efficient neural architecture (TinyNAS) and the lightweight inference engine (TinyEngine), enabling ImageNet-scale inference on microcontrollers. TinyNAS adopts a two-stage neural architecture search approach that first optimizes the search space to fit the resource constraints, then specializes the network architecture in the optimized search space. TinyNAS can automatically handle diverse constraints (i.e.

mcunet, name change, tiny deep learning, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.62)

Add feedback

Memory-efficient Patch-based Inference for Tiny Deep Learning

Neural Information Processing SystemsDec-23-2025, 19:08:20 GMT

Tiny deep learning on microcontroller units (MCUs) is challenging due to the limited memory size. We find that the memory bottleneck is due to the imbalanced memory distribution in convolutional neural network (CNN) designs: the first several blocks have an order of magnitude larger memory usage than the rest of the network. To alleviate this issue, we propose a generic patch-by-patch inference scheduling, which operates only on a small spatial region of the feature map and significantly cuts down the peak memory. However, naive implementation brings overlapping patches and computation overhead. We further propose receptive field redistribution to shift the receptive field and FLOPs to the later stage and reduce the computation overhead. Manually redistributing the receptive field is difficult.

memory-efficient patch-based inference, name change, tiny deep learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

From Tiny Machine Learning to Tiny Deep Learning: A Survey

Somvanshi, Shriyank, Islam, Md Monzurul, Chhetri, Gaurab, Chakraborty, Rohit, Mimi, Mahmuda Sultana, Shuvo, Sawgat Ahmed, Islam, Kazi Sifatul, Javed, Syed Aaqib, Rafat, Sharif Ahmed, Dutta, Anandi, Das, Subasish

arXiv.org Artificial IntelligenceNov-14-2025

The rapid growth of edge devices has driven the demand for deploying artificial intelligence (AI) at the edge, giving rise to Tiny Machine Learning (TinyML) and its evolving counterpart, Tiny Deep Learning (TinyDL). While TinyML initially focused on enabling simple inference tasks on microcontrollers, the emergence of TinyDL marks a paradigm shift toward deploying deep learning models on severely resource-constrained hardware. This survey presents a comprehensive overview of the transition from TinyML to TinyDL, encompassing architectural innovations, hardware platforms, model optimization techniques, and software toolchains. We analyze state-of-the-art methods in quantization, pruning, and neural architecture search (NAS), and examine hardware trends from MCUs to dedicated neural accelerators. Furthermore, we categorize software deployment frameworks, compilers, and AutoML tools enabling practical on-device learning. Applications across domains such as computer vision, audio recognition, healthcare, and industrial monitoring are reviewed to illustrate the real-world impact of TinyDL. Finally, we identify emerging directions including neuromorphic computing, federated TinyDL, edge-native foundation models, and domain-specific co-design approaches. This survey aims to serve as a foundational resource for researchers and practitioners, offering a holistic view of the ecosystem and laying the groundwork for future advancements in edge AI.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3776588

2506.18927

Country: North America > United States > California (0.45)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Semiconductors & Electronics (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Review for NeurIPS paper: MCUNet: Tiny Deep Learning on IoT Devices

Neural Information Processing SystemsJan-26-2025, 08:45:03 GMT

The co-designing mechanism is the core contribution of the paper but the detailed process is not described clearly. It is suggested to provide an elaborate diagram or pseudocode to introduce the whole framework. It is meaningful to explore its generalization ability. Please demonstrate the potential/ability of MCUNet being deployed on more scenarios and tasks, e.g., more devices or tasks like object detection, semantic segmentation. But the experiments failed to highlight the improvements of the co-design scheme, when compared to those single design schemes. But it is unclear whether the overall network topology indeed is the main reason for the huge improvements.

mcunet, neurips paper, tiny deep learning, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Review for NeurIPS paper: MCUNet: Tiny Deep Learning on IoT Devices

Neural Information Processing SystemsJan-26-2025, 08:44:56 GMT

All four knowledgeable referees support acceptance for the contributions, notably co-designing TinyNAS and TinyEngine for deep learning on IoT devices and promising experimental results on ImageNet, and I also recommend acceptance. Please make it sure to appropriately reflect what has been promised through rebuttal such as elaboration on co-design.

iot device, neurips paper, tiny deep learning, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Internet of Things (0.79)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.79)

Add feedback

MCUNet: Tiny Deep Learning on IoT Devices

Neural Information Processing SystemsOct-10-2024, 17:08:07 GMT

Machine learning on tiny IoT devices based on microcontroller units (MCU) is appealing but challenging: the memory of microcontrollers is 2-3 orders of magnitude smaller even than mobile phones. We propose MCUNet, a framework that jointly designs the efficient neural architecture (TinyNAS) and the lightweight inference engine (TinyEngine), enabling ImageNet-scale inference on microcontrollers. TinyNAS adopts a two-stage neural architecture search approach that first optimizes the search space to fit the resource constraints, then specializes the network architecture in the optimized search space. TinyNAS can automatically handle diverse constraints (i.e. TinyNAS is co-designed with TinyEngine, a memory-efficient inference library to expand the search space and fit a larger model.

mcunet, microcontroller, tiny deep learning, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Memory-efficient Patch-based Inference for Tiny Deep Learning

Neural Information Processing SystemsOct-9-2024, 14:02:01 GMT

Tiny deep learning on microcontroller units (MCUs) is challenging due to the limited memory size. We find that the memory bottleneck is due to the imbalanced memory distribution in convolutional neural network (CNN) designs: the first several blocks have an order of magnitude larger memory usage than the rest of the network. To alleviate this issue, we propose a generic patch-by-patch inference scheduling, which operates only on a small spatial region of the feature map and significantly cuts down the peak memory. However, naive implementation brings overlapping patches and computation overhead. We further propose receptive field redistribution to shift the receptive field and FLOPs to the later stage and reduce the computation overhead.

computation overhead, memory-efficient patch-based inference, tiny deep learning, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

GitHub - mit-han-lab/tinyengine: [NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; MCUNetV3: On-Device Training Under 256KB Memory

#artificialintelligenceAug-30-2022, 16:27:39 GMT

This is the official implementation of TinyEngine, a memory-efficient and high-performance neural network library for Microcontrollers. TinyEngine is a part of MCUNet, which also consists of TinyNAS. MCUNet is a system-algorithm co-design framework for tiny deep learning on microcontrollers. TinyEngine and TinyNAS are co-designed to fit the tight memory budgets. We will soon release Tiny Training Engine used in MCUNetV3: On-Device Training Under 256KB Memory. If you are interested in getting updates, please sign up here to get notified!

memory-efficient patch-based inference, on-device training, tiny deep learning, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback