AITopics | pc-darts

Collaborating Authors

pc-darts

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix

Neural Information Processing SystemsFeb-10-2026, 18:44:41 GMT

artificial intelligence, machine learning, retrieval, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Oregon (0.05)

Industry: Health & Medicine > Therapeutic Area (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Half Search Space is All You Need

Rumiantsev, Pavel, Coates, Mark

arXiv.org Machine LearningMay-21-2025

Neural Architecture Search (NAS) is a powerful tool for automating architecture design. One-Shot NAS techniques, such as DARTS, have gained substantial popularity due to their combination of search efficiency with simplicity of implementation. By design, One-Shot methods have high GPU memory requirements during the search. To mitigate this issue, we propose to prune the search space in an efficient automatic manner to reduce memory consumption and search time while preserving the search accuracy. Specifically, we utilise Zero-Shot NAS to efficiently remove low-performing architectures from the search space before applying One-Shot NAS to the pruned search space. Experimental results on the DARTS search space show that our approach reduces memory consumption by 81% compared to the baseline One-Shot setup while achieving the same level of accuracy.

architecture, artificial intelligence, search space, (17 more...)

arXiv.org Machine Learning

2505.13586

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Masked Autoencoders Are Robust Neural Architecture Search Learners

Hu, Yiming, Chu, Xiangxiang, Zhang, Bo

arXiv.org Artificial IntelligenceNov-20-2023

Neural Architecture Search (NAS) currently relies heavily on labeled data, which is both expensive and time-consuming to acquire. In this paper, we propose a novel NAS framework based on Masked Autoencoders (MAE) that eliminates the need for labeled data during the search process. By replacing the supervised learning objective with an image reconstruction task, our approach enables the robust discovery of network architectures without compromising performance and generalization ability. Additionally, we address the problem of performance collapse encountered in the widely-used Differentiable Architecture Search (DARTS) method in the unsupervised paradigm by introducing a multi-scale decoder. Through extensive experiments conducted on various search spaces and datasets, we demonstrate the effectiveness and robustness of the proposed method, providing empirical evidence of its superiority over baseline approaches.

architecture, architecture search, search space, (15 more...)

arXiv.org Artificial Intelligence

2311.12086

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Interleaving Learning, with Application to Neural Architecture Search

Ban, Hao, Xie, Pengtao

arXiv.org Artificial IntelligenceMar-11-2021

Interleaving learning is a human learning technique where a learner interleaves the studies of multiple topics, which increases long-term retention and improves ability to transfer learned knowledge. Inspired by the interleaving learning technique of humans, in this paper we explore whether this learning methodology is beneficial for improving the performance of machine learning models as well. We propose a novel machine learning framework referred to as interleaving learning (IL). In our framework, a set of models collaboratively learn a data encoder in an interleaving fashion: the encoder is trained by model 1 for a while, then passed to model 2 for further training, then model 3, and so on; after trained by all models, the encoder returns back to model 1 and is trained again, then moving to model 2, 3, etc. This process repeats for multiple rounds. Our framework is based on multi-level optimization consisting of multiple inter-connected learning stages. An efficient gradient-based algorithm is developed to solve the multi-level optimization problem. We apply interleaving learning to search neural architectures for image classification on CIFAR-10, CIFAR-100, and ImageNet. The effectiveness of our method is strongly demonstrated by the experimental results.

architecture, architecture search, cifar-10, (17 more...)

arXiv.org Artificial Intelligence

2103.07018

Country: North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning by Teaching, with Application to Neural Architecture Search

Sheth, Parth, Jiang, Yueyu, Xie, Pengtao

arXiv.org Artificial IntelligenceMar-11-2021

In human learning, an effective skill in improving learning outcomes is learning by teaching: a learner deepens his/her understanding of a topic by teaching this topic to others. In this paper, we aim to borrow this teaching-driven learning methodology from humans and leverage it to train more performant machine learning models, by proposing a novel ML framework referred to as learning by teaching (LBT). In the LBT framework, a teacher model improves itself by teaching a student model to learn well. Specifically, the teacher creates a pseudo-labeled dataset and uses it to train a student model. Based on how the student performs on a validation dataset, the teacher re-learns its model and re-teaches the student until the student achieves great validation performance. Our framework is based on three-level optimization which contains three stages: teacher learns; teacher teaches student; teacher re-learns based on how well the student performs. A simple but efficient algorithm is developed to solve the three-level optimization problem. We apply LBT to search neural architectures on CIFAR-10, CIFAR-100, and ImageNet. The efficacy of our method is demonstrated in various experiments.

architecture, architecture search, student, (15 more...)

arXiv.org Artificial Intelligence

2103.07009

Country: North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning by Self-Explanation, with Application to Neural Architecture Search

Hosseini, Ramtin, Xie, Pengtao

arXiv.org Artificial IntelligenceDec-23-2020

Learning by self-explanation is an effective learning technique in human learning, where students explain a learned topic to themselves for deepening their understanding of this topic. It is interesting to investigate whether this explanation-driven learning methodology broadly used by humans is helpful for improving machine learning as well. Based on this inspiration, we propose a novel machine learning method called learning by self-explanation (LeaSE). In our approach, an explainer model improves its learning ability by trying to clearly explain to an audience model regarding how a prediction outcome is made. LeaSE is formulated as a four-level optimization problem involving a sequence of four learning stages which are conducted end-to-end in a unified framework: 1) explainer learns; 2) explainer explains; 3) audience learns; 4) explainer re-learns based on the performance of the audience. We develop an efficient algorithm to solve the LeaSE problem. We apply LeaSE for neural architecture search on CIFAR-100, CIFAR-10, and ImageNet. Experimental results strongly demonstrate the effectiveness of our method.

architecture, architecture search, explainer, (15 more...)

arXiv.org Artificial Intelligence

2012.12899

Country: North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Add feedback

DrNAS: Dirichlet Neural Architecture Search

Chen, Xiangning, Wang, Ruochen, Cheng, Minhao, Tang, Xiaocheng, Hsieh, Cho-Jui

arXiv.org Machine LearningOct-16-2020

This paper proposes a novel differentiable architecture search method by formulating it into a distribution learning problem. We treat the continuously relaxed architecture mixing weight as random variables, modeled by Dirichlet distribution. With recently developed pathwise derivatives, the Dirichlet parameters can be easily optimized with gradient-based optimizer in an end-to-end manner. This formulation improves the generalization ability and induces stochasticity that naturally encourages exploration in the search space. Furthermore, to alleviate the large memory consumption of differentiable NAS, we propose a simple yet effective progressive learning scheme that enables searching directly on large-scale tasks, eliminating the gap between search and evaluation phases. Extensive experiments demonstrate the effectiveness of our method. Specifically, we obtain a test error of 2.46% for CIFAR-10, 23.7% for ImageNet under the mobile setting. On NAS-Bench-201, we also achieve state-of-the-art results on all three datasets and provide insights for the effective design of neural architecture search algorithms.

architecture search, artificial intelligence, machine learning, (14 more...)

arXiv.org Machine Learning

2006.10355

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Faster Gradient-based NAS Pipeline Combining Broad Scalable Architecture with Confident Learning Rate

Ding, Zixiang, Chen, Yaran, Li, Nannan, Zhao, Dongbin

arXiv.org Machine LearningSep-21-2020

In order to further improve the search efficiency of Neural Architecture Search (NAS), we propose B-DARTS, a novel pipeline combining broad scalable architecture with Confident Learning Rate (CLR). In B-DARTS, Broad Convolutional Neural Network (BCNN) is employed as the scalable architecture for DARTS, a popular differentiable NAS approach. On one hand, BCNN is a broad scalable architecture whose topology achieves two advantages compared with the deep one, mainly including faster single-step training speed and higher memory efficiency (i.e. larger batch size for architecture search), which are all contributed to the search efficiency improvement of NAS. On the other hand, DARTS discovers the optimal architecture by gradient-based optimization algorithm, which benefits from two superiorities of BCNN simultaneously. Similar to vanilla DARTS, B-DARTS also suffers from the performance collapse issue, where those weight-free operations are prone to be selected by the search strategy. Therefore, we propose CLR, that considers the confidence of gradient for architecture weights update increasing with the training time of over-parameterized model, to mitigate the above issue. Experimental results on CIFAR-10 and ImageNet show that 1) B-DARTS delivers state-of-the-art efficiency of 0.09 GPU day using first order approximation on CIFAR-10; 2) the learned architecture by B-DARTS achieves competitive performance using state-of-the-art composite multiply-accumulate operations and parameters on ImageNet; and 3) the proposed CLR is effective for performance collapse issue alleviation of both B-DARTS and DARTS.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

2009.08886

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Neural Architecture Search Using Stable Rank of Convolutional Layers

Machida, Kengo, Uto, Kuniaki, Shinoda, Koichi, Suzuki, Taiji

arXiv.org Artificial IntelligenceSep-19-2020

In Neural Architecture Search (NAS), Differentiable ARchiTecture Search (DARTS) has recently attracted much attention due to its high efficiency. It defines an over-parameterized network with mixed edges each of which represents all operator candidates, and jointly optimizes the weights of the network and its architecture in an alternating way. However, this process prefers a model whose weights converge faster than the others, and such a model with fastest convergence often leads to overfitting. Accordingly the resulting model cannot always be well-generalized. To overcome this problem, we propose Minimum Stable Rank DARTS (MSR-DARTS), which aims to find a model with the best generalization error by replacing the architecture optimization with the selection process using the minimum stable rank criterion. Specifically, a convolution operator is represented by a matrix and our method chooses the one whose stable rank is the smallest. We evaluate MSR-DARTS on CIFAR-10 and ImageNet dataset. It achieves an error rate of 2.92% with only 1.7M parameters within 0.5 GPU-days on CIFAR-10, and a top-1 error rate of 24.0% on ImageNet. Our MSR-DARTS directly optimizes an ImageNet model with only 2.6 GPU days while it is often impractical for existing NAS methods to directly optimize a large model such as ImageNet models and hence a proxy dataset such as CIFAR-10 is often utilized.

artificial intelligence, machine learning, operator, (17 more...)

arXiv.org Artificial Intelligence

2009.09209

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
Europe > Sweden > Stockholm > Stockholm (0.04)
(9 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

DARTS-: Robustly Stepping out of Performance Collapse Without Indicators

Chu, Xiangxiang, Wang, Xiaoxing, Zhang, Bo, Lu, Shun, Wei, Xiaolin, Yan, Junchi

arXiv.org Artificial IntelligenceSep-2-2020

Despite the fast development of differentiable architecture search (DARTS), it suffers from a standing instability issue regarding searching performance, which extremely limits its application. Existing robustifying methods draw clues from the outcome instead of finding out the causing factor. Various indicators such as Hessian eigenvalues are proposed as a signal of performance collapse, and the searching should be stopped once an indicator reaches a preset threshold. However, these methods tend to easily reject good architectures if thresholds are inappropriately set, let alone the searching is intrinsically noisy. In this paper, we undertake a more subtle and direct approach to resolve the collapse. We first demonstrate that skip connections with a learnable architectural coefficient can easily recover from a disadvantageous state and become dominant. We conjecture that skip connections profit too much from this privilege, hence causing the collapse for the derived model. Therefore, we propose to factor out this benefit with an auxiliary skip connection, ensuring a fairer competition for all operations. Extensive experiments on various datasets verify that our approach can substantially improve the robustness of DARTS.

artificial intelligence, darts, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2009.01027

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback