AITopics | architecture representation

Existing Neural Architecture Search (NAS) methods either encode neural architectures using discrete encodings that do not scale well, or adopt supervised learning-based methods to jointly learn architecture representations and optimize architecture search on such representations which incurs search bias. Despite the widespread use, architecture representations learned in NAS are still poorly understood. We observe that the structural properties of neural architectures are hard to preserve in the latent space if architecture representation learning and search are coupled, resulting in less effective search performance. In this work, we find empirically that pre-training architecture representations using only neural architectures without their accuracies as labels improves the downstream architecture search efficiency. To explain this finding, we visualize how unsupervised architecture representation learning better encourages neural architectures with similar connections and operators to cluster together. This helps map neural architectures with similar performance to the same regions in the latent space and makes the transition of architectures in the latent space relatively smooth, which considerably benefits diverse downstream search strategies.

architecture representation, learning help neural architecture search, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

f22e4747da1aa27e363d86d40ff442fe-Paper.pdf

Neural Information Processing SystemsAug-18-2025, 19:49:07 GMT

architecture, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.73)

Add feedback

937936029af671cf479fa893db91cbdd-Paper.pdf

Neural Information Processing SystemsAug-15-2025, 04:02:01 GMT

Figure 1: Supervised architecture representation learning (top): the supervision signal for representation learning comes from the accuracies of architectures selected by the search strategies.

architecture, architecture search, representation, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan (0.04)
North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.73)

Add feedback

NN-Former: Rethinking Graph Structure in Neural Architecture Representation

Xu, Ruihan, Zhang, Haokui, Wang, Yaowei, Zeng, Wei, Zhang, Shiliang

arXiv.org Artificial IntelligenceJul-2-2025

The growing use of deep learning necessitates efficient network design and deployment, making neural predictors vital for estimating attributes such as accuracy and latency. Recently, Graph Neural Networks (GNNs) and transformers have shown promising performance in representing neural architectures. However, each of both methods has its disadvantages. GNNs lack the capabilities to represent complicated features, while transformers face poor generalization when the depth of architecture grows. T o mitigate the above issues, we rethink neural architecture topology and show that sibling nodes are pivotal while overlooked in previous research. W e thus propose a novel predictor leveraging the strengths of GNNs and transformers to learn the enhanced topology. W e introduce a novel token mixer that considers siblings, and a new channel mixer named bidirectional graph isomorphism feed-forward network. Our approach consistently achieves promising performance in both accuracy and latency prediction, providing valuable insights for learning Directed Acyclic Graph (DAG) topology. The code is available at https://github.com/XuRuihan/NNF

artificial intelligence, information, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.0088

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Language Embedding Meets Dynamic Graph: A New Exploration for Neural Architecture Representation Learning

Jing, Haizhao, Zhang, Haokui, Shang, Zhenhao, Xiao, Rong, Wang, Peng, Zhang, Yanning

arXiv.org Artificial IntelligenceJun-10-2025

Neural Architecture Representation Learning aims to transform network models into feature representations for predicting network attributes, playing a crucial role in deploying and designing networks for real-world applications. Recently, inspired by the success of transformers, transformer-based models integrated with Graph Neural Networks (GNNs) have achieved significant progress in representation learning. However, current methods still have some limitations. First, existing methods overlook hardware attribute information, which conflicts with the current trend of diversified deep learning hardware and limits the practical applicability of models. Second, current encoding approaches rely on static adjacency matrices to represent topological structures, failing to capture the structural differences between computational nodes, which ultimately compromises encoding effectiveness. In this paper, we introduce LeDG-Former, an innovative framework that addresses these limitations through the synergistic integration of language-based semantic embedding and dynamic graph representation learning. Specifically, inspired by large language models (LLMs), we propose a language embedding framework where both neural architectures and hardware platform specifications are projected into a unified semantic space through tokenization and LLM processing, enabling zero-shot prediction across different hardware platforms for the first time. Then, we propose a dynamic graph-based transformer for modeling neural architectures, resulting in improved neural architecture modeling performance. On the NNLQP benchmark, LeDG-Former surpasses previous methods, establishing a new SOTA while demonstrating the first successful cross-hardware latency prediction capability. Furthermore, our framework achieves superior performance on the cell-structured NAS-Bench-101 and NAS-Bench-201 datasets.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.07735

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

CARL: Causality-guided Architecture Representation Learning for an Interpretable Performance Predictor

Ji, Han, Feng, Yuqi, Fan, Jiahao, Sun, Yanan

arXiv.org Artificial IntelligenceJun-5-2025

Performance predictors have emerged as a promising method to accelerate the evaluation stage of neural architecture search (NAS). These predictors estimate the performance of unseen architectures by learning from the correlation between a small set of trained architectures and their performance. However, most existing predictors ignore the inherent distribution shift between limited training samples and diverse test samples. Hence, they tend to learn spurious correlations as shortcuts to predictions, leading to poor generalization. To address this, we propose a Causality-guided Architecture Representation Learning (CARL) method aiming to separate critical (causal) and redundant (non-causal) features of architectures for generalizable architecture performance prediction. Specifically, we employ a substructure extractor to split the input architecture into critical and redundant substructures in the latent space. Then, we generate multiple interventional samples by pairing critical representations with diverse redundant representations to prioritize critical features. Extensive experiments on five NAS search spaces demonstrate the state-of-the-art accuracy and superior interpretability of CARL. For instance, CARL achieves 97.67% top-1 accuracy on CIFAR-10 using DARTS.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2506.04001

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Cognitive Science (0.93)

Add feedback

Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?

Neural Information Processing SystemsOct-10-2024, 19:08:23 GMT

Existing Neural Architecture Search (NAS) methods either encode neural architectures using discrete encodings that do not scale well, or adopt supervised learning-based methods to jointly learn architecture representations and optimize architecture search on such representations which incurs search bias. Despite the widespread use, architecture representations learned in NAS are still poorly understood. We observe that the structural properties of neural architectures are hard to preserve in the latent space if architecture representation learning and search are coupled, resulting in less effective search performance. In this work, we find empirically that pre-training architecture representations using only neural architectures without their accuracies as labels improves the downstream architecture search efficiency. To explain this finding, we visualize how unsupervised architecture representation learning better encourages neural architectures with similar connections and operators to cluster together.

architecture, architecture representation, neural architecture, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Hierarchical Representations for Efficient Architecture Search

Liu, Hanxiao, Simonyan, Karen, Vinyals, Oriol, Fernando, Chrisantha, Kavukcuoglu, Koray

arXiv.org Machine LearningFeb-22-2018

We explore efficient neural architecture search methods and show that a simple yet powerful evolutionary algorithm can discover new architectures with excellent performance. Our approach combines a novel hierarchical genetic representation scheme that imitates the modularized design pattern commonly adopted by human experts, and an expressive search space that supports complex topologies. Our algorithm efficiently discovers architectures that outperform a large number of manually designed models for image classification, obtaining top-1 error of 3.6% on CIFAR-10 and 20.3% when transferred to ImageNet, which is competitive with the best existing neural architecture search approaches.

artificial intelligence, evolutionary algorithm, machine learning, (17 more...)

arXiv.org Machine Learning

1711.00436

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Filters

Collaborating Authors

architecture representation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

f22e4747da1aa27e363d86d40ff442fe-Paper.pdf

937936029af671cf479fa893db91cbdd-Paper.pdf

Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?

f22e4747da1aa27e363d86d40ff442fe-Paper.pdf

937936029af671cf479fa893db91cbdd-Paper.pdf

NN-Former: Rethinking Graph Structure in Neural Architecture Representation

Language Embedding Meets Dynamic Graph: A New Exploration for Neural Architecture Representation Learning

CARL: Causality-guided Architecture Representation Learning for an Interpretable Performance Predictor

Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?

Hierarchical Representations for Efficient Architecture Search