AITopics | Problem-Independent Architectures

Collaborating Authors

Problem-Independent Architectures

News Overviews Instructional Materials AI-Alerts Classics

Review for NeurIPS paper: Adapting Neural Architectures Between Domains

Neural Information Processing SystemsJan-21-2025, 08:29:12 GMT

This paper focuses on the intersection between neural architecture search and domain adaptation. The proposal is to minimize the cross-domain generalization gap that generally exists in current neural architecture search (NAS) methods with proxy tasks. The philosophy behind sounds quite interesting to me. Namely, instead of directly using the target dataset for searching, which suffers from the high computation cost, the authors propose to improve the generalizability of neural architectures by leveraging a small portion of target samples via a domain adaptation technique. This philosophy leads to a novel algorithm design I have never seen, i.e., AdaptNAS.

neural architecture, neural architecture search, neurips paper

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Reviews: XNAS: Neural Architecture Search with Expert Advice

Neural Information Processing SystemsJan-21-2025, 04:22:20 GMT

I think the empirical evidence for exponentiated gradient descent ( wipeout) in NAS is valuable for the community. I would suggest that the authors clearly state the limitations associated with the analysis.] Summary: The proposed approach for NAS treats architecture choices, i.e., intra-cell node connections as in DARTS (Liu et al., 2018), as selection among "experts". The expert weights are found via EG descent (Kivinen & Warmuth, 1997) that somehow utilizes the standard "back-propagated loss gradient" (line 113) to perform the multiplicative weight updates. It's at this point that I had trouble following the theoretical analysis given that the EG algorithm requires a loss function on the expert prediction to be specified for the development of a regret bound.

artificial intelligence, machine learning, neural architecture search, (12 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.40)

Add feedback

Reviews: XNAS: Neural Architecture Search with Expert Advice

Neural Information Processing SystemsJan-21-2025, 04:22:10 GMT

The reviewers appreciated the good empirical results and theoretical analysis. This paper proposes to treat NAS problems as a selection problem among experts. Over time, it eliminates underperforming experts with a wipe-out step. As two of the reviewers pointed out, the theoretical analysis is interesting (and rare, in this type of paper), but it would be good to more explicitly spell out when and why the assumptions hold. Empirical performance seems good, but the authors should include error bars for at least the CIFAR-10 experiments and ideally the ImageNet ones as well.

expert advice, neural architecture search, reviewer, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)
Information Technology > Artificial Intelligence > Cognitive Science (0.40)

Add feedback

Decomposing Interventional Causality into Synergistic, Redundant, and Unique Components

Jansma, Abel

arXiv.org Artificial IntelligenceJan-20-2025

We introduce a novel framework for decomposing interventional causal effects into synergistic, redundant, and unique components, building on the intuition of Partial Information Decomposition (PID) and the principle of M\"obius inversion. While recent work has explored a similar decomposition of an observational measure, we argue that a proper causal decomposition must be interventional in nature. We develop a mathematical approach that systematically quantifies how causal power is distributed among variables in a system, using a recently derived closed-form expression for the M\"obius function of the redundancy lattice. The formalism is then illustrated by decomposing the causal power in logic gates, cellular automata, and chemical reaction networks. Our results reveal how the distribution of causal power can be context- and parameter-dependent. This decomposition provides new insights into complex systems by revealing how causal influences are shared and combined among multiple variables, with potential applications ranging from attribution of responsibility in legal or AI systems, to the analysis of biological networks or climate models.

artificial intelligence, causal power, decomposition, (18 more...)

arXiv.org Artificial Intelligence

2501.11447

Country: Europe (0.15)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.48)
Research Report > Strength High (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.35)

Add feedback

Arbitrarily Scalable Environment Generators via Neural Cellular Automata

Neural Information Processing SystemsJan-19-2025, 19:57:14 GMT

We study the problem of generating arbitrarily large environments to improve the throughput of multi-robot systems. Prior work proposes Quality Diversity (QD) algorithms as an effective method for optimizing the environments of automated warehouses. However, these approaches optimize only relatively small environments, falling short when it comes to replicating real-world warehouse sizes. The challenge arises from the exponential increase in the search space as the environment size increases. Additionally, the previous methods have only been tested with up to 350 robots in simulations, while practical warehouses could host thousands of robots.

artificial intelligence, neural cellular automata, scalable environment generator, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.44)

Add feedback

How Powerful are Performance Predictors in Neural Architecture Search?

Neural Information Processing SystemsJan-19-2025, 12:40:25 GMT

Early methods in the rapidly developing field of neural architecture search (NAS) required fully training thousands of neural networks. To reduce this extreme computational cost, dozens of techniques have since been proposed to predict the final performance of neural architectures. Despite the success of such performance prediction methods, it is not well-understood how different families of techniques compare to one another, due to the lack of an agreed-upon evaluation metric and optimization for different constraints on the initialization time and query time. In this work, we give the first large-scale study of performance predictors by analyzing 31 techniques ranging from learning curve extrapolation, to weight-sharing, to supervised learning, to zero-cost proxies. We test a number of correlation- and rank-based performance measures in a variety of settings, as well as the ability of each technique to speed up predictor-based NAS frameworks.

artificial intelligence, machine learning, neural architecture search, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.89)

Add feedback

Unifying and Boosting Gradient-Based Training-Free Neural Architecture Search

Neural Information Processing SystemsJan-19-2025, 00:32:16 GMT

Neural architecture search (NAS) has gained immense popularity owing to its ability to automate neural architecture design. A number of training-free metrics are recently proposed to realize NAS without training, hence making NAS more scalable. Despite their competitive empirical performances, a unified theoretical understanding of these training-free metrics is lacking. As a consequence, (a) the relationships among these metrics are unclear, (b) there is no theoretical interpretation for their empirical performances, and (c) there may exist untapped potential in existing training-free NAS, which probably can be unveiled through a unified theoretical understanding. To this end, this paper presents a unified theoretical analysis of gradient-based training-free NAS, which allows us to (a) theoretically study their relationships, (b) theoretically guarantee their generalization performances, and (c) exploit our unified theoretical understanding to develop a novel framework named hybrid NAS (HNAS) which consistently boosts training-free NAS in a principled way.

artificial intelligence, gradient-based training-free neural architecture search, machine learning, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Cognitive Science (0.90)

Add feedback

BLOX: Macro Neural Architecture Search Benchmark and Algorithms

Neural Information Processing SystemsJan-18-2025, 21:01:43 GMT

Neural architecture search (NAS) has been successfully used to design numerous high-performance neural networks. However, NAS is typically compute-intensive, so most existing approaches restrict the search to decide the operations and topological structure of a single block only, then the same block is stacked repeatedly to form an end-to-end model. Although such an approach reduces the size of search space, recent studies show that a macro search space, which allows blocks in a model to be different, can lead to better performance. To provide a systematic study of the performance of NAS algorithms on a macro search space, we release Blox – a benchmark that consists of 91k unique models trained on the CIFAR-100 dataset. The dataset also includes runtime measurements of all the models on a diverse set of hardware platforms. We perform extensive experiments to compare existing algorithms that are well studied on cell-based search spaces, with the emerging blockwise approaches that aim to make NAS scalable to much larger macro search spaces.

artificial intelligence, machine learning, search space, (4 more...)

Neural Information Processing Systems

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.99)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.65)

Add feedback

Learning Graph Cellular Automata

Neural Information Processing SystemsJan-18-2025, 16:56:33 GMT

Cellular automata (CA) are a class of computational models that exhibit rich dynamics emerging from the local interaction of cells arranged in a regular lattice. In this work we focus on a generalised version of typical CA, called graph cellular automata (GCA), in which the lattice structure is replaced by an arbitrary graph. In particular, we extend previous work that used convolutional neural networks to learn the transition rule of conventional CA and we use graph neural networks to learn a variety of transition rules for GCA. First, we present a general-purpose architecture for learning GCA, and we show that it can represent any arbitrary GCA with finite and discrete state space. Then, we test our approach on three different tasks: 1) learning the transition rule of a GCA on a Voronoi tessellation; 2) imitating the behaviour of a group of flocking agents; 3) learning a rule that converges to a desired target state.

artificial intelligence, learning graph cellular automata, machine learning, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.91)

Add feedback

Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models

Neural Information Processing SystemsJan-18-2025, 16:09:29 GMT

Traditional knowledge distillation (KD) methods manually design student architectures to compress large models given pre-specified computational cost. This requires several trials to find viable students, and repeating the process with change in computational budget. We use Neural Architecture Search (NAS) to automatically distill several compressed students with variable cost from a large model. Existing NAS methods train a single SuperLM consisting of millions of subnetworks with weight-sharing, resulting in interference between subnetworks of different sizes. Additionally, many of these works are task-specific requiring task labels for SuperLM training.

artificial intelligence, few-shot task-agnostic neural architecture search, machine learning, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.77)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)

Add feedback