Goto

Collaborating Authors

 Search


FX-DARTS: Designing Topology-unconstrained Architectures with Differentiable Architecture Search and Entropy-based Super-network Shrinking

arXiv.org Artificial Intelligence

--Strong priors are imposed on the search space of Differentiable Architecture Search (DARTS), such that cells of the same type share the same topological structure and each intermediate node retains two operators from distinct nodes. While these priors reduce optimization difficulties and improve the applicability of searched architectures, they hinder the subsequent development of automated machine learning (Auto-ML) and prevent the optimization algorithm from exploring more powerful neural networks through improved architectural flexibility. This paper aims to reduce these prior constraints by eliminating restrictions on cell topology and modifying the dis-cretization mechanism for super-networks. Specifically, the Flexible DARTS (FX-DARTS) method, which leverages an Entropy-based Super-Network Shrinking (ESS) framework, is presented to address the challenges arising from the elimination of prior constraints. Notably, FX-DARTS enables the derivation of neural architectures without strict prior rules while maintaining the stability in the enlarged search space. Experimental results on image classification benchmarks demonstrate that FX-DARTS is capable of exploring a set of neural architectures with competitive trade-offs between performance and computational complexity within a single search procedure. Derong Liu is with the School of System Design and Intelligent Manufacturing, Southern University of Science and Technology, Shenzhen 518000, China (email: liudr@sustech.edu.cn), and also with the Department of Electrical and Computer Engineering, University of Illinois Chicago, Chicago, IL 60607, USA (e-mail: derong@uic.edu). This manuscript is submitted to IEEE Transaction on Neural Network and Learning Systems and is under reviewed. Personal use of this manuscript is permitted. Over the past decade, the powerful representation ability of deep neural networks (DNNs) has contributed to significant progress in various machine learning tasks, including computer vision [1], [2], natural language processing [3], [4], system identification and control [5], [6], time series prediction [7], and autonomous vehicles [8]-[10], among others. Undoubtedly, the architecture design of neural networks plays a pivotal role in these breakthroughs [11]-[15]. To address the labor-intensive and time-consuming trial-and-error process of DNN architecture design, neural architecture search (NAS) [16] has emerged as a promising approach. NAS automates the exploration of a vast space of potential architectures, traditionally through three key steps: defining a search space, selecting a search algorithm, and identifying an optimal architecture within the search space. The effectiveness of NAS heavily relies on the careful design of both the search space and the search strategy, as a well-constructed search space can significantly enhance the search algorithm's ability to discover optimal neural architectures [17].


MINT: Multi-Vector Search Index Tuning

arXiv.org Artificial Intelligence

Vector search plays a crucial role in many real-world applications. In addition to single-vector search, multi-vector search becomes important for multi-modal and multi-feature scenarios today. In a multi-vector database, each row is an item, each column represents a feature of items, and each cell is a high-dimensional vector. In multi-vector databases, the choice of indexes can have a significant impact on performance. Although index tuning for relational databases has been extensively studied, index tuning for multi-vector search remains unclear and challenging. In this paper, we define multi-vector search index tuning and propose a framework to solve it. Specifically, given a multi-vector search workload, we develop algorithms to find indexes that minimize latency and meet storage and recall constraints. Compared to the baseline, our latency achieves 2.1X to 8.3X speedup.


Learning Efficiency Meets Symmetry Breaking

arXiv.org Artificial Intelligence

Learning-based planners leveraging Graph Neural Networks can learn search guidance applicable to large search spaces, yet their potential to address symmetries remains largely unexplored. In this paper, we introduce a graph representation of planning problems allying learning efficiency with the ability to detect symmetries, along with two pruning methods, action pruning and state pruning, designed to manage symmetries during search. The integration of these techniques into Fast Downward achieves a first-time success over LAMA on the latest IPC learning track dataset. Code is released at: https://github.com/bybeye/Distincter.


Smooth Approximations of the Rounding Function

arXiv.org Artificial Intelligence

We propose novel smooth approximations to the classical rounding function, suitable for differentiable optimization and machine learning applications. Our constructions are based on two approaches: (1) localized sigmoid window functions centered at each integer, and (2) normalized weighted sums of sigmoid derivatives representing local densities. The first method approximates the step-like behavior of rounding through differences of shifted sigmoids, while the second method achieves smooth interpolation between integers via density-based weighting. Both methods converge pointwise to the classical rounding function as the sharpness parameter k tends to infinity, and allow controlled trade-offs between smoothness and approximation accuracy. We demonstrate that by restricting the summation to a small set of nearest integers, the computational cost remains low without sacrificing precision. These constructions provide fully differentiable alternatives to hard rounding, which are valuable in contexts where gradient-based methods are essential.


Towards minimax optimal algorithms for Active Simple Hypothesis Testing

arXiv.org Machine Learning

We study the problem of Active Simple Hypothesis Testing (ASHT) whe re an agent is faced with the problem of choosing between m different simple hypotheses after observing T samples. At the end of T samples, it has to output one of the m hypothesis. The distinguishing difference from the usual hypothes is testing scenario is the ability to choose one of K actions and observe the corresponding sample for that action. Th is ability to control the samples in this way makes the problem more interesting and difficult compared to the usual hypothesis testing with no control over the sample generation. The performance of the agent is meas ured in terms of the error probability its decision incurs. The above theoretical problem is a model for many practica l scenarios-A cosmetic drug trial often involve a testing period where the outcome of interest is to identify the best product after the trial period, choosing a channel from a set of channels before commencing communications, placeme nt of sensors in certain set of positions so as to minimize signal error. Any situation which require a period of testing b efore committing to a final decision with only certain fixed budget of samples (that is an inability to request additio nal samples) can be modeled effectively using ASHT and its more general version - Fixed Budget Best Arm Identific ation (FB-BAI). We intend to study the ASHT problem in the large deviation setting with the quantity of interest being the minimax error exponent over the hypotheses, that is, the worst case er ror exponent over the hypotheses.


Introduction to Quantum Machine Learning and Quantum Architecture Search

arXiv.org Artificial Intelligence

Introduction to Quantum Machine Learning and Quantum Architecture Search Samuel Y en-Chi Chen 1 Zhiding Liang 2 1 Wells Fargo 2 Rensselaer Polytechnic Institute Abstract --Recent advancements in quantum computing (QC) and machine learning (ML) have fueled significant research efforts aimed at integrating these two transformative technologies. Quantum machine learning (QML), an emerging interdisciplinary field, leverages quantum principles to enhance the performance of ML algorithms. Concurrently, the exploration of systematic and automated approaches for designing high-performance quantum circuit architectures for QML tasks has gained prominence, as these methods empower researchers outside the quantum computing domain to effectively utilize quantum-enhanced tools. This tutorial will provide an in-depth overview of recent breakthroughs in both areas, highlighting their potential to expand the application landscape of QML across diverse fields. I NTRODUCTION Quantum computing (QC) offers the potential for substantial speedups in solving certain computationally challenging problems compared to classical computers. Recent advancements in quantum hardware, coupled with remarkable progress in classical AI and machine learning (ML) techniques, have sparked growing interest in merging these two technologies to further accelerate advancements in artificial intelligence.


OPO: Making Decision-Focused Data Acquisition Decisions

arXiv.org Artificial Intelligence

We propose a model for making data acquisition decisions for variables in contextual stochastic optimisation problems. Data acquisition decisions are typically treated as separate and fixed. We explore problem settings in which the acquisition of contextual variables is costly and consequently constrained. The data acquisition problem is often solved heuristically for proxy objectives such as coverage. The more intuitive objective is the downstream decision quality as a result of data acquisition decisions. The whole pipeline can be characterised as an optimise-then-predict-then-optimise (OPO) problem. Analogously, much recent research has focused on how to integrate prediction and optimisation (PO) in the form of decision-focused learning. We propose leveraging differentiable optimisation to extend the integration to data acquisition. We solve the data acquisition problem with well-defined constraints by learning a surrogate linear objective function. We demonstrate an application of this model on a shortest path problem for which we first have to set a drone reconnaissance strategy to capture image segments serving as inputs to a model that predicts travel costs. We ablate the problem with a number of training modalities and demonstrate that the differentiable optimisation approach outperforms random search strategies.


Causal DAG Summarization (Full Version)

arXiv.org Artificial Intelligence

Causal inference aids researchers in discovering cause-and-effect relationships, leading to scientific insights. Accurate causal estimation requires identifying confounding variables to avoid false discoveries. Pearl's causal model uses causal DAGs to identify confounding variables, but incorrect DAGs can lead to unreliable causal conclusions. However, for high dimensional data, the causal DAGs are often complex beyond human verifiability. Graph summarization is a logical next step, but current methods for general-purpose graph summarization are inadequate for causal DAG summarization. This paper addresses these challenges by proposing a causal graph summarization objective that balances graph simplification for better understanding while retaining essential causal information for reliable inference. We develop an efficient greedy algorithm and show that summary causal DAGs can be directly used for inference and are more robust to misspecification of assumptions, enhancing robustness for causal inference. Experimenting with six real-life datasets, we compared our algorithm to three existing solutions, showing its effectiveness in handling high-dimensional data and its ability to generate summary DAGs that ensure both reliable causal inference and robustness against misspecifications.


On Revealing the Hidden Problem Structure in Real-World and Theoretical Problems Using Walsh Coefficient Influence

arXiv.org Artificial Intelligence

Gray-box optimization employs Walsh decomposition to obtain non-linear variable dependencies and utilize them to propose masks of variables that have a joint non-linear influence on fitness value. These masks significantly improve the effectiveness of variation operators. In some problems, all variables are non-linearly dependent, making the aforementioned masks useless. We analyze the features of the real-world instances of such problems and show that many of their dependencies may have noise-like origins. Such noise-caused dependencies are irrelevant to the optimization process and can be ignored. To identify them, we propose extending the use of Walsh decomposition by measuring variable dependency strength that allows the construction of the weighted dynamic Variable Interaction Graph (wdVIG). wdVIGs adjust the dependency strength to mixed individuals. They allow the filtering of irrelevant dependencies and re-enable using dependency-based masks by variation operators. We verify the wdVIG potential on a large benchmark suite. For problems with noise, the wdVIG masks can improve the optimizer's effectiveness. If all dependencies are relevant for the optimization, i.e., the problem is not noised, the influence of wdVIG masks is similar to that of state-of-the-art structures of this kind.


The Model Counting Competitions 2021-2023

arXiv.org Artificial Intelligence

Modern society is full of computational challenges that rely on probabilistic reasoning, statistics, and combinatorics. Interestingly, many of these questions can be formulated by encoding them into propositional formulas and then asking for its number of models. With a growing interest in practical problem-solving for tasks that involve model counting, the community established the Model Counting (MC) Competition in fall of 2019 with its first iteration in 2020. The competition aims at advancing applications, identifying challenging benchmarks, fostering new solver development, and enhancing existing solvers for model counting problems and their variants. The first iteration, brought together various researchers, identified challenges, and inspired numerous new applications. In this paper, we present a comprehensive overview of the 2021-2023 iterations of the Model Counting Competition. We detail its execution and outcomes. The competition comprised four tracks, each focusing on a different variant of the model counting problem. The first track centered on the model counting problem (MC), which seeks the count of models for a given propositional formula. The second track challenged developers to submit programs capable of solving the weighted model counting problem (WMC). The third track was dedicated to projected model counting (PMC). Finally, we initiated a track that combined projected and weighted model counting (PWMC). The competition continued with a high level of participation, with seven to nine solvers submitted in various different version and based on quite diverging techniques.