Goto

Collaborating Authors

 Optimization


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

The authors uses an estimation approach to estimate the scaling matrix and replaces an O(np p 3) per iteration cost with an O(np p 2) cost. The new optimization method applies to generalized linear models and is extremely well crafted. The paper is extremely clear, the quality is superb, the originality is good and the significance only limited by the experiments and the importance of generalized linear models. The experiments are a bit weak. There are only two examples -- logistic regression and ordinary least squares.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

The authors propose a probabilistic version of the "line search" procedure that is commonly used as a subroutine in many deterministic optimization algorithms. The new technique can be applied when the evaluations of the objective function and its gradients are corrupted by noise. Therefore, the proposed method can be successfully used in stochastic optimization problems, eliminating the requirement of having to specify a learning rate parameter in this type of problems. The proposed method uses a Gaussian process surrogate model for the objective and its gradients. This allows us to obtain a probabilistic version of the conditions commonly used to terminate line searches in the deterministic scenario.


Adaptive Learning-based Model Predictive Control Strategy for Drift Vehicles

arXiv.org Artificial Intelligence

Drift vehicle control offers valuable insights to support safe autonomous driving in extreme conditions, which hinges on tracking a particular path while maintaining the vehicle states near the drift equilibrium points (DEP). However, conventional tracking methods are not adaptable for drift vehicles due to their opposite steering angle and yaw rate. In this paper, we propose an adaptive path tracking (APT) control method to dynamically adjust drift states to follow the reference path, improving the commonly utilized predictive path tracking methods with released computation burden. Furthermore, existing control strategies necessitate a precise system model to calculate the DEP, which can be more intractable due to the highly nonlinear drift dynamics and sensitive vehicle parameters. To tackle this problem, an adaptive learning-based model predictive control (ALMPC) strategy is proposed based on the APT method, where an upper-level Bayesian optimization is employed to learn the DEP and APT control law to instruct a lower-level MPC drift controller. This hierarchical system architecture can also resolve the inherent control conflict between path tracking and drifting by separating these objectives into different layers. The ALMPC strategy is verified on the Matlab-Carsim platform, and simulation results demonstrate its effectiveness in controlling the drift vehicle to follow a clothoid-based reference path even with the misidentified road friction parameter.


Computing and Learning on Combinatorial Data

arXiv.org Artificial Intelligence

The twenty-first century is a data-driven era where human activities and behavior, physical phenomena, scientific discoveries, technology advancements, and almost everything that happens in the world resulting in massive generation, collection, and utilization of data. Connectivity in data is a crucial property. A straightforward example is the World Wide Web, where every webpage is connected to other web pages through hyperlinks, providing a form of directed connectivity. Combinatorial data refers to combinations of data items based on certain connectivity rules. Other forms of combinatorial data include social networks, meshes, community clusters, set systems, and molecules. This Ph.D. dissertation focuses on learning and computing with combinatorial data. We study and examine topological and connectivity features within and across connected data to improve the performance of learning and achieve high algorithmic efficiency.


Probabilistic Artificial Intelligence

arXiv.org Artificial Intelligence

Artificial intelligence commonly refers to the science and engineering of artificial systems that can carry out tasks generally associated with requiring aspects of human intelligence, such as playing games, translating languages, and driving cars. In recent years, there have been exciting advances in learning-based, data-driven approaches towards AI, and machine learning and deep learning have enabled computer systems to perceive the world in unprecedented ways. Reinforcement learning has enabled breakthroughs in complex games such as Go and challenging robotics tasks such as quadrupedal locomotion. A key aspect of intelligence is to not only make predictions, but reason about the uncertainty in these predictions, and to consider this uncertainty when making decisions. This is what this manuscript on "Probabilistic Artificial Intelligence" is about. The first part covers probabilistic approaches to machine learning. We discuss the differentiation between "epistemic" uncertainty due to lack of data and "aleatoric" uncertainty, which is irreducible and stems, e.g., from noisy observations and outcomes. We discuss concrete approaches towards probabilistic inference and modern approaches to efficient approximate inference. The second part of the manuscript is about taking uncertainty into account in sequential decision tasks. We consider active learning and Bayesian optimization -- approaches that collect data by proposing experiments that are informative for reducing the epistemic uncertainty. We then consider reinforcement learning and modern deep RL approaches that use neural network function approximation. We close by discussing modern approaches in model-based RL, which harness epistemic and aleatoric uncertainty to guide exploration, while also reasoning about safety.


Generative-enhanced optimization for knapsack problems: an industry-relevant study

arXiv.org Artificial Intelligence

Optimization is a crucial task in various industries such as logistics, aviation, manufacturing, chemical, pharmaceutical, and insurance, where finding the best solution to a problem can result in significant cost savings and increased efficiency. Tensor networks (TNs) have gained prominence in recent years in modeling classical systems with quantum-inspired approaches. More recently, TN generative-enhanced optimization (TN-GEO) has been proposed as a strategy which uses generative modeling to efficiently sample valid solutions with respect to certain constraints of optimization problems. Moreover, it has been shown that symmetric TNs (STNs) can encode certain constraints of optimization problems, thus aiding in their solution process. In this work, we investigate the applicability of TN- and STN-GEO to an industry relevant problem class, a multi-knapsack problem, in which each object must be assigned to an available knapsack. We detail a prescription for practitioners to use the TN-and STN-GEO methodology and study its scaling behavior and dependence on its hyper-parameters. We benchmark 60 different problem instances and find that TN-GEO and STN-GEO produce results of similar quality to simulated annealing.


Contextual Scenario Generation for Two-Stage Stochastic Programming

arXiv.org Artificial Intelligence

Two-stage stochastic programs (2SPs) are important tools for making decisions under uncertainty. Decision-makers use contextual information to generate a set of scenarios to represent the true conditional distribution. However, the number of scenarios required is a barrier to implementing 2SPs, motivating the problem of generating a small set of surrogate scenarios that yield high-quality decisions when they represent uncertainty. Current scenario generation approaches do not leverage contextual information or do not address computational concerns. In response, we propose contextual scenario generation (CSG) to learn a mapping between the context and a set of surrogate scenarios of user-specified size. First, we propose a distributional approach that learns the mapping by minimizing a distributional distance between the predicted surrogate scenarios and the true contextual distribution. Second, we propose a task-based approach that aims to produce surrogate scenarios that yield high-quality decisions. The task-based approach uses neural architectures to approximate the downstream objective and leverages the approximation to search for the mapping. The proposed approaches apply to various problem structures and loosely only require efficient solving of the associated subproblems and 2SPs defined on the reduced scenario sets. Numerical experiments demonstrating the effectiveness of the proposed methods are presented.


Shapley Value Approximation Based on k-Additive Games

arXiv.org Artificial Intelligence

The Shapley value is the prevalent solution for fair division problems in which a payout is to be divided among multiple agents. By adopting a game-theoretic view, the idea of fair division and the Shapley value can also be used in machine learning to quantify the individual contribution of features or data points to the performance of a predictive model. Despite its popularity and axiomatic justification, the Shapley value suffers from a computational complexity that scales exponentially with the number of entities involved, and hence requires approximation methods for its reliable estimation. We propose SVA$k_{\text{ADD}}$, a novel approximation method that fits a $k$-additive surrogate game. By taking advantage of $k$-additivity, we are able to elicit the exact Shapley values of the surrogate game and then use these values as estimates for the original fair division problem. The efficacy of our method is evaluated empirically and compared to competing methods.


DobLIX: A Dual-Objective Learned Index for Log-Structured Merge Trees

arXiv.org Artificial Intelligence

In this paper, we introduce DobLIX, a dual-objective learned index specifically designed for Log-Structured Merge(LSM) tree-based key-value stores. Although traditional learned indexes focus exclusively on optimizing index lookups, they often overlook the impact of data access from storage, resulting in performance bottlenecks. DobLIX addresses this by incorporating a second objective, data access optimization, into the learned index training process. This dual-objective approach ensures that both index lookup efficiency and data access costs are minimized, leading to significant improvements in read performance while maintaining write efficiency in real-world LSM-tree systems. Additionally, DobLIX features a reinforcement learning agent that dynamically tunes the system parameters, allowing it to adapt to varying workloads in real-time. Experimental results using real-world datasets demonstrate that DobLIX reduces indexing overhead and improves throughput by 1.19 to 2.21 times compared to state-of-the-art methods within RocksDB, a widely used LSM-tree-based storage engine.


Unbiased Sliced Wasserstein Kernels for High-Quality Audio Captioning

arXiv.org Artificial Intelligence

Teacher-forcing training for audio captioning usually leads to exposure bias due to training and inference mismatch. Prior works propose the contrastive method to deal with caption degeneration. However, the contrastive method ignores the temporal information when measuring similarity across acoustic and linguistic modalities, leading to inferior performance. In this work, we develop the temporal-similarity score by introducing the unbiased sliced Wasserstein RBF (USW-RBF) kernel equipped with rotary positional embedding to account for temporal information across modalities. In contrast to the conventional sliced Wasserstein RBF kernel, we can form an unbiased estimation of USW-RBF kernel via Monte Carlo estimation. Therefore, it is well-suited to stochastic gradient optimization algorithms, and its approximation error decreases at a parametric rate of $\mathcal{O}(L^{-1/2})$ with $L$ Monte Carlo samples. Additionally, we introduce an audio captioning framework based on the unbiased sliced Wasserstein kernel, incorporating stochastic decoding methods to mitigate caption degeneration during the generation process. We conduct extensive quantitative and qualitative experiments on two datasets, AudioCaps and Clotho, to illustrate the capability of generating high-quality audio captions. Experimental results show that our framework is able to increase caption length, lexical diversity, and text-to-audio self-retrieval accuracy.