Goto

Collaborating Authors

 Optimization


Predictive Optimization with Zero-Shot Domain Adaptation

arXiv.org Machine Learning

Prediction in a new domain without any training samples, called zero-shot domain adaptation (ZSDA) (Yang and Hospedales, 2015a,b), is an important task in domain adaptation. To this end, an approach to utilize domain descriptions (Yang and Hospedales, 2015a,b), called domain attributes, has been developed. A goal of ZSDA is to obtain predictions in an unseen domain in which we did not observe any training samples. An application of ZSDA is the sales prediction of new products; regarding domains as products and given product attributes and sales data, we can use ZSDA to the sales prediction of a customer for a new product. Thanks to ZSDA, we can predict the response of input in an unseen domain; however, one potential aspect of ZSDA has been overlooked. We demonstrate another potential of ZSDA; by reversing the ZSDA prediction process, we can optimize domain attributes so that an evaluation metric of responses over customers is maximized, referred to as attribute optimization as shown in Figure 1. That is, instead of predicting responses given new domain attributes as in ZSDA, our task is to find new domain attributes given a prediction.


Optimal Energy Shaping via Neural Approximators

arXiv.org Artificial Intelligence

We introduce optimal energy shaping as an enhancement of classical passivity-based control methods. A promising feature of passivity theory, alongside stability, has traditionally been claimed to be intuitive performance tuning along the execution of a given task. However, a systematic approach to adjust performance within a passive control framework has yet to be developed, as each method relies on few and problem-specific practical insights. Here, we cast the classic energy-shaping control design process in an optimal control framework; once a task-dependent performance metric is defined, an optimal solution is systematically obtained through an iterative procedure relying on neural networks and gradient-based optimization. The proposed method is validated on state-regulation tasks.


Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model

arXiv.org Machine Learning

The drastic increase of data quantity often brings the severe decrease of data quality, such as incorrect label annotations, which poses a great challenge for robustly training Deep Neural Networks (DNNs). Existing learning \mbox{methods} with label noise either employ ad-hoc heuristics or restrict to specific noise assumptions. However, more general situations, such as instance-dependent label noise, have not been fully explored, as scarce studies focus on their label corruption process. By categorizing instances into confusing and unconfusing instances, this paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances. The resultant model can be realized by DNNs, where the training procedure is accomplished by employing an alternating optimization algorithm. Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness over state-of-the-art counterparts.


Singularity-free Aerial Deformation by Two-dimensional Multilinked Aerial Robot with 1-DoF Vectorable Propeller

arXiv.org Artificial Intelligence

Two-dimensional multilinked structures can benefit aerial robots in both maneuvering and manipulation because of their deformation ability. However, certain types of singular forms must be avoided during deformation. Hence, an additional 1 Degrees-of-Freedom (DoF) vectorable propeller is employed in this work to overcome singular forms by properly changing the thrust direction. In this paper, we first extend modeling and control methods from our previous works for an under-actuated model whose thrust forces are not unidirectional. We then propose a planning method for the vectoring angles to solve the singularity by maximizing the controllability under arbitrary robot forms. Finally, we demonstrate the feasibility of the proposed methods by experiments where a quad-type model is used to perform trajectory tracking under challenging forms, such as a line-shape form, and the deformation passing these challenging forms.


Preferential Mixture-of-Experts: Interpretable Models that Rely on Human Expertise as much as Possible

arXiv.org Artificial Intelligence

We propose Preferential MoE, a novel human-ML mixture-of-experts model that augments human expertise in decision making with a data-based classifier only when necessary for predictive performance. Our model exhibits an interpretable gating function that provides information on when human rules should be followed or avoided. The gating function is maximized for using human-based rules, and classification errors are minimized. We propose solving a coupled multi-objective problem with convex subproblems. We develop approximate algorithms and study their performance and convergence. Finally, we demonstrate the utility of Preferential MoE on two clinical applications for the treatment of Human Immunodeficiency Virus (HIV) and management of Major Depressive Disorder (MDD).


CobBO: Coordinate Backoff Bayesian Optimization

arXiv.org Machine Learning

Bayesian optimization is a popular method for optimizing expensive black-box functions. The objective functions of hard real world problems are oftentimes characterized by a fluctuated landscape of many local optima. Bayesian optimization risks in over-exploiting such traps, remaining with insufficient query budget for exploring the global landscape. We introduce Coordinate Backoff Bayesian optimization (CobBO) to alleviate those challenges. CobBO captures a smooth approximation of the global landscape by interpolating the values of queried points projected to randomly selected promising coordinate subspaces. Thus also a smaller query budget is required for the Gaussian process regressions applied over the lower dimensional subspaces. This approach can be viewed as a variant of coordinate ascent, tailored for Bayesian optimization, using a stopping rule for backing off from a certain subspace and switching to another coordinate subset. Additionally, adaptive trust regions are dynamically formed to expedite the convergence, and stagnant local optima are escaped by switching trust regions. Further smoothness and acceleration are achieved by filtering out clustered queried points. Through comprehensive evaluations over a wide spectrum of benchmarks, CobBO is shown to consistently find comparable or better solutions, with a reduced trial complexity compared to the state-of-the-art methods in both low and high dimensions.


Convolutional Neural Nets: Foundations, Computations, and New Applications

arXiv.org Artificial Intelligence

We review mathematical foundations of convolutional neural nets (CNNs) with the goals of: i) highlighting connections with techniques from statistics, signal processing, linear algebra, differential equations, and optimization, ii) demystifying underlying computations, and iii) identifying new types of applications. CNNs are powerful machine learning models that highlight features from grid data to make predictions (regression and classification). The grid data object can be represented as vectors (in 1D), matrices (in 2D), or tensors (in 3D or higher dimensions) and can incorporate multiple channels (thus providing high flexibility in the input data representation). For example, an image can be represented as a 2D grid data object that contains red, green, and blue (RBG) channels (each channel is a 2D matrix). Similarly, a video can be represented as a 3D grid data object (two spatial dimensions plus time) with RGB channels (each channel is a 3D tensor). CNNs highlight features from the grid data by performing convolution operations with different types of operators. The operators highlight different types of features (e.g., patterns, gradients, geometrical features) and are learned by using optimization techniques. In other words, CNNs seek to identify optimal operators that best map the input data to the output data. A common misconception is that CNNs are only capable of processing image or video data but their application scope is much wider; specifically, datasets encountered in diverse applications can be expressed as grid data. Here, we show how to apply CNNs to new types of applications such as optimal control, flow cytometry, multivariate process monitoring, and molecular simulations.


Discrete Knowledge Graph Embedding based on Discrete Optimization

arXiv.org Artificial Intelligence

This paper proposes a discrete knowledge graph (KG) embedding (DKGE) method, which projects KG entities and relations into the Hamming space based on a computationally tractable discrete optimization algorithm, to solve the formidable storage and computation cost challenges in traditional continuous graph embedding methods. The convergence of DKGE can be guaranteed theoretically. Extensive experiments demonstrate that DKGE achieves superior accuracy than classical hashing functions that map the effective continuous embeddings into discrete codes. Besides, DKGE reaches comparable accuracy with much lower computational complexity and storage compared to many continuous graph embedding methods.


Joint aggregation of cardinal and ordinal evaluations with an application to a student paper competition

arXiv.org Artificial Intelligence

An important problem in decision theory concerns the aggregation of individual rankings/ratings into a collective evaluation. We illustrate a new aggregation method in the context of the 2007 MSOM's student paper competition. The aggregation problem in this competition poses two challenges. Firstly, each paper was reviewed only by a very small fraction of the judges; thus the aggregate evaluation is highly sensitive to the subjective scales chosen by the judges. Secondly, the judges provided both cardinal and ordinal evaluations (ratings and rankings) of the papers they reviewed. The contribution here is a new robust methodology that jointly aggregates ordinal and cardinal evaluations into a collective evaluation. This methodology is particularly suitable in cases of incomplete evaluations -- i.e., when the individuals evaluate only a strict subset of the objects. This approach is potentially useful in managerial decision making problems by a committee selecting projects from a large set or capital budgeting involving multiple priorities.


Model-Based Machine Learning for Communications

arXiv.org Machine Learning

Traditional communication systems design is dominated by methods that are based on statistical models. These statistical-model-based algorithms, which we refer to henceforth as model-based methods, rely on mathematical models that describe the transmission process, signal propagation, receiver noise, interference, and many other components of the system that affect the end-to-end signal transmission and reception. Such mathematical models use parameters that vary over time as the channel conditions, the environment, network traffic, or network topology change. Therefore, for optimal operation, many of the algorithms used in communication systems rely on the underlying mathematical models as well as the estimation of the model parameters. However, there are cases where this approach fails, in particular when the mathematical models for one or more of the system components are highly complex, hard to estimate, poorly understood, do not well-capture the underlying physics of the system, or do not lend themselves to computationally-efficient algorithms.