Goto

Collaborating Authors

 Optimization


Data Driven Resource Allocation for Distributed Learning

AAAI Conferences

In distributed machine learning, data is dispatched to multiple machines for processing. Motivated by the fact that similar data points often belong to the same or similar classes, and more generally, classification rules of high accuracy tend to be "locally simple but globally complex" (Vapnik and Bottou 1993), we propose data dependent dispatching that takes advantage of such structure. We present an in-depth analysis of this model, providing new algorithms with provable worst-case guarantees, analysis proving existing scalable heuristics perform well in natural non worst-case conditions, and techniques for extending a dispatching rule from a small sample to the entire distribution. We overcome novel technical challenges to satisfy important conditions for accurate distributed learning, including fault tolerance and balancedness. We empirically compare our approach with baselines based on random partitioning, balanced partition trees, and locality sensitive hashing, showing that we achieve significantly higher accuracy on both synthetic and real world image and advertising datasets. We also demonstrate that our technique strongly scales with the available computing power.


Fast Electrical Demand Optimization Under Real-Time Pricing

AAAI Conferences

Real-time pricing (RTP) is an effective scheme for reducing peak demand, but it can lead to load synchronization , where a large amount of consumption is shifted from a typical peak time to a non-peak time, without reducing the peak demand. To address this issue, this paper presents a demand management method under RTP for the smart grid, that solves a large-scale of energy scheduling problem for households in an area. This is a distributed optimization method that finds the optimal consumption levels to minimize the total electricity cost while meeting the demands and preferences of households. Moreover, we propose to compute the probability distributions of start times for tasks, with which smart meters can quickly schedule tasks in practice, while matching the aggregate demand to the optimal consumption levels. The complexity of the optimization method is independent of the number households, which allows it to be applied to problems with realistic scales.


Stochastic Search In Changing Situations

AAAI Conferences

Stochastic search algorithms are black-box optimizer of an objective function. They have recently gained a lot of attention in operations research, machine learning and policy search of robot motor skills due to their ease of use and their generality. However, when the task or objective function slightly changes, many stochastic search algorithms require complete re-learning in order to adapt thesolution to the new objective function or the new context. As such, we consider the contextual stochastic search paradigm. Here, we want to find good parameter vectors for multiple related tasks, where each task is described by a continuous context vector. Hence, the objective function might change slightly for each parameter vector evaluation. In this paper, we investigate a contextual stochastic search algorithm known as Contextual Relative Entropy Policy Search (CREPS), an information-theoretic algorithm that can learn from multiple tasks simultaneously. We show the application of CREPS for simulated robotic tasks.


Complementing the Execution of AI Systems with Human Computation

AAAI Conferences

For a multitude of tasks that come naturally to humans, performance of AI systems is inferior to human level performance. We show how human intellect made available via crowdsourcing can be used to complement an existing system during execution. We introduce a hybrid workflow that queries people to verify and correct the output of the system and present a simulation-based workflow optimization method to balance the cost of human input with the expected improvement in performance. Through empirical evaluations on an image captioning system, we show that the hybrid system, which combines the AI system with human input, significantly outperforms the automated system by properly trading off the cost of human input with expected benefit. Finally, we show that human input collected at execution time can be used to teach the system about its errors and limitations.


Preemptive Detection of Unsafe Motion Liable for Hazard

AAAI Conferences

Establishing a safety standard for autonomous vehicles operating in open and dynamic environment is a challenge. As collisions are inevitable in over-constrained situations, we focus on deciding the liability for a hazard. Our insight is that hazards caused by malfunctions of autonomous vehicles result from loss of functional integrity. Design defects may leave it unnoticed, or the real-world may make integritypreserving motion infeasible. Guarantee of functional integrity in an observable way at run-time is indispensable for revealing defects by using formal root-cause analysis, and for supporting safety claims by dismissing unreasonable doubts about design defects. From a practitical standpoint, we attempt to formalize a verification problem that consists of a novel criterion for determining liability for hazard, a safety claim comprised of confirmed observable states, and assumptions underlying the safety claim. We propose a run-time scheme of monitoring events that may lead to violations of the assumptions and a precursor to root-causes leading to loss of functional integrity and consequent hazards. We formulate a means of preemptively detecting unsafe motions liable to be hazardous as satisfiability problem within the framework of an adversarial motion planning subject to assumptions on maneuverability of movers. A numerical study shows that the run-time scheme using non-linear programming (NLP) encoding is viable in a real-world setting.


Genetic algorithms for feature selection in Data Analytics

#artificialintelligence

Many common applications of predictive analytics, from customer segmentation to medical diagnosis, arise from complex relationships between features (also called variables or characteristics). Feature selection is the process of finding the most relevant variables for a predictive model. These techniques can be used to identify and remove unneeded, irrelevant and redundant features that do not contribute or decrease the accuracy of the predictive model. Mathematically, feature selection is formulated as a combinatorial optimization problem. Here the function to optimize is the generalization performance of the predictive model, represented by the error on a selection data set.


Matrix Completion has No Spurious Local Minimum

arXiv.org Machine Learning

Matrix completion is a basic machine learning problem that has wide applications, especially in collaborative filtering and recommender systems. Simple non-convex optimization algorithms are popular and effective in practice. Despite recent progress in proving various non-convex algorithms converge from a good initial point, it remains unclear why random or arbitrary initialization suffices in practice. We prove that the commonly used non-convex objective function for \textit{positive semidefinite} matrix completion has no spurious local minima --- all local minima must also be global. Therefore, many popular optimization algorithms such as (stochastic) gradient descent can provably solve positive semidefinite matrix completion with \textit{arbitrary} initialization in polynomial time. The result can be generalized to the setting when the observed entries contain noise. We believe that our main proof strategy can be useful for understanding geometric properties of other statistical problems involving partial or noisy observations.


The Impact of Estimation: A New Method for Clustering and Trajectory Estimation in Patient Flow Modeling

arXiv.org Machine Learning

The ability to accurately forecast and control inpatient census, and thereby workloads, is a critical and longstanding problem in hospital management. Majority of current literature focuses on optimal scheduling of inpatients, but largely ignores the process of accurate estimation of the trajectory of patients throughout the treatment and recovery process. The result is that current scheduling models are optimizing based on inaccurate input data. We developed a Clustering and Scheduling Integrated (CSI) approach to capture patient flows through a network of hospital services. CSI functions by clustering patients into groups based on similarity of trajectory using a novel Semi-Markov model (SMM)-based clustering scheme proposed in this paper, as opposed to clustering by admit type or condition as in previous literature. The methodology is validated by simulation and then applied to real patient data from a partner hospital where we see it outperforms current methods. Further, we demonstrate that extant optimization methods achieve significantly better results on key hospital performance measures under CSI, compared with traditional estimation approaches, increasing elective admissions by 97% and utilization by 22% compared to 30% and 8% using traditional estimation techniques. From a theoretical standpoint, the SMM-clustering is a novel approach applicable to any temporal-spatial stochastic data that is prevalent in many industries and application areas.


Using ML-driven marketing optimization to solve the attribution conundrum

@machinelearnbot

Accurate multichannel campaign attribution has stumped the online marketing industry for years. But what if the solution is to stop worrying about attribution, and move to an optimization-driven approach? You know those photo mosaic images, which suddenly became terribly popular a few years back? They cleverly use lots of individual tiny images to make up one large image. If you look closely you can make out the individual images, but you have to stand back to take in the full picture.


Robust Semi-supervised Least Squares Classification by Implicit Constraints

arXiv.org Machine Learning

We introduce the implicitly constrained least squares (ICLS) classifier, a novel semi-supervised version of the least squares classifier. This classifier minimizes the squared loss on the labeled data among the set of parameters implied by all possible labelings of the unlabeled data. Unlike other discriminative semi-supervised methods, this approach does not introduce explicit additional assumptions into the objective function, but leverages implicit assumptions already present in the choice of the supervised least squares classifier. This method can be formulated as a quadratic programming problem and its solution can be found using a simple gradient descent procedure. We prove that, in a limited 1-dimensional setting, this approach never leads to performance worse than the supervised classifier. Experimental results show that also in the general multidimensional case performance improvements can be expected, both in terms of the squared loss that is intrinsic to the classifier, as well as in terms of the expected classification error.