Goto

Collaborating Authors

 Genre


A note on the lack of symmetry in the graphical lasso

arXiv.org Machine Learning

The graphical lasso (glasso) is a widely-used fast algorithm for estimating sparse inverse covariance matrices. The glasso solves an L1 penalized maximum likelihood problem and is available as an R library on CRAN. The output from the glasso, a regularized covariance matrix estimate a sparse inverse covariance matrix estimate, not only identify a graphical model but can also serve as intermediate inputs into multivariate procedures such as PCA, LDA, MANOVA, and others. The glasso indeed produces a covariance matrix estimate which solves the L1 penalized optimization problem in a dual sense; however, the method for producing the inverse covariance matrix estimator after this optimization is inexact and may produce asymmetric estimates. This problem is exacerbated when the amount of L1 regularization that is applied is small, which in turn is more likely to occur if the true underlying inverse covariance matrix is not sparse. The lack of symmetry can potentially have consequences. First, it implies that the covariance and inverse covariance estimates are not numerical inverses of one another, and second, asymmetry can possibly lead to negative or complex eigenvalues,rendering many multivariate procedures which may depend on the inverse covariance estimator unusable. We demonstrate this problem, explain its causes, and propose possible remedies.


Nonlinear spectral unmixing of hyperspectral images using Gaussian processes

arXiv.org Machine Learning

This paper presents an unsupervised algorithm for nonlinear unmixing of hyperspectral images. The proposed model assumes that the pixel reflectances result from a nonlinear function of the abundance vectors associated with the pure spectral components. We assume that the spectral signatures of the pure components and the nonlinear function are unknown. The first step of the proposed method consists of the Bayesian estimation of the abundance vectors for all the image pixels and the nonlinear function relating the abundance vectors to the observations. The endmembers are subsequently estimated using Gaussian process regression. The performance of the unmixing strategy is evaluated with simulations conducted on synthetic and real data.


Meta-Learning of Exploration/Exploitation Strategies: The Multi-Armed Bandit Case

arXiv.org Machine Learning

The exploration/exploitation (E/E) dilemma arises naturally in many subfields of Science. Multi-armed bandit problems formalize this dilemma in its canonical form. Most current research in this field focuses on generic solutions that can be applied to a wide range of problems. However, in practice, it is often the case that a form of prior information is available about the specific class of target problems. Prior knowledge is rarely used in current solutions due to the lack of a systematic approach to incorporate it into the E/E strategy. To address a specific class of E/E problems, we propose to proceed in three steps: (i) model prior knowledge in the form of a probability distribution over the target class of E/E problems; (ii) choose a large hypothesis space of candidate E/E strategies; and (iii), solve an optimization problem to find a candidate E/E strategy of maximal average performance over a sample of problems drawn from the prior distribution. We illustrate this meta-learning approach with two different hypothesis spaces: one where E/E strategies are numerically parameterized and another where E/E strategies are represented as small symbolic formulas. We propose appropriate optimization algorithms for both cases. Our experiments, with two-armed Bernoulli bandit problems and various playing budgets, show that the meta-learnt E/E strategies outperform generic strategies of the literature (UCB1, UCB1-Tuned, UCB-v, KL-UCB and epsilon greedy); they also evaluate the robustness of the learnt E/E strategies, by tests carried out on arms whose rewards follow a truncated Gaussian distribution.


Last-Mile Restoration for Multiple Interdependent Infrastructures

AAAI Conferences

This paper considers the restoration of multiple interdependent infrastructures after a man-made or natural disaster. Modern infrastructures feature complex cyclic interdependencies and require a holistic restoration process. This paper presents the first scalable approach for the last-mile restoration of the joint electrical power and gas infrastructures. It builds on an earlier three-stage decomposition for restoring the power network that decouples the restoration ordering and the routing aspects. The key contributions of the paper are (1) mixed-integer programming models for finding a minimal restoration set and a restoration ordering and (2) a randomized adaptive decomposition to obtain high-quality solutions within the required time constraints. The approach is validated on a large selection of benchmarks based on the United States infrastructures and state-of-the-art weather and fragility simulation tools. The results show significant improvements over current field practices.


Diamonds From the Rough: Improving Drawing, Painting, and Singing via Crowdsourcing

AAAI Conferences

It is well established that in certain domains, noisy inputs can be reliablycombined to obtain a better answer than any individual.It is now possible to consider the crowdsourcing of physical actions,commonly used for creative expressions such as drawing, shading, and singing.We provide algorithms for converting low-quality inputobtained from the physical actions of a crowd into high-quality output.The inputs take the form of line drawings, shaded images, and songs.We investigate single-individual crowds (multiple inputs from a single human)and multiple-individual crowds.


Personalized Online Education — A Crowdsourcing Challenge

AAAI Conferences

Interest in online education is surging, as dramatized bythe success of Khan Academy and recent Stanford online courses, but the technology for online education isin its infancy. Crowdsourcing mechanisms will likelybe essential in order to reach the full potential of thismedium. This paper sketches some of the challengesand directions we hope HCOMP researchers will address.


Statistical Relational Learning to Predict Primary Myocardial Infarction from Electronic Health Records

AAAI Conferences

Electronic health records (EHRs) are an emerging relational domain with large potential to improve clinical outcomes. We apply two statistical relational learning (SRL) algorithms to the task of predicting primary myocardial infarction. We show that one SRL algorithm, relational functional gradient boosting, outperforms propositional learners particularly in the medically-relevant high recall region. We observe that both SRL algorithms predict outcomes better than their propositional analogs and suggest how our methods can augment current epidemiological practices.


Solving Peg Solitaire with Bidirectional BFIDA*

AAAI Conferences

We present a novel approach to bidirectional breadth-first IDA* (BFIDA*) and demonstrate its effectiveness in the domain of peg solitaire, a simple puzzle. Our approach improves upon unidirectional BFIDA* by usually avoiding the last iteration of search entirely, greatly speeding up search. In addition, we provide a number of improvements specific to peg solitaire. We have improved duplicate-detection in the context of BFIDA*. We have strengthened the heuristic used in the previous state-of-the-art solver. Finally, we use bidirectional search frontiers to provide a stronger technique for pruning unsolvable states. The combination of these approaches allows us to improve over the previous state-of-the-art, often by a two-orders-of-magnitude reduction in search time.


TRUSTS: Scheduling Randomized Patrols for Fare Inspection in Transit Systems

AAAI Conferences

In proof-of-payment transit systems, passengers are legally required to purchase tickets before entering but are not physically forced to do so. Instead, patrol units move about the transit system, inspecting the tickets of passengers, who face fines if caught fare evading. The deterrence of such fines depends on the unpredictability and effectiveness of the patrols. In this paper, we present TRUSTS, an application for scheduling randomized patrols for fare inspection in transit systems. TRUSTS models the problem of computing patrol strategies as a leader-follower Stackelberg game where the objective is to deter fare evasion and hence maximize revenue. This problem differs from previously studied Stackelberg settings in that the leader strategies must satisfy massive temporal and spatial constraints; moreover, unlike in these counterterrorism-motivated Stackelberg applications, a large fraction of the ridership might realistically consider fare evasion, and so the number of followers is potentially huge. A third key novelty in our work is deliberate simplification of leader strategies to make patrols easier to be executed. We present an efficient algorithm for computing such patrol strategies and present experimental results using real-world ridership data from the Los Angeles Metro Rail system. The Los Angeles County Sheriff’s department has begun trials of TRUSTS.


Cost-Sensitive Risk Stratification in the Diagnosis of Heart Disease

AAAI Conferences

We investigate machine learning methods for diagnostic screening of heart disease. Coronary heart disease is the leading cause of death in the US, causing more deaths than all types of cancers combined. Early diagnosis of heart disease in women is harder than it is in men and typically requires the administration of several clinical tests on the patient. Most risk stratification methods aggregate the results of such tests, including the risky, invasive procedures that cannot be administered on all patients. In this paper, our goal is to identify patients who are under high-risk of having heart disease and related adverse events, using a minimal number of diagnostic tests, especially less invasive ones. The low frequency of patients with severe heart disease in the dataset is challenging for most conventional machine learning methods. To overcome this problem, we develop and apply a cost-sensitive k nearest neighbor algorithm. Our contributions are two fold: First, we compare the predictive value of several diagnostic procedures for heart disease, including electrocardiography, angiography, radionuclide perfusion and conclude that in womens heart disease, certain combinations of non-invasive techniques are more predictive than some of the widely used invasive procedures. Then, we evaluate held out data and achieve an AUROC over 0.70, signifying valuable clinical utility, using only the least costly and least invasive tests.