McAuley, Julian
Predicting Surgery Duration with Neural Heteroscedastic Regression
Ng, Nathan, Gabriel, Rodney A, McAuley, Julian, Elkan, Charles, Lipton, Zachary C
Scheduling surgeries is a challenging task due to the fundamental uncertainty of the clinical environment, as well as the risks and costs associated with under- and over-booking. We investigate neural regression algorithms to estimate the parameters of surgery case durations, focusing on the issue of heteroscedasticity. We seek to simultaneously estimate the duration of each surgery, as well as a surgery-specific notion of our uncertainty about its duration. Estimating this uncertainty can lead to more nuanced and effective scheduling strategies, as we are able to schedule surgeries more efficiently while allowing an informed and case-specific margin of error. Using surgery records %from the UC San Diego Health System, from a large United States health system we demonstrate potential improvements on the order of 20% (in terms of minutes overbooked) compared to current scheduling techniques. Moreover, we demonstrate that surgery durations are indeed heteroscedastic. We show that models that estimate case-specific uncertainty better fit the data (log likelihood). Additionally, we show that the heteroscedastic predictions can more optimally trade off between over and under-booking minutes, especially when idle minutes and scheduling collisions confer disparate costs.
Dance Dance Convolution
Donahue, Chris, Lipton, Zachary C., McAuley, Julian
Dance Dance Revolution (DDR) is a popular rhythm-based video game. Players perform steps on a dance platform in synchronization with music as directed by on-screen step charts. While many step charts are available in standardized packs, players may grow tired of existing charts, or wish to dance to a song for which no chart exists. We introduce the task of learning to choreograph. Given a raw audio track, the goal is to produce a new step chart. This task decomposes naturally into two subtasks: deciding when to place steps and deciding which steps to select. For the step placement task, we combine recurrent and convolutional neural networks to ingest spectrograms of low-level audio features to predict steps, conditioned on chart difficulty. For step selection, we present a conditional LSTM generative model that substantially outperforms n-gram and fixed-window approaches.
Solving Constrained Combinatorial Optimisation Problems via MAP Inference without High-Order Penalties
Zhang, Zhen (Northwestern Polytechnical University) | Shi, Qinfeng (The University of Adelaide) | McAuley, Julian (University of California, San Diego) | Wei, Wei (Northwestern Polytechnical University) | Zhang, Yanning (Northwestern Polytechnical University) | Yao, Rui (China University of Mining and Technology) | Hengel, Anton van den (The University of Adelaide)
Solving constrained combinatorial optimisation problems via MAP inference is often achieved by introducing extra potential functions for each constraint. This can result in very high order potentials, e.g. a 2nd-order objective with pairwise potentials and a quadratic constraint over all N variables would correspond to an unconstrained objective with an order-N potential. This limits the practicality of such an approach, since inference with high order potentials is tractable only for a few special classes of functions. We propose an approach which is able to solve constrained combinatorial problems using belief propagation without increasing the order. For example, in our scheme the 2nd-order problem above remains order 2 instead of order N. Experiments on applications ranging from foreground detection, image reconstruction, quadratic knapsack, and the M-best solutions problem demonstrate the effectiveness and efficiency of our method. Moreover, we show several situations in which our approach outperforms commercial solvers like CPLEX and others designed for specific constrained MAP inference problems.
VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback
He, Ruining (University of California, San Diego) | McAuley, Julian ( University of California, San Diego )
Modern recommender systems model people and items by discovering or `teasing apart' the underlying dimensions that encode the properties of items and users' preferences toward them. Critically, such dimensions are uncovered based on user feedback, often in implicit form (such as purchase histories, browsing logs, etc.); in addition, some recommender systems make use of side information, such as product attributes, temporal information, or review text.However one important feature that is typically ignored by existing personalized recommendation and ranking methods is the visual appearance of the items being considered. In this paper we propose a scalable factorization model to incorporate visual signals into predictors of people's opinions, which we apply to a selection of large, real-world datasets. We make use of visual features extracted from product images using (pre-trained) deep networks, on top of which we learn an additional layer that uncovers the visual dimensions that best explain the variation in people's feedback. This not only leads to significantly more accurate personalized ranking methods, but also helps to alleviate cold start issues, and qualitatively to analyze the visual dimensions that influence people's opinions.
VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback
He, Ruining, McAuley, Julian
Modern recommender systems model people and items by discovering or `teasing apart' the underlying dimensions that encode the properties of items and users' preferences toward them. Critically, such dimensions are uncovered based on user feedback, often in implicit form (such as purchase histories, browsing logs, etc.); in addition, some recommender systems make use of side information, such as product attributes, temporal information, or review text. However one important feature that is typically ignored by existing personalized recommendation and ranking methods is the visual appearance of the items being considered. In this paper we propose a scalable factorization model to incorporate visual signals into predictors of people's opinions, which we apply to a selection of large, real-world datasets. We make use of visual features extracted from product images using (pre-trained) deep networks, on top of which we learn an additional layer that uncovers the visual dimensions that best explain the variation in people's feedback. This not only leads to significantly more accurate personalized ranking methods, but also helps to alleviate cold start issues, and qualitatively to analyze the visual dimensions that influence people's opinions.
Exponential Family Graph Matching and Ranking
Petterson, James, Caetano, Tiberio, McAuley, Julian, Yu, Jin
We present a method for learning max-weight matching predictors in bipartite graphs. The method consists of performing maximum a posteriori estimation in exponential families with sufficient statistics that encode permutations and data features. Although inference is in general hard, we show that for one very relevant application - web page ranking - exact inference is efficient. For general model instances, an appropriate sampler is readily available. Contrary to existing max-margin matching models, our approach is statistically consistent and, in addition, experiments with increasing sample sizes indicate superior improvement over such models. We apply the method to graph matching in computer vision as well as to a standard benchmark dataset for learning web page ranking, in which we obtain state-of-the-art results, in particular improving on max-margin variants. The drawback of this method with respect to max-margin alternatives is its runtime for large graphs, which is comparatively high.