Goto

Collaborating Authors

 Statistical Learning


Influence-Driven Model for Time Series Prediction from Partial Observations

AAAI Conferences

Applications in sustainability domains such as in energy, transportation, and natural resource and environment monitoring, increasingly use sensors for collecting data and sending it back to centrally located processing nodes. While data can usually be collected by the sensors at a very high speed, in many cases, it can not be sent back to central nodes at a frequency that is required for fast and real-time modeling and decision-making. This may be due to physical limitations of the transmission networks, or due to consumers limiting frequent transmission of data from sensors located at their premises for security and privacy concerns. We propose a novel solution to the problem of making short term predictions in absence of real-time data from sensors. A key implication of our work is that by using real-time data from only a small subset of influential sensors, we are able to make predictions for all sen- sors. We evaluated our approach with a large real-world electricity consumption data collected from smart meters in Los Angeles and the results show that between prediction horizons of 2 to 8 hours, despite lack of real time data, our influence model outperforms the baseline model that uses real-time data. Also, when using partial real-time data from only ≈ 7% influential smart meters, we witness prediction error increase by only ≈ 0.5% over the baseline, thus demonstrating the usefulness of our method for practical scenarios.


Are Features Equally Representative? A Feature-Centric Recommendation

AAAI Conferences

Typically a user prefers an item (e.g., a movie) because she likes certain features of the item (e.g., director, genre, producer). This observation motivates us to consider a feature-centric recommendation approach to item recommendation: instead of directly predicting the rating on items, we predict the rating on the features of items, and use such ratings to derive the rating on an item. This approach offers several advantages over the traditional item-centric approach: it incorporates more information about why a user chooses an item, it generalizes better due to the denser feature rating data, it explains the prediction of item ratings through the predicted feature ratings. Another contribution is turning a principled item-centric solution into a feature-centric solution, instead of inventing a new algorithm that is feature-centric. This approach maximally leverages previous research. We demonstrate this approach by turning the traditional item-centric latent factor model into a feature-centric solution and demonstrate its superiority over item-centric approaches.


Using Matched Samples to Estimate the Effects of Exercise on Mental Health via Twitter

AAAI Conferences

Recent work has demonstrated the value of social media monitoring for health surveillance (e.g., tracking influenza or depression rates). It is an open question whether such data can be used to make causal inferences (e.g., determining which activities lead to increased depression rates). Even in traditional, restricted domains, estimating causal effects from observational data is highly susceptible to confounding bias. In this work, we estimate the effect of exercise on mental health from Twitter, relying on statistical matching methods to reduce confounding bias. We train a text classifier to estimate the volume of a user's tweets expressing anxiety, depression, or anger, then compare two groups: those who exercise regularly (identified by their use of physical activity trackers like Nike+), and a matched control group. We find that those who exercise regularly have significantly fewer tweets expressing depression or anxiety; there is no significant difference in rates of tweets expressing anger. We additionally perform a sensitivity analysis to investigate how the many experimental design choices in such a study impact the final conclusions, including the quality of the classifier and the construction of the control group.


Predicting the Demographics of Twitter Users from Website Traffic Data

AAAI Conferences

Understanding the demographics of users of online social networks has important applications for health, marketing, and public messaging. In this paper, we predict the demographics of Twitter users based on whom they follow. Whereas most prior approaches rely on a supervised learning approach, in which individual users are labeled with demographics, we instead create a distantly labeled dataset by collecting audience measurement data for 1,500 websites (e.g., 50% of visitors to gizmodo.com are estimated to have a bachelor's degree). We then fit a regression model to predict these demographics using information about the followers of each website on Twitter. The resulting average held-out correlation is .77 across six different variables (gender, age, ethnicity, education, income, and child status). We additionally validate the model on a smaller set of Twitter users labeled individually for ethnicity and gender, finding performance that is surprisingly competitive with a fully supervised approach.


Learning Sparse Representations from Datasets with Uncertain Group Structures: Model, Algorithm and Applications

AAAI Conferences

Group sparsity has drawn much attention in machine learning. However, existing work can handle only datasets with certain group structures, where each sample has a certain membership with one or more groups. This paper investigates the learning of sparse representations from datasets with uncertain group structures, where each sample has an uncertain member-ship with all groups in terms of a probability distribution. We call this problem uncertain group sparse representation (UGSR in short), which is a generalization of the standard group sparse representation (GSR). We formulate the UGSR model and propose an efficient algorithm to solve this problem. We apply UGSR to text emotion classification and aging face recognition. Experiments show that UGSR outperforms standard sparse representation (SR) and standard GSR as well as fuzzy kNN classification.


Algorithm Selection via Ranking

AAAI Conferences

The abundance of algorithms developed to solve different problems has given rise to an important research question: How do we choose the best algorithm for a given problem? Known as algorithm selection, this issue has been prevailing in many domains, as no single algorithm can perform best on all problem instances. Traditional algorithm selection and portfolio construction methods typically treat the problem as a classification or regression task. In this paper, we present a new approach that provides a more natural treatment of algorithm selection and portfolio construction as a ranking task. Accordingly, we develop a Ranking-Based Algorithm Selection (RAS) method, which employs a simple polynomial model to capture the ranking of different solvers for different problem instances. We devise an efficient iterative algorithm that can gracefully optimize the polynomial coefficients by minimizing a ranking loss function, which is derived from a sound probabilistic formulation of the ranking problem. Experiments on the SAT 2012 competition dataset show that our approach yields competitive performance to that of more sophisticated algorithm selection methods.


On the Equivalence of Linear Discriminant Analysis and Least Squares

AAAI Conferences

Linear discriminant analysis (LDA) is a popular dimensionality reduction and classification method that simultaneously maximizes between-class scatter and minimizes within-class scatter. In this paper, we verify the equivalence of LDA and least squares (LS) with a set of dependent variable matrices. The equivalence is in the sense that the LDA solution matrix and the LS solution matrix have the same range. The resulting LS provides an intuitive interpretation in which its solution performs data clustering according to class labels. Further, the fact that LDA and LS have the same range allows us to design a two-stage algorithm that computes the LDA solution given by generalized eigenvalue decomposition (GEVD), much faster than computing the original GEVD. Experimental results demonstrate the equivalence of the LDA solution and the proposed LS solution.


Stochastic Blockmodeling for Online Advertising

AAAI Conferences

Online advertising is an important and huge industry. Having knowledge of the website attributes can contribute greatly to business strategies for ad-targeting, content display, inventory purchase or revenue prediction. In this paper, we introduce a stochastic blockmodeling for the website relations induced by the event of online user visitation. We propose two clustering algorithms to discover the intrinsic structures of websites, and compare the performance with a goodness-of-fit method and a deterministic graph partitioning method. We demonstrate the effectiveness of our algorithms on both simulation and AOL website dataset.


Integrating Image Clustering and Codebook Learning

AAAI Conferences

Image clustering and visual codebook learning are two fundamental problems in computer vision and they are tightly related. On one hand, a good codebook can generate effective feature representations which largely affect clustering performance. On the other hand, class labels obtained from image clustering can serve as supervised information to guide codebook learning. Traditionally, these two processes are conducted separately and their correlation is generally ignored.In this paper, we propose a Double Layer Gaussian Mixture Model (DLGMM) to simultaneously perform image clustering and codebook learning. In DLGMM, two tasks are seamlessly coupled and can mutually promote each other. Cluster labels and codebook are jointly estimated to achieve the overall best performance. To incorporate the spatial coherence between neighboring visual patches, we propose a Spatially Coherent DLGMM which uses a Markov Random Field to encourage neighboring patches to share the same visual word label.We use variational inference to approximate the posterior of latent variables and learn model parameters.Experiments on two datasets demonstrate the effectiveness of two models.


Tensor-Based Learning for Predicting Stock Movements

AAAI Conferences

Stock movements are essentially driven by new information. Market data, financial news, and social sentiment are believed to have impacts on stock markets. To study the correlation between information and stock movements, previous works typically concatenate the features of different information sources into one super feature vector. However, such concatenated vector approaches treat each information source separately and ignore their interactions. In this article, we model the multi-faceted investors’ information and their intrinsic links with tensors. To identify the nonlinear patterns between stock movements and new information, we propose a supervised tensor regression learning approach to investigate the joint impact of different information sources on stock markets. Experiments on CSI 100 stocks in the year 2011 show that our approach outperforms the state-of-the-art trading strategies.