Goto

Collaborating Authors

 Asia


Exploiting Task-Feature Co-Clusters in Multi-Task Learning

AAAI Conferences

In multi-task learning, multiple related tasks are considered simultaneously, with the goal to improve the generalization performance by utilizing the intrinsic sharing of information across tasks. This paper presents a multi-task learning approach by modeling the task-feature relationships. Specifically, instead of assuming that similar tasks have similar weights on all the features, we start with the motivation that the tasks should be related in terms of subsets of features, which implies a co-cluster structure. We design a novel regularization term to capture this task-feature co-cluster structure. A proximal algorithm is adopted to solve the optimization problem. Convincing experimental results demonstrate the effectiveness of the proposed algorithm and justify the idea of exploiting the task-feature relationships.


Large-Margin Multi-Label Causal Feature Learning

AAAI Conferences

In multi-label learning, an example is represented by a descriptive feature associated with several labels. Simply considering labels as independent or correlated is crude; it would be beneficial to define and exploit the causality between multiple labels. For example, an image label 'lake' implies the label 'water', but not vice versa. Since the original features are a disorderly mixture of the properties originating from different labels, it is intuitive to factorize these raw features to clearly represent each individual label and its causality relationship.Following the large-margin principle, we propose an effective approach to discover the causal features of multiple labels, thus revealing the causality between labels from the perspective of feature. We show theoretically that the proposed approach is a tight approximation of the empirical multi-label classification error, and the causality revealed strengthens the consistency of the algorithm. Extensive experimentations using synthetic and real-world data demonstrate that the proposed algorithm effectively discovers label causality, generates causal features, and improves multi-label learning.


Forecasting Collector Road Speeds Under High Percentage of Missing Data

AAAI Conferences

Accurate road speed predictions can help drivers in smart route planning. Although the issue has been studied previously, most existing work focus on arterial roads only, where sensors are configured closely for collecting complete real-time data. For collector roads where sensors sparsly cover, however, speed predictions are often ignored. With GPS-equipped floating car signals being available nowadays, we aim at forecasting collector road speeds by utilizing these signals. The main challenge compared with arterial roads comes from the missing data. In a time slot of the real case, over 90% of collector roads cannot be covered by enough floating cars. Thus most traditional approaches for arterial roads, relying on complete historical data, cannot be employed directly. Aiming at solving this problem, we propose a multi-view road speed prediction framework. In the first view, temporal patterns are modeled by a layered hidden Markov model; and in the second view, spatial patterns are modeled by a collective matrix factorization model. The two models are learned and inferred simultaneously in a co-regularized manner. Experiments conducted in the Beijing road network, based on 10K taxi signals in 2 years, have demonstrated that the approach outperforms traditional approaches by 10% in MAE and RMSE.


Stable Feature Selection from Brain sMRI

AAAI Conferences

Neuroimage analysis usually involves learning thousands or even millions of variables using only a limited number of samples. In this regard, sparse models, e.g. the lasso, are applied to select the optimal features and achieve high diagnosis accuracy. The lasso, however, usually results in independent unstable features. Stability, a manifest of reproducibility of statistical results subject to reasonable perturbations to data and the model (Yu 2013), is an important focus in statistics, especially in the analysis of high dimensional data. In this paper, we explore a nonnegative generalized fused lasso model for stable feature selection in the diagnosis of Alzheimer's disease. In addition to sparsity, our model incorporates two important pathological priors: the spatial cohesion of lesion voxels and the positive correlation between the features and the disease labels. To optimize the model, we propose an efficient algorithm by proving a novel link between total variation and fast network flow algorithms via conic duality. Experiments show that the proposed nonnegative model performs much better in exploring the intrinsic structure of data via selecting stable features compared with other state-of-the-arts.


Mining User Interests from Personal Photos

AAAI Conferences

Personal photos are enjoying explosive growth with the popularity of photo-taking devices and social media. The vast amount of online photos largely exhibit users' interests, emotion and opinions. Mining user interests from personal photos can boost a number of utilities, such as advertising, interest based community detection and photo recommendation. In this paper, we study the problem of user interests mining from personal photos. We propose a User Image Latent Space Model to jointly model user interests and image contents. User interests are modeled as latent factors and each user is assumed to have a distribution over them. By inferring the latent factors and users' distributions, we can discover what the users are interested in. We model image contents with a four-level hierarchical structure where the layers correspond to themes, semantic regions, visual words and pixels respectively. Users' latent interests are embedded in the theme layer. Given image contents, users' interests can be discovered by doing posterior inference. We use variational inference to approximate the posteriors of latent variables and learn model parameters. Experiments on 180K Flickr photos demonstrate the effectiveness of our model.


Swiss-System Based Cascade Ranking for Gait-Based Person Re-Identification

AAAI Conferences

Human gait has been shown to be an efficient biometric measure for person identification at a distance. However, it often needs different gait features to handle various covariate conditions including viewing angles, walking speed, carrying an object and wearing different types of shoes. In order to improve the robustness of gait-based person re-identification on such multi-covariate conditions, a novel Swiss-system based cascade ranking model is proposed in this paper. Since the ranking model is able to learn a subspace where the potential true match is given the highest ranking, we formulate the gait-based person re-identification as a bipartite ranking problem and utilize it as an effective way for multi-feature ensemble learning. Then a Swiss multi-round competition system is developed for the cascade ranking model to optimize its effectiveness and efficiency. Extensive experiments on three indoor and outdoor public datasets demonstrate that our model outperforms several state-of-the-art methods remarkably.


Exploring Social Context for Topic Identification in Short and Noisy Texts

AAAI Conferences

With the pervasion of social media, topic identification in short texts attracts increasing attention inย  recent years. However, in nature the texts of social media are short and noisy, and the structures are sparse and dynamic, resulting in difficulty to identify topic categories exactly from online social media. Inspired by social science findings that preference consistency and social contagion are observed in social media, we investigate topic identification in short and noisy texts by exploring social context from the perspective of social sciences. In particular, we present a mathematical optimization formulation that incorporates the preference consistency and social contagion theories into a supervised learning method, and conduct feature selection to tackle short and noisy texts in social media, which result in a Sociological framework for Topic Identification (STI). Experimental results on real-world datasets from Twitter and Citation Network demonstrate the effectiveness of the proposed framework. Further experiments are conducted to understand the importance of social context in topic identification.


Propagating Ranking Functions on a Graph: Algorithms and Applications

AAAI Conferences

Learning to rank is an emerging learning task that opens up a diverse set of applications. However, most existing work focuses on learning a single ranking function whilst in many real world applications, there can be many ranking functions to fulfill various retrieval tasks on the same data set. How to train many ranking functions is challenging due to the limited availability of training data which is further compounded when plentiful training data is available for a small subset of the ranking functions. This is particularly true in settings, such as personalized ranking/retrieval, where each person requires a unique ranking function according to their preference, but only the functions of the persons who provide sufficient ratings (of objects, such as movies and music) can be well trained. To address this, we propose to construct a graph where each node corresponds to a retrieval task, and then propagate ranking functions on the graph. We illustrate the usefulness of the idea of propagating ranking functions and our method by exploring two real world applications.


Sub-Merge: Diving Down to the Attribute-Value Level in Statistical Schema Matching

AAAI Conferences

Matching and merging data from conflicting sources is the bread and butter of data integration, which drives search verticals, e-commerce comparison sites and cyber intelligence. Schema matching lifts data integration - traditionally focused on well-structured data - to highly heterogeneous sources. While schema matching has enjoyed significant success in matching data attributes, inconsistencies can exist at a deeper level, making full integration difficult or impossible. We propose a more fine-grained approach that focuses on correspondences between the values of attributes across data sources. Since the semantics of attribute values derive from their use and co-occurrence, we argue for the suitability of canonical correlation analysis (CCA) and its variants. We demonstrate the superior statistical and computational performance of multiple sparse CCA compared to a suite of baseline algorithms, on two datasets which we are releasing to stimulate further research. Our crowd-annotated data covers both cases that are relatively easy for humans to supply ground-truth, and that are inherently difficult for human computation.


Scalable and Interpretable Data Representation for High-Dimensional, Complex Data

AAAI Conferences

The majority of machine learning research has been focused on building models and inference techniques with sound mathematical properties and cutting edge performance. Little attention has been devoted to the development of data representation that can be used to improve a user's ability to interpret the data and machine learning models to solve real-world problems. In this paper, we quantitatively and qualitatively evaluate an efficient, accurate and scalable feature-compression method using latent Dirichlet allocation for discrete data. This representation can effectively communicate the characteristics of high-dimensional, complex data points. We show that the improvement of a user's interpretability through the use of a topic modeling-based compression technique is statistically significant, according to a number of metrics, when compared with other representations. Also, we find that this representation is scalable --- it maintains alignment with human classification accuracy as an increasing number of data points are shown. In addition, the learned topic layer can semantically deliver meaningful information to users that could potentially aid human reasoning about data characteristics in connection with compressed topic space.