Goto

Collaborating Authors

 Statistical Learning


Instance-Wise Weighted Nonnegative Matrix Factorization for Aggregating Partitions with Locally Reliable Clusters

AAAI Conferences

We address an ensemble clustering problem, where reliable clusters are locally embedded in given multiple partitions. We propose a new nonnegative matrix factorization (NMF)-based method, in which locally reliable clusters are explicitly considered by using instance-wise weights over clusters. Our method factorizes the input cluster assignment matrix into two matrices H and W, which are optimized by iteratively 1) updating H and W while keeping the weight matrix constant and 2) updating the weight matrix while keeping H and W constant, alternatively. The weights in the second step were updated by solving a convex problem, which makes our algorithm significantly faster than existing NMF-based ensemble clustering methods. We empirically proved that our method outperformed a lot of cutting-edge ensemble clustering methods by using a variety of datasets.


Personalized Ranking Metric Embedding for Next New POI Recommendation

AAAI Conferences

The rapidly growing of Location-based Social Networks (LBSNs) provides a vast amount of check-in data, which enables many services, e.g., point-of-interest (POI) recommendation. In this paper, we study the next new POI recommendation problem in which new POIs with respect to users' current location are to be recommended. The challenge lies in the difficulty in precisely learning users' sequential information and personalizing the recommendation model. To this end, we resort to the Metric Embedding method for the recommendation, which avoids drawbacks of the Matrix Factorization technique. We propose a personalized ranking metric embedding method (PRME) to model personalized check-in sequences. We further develop a PRME-G model, which integrates sequential information, individual preference, and geographical influence, to improve the recommendation performance. Experiments on two real-world LBSN datasets demonstrate that our new algorithm outperforms the state-of-the-art next POI recommendation methods.


Instance-Wise Weighted Nonnegative Matrix Factorization for Aggregating Partitions with Locally Reliable Clusters

AAAI Conferences

We address an ensemble clustering problem, where reliable clusters are locally embedded in given multiple partitions. We propose a new nonnegative matrix factorization (NMF)-based method, in which locally reliable clusters are explicitly considered by using instance-wise weights over clusters. Our method factorizes the input cluster assignment matrix into two matrices H and W, which are optimized by iteratively 1) updating H and W while keeping the weight matrix constant and 2) updating the weight matrix while keeping H and W constant, alternatively. The weights in the second step were updated by solving a convex problem, which makes our algorithm significantly faster than existing NMF-based ensemble clustering methods. We empirically proved that our method outperformed a lot of cutting-edge ensemble clustering methods by using a variety of datasets.


Personalized Ranking Metric Embedding for Next New POI Recommendation

AAAI Conferences

The rapidly growing of Location-based Social Networks (LBSNs) provides a vast amount of check-in data, which enables many services, e.g., point-of-interest (POI) recommendation. In this paper, we study the next new POI recommendation problem in which new POIs with respect to users' current location are to be recommended. The challenge lies in the difficulty in precisely learning users' sequential information and personalizing the recommendation model. To this end, we resort to the Metric Embedding method for the recommendation, which avoids drawbacks of the Matrix Factorization technique. We propose a personalized ranking metric embedding method (PRME) to model personalized check-in sequences. We further develop a PRME-G model, which integrates sequential information, individual preference, and geographical influence, to improve the recommendation performance. Experiments on two real-world LBSN datasets demonstrate that our new algorithm outperforms the state-of-the-art next POI recommendation methods.


An Intelligent and Unified Framework for Multiple Robot and Human Coalition Formation

AAAI Conferences

This dissertation develops the intelligent-Coalition Formation framework for Humans and Robots (i-CiFHaR), an intelligent decision making frameworkfor multi-agent coalition formation. i-CiFHaR is a first of its kind that incorporates a library of coalition formation algorithms; employs unsupervised learning to mine crucial patterns among these algorithms; and leverages probabilistic reasoning to derive the most appropriate algorithm(s) to apply in accordance with multiple mission criteria. The dissertation also contributes to the state-of-the-art in swarm intelligence by addressing the search stagnation limitation of existing ant colony optimization algorithms (ACO) by integrating the simulated annealing mechanism. The experimental results demonstrate that the presented hybrid ACO algorithms significantly outperformed the best existing ACO approaches, when applied to three NP-complete optimization problems (e.g., traveling salesman problem, maximal clique problem, multi-agent coalition formation problem).


Statistical Relational Learning Towards Modelling Social Media Users

AAAI Conferences

Nowadays web users actively generate content on different social media platforms. The large number of users requiring personalized services creates a unique opportunity for researchers to explore user modelling. Substantial research has been done by utilizing user generated content to model users by applying different classification or regression techniques. These techniques are powerful types of machine learning approaches, however they only partially model social media users. In this work, we introduce a new statistical relational learning (SRL) framework suitable for this purpose, which we call PSL Q . PSL Q is the first SRL framework that supports reasoning with soft quantifiers, such as “most” and “a few”. Indeed, in models for social media it is common to assume that friends are influenced by each other’s behavior, beliefs, and preferences. Thus, having a trait only becomes probable once most or some of one’s friends have that trait. Expressing this dependency requires a soft quantifier, which can be modeled with PSL^Q. Our experimental results for link prediction in social trust networks demonstrate that the use of soft quantifiers not only allows for a natural and intuitive formulation of domain knowledge, but also improves the accuracy of inferred results.


Firefly Monte Carlo: Exact MCMC with Subsets of Data

AAAI Conferences

Markov chain Monte Carlo (MCMC) is a popular tool for Bayesian inference.However, MCMC cannot be practically applied to large data sets because of theprohibitive cost of evaluating every likelihood term at every iteration. Here we present Firefly Monte Carlo (FlyMC) MCMC algorithm with auxiliary variables that only queries the likelihoods of a subset of the data at each iteration yet simulates from the exact posterior distribution. FlyMC is compatible with modern MCMC algorithms, and only requires a lower bound on the per-datum likelihood factors. In experiments, we find that FlyMC generates samples from the posterior more than an order of magnitude faster than regular MCMC, allowing MCMC methods to tackle larger datasets than were previously considered feasible.


Using Social Media to Enhance Emergency Situation Awareness: Extended Abstract

AAAI Conferences

Social media platforms, such as Twitter, offer a rich source of real-time information about real-world events, particularly during mass emergencies. Sifting valuable information from social media provides useful insight into time-critical situations for emergency officers to understand the impact of hazards and act on emergency responses in a timely manner. This work focuses on analyzing Twitter messages generated during natural disasters, and shows how natural language processing and data mining techniques can be utilized to extract situation awareness information from Twitter. We present key relevant approaches that we have investigated including burst detection, tweet filtering and classification, online clustering, and geotagging.


kLog: A Language for Logical and Relational Learning with Kernels (Extended Abstract)

AAAI Conferences

We introduce kLog, a novel language for kernel-based learning on expressive logical and relational representations. kLog allows users to specify logical and relational learning problems declaratively. It builds on simple but powerful concepts: learning from interpretations, entity/relationship data modeling, and logic programming. Access by the kernel to the rich representation is mediated by a technique we call graphicalization: the relational representation is first transformed into a graph — in particular, a grounded entity/relationship diagram. Subsequently, a choice of graph kernel defines the feature space. The kLog framework can be applied to tackle the same range of tasks that has made statistical relational learning so popular, including classification, regression, multitask learning, and collective classification. An empirical evaluation shows that kLog can be either more accurate, or much faster at the same level of accuracy, than Tilde and Alchemy.


Learning a Robust Consensus Matrix for Clustering Ensemble via Kullback-Leibler Divergence Minimization

AAAI Conferences

Clustering ensemble has emerged as an important extension of the classical clustering problem. It provides a framework for combining multiple base clusterings of a data set to generate a final consensus result. Most existing clustering methods simply combine clustering results without taking into account the noises, which may degrade the clustering performance. In this paper, we propose a novel robust clustering ensemble method. To improve the robustness, we capture the sparse and symmetric errors and integrate them into our robust and consensus framework to learn a low-rank matrix. Since the optimization of the objective function is difficult to solve, we develop a block coordinate descent algorithm which is theoretically guaranteed to converge. Experimental results on real world data sets demonstrate the effectiveness of our method.