Goto

Collaborating Authors

 Xu, Congfu


Riemannian Tensor Completion with Side Information

arXiv.org Machine Learning

By restricting the iterate on a nonlinear manifold, the recently proposed Riemannian optimization methods prove to be both efficient and effective in low rank tensor completion problems. However, existing methods fail to exploit the easily accessible side information, due to their format mismatch. Consequently, there is still room for improvement in such methods. To fill the gap, in this paper, a novel Riemannian model is proposed to organically integrate the original model and the side information by overcoming their inconsistency. For this particular model, an efficient Riemannian conjugate gradient descent solver is devised based on a new metric that captures the curvature of the objective.Numerical experiments suggest that our solver is more accurate than the state-of-the-art without compromising the efficiency.


Incorporating Collaborative Ranking Algorithm with Weighted Recursive Autoencoder for Item Recommendation

AAAI Conferences

Collaborative filtering (CF) with implicit feedback is a successful method for recommending items to users, which does not require a knowledge of the items or users. CF methods can be mainly classified into two categories. One is point-wise regression based and the other is pair-wise ranking based, where the latter one only tries to find out the items that users prefer while ignores the items that users dislike, and usually gives out a better recommended item list. The performance of CF-based methods degrades significantly when the feedback information is sparse. To address the problem, many kinds of auxiliary information have been utilized such as users’ reviews on items, items’ content and description information, price, brands. In this paper we utilize a weighted recursive autoencoder (RAE) to extract useful features from several heterogeneous auxiliary information and tightly couple the weighted RAE with a pair-wise ranking based CF method. Analysis of the hyperparameters illustrates that auxiliary information from different sources is indeed able to benefit our model. Empirical experiments on six real world datasets show that our method outperforms other state-of-the-art methods.


Fast Hybrid Algorithm for Big Matrix Recovery

AAAI Conferences

Large-scale Nuclear Norm penalized Least Square problem (NNLS) is frequently encountered in estimation of low rank structures. In this paper we accelerate the solution procedure by combining non-smooth convex optimization with smooth Riemannian method. Our methods comprise of two phases. In the first phase, we use Alternating Direction Method of Multipliers (ADMM) both to identify the fix rank manifold where an optimum resides and to provide an initializer for the subsequent refinement. In the second phase, two superlinearly convergent Riemannian methods: Riemannian NewTon (NT) and Riemannian Conjugate Gradient descent (CG) are adopted to improve the approximation over a fix rank manifold. We prove that our Hybrid method of ADMM and NT (HADMNT) converges to an optimum of NNLS at least quadratically. The experiments on large-scale collaborative filtering datasets demonstrate very competitive performance of these fast hybrid methods compared to the state-of-the-arts.


Recommendation Algorithms for Optimizing Hit Rate, User Satisfaction and Website Revenue

AAAI Conferences

We generally use hit rate to measure the performance of item recommendation algorithms. In addition to hit rate, we consider another two important factors which are ignored by most previous works. First, whether users are satisfied with the recommended items. It is possible that a user has bought an item but dislikes it. Hence high hit rate does not reflect high customer satisfaction. Second, whether the website retailers are satisfied with the recommendation results. If a customer is interested in two products and wants to buy one of them, it may be better to suggest the item which can help bring more profit. Therefore, a good recommendation algorithm should not only consider improving hit rate but also consider optimizing user satisfaction and website revenue. In this paper, we propose two algorithms for the above purposes and design two modified hit rate based metrics to measure them. Experimental results on 10 real-world datasets show that our methods can not only achieve better hit rate, but improve user satisfaction and website revenue comparing with the state-of-the-art models.


A Fast Spectral Relaxation Approach to Matrix Completion via Kronecker Products

AAAI Conferences

In the existing methods for solving matrix completion, such as singular value thresholding (SVT), soft-impute and fixed point continuation (FPCA) algorithms, it is typically required to repeatedly implement singular value decompositions (SVD) of matrices.When the size of the matrix in question is large, the computational complexity of finding a solution is costly. To reduce this expensive computational complexity, we apply Kronecker products to handle the matrix completion problem. In particular, we propose using Kronecker factorization, which approximates a matrix by the Kronecker product of several matrices of smaller sizes. Weintroduce Kronecker factorization into the soft-impute framework and devise an effective matrix completion algorithm.Especially when the factorized matrices have about the samesizes, the computational complexity of our algorithm is improved substantially.


Not So Naive Online Bayesian Spam Filter

AAAI Conferences

Spam filtering, as a key problem in electronic communication, has drawn significant attention due to increasingly huge amounts of junk email on the Internet. Content-based filtering is one reliable method in combating with spammers' changing tactics. Naive Bayes (NB) is one of the earliest content-based machine learning methods both in theory and practice in combating with spammers, which is easy to implement while can achieve considerable accuracy. In this paper, the traditional online Bayesian classifier are enhanced  by two ways. First, from theory's point of view, we devise a self-adaptive mechanism to gradually weaken the assumption of independence required by original NB in the online training process, and as a result of that our NSNB is no longer ``naive''. Second, we propose other engineering ways to make the classifier more robust and accuracy. The experiment results show that our NSNB does give state-of-the-art classification performance on online spam filtering on large benchmark data sets while it is extremely fast and takes up little memory in comparison with other statistical methods.