Goto

Collaborating Authors

 Wang, Nan


Rank4Class: A Ranking Formulation for Multiclass Classification

arXiv.org Artificial Intelligence

Multiclass classification (MCC) is a fundamental machine learning problem which aims to classify each instance into one of a predefined set of classes. Given an instance, a classification model computes a score for each class, all of which are then used to sort the classes. The performance of a classification model is usually measured by Top-K Accuracy/Error (e.g., K=1 or 5). In this paper, we do not aim to propose new neural representation learning models as most recent works do, but to show that it is easy to boost MCC performance with a novel formulation through the lens of ranking. In particular, by viewing MCC as to rank classes for an instance, we first argue that ranking metrics, such as Normalized Discounted Cumulative Gain (NDCG), can be more informative than existing Top-K metrics. We further demonstrate that the dominant neural MCC architecture can be formulated as a neural ranking framework with a specific set of design choices. Based on such generalization, we show that it is straightforward and intuitive to leverage techniques from the rich information retrieval literature to improve the MCC performance out of the box. Extensive empirical results on both text and image classification tasks with diverse datasets and backbone models (e.g., BERT and ResNet for text and image classification) show the value of our proposed framework.


Comparative Explanations of Recommendations

arXiv.org Artificial Intelligence

As recommendation is essentially a comparative (or ranking) process, a good explanation should illustrate to users why an item is believed to be better than another, i.e., comparative explanations about the recommended items. Ideally, after reading the explanations, a user should reach the same ranking of items as the system's. Unfortunately, little research attention has yet been paid on such comparative explanations. In this work, we develop an extract-and-refine architecture to explain the relative comparisons among a set of ranked items from a recommender system. For each recommended item, we first extract one sentence from its associated reviews that best suits the desired comparison against a set of reference items. Then this extracted sentence is further articulated with respect to the target user through a generative model to better explain why the item is recommended. We design a new explanation quality metric based on BLEU to guide the end-to-end training of the extraction and refinement components, which avoids generation of generic content. Extensive offline evaluations on two large recommendation benchmark datasets and serious user studies against an array of state-of-the-art explainable recommendation algorithms demonstrate the necessity of comparative explanations and the effectiveness of our solution.


Unbiased Graph Embedding with Biased Graph Observations

arXiv.org Artificial Intelligence

Graph embedding techniques have been increasingly employed in real-world machine learning tasks on graph-structured data, such as social recommendations and protein structure modeling. Since the generation of a graph is inevitably affected by some sensitive node attributes (such as gender and age of users in a social network), the learned graph representations can inherit such sensitive information and introduce undesirable biases in downstream tasks. Most existing works on debiasing graph representations add ad-hoc constraints on the learned embeddings to restrict their distributions, which however compromise the utility of resulting graph representations in downstream tasks. In this paper, we propose a principled new way for obtaining unbiased representations by learning from an underlying bias-free graph that is not influenced by sensitive attributes. Based on this new perspective, we propose two complementary methods for uncovering such an underlying graph with the goal of introducing minimum impact on the utility of learned representations in downstream tasks. Both our theoretical justification and extensive experiment comparisons against state-of-the-art solutions demonstrate the effectiveness of our proposed methods.


CORe: Capitalizing On Rewards in Bandit Exploration

arXiv.org Machine Learning

A multi-armed bandit can be considered as a special case We propose a bandit algorithm that explores purely of linear bandits, where the feature vector of each arm is by randomizing its past observations. In particular, a one-hot vector indicating the index of the arm, and the the sufficient optimism in the mean reward estimates parameter vector is a vector of corresponding mean rewards. is achieved by exploiting the variance in Arguably, the most popular and well-studied exploration the past observed rewards. We name the algorithm strategies for solving bandit problems are Thompson sampling Capitalizing On Rewards (CORe). The algorithm (TS) [Thompson, 1933, Agrawal and Goyal, 2013] is general and can be easily applied to different and Optimism in the Face of Uncertainty (OFU) [Auer et al., bandit settings. The main benefit of CORe is that 2002]. TS maintains a posterior distribution over each arm's its exploration is fully data-dependent.


Explanation as a Defense of Recommendation

arXiv.org Artificial Intelligence

Textual explanations have proved to help improve user satisfaction on machine-made recommendations. However, current mainstream solutions loosely connect the learning of explanation with the learning of recommendation: for example, they are often separately modeled as rating prediction and content generation tasks. In this work, we propose to strengthen their connection by enforcing the idea of sentiment alignment between a recommendation and its corresponding explanation. At training time, the two learning tasks are joined by a latent sentiment vector, which is encoded by the recommendation module and used to make word choices for explanation generation. At both training and inference time, the explanation module is required to generate explanation text that matches sentiment predicted by the recommendation module. Extensive experiments demonstrate our solution outperforms a rich set of baselines in both recommendation and explanation tasks, especially on the improved quality of its generated explanations. More importantly, our user studies confirm our generated explanations help users better recognize the differences between recommended items and understand why an item is recommended.


Directional Multivariate Ranking

arXiv.org Machine Learning

User-provided multi-aspect evaluations manifest users' detailed feedback on the recommended items and enable fine-grained understanding of their preferences. Extensive studies have shown that modeling such data greatly improves the effectiveness and explainability of the recommendations. However, as ranking is essential in recommendation, there is no principled solution yet for collectively generating multiple item rankings over different aspects. In this work, we propose a directional multi-aspect ranking criterion to enable a holistic ranking of items with respect to multiple aspects. Specifically, we view multi-aspect evaluation as an integral effort from a user that forms a vector of his/her preferences over aspects. Our key insight is that the direction of the difference vector between two multi-aspect preference vectors reveals the pairwise order of comparison. Hence, it is necessary for a multi-aspect ranking criterion to preserve the observed directions from such pairwise comparisons. We further derive a complete solution for the multi-aspect ranking problem based on a probabilistic multivariate tensor factorization model. Comprehensive experimental analysis on a large TripAdvisor multi-aspect rating dataset and a Yelp review text dataset confirms the effectiveness of our solution.


The FacT: Taming Latent Factor Models for Explainability with Factorization Trees

arXiv.org Machine Learning

Latent factor models have achieved great success in personalized recommendations, but they are also notoriously difficult to explain. In this work, we integrate regression trees to guide the learning of latent factor models for recommendation, and use the learnt tree structure to explain the resulting latent factors. Specifically, we build regression trees on users and items respectively with user-generated reviews, and associate a latent profile to each node on the trees to represent users and items. With the growth of regression tree, the latent factors are gradually refined under the regularization imposed by the tree structure. As a result, we are able to track the creation of latent profiles by looking into the path of each factor on regression trees, which thus serves as an explanation for the resulting recommendations. Extensive experiments on two large collections of Amazon and Yelp reviews demonstrate the advantage of our model over several competitive baseline algorithms. Besides, our extensive user study also confirms the practical value of explainable recommendations generated by our model.


Explainable Recommendation via Multi-Task Learning in Opinionated Text Data

arXiv.org Artificial Intelligence

Explaining automatically generated recommendations allows users to make more informed and accurate decisions about which results to utilize, and therefore improves their satisfaction. In this work, we develop a multi-task learning solution for explainable recommendation. Two companion learning tasks of user preference modeling for recommendation} and \textit{opinionated content modeling for explanation are integrated via a joint tensor factorization. As a result, the algorithm predicts not only a user's preference over a list of items, i.e., recommendation, but also how the user would appreciate a particular item at the feature level, i.e., opinionated textual explanation. Extensive experiments on two large collections of Amazon and Yelp reviews confirmed the effectiveness of our solution in both recommendation and explanation tasks, compared with several existing recommendation algorithms. And our extensive user study clearly demonstrates the practical value of the explainable recommendations generated by our algorithm.


Hypergraph Learning With Cost Interval Optimization

AAAI Conferences

In many classification tasks, the misclassification costs of different categories usually vary significantly. Under such circumstances, it is essential to identify the importance of different categories and thus assign different misclassification losses in many applications, such as medical diagnosis, saliency detection and software defect prediction. However, we note that it is infeasible to determine the accurate cost value without great domain knowledge. In most common cases, we may just have the information that which category is more important than the other categories, i.e., the identification of defect-prone softwares is more important than that of defect-free. To tackle these issues, in this paper, we propose a hypergraph learning method with cost interval optimization, which is able to handle cost interval when data is formulated using the high-order relationships. In this way, data correlations are modeled by a hypergraph structure, which has the merit to exploit the underlying relationships behind the data. With a cost-sensitive hypergraph structure, in order to improve the performance of the classifier without precise cost value, we further introduce cost interval optimization to hypergraph learning. In this process, the optimization on cost interval achieves better performance instead of choosing uncertain fixed cost in the learning process. To evaluate the effectiveness of the proposed method, we have conducted experiments on two groups of dataset, i.e., the NASA Metrics Data Program (NASA) dataset and UCI Machine Learning Repository (UCI) dataset. Experimental results and comparisons with state-of-the-art methods have exhibited better performance of our proposed method.


Gaussian-binary Restricted Boltzmann Machines on Modeling Natural Image Statistics

arXiv.org Machine Learning

We present a theoretical analysis of Gaussian-binary restricted Boltzmann machines (GRBMs) from the perspective of density models. The key aspect of this analysis is to show that GRBMs can be formulated as a constrained mixture of Gaussians, which gives a much better insight into the model's capabilities and limitations. We show that GRBMs are capable of learning meaningful features both in a two-dimensional blind source separation task and in modeling natural images. Further, we show that reported difficulties in training GRBMs are due to the failure of the training algorithm rather than the model itself. Based on our analysis we are able to propose several training recipes, which allowed successful and fast training in our experiments. Finally, we discuss the relationship of GRBMs to several modifications that have been proposed to improve the model.