Goto

Collaborating Authors

 University of Iowa


Contextual Collaborative Filtering for Student Response Prediction in Mixed-Format Tests

AAAI Conferences

The purpose of this study is to design a machine learning approach to predict the student response in mixed-format tests. Particularly, a novel contextual collaborative filtering model is proposed to extract latent factors for students and test items, by exploiting the item information. Empirical results from a simulation study validate the effectiveness of the proposed method.


Stochastic Optimization for Kernel PCA

AAAI Conferences

Kernel Principal Component Analysis (PCA) is a popular extension of PCA which is able to find nonlinear patterns from data. However, the application of kernel PCA to large-scale problems remains a big challenge, due to its quadratic space complexity and cubic time complexity in the number of examples. To address this limitation, we utilize techniques from stochastic optimization to solve kernel PCA with linear space and time complexities per iteration. Specifically, we formulate it as a stochastic composite optimization problem, where a nuclear norm regularizer is introduced to promote low-rankness, and then develop a simple algorithm based on stochastic proximal gradient descent. During the optimization process, the proposed algorithm always maintains a low-rank factorization of iterates that can be conveniently held in memory. Compared to previous iterative approaches, a remarkable property of our algorithm is that it is equipped with an explicit rate of convergence. Theoretical analysis shows that the solution of our algorithm converges to the optimal one at an O(1/T) rate, where T is the number of iterations.


Exploring Feature Definition and Selection for Sentiment Classifiers

AAAI Conferences

In this paper, we systematically explore feature definition and selection strategies for sentiment polarity classification. We begin by exploring basic questions, such as whether to use stemming, term frequency versus binary weighting, negation-enriched features, n-grams or phrases. We then move onto more complex aspects including feature selection using frequency-based vocabulary trimming, part-of-speech and lexicon selection (three types of lexicons), as well as using expected Mutual Information (MI). Using three product and movie review datasets of various sizes, we show, for example, that some techniques are more beneficial for larger datasets than the smaller. A classifier trained on only few features ranked high by MI outperformed one trained on all features in large datasets, yet in small dataset this did not prove to be true. Finally, we perform a space and computation cost analysis to further understand the merits of various feature types.