Plotting

 Dyer, Chris


Weakly-Supervised Grammar-Informed Bayesian CCG Parser Learning

AAAI Conferences

Combinatory Categorial Grammar (CCG) is a lexicalized grammar formalism in which words are associated with categories that, in combination with a small universal set of rules, specify the syntactic configurations in which they may occur. Categories are selected from a large, recursively-defined set; this leads to high word-to-category ambiguity, which is one of the primary factors that make learning CCG parsers difficult, especially in the face of little data. Previous work has shown that learning sequence models for CCG tagging can be improved by using linguistically-motivated prior probability distributions over potential categories. We extend this approach to the task of learning a CCG parser from weak supervision. We present a Bayesian formulation for CCG parser induction that assumes only supervision in the form of an incomplete tag dictionary mapping some word types to sets of potential categories. Our approach outperforms a baseline model trained with uniform priors by exploiting universal, intrinsic properties of the CCG formalism to bias the model toward simpler, more cross-linguistically common categories.


Conditional Random Field Autoencoders for Unsupervised Structured Prediction

Neural Information Processing Systems

We introduce a framework for unsupervised learning of structured predictors with overlapping, global features. Each input's latent representation is predicted conditional on the observed data using a feature-rich conditional random field (CRF). Then a reconstruction of the input is (re)generated, conditional on the latent structure, using a generative model which factorizes similarly to the CRF. The autoencoder formulation enables efficient exact inference without resorting to unrealistic independence assumptions or restricting the kinds of features that can be used. We illustrate insightful connections to traditional autoencoders, posterior regularization and multi-view learning. Finally, we show competitive results with instantiations of the framework for two canonical tasks in natural language processing: part-of-speech induction and bitext word alignment, and show that training our model can be substantially more efficient than comparable feature-rich baselines.


Learning Word Representations with Hierarchical Sparse Coding

arXiv.org Machine Learning

We propose a new method for learning word representations using hierarchical regularization in sparse coding inspired by the linguistic study of word meanings. We show an efficient learning algorithm based on stochastic proximal methods that is significantly faster than previous approaches, making it possible to perform hierarchical sparse coding on a corpus of billions of word tokens. Experiments on various benchmark tasks---word similarity ranking, analogies, sentence completion, and sentiment analysis---demonstrate that the method outperforms or is competitive with state-of-the-art methods. Our word representations are available at \url{http://www.ark.cs.cmu.edu/dyogatam/wordvecs/}.


Predicting the NFL using Twitter

arXiv.org Machine Learning

We study the relationship between social media output and National Football League (NFL) games, using a dataset containing messages from Twitter and NFL game statistics. Specifically, we consider tweets pertaining to specific teams and games in the NFL season and use them alongside statistical game data to build predictive models for future game outcomes (which team will win?) and sports betting outcomes (which team will win with the point spread? will the total points be over/under the line?). We experiment with several feature sets and find that simple features using large volumes of tweets can match or exceed the performance of more traditional features that use game statistics.