Goto

Collaborating Authors

 entailment


Learned in Translation: Contextualized Word Vectors

Neural Information Processing Systems

Computer vision has benefited from initializing multiple deep layers with weights pretrained on large supervised training sets like ImageNet. Natural language processing (NLP) typically sees initialization of only the lowest layer of deep models with pretrained word vectors. In this paper, we use a deep LSTM encoder from an attentional sequence-to-sequence model trained for machine translation (MT) to contextualize word vectors. We show that adding these context vectors (CoVe) improves performance over using only unsupervised word and character vectors on a wide variety of common NLP tasks: sentiment analysis (SST, IMDb), question classification (TREC), entailment (SNLI), and question answering (SQuAD). For fine-grained sentiment analysis and entailment, CoVe improves performance of our baseline models to the state of the art.


HyperbolicNeuralNetworks

Neural Information Processing Systems

Hyperbolic spaces have recently gained momentum in the context of machine learning due to their high capacity and tree-likeliness properties. However, the representational power of hyperbolic geometry is not yet on par with Euclidean geometry, mostly because of the absence of corresponding hyperbolic neural networklayers.




Supplementary for Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning

Neural Information Processing Systems

Xiaoqian Wu Shanghai Jiao Tong University enlighten@sjtu.edu.cn In Tab. 1, we conclude the notations in this work for clarity.Notation Definition r A rule. The size of the premise symbols set M . S is the symbol set, and R is the rule set. A \ B The set difference of A and B. D A very large-scale activity images database.



eccd2a86bae4728b38627162ba297828-AuthorFeedback.pdf

Neural Information Processing Systems

First, LCs and2 NBCs are extensively used in different settings, with NBCs being deemed by some as one of the top algorithms in3 datamining. Q3: We will cover the references mentioned by the reviewer. However, that is orthogonal to our work.37 If one fixes the linear model we will compute rigorous PI-explanation in log-linear time.