Goto

Collaborating Authors

 entailment


Learned in Translation: Contextualized Word Vectors

Neural Information Processing Systems

Computer vision has benefited from initializing multiple deep layers with weights pretrained on large supervised training sets like ImageNet. Natural language processing (NLP) typically sees initialization of only the lowest layer of deep models with pretrained word vectors. In this paper, we use a deep LSTM encoder from an attentional sequence-to-sequence model trained for machine translation (MT) to contextualize word vectors. We show that adding these context vectors (CoVe) improves performance over using only unsupervised word and character vectors on a wide variety of common NLP tasks: sentiment analysis (SST, IMDb), question classification (TREC), entailment (SNLI), and question answering (SQuAD). For fine-grained sentiment analysis and entailment, CoVe improves performance of our baseline models to the state of the art.


HyperbolicNeuralNetworks

Neural Information Processing Systems

Hyperbolic spaces have recently gained momentum in the context of machine learning due to their high capacity and tree-likeliness properties. However, the representational power of hyperbolic geometry is not yet on par with Euclidean geometry, mostly because of the absence of corresponding hyperbolic neural networklayers.




Supplementary for Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning

Neural Information Processing Systems

Xiaoqian Wu Shanghai Jiao Tong University enlighten@sjtu.edu.cn In Tab. 1, we conclude the notations in this work for clarity.Notation Definition r A rule. The size of the premise symbols set M . S is the symbol set, and R is the rule set. A \ B The set difference of A and B. D A very large-scale activity images database.



eccd2a86bae4728b38627162ba297828-AuthorFeedback.pdf

Neural Information Processing Systems

First, LCs and2 NBCs are extensively used in different settings, with NBCs being deemed by some as one of the top algorithms in3 datamining. Q3: We will cover the references mentioned by the reviewer. However, that is orthogonal to our work.37 If one fixes the linear model we will compute rigorous PI-explanation in log-linear time.




Selective Generation for Controllable Language Models

Neural Information Processing Systems

Trustworthiness of generative language models (GLMs) is crucial in their deployment to critical decision making systems. Hence, certified risk control methods such as selective prediction and conformal prediction have been applied to mitigating the hallucination problem in various supervised downstream tasks. However, the lack of appropriate correctness metric hinders applying such principled methods to language generation tasks. In this paper, we circumvent this problem by leveraging the concept of textual entailment to evaluate the correctness of the generated sequence, and propose two selective generation algorithms which control the false discovery rate with respect to the textual entailment relation (FDR-E) with a theoretical guarantee: $\texttt{SGen}^{\texttt{Sup}}$ and $\texttt{SGen}^{\texttt{Semi}}$.