AITopics | Glass, James

Collaborating Authors

Glass, James

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning

Zhang, Tianhua, Ge, Jiaxin, Luo, Hongyin, Chuang, Yung-Sung, Gao, Mingye, Gong, Yuan, Wu, Xixin, Kim, Yoon, Meng, Helen, Glass, James

arXiv.org Artificial IntelligenceSep-19-2023

How can we perform computations over natural language representations to solve tasks that require symbolic and numeric reasoning? We propose natural language embedded programs (NLEP) as a unifying framework for addressing math/symbolic reasoning, natural language understanding, and instruction following tasks. Our approach prompts a language model to generate full Python programs that define functions over data structures which contain natural language representations of structured knowledge. A Python interpreter then executes the generated code and prints the output. Despite using a task-general prompt, we find that this approach can improve upon strong baselines across a range of different tasks including math and symbolic reasoning, text classification, question answering, and instruction following. We further find the generated programs are often interpretable and enable post-hoc verification of the intermediate reasoning steps.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2309.10814

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance (0.93)
Media > Film (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Chuang, Yung-Sung, Xie, Yujia, Luo, Hongyin, Kim, Yoon, Glass, James, He, Pengcheng

arXiv.org Artificial IntelligenceSep-7-2023

Despite their impressive capabilities, large language models (LLMs) are prone to hallucinations, i.e., generating content that deviates from facts seen during pretraining. We propose a simple decoding strategy for reducing hallucinations with pretrained LLMs that does not require conditioning on retrieved external knowledge nor additional fine-tuning. Our approach obtains the next-token distribution by contrasting the differences in logits obtained from projecting the later layers versus earlier layers to the vocabulary space, exploiting the fact that factual knowledge in an LLMs has generally been shown to be localized to particular transformer layers. We find that this Decoding by Contrasting Layers (DoLa) approach is able to better surface factual knowledge and reduce the generation of incorrect facts. DoLa consistently improves the truthfulness across multiple choices tasks and open-ended generation tasks, for example improving the performance of LLaMA family models on TruthfulQA by 12-17% absolute points, demonstrating its potential in making LLMs reliably generate truthful facts.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2309.03883

Country:

Asia (0.67)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

SAIL: Search-Augmented Instruction Learning

Luo, Hongyin, Chuang, Yung-Sung, Gong, Yuan, Zhang, Tianhua, Kim, Yoon, Wu, Xixin, Fox, Danny, Meng, Helen, Glass, James

arXiv.org Artificial IntelligenceJun-25-2023

Large language models (LLMs) have been significantly improved by instruction fine-tuning, but still lack transparency and the ability to utilize up-to-date knowledge and information. In this work, we propose search-augmented instruction learning (SAIL), which grounds the language generation and instruction following abilities on complex search results generated by in-house and external search engines. With an instruction tuning corpus, we collect search results for each training case from different search APIs and domains, and construct a new search-grounded training set containing \textit{(instruction, grounding information, response)} triplets. We then fine-tune the LLaMA-7B model on the constructed training set. Since the collected results contain unrelated and disputing languages, the model needs to learn to ground on trustworthy search results, filter out distracting passages, and generate the target response. The search result-denoising process entails explicit trustworthy information selection and multi-hop reasoning, since the retrieved passages might be informative but not contain the instruction-following answer. Experiments show that the fine-tuned SAIL-7B model has a strong instruction-following ability, and it performs significantly better on transparency-sensitive tasks, including open-ended question answering and fact checking.

information retrieval, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.15225

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Revealing the Blind Spot of Sentence Encoder Evaluation by HEROS

Chiang, Cheng-Han, Chuang, Yung-Sung, Glass, James, Lee, Hung-yi

arXiv.org Artificial IntelligenceJun-13-2023

Existing sentence textual similarity benchmark datasets only use a single number to summarize how similar the sentence encoder's decision is to humans'. However, it is unclear what kind of sentence pairs a sentence encoder (SE) would consider similar. Moreover, existing SE benchmarks mainly consider sentence pairs with low lexical overlap, so it is unclear how the SEs behave when two sentences have high lexical overlap. We introduce a high-quality SE diagnostic dataset, HEROS. HEROS is constructed by transforming an original sentence into a new sentence based on certain rules to form a \textit{minimal pair}, and the minimal pair has high lexical overlaps. The rules include replacing a word with a synonym, an antonym, a typo, a random word, and converting the original sentence into its negation. Different rules yield different subsets of HEROS. By systematically comparing the performance of over 60 supervised and unsupervised SEs on HEROS, we reveal that most unsupervised sentence encoders are insensitive to negation. We find the datasets used to train the SE are the main determinants of what kind of sentence pairs an SE considers similar. We also show that even if two SEs have similar performance on STS benchmarks, they can have very different behavior on HEROS. Our result reveals the blind spot of traditional STS benchmarks when evaluating SEs.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.05083

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages

Rouditchenko, Andrew, Khurana, Sameer, Thomas, Samuel, Feris, Rogerio, Karlinsky, Leonid, Kuehne, Hilde, Harwath, David, Kingsbury, Brian, Glass, James

arXiv.org Artificial IntelligenceMay-30-2023

Recent models such as XLS-R and Whisper have made multilingual speech technologies more accessible by pre-training on audio from around 100 spoken languages each. However, there are thousands of spoken languages worldwide, and adapting to new languages is an important problem. In this work, we aim to understand which model adapts better to languages unseen during pre-training. We fine-tune both models on 13 unseen languages and 18 seen languages. Our results show that the number of hours seen per language and language family during pre-training is predictive of how the models compare, despite the significant differences in the pre-training methods.

artificial intelligence, machine learning, unseen language, (16 more...)

arXiv.org Artificial Intelligence

2305.12606

Country:

Europe (0.46)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.69)

Add feedback

Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering

Chuang, Yung-Sung, Fang, Wei, Li, Shang-Wen, Yih, Wen-tau, Glass, James

arXiv.org Artificial IntelligenceMay-26-2023

We propose EAR, a query Expansion And Reranking approach for improving passage retrieval, with the application to open-domain question answering. EAR first applies a query expansion model to generate a diverse set of queries, and then uses a query reranker to select the ones that could lead to better retrieval results. Motivated by the observation that the best query expansion often is not picked by greedy decoding, EAR trains its reranker to predict the rank orders of the gold passages when issuing the expanded queries to a given retriever. By connecting better the query expansion model and retriever, EAR significantly enhances a traditional sparse retrieval method, BM25. Empirically, EAR improves top-5/20 accuracy by 3-8 and 5-10 points in in-domain and out-of-domain settings, respectively, when compared to a vanilla query expansion model, GAR, and a dense retrieval model, DPR.

artificial intelligence, information retrieval query processing, natural language, (16 more...)

arXiv.org Artificial Intelligence

2305.1708

Country:

Europe > United Kingdom > England (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry:

Media (0.93)
Leisure & Entertainment (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)

Add feedback

Entailment as Robust Self-Learner

Ge, Jiaxin, Luo, Hongyin, Kim, Yoon, Glass, James

arXiv.org Artificial IntelligenceMay-26-2023

Entailment has been recognized as an important metric for evaluating natural language understanding (NLU) models, and recent studies have found that entailment pretraining benefits weakly supervised fine-tuning. In this work, we design a prompting strategy that formulates a number of different NLU tasks as contextual entailment. This approach improves the zero-shot adaptation of pretrained entailment models. Secondly, we notice that self-training entailment-based models with unlabeled data can significantly improve the adaptation performance on downstream tasks. To achieve more stable improvement, we propose the Simple Pseudo-Label Editing (SimPLE) algorithm for better pseudo-labeling quality in self-training. We also found that both pretrained entailment-based models and the self-trained models are robust against adversarial evaluation data. Experiments on binary and multi-class classification tasks show that SimPLE leads to more robust self-training results, indicating that the self-trained entailment models are more efficient and trustworthy than large language models on language understanding tasks.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2305.17197

Country:

Asia > China (0.28)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Blind Spots of Model-Based Evaluation Metrics for Text Generation

He, Tianxing, Zhang, Jingyu, Wang, Tianle, Kumar, Sachin, Cho, Kyunghyun, Glass, James, Tsvetkov, Yulia

arXiv.org Artificial IntelligenceMay-18-2023

In this work, we explore a useful but often neglected methodology for robustness analysis of text generation evaluation metrics: stress tests with synthetic data. Basically, we design and synthesize a wide range of potential errors and check whether they result in a commensurate drop in the metric scores. We examine a range of recently proposed evaluation metrics based on pretrained language models, for the tasks of open-ended generation, translation, and summarization. Our experiments reveal interesting insensitivities, biases, or even loopholes in existing metrics. For example, we find that BERTScore is confused by truncation errors in summarization, and MAUVE (built on top of GPT-2) is insensitive to errors at the beginning or middle of generations. Further, we investigate the reasons behind these blind spots and suggest practical workarounds for a more reliable evaluation of text generation. We have released our code and data at https://github.com/cloudygoose/blindspot_nlg.

artificial intelligence, hypothesis, natural language, (18 more...)

arXiv.org Artificial Intelligence

2212.1002

Country:

Europe (0.93)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Government (1.00)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.51)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering

Chang, Heng-Jui, Liu, Alexander H., Glass, James

arXiv.org Artificial IntelligenceMay-18-2023

Self-supervised speech representation models have succeeded in various tasks, but improving them for content-related problems using unlabeled data is challenging. We propose speaker-invariant clustering (Spin), a novel self-supervised learning method that clusters speech representations and performs swapped prediction between the original and speaker-perturbed utterances. Spin disentangles speaker information and preserves content representations with just 45 minutes of fine-tuning on a single GPU. Spin improves pre-trained networks and outperforms prior methods in speech recognition and acoustic unit discovery.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Artificial Intelligence

2305.11072

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
(2 more...)

Add feedback

C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval

Rouditchenko, Andrew, Chuang, Yung-Sung, Shvetsova, Nina, Thomas, Samuel, Feris, Rogerio, Kingsbury, Brian, Karlinsky, Leonid, Harwath, David, Kuehne, Hilde, Glass, James

arXiv.org Artificial IntelligenceMay-9-2023

Multilingual text-video retrieval methods have improved significantly in recent years, but the performance for other languages lags behind English. We propose a Cross-Lingual Cross-Modal Knowledge Distillation method to improve multilingual text-video retrieval. Inspired by the fact that English text-video retrieval outperforms other languages, we train a student model using input text in different languages to match the cross-modal predictions from teacher models using input text in English. We propose a cross entropy based objective which forces the distribution over the student's text-video similarity scores to be similar to those of the teacher models. We introduce a new multilingual video dataset, Multi-YouCook2, by translating the English captions in the YouCook2 video dataset to 8 other languages. Our method improves multilingual text-video retrieval performance on Multi-YouCook2 and several other datasets such as Multi-MSRVTT and VATEX. We also conducted an analysis on the effectiveness of different multilingual text models as teachers. The code, models, and dataset are available at https://github.com/roudimit/c2kd.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2210.03625

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (0.82)

Industry: Education > Educational Technology (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Communications (0.68)

Add feedback