AITopics | Yin, Kayo

Collaborating Authors

Yin, Kayo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Which Attention Heads Matter for In-Context Learning?

Yin, Kayo, Steinhardt, Jacob

arXiv.org Artificial IntelligenceFeb-19-2025

Large language models (LLMs) exhibit impressive in-context learning (ICL) capability, enabling them to perform new tasks using only a few demonstrations in the prompt. Two different mechanisms have been proposed to explain ICL: induction heads that find and copy relevant tokens, and function vector (FV) heads whose activations compute a latent encoding of the ICL task. To better understand which of the two distinct mechanisms drives ICL, we study and compare induction heads and FV heads in 12 language models. Through detailed ablations, we discover that few-shot ICL performance depends primarily on FV heads, especially in larger models. In addition, we uncover that FV and induction heads are connected: many FV heads start as induction heads during training before transitioning to the FV mechanism. This leads us to speculate that induction facilitates learning the more complex FV mechanism that ultimately drives ICL.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.1401

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Using Language Models to Disambiguate Lexical Choices in Translation

Barua, Josh, Subramanian, Sanjay, Yin, Kayo, Suhr, Alane

arXiv.org Artificial IntelligenceNov-8-2024

In translation, a concept represented by a single word in a source language can have multiple variations in a target language. The task of lexical selection requires using context to identify which variation is most appropriate for a source text. We work with native speakers of nine languages to create DTAiLS, a dataset of 1,377 sentence pairs that exhibit cross-lingual concept variation when translating from English. We evaluate recent LLMs and neural machine translation systems on DTAiLS, with the best-performing model, GPT-4, achieving from 67 to 85% accuracy across languages. Finally, we use language models to generate English rules describing target-language concept variations. Providing weaker models with high-quality lexical rules improves accuracy substantially, in some cases reaching or outperforming GPT-4.

large language model, machine learning, variation, (19 more...)

arXiv.org Artificial Intelligence

2411.05781

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

ASL STEM Wiki: Dataset and Benchmark for Interpreting STEM Articles

Yin, Kayo, Singh, Chinmay, Minakov, Fyodor O., Milan, Vanessa, Daumé, Hal III, Zhang, Cyril, Lu, Alex X., Bragg, Danielle

arXiv.org Artificial IntelligenceNov-8-2024

Deaf and hard-of-hearing (DHH) students face significant barriers in accessing science, technology, engineering, and mathematics (STEM) education, notably due to the scarcity of STEM resources in signed languages. To help address this, we introduce ASL STEM Wiki: a parallel corpus of 254 Wikipedia articles on STEM topics in English, interpreted into over 300 hours of American Sign Language (ASL). ASL STEM Wiki is the first continuous signing dataset focused on STEM, facilitating the development of AI resources for STEM education in ASL. We identify several use cases of ASL STEM Wiki with human-centered applications. Figure 1: One use case of ASL STEM Wiki is automatic For example, because this dataset sign suggestion. Given an English sentence and a video highlights the frequent use of fingerspelling for of its ASL interpretation, the model detects all clips of technical concepts, which inhibits DHH students' ASL that contains fingerspelling (FS). Then, given the ability to learn, we develop models to detected FS clip and the English sentence, the model identify fingerspelled words--which can later identifies which English phrase in the sentence is fingerspelled be used to query for appropriate ASL signs to in the clip. The English phrase can be used to suggest to interpreters.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.05783

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

American Sign Language Handshapes Reflect Pressures for Communicative Efficiency

Yin, Kayo, Regier, Terry, Klein, Dan

arXiv.org Artificial IntelligenceJun-10-2024

Communicative efficiency is a key topic in linguistics and cognitive psychology, with many studies demonstrating how the pressure to communicate with minimal effort guides the form of natural language. However, this phenomenon is rarely explored in signed languages. This paper shows how handshapes in American Sign Language (ASL) reflect these efficiency pressures and provides new evidence of communicative efficiency in the visual-gestural modality. We focus on hand configurations in native ASL signs and signs borrowed from English to compare efficiency pressures from both ASL and English usage. First, we develop new methodologies to quantify the articulatory effort needed to produce handshapes and the perceptual effort required to recognize them. Then, we analyze correlations between communicative effort and usage statistics in ASL or English. Our findings reveal that frequent ASL handshapes are easier to produce and that pressures for communicative efficiency mostly come from ASL usage, rather than from English lexical borrowing.

handshape, natural language, simulation of human behavior, (19 more...)

arXiv.org Artificial Intelligence

2406.04024

Country:

North America > United States (0.14)
Asia (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.64)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.88)
Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (0.34)

Add feedback

When Does Translation Require Context? A Data-driven, Multilingual Exploration

Fernandes, Patrick, Yin, Kayo, Liu, Emmy, Martins, André F. T., Neubig, Graham

arXiv.org Artificial IntelligenceJun-27-2023

Although proper handling of discourse significantly contributes to the quality of machine translation (MT), these improvements are not adequately measured in common translation quality metrics. Recent works in context-aware MT attempt to target a small set of discourse phenomena during evaluation, however not in a fully systematic way. In this paper, we develop the Multilingual Discourse-Aware (MuDA) benchmark, a series of taggers that identify and evaluate model performance on discourse phenomena in any given dataset. The choice of phenomena is inspired by a novel methodology to systematically identify translations requiring context. We confirm the difficulty of previously studied phenomena while uncovering others that were previously unaddressed. We find that common context-aware MT models make only marginal improvements over context-agnostic models, which suggests these models do not handle these ambiguities effectively. We release code and data for 14 language pairs to encourage the MT community to focus on accurately capturing discourse phenomena.

artificial intelligence, machine translation, natural language, (15 more...)

arXiv.org Artificial Intelligence

2109.07446

Country:

North America > United States > California (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Portugal > Lisbon > Lisbon (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Do Context-Aware Translation Models Pay the Right Attention?

Yin, Kayo, Fernandes, Patrick, Pruthi, Danish, Chaudhary, Aditi, Martins, André F. T., Neubig, Graham

arXiv.org Artificial IntelligenceMay-21-2021

Context-aware machine translation models are designed to leverage contextual information, but often fail to do so. As a result, they inaccurately disambiguate pronouns and polysemous words that require context for resolution. In this paper, we ask several questions: What contexts do human translators use to resolve ambiguous words? Are models paying large amounts of attention to the same context? What if we explicitly train them to do so? To answer these questions, we introduce SCAT (Supporting Context for Ambiguous Translations), a new English-French dataset comprising supporting context words for 14K translations that professional translators found useful for pronoun disambiguation. Using SCAT, we perform an in-depth analysis of the context used to disambiguate, examining positional and lexical characteristics of the supporting words. Furthermore, we measure the degree of alignment between the model's attention scores and the supporting context from SCAT, and apply a guided attention strategy to encourage agreement between the two.

artificial intelligence, computational linguistics, machine translation, (17 more...)

arXiv.org Artificial Intelligence

2105.06977

Country:

Asia (0.93)
Europe > Portugal > Lisbon > Lisbon (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Including Signed Languages in Natural Language Processing

Yin, Kayo, Moryossef, Amit, Hochgesang, Julie, Goldberg, Yoav, Alikhani, Malihe

arXiv.org Artificial IntelligenceMay-11-2021

Signed languages are the primary means of communication for many deaf and hard of hearing individuals. Since signed languages exhibit all the fundamental linguistic properties of natural language, we believe that tools and theories of Natural Language Processing (NLP) are crucial towards its modeling. However, existing research in Sign Language Processing (SLP) seldom attempt to explore and leverage the linguistic organization of signed languages. This position paper calls on the NLP community to include signed languages as a research area with high social and scientific impact. We first discuss the linguistic properties of signed languages to consider during their modeling. Then, we review the limitations of current SLP models and identify the open challenges to extend NLP to signed languages. Finally, we urge (1) the adoption of an efficient tokenization method; (2) the development of linguistically-informed models; (3) the collection of real-world signed language data; (4) the inclusion of local signed language communities as an active and leading voice in the direction of research.

language learning, otolaryngology, signed language, (22 more...)

arXiv.org Artificial Intelligence

2105.05222

Country:

Europe > United Kingdom > England (0.14)
North America > United States > Louisiana (0.14)

Genre:

Research Report (0.50)
Overview (0.34)

Industry:

Education > Curriculum > Subject-Specific Education (0.46)
Health & Medicine > Therapeutic Area > Otolaryngology (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)

Add feedback