AITopics | linguistic generalization

Collaborating Authors

linguistic generalization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Can Language Models Induce Grammatical Knowledge from Indirect Evidence?

Oba, Miyu, Oseki, Yohei, Fukatsu, Akiyo, Haga, Akari, Ouchi, Hiroki, Watanabe, Taro, Sugawara, Saku

arXiv.org Artificial IntelligenceOct-23-2024

What kinds of and how much data is necessary for language models to induce grammatical knowledge to judge sentence acceptability? Recent language models still have much room for improvement in their data efficiency compared to humans. This paper investigates whether language models efficiently use indirect data (indirect evidence), from which they infer sentence acceptability. In contrast, humans use indirect evidence efficiently, which is considered one of the inductive biases contributing to efficient language acquisition. To explore this question, we introduce the Wug InDirect Evidence Test (WIDET), a dataset consisting of training instances inserted into the pre-training data and evaluation instances. We inject synthetic instances with newly coined wug words into pretraining data and explore the model's behavior on evaluation data that assesses grammatical acceptability regarding those words. We prepare the injected instances by varying their levels of indirectness and quantity. Our experiments surprisingly show that language models do not induce grammatical knowledge even after repeated exposure to instances with the same structure but differing only in lexical items from evaluation instances in certain language phenomena. Our findings suggest a potential direction for future research: developing models that use latent indirect evidence to induce grammatical knowledge.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.06022

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.57)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure

Wilson, Michael, Petty, Jackson, Frank, Robert

arXiv.org Artificial IntelligenceNov-8-2023

Language models are typically evaluated on their success at predicting the distribution of specific words in specific contexts. Yet linguistic knowledge also encodes relationships between contexts, allowing inferences between word distributions. We investigate the degree to which pre-trained Transformer-based large language models (LLMs) represent such relationships, focusing on the domain of argument structure. We find that LLMs perform well in generalizing the distribution of a novel noun argument between related contexts that were seen during pre-training (e.g., the active object and passive subject of the verb spray), succeeding by making use of the semantically-organized structure of the embedding space for word embeddings. However, LLMs fail at generalizations between related contexts that have not been observed during pre-training, but which instantiate more abstract, but well-attested structural generalizations (e.g., between the active object and passive subject of an arbitrary verb). Instead, in this case, LLMs show a bias to generalize based on linear order. This finding points to a limitation with current models and points to a reason for which their training is data-intensive.s reported here are available at https://github.com/clay-lab/structural-alternations.

argument structure, language model, linguistic generalization, (1 more...)

arXiv.org Artificial Intelligence

2311.049

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Second Language Acquisition of Neural Language Models

Oba, Miyu, Kuribayashi, Tatsuki, Ouchi, Hiroki, Watanabe, Taro

arXiv.org Artificial IntelligenceJun-5-2023

With the success of neural language models (LMs), their language acquisition has gained much attention. This work sheds light on the second language (L2) acquisition of LMs, while previous work has typically explored their first language (L1) acquisition. Specifically, we trained bilingual LMs with a scenario similar to human L2 acquisition and analyzed their cross-lingual transfer from linguistic perspectives. Our exploratory experiments demonstrated that the L1 pretraining accelerated their linguistic generalization in L2, and language transfer configurations (e.g., the L1 choice, and presence of parallel texts) substantially affected their generalizations. These clarify their (non-)human-like L2 acquisition in particular aspects.

acquisition, computational linguistic, lms, (14 more...)

arXiv.org Artificial Intelligence

2306.0292

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > Japan > Honshū > Tōhoku (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(14 more...)

Genre: Research Report > New Finding (0.88)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

Does Vision Accelerate Hierarchical Generalization of Neural Language Learners?

Kuribayashi, Tatsuki

arXiv.org Artificial IntelligenceFeb-1-2023

Neural language models (LMs) are arguably less data-efficient than humans -- why does this gap occur? In this study, we hypothesize that this gap stems from the learners' accessibility to modalities other than text, specifically, vision. We conducted two complementary experiments (using noisy, realistic data and a simplified, artificial one) toward the advantage of vision in the syntactic generalization of LMs. Our results showed that vision accelerated a proper linguistic generalization in the simplified, artificial setting, but LMs struggled with the noisy, realistic setting. These mixed results indicate several possibilities, e.g., vision can potentially boost language acquisition, but learners' additional visual/linguistic prior knowledge should be needed to robustly make use of raw images for efficient language acquisition.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2302.00667

Country:

Asia > Japan > Honshū > Tōhoku (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Massachusetts (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback