AITopics | example sentence

Collaborating Authors

example sentence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Phase transition on a context-sensitive random language model with short range interactions

Toji, Yuma, Takahashi, Jun, Roychowdhury, Vwani, Miyahara, Hideyuki

arXiv.org Machine LearningApr-2-2026

Since the random language model was proposed by E. DeGiuli [Phys. Rev. Lett. 122, 128301], language models have been investigated intensively from the viewpoint of statistical mechanics. Recently, the existence of a Berezinskii--Kosterlitz--Thouless transition was numerically demonstrated in models with long-range interactions between symbols. In statistical mechanics, it has long been known that long-range interactions can induce phase transitions. Therefore, it has remained unclear whether phase transitions observed in language models originate from genuinely linguistic properties that are absent in conventional spin models. In this study, we construct a random language model with short-range interactions and numerically investigate its statistical properties. Our model belongs to the class of context-sensitive grammars in the Chomsky hierarchy and allows explicit reference to contexts. We find that a phase transition occurs even when the model refers only to contexts whose length remains constant with respect to the sentence length. This result indicates that finite-temperature phase transitions in language models are genuinely induced by the intrinsic nature of language, rather than by long-range interactions.

artificial intelligence, natural language, transition, (19 more...)

arXiv.org Machine Learning

2604.00947

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Reading (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.34)

Add feedback

Do Large Language Models Grasp The Grammar? Evidence from Grammar-Book-Guided Probing in Luxembourgish

Li, Lujun, Song, Yewei, Sleem, Lama, Wang, Yiqun, Xu, Yangjie, Lothritz, Cedric, Gentile, Niccolo, State, Radu, Bissyande, Tegawende F., Klein, Jacques

arXiv.org Artificial IntelligenceOct-30-2025

Grammar refers to the system of rules that governs the structural organization and the semantic relations among linguistic units such as sentences, phrases, and words within a given language. In natural language processing, there remains a notable scarcity of grammar focused evaluation protocols, a gap that is even more pronounced for low-resource languages. Moreover, the extent to which large language models genuinely comprehend grammatical structure, especially the mapping between syntactic structures and meanings, remains under debate. To investigate this issue, we propose a Grammar Book Guided evaluation pipeline intended to provide a systematic and generalizable framework for grammar evaluation consisting of four key stages, and in this work we take Luxembourgish as a case study. The results show a weak positive correlation between translation performance and grammatical understanding, indicating that strong translations do not necessarily imply deep grammatical competence. Larger models perform well overall due to their semantic strength but remain weak in morphology and syntax, struggling particularly with Minimal Pair tasks, while strong reasoning ability offers a promising way to enhance their grammatical understanding.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.24856

Country:

North America > United States (0.46)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Are Large Language Models Chronically Online Surfers? A Dataset for Chinese Internet Meme Explanation

Xie, Yubo, Wang, Chenkai, Ma, Zongyang, Miao, Fahui

arXiv.org Artificial IntelligenceOct-2-2025

Large language models (LLMs) are trained on vast amounts of text from the Internet, but do they truly understand the viral content that rapidly spreads online -- commonly known as memes? In this paper, we introduce CHIME, a dataset for CHinese Internet Meme Explanation. The dataset comprises popular phrase-based memes from the Chinese Internet, annotated with detailed information on their meaning, origin, example sentences, types, etc. To evaluate whether LLMs understand these memes, we designed two tasks. In the first task, we assessed the models' ability to explain a given meme, identify its origin, and generate appropriate example sentences. The results show that while LLMs can explain the meanings of some memes, their performance declines significantly for culturally and linguistically nuanced meme types. Additionally, they consistently struggle to provide accurate origins for the memes. In the second task, we created a set of multiple-choice questions (MCQs) requiring LLMs to select the most appropriate meme to fill in a blank within a contextual sentence. While the evaluated models were able to provide correct answers, their performance remains noticeably below human levels. We have made CHIME public and hope it will facilitate future research on computational meme understanding.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.00567

Country:

Europe (1.00)
Asia > China (0.69)
North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Industry:

Media (0.46)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Memorization or Reasoning? Exploring the Idiom Understanding of LLMs

Kim, Jisu, Shin, Youngwoo, Hwang, Uiji, Choi, Jihun, Xuan, Richeng, Kim, Taeuk

arXiv.org Artificial IntelligenceSep-24-2025

Idioms have long posed a challenge due to their unique linguistic properties, which set them apart from other common expressions. While recent studies have leveraged large language models (LLMs) to handle idioms across various tasks, e.g., idiom-containing sentence generation and idiomatic machine translation, little is known about the underlying mechanisms of idiom processing in LLMs, particularly in multilingual settings. To this end, we introduce MIDAS, a new large-scale dataset of idioms in six languages, each paired with its corresponding meaning. Leveraging this resource, we conduct a comprehensive evaluation of LLMs' idiom processing ability, identifying key factors that influence their performance. Our findings suggest that LLMs rely not only on memorization, but also adopt a hybrid approach that integrates contextual cues and reasoning, especially when processing compositional idioms. This implies that idiom understanding in LLMs emerges from an interplay between internal knowledge retrieval and reasoning-based inference.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.16216

Country:

Europe (1.00)
Asia > Middle East > UAE (0.46)
North America > Mexico (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.64)

Add feedback

Towards Universal Semantics With Large Language Models

Baartmans, Raymond, Raffel, Matthew, Vikram, Rahul, Deringer, Aiden, Chen, Lizhong

arXiv.org Artificial IntelligenceJul-8-2025

The Natural Semantic Metalanguage (NSM) is a linguistic theory based on a universal set of semantic primes: simple, primitive word-meanings that have been shown to exist in most, if not all, languages of the world. According to this framework, any word, regardless of complexity, can be paraphrased using these primes, revealing a clear and universally translatable meaning. These paraphrases, known as explications, can offer valuable applications for many natural language processing (NLP) tasks, but producing them has traditionally been a slow, manual process. In this work, we present the first study of using large language models (LLMs) to generate NSM explications. We introduce automatic evaluation methods, a tailored dataset for training and evaluation, and fine-tuned models for this task. Our 1B and 8B models outperform GPT-4o in producing accurate, cross-translatable explications, marking a significant step toward universal semantic representation with LLMs and opening up new possibilities for applications in semantic analysis, translation, and beyond. Our code is available at https://github.com/OSU-STARLAB/DeepNSM.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2505.11764

Country: Europe (0.46)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Automatically Suggesting Diverse Example Sentences for L2 Japanese Learners Using Pre-Trained Language Models

Benedetti, Enrico, Aizawa, Akiko, Boudin, Florian

arXiv.org Artificial IntelligenceJun-5-2025

Providing example sentences that are diverse and aligned with learners' proficiency levels is essential for fostering effective language acquisition. This study examines the use of Pre-trained Language Models (PLMs) to produce example sentences targeting L2 Japanese learners. We utilize PLMs in two ways: as quality scoring components in a retrieval system that draws from a newly curated corpus of Japanese sentences, and as direct sentence generators using zero-shot learning. We evaluate the quality of sentences by considering multiple aspects such as difficulty, diversity, and naturalness, with a panel of raters consisting of learners of Japanese, native speakers -- and GPT-4. Our findings suggest that there is inherent disagreement among participants on the ratings of sentence qualities, except for difficulty. Despite that, the retrieval approach was preferred by all evaluators, especially for beginner and advanced target proficiency, while the generative approaches received lower scores on average. Even so, our experiments highlight the potential for using PLMs to enhance the adaptability of sentence suggestion systems and therefore improve the language learning journey.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.0358

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.04)
(16 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Industry: Education > Curriculum > Subject-Specific Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

Do Inuit languages really have many words for snow?

This article was originally featured on The Conversation. Languages are windows into the worlds of the people who speak them – reflecting what they value and experience daily. So perhaps it's no surprise different languages highlight different areas of vocabulary. Scholars have noted that Mongolian has many horse-related words, that Maori has many words for ferns, and Japanese has many words related to taste. Some links are unsurprising, such as German having many words related to beer, or Fijian having many words for fish.

canadian inuktitut, language and concept, snow, (11 more...)

Popular Science

Country:

North America > United States > Alaska (0.05)
Africa > South Africa (0.05)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.37)

Add feedback

Digitally Supported Analysis of Spontaneous Speech (DigiSpon): Benchmarking NLP-Supported Language Sample Analysis of Swiss Children's Speech

Ryser, Anja, Gao, Yingqiang, Ebling, Sarah

arXiv.org Artificial IntelligenceApr-1-2025

Language sample analysis (LSA) is a process that complements standardized psychometric tests for diagnosing, for example, developmental language disorder (DLD) in children. However, its labor-intensive nature has limited its use in speech-language pathology practice. We introduce an approach that leverages natural language processing (NLP) methods not based on commercial large language models (LLMs) applied to transcribed speech data from 119 children in the German speaking part of Switzerland with typical and atypical language development. The study aims to identify optimal practices that support speech-language pathologists in diagnosing DLD more efficiently within a human-in-the-loop framework, without relying on potentially unethical implementations that leverage commercial LLMs. Preliminary findings underscore the potential of integrating locally deployed NLP methods into the process of semi-automatic LSA.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2504.0078

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > California > Orange County > Irvine (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (0.68)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Towards Robust Universal Information Extraction: Benchmark, Evaluation, and Solution

Zhu, Jizhao, Shi, Akang, Li, Zixuan, Bai, Long, Jin, Xiaolong, Guo, Jiafeng, Cheng, Xueqi

arXiv.org Artificial IntelligenceMar-5-2025

In this paper, we aim to enhance the robustness of Universal Information Extraction (UIE) by introducing a new benchmark dataset, a comprehensive evaluation, and a feasible solution. Existing robust benchmark datasets have two key limitations: 1) They generate only a limited range of perturbations for a single Information Extraction (IE) task, which fails to evaluate the robustness of UIE models effectively; 2) They rely on small models or handcrafted rules to generate perturbations, often resulting in unnatural adversarial examples. Considering the powerful generation capabilities of Large Language Models (LLMs), we introduce a new benchmark dataset for Robust UIE, called RUIE-Bench, which utilizes LLMs to generate more diverse and realistic perturbations across different IE tasks. Based on this dataset, we comprehensively evaluate existing UIE models and reveal that both LLM-based models and other models suffer from significant performance drops. To improve robustness and reduce training costs, we propose a data-augmentation solution that dynamically selects hard samples for iterative training based on the model's inference loss. Experimental results show that training with only \textbf{15\%} of the data leads to an average \textbf{7.5\%} relative performance improvement across three IE tasks.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.03201

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
Asia > China > Liaoning Province > Shenyang (0.04)
(7 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Generating bilingual example sentences with large language models as lexicography assistants

Merx, Raphael, Vylomova, Ekaterina, Kurniawan, Kemal

arXiv.org Artificial IntelligenceNov-19-2024

We present a study of LLMs' performance in generating and rating example sentences for bilingual dictionaries across languages with varying resource levels: French (high-resource), Indonesian (mid-resource), and Tetun (low-resource), with English as the target language. We evaluate the quality of LLM-generated examples against the GDEX (Good Dictionary EXample) criteria: typicality, informativeness, and intelligibility. Our findings reveal that while LLMs can generate reasonably good dictionary examples, their performance degrades significantly for lower-resourced languages. We also observe high variability in human preferences for example quality, reflected in low inter-annotator agreement rates. To address this, we demonstrate that in-context learning can successfully align LLMs with individual annotator preferences. Additionally, we explore the use of pre-trained language models for automated rating of examples, finding that sentence perplexity serves as a good proxy for typicality and intelligibility in higher-resourced languages. Our study also contributes a novel dataset of 600 ratings for LLM-generated sentence pairs, and provides insights into the potential of LLMs in reducing the cost of lexicographic work, particularly for low-resource languages.

annotator, example sentence, language model, (16 more...)

arXiv.org Artificial Intelligence

2410.03182

Country:

Asia > Timor-Leste (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Israel (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback