AITopics | preposition

Collaborating Authors

preposition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

9d411e87d0f37059f40fb27c5de00ba0-Supplemental-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsJun-20-2026, 15:52:12 GMT

The following section is answers to questions listed in datasheets for datasets.858 A.1 Motivation859 Question: For what purpose was the dataset created? Was there a specific task in mind?860 Was there a specific gap that needed to be filled? Answer: To evaluate the linguistic robustness of language models across diverse English862 varieties by transforming Standard American English (SAE) datasets.863 Question: Who created the dataset (e.g., which team, research group) and on behalf of864 which entity (e.g., company, institution, organization)?865 Answer: The authors of this paper.866 Question: Who funded the creation of the dataset? If there is an associated grant, please867 provide the name of the grantor and the grant name and number.868

large language model, machine learning, natural language, (23 more...)

Neural Information Processing Systems

Genre: Research Report (0.34)

Industry: Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.46)

Add feedback

A Taxonomy of Errors in English as she is spoke: Toward an AI-Based Method of Error Analysis for EFL Writing Instruction

Heywood, Damian, Carrier, Joseph Andrew, Hwang, Kyu-Hong

arXiv.org Artificial IntelligenceDec-2-2025

Background Recent developments in artificial intelligence (AI), particularly Large Language Models (LLMs), have shown promise in automating previously unavailable aspects of student writing assessment and providing detailed, individuated feedback. Our previous research demonstrated that AI systems can reliably assess student writing using standardized rubrics, achieving consistency 2 rates of over 99% over five iterations (Heywood & Carrier, 2024). However, while these systems excel at providing holistic assessment using broad categories, their potential to provide detailed, granular feedback about specific writing errors has not yet been fully explored . This study builds upon our earlier work by developing and testing a sophisticated error classification system that can identify, categorize, and describe writing errors at both the word and sentence levels. The system employs a detailed taxonomy of errors based on established linguistic theory in the area of error classification (Corder, 1967, 1975, 1981; Richards, 1971, 1974; James, 1998). The AI analysis is implemented through carefully designed API calls to Claude 3.5 Sonnet in Python. With this enhanced error classification system, the present study analyzes an error ridden dialogue from an infamous text, English as she is spoke (Fonseca et al., 2004). We also provide the results of a review of the AI analysis by a human panel of experts.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2512.00392

Genre: Research Report > New Finding (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Linguistically Motivated Analysis of Intonational Phrasing in Text-to-Speech Systems: Revealing Gaps in Syntactic Sensitivity

Pouw, Charlotte, Alishahi, Afra, Zuidema, Willem

arXiv.org Artificial IntelligenceOct-16-2025

We analyze the syntactic sensitivity of Text-to-Speech (TTS) systems using methods inspired by psycholinguistic research. Specifically, we focus on the generation of intonational phrase boundaries, which can often be predicted by identifying syntactic boundaries within a sentence. We find that TTS systems struggle to accurately generate intonational phrase boundaries in sentences where syntactic boundaries are ambiguous (e.g., garden path sentences or sentences with attachment ambiguity). In these cases, systems need superficial cues such as commas to place boundaries at the correct positions. In contrast, for sentences with simpler syntactic structures, we find that systems do incorporate syntactic cues beyond surface markers. Finally, we finetune models on sentences without commas at the syntactic boundary positions, encouraging them to focus on more subtle linguistic cues. Our findings indicate that this leads to more distinct intonation patterns that better reflect the underlying structure.

artificial intelligence, boundary, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2025.conll-1.9

2505.22236

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Say It Another Way: Auditing LLMs with a User-Grounded Automated Paraphrasing Framework

Chataigner, Cléa, Ma, Rebecca, Ganesh, Prakhar, Chen, Yuhao, Taïk, Afaf, Creager, Elliot, Farnadi, Golnoosh

arXiv.org Artificial IntelligenceOct-10-2025

Large language models (LLMs) are highly sensitive to subtle changes in prompt phrasing, posing challenges for reliable auditing. Prior methods often apply unconstrained prompt paraphrasing, which risk missing linguistic and demographic factors that shape authentic user interactions. We introduce AUGMENT (Automated User-Grounded Modeling and Evaluation of Natural Language Transformations), a framework for generating controlled paraphrases, grounded in user behaviors. AUGMENT leverages linguistically informed rules and enforces quality through checks on instruction adherence, semantic similarity, and realism, ensuring paraphrases are both reliable and meaningful for auditing. Through case studies on the BBQ and MMLU datasets, we show that controlled paraphrases uncover systematic weaknesses that remain obscured under unconstrained variation. These results highlight the value of the AUGMENT framework for reliable auditing.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2505.03563

Country:

Europe (0.68)
Asia > Middle East > UAE (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Evaluating The Impact of Stimulus Quality in Investigations of LLM Language Performance

Pistotti, Timothy, Brown, Jason, Witbrock, Michael

arXiv.org Artificial IntelligenceOct-8-2025

Recent studies employing Large Language Models (LLMs) to test the Argument from the Poverty of the Stimulus (APS) have yielded contrasting results across syntactic phenomena. This paper investigates the hypothesis that characteristics of the stimuli used in recent studies, including lexical ambiguities and structural complexities, may confound model performance. A methodology is proposed for re-evaluating LLM competence on syntactic prediction, focusing on GPT-2. This involves: 1) establishing a baseline on previously used (both filtered and unfiltered) stimuli, and 2) generating a new, refined dataset using a state-of-the-art (SOTA) generative LLM (Gemini 2.5 Pro Preview) guided by linguistically-informed templates designed to mitigate identified confounds. Our preliminary findings indicate that GPT-2 demonstrates notably improved performance on these refined PG stimuli compared to baselines, suggesting that stimulus quality significantly influences outcomes in surprisal-based evaluations of LLM syntactic competency.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2510.06018

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

A UD Treebank for Bohairic Coptic

Zeldes, Amir, Speransky, Nina, Wagner, Nicholas, Schroeder, Caroline T.

arXiv.org Artificial IntelligenceJun-10-2025

Despite recent advances in digital resources for other Coptic dialects, especially Sahidic, Bohairic Coptic, the main Coptic dialect for pre-Mamluk, late Byzantine Egypt, and the contemporary language of the Coptic Church, remains critically under-resourced. This paper presents and evaluates the first syntactically annotated corpus of Bohairic Coptic, sampling data from a range of works, including Biblical text, saints' lives and Christian ascetic writing. We also explore some of the main differences we observe compared to the existing UD treebank of Sahidic Coptic, the classical dialect of the language, and conduct joint and cross-dialect parsing experiments, revealing the unique nature of Bohairic as a related, but distinct variety from the more often studied Sahidic.

artificial intelligence, natural language, treebank, (17 more...)

arXiv.org Artificial Intelligence

2504.18386

Country:

North America > United States (0.68)
Europe > Germany (0.68)
Africa > Middle East > Egypt (0.49)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Computational Typology

Jäger, Gerhard

arXiv.org Artificial IntelligenceApr-30-2025

Typology is a subfield of linguistics that focuses on the study and classification of languages based on their structural features. Unlike genealogical classification, which examines the historical relationships between languages, typology seeks to understand the diversity of human languages by identifying common properties and patterns, known as universals. In recent years, computational methods have played an increasingly important role in typological research, enabling the analysis of large-scale linguistic data and the testing of hypotheses about language structure and evolution. This article provides an illustration of the benefits of computational statistical modeling in typology.

correlation, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.15642

Country:

North America > United States (0.46)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Acquiring Grounded Representations of Words with Situated Interactive Instruction

Mohan, Shiwali, Mininger, Aaron H., Kirk, James R., Laird, John E.

arXiv.org Artificial IntelligenceFeb-28-2025

We present an approach for acquiring grounded representations of words from mixed-initiative, situated interactions with a human instructor. The work focuses on the acquisition of diverse types of knowledge including perceptual, semantic, and procedural knowledge along with learning grounded meanings. Interactive learning allows the agent to control its learning by requesting instructions about unknown concepts, making learning efficient. Our approach has been instantiated in Soar and has been evaluated on a table-top robotic arm capable of manipulating small objects.

agent, knowledge, preposition, (17 more...)

arXiv.org Artificial Intelligence

2502.20754

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > Canada > Ontario > Toronto (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

PropNet: a White-Box and Human-Like Network for Sentence Representation

Yang, Fei

arXiv.org Artificial IntelligenceFeb-15-2025

Transformer-based embedding methods have dominated the field of sentence representation in recent years. Although they have achieved remarkable performance on NLP missions, such as semantic textual similarity (STS) tasks, their black-box nature and large-data-driven training style have raised concerns, including issues related to bias, trust, and safety. Many efforts have been made to improve the interpretability of embedding models, but these problems have not been fundamentally resolved. To achieve inherent interpretability, we propose a purely white-box and human-like sentence representation network, PropNet. Inspired by findings from cognitive science, PropNet constructs a hierarchical network based on the propositions contained in a sentence. While experiments indicate that PropNet has a significant gap compared to state-of-the-art (SOTA) embedding models in STS tasks, case studies reveal substantial room for improvement. Additionally, PropNet enables us to analyze and understand the human cognitive processes underlying STS benchmarks.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.10725

Country:

Africa (0.67)
North America > United States (0.67)
Asia > Middle East > Iraq (0.14)

Genre: Research Report (0.82)

Industry:

Government (1.00)
Health & Medicine (0.93)
Transportation > Air (0.87)
Leisure & Entertainment > Sports (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

TACOMORE: Leveraging the Potential of LLMs in Corpus-based Discourse Analysis with Prompt Engineering

Li, Bingru, Wang, Han

arXiv.org Artificial IntelligenceDec-13-2024

The capacity of LLMs to carry out automated qualitative analysis has been questioned by corpus linguists, and it has been argued that corpus-based discourse analysis incorporating LLMs is hindered by issues of unsatisfying performance, hallucination, and irreproducibility. Our proposed method, TACOMORE, aims to address these concerns by serving as an effective prompting framework in this domain. The framework consists of four principles, i.e., Task, Context, Model and Reproducibility, and specifies five fundamental elements of a good prompt, i.e., Role Description, Task Definition, Task Procedures, Contextual Information and Output Format. We conduct experiments on three LLMs, i.e., GPT-4o, Gemini-1.5-Pro and Gemini-1.5.Flash, and find that TACOMORE helps improve LLM performance in three representative discourse analysis tasks, i.e., the analysis of keywords, collocates and concordances, based on an open corpus of COVID-19 research articles. Our findings show the efficacy of the proposed prompting framework TACOMORE in corpus-based discourse analysis in terms of Accuracy, Ethicality, Reasoning, and Reproducibility, and provide novel insights into the application and evaluation of LLMs in automated qualitative studies.

large language model, machine learning, virus, (20 more...)

arXiv.org Artificial Intelligence

2412.10139

Country:

Asia > China > Hubei Province > Wuhan (0.05)
Europe > Italy (0.04)
Asia > South Korea (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback