AITopics

As natural language corpora expand at an unprecedented rate, manual annotation remains a significant methodological bottleneck in corpus linguistic work. We address this challenge by presenting a scalable, unsupervised pipeline for automating grammatical annotation in voluminous corpora using large language models (LLMs). Unlike previous supervised and iterative approaches, our method employs a four-phase workflow: prompt engineering, pre-hoc evaluation, automated batch processing, and post-hoc validation. We demonstrate the pipeline's accessibility and effectiveness through a diachronic case study of variation in the English consider construction. Using GPT-5 through the OpenAI API, we annotate 143,933 sentences from the Corpus of Historical American English (COHA) in under 60 hours, achieving 98%+ accuracy on two sophisticated annotation procedures. Our results suggest that LLMs can perform a range of data preparation tasks at scale with minimal human intervention, opening new possibilities for corpus-based research, though implementation requires attention to costs, licensing, and other ethical considerations.

large language model, machine learning, natural language, (20 more...)

2510.12306

Country: Europe (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Law (0.46)
Energy (0.46)
Health & Medicine > Therapeutic Area (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Abdullah, Abdulhady Abas, Zubiaga, Arkaitz, Mirjalili, Seyedali, Gandomi, Amir H., Daneshfar, Fatemeh, Amini, Mohammadsadra, Mohammed, Alan Salam, Veisi, Hadi

Evolution of meta's llama models and parameter-efficient fine-tuning of large language models: a survey

This review surveys the rapid evolution of Meta AI's LLaMA (Large Language Model Meta AI) series - from LLaMA 1 through LLaMA 4 and the specialized parameter-efficient fine-tuning (PEFT) methods developed for these models. We first describe the LLaMA family of foundation models (7B-65B to 288B parameters), their architectures (including native multimodal and Mixtureof-Experts variants), and key performance characteristics. We then describe and discuss the concept of PEFT, which adapts large pre-trained models by updating only a small subset of parameters, and review five PEFT methods that have been applied to LLaMA: LoRA (Low-Rank Adaptation), LLaMA-Adapter V1 and V2, LLaMA-Excitor, and QLoRA (Quantized LoRA). We discuss each method's mechanism, parameter savings, and example application to LLaMA (e.g., instruction tuning, multimodal tasks). We provide structured discussion and analysis of model and adapter architectures, parameter counts, and benchmark results (including examples where fine-tuned LLaMA models outperform larger baselines). Finally, we examine real-world use cases where LLaMA-based models and PEFT have been successfully applied (e.g., legal and medical domains), and we discuss ongoing challenges and future research directions (such as scaling to even larger contexts and improving robustness). This survey paper provides a one-stop resource for ML researchers and practitioners interested in LLaMA models and efficient fine-tuning strategies.

large language model, machine learning, natural language, (21 more...)

2510.12178

Country:

North America > United States (0.28)
Asia > Middle East > Iran (0.28)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.67)

Industry:

Law > Statutes (1.00)
Law > Business Law (1.00)
Information Technology > Security & Privacy (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Parsaee, Ali, Jiang, Bei, Friggstad, Zachary, Greiner, Russell

Budget-constrained Active Learning to Effectively De-censor Survival Data

Standard supervised learners attempt to learn a model from a labeled dataset. Given a small set of labeled instances, and a pool of unlabeled instances, a budgeted learner can use its given budget to pay to acquire the labels of some unlabeled instances, which it can then use to produce a model. Here, we explore budgeted learning in the context of survival datasets, which include (right) censored instances, where we know only a lower bound on an instance's time-to-event. Here, that learner can pay to (partially) label a censored instance -- e.g., to acquire the actual time for an instance [perhaps go from (3 yr, censored) to (7.2 yr, uncensored)], or other variants [e.g., learn about one more year, so go from (3 yr, censored) to either (4 yr, censored) or perhaps (3.2 yr, uncensored)]. This serves as a model of real world data collection, where follow-up with censored patients does not always lead to uncensoring, and how much information is given to the learner model during data collection is a function of the budget and the nature of the data itself. We provide both experimental and theoretical results for how to apply state-of-the-art budgeted learning algorithms to survival data and the respective limitations that exist in doing so. Our approach provides bounds and time complexity asymptotically equivalent to the standard active learning method BatchBALD. Moreover, empirical analysis on several survival tasks show that our model performs better than other potential approaches on several benchmarks.

artificial intelligence, machine learning, probe depth, (16 more...)

2510.12144

Country: North America > Canada > Alberta (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.92)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

SafeMT: Multi-turn Safety for Multimodal Language Models

Zhu, Han, Dai, Juntao, Ji, Jiaming, Li, Haoran, Cai, Chengkun, Wen, Pengcheng, Chan, Chi-Min, Chen, Boyuan, Yang, Yaodong, Han, Sirui, Guo, Yike

With the widespread use of multi-modal Large Language models (MLLMs), safety issues have become a growing concern. Multi-turn dialogues, which are more common in everyday interactions, pose a greater risk than single prompts; however, existing benchmarks do not adequately consider this situation. To encourage the community to focus on the safety issues of these models in multi-turn dialogues, we introduce SafeMT, a benchmark that features dialogues of varying lengths generated from harmful queries accompanied by images. This benchmark consists of 10,000 samples in total, encompassing 17 different scenarios and four jailbreak methods. Additionally, we propose Safety Index (SI) to evaluate the general safety of MLLMs during conversations. We assess the safety of 17 models using this benchmark and discover that the risk of successful attacks on these models increases as the number of turns in harmful dialogues rises. This observation indicates that the safety mechanisms of these models are inadequate for recognizing the hazard in dialogue interactions. We propose a dialogue safety moderator capable of detecting malicious intent concealed within conversations and providing MLLMs with relevant safety policies. Experimental results from several open-source models indicate that this moderator is more effective in reducing multi-turn ASR compared to existed guard models.

large language model, machine learning, natural language, (18 more...)

2510.12133

Genre: Research Report > New Finding (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Law Enforcement & Public Safety (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.97)

Ceylan, Hakan, Sinibaldi, Edoardo, Misra, Sanjay, Pasricha, Pankaj J., Hutmacher, Dietmar W.

Translating Milli/Microrobots with A Value-Centered Readiness Framework

Untethered mobile milli/microrobots hold transformative potential for interventional medicine by enabling more precise and entirely non-invasive diagnosis and therapy. Realizing this promise requires bridging the gap between groundbreaking laboratory demonstrations and successful clinical integration. Despite remarkable technical progress over the past two decades, most millirobots and microrobots remain confined to laboratory proof-of-concept demonstrations, with limited real-world feasibility. In this Review, we identify key factors that slow translation from bench to bedside, focusing on the disconnect between technical innovation and real-world application. We argue that the long-term impact and sustainability of the field depend on aligning development with unmet medical needs, ensuring applied feasibility, and integrating seamlessly into existing clinical workflows, which are essential pillars for delivering meaningful patient outcomes. To support this shift, we introduce a strategic milli/microrobot Technology Readiness Level framework (mTRL), which maps system development from initial conceptualization to clinical adoption through clearly defined milestones and their associated stepwise activities. The mTRL model provides a structured gauge of technological maturity, a common language for cross-disciplinary collaboration and actionable guidance to accelerate translational development toward new, safer and more efficient interventions.

artificial intelligence, human computer interaction, microrobot, (15 more...)

2510.1209

Country:

Oceania > Australia > Queensland (0.15)
North America > United States > Arizona (0.14)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Surgery (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.46)

One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration

Khan, Zaid, Prasad, Archiki, Stengel-Eskin, Elias, Cho, Jaemin, Bansal, Mohit

Symbolic world modeling requires inferring and representing an environment's transitional dynamics as an executable program. Prior work has focused on largely deterministic environments with abundant interaction data, simple mechanics, and human guidance. We address a more realistic and challenging setting, learning in a complex, stochastic environment where the agent has only "one life" to explore a hostile environment without human guidance. We introduce OneLife, a framework that models world dynamics through conditionally-activated programmatic laws within a probabilistic programming framework. Each law operates through a precondition-effect structure, activating in relevant world states. This creates a dynamic computation graph that routes inference and optimization only through relevant laws, avoiding scaling challenges when all laws contribute to predictions about a complex, hierarchical state, and enabling the learning of stochastic dynamics even with sparse rule activation. To evaluate our approach under these demanding constraints, we introduce a new evaluation protocol that measures (a) state ranking, the ability to distinguish plausible future states from implausible ones, and (b) state fidelity, the ability to generate future states that closely resemble reality. We develop and evaluate our framework on Crafter-OO, our reimplementation of the Crafter environment that exposes a structured, object-oriented symbolic state and a pure transition function that operates on that state alone. OneLife can successfully learn key environment dynamics from minimal, unguided interaction, outperforming a strong baseline on 16 out of 23 scenarios tested. We also test OneLife's planning ability, with simulated rollouts successfully identifying superior strategies. Our work establishes a foundation for autonomously constructing programmatic world models of unknown, complex environments.

artificial intelligence, world model, worldstate, (18 more...)

2510.12088

Genre:

Research Report (0.64)
Workflow (0.46)

Industry:

Leisure & Entertainment > Games > Computer Games (0.93)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.65)

Kurniawan, Kemal, Mistica, Meladel, Baldwin, Timothy, Lau, Jey Han

On the Interplay between Human Label Variation and Model Fairness

The impact of human label variation (HLV) on model fairness is an unexplored topic. This paper examines the interplay by comparing training on majority-vote labels with a range of HLV methods. Our experiments show that without explicit debiasing, HLV training methods have a positive impact on fairness.

artificial intelligence, machine learning, natural language, (17 more...)

2510.12036

Country: Oceania > Australia (0.14)

Genre:

Research Report > New Finding (0.69)
Research Report > Experimental Study (0.46)

Industry: Law (0.96)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Cognition-of-Thought Elicits Social-Aligned Reasoning in Large Language Models

Zhang, Xuanming, Chen, Yuxuan, Yeh, Samuel, Li, Sharon

Large language models (LLMs) excel at complex reasoning but can still exhibit harmful behaviors. Current alignment strategies typically embed safety into model weights, making these controls implicit, static, and difficult to modify. This paper introduces Cognition-of-Thought (CooT), a novel decoding-time framework that equips LLMs with an explicit cognitive self-monitoring loop. CooT couples a standard text Generator with a cognitive Perceiver that continuously monitors the unfolding sequence. The Perceiver uses a structured, precedence-based hierarchy of principles (e.g., safety over obedience) to detect potential misalignments as they arise. When violations are flagged, CooT intervenes by rolling back the generation to the point of error and regenerating under injected guidance that combines universal social priors with context-specific warnings. CooT thus transforms alignment from a fixed property into an explicit, dynamic, and auditable process active during inference, allowing for flexible policy updates without retraining the model. Extensive experiments across multiple benchmarks and model families confirm that CooT consistently improves safety and social reasoning performance.

coot, large language model, machine learning, (21 more...)

2509.23441

Country:

North America > United States (0.45)
Europe (0.27)

Genre:

Research Report > New Finding (0.45)
Research Report > Experimental Study (0.45)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Responsible AI Technical Report

KT, null, :, null, Park, Yunjin, Yoon, Jungwon, Moon, Junhyung, Oh, Myunggyo, Lee, Wonhyuk, Kim, Sujin Kim Youngchol, Kim, Eunmi, Park, Hyoungjun, Shin, Eunyoung, Lee, Wonyoung, Lee, Somin, Ju, Minwook, Noh, Minsung, Jeong, Dongyoung, Kim, Jeongyeop, Park, Wanjin, Bae, Soonmin

KT developed a Responsible AI (RAI) assessment methodology and risk mitigation technologies to ensure the safety and reliability of AI services. By analyzing the Basic Act on AI implementation and global AI governance trends, we established a unique approach for regulatory compliance and systematically identify and manage all potential risk factors from AI development to operation. We present a reliable assessment methodology that systematically verifies model safety and robustness based on KT's AI risk taxonomy tailored to the domestic environment. We also provide practical tools for managing and mitigating identified AI risks. With the release of this report, we also release proprietary Guardrail : SafetyGuard that blocks harmful responses from AI models in real-time, supporting the enhancement of safety in the domestic AI development ecosystem. We also believe these research outcomes provide valuable insights for organizations seeking to develop Responsible AI.

artificial intelligence, assessment, machine learning, (15 more...)

2509.20057

Country:

North America > United States (0.28)
Asia (0.28)

Genre: Research Report (0.82)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Guingrich, Rose E., Graziano, Michael S. A.

A Longitudinal Randomized Control Study of Companion Chatbot Use: Anthropomorphism and Its Mediating Role on Social Impacts

Many Large Language Model (LLM) chatbots are designed and used for companionship, and people have reported forming friendships, mentorships, and romantic partnerships with them. Concerns that companion chatbots may harm or replace real human relationships have been raised, but whether and how these social consequences occur remains unclear. In the present longitudinal study ($N = 183$), participants were randomly assigned to a chatbot condition (text chat with a companion chatbot) or to a control condition (text-based word games) for 10 minutes a day for 21 days. Participants also completed four surveys during the 21 days and engaged in audio recorded interviews on day 1 and 21. Overall, social health and relationships were not significantly impacted by companion chatbot interactions across 21 days of use. However, a detailed analysis showed a different story. People who had a higher desire to socially connect also tended to anthropomorphize the chatbot more, attributing humanlike properties to it; and those who anthropomorphized the chatbot more also reported that talking to the chatbot had a greater impact on their social interactions and relationships with family and friends. Via a mediation analysis, our results suggest a key mechanism at work: the impact of human-AI interaction on human-human social outcomes is mediated by the extent to which people anthropomorphize the AI agent, which is in turn motivated by a desire to socially connect. In a world where the desire to socially connect is on the rise, this finding may be cause for concern.

artificial intelligence, chatbot, natural language, (16 more...)

2509.19515

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.28)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Law (0.66)

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)