AITopics | clms

Collaborating Authors

clms

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FrameEOL: Semantic Frame Induction using Causal Language Models

Yano, Chihiro, Yamada, Kosuke, Tsukagoshi, Hayato, Sasano, Ryohei, Takeda, Koichi

arXiv.org Artificial IntelligenceOct-13-2025

Semantic frame induction is the task of clustering frame-evoking words according to the semantic frames they evoke. In recent years, leveraging embeddings of frame-evoking words that are obtained using masked language models (MLMs) such as BERT has led to high-performance semantic frame induction. Although causal language models (CLMs) such as the GPT and Llama series succeed in a wide range of language comprehension tasks and can engage in dialogue as if they understood frames, they have not yet been applied to semantic frame induction. We propose a new method for semantic frame induction based on CLMs. Specifically, we introduce FrameEOL, a prompt-based method for obtaining Frame Embeddings that outputs One frame-name as a Label representing the given situation. To obtain embeddings more suitable for frame induction, we leverage in-context learning (ICL) and deep metric learning (DML). Frame induction is then performed by clustering the resulting embeddings. Experimental results on the English and Japanese FrameNet datasets demonstrate that the proposed methods outperform existing frame induction methods. In particular, for Japanese, which lacks extensive frame resources, the CLM-based method using only 5 ICL examples achieved comparable performance to the MLM-based method fine-tuned with DML.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.09097

Country: Asia > Japan > Honshū > Tōhoku (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Scrub It Out! Erasing Sensitive Memorization in Code Language Models via Machine Unlearning

Chu, Zhaoyang, Wan, Yao, Zhang, Zhikun, Wang, Di, Yang, Zhou, Zhang, Hongyu, Zhou, Pan, Shi, Xuanhua, Jin, Hai, Lo, David

arXiv.org Artificial IntelligenceSep-18-2025

While Code Language Models (CLMs) have demonstrated superior performance in software engineering tasks such as code generation and summarization, recent empirical studies reveal a critical privacy vulnerability: these models exhibit unintended memorization of sensitive training data, enabling verbatim reproduction of confidential information when specifically prompted. To address this issue, several approaches, including training data de-duplication and differential privacy augmentation, have been proposed. However, these methods require full-model retraining for deployed CLMs, which incurs substantial computational costs. In this paper, we aim to answer the following research question: Can sensitive information memorized by CLMs be erased effectively and efficiently? We conduct a pioneering investigation into erasing sensitive memorization in CLMs through machine unlearning - a post-hoc modification method that removes specific information from trained models without requiring full retraining. Specifically, we first quantify the memorization risks of sensitive data within CLM training datasets and curate a high-risk dataset of 50,000 sensitive memorized samples as unlearning targets. We study two widely used gradient ascent-based unlearning approaches: the vanilla and constraint-based methods, and introduce CodeEraser, an advanced variant that selectively unlearns sensitive memorized segments in code while preserving the structural integrity and functional correctness of the surrounding code. Extensive experiments on three families of CLMs, i.e., CodeParrot, CodeGen-Mono, and Qwen2.5-Coder, validate the effectiveness and efficiency of CodeEraser in erasing targeted sensitive memorization while maintaining model utility.

artificial intelligence, machine learning, proceedings, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3744916.3764573

2509.13755

Country:

South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.05)
North America > United States > New York > New York County > New York City (0.05)
Asia > China > Hubei Province > Wuhan (0.04)
(10 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency

Gee, Leonidas, Gritta, Milan, Lampouras, Gerasimos, Iacobacci, Ignacio

arXiv.org Artificial IntelligenceJun-18-2024

Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that incorporates both correctness (passed, failed) and runtime (quick, slow) as learning signals via self-generated preference data. Our framework is both lightweight and robust as it dynamically selects solutions to reduce overfitting while avoiding a reliance on larger models for learning signals. Code-Optimise achieves significant improvements in pass@k while decreasing the competitive baseline runtimes by an additional 6% for in-domain data and up to 3% for out-of-domain data. As a byproduct, the average length of the generated solutions is reduced by up to 48% on MBPP and 23% on HumanEval, resulting in faster and cheaper inference. The generated data and codebase will be open-sourced at www.open-source.link.

humaneval, pass, runtime, (16 more...)

arXiv.org Artificial Intelligence

2406.12502

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > East Sussex > Brighton (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.89)
Information Technology > Software > Programming Languages (0.82)

Add feedback

How BERT Speaks Shakespearean English? Evaluating Historical Bias in Contextual Language Models

Cuscito, Miriam, Ferrara, Alfio, Ruskov, Martin

arXiv.org Artificial IntelligenceFeb-7-2024

In this paper, we explore the idea of analysing the historical bias of contextual language models based on BERT by measuring their adequacy with respect to Early Modern (EME) and Modern (ME) English. In our preliminary experiments, we perform fill-in-the-blank tests with 60 masked sentences (20 EME-specific, 20 ME-specific and 20 generic) and three different models (i.e., BERT Base, MacBERTh, English HLM). We then rate the model predictions according to a 5-point bipolar scale between the two language varieties and derive a weighted score to measure the adequacy of each model to EME and ME varieties of English.

adequacy, clms, computational linguistic, (10 more...)

arXiv.org Artificial Intelligence

2402.05034

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Exploring Category Structure with Contextual Language Models and Lexical Semantic Networks

Renner, Joseph, Denis, Pascal, Gilleron, Rémi, Brunellière, Angèle

arXiv.org Artificial IntelligenceFeb-14-2023

Recent work on predicting category structure with distributional models, using either static word embeddings (Heyman and Heyman, 2019) or contextualized language models (CLMs) (Misra et al., 2021), report low correlations with human ratings, thus calling into question their plausibility as models of human semantic memory. In this work, we revisit this question testing a wider array of methods for probing CLMs for predicting typicality scores. Our experiments, using BERT (Devlin et al., 2018), show the importance of using the right type of CLM probes, as our best BERT-based typicality prediction methods substantially improve over previous works. Second, our results highlight the importance of polysemy in this task: our best results are obtained when using a disambiguation mechanism. Finally, additional experiments reveal that Information Contentbased WordNet (Miller, 1995), also endowed with disambiguation, match the performance of the best BERT-based method, and in fact capture complementary information, which can be combined with BERT to achieve enhanced typicality predictions.

artificial intelligence, category, natural language, (19 more...)

arXiv.org Artificial Intelligence

2302.06942

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > China > Hong Kong (0.04)
(4 more...)

Genre: Research Report > New Finding (0.87)

Industry: Health & Medicine > Consumer Health (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.82)

Add feedback

Perplexity-Based Molecule Ranking and Bias Estimation of Chemical Language Models

#artificialintelligenceMar-25-2022, 22:25:43 GMT

Chemical language models (CLMs) can be employed to design molecules with desired properties. CLMs generate new chemical structures in the form of textual representations, such as the simplified mol...

artificial intelligence, molecule ranking and bias estimation, natural language, (5 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language (0.74)

Add feedback

Leveraging molecular structure and bioactivity with chemical language models for drug design

#artificialintelligenceOct-5-2021, 08:05:34 GMT

Generative chemical language models (CLMs) can be used for de novo molecular structure generation. These CLMs learn from the structural information of known molecules to generate new ones. In this paper, we show that "hybrid" CLMs can additionally leverage the bioactivity information available for the training compounds. To computationally design ligands of phosphoinositide 3-kinase gamma (PI3Kγ), we created a large collection of virtual molecules with a generative CLM. This primary virtual compound library was further refined using a CLM-based classifier for bioactivity prediction.

chemical language model, leveraging molecular structure, molecular structure and bioactivity, (5 more...)

#artificialintelligence

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.76)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.66)

Add feedback

Why Custom Language Models (CLMs) are Needed in Speech Recognition for Kids

#artificialintelligenceAug-6-2021, 21:25:46 GMT

Welcome back to "Lessons from Our Voice Engine," where members of our Engineering and Speech Tech teams offer high level insights into how our voice engine works. Lesson 2 is from Lora Lynn Asvos, a Computational Linguist on our Speech Tech team. CLM stands for "custom language model." As mentioned in Lesson 1, language models are statistical models of language that can predict the next word based on the context. CLMs are language models, as the name implies, but they have a little something extra.

artificial intelligence, clms, natural language, (5 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.40)

Add feedback

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

Zhang, Zhuosheng, Zhao, Hai, Wang, Rui

arXiv.org Artificial IntelligenceMay-13-2020

Machine reading comprehension (MRC) aims to teach machines to read and comprehend human languages, which is a long-standing goal of natural language processing (NLP). With the burst of deep neural networks and the evolution of contextualized language models (CLMs), the research of MRC has experienced two significant breakthroughs. MRC and CLM, as a phenomenon, have a great impact on the NLP community. In this survey, we provide a comprehensive and comparative review on MRC covering overall research topics about 1) the origin and development of MRC and CLM, with a particular focus on the role of CLMs; 2) the impact of MRC and CLM to the NLP community; 3) the definition, datasets, and evaluation of MRC; 4) general MRC architecture and technical methods in the view of two-stage Encoder-Decoder solving architecture from the insights of the cognitive process of humans; 5) previous highlights, emerging topics, and our empirical analysis, among which we especially focus on what works in different periods of MRC researches. We propose a full-view categorization and new taxonomies on these topics. The primary views we have arrived at are that 1) MRC boosts the progress from language processing to understanding; 2) the rapid improvement of MRC systems greatly benefits from the development of CLMs; 3) the theme of MRC is gradually moving from shallow text matching to cognitive reasoning.

comprehension, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2005.06249

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(11 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Education > Assessment & Standards > Student Performance (0.73)
Leisure & Entertainment > Sports > Football (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Explaining AI-based Decision Support Systems using Concept Localization Maps

Lucieri, Adriano, Bajwa, Muhammad Naseer, Dengel, Andreas, Ahmed, Sheraz

arXiv.org Artificial IntelligenceMay-4-2020

Human-centric explainability of AI-based Decision Support Systems (DSS) using visual input modalities is directly related to reliability and practicality of such algorithms. An otherwise accurate and robust DSS might not enjoy trust of experts in critical application areas if it is not able to provide reasonable justification of its predictions. This paper introduces Concept Localization Maps (CLMs), which is a novel approach towards explainable image classifiers employed as DSS. CLMs extend Concept Activation Vectors (CAVs) by locating significant regions corresponding to a learned concept in the latent space of a trained image classifier. They provide qualitative and quantitative assurance of a classifier's ability to learn and focus on similar concepts important for humans during image recognition. To better understand the effectiveness of the proposed method, we generated a new synthetic dataset called Simple Concept DataBase (SCDB) that includes annotations for 10 distinguishable concepts, and made it publicly available. We evaluated our proposed method on SCDB as well as a real-world dataset called CelebA. We achieved localization recall of above 80% for most relevant concepts and average recall above 60% for all concepts using SE-ResNeXt-50 on SCDB. Our results on both datasets show great promise of CLMs for easing acceptance of DSS in practice.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2005.01399

Country: Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.05)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback