AITopics | person2

Collaborating Authors

person2

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Details of Data Augmentation with External Knowledge Resources 486 4 Enhance Relation Recognition: We enriched the relationships between objects parsed from the

Neural Information Processing SystemsFeb-11-2026, 09:45:17 GMT

The hyperparameters for training are detailed in Table 7. We perform the human evaluation on two of the four in-depth knowledge quality assessment metrics. V alidity ( "): whether the generated visual knowledge is valid to humans . Conformity ( "): whether the generated knowledge faithfully depicts the scenarios in the images . Our calculated average pairwise Cohen's Suppose you are looking at an image that contains the following subject and object entities: Subject list: [Insert the subject names here] Object list: [Insert the object names here] Please extract 5-10 condensed descriptions that describe the interactions and/or relations among those entities in the image.

artificial intelligence, natural language, person2, (15 more...)

Neural Information Processing Systems

Country: Europe (0.04)

Industry: Leisure & Entertainment (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.47)

Add feedback

Efficient Compositional Multi-tasking for On-device Large Language Models

Bohdal, Ondrej, Ozay, Mete, Moon, Jijoong, Lee, Kyeng-Hun, Ko, Hyeonmok, Michieli, Umberto

arXiv.org Artificial IntelligenceOct-14-2025

Adapter parameters provide a mechanism to modify the behavior of machine learning models and have gained significant popularity in the context of large language models (LLMs) and generative AI. These parameters can be merged to support multiple tasks via a process known as task merging. However, prior work on merging in LLMs, particularly in natural language processing, has been limited to scenarios where each test example addresses only a single task. In this paper, we focus on on-device settings and study the problem of text-based compositional multi-tasking, where each test example involves the simultaneous execution of multiple tasks. For instance, generating a translated summary of a long text requires solving both translation and summarization tasks concurrently. To facilitate research in this setting, we propose a benchmark comprising four practically relevant compositional tasks. We also present an efficient method (Learnable Calibration) tailored for on-device applications, where computational resources are limited, emphasizing the need for solutions that are both resource-efficient and high-performing. Our contributions lay the groundwork for advancing the capabilities of LLMs in real-world multi-tasking scenarios, expanding their applicability to complex, resource-constrained use cases.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.16083

Country: Europe (0.46)

Genre: Research Report > New Finding (0.67)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Paired by the Teacher: Turning Unpaired Data into High-Fidelity Pairs for Low-Resource Text Generation

Lu, Yen-Ju, Thebaud, Thomas, Moro-Velazquez, Laureano, Dehak, Najim, Villalba, Jesus

arXiv.org Artificial IntelligenceSep-30-2025

We present Paired by the Teacher (PbT), a two-stage teacher-student pipeline that synthesizes accurate input-output pairs without human labels or parallel data. In many low-resource natural language generation (NLG) scenarios, practitioners may have only raw outputs, like highlights, recaps, or questions, or only raw inputs, such as articles, dialogues, or paragraphs, but seldom both. This mismatch forces small models to learn from very few examples or rely on costly, broad-scope synthetic examples produced by large LLMs. PbT addresses this by asking a teacher LLM to compress each unpaired example into a concise intermediate representation (IR), and training a student to reconstruct inputs from IRs. This enables outputs to be paired with student-generated inputs, yielding high-quality synthetic data. We evaluate PbT on five benchmarks-document summarization (XSum, CNNDM), dialogue summarization (SAMSum, DialogSum), and question generation (SQuAD)-as well as an unpaired setting on SwitchBoard (paired with DialogSum summaries). An 8B student trained only on PbT data outperforms models trained on 70 B teacher-generated corpora and other unsupervised baselines, coming within 1.2 ROUGE-L of human-annotated pairs and closing 82% of the oracle gap at one-third the annotation cost of direct synthesis. Human evaluation on SwitchBoard further confirms that only PbT produces concise, faithful summaries aligned with the target style, highlighting its advantage of generating in-domain sources that avoid the mismatch, limiting direct synthesis.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2509.25144

Country:

North America > United States > Texas (0.46)
North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Explainable Detection of Implicit Influential Patterns in Conversations via Data Augmentation

Abdidizaji, Sina, Kowsher, Md, Yousefi, Niloofar, Garibay, Ivan

arXiv.org Artificial IntelligenceJun-18-2025

In the era of digitalization, as individuals increasingly rely on digital platforms for communication and news consumption, various actors employ linguistic strategies to influence public perception. While models have become proficient at detecting explicit patterns, which typically appear in texts as single remarks referred to as utterances, such as social media posts, malicious actors have shifted toward utilizing implicit influential verbal patterns embedded within conversations. These verbal patterns aim to mentally penetrate the victim's mind in order to influence them, enabling the actor to obtain the desired information through implicit means. This paper presents an improved approach for detecting such implicit influential patterns. Furthermore, the proposed model is capable of identifying the specific locations of these influential elements within a conversation. To achieve this, the existing dataset was augmented using the reasoning capabilities of state-of-the-art language models. Our designed framework resulted in a 6% improvement in the detection of implicit influential patterns in conversations. Moreover, this approach improved the multi-label classification tasks related to both the techniques used for influence and the vulnerability of victims by 33% and 43%, respectively.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.14211

Country: North America > United States (0.93)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

CS-Sum: A Benchmark for Code-Switching Dialogue Summarization and the Limits of Large Language Models

Suresh, Sathya Krishnan, Surana, Tanmay, Hao, Lim Zhi, Chng, Eng Siong

arXiv.org Artificial IntelligenceMay-21-2025

Code-switching (CS) poses a significant challenge for Large Language Models (LLMs), yet its comprehensibility remains underexplored in LLMs. We introduce CS-Sum, to evaluate the comprehensibility of CS by the LLMs through CS dialogue to English summarization. CS-Sum is the first benchmark for CS dialogue summarization across Mandarin-English (EN-ZH), Tamil-English (EN-TA), and Malay-English (EN-MS), with 900-1300 human-annotated dialogues per language pair. Evaluating ten LLMs, including open and closed-source models, we analyze performance across few-shot, translate-summarize, and fine-tuning (LoRA, QLoRA on synthetic data) approaches. Our findings show that though the scores on automated metrics are high, LLMs make subtle mistakes that alter the complete meaning of the dialogue. To this end, we introduce 3 most common type of errors that LLMs make when handling CS input. Error rates vary across CS pairs and LLMs, with some LLMs showing more frequent errors on certain language pairs, underscoring the need for specialized training on code-switched data.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.13559

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Segment-Level Diffusion: A Framework for Controllable Long-Form Generation with Diffusion Language Models

Zhu, Xiaochen, Karadzhov, Georgi, Whitehouse, Chenxi, Vlachos, Andreas

arXiv.org Artificial IntelligenceDec-15-2024

Diffusion models have shown promise in text generation but often struggle with generating long, coherent, and contextually accurate text. Token-level diffusion overlooks word-order dependencies and enforces short output windows, while passage-level diffusion struggles with learning robust representation for long-form text. To address these challenges, we propose Segment-Level Diffusion (SLD), a framework that enhances diffusion-based text generation through text segmentation, robust representation training with adversarial and contrastive learning, and improved latent-space guidance. By segmenting long-form outputs into separate latent representations and decoding them with an autoregressive decoder, SLD simplifies diffusion predictions and improves scalability. Experiments on XSum, ROCStories, DialogSum, and DeliData demonstrate that SLD achieves competitive or superior performance in fluency, coherence, and contextual compatibility across automatic and human evaluation metrics comparing with other diffusion and autoregressive baselines. Ablation studies further validate the effectiveness of our segmentation and representation learning strategies.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2412.11333

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > United Kingdom > Wales > Blaenau Gwent (0.04)
Africa > Rwanda > Kigali > Kigali (0.04)
(17 more...)

Genre:

Research Report (0.82)
Overview (0.68)
Personal > Interview (0.46)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback

Detecting Conversational Mental Manipulation with Intent-Aware Prompting

Ma, Jiayuan, Na, Hongbin, Wang, Zimu, Hua, Yining, Liu, Yue, Wang, Wei, Chen, Ling

arXiv.org Artificial IntelligenceDec-11-2024

Mental manipulation severely undermines mental wellness by covertly and negatively distorting decision-making. While there is an increasing interest in mental health care within the natural language processing community, progress in tackling manipulation remains limited due to the complexity of detecting subtle, covert tactics in conversations. In this paper, we propose Intent-Aware Prompting (IAP), a novel approach for detecting mental manipulations using large language models (LLMs), providing a deeper understanding of manipulative tactics by capturing the underlying intents of participants. Experimental results on the MentalManip dataset demonstrate superior effectiveness of IAP against other advanced prompting strategies. Notably, our approach substantially reduces false negatives, helping detect more instances of mental manipulation with minimal misjudgment of positive cases. The code of this paper is available at https://github.com/Anton-Jiayuan-MA/Manip-IAP.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.08414

Country:

Asia > Thailand > Bangkok > Bangkok (0.05)
Oceania > Australia > New South Wales (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.52)

Add feedback

Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs

Boizard, Nicolas, Haddad, Kevin El, Hudelot, Céline, Colombo, Pierre

arXiv.org Artificial IntelligenceFeb-20-2024

Deploying large language models (LLMs) of several billion parameters can be impractical in most industrial use cases due to constraints such as cost, latency limitations, and hardware accessibility. Knowledge distillation (KD) offers a solution by compressing knowledge from resource-intensive large models to smaller ones. Various strategies exist, some relying on the text generated by the teacher model and optionally utilizing his logits to enhance learning. However, these methods based on logits often require both teacher and student models to share the same tokenizer, limiting their applicability across different LLM families. In this paper, we introduce Universal Logit Distillation (ULD) loss, grounded in optimal transport, to address this limitation. Our experimental results demonstrate the effectiveness of ULD loss in enabling distillation across models with different architectures and tokenizers, paving the way to a more widespread use of distillation techniques.

distillation, student model, uld loss, (14 more...)

arXiv.org Artificial Intelligence

2402.1203

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > Texas (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine (1.00)
Education (1.00)
Government > Military (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Zero-shot Conversational Summarization Evaluations with small Large Language Models

Manuvinakurike, Ramesh, Sahay, Saurav, Manepalli, Sangeeta, Nachman, Lama

arXiv.org Artificial IntelligenceNov-29-2023

However, their capabilities on conversational summarization remains under explored. In this work we evaluate LLMs ( 10 billion parameters) on conversational summarization and showcase their performance on various prompts. We show that the summaries generated by models depend on the instructions and the performance of LLMs vary with different instructions sometimes resulting steep drop in ROUGE scores if prompts are not selected carefully. We also evaluate the models with human evaluations and discuss the limitations of the models on conversational summarization.

dialogue, dialogue 0, summarize, (15 more...)

arXiv.org Artificial Intelligence

2311.18041

Country:

North America > United States (0.67)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Frugal Prompting for Dialog Models

Santra, Bishal, Basak, Sakya, De, Abhinandan, Gupta, Manish, Goyal, Pawan

arXiv.org Artificial IntelligenceNov-5-2023

The use of large language models (LLMs) in natural language processing (NLP) tasks is rapidly increasing, leading to changes in how researchers approach problems in the field. To fully utilize these models' abilities, a better understanding of their behavior for different input protocols is required. With LLMs, users can directly interact with the models through a text-based interface to define and solve various tasks. Hence, understanding the conversational abilities of these LLMs, which may not have been specifically trained for dialog modeling, is also important. This study examines different approaches for building dialog systems using LLMs by considering various aspects of the prompt. As part of prompt tuning, we experiment with various ways of providing instructions, exemplars, current query and additional context. The research also analyzes the representations of dialog history that have the optimal usable-information density. Based on the findings, the paper suggests more compact ways of providing dialog history information while ensuring good performance and reducing model's inference-API costs. The research contributes to a better understanding of how LLMs can be effectively used for building interactive systems.

person1, person1 and person2, person2, (13 more...)

arXiv.org Artificial Intelligence

2305.14919

Country:

North America > United States > Iowa (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
(14 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Media > Film (1.00)
Leisure & Entertainment > Sports > Football (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback