AITopics | ldd

Collaborating Authors

ldd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Final policy RL fine-tuning Envir

Neural Information Processing SystemsFeb-8-2026, 22:48:45 GMT

Figure dynamics into predict learning from green.

artificial intelligence, machine learning, zhongetal, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

Semantics as a Shield: Label Disguise Defense (LDD) against Prompt Injection in LLM Sentiment Classification

Li, Yanxi, Shan, Ruocheng

arXiv.org Artificial IntelligenceDec-1-2025

Large language models are increasingly used for text classification tasks such as sentiment analysis, yet their reliance on natural language prompts exposes them to prompt injection attacks. In particular, class-directive injections exploit knowledge of the model's label set (e.g., positive vs. negative) to override its intended behavior through adversarial instructions. Existing defenses, such as detection-based filters, instruction hierarchies, and signed prompts, either require model retraining or remain vulnerable to obfuscation. This paper introduces Label Disguise Defense (LDD), a lightweight and model-agnostic strategy that conceals true labels by replacing them with semantically transformed or unrelated alias labels(e.g., blue vs. yellow). The model learns these new label mappings implicitly through few-shot demonstrations, preventing direct correspondence between injected directives and decision outputs. We evaluate LDD across nine state-of-the-art models, including GPT-5, GPT-4o, LLaMA3.2, Gemma3, and Mistral variants, under varying few-shot and an adversarial setting. Our results show that the ability of LDD to recover performance lost to the adversarial attack varies across models and alias choices. For every model evaluated, LDD is able to restore a portion of the accuracy degradation caused by the attack. Moreover, for the vast majority of models, we can identify more than one alias pair that achieves higher accuracy than the under-attack baseline, in which the model relies solely on few-shot learning without any defensive mechanism. A linguistic analysis further reveals that semantically aligned alias labels(e.g., good vs. bad) yield stronger robustness than unaligned symbols(e.g., blue vs. yellow). Overall, this study demonstrates that label semantics can serve as an effective defense layer, transforming meaning itself into a shield against prompt injection.

alias, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.21752

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.87)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

For GPT-4 as with Humans: Information Structure Predicts Acceptability of Long-Distance Dependencies

Cuneo, Nicole, Graves, Eleanor, Rakshit, Supantho, Goldberg, Adele E.

arXiv.org Artificial IntelligenceMay-15-2025

It remains debated how well any LM understands natural language or generates reliable metalinguistic judgments. Moreover, relatively little work has demonstrated that LMs can represent and respect subtle relationships between form and function proposed by linguists. We here focus on a particular such relationship established in recent work: English speakers' judgments about the information structure of canonical sentences predicts independently collected acceptability ratings on corresponding 'long distance dependency' [LDD] constructions, across a wide array of base constructions and multiple types of LDDs. To determine whether any LM captures this relationship, we probe GPT-4 on the same tasks used with humans and new extensions.Results reveal reliable metalinguistic skill on the information structure and acceptability tasks, replicating a striking interaction between the two, despite the zero-shot, explicit nature of the tasks, and little to no chance of contamination [Studies 1a, 1b]. Study 2 manipulates the information structure of base sentences and confirms a causal relationship: increasing the prominence of a constituent in a context sentence increases the subsequent acceptability ratings on an LDD construction. The findings suggest a tight relationship between natural and GPT-4 generated English, and between information structure and syntax, which begs for further exploration.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.09005

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Language-Driven Dual Style Mixing for Single-Domain Generalized Object Detection

Qin, Hongda, Lu, Xiao, Wei, Zhiyong, Cao, Yihong, Yang, Kailun, Chen, Ningjiang

arXiv.org Artificial IntelligenceMay-13-2025

Generalizing an object detector trained on a single domain to multiple unseen domains is a challenging task. Existing methods typically introduce image or feature augmentation to diversify the source domain to raise the robustness of the detector. Vision-Language Model (VLM)-based augmentation techniques have been proven to be effective, but they require that the detector's backbone has the same structure as the image encoder of VLM, limiting the detector framework selection. To address this problem, we propose Language-Driven Dual Style Mixing (LDDS) for single-domain generalization, which diversifies the source domain by fully utilizing the semantic information of the VLM. Specifically, we first construct prompts to transfer style semantics embedded in the VLM to an image translation network. This facilitates the generation of style diversified images with explicit semantic information. Then, we propose image-level style mixing between the diversified images and source domain images. This effectively mines the semantic information for image augmentation without relying on specific augmentation selections. Finally, we propose feature-level style mixing in a double-pipeline manner, allowing feature augmentation to be model-agnostic and can work seamlessly with the mainstream detector frameworks, including the one-stage, two-stage, and transformer-based detectors. Extensive experiments demonstrate the effectiveness of our approach across various benchmark datasets, including real to cartoon and normal to adverse weather tasks. The source code and pre-trained models will be publicly available at https://github.com/qinhongda8/LDDS.

information, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2505.07219

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Refining Dimensions for Improving Clustering-based Cross-lingual Topic Models

Chang, Chia-Hsuan, Huang, Tien-Yuan, Tsai, Yi-Hang, Chang, Chia-Ming, Hwang, San-Yih

arXiv.org Artificial IntelligenceDec-16-2024

Recent works in clustering-based topic models perform well in monolingual topic identification by introducing a pipeline to cluster the contextualized representations. However, the pipeline is suboptimal in identifying topics across languages due to the presence of language-dependent dimensions (LDDs) generated by multilingual language models. To address this issue, we introduce a novel, SVD-based dimension refinement component into the pipeline of the clustering-based topic model. This component effectively neutralizes the negative impact of LDDs, enabling the model to accurately identify topics across languages. Our experiments on three datasets demonstrate that the updated pipeline with the dimension refinement component generally outperforms other state-of-the-art cross-lingual topic models.

dimension, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.12433

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
Europe > Italy (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Improving Policy Learning via Language Dynamics Distillation

Zhong, Victor, Mu, Jesse, Zettlemoyer, Luke, Grefenstette, Edward, Rocktäschel, Tim

arXiv.org Artificial IntelligenceSep-30-2022

Recent work has shown that augmenting environments with language descriptions improves policy learning. However, for environments with complex language abstractions, learning how to ground language to observations is difficult due to sparse, delayed rewards. We propose Language Dynamics Distillation (LDD), which pretrains a model to predict environment dynamics given demonstrations with language descriptions, and then fine-tunes these language-aware pretrained representations via reinforcement learning (RL). In this way, the model is trained to both maximize expected reward and retain knowledge about how language relates to environment dynamics. On SILG, a benchmark of five tasks with language descriptions that evaluate distinct generalization challenges on unseen environments (NetHack, ALFWorld, RTFM, Messenger, and Touchdown), LDD outperforms tabula-rasa RL, VAE pretraining, and methods that learn from unlabeled demonstrations in inverse RL and reward shaping with pretrained experts. In our analyses, we show that language descriptions in demonstrations improve sample-efficiency and generalization across environments, and that dynamics modelling with expert demonstrations is more effective than with non-experts.

demonstration, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2210.00066

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Using Regular Languages to Explore the Representational Capacity of Recurrent Neural Architectures

Mahalunkar, Abhijit, Kelleher, John D.

arXiv.org Machine LearningAug-15-2018

The presence of Long Distance Dependencies (LDDs) in sequential data poses significant challenges for computational models. Various recurrent neural architectures have been designed to mitigate this issue. In order to test these state-of-the-art architectures, there is growing need for rich benchmarking datasets. However, one of the drawbacks of existing datasets is the lack of experimental control with regards to the presence and/or degree of LDDs. This lack of control limits the analysis of model performance in relation to the specific challenge posed by LDDs. One way to address this is to use synthetic data having the properties of subregular languages. The degree of LDDs within the generated data can be controlled through the k parameter, length of the generated strings, and by choosing appropriate forbidden strings. In this paper, we explore the capacity of different RNN extensions to model LDDs, by evaluating these models on a sequence of SPk synthesized datasets, where each subsequent dataset exhibits a longer degree of LDD. Even though SPk are simple languages, the presence of LDDs does have significant impact on the performance of recurrent neural architectures, thus making them prime candidate in benchmarking tasks.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

1808.05128

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Europe > Ireland (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Orthogonality-Promoting Distance Metric Learning: Convex Relaxation and Theoretical Analysis

Xie, Pengtao, Wu, Wei, Zhu, Yichen, Xing, Eric P.

arXiv.org Machine LearningFeb-16-2018

Distance metric learning (DML), which learns a distance metric from labeled "similar" and "dissimilar" data pairs, is widely utilized. Recently, several works investigate orthogonality-promoting regularization (OPR), which encourages the projection vectors in DML to be close to being orthogonal, to achieve three effects: (1) high balancedness -- achieving comparable performance on both frequent and infrequent classes; (2) high compactness -- using a small number of projection vectors to achieve a "good" metric; (3) good generalizability -- alleviating overfitting to training data. While showing promising results, these approaches suffer three problems. First, they involve solving non-convex optimization problems where achieving the global optimal is NP-hard. Second, it lacks a theoretical understanding why OPR can lead to balancedness. Third, the current generalization error analysis of OPR is not directly on the regularizer. In this paper, we address these three issues by (1) seeking convex relaxations of the original nonconvex problems so that the global optimal is guaranteed to be achievable; (2) providing a formal analysis on OPR's capability of promoting balancedness; (3) providing a theoretical analysis that directly reveals the relationship between OPR and generalization performance. Experiments on various datasets demonstrate that our convex methods are more effective in promoting balancedness, compactness, and generalization, and are computationally more efficient, compared with the nonconvex methods.

artificial intelligence, machine learning, regularizer, (18 more...)

arXiv.org Machine Learning

1802.06014

Genre: Research Report (0.82)

Industry: Health & Medicine > Health Care Providers & Services (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Online Repositioning in Bike Sharing Systems

Lowalekar, Meghna (Singapore Management University) | Varakantham, Pradeep (Singapore Management University) | Ghosh, Supriyo (Singapore Management University) | Jena, Sanjay Dominik (Université du Québec à Montréal) | Jaillet, Patrick (Massachusetts Institute of Technology)

AAAI ConferencesJun-14-2017

Due to increased traffic congestion and carbon emissions, Bike Sharing Systems (BSSs) are adopted in various cities for short distance travels, specifically for last mile transportation. The success of a bike sharing system depends on its ability to have bikes available at the "right" base stations at the "right" times. Typically, carrier vehicles are used to perform repositioning of bikes between stations so as to satisfy customer requests. Owing to the uncertainty in customer demand and day-long repositioning, the problem of having bikes available at the right base stations at the right times is a challenging one. In this paper, we propose a multi-stage stochastic formulation, to consider expected future demand over a set of scenarios to find an efficient repositioning strategy for bike sharing systems. Furthermore, we provide a Lagrangian decomposition approach (that decouples the global problem into routing and repositioning slaves and employs a novel DP approach to efficiently solve routing slave) and a greedy online anticipatory heuristic to solve large scale problems effectively and efficiently. Finally, in our experimental results, we demonstrate significant reduction in lost demand provided by our techniques on real world datasets from two bike sharing companies in comparison to existing benchmark approaches.

bike, optimization problem, télécommunications, (18 more...)

AAAI Conferences

Twenty-Seventh International Conference on Automated Planning and Scheduling

Country: North America > United States > Massachusetts (0.14)

Industry:

Telecommunications (0.68)
Energy > Oil & Gas (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Filters

Collaborating Authors

ldd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Final policy RL fine-tuning Envir

Semantics as a Shield: Label Disguise Defense (LDD) against Prompt Injection in LLM Sentiment Classification

51053d7b8473df7d5a2165b2a8ee9629-Paper-Conference.pdf

For GPT-4 as with Humans: Information Structure Predicts Acceptability of Long-Distance Dependencies

Language-Driven Dual Style Mixing for Single-Domain Generalized Object Detection

Refining Dimensions for Improving Clustering-based Cross-lingual Topic Models

Improving Policy Learning via Language Dynamics Distillation

Using Regular Languages to Explore the Representational Capacity of Recurrent Neural Architectures

Orthogonality-Promoting Distance Metric Learning: Convex Relaxation and Theoretical Analysis

Online Repositioning in Bike Sharing Systems