Goto

Collaborating Authors

 suffix


solid [ R1, R3, R4 ], our experimental results valuable [ R2, R3, R4] and our paper well-written [ R1, R3, R4]

Neural Information Processing Systems

We only included a single environment (Pusher-v2) in the main paper in order to save space. We will include the suggested references into the paper. See also About multi-step rollouts . The reviewer suggests that the paper should first "show that minimizing the TD-error is not Notice, however, that despite being commonly used and thought of as "intuitive", Furthermore, Figure 1 shows indeed that minimizing the TD-error can lead to a critic being far away from the ideal one. We did not write that "model-based RL has no advantage in terms of sample-efficiency than model-free RL".


Universal Adversarial Suffixes for Language Models Using Reinforcement Learning with Calibrated Reward

Soor, Sampriti, Ghosh, Suklav, Sur, Arijit

arXiv.org Artificial Intelligence

Language models are vulnerable to short adversarial suffixes that can reliably alter predictions. Previous works usually find such suffixes with gradient search or rule-based methods, but these are brittle and often tied to a single task or model. In this paper, a reinforcement learning framework is used where the suffix is treated as a policy and trained with Proximal Policy Optimization against a frozen model as a reward oracle. Rewards are shaped using calibrated cross-entropy, removing label bias and aggregating across surface forms to improve transferability. The proposed method is evaluated on five diverse NLP benchmark datasets, covering sentiment, natural language inference, paraphrase, and commonsense reasoning, using three distinct language models: Qwen2-1.5B Instruct, TinyLlama-1.1B Chat, and Phi-1.5. Results show that RL-trained suffixes consistently degrade accuracy and transfer more effectively across tasks and models than previous adversarial triggers of similar genres.


TurBLiMP: A Turkish Benchmark of Linguistic Minimal Pairs

Başar, Ezgi, Padovani, Francesca, Jumelet, Jaap, Bisazza, Arianna

arXiv.org Artificial Intelligence

We introduce TurBLiMP, the first Turkish benchmark of linguistic minimal pairs, designed to evaluate the linguistic abilities of monolingual and multilingual language models (LMs). Covering 16 linguistic phenomena with 1000 minimal pairs each, TurBLiMP fills an important gap in linguistic evaluation resources for Turkish. In designing the benchmark, we give extra attention to two properties of Turkish that remain understudied in current syntactic evaluations of LMs, namely word order flexibility and subordination through morphological processes. Our experiments on a wide range of LMs and a newly collected set of human acceptability judgments reveal that even cutting-edge Large LMs still struggle with grammatical phenomena that are not challenging for humans, and may also exhibit different sensitivities to word order and morphological complexity compared to humans.


Extracting memorized pieces of (copyrighted) books from open-weight language models

Cooper, A. Feder, Gokaslan, Aaron, Ahmed, Ahmed, Cyphert, Amy B., De Sa, Christopher, Lemley, Mark A., Ho, Daniel E., Liang, Percy

arXiv.org Artificial Intelligence

Plaintiffs and defendants in copyright lawsuits over generative AI often make sweeping, opposing claims about the extent to which large language models (LLMs) have memorized plaintiffs' protected expression in their training data. Drawing on both machine learning and copyright law, we show that these polarized positions dramatically oversimplify the relationship between memorization and copyright. To do so, we extend a recent probabilistic extraction technique to measure memorization of 50 books in 17 open-weight LLMs. Through thousands of experiments, we show that the extent of memorization varies both by model and by book. With respect to our specific extraction methodology, we find that most LLMs do not memorize most books -- either in whole or in part. However, we also find that Llama 3.1 70B entirely memorizes some books, like the first Harry Potter book and 1984. In fact, the first Harry Potter is so memorized that, using a seed prompt consisting of just the first few tokens of the first chapter, we can deterministically generate the entire book near-verbatim. We discuss why our results have significant implications for copyright cases, though not ones that unambiguously favor either side.


CAHS-Attack: CLIP-Aware Heuristic Search Attack Method for Stable Diffusion

Xia, Shuhan, Dai, Jing, Ouyang, Hui, Shang, Yadong, Zhao, Dongxiao, Li, Peipei

arXiv.org Artificial Intelligence

Abstract--Diffusion models exhibit notable fragility when faced with adversarial prompts, and strengthening attack capabilities is crucial for uncovering such vulnerabilities and building more robust generative systems. Existing works often rely on white-box access to model gradients or hand-crafted prompt engineering, which is infeasible in real-world deployments due to restricted access or poor attack effect. In this paper, we propose CAHS-Attack, a CLIP-A ware Heuristic Search attack method. CAHS-Attack integrates Monte Carlo Tree Search (MCTS) to perform fine-grained suffix optimization, leveraging a constrained genetic algorithm to preselect high-potential adversarial prompts as root nodes, and retaining the most semantically disruptive outcome at each simulation rollout for efficient local search. Extensive experiments demonstrate that our method achieves state-of-the-art attack performance across both short and long prompts of varying semantics. Furthermore, we find that the fragility of SD models can be attributed to the inherent vulnerability of their CLIP-based text encoders, suggesting a fundamental security risk in current text-to-image pipelines. In recent years, advances in text-to-image generation have led to the emergence of powerful models such as Stable Diffusion (SD) [1], [2], FLUX [3], and MMaDA [4], enabling users to create high-quality images from natural language prompts.


Hidden markov model to predict tourists visited place

Demessance, Theo, Bi, Chongke, Djebali, Sonia, Guerard, Guillaume

arXiv.org Artificial Intelligence

Nowadays, social networks are becoming a popular way of analyzing tourist behavior, thanks to the digital traces left by travelers during their stays on these networks. The massive amount of data generated; by the propensity of tourists to share comments and photos during their trip; makes it possible to model their journeys and analyze their behavior. Predicting the next movement of tourists plays a key role in tourism marketing to understand demand and improve decision support. In this paper, we propose a method to understand and to learn tourists' movements based on social network data analysis to predict future movements. The method relies on a machine learning grammatical inference algorithm. A major contribution in this paper is to adapt the grammatical inference algorithm to the context of big data. Our method produces a hidden Markov model representing the movements of a group of tourists. The hidden Markov model is flexible and editable with new data. The capital city of France, Paris is selected to demonstrate the efficiency of the proposed methodology.





Contextual morphologically-guided tokenization for Latin encoder models

Hudspeth, Marisa, Burns, Patrick J., O'Connor, Brendan

arXiv.org Artificial Intelligence

Tokenization is a critical component of language model pretraining, yet standard tokenization methods often prioritize information-theoretical goals like high compression and low fertility rather than linguistic goals like morphological alignment. In fact, they have been shown to be suboptimal for morphologically rich languages, where tokenization quality directly impacts downstream performance. In this work, we investigate morphologically-aware tokenization for Latin, a morphologically rich language that is medium-resource in terms of pretraining data, but high-resource in terms of curated lexical resources -- a distinction that is often overlooked but critical in discussions of low-resource language modeling. We find that morphologically-guided tokenization improves overall performance on four downstream tasks. Performance gains are most pronounced for out of domain texts, highlighting our models' improved generalization ability. Our findings demonstrate the utility of linguistic resources to improve language modeling for morphologically complex languages. For low-resource languages that lack large-scale pretraining data, the development and incorporation of linguistic resources can serve as a feasible alternative to improve LM performance.