anagram
AMStraMGRAM: Adaptive Multi-cutoff Strategy Modification for ANaGRAM
Schwencke, Nilo, Rousselot, Cyriaque, Shilova, Alena, Furtlehner, Cyril
Recent works have shown that natural gradient methods can significantly outperform standard optimizers when training physics-informed neural networks (PINNs). In this paper, we analyze the training dynamics of PINNs optimized with ANaGRAM, a natural-gradient-inspired approach employing singular value decomposition with cutoff regularization. Building on this analysis, we propose a multi-cutoff adaptation strategy that further enhances ANaGRAM's performance. Experiments on benchmark PDEs validate the effectiveness of our method, which allows to reach machine precision on some experiments. To provide theoretical grounding, we develop a framework based on spectral theory that explains the necessity of regularization and extend previous shown connections with Green's functions theory.
- Europe > Portugal > Braga > Braga (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
PUZZLED: Jailbreaking LLMs through Word-Based Puzzles
As large language models (LLMs) are increasingly deployed across diverse domains, ensuring their safety has become a critical concern. In response, studies on jailbreak attacks have been actively growing. Existing approaches typically rely on iterative prompt engineering or semantic transformations of harmful instructions to evade detection. In this work, we introduce PUZZLED, a novel jailbreak method that leverages the LLM's reasoning capabilities. It masks keywords in a harmful instruction and presents them as word puzzles for the LLM to solve. We design three puzzle types-word search, anagram, and crossword-that are familiar to humans but cognitively demanding for LLMs. The model must solve the puzzle to uncover the masked words and then proceed to generate responses to the reconstructed harmful instruction. We evaluate PUZZLED on five state-of-the-art LLMs and observe a high average attack success rate (ASR) of 88.8%, specifically 96.5% on GPT-4.1 and 92.3% on Claude 3.7 Sonnet. PUZZLED is a simple yet powerful attack that transforms familiar puzzles into an effective jailbreak strategy by harnessing LLMs' reasoning capabilities.
What Makes Cryptic Crosswords Challenging for LLMs?
Sadallah, Abdelrahman, Kotova, Daria, Kochmar, Ekaterina
Cryptic crosswords are puzzles that rely on general knowledge and the solver's ability to manipulate language on different levels, dealing with various types of wordplay. Previous research suggests that solving such puzzles is challenging even for modern NLP models, including Large Language Models (LLMs). However, there is little to no research on the reasons for their poor performance on this task. In this paper, we establish the benchmark results for three popular LLMs: Gemma2, LLaMA3 and ChatGPT, showing that their performance on this task is still significantly below that of humans. We also investigate why these models struggle to achieve superior performance. We release our code and introduced datasets at https://github.com/bodasadallah/decrypting-crosswords.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (6 more...)
ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning
Schwencke, Nilo, Furtlehner, Cyril
In the recent years, Physics Informed Neural Networks (PINNs) have received strong interest as a method to solve PDE driven systems, in particular for data assimilation purpose. This method is still in its infancy, with many shortcomings and failures that remain not properly understood. In this paper we propose a natural gradient approach to PINNs which contributes to speed-up and improve the accuracy of the training. Based on an in depth analysis of the differential geometric structures of the problem, we come up with two distinct contributions: (i) a new natural gradient algorithm that scales as $\min(P^2S, S^2P)$, where $P$ is the number of parameters, and $S$ the batch size; (ii) a mathematically principled reformulation of the PINNs problem that allows the extension of natural gradient to it, with proved connections to Green's function theory.
AIs get worse at answering simple questions as they get bigger
Large language models (LLMs) seem to get less reliable at answering simple questions when they get bigger and learn from human feedback. AI developers try to improve the power of LLMs in two main ways: scaling up – giving them more training data and more computational power – and shaping up, or fine-tuning them in response to human feedback. How does ChatGPT work and do AI-powered chatbots "think" like us? José Hernández-Orallo at the Polytechnic University of Valencia, Spain, and his colleagues examined the performance of LLMs as they scaled up and shaped up. They looked at OpenAI's GPT series of chatbots, Meta's LLaMA AI models, and BLOOM, developed by a group of researchers called BigScience. The researchers tested the AIs by posing five types of task: arithmetic problems, solving anagrams, geographical questions, scientific challenges and pulling out information from disorganised lists.
- Europe > Spain > Valencian Community > Valencia Province > Valencia (0.26)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.06)
Investigating Antigram Behaviour using Distributional Semantics
The field of computational linguistics constantly presents new challenges and topics for research. Whether it be analyzing word usage changes over time or identifying relationships between pairs of seemingly unrelated words. To this point, we identify Anagrams and Antigrams as words possessing such unique properties. The presented work is an exploration into generating anagrams from a given word and determining whether there exists antigram (semantically opposite anagrams) relationships between the pairs of generated anagrams using GloVe embeddings. We propose a rudimentary, yet interpretable, rule-based algorithm for detecting antigrams. On a small dataset of just 12 antigrams, our approach yielded an accuracy of 39\% which shows that there is much work left to be done in this space.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Pennsylvania (0.04)
- North America > United States > Arizona (0.04)
Genetic Micro-Programs for Automated Software Testing with Large Path Coverage
Goschen, Jarrod, Bosman, Anna Sergeevna, Gruner, Stefan
Ongoing progress in computational intelligence (CI) has led to an increased desire to apply CI techniques for the purpose of improving software engineering processes, particularly software testing. Existing state-of-the-art automated software testing techniques focus on utilising search algorithms to discover input values that achieve high execution path coverage. These algorithms are trained on the same code that they intend to test, requiring instrumentation and lengthy search times to test each software component. This paper outlines a novel genetic programming framework, where the evolved solutions are not input values, but micro-programs that can repeatedly generate input values to efficiently explore a software component's input parameter domain. We also argue that our approach can be generalised such as to be applied to many different software systems, and is thus not specific to merely the particular software component on which it was trained.
- Africa > South Africa > Gauteng > Pretoria (0.05)
- North America > United States > New York > New York County > New York City (0.04)
American English Is Now Reliant on Scrabble's Dictionary
In the mid-1970s, top players in an emerging tournament Scrabble scene persuaded the game's corporate owner to adopt a universal lexicon for competition. Players manually scraped five standard college dictionaries, recording every unique two- through eight-letter word (plus inflections) that met the game's rules. When the Official Scrabble Players Dictionary was published, in 1978, players rejoiced. "You can retire the boxing gloves and put up your swords," the Scrabble Players Newspaper wrote. "You now have an arbiter to settle all arguments."
- North America > United States > Indiana (0.05)
- North America > United States > District of Columbia > Washington (0.05)
- North America > United States > Connecticut > Fairfield County > Westport (0.05)
- Europe > United Kingdom > England > East Sussex > Brighton (0.05)
Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as a Target for NLP
Rozner, Josh, Potts, Christopher, Mahowald, Kyle
Cryptic crosswords, the dominant English-language crossword variety in the United Kingdom, can be solved by expert humans using flexible, creative intelligence and knowledge of language. Cryptic clues read like fluent natural language, but they are adversarially composed of two parts: a definition and a wordplay cipher requiring sub-word or character-level manipulations. As such, they are a promising target for evaluating and advancing NLP systems that seek to process language in more creative, human-like ways. We present a dataset of cryptic crossword clues from a major newspaper that can be used as a benchmark and train a sequence-to-sequence model to solve them. We also develop related benchmarks that can guide development of approaches to this challenging task. We show that performance can be substantially improved using a novel curriculum learning approach in which the model is pre-trained on related tasks involving, e.g, unscrambling words, before it is trained to solve cryptics. However, even this curricular approach does not generalize to novel clue types in the way that humans can, and so cryptic crosswords remain a challenge for NLP systems and a potential source of future innovation.
- Europe > United Kingdom (0.24)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
Artificial Intelligence May Have Cracked Freaky 600-Year-Old Manuscript
Since its discovery over a hundred years ago, the 240-page Voynich manuscript, filled with seemingly coded language and inscrutable illustrations, has confounded linguists and cryptographers. Using artificial intelligence, Canadian researchers have taken a huge step forward in unraveling the document's hidden meaning. Named after Wilfrid Voynich, the Polish book dealer who procured the manuscript in 1912, the document is written in an unknown script that encodes an unknown language--a double-whammy of unknowns that has, until this point, been impossible to interpret. The Voynich manuscript contains hundreds of fragile pages, some missing, with hand-written text going from left to right. Most pages are adorned with illustrations of diagrams, including plants, nude figures, and astronomical symbols.