Goto

Collaborating Authors

 cryptic crossword


Devious humour and painful puns: will the cryptic crossword remain the last thing AI can't conquer?

The Guardian

The Times hosts an annual crossword-solving competition and it remains, until such time as the Guardian has its own version, the gold standard. This year's competitors included a dog. Rather, an AI represented as a jolly coffee-drinking dog named Ross (a name hidden in "crossword"), and who is embedded on the Crossword Genius smartphone app. The human competitors at the event – which took place at Times' parent company News UK's London headquarters, in the shadow of the Shard – were, as usual, bafflingly fast: pondering the next clue while scribbling the letters of the previous. An AI can conceivably "think" about multiple puzzles at once: so did it outwit us mortals?

  Country:

Proving that Cryptic Crossword Clue Answers are Correct

Andrews, Martin, Witteveen, Sam

arXiv.org Artificial Intelligence

Cryptic crossword clues are challenging cognitive tasks, for which new test sets are released on a daily basis by multiple international newspapers. Each cryptic clue contains both the definition of the answer to be placed in the crossword grid (in common with regular crosswords), and `wordplay' that proves that the answer is correct (i.e. a human solver can be confident that an answer is correct without needing crossing words to confirm it). Using an existing cryptic wordplay proving framework (operating on Python proofs created by an LLM), we show that it is possible to distinguish between correct answers and almost-correct ones based upon whether the wordplay `works'.


Language Models are Crossword Solvers

Saha, Soumadeep, Chakraborty, Sutanoya, Saha, Saptarshi, Garain, Utpal

arXiv.org Artificial Intelligence

Crosswords are a form of word puzzle that require a solver to demonstrate a high degree of proficiency in natural language understanding, wordplay, reasoning, and world knowledge, along with adherence to character and length constraints. In this paper we tackle the challenge of solving crosswords with Large Language Models (LLMs). We demonstrate that the current generation of state-of-the art (SoTA) language models show significant competence at deciphering cryptic crossword clues, and outperform previously reported SoTA results by a factor of 2-3 in relevant benchmarks. We also develop a search algorithm that builds off this performance to tackle the problem of solving full crossword grids with LLMs for the very first time, achieving an accuracy of 93\% on New York Times crossword puzzles. Contrary to previous work in this area which concluded that LLMs lag human expert performance significantly, our research suggests this gap is a lot narrower.


Are LLMs Good Cryptic Crossword Solvers?

Sadallah, Abdelrahman "Boda", Kotova, Daria, Kochmar, Ekaterina

arXiv.org Artificial Intelligence

Cryptic crosswords are puzzles that rely not only on general knowledge but also on the solver's ability to manipulate language on different levels and deal with various types of wordplay. Previous research suggests that solving such puzzles is a challenge even for modern NLP models. However, the abilities of large language models (LLMs) have not yet been tested on this task. In this paper, we establish the benchmark results for three popular LLMs -- LLaMA2, Mistral, and ChatGPT -- showing that their performance on this task is still far from that of humans.


Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as a Target for NLP

Rozner, Josh, Potts, Christopher, Mahowald, Kyle

arXiv.org Artificial Intelligence

Cryptic crosswords, the dominant English-language crossword variety in the United Kingdom, can be solved by expert humans using flexible, creative intelligence and knowledge of language. Cryptic clues read like fluent natural language, but they are adversarially composed of two parts: a definition and a wordplay cipher requiring sub-word or character-level manipulations. As such, they are a promising target for evaluating and advancing NLP systems that seek to process language in more creative, human-like ways. We present a dataset of cryptic crossword clues from a major newspaper that can be used as a benchmark and train a sequence-to-sequence model to solve them. We also develop related benchmarks that can guide development of approaches to this challenging task. We show that performance can be substantially improved using a novel curriculum learning approach in which the model is pre-trained on related tasks involving, e.g, unscrambling words, before it is trained to solve cryptics. However, even this curricular approach does not generalize to novel clue types in the way that humans can, and so cryptic crosswords remain a challenge for NLP systems and a potential source of future innovation.


Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language

Efrat, Avia, Shaham, Uri, Kilman, Dan, Levy, Omer

arXiv.org Artificial Intelligence

Current NLP datasets targeting ambiguity can be solved by a native speaker with relative ease. We present Cryptonite, a large-scale dataset based on cryptic crosswords, which is both linguistically complex and naturally sourced. Each example in Cryptonite is a cryptic clue, a short phrase or sentence with a misleading surface reading, whose solving requires disambiguating semantic, syntactic, and phonetic wordplays, as well as world knowledge. Cryptic clues pose a challenge even for experienced solvers, though top-tier experts can solve them with almost 100% accuracy. Cryptonite is a challenging task for current models; fine-tuning T5-Large on 470k cryptic clues achieves only 7.6% accuracy, on par with the accuracy of a rule-based clue solver (8.6%).