AITopics | polyglot

Collaborating Authors

polyglot

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Lita: Light Agent Uncovers the Agentic Coding Capabilities of LLMs

Dai, Hankun, Wang, Maoquan, Qi, Mengnan, Zhang, Yikai, Jin, Zijian, Yao, Yongqiang, Huang, Yufan, Fu, Shengyu, Nallipogu, Elsie

arXiv.org Artificial IntelligenceOct-1-2025

Large language models (LLMs) are increasingly being applied to programming tasks, ranging from single-turn code completion to autonomous agents. Current code agent designs frequently depend on complex, hand-crafted workflows and tool sets. However, this reliance on elaborate scaffolding presents several challenges: agent performance becomes overly dependent on prompt tuning and custom design choices, heavy human intervention obscures a model's true underlying capabilities, and intricate pipelines are costly to build and maintain. Furthermore, optimizing complex task prompts increases the risk of data leakage. Currently, when introducing new models, LLM providers like OpenAI and Anthropic often publish benchmark scores to demonstrate their models' coding proficiency, but keep their proprietary evaluation frameworks confidential. To address these limitations, we introduce Lita (Lite Agent), which operationalizes liteness, a principle of minimizing manual design while retaining the essential elements of a fully autonomous agent. Lita enables a more faithful and unified evaluation without elaborate scaffolding. Experiments on the Aider Polyglot and SWE-Bench with frontier models demonstrate that Lita achieves competitive or superior performance compared to workflow-based and agentic baselines. Crucially, Lita also consumes fewer tokens and requires significantly less design effort. Our results suggest that Lita is sufficient to reveal the underlying coding competence of modern LLMs. Finally, we propose the Agent Complexity Law: the performance gap between agents of varying complexity, from simple to sophisticated designs, will shrink as the core model improves, ultimately converging to a negligible difference.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.25873

Country:

North America > United States (0.46)
Europe > Austria (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents

Zhang, Jenny, Hu, Shengran, Lu, Cong, Lange, Robert, Clune, Jeff

arXiv.org Artificial IntelligenceSep-29-2025

Today's AI systems have human-designed, fixed architectures and cannot autonomously and continuously improve themselves. The advance of AI could itself be automated. If done safely, that would accelerate AI development and allow us to reap its benefits much sooner. Meta-learning can automate the discovery of novel algorithms, but is limited by first-order improvements and the human design of a suitable search space. The Gödel machine proposed a theoretical alternative: a self-improving AI that repeatedly modifies itself in a provably beneficial manner. Unfortunately, proving that most changes are net beneficial is impossible in practice. We introduce the Darwin Gödel Machine (DGM), a self-improving system that iteratively modifies its own code (thereby also improving its ability to modify its own codebase) and empirically validates each change using coding benchmarks. Inspired by Darwinian evolution and open-endedness research, the DGM maintains an archive of generated coding agents. It grows the archive by sampling an agent from it and using a foundation model to create a new, interesting, version of the sampled agent. This open-ended exploration forms a growing tree of diverse, high-quality agents and allows the parallel exploration of many different paths through the search space. Empirically, the DGM automatically improves its coding capabilities (e.g., better code editing tools, long-context window management, peer-review mechanisms), increasing performance on SWE-bench from 20.0% to 50.0%, and on Polyglot from 14.2% to 30.7%. Furthermore, the DGM significantly outperforms baselines without self-improvement or open-ended exploration. All experiments were done with safety precautions (e.g., sandboxing, human oversight). The DGM is a significant step toward self-improving AI, capable of gathering its own stepping stones along paths that unfold into endless innovation.

evolutionary algorithm, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2505.22954

Country:

Europe (0.67)
North America > Canada (0.45)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Education (1.00)
Government (0.67)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(6 more...)

Add feedback

Unify and Triumph: Polyglot, Diverse, and Self-Consistent Generation of Unit Tests with LLMs

Khelladi, Djamel Eddine, Reux, Charly, Acher, Mathieu

arXiv.org Artificial IntelligenceMar-20-2025

Large language model (LLM)-based test generation has gained attention in software engineering, yet most studies evaluate LLMs' ability to generate unit tests in a single attempt for a given language, missing the opportunity to leverage LLM diversity for more robust testing. This paper introduces PolyTest, a novel approach that enhances test generation by exploiting polyglot and temperature-controlled diversity. PolyTest systematically leverages these properties in two complementary ways: (1) Cross-lingual test generation, where tests are generated in multiple languages at zero temperature and then unified; (2) Diverse test sampling, where multiple test sets are generated within the same language at a higher temperature before unification. A key insight is that LLMs can generate diverse yet contradicting tests -- same input, different expected outputs -- across languages and generations. PolyTest mitigates inconsistencies by unifying test sets, fostering self-consistency and improving overall test quality. Unlike single-language or single-attempt approaches, PolyTest enhances testing without requiring on-the-fly execution, making it particularly beneficial for weaker-performing languages. We evaluate PolyTest on Llama3-70B, GPT-4o, and GPT-3.5 using EvalPlus, generating tests in five languages (Java, C, Python, JavaScript, and a CSV-based format) at temperature 0 and sampling multiple sets at temperature 1. We observe that LLMs frequently generate contradicting tests across settings, and that PolyTest significantly improves test quality across all considered metrics -- number of tests, passing rate, statement/branch coverage (up to +9.01%), and mutation score (up to +11.23%). Finally, PolyTest outperforms Pynguin in test generation, passing rate, and mutation score.

large language model, machine learning, natural language, (7 more...)

arXiv.org Artificial Intelligence

2503.16144

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

On the Abuse and Detection of Polyglot Files

Koch, Luke, Oesch, Sean, Chaulagain, Amul, Dixon, Jared, Dixon, Matthew, Huettal, Mike, Sadovnik, Amir, Watson, Cory, Weber, Brian, Hartman, Jacob, Patulski, Richard

arXiv.org Artificial IntelligenceJul-1-2024

A polyglot is a file that is valid in two or more formats. Polyglot files pose a problem for malware detection systems that route files to format-specific detectors/signatures, as well as file upload and sanitization tools. In this work we found that existing file-format and embedded-file detection tools, even those developed specifically for polyglot files, fail to reliably detect polyglot files used in the wild, leaving organizations vulnerable to attack. To address this issue, we studied the use of polyglot files by malicious actors in the wild, finding $30$ polyglot samples and $15$ attack chains that leveraged polyglot files. In this report, we highlight two well-known APTs whose cyber attack chains relied on polyglot files to bypass detection mechanisms. Using knowledge from our survey of polyglot usage in the wild -- the first of its kind -- we created a novel data set based on adversary techniques. We then trained a machine learning detection solution, PolyConv, using this data set. PolyConv achieves a precision-recall area-under-curve score of $0.999$ with an F1 score of $99.20$% for polyglot detection and $99.47$% for file-format identification, significantly outperforming all other tools tested. We developed a content disarmament and reconstruction tool, ImSan, that successfully sanitized $100$% of the tested image-based polyglots, which were the most common type found via the survey. Our work provides concrete tools and suggestions to enable defenders to better defend themselves against polyglot files, as well as directions for future work to create more robust file specifications and methods of disarmament.

file format, polyglot, polyglot file, (14 more...)

arXiv.org Artificial Intelligence

2407.01529

Country:

North America > United States > Tennessee > Anderson County > Oak Ridge (0.05)
Asia > South Korea (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre:

Research Report (0.82)
Overview (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Software > Programming Languages (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.88)
(3 more...)

Add feedback

Attending Form and Context to Generate Specialized Out-of-VocabularyWords Representations

Garneau, Nicolas, Leboeuf, Jean-Samuel, Pinter, Yuval, Lamontagne, Luc

arXiv.org Machine LearningDec-14-2019

We propose a new contextual-compositional neural network layer that handles out-of-vocabulary (OOV) words in natural language processing (NLP) tagging tasks. This layer consists of a model that attends to both the character sequence and the context in which the OOV words appear. We show that our model learns to generate task-specific \textit{and} sentence-dependent OOV word representations without the need for pre-training on an embedding table, unlike previous attempts. We insert our layer in the state-of-the-art tagging model of \citet{plank2016multilingual} and thoroughly evaluate its contribution on 23 different languages on the task of jointly tagging part-of-speech and morphosyntactic attributes. Our OOV handling method successfully improves performances of this model on every language but one to achieve a new state-of-the-art on the Universal Dependencies Dataset 1.4.

oov word, representation, word representation, (11 more...)

arXiv.org Machine Learning

1912.06876

Country:

North America > Canada > Quebec (0.04)
Asia > Indonesia (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Polyglot!

Communications of the ACMAug-22-2019, 11:33:27 GMT

Google speaks 106 languages--or at least can understand queries in written form if not also oral form. When I watch someone interacting verbally with Google Assistant in languages other than English (my native tongue), I realize Google's language ability vastly exceeds my own. I have a modest ability to speak and understand German. I know a few phrases in Russian and French. But it suddenly strikes me that Google is usefully dealing with over 100 languages in written and oral form.

artificial intelligence, discovery, machine learning, (9 more...)

Communications of the ACM

Industry: Education > Curriculum > Subject-Specific Education (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

AI Weekly: The AI research agenda for the next 20 years is being made now

#artificialintelligenceFeb-9-2019, 14:17:57 GMT

It's mind-blowing how much world-shaping work that gets done in hotel ballrooms. Machine learning experts regularly gather at conferences around the world to discuss noteworthy work and how to move the industry forward. Few are fortunate enough to attend in person, but you can sometimes find video online. The most recent example: The Association for the Advancement of Artificial Intelligence (AAAI) met in Hawaii last week, and among topics discussed was the roadmap for AI research in the United States for the next 20 years. The process to create a plan for the next two decades started in November with private workshops attended by academics and people from industry.

ai research agenda, artificial intelligence, social media, (14 more...)

#artificialintelligence

Country: North America > United States > Hawaii (0.25)

Industry:

Information Technology > Services (0.31)
Government > Regional Government > North America Government > United States Government (0.31)

Technology:

Information Technology > Communications > Social Media (0.34)
Information Technology > Artificial Intelligence > Robots (0.33)

Add feedback

Facebook's 'polyglot' AI speaks English, German, and Spanish

#artificialintelligenceFeb-8-2019, 19:08:31 GMT

But they hold particular promise in the text-to-speech (TTS) realm, as evidenced by systems like Google's WaveNet, Baidu's DeepVoice, and WaveLoop. Another case in point: an artificially intelligent (AI) 'polyglot' system created by researchers at Facebook that's able to, given voice data, produce new speech samples in multiple languages. The team describes their work in a paper ("Unsupervised Polyglot Text-to-Speech") published on the preprint server Arxiv.org. "The … [AI] is able to transfer a voice, which was presented as a sample in a source language, into one of several target languages," they wrote. "[It can] take a sample of a speaker talking in one language and have [them] … speak as a native speaker in another language."

artificial intelligence, social media, source language, (8 more...)

#artificialintelligence

Genre: Research Report (0.37)

Industry: Information Technology > Services (0.42)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback