AITopics | test seed

Collaborating Authors

test seed

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LLMs are All You Need? Improving Fuzz Testing for MOJO with Large Language Models

Huang, Linghan, Zhao, Peizhou, Chen, Huaming

arXiv.org Artificial IntelligenceOct-14-2025

The rapid development of large language models (LLMs) has revolutionized software testing, particularly fuzz testing, by automating the generation of diverse and effective test inputs. This advancement holds great promise for improving software reliability. Meanwhile, the introduction of MOJO, a high-performance AI programming language blending Python's usability with the efficiency of C and C++, presents new opportunities to enhance AI model scalability and programmability. However, as a new language, MOJO lacks comprehensive testing frameworks and a sufficient corpus for LLM-based testing, which exacerbates model hallucination. In this case, LLMs will generate syntactically valid but semantically incorrect code, significantly reducing the effectiveness of fuzz testing. To address this challenge, we propose MOJOFuzzer, the first adaptive LLM-based fuzzing framework designed for zero-shot learning environments of emerging programming languages. MOJOFuzzer integrates a mutil-phase framework that systematically eliminates low-quality generated inputs before execution, significantly improving test case validity. Furthermore, MOJOFuzzer dynamically adapts LLM prompts based on runtime feedback for test case mutation, enabling an iterative learning process that continuously enhances fuzzing efficiency and bug detection performance. Our experimental results demonstrate that MOJOFuzzer significantly enhances test validity, API coverage, and bug detection performance, outperforming traditional fuzz testing and state-of-the-art LLM-based fuzzing approaches. Using MOJOFuzzer, we have conducted a first large-scale fuzz testing evaluation of MOJO, uncorvering 13 previous unknown bugs. This study not only advances the field of LLM-driven software testing but also establishes a foundational methodology for leveraging LLMs in the testing of emerging programming languages.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.10179

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hierarchical Distribution-Aware Testing of Deep Learning

Huang, Wei, Zhao, Xingyu, Banks, Alec, Cox, Victoria, Huang, Xiaowei

arXiv.org Artificial IntelligenceSep-1-2023

Deep Learning (DL) is increasingly used in safety-critical applications, raising concerns about its reliability. DL suffers from a well-known problem of lacking robustness, especially when faced with adversarial perturbations known as Adversarial Examples (AEs). Despite recent efforts to detect AEs using advanced attack and testing methods, these approaches often overlook the input distribution and perceptual quality of the perturbations. As a result, the detected AEs may not be relevant in practical applications or may appear unrealistic to human observers. This can waste testing resources on rare AEs that seldom occur during real-world use, limiting improvements in DL model dependability. In this paper, we propose a new robustness testing approach for detecting AEs that considers both the feature level distribution and the pixel level distribution, capturing the perceptual quality of adversarial perturbations. The two considerations are encoded by a novel hierarchical mechanism. First, we select test seeds based on the density of feature level distribution and the vulnerability of adversarial robustness. The vulnerability of test seeds are indicated by the auxiliary information, that are highly correlated with local robustness. Given a test seed, we then develop a novel genetic algorithm based local test case generation method, in which two fitness functions work alternatively to control the perceptual quality of detected AEs. Finally, extensive experiments confirm that our holistic approach considering hierarchical distributions is superior to the state-of-the-arts that either disregard any input distribution or only consider a single (non-hierarchical) distribution, in terms of not only detecting imperceptible AEs but also improving the overall robustness of the DL model under testing.

aes, robustness, test seed, (15 more...)

arXiv.org Artificial Intelligence

2205.08589

Country:

Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Spain > Galicia > Madrid (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Hyperparameters in Reinforcement Learning and How To Tune Them

Eimer, Theresa, Lindauer, Marius, Raileanu, Roberta

arXiv.org Artificial IntelligenceJun-2-2023

In order to improve reproducibility, deep reinforcement learning (RL) has been adopting better scientific practices such as standardized evaluation metrics and reporting. However, the process of hyperparameter optimization still varies widely across papers, which makes it challenging to compare RL algorithms fairly. In this paper, we show that hyperparameter choices in RL can significantly affect the agent's final performance and sample efficiency, and that the hyperparameter landscape can strongly depend on the tuning seed which may lead to overfitting. We therefore propose adopting established best practices from AutoML, such as the separation of tuning and testing seeds, as well as principled hyperparameter optimization (HPO) across a broad search space. We support this by comparing multiple state-of-the-art HPO tools on a range of RL algorithms and environments to their hand-tuned counterparts, demonstrating that HPO approaches often have higher performance and lower compute overhead. As a result of our findings, we recommend a set of best practices for the RL community, which should result in stronger empirical results with fewer computational costs, better reproducibility, and thus faster progress. In order to encourage the adoption of these practices, we provide plug-and-play implementations of the tuning algorithms used in this paper at https://github.com/facebookresearch/how-to-autorl.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2306.01324

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > Portugal > Braga > Braga (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback