AITopics

Artificial Intelligence (AI) and Machine Learning (ML) have significantly impacted various industries, including software development. Software testing, a crucial part of the software development lifecycle (SDLC), ensures the quality and reliability of software products. Traditionally, software testing has been a labor-intensive process requiring significant manual effort. However, the advent of AI and ML has transformed this landscape by introducing automation and intelligent decision-making capabilities. AI and ML technologies enhance the efficiency and effectiveness of software testing by automating complex tasks such as test case generation, test execution, and result analysis. These technologies reduce the time required for testing and improve the accuracy of defect detection, ultimately leading to higher quality software. AI can predict potential areas of failure by analyzing historical data and identifying patterns, which allows for more targeted and efficient testing. This paper explores the role of AI and ML in software testing by reviewing existing literature, analyzing current tools and techniques, and presenting case studies that demonstrate the practical benefits of these technologies. The literature review provides a comprehensive overview of the advancements in AI and ML applications in software testing, highlighting key methodologies and findings from various studies. The analysis of current tools showcases the capabilities of popular AI-driven testing tools such as Eggplant AI, Test.ai, Selenium, Appvance, Applitools Eyes, Katalon Studio, and Tricentis Tosca, each offering unique features and advantages. Case studies included in this paper illustrate real-world applications of AI and ML in software testing, showing significant improvements in testing efficiency, accuracy, and overall software quality.

ai and ml, software testing, test case, (15 more...)

2409.02693

Country: Asia > Middle East > Jordan (0.04)

Genre:

Overview (1.00)
Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)

Wasil, Akash R., Barnett, Peter, Gerovitch, Michael, Hauksson, Roman, Reed, Tom, Miller, Jack William

Governing dual-use technologies: Case studies of international security agreements and lessons for AI governance

International AI governance agreements and institutions may play an important role in reducing global security risks from advanced AI. To inform the design of such agreements and institutions, we conducted case studies of historical and contemporary international security agreements. We focused specifically on those arrangements around dual-use technologies, examining agreements in nuclear security, chemical weapons, biosecurity, and export controls. For each agreement, we examined four key areas: (a) purpose, (b) core powers, (c) governance structure, and (d) instances of non-compliance. From these case studies, we extracted lessons for the design of international AI agreements and governance institutions. We discuss the importance of robust verification methods, strategies for balancing power between nations, mechanisms for adapting to rapid technological change, approaches to managing trade-offs between transparency and security, incentives for participation, and effective enforcement mechanisms.

agreement, convention, inspection, (15 more...)

2409.02779

Country:

Asia > Russia (1.00)
Europe > Russia (0.32)
Asia > Middle East > Iran (0.31)
(15 more...)

Genre:

Overview (1.00)
Research Report (0.82)

Industry:

Law > International Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(6 more...)

Technology: Information Technology > Artificial Intelligence (1.00)

Hutchinson, Maeve, Jianu, Radu, Slingsby, Aidan, Madhyastha, Pranava

LLM-Assisted Visual Analytics: Opportunities and Challenges

We explore the integration of large language models (LLMs) into visual analytics (VA) systems to transform their capabilities through intuitive natural language interactions. We survey current research directions in this emerging field, examining how LLMs are integrated into data management, language interaction, visualisation generation, and language generation processes. We highlight the new possibilities that LLMs bring to VA, especially how they can change VA processes beyond the usual use cases. We especially highlight building new visualisation-language models, allowing access of a breadth of domain knowledge, multimodal interaction, and opportunities with guidance. Finally, we carefully consider the prominent challenges of using current LLMs in VA tasks. Our discussions in this paper aim to guide future researchers working on LLM-assisted VA systems and help them navigate common obstacles when developing these systems.

llm, va system, visualisation, (16 more...)

2409.02691

Country:

North America > United States > New York (0.05)
North America > Canada > Ontario > Toronto (0.04)

Genre: Overview (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Creating a Gen-AI based Track and Trace Assistant MVP (SuperTracy) for PostNL

Reshadati, Mohammad

The developments in the field of generative AI has brought a lot of opportunities for companies, for instance to improve efficiency in customer service and automating tasks. PostNL, the biggest parcel and E-commerce corporation of the Netherlands wants to use generative AI to enhance the communication around track and trace of parcels. During the internship a Minimal Viable Product (MVP) is created to showcase the value of using generative AI technologies, to enhance parcel tracking, analyzing the parcel's journey and being able to communicate about it in an easy to understand manner. The primary goal was to develop an in-house LLM-based system, reducing dependency on external platforms and establishing the feasibility of a dedicated generative AI team within the company. This multi-agent LLM based system aimed to construct parcel journey stories and identify logistical disruptions with heightened efficiency and accuracy. The research involved deploying a sophisticated AI-driven communication system, employing Retrieval-Augmented Generation (RAG) for enhanced response precision, and optimizing large language models (LLMs) tailored to domain specific tasks. The MVP successfully implemented a multi-agent open-source LLM system, called SuperTracy. SuperTracy is capable of autonomously managing a broad spectrum of user inquiries and improving internal knowledge handling. Results and evaluation demonstrated technological innovation and feasibility, notably in communication about the track and trace of a parcel, which exceeded initial expectations.

agent, parcel, supertracy, (15 more...)

2409.02711

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > United Kingdom (0.04)
(3 more...)

Genre:

Overview (1.00)
Workflow (0.67)
Research Report (0.65)

Industry:

Transportation > Freight & Logistics Services (0.93)
Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

The Future of Open Human Feedback

Don-Yehiya, Shachar, Burtenshaw, Ben, Astudillo, Ramon Fernandez, Osborne, Cailean, Jaiswal, Mimansa, Kuo, Tzu-Sheng, Zhao, Wenting, Shenfeld, Idan, Peng, Andi, Yurochkin, Mikhail, Kasirzadeh, Atoosa, Huang, Yangsibo, Hashimoto, Tatsunori, Jernite, Yacine, Vila-Suero, Daniel, Abend, Omri, Ding, Jennifer, Hooker, Sara, Kirk, Hannah Rose, Choshen, Leshem

Human feedback on conversations with language language models (LLMs) is central to how these systems learn about the world, improve their capabilities, and are steered toward desirable and safe behaviors. However, this feedback is mostly collected by frontier AI labs and kept behind closed doors. In this work, we bring together interdisciplinary experts to assess the opportunities and challenges to realizing an open ecosystem of human feedback for AI. We first look for successful practices in peer production, open source, and citizen science communities. We then characterize the main challenges for open human feedback. For each, we survey current approaches and offer recommendations. We end by envisioning the components needed to underpin a sustainable and open human feedback ecosystem. In the center of this ecosystem are mutually beneficial feedback loops, between users and specialized models, incentivizing a diverse stakeholders community of model trainers and feedback providers to support a general open feedback pool.

arxiv preprint arxiv, contributor, proceedings, (14 more...)

2408.16961

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Texas (0.04)
North America > United States > California (0.04)
(9 more...)

Genre:

Overview (0.93)
Research Report (0.83)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)
Law > Statutes (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.88)

Large Language Model-Based Agents for Software Engineering: A Survey

Liu, Junwei, Wang, Kaixin, Chen, Yixuan, Peng, Xin, Chen, Zhenpeng, Zhang, Lingming, Lou, Yiling

The recent advance in Large Language Models (LLMs) has shaped a new paradigm of AI agents, i.e., LLM-based agents. Compared to standalone LLMs, LLM-based agents substantially extend the versatility and expertise of LLMs by enhancing LLMs with the capabilities of perceiving and utilizing external resources and tools. To date, LLM-based agents have been applied and shown remarkable effectiveness in Software Engineering (SE). The synergy between multiple agents and human interaction brings further promise in tackling complex real-world SE problems. In this work, we present a comprehensive and systematic survey on LLM-based agents for SE. We collect 106 papers and categorize them from two perspectives, i.e., the SE and agent perspectives. In addition, we discuss open challenges and future directions in this critical domain. The repository of this survey is at https://github.com/FudanSELab/Agent4SE-Paper-List.

agent, information, llm-based agent, (12 more...)

2409.02977

Country:

Europe > Austria > Vienna (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(16 more...)

Genre:

Workflow (1.00)
Research Report (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Nguyen, Tri, Villaescusa-Navarro, Francisco, Mishra-Sharma, Siddharth, Cuesta-Lazaro, Carolina, Torrey, Paul, Farahi, Arya, Garcia, Alex M., Rose, Jonah C., O'Neil, Stephanie, Vogelsberger, Mark, Shen, Xuejian, Roche, Cian, Anglés-Alcázar, Daniel, Kallivayalil, Nitya, Muñoz, Julian B., Cyr-Racine, Francis-Yan, Roy, Sandip, Necib, Lina, Kollmann, Kassidy E.

How DREAMS are made: Emulating Satellite Galaxy and Subhalo Populations with Diffusion Models and Point Clouds

The connection between galaxies and their host dark matter (DM) halos is critical to our understanding of cosmology, galaxy formation, and DM physics. To maximize the return of upcoming cosmological surveys, we need an accurate way to model this complex relationship. Many techniques have been developed to model this connection, from Halo Occupation Distribution (HOD) to empirical and semi-analytic models to hydrodynamic. Hydrodynamic simulations can incorporate more detailed astrophysical processes but are computationally expensive; HODs, on the other hand, are computationally cheap but have limited accuracy. In this work, we present NeHOD, a generative framework based on variational diffusion model and Transformer, for painting galaxies/subhalos on top of DM with an accuracy of hydrodynamic simulations but at a computational cost similar to HOD. By modeling galaxies/subhalos as point clouds, instead of binning or voxelization, we can resolve small spatial scales down to the resolution of the simulations. For each halo, NeHOD predicts the positions, velocities, masses, and concentrations of its central and satellite galaxies. We train NeHOD on the TNG-Warm DM suite of the DREAMS project, which consists of 1024 high-resolution zoom-in hydrodynamic simulations of Milky Way-mass halos with varying warm DM mass and astrophysical parameters. We show that our model captures the complex relationships between subhalo properties as a function of the simulation parameters, including the mass functions, stellar-halo mass relations, concentration-mass relations, and spatial clustering. Our method can be used for a large variety of downstream applications, from galaxy clustering to strong lensing studies.

arxiv, galaxy, simulation, (17 more...)

2409.0298

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Florida > Alachua County > Gainesville (0.14)
(9 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Energy (0.92)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceSep-3-2024

LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet

Li, Nathaniel, Han, Ziwen, Steneker, Ian, Primack, Willow, Goodside, Riley, Zhang, Hugh, Wang, Zifan, Menghini, Cristina, Yue, Summer

Recent large language model (LLM) defenses have greatly improved models' ability to refuse harmful queries, even when adversarially attacked. However, LLM defenses are primarily evaluated against automated adversarial attacks in a single turn of conversation, an insufficient threat model for real-world malicious use. We demonstrate that multi-turn human jailbreaks uncover significant vulnerabilities, exceeding 70% attack success rate (ASR) on HarmBench against defenses that report single-digit ASRs with automated single-turn attacks. Human jailbreaks also reveal vulnerabilities in machine unlearning defenses, successfully recovering dual-use biosecurity knowledge from unlearned models. We compile these results into Multi-Turn Human Jailbreaks (MHJ), a dataset of 2,912 prompts across 537 multi-turn jailbreaks. We publicly release MHJ alongside a compendium of jailbreak tactics developed across dozens of commercial red teaming engagements, supporting research towards stronger LLM defenses.

large language model, machine learning, natural language, (18 more...)

2408.15221

Country:

North America > United States (1.00)
Asia (0.93)

Genre:

Research Report (0.83)
Overview (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Stein, Jonas, Hildebrandt, Florentin D, Thomas, Barrett W, Ulmer, Marlin W

Learning State-Dependent Policy Parametrizations for Dynamic Technician Routing with Rework

arXiv.org Artificial IntelligenceSep-3-2024

Home repair and installation services require technicians to visit customers and resolve tasks of different complexity. Technicians often have heterogeneous skills and working experiences. The geographical spread of customers makes achieving only perfect matches between technician skills and task requirements impractical. Additionally, technicians are regularly absent due to sickness. With non-perfect assignments regarding task requirement and technician skill, some tasks may remain unresolved and require a revisit and rework. Companies seek to minimize customer inconvenience due to delay. We model the problem as a sequential decision process where, over a number of service days, customers request service while heterogeneously skilled technicians are routed to serve customers in the system. Each day, our policy iteratively builds tours by adding "important" customers. The importance bases on analytical considerations and is measured by respecting routing efficiency, urgency of service, and risk of rework in an integrated fashion. We propose a state-dependent balance of these factors via reinforcement learning. A comprehensive study shows that taking a few non-perfect assignments can be quite beneficial for the overall service quality. We further demonstrate the value provided by a state-dependent parametrization.

customer, machine learning, reinforcement learning, (19 more...)

2409.01815

Country:

Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
North America > United States > Iowa (0.04)
Europe > Germany > Hamburg (0.04)

Genre:

Research Report > New Finding (0.93)
Overview (0.93)

Industry: Transportation (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Amaya-Mejía, Lina María, Ghita, Mohamed, Dentler, Jan, Olivares-Mendez, Miguel, Martinez, Carol

Visual Servoing for Robotic On-Orbit Servicing: A Survey

arXiv.org Artificial IntelligenceSep-3-2024

On-orbit servicing (OOS) activities will power the next big step for sustainable exploration and commercialization of space. Developing robotic capabilities for autonomous OOS operations is a priority for the space industry. Visual Servoing (VS) enables robots to achieve the precise manoeuvres needed for critical OOS missions by utilizing visual information for motion control. This article presents an overview of existing VS approaches for autonomous OOS operations with space manipulator systems (SMS). We divide the approaches according to their contribution to the typical phases of a robotic OOS mission: a) Recognition, b) Approach, and c) Contact. We also present a discussion on the reviewed VS approaches, identifying current trends. Finally, we highlight the challenges and areas for future research on VS techniques for robotic OOS.

controller, manipulator, robot, (15 more...)

2409.02324

Country:

North America > United States (1.00)
Europe (0.14)
North America > Canada (0.14)
(2 more...)

Genre: Overview (1.00)

Industry:

Government > Space Agency (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)