AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.73)

Neural Information Processing SystemsAug-16-2025, 12:39:24 GMT

cd0f74b5955dc87fd0605745c4b49ee8-Paper.pdf

mutation, program synthesis, synthesizer, (13 more...)

Country:

North America > United States (0.28)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Denmark (0.04)

Genre:

Workflow (0.46)
Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Neural Information Processing SystemsAug-16-2025, 12:39:13 GMT

cd0f74b5955dc87fd0605745c4b49ee8-AuthorFeedback.pdf

program synthesis, synthesizer, traceembed, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.73)

Liu, Siyuan, Yang, Zhice, Chen, Huangxun

AutoBridge: Automating Smart Device Integration with Centralized Platform

arXiv.org Artificial IntelligenceAug-1-2025

Multimodal IoT systems coordinate diverse IoT devices to deliver human-centered services. The ability to incorporate new IoT devices under the management of a centralized platform is an essential requirement. However, it requires significant human expertise and effort to program the complex IoT integration code that enables the platform to understand and control the device functions. Therefore, we propose AutoBridge to automate IoT integration code generation. Specifically, AutoBridge adopts a divide-and-conquer strategy: it first generates device control logic by progressively retrieving device-specific knowledge, then synthesizes platformcompliant integration code using platform-specific knowledge. To ensure correctness, AutoBridge features a multi-stage debugging pipeline, including an automated debugger for virtual IoT device testing and an interactive hardware-in-the-loop debugger that requires only binary user feedback (yes and no) for real-device verification. We evaluate AutoBridge on a benchmark of 34 IoT devices across two open-source IoT platforms. The results demonstrate that AutoBridge can achieves an average success rate of 93.87% and an average function coverage of 94.87%, without any human involvement. With minimal binary yes and no feedback from users, the code is then revised to reach 100% function coverage. A user study with 15 participants further shows that AutoBridge outperforms expert programmers by 50% to 80% in code accuracy, even when the programmers are allowed to use commercial code LLMs.

large language model, machine learning, natural language, (20 more...)

2507.23178

Country:

Asia > China (0.68)
North America > United States (0.46)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)

Industry: Information Technology > Smart Houses & Appliances (0.88)

Technology:

Information Technology > Internet of Things (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

arXiv.org Artificial IntelligenceJul-29-2025

AgentMesh: A Cooperative Multi-Agent Generative AI Framework for Software Development Automation

Khanzadeh, Sourena

Software development is a complex, multi-phase process traditionally requiring collaboration among individuals with diverse expertise. We propose AgentMesh, a Python-based framework that uses multiple cooperating LLM-powered agents to automate software development tasks. In AgentMesh, specialized agents - a Planner, Coder, Debugger, and Reviewer - work in concert to transform a high-level requirement into fully realized code. The Planner agent first decomposes user requests into concrete subtasks; the Coder agent implements each subtask in code; the Debugger agent tests and fixes the code; and the Reviewer agent validates the final output for correctness and quality. We describe the architecture and design of these agents and their communication, and provide implementation details including prompt strategies and workflow orchestration. A case study illustrates AgentMesh handling a non-trivial development request via sequential task planning, code generation, iterative debugging, and final code review. We discuss how dividing responsibilities among cooperative agents leverages the strengths of large language models while mitigating single-agent limitations. Finally, we examine current limitations - such as error propagation and context scaling - and outline future work toward more robust, scalable multi-agent AI systems for software engineering automation.

large language model, machine learning, natural language, (22 more...)

2507.19902

Genre:

Workflow (1.00)
Research Report (0.64)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.50)

Neural Information Processing SystemsFeb-6-2025, 08:37:28 GMT

Review for NeurIPS paper: Synthesize, Execute and Debug: Learning to Repair for Neural Program Synthesis

Additional Feedback: I enjoyed reading your paper. The notion of a debugger is intuitive and seems to have a very positive impact. Unfortunately, it's not a novel contribution, since it was published in the earlier ICLR workshop paper, which steals the thunder from this submission a bit. Pre-training on synthetic mutations, before fine-tuning on real buggy synthesized programs seems simple and effective! The contribution of TraceEmbed appears to be more of a negative result, given that it doesn't add that much, and it only improves accuracy before any real improvements have been made by the debugger. Overall, this is a nice extended presentation of the previously published concept of a neural debugger.

debugger, execute and debug, neural program synthesis, (12 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.40)

arXiv.org Artificial IntelligenceJan-29-2025

Cogito, ergo sum: A Neurobiologically-Inspired Cognition-Memory-Growth System for Code Generation

Li, Yanlong, Li, Jindong, Wang, Qi, Yang, Menglin, Kong, He, Wang, Shengsheng

Large language models based Multi Agent Systems (MAS) have demonstrated promising performance for enhancing the efficiency and accuracy of code generation tasks. However,most existing methods follow a conventional sequence of planning, coding, and debugging,which contradicts the growth-driven nature of human learning process. Additionally,the frequent information interaction between multiple agents inevitably involves high computational costs. In this paper,we propose Cogito,a neurobiologically inspired multi-agent framework to enhance the problem-solving capabilities in code generation tasks with lower cost. Specifically,Cogito adopts a reverse sequence: it first undergoes debugging, then coding,and finally planning. This approach mimics human learning and development,where knowledge is acquired progressively. Accordingly,a hippocampus-like memory module with different functions is designed to work with the pipeline to provide quick retrieval in similar tasks. Through this growth-based learning model,Cogito accumulates knowledge and cognitive skills at each stage,ultimately forming a Super Role an all capable agent to perform the code generation task. Extensive experiments against representative baselines demonstrate the superior performance and efficiency of Cogito. The code is publicly available at https://anonymous.4open.science/r/Cogito-0083.

artificial intelligence, deep learning, machine learning, (19 more...)

2501.18653

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.48)
Education > Educational Setting (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Levin, Kyla, van Kempen, Nicolas, Berger, Emery D., Freund, Stephen N.

ChatDBG: An AI-Powered Debugging Assistant

arXiv.org Artificial IntelligenceMar-24-2024

This paper presents ChatDBG, the first AI-powered debugging assistant. ChatDBG integrates large language models (LLMs) to significantly enhance the capabilities and user-friendliness of conventional debuggers. ChatDBG lets programmers engage in a collaborative dialogue with the debugger, allowing them to pose complex questions about program state, perform root cause analysis for crashes or assertion failures, and explore open-ended queries like "why is x null?". To handle these queries, ChatDBG grants the LLM autonomy to take the wheel and drive debugging by issuing commands to navigate through stacks and inspect program state; it then reports its findings and yields back control to the programmer. Our ChatDBG prototype integrates with standard debuggers including LLDB, GDB, and WinDBG for native code and Pdb for Python. Our evaluation across a diverse set of code, including C/C++ code with known bugs and a suite of Python code including standalone scripts and Jupyter notebooks, demonstrates that ChatDBG can successfully analyze root causes, explain bugs, and generate accurate fixes for a wide range of real-world errors. For the Python programs, a single query led to an actionable bug fix 67% of the time; one additional follow-up query increased the success rate to 85%. ChatDBG has seen rapid uptake; it has already been downloaded nearly 30,000 times.

debugger, hat dbg, stat, (16 more...)

2403.16354

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

#artificialintelligenceDec-12-2022, 05:55:07 GMT

Improving Mask RCNN Convergence with PyTorch Lightning and SageMaker Debugger

MLPerf training times represent the state of the art in machine learning performance, in which AI industry leaders publish their best training times for a set of common machine learning models. But optimizing for training speed means these models are often complex, and difficult to move to practical applications. Last year, we published SageMakerCV, a collection of computer vision models based on MLPerf, but with added flexibility and optimization for use on Amazon SageMaker. The recently published MLPerf 2.0 adds a series of new optimizations. In this blog, discuss those optimizations, and how we can use PyTorch Lightning and the SageMaker Debugger to further improve training performance and flexibility.

artificial intelligence, machine learning, pytorch lightning, (17 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

#artificialintelligenceApr-22-2022, 06:20:29 GMT

Moving to SageMaker

Almost everything we see around us today comes from factories. However, manufacturing as we see it today is mostly outdated. Manufacturers spend up to 15–20% of their sales revenue due to the cost of poor quality (COPQ) [link]. This includes the cost of detecting and preventing product failures. The later a defect is detected, the more resources have been wasted on the defective part.

experiment, sagemaker, sagemaker training job, (15 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)