Goto

Collaborating Authors

 codebase








Benchmarking and Analyzing 3D-aware Image Synthesis with a Modularized Codebase

Neural Information Processing Systems

Despite the rapid advance of 3D-aware image synthesis, existing studies usually adopt a mixture of techniques and tricks, leaving it unclear how each part contributes to the final performance in terms of generality. Following the most popular and effective paradigm in this field, which incorporates a neural radiance field (NeRF) into the generator of a generative adversarial network (GAN), we builda well-structured codebase through modularizing the generation process. Such a design allows researchers to develop and replace each module independently, and hence offers an opportunity to fairly compare various approaches and recognize their contributions from the module perspective. The reproduction of a range of cutting-edge algorithms demonstrates the availability of our modularized codebase. We also perform a variety of in-depth analyses, such as the comparison across different types of point feature, the necessity of the tailing upsampler in the generator, the reliance on the camera pose prior, etc., which deepen our understanding of existing methods and point out some further directions of the research work. Code and models will be made publicly available to facilitate the development and evaluation of this field.


DeepCode: Open Agentic Coding

Li, Zongwei, Li, Zhonghang, Guo, Zirui, Ren, Xubin, Huang, Chao

arXiv.org Artificial Intelligence

Recent advances in large language models (LLMs) have given rise to powerful coding agents, making it possible for code assistants to evolve into code engineers. However, existing methods still face significant challenges in achieving high-fidelity document-to-codebase synthesis--such as scientific papers to code--primarily due to a fundamental conflict between information overload and the context bottlenecks of LLMs. In this work, we introduce DeepCode, a fully autonomous framework that fundamentally addresses this challenge through principled information-flow management. By treating repository synthesis as a channel optimization problem, DeepCode seamlessly orchestrates four information operations to maximize task-relevant signals under finite context budgets: source compression via blueprint distillation, structured indexing using stateful code memory, conditional knowledge injection via retrieval-augmented generation, and closed-loop error correction. Extensive evaluations on the PaperBench benchmark demonstrate that DeepCode achieves state-of-the-art performance, decisively outperforming leading commercial agents such as Cursor and Claude Code, and crucially, surpassing PhD-level human experts from top institutes on key reproduction metrics. By systematically transforming paper specifications into production-grade implementations comparable to human expert quality, this work establishes new foundations for autonomous scientific reproduction that can accelerate research evaluation and discovery.


Beyond Greenfield: The D3 Framework for AI-Driven Productivity in Brownfield Engineering

Sharma, Krishna Kumaar

arXiv.org Artificial Intelligence

Brownfield engineering work involving legacy systems, incomplete documentation, and fragmented architectural knowledge poses unique challenges for the effective use of large language models (LLMs). Prior research has largely focused on greenfield or synthetic tasks, leaving a gap in structured workflows for complex, context-heavy environments. This paper introduces the Discover-Define-Deliver (D3) Framework, a disciplined LLM-assisted workflow that combines role-separated prompting strategies with applied best practices for navigating ambiguity in brownfield systems. The framework incorporates a dual-agent prompting architecture in which a Builder model generates candidate outputs and a Reviewer model provides structured critique to improve reliability. I conducted an exploratory survey study with 52 software practitioners who applied the D3 workflow to real-world engineering tasks such as legacy system exploration, documentation reconstruction, and architectural refactoring. Respondents reported perceived improvements in task clarity, documentation quality, and cognitive load, along with self-estimated productivity gains. In this exploratory study, participants reported a weighted average productivity improvement of 26.9%, reduced cognitive load for approximately 77% of participants, and 83% of participants spent less time fixing or rewriting code due to better initial planning with AI. As these findings are self-reported and not derived from controlled experiments, they should be interpreted as preliminary evidence of practitioner sentiment rather than causal effects. The results highlight both the potential and limitations of structured LLM workflows for legacy engineering systems and motivate future controlled evaluations.


LLM-CSEC: Empirical Evaluation of Security in C/C++ Code Generated by Large Language Models

Shahid, Muhammad Usman, Ahmed, Chuadhry Mujeeb, Ranjan, Rajiv

arXiv.org Artificial Intelligence

The security of code generated by large language models (LLMs) is a significant concern, as studies indicate that such code often contains vulnerabilities and lacks essential defensive programming constructs. This work focuses on examining and evaluating the security of LLM-generated code, particularly in the context of C/C++. We categorized known vulnerabilities using the Common Weakness Enumeration (CWE) and, to study their criticality, mapped them to CVEs. We used ten different LLMs for code generation and analyzed the outputs through static analysis. The amount of CWEs present in AI-generated code is concerning. Our findings highlight the need for developers to be cautious when using LLM-generated code. This study provides valuable insights to advance automated code generation and encourage further research in this domain.