Goto

Collaborating Authors

 deepcode



DeepCode: Open Agentic Coding

Li, Zongwei, Li, Zhonghang, Guo, Zirui, Ren, Xubin, Huang, Chao

arXiv.org Artificial Intelligence

Recent advances in large language models (LLMs) have given rise to powerful coding agents, making it possible for code assistants to evolve into code engineers. However, existing methods still face significant challenges in achieving high-fidelity document-to-codebase synthesis--such as scientific papers to code--primarily due to a fundamental conflict between information overload and the context bottlenecks of LLMs. In this work, we introduce DeepCode, a fully autonomous framework that fundamentally addresses this challenge through principled information-flow management. By treating repository synthesis as a channel optimization problem, DeepCode seamlessly orchestrates four information operations to maximize task-relevant signals under finite context budgets: source compression via blueprint distillation, structured indexing using stateful code memory, conditional knowledge injection via retrieval-augmented generation, and closed-loop error correction. Extensive evaluations on the PaperBench benchmark demonstrate that DeepCode achieves state-of-the-art performance, decisively outperforming leading commercial agents such as Cursor and Claude Code, and crucially, surpassing PhD-level human experts from top institutes on key reproduction metrics. By systematically transforming paper specifications into production-grade implementations comparable to human expert quality, this work establishes new foundations for autonomous scientific reproduction that can accelerate research evaluation and discovery.


Deepcode: Feedback Codes via Deep Learning

Neural Information Processing Systems

The design of codes for communicating reliably over a statistically well defined channel is an important endeavor involving deep mathematical research and wide-ranging practical applications. In this work, we present the first family of codes obtained via deep learning, which significantly beats state-of-the-art codes designed over several decades of research. The communication channel under consideration is the Gaussian noise channel with feedback, whose study was initiated by Shannon; feedback is known theoretically to improve reliability of communication, but no practical codes that do so have ever been successfully constructed. We break this logjam by integrating information theoretic insights harmoniously with recurrent-neural-network based encoders and decoders to create novel codes that outperform known codes by 3 orders of magnitude in reliability. We also demonstrate several desirable properties in the codes: (a) generalization to larger block lengths; (b) composability with known codes; (c) adaptation to practical constraints. This result also presents broader ramifications to coding theory: even when the channel has a clear mathematical model, deep learning methodologies, when combined with channel specific information-theoretic insights, can potentially beat state-of-the-art codes, constructed over decades of mathematical research.



Reviews: Deepcode: Feedback Codes via Deep Learning

Neural Information Processing Systems

The formal noisy channel setting is similar to a standard autoencoder framework, with a few key differences. For one, we usually encode and transmit one bit of a message at time due to channel limits, and second we get feedback, usually in the form of a noisy version of each encoded bit. Due to the sequential nature of the problem, plus the availability of feedback, the authors apply an RNN architecture. The input to the decoder at each step is the next bit to encode plus an estimate of the noise from previous steps (derived from the difference between the encoded message and the received feedback). Experiments suggest that this approach significantly outperforms existing approaches.


Robust Non-Linear Feedback Coding via Power-Constrained Deep Learning

Kim, Junghoon, Kim, Taejoon, Love, David, Brinton, Christopher

arXiv.org Artificial Intelligence

The design of codes for feedback-enabled communications has been a long-standing open problem. Recent research on non-linear, deep learning-based coding schemes have demonstrated significant improvements in communication reliability over linear codes, but are still vulnerable to the presence of forward and feedback noise over the channel. In this paper, we develop a new family of non-linear feedback codes that greatly enhance robustness to channel noise. Our autoencoder-based architecture is designed to learn codes based on consecutive blocks of bits, which obtains de-noising advantages over bit-by-bit processing to help overcome the physical separation between the encoder and decoder over a noisy channel. Moreover, we develop a power control layer at the encoder to explicitly incorporate hardware constraints into the learning optimization, and prove that the resulting average power constraint is satisfied asymptotically. Numerical experiments demonstrate that our scheme outperforms state-of-the-art feedback codes by wide margins over practical forward and feedback noise regimes, and provide information-theoretic insights on the behavior of our non-linear codes. Moreover, we observe that, in a long blocklength regime, canonical error correction codes are still preferable to feedback codes when the feedback noise becomes high.


AttentionCode: Ultra-Reliable Feedback Codes for Short-Packet Communications

Shao, Yulin, Ozfatura, Emre, Perotti, Alberto, Popovic, Branislav, Gunduz, Deniz

arXiv.org Artificial Intelligence

Ultra-reliable short-packet communication is a major challenge in future wireless networks with critical applications. To achieve ultra-reliable communications beyond 99.999%, this paper envisions a new interaction-based communication paradigm that exploits feedback from the receiver. We present AttentionCode, a new class of feedback codes leveraging deep learning (DL) technologies. The underpinnings of AttentionCode are three architectural innovations: AttentionNet, input restructuring, and adaptation to fading channels, accompanied by several training methods, including large-batch training, distributed learning, look-ahead optimizer, training-test signal-to-noise ratio (SNR) mismatch, and curriculum learning. The training methods can potentially be generalized to other wireless communication applications with machine learning. Numerical experiments verify that AttentionCode establishes a new state of the art among all DL-based feedback codes in both additive white Gaussian noise (AWGN) channels and fading channels. In AWGN channels with noiseless feedback, for example, AttentionCode achieves a block error rate (BLER) of $10^{-7}$ when the forward channel SNR is 0 dB for a block size of 50 bits, demonstrating the potential of AttentionCode to provide ultra-reliable short-packet communications.


Top 10 HTML Code Generators

#artificialintelligence

Artificial intelligence (AI) is a cutting-edge technology that enables robots to learn from their own experience. AI can be found in self-driving cars, smart homes, and chess computers, to name a few. They are based on deep learning and are equipped with artificial intelligence. Computers can execute complex tasks using these technologies. As a result, businesses are recognized for their enthusiasm for AI to obtain a competitive advantage over their competitors.


Snyk Acquires DeepCode to Apply AI to DevSecOps - DevOps.com

#artificialintelligence

Snyk today announced it has agreed to acquire DeepCode as part of an effort to apply artificial intelligence (AI) to DevSecOps. DeepCode has developed an interpretable machine learning semantic code analysis tool that scans code anywhere from 10 to 50 times faster than existing approaches. DeepCode currently supports Java, JavaScript, Python, TypeScript and C/C programming languages. Developers get started by connecting the DeepCode bot to their GitHub, BitBucket or GitLab accounts or directly within their integrated development environment (IDE). DeepCode then immediately starts reviewing each commit with no additional coding required.


Deepcode and Modulo-SK are Designed for Different Settings

Kim, Hyeji, Jiang, Yihan, Kannan, Sreeram, Oh, Sewoong, Viswanath, Pramod

arXiv.org Machine Learning

We respond to [1] which claimed that "Modulo-SK scheme outperforms Deepcode [2]". We demonstrate that this statement is not true: the two schemes are designed and evaluated for entirely different settings. DeepCode is designed and evaluated for the AWGN channel with (potentially delayed) uncoded output feedback. Modulo-SK is evaluated on the AWGN channel with coded feedback and unit delay. [1] also claimed an implementation of Schalkwijk and Kailath (SK) [3] which was numerically stable for any number of information bits and iterations. However, we observe that while their implementation does marginally improve over ours, it also suffers from a fundamental issue with precision. Finally, we show that Deepcode dominates the optimized performance of SK, over a natural choice of parameterizations when the feedback is noisy.