codenet
Out of style: Misadventures with LLMs and code style transfer
Munson, Karl, Ting, Chih-Kai, Wade, Serenity, Savla, Anish, Dolby, Julian, Kate, Kiran, Srinivas, Kavitha
Like text, programs have styles, and certain programming styles are more desirable than others for program readability, maintainability, and performance. Code style transfer, however, is difficult to automate except for trivial style guidelines such as limits on line length. Inspired by the success of using language models for text style transfer, we investigate if code language models can perform code style transfer. Code style transfer, unlike text transfer, has rigorous requirements: the system needs to identify lines of code to change, change them correctly, and leave the rest of the program untouched. We designed CSB (Code Style Benchmark), a benchmark suite of code style transfer tasks across five categories including converting for-loops to list comprehensions, eliminating duplication in code, adding decorators to methods, etc. We then used these tests to see if large pre-trained code language models or fine-tuned models perform style transfer correctly, based on rigorous metrics to test that the transfer did occur, and the code still passes functional tests. Surprisingly, language models failed to perform all of the tasks, suggesting that they perform poorly on tasks that require code understanding. We will make available the large-scale corpora to help the community build better code models.
- South America > Colombia > Meta Department > Villavicencio (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Investigating the Efficacy of Large Language Models for Code Clone Detection
Khajezade, Mohamad, Wu, Jie JW, Fard, Fatemeh Hendijani, Rodríguez-Pérez, Gema, Shehata, Mohamed Sami
Large Language Models (LLMs) have demonstrated remarkable success in various natural language processing and software engineering tasks, such as code generation. The LLMs are mainly utilized in the prompt-based zero/few-shot paradigm to guide the model in accomplishing the task. GPT-based models are one of the popular ones studied for tasks such as code comment generation or test generation. These tasks are `generative' tasks. However, there is limited research on the usage of LLMs for `non-generative' tasks such as classification using the prompt-based paradigm. In this preliminary exploratory study, we investigated the applicability of LLMs for Code Clone Detection (CCD), a non-generative task. By building a mono-lingual and cross-lingual CCD dataset derived from CodeNet, we first investigated two different prompts using ChatGPT to detect Type-4 code clones in Java-Java and Java-Ruby pairs in a zero-shot setting. We then conducted an analysis to understand the strengths and weaknesses of ChatGPT in CCD. ChatGPT surpasses the baselines in cross-language CCD attaining an F1-score of 0.877 and achieves comparable performance to fully fine-tuned models for mono-lingual CCD, with an F1-score of 0.878. Also, the prompt and the difficulty level of the problems has an impact on the performance of ChatGPT. Finally we provide insights and future directions based on our initial analysis
- North America > Canada > British Columbia > Regional District of Central Okanagan > Kelowna (0.15)
- Europe > Portugal > Lisbon > Lisbon (0.05)
Understanding Programs by Exploiting (Fuzzing) Test Cases
Zhao, Jianyu, Rong, Yuyang, Guo, Yiwen, He, Yifeng, Chen, Hao
Semantic understanding of programs has attracted great attention in the community. Inspired by recent successes of large language models (LLMs) in natural language understanding, tremendous progress has been made by treating programming language as another sort of natural language and training LLMs on corpora of program code. However, programs are essentially different from texts after all, in a sense that they are normally heavily structured and syntax-strict. In particular, programs and their basic units (i.e., functions and subroutines) are designed to demonstrate a variety of behaviors and/or provide possible outputs, given different inputs. The relationship between inputs and possible outputs/behaviors represents the functions/subroutines and profiles the program as a whole. Therefore, we propose to incorporate such a relationship into learning, for achieving a deeper semantic understanding of programs. To obtain inputs that are representative enough to trigger the execution of most part of the code, we resort to fuzz testing and propose fuzz tuning to boost the performance of program understanding and code representation learning, given a pre-trained LLM. The effectiveness of the proposed method is verified on two program understanding tasks including code clone detection and code classification, and it outperforms current state-of-the-arts by large margins. Code is available at https://github.com/rabbitjy/FuzzTuning.
Programming in 'natural' language is coming sooner than you think
Sometimes major shifts happen virtually unnoticed. CodeNet is a follow-up to ImageNet, a large-scale dataset of images and their descriptions; the images are free for non-commercial uses. ImageNet is now central to the progress of deep learning computer vision. CodeNet is an attempt to do for Artificial Intelligence (AI) coding what ImageNet did for computer vision: it is a dataset of over 14 million code samples, covering 50 programming languages, intended to solve 4,000 coding problems. The dataset also contains numerous additional data, such as the amount of memory required for software to run and log outputs of running code.
Google and Microsoft are creating a monopoly on coding in plain language
Sometimes major shifts happen virtually unnoticed. On May 5, IBM announced Project CodeNet to very little media or academic attention. CodeNet is a follow-up to ImageNet, a large-scale dataset of images and their descriptions; the images are free for non-commercial uses. ImageNet is now central to the progress of deep learning computer vision. CodeNet is an attempt to do for Artifical Intelligence (AI) coding what ImageNet did for computer vision: it is a dataset of over 14 million code samples, covering 50 programming languages, intended to solve 4,000 coding problems.
IBM CodeNet: Artificial Intelligence That Can Program Computers And Solve A $100 Billion Legacy Code Problem
Computer scientists have long toyed with the idea of creating computers that could write programs for other computers. Artificial intelligence is an obvious technology for the task. It has been previously used for programming on a small scale but unfortunately the results have been limited. Artificial intelligence is one of our most powerful and versatile technologies in use today. It can understand and generate speech, analyze documents, recognize images and characters, drive cars, pilot war planes, write papers, and perform thousands of other valuable operations.
- Semiconductors & Electronics (1.00)
- Information Technology (1.00)
- Government > Regional Government (0.47)
IBM's Project CodeNet will test how far you can push AI to write software
IBM's AI research division has released a 14-million-sample dataset to develop machine learning models that can help in programming tasks. Called Project CodeNet, the dataset takes its name after ImageNet, the famous repository of labeled photos that triggered a revolution in computer vision and deep learning. While there's a scant chance that machine learning models built on the CodeNet dataset will make human programmers redundant, there's reason to be hopeful that they will make developers more productive. In the early 2010s, impressive advances in machine learning triggered excitement (and fear) about artificial intelligence soon automating many tasks, including programming. But AI's penetration in software development has been extremely limited.
Can we teach AI how to code? Welcome to IBM's Project CodeNet
IBM's AI research division has released a 14-million-sample dataset to develop machine learning models that can help in programming tasks. Called Project CodeNet, the dataset takes its name after ImageNet, the famous repository of labeled photos that triggered a revolution in computer vision and deep learning. While there's a scant chance that machine learning models built on the CodeNet dataset will make human programmers redundant, there's reason to be hopeful that they will make developers more productive. In the early 2010s, impressive advances in machine learning triggered excitement (and fear) about artificial intelligence soon automating many tasks, including programming. But AI's penetration in software development has been extremely limited.
IBM's Project CodeNet will test how far you can push AI to write software
This article is part of our reviews of AI research papers, a series of posts that explore the latest findings in artificial intelligence. IBM's AI research division has released a 14-million-sample dataset to develop machine learning models that can help in programming tasks. Called Project CodeNet, the dataset takes its name after ImageNet, the famous repository of labeled photos that triggered a revolution in computer vision and deep learning. While there's a scant chance that machine learning models built on the CodeNet dataset will make human programmers redundant, there's reason to be hopeful that they will make developers more productive. In the early 2010s, impressive advances in machine learning triggered excitement (and fear) about artificial intelligence soon automating many tasks, including programming.
- Education > Curriculum > Subject-Specific Education (0.70)
- Information Technology (0.65)
IBM's CodeNet dataset can teach AI to translate computer languages
AI and machine learning systems have become increasingly competent in recent years, capable of not just understanding the written word but writing it as well. But while these artificial intelligences have nearly mastered the English language, they have yet to become fluent in the language of computers -- that is, until now. IBM announced during its Think 2021 conference on Monday that its researchers have crafted a Rosetta Stone for programming code. Over the past decade, advancements in AI have mainly been "driven by deep neural networks, and even that, it was driven by three major factors: data with the availability of large data sets for training, innovations in new algorithms, and the massive acceleration of faster and faster compute hardware driven by GPUs," Ruchir Puri, IBM Fellow and Chief Scientist at IBM Research, said during his Think 2021 presentation, likening the new data set to the venerated ImageNet, which has spawned the recent computer vision land rush. "Software is eating the world," Marc Andreessen wrote in 2011.