Goto

Collaborating Authors

 typescript


Type-Constrained Code Generation with Language Models

arXiv.org Artificial Intelligence

Large language models (LLMs) have achieved notable success in code generation. However, they still frequently produce uncompilable output because their next-token inference procedure does not model formal aspects of code. Although constrained decoding is a promising approach to alleviate this issue, it has only been applied to handle either domain-specific languages or syntactic features of general-purpose programming languages. However, LLMs frequently generate code with typing errors, which are beyond the domain of syntax and generally hard to adequately constrain. To address this challenge, we introduce a type-constrained decoding approach that leverages type systems to guide code generation. For this purpose, we develop novel prefix automata and a search over inhabitable types, forming a sound approach to enforce well-typedness on LLM-generated code. We formalize our approach on a foundational simply-typed language and extend it to TypeScript to demonstrate practicality. Our evaluation on the HumanEval and MBPP datasets shows that our approach reduces compilation errors by more than half and significantly increases functional correctness in code synthesis, translation, and repair tasks across LLMs of various sizes and model families, including state-of-the-art open-weight models with more than 30B parameters. The results demonstrate the generality and effectiveness of our approach in constraining LLM code generation with formal rules of type systems.


Skeet: Towards a Lightweight Serverless Framework Supporting Modern AI-Driven App Development

arXiv.org Artificial Intelligence

The field of web and mobile software frameworks is relatively mature, with a large variety of tools in different languages that facilitate traditional app development where data in a relational database is displayed and modified. Our position is that many current frameworks became popular during single server deployment of MVC architecture apps, and do not facilitate modern aspects of app development such as cloud computing and the incorporation of emerging technologies such as AI. We present a novel framework which accomplishes these purposes, Skeet, which was recently released to general use, alongside an initial evaluation. Skeet provides an app structure that reflects current trends in architecture, and tool suites that allow developers with minimal knowledge of AI internals to easily incorporate such technologies into their apps and deploy them.


Activation Steering for Robust Type Prediction in CodeLLMs

arXiv.org Artificial Intelligence

Contemporary LLMs pretrained on code are capable of succeeding at a wide variety of programming tasks. However, their performance is very sensitive to syntactic features, such as the names of variables and types, the structure of code, and presence of type hints. We contribute an inference-time technique to make CodeLLMs more robust to syntactic distractors that are semantically irrelevant. Our methodology relies on activation steering, which involves editing internal model activations to steer the model towards the correct prediction. We contribute a novel way to construct steering vectors by taking inspiration from mutation testing, which constructs minimal semantics-breaking code edits. In contrast, we construct steering vectors from semantics-preserving code edits. We apply our approach to the task of type prediction for the gradually typed languages Python and TypeScript. This approach corrects up to 90% of type mispredictions. Finally, we show that steering vectors calculated from Python activations reliably correct type mispredictions in TypeScript, and vice versa. This result suggests that LLMs may be learning to transfer knowledge of types across programming languages.


Can Programming Languages Boost Each Other via Instruction Tuning?

arXiv.org Artificial Intelligence

When human programmers have mastered a programming language, it would be easier when they learn a new programming language. In this report, we focus on exploring whether programming languages can boost each other during the instruction fine-tuning phase of code large language models. We conduct extensive experiments of 8 popular programming languages (Python, JavaScript, TypeScript, C, C++, Java, Go, HTML) on StarCoder. Results demonstrate that programming languages can significantly improve each other. For example, CodeM-Python 15B trained on Python is able to increase Java by an absolute 17.95% pass@1 on HumanEval-X. More surprisingly, we found that CodeM-HTML 7B trained on the HTML corpus can improve Java by an absolute 15.24% pass@1. Our training data is released at https://github.com/NL2Code/CodeM.


Measuring The Impact Of Programming Language Distribution

arXiv.org Artificial Intelligence

Current benchmarks for evaluating neural code models focus on only a small subset of programming languages, excluding many popular languages such as Go or Rust. To ameliorate this issue, we present the BabelCode framework for execution-based evaluation of any benchmark in any language. BabelCode enables new investigations into the qualitative performance of models' memory, runtime, and individual test case results. Additionally, we present a new code translation dataset called Translating Python Programming Puzzles (TP3) from the Python Programming Puzzles (Schuster et al. 2021) benchmark that involves translating expert-level python functions to any language. With both BabelCode and the TP3 benchmark, we investigate if balancing the distributions of 14 languages in a training dataset improves a large language model's performance on low-resource languages. Training a model on a balanced corpus results in, on average, 12.34% higher $pass@k$ across all tasks and languages compared to the baseline. We find that this strategy achieves 66.48% better $pass@k$ on low-resource languages at the cost of only a 12.94% decrease to high-resource languages. In our three translation tasks, this strategy yields, on average, 30.77% better low-resource $pass@k$ while having 19.58% worse high-resource $pass@k$.


GitHub - transitive-bullshit/chatgpt-plugin-ts: Everything you need to start building ChatGPT Plugins in JS/TS 🔥

#artificialintelligence

This repo contains the chatgpt-plugin NPM package, with TS types and utilities for building ChatGPT Plugins with TypeScript. It also contains several high quality example plugins that you can use as a template for building your own plugins. The goal is to add more examples using different OpenAPI frameworks and hosting providers over time. Currently, all of the examples use Cloudflare Workers, but I'll add an example using Vercel serverless functions soon. If there's something missing that you'd like to see, please open an issue or join our ChatGPT Hackers community on Discord, with over 8000 developers who are building cool stuff with AI! TS code for all example plugins can be found in the examples directory.



MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation

arXiv.org Artificial Intelligence

Large language models have demonstrated the ability to generate both natural language and programming language text. Such models open up the possibility of multi-language code generation: could code generation models generalize knowledge from one language to another? Although contemporary code generation models can generate semantically correct Python code, little is known about their abilities with other languages. We propose MultiPL-E, a system for translating unit test-driven code generation benchmarks to new languages. We create the first massively multilingual code generation benchmark by using MultiPL-E to translate two popular Python code generation benchmarks to 18 additional programming languages. We use MultiPL-E to extend the HumanEval benchmark and MBPP benchmark to 18 languages that encompass a range of programming paradigms and popularity. Using these new parallel benchmarks, we evaluate the multi-language performance of three state-of-the-art code generation models: Codex, CodeGen, and InCoder. We find that Codex matches or even exceeds its performance on Python for several other languages. The range of programming languages represented in MultiPL-E allow us to explore the impact of language frequency and language features on model performance. Finally, the MultiPL-E approach of compiling code generation benchmarks to new programming languages is both scalable and extensible, making it straightforward to evaluate new models, benchmarks, and languages.


Best language for machine learning in 2022: Is it Python?

#artificialintelligence

If you're new to the topic, the hardest part of mastering machine learning is figuring out where to start. It is normal to question the ideal language for machine learning, regardless of whether you are looking to brush up on your machine learning knowledge or completely change careers. Finding the ideal programming language for machine learning is undoubtedly difficult because over 700 distinct programming languages are widely used, and each has advantages and disadvantages. The good news is that you'll start to identify which programming language will best suit a business problem you are trying to address as you start your journey as a machine learning engineer. Which programming language is ideal for machine learning is certainly on your mind if you're considering a career in this area. While numerous options are available for various uses, in this post, we'll focus on the top machine learning languages. It's crucial to comprehend the fundamentals of creating an ML model before discovering why particular programming languages are better suited for ML.


10 Must-Know Patterns for Writing Clean Code with React and TypeScript 🛀

#artificialintelligence

React is a JavaScript library, and it is the most popular and industry-leading frontend development library today. JavaScript is a loosely typed language, and as a result, it catches runtime. The result of this is that JavaScript errors are caught very late and this can lead to nasty bugs. As a JavaScript library, React inherits this problem. Clean code is a consistent style of programming that makes your code easier to write, read, and maintain. Anyone can write code that a computer can understand but good developers write clean code – code that humans can understand.