Goto

Collaborating Authors

 fortran



CodeRosetta: Pushing the Boundaries of Unsupervised Code Translation for Parallel Programming

Neural Information Processing Systems

Automatic translation of programming languages has garnered renewed interest, driven by recent advancements in large language models (LLMs). Encoder-decoder transformer models, in particular, have shown promise in translating between different programming languages. However, translating between a language and its high-performance computing (HPC) extension remains underexplored due to inherent challenges like complex parallel semantics understanding. In this paper, we introduce CodeRosetta, an encoder-decoder transformer model explicitly designed for translating between programming languages and also their HPC extensions. CodeRosetta is evaluated on C++ to CUDA and Fortran to C++ translation.It employs a customized learning-based framework with tailored pretraining and training objectives that enable it to effectively capture code semantics and parallel structural nuances, allowing for bidirectional code translation. Our results show that CodeRosetta outperforms state-of-the-art baselines in C++ to CUDA translation by 2.9 BLEU and 1.72 CodeBLUE points while improving compilation accuracy by 6.05%. Compared to general closed-source LLMs, our proposed bidirectional learning-based method improves C++ to CUDA translation by 22.08 BLEU and 14.39 CodeBLUE with 2.75% higher compilation accuracy.Finally, CodeRosetta exhibits proficiency in Fortran to parallel C++ translation, marking it, to our knowledge, as the first encoder-decoder model for such a complex translation task, improving CodeBLEU at least by 4.63 points compared to closed-source LLMs and Open Code LLM.



Agnostics: Learning to Code in Any Programming Language via Reinforcement with a Universal Learning Environment

Boruch-Gruszecki, Aleksander, Zi, Yangtian, Wu, Zixuan, Oberoi, Tejas, Anderson, Carolyn Jane, Biswas, Joydeep, Guha, Arjun

arXiv.org Artificial Intelligence

Large language models (LLMs) already excel at writing code in high-resource languages such as Python and JavaScript, yet stumble on low-resource languages that remain essential to science and engineering. Besides the obvious shortage of pre-training data, post-training itself is a bottleneck: every new language seems to require new datasets, test harnesses, and reinforcement-learning (RL) infrastructure. We introduce Agnostics, a language-agnostic post-training pipeline that eliminates this per-language engineering. The key idea is to judge code solely by its externally observable behavior, so a single verifier can test solutions written in any language. Concretely, we (i) use an LLM to rewrite existing unit-test datasets into an I/O format, (ii) supply a short configuration that tells the verifier how to compile and run a target language, and (iii) apply reinforcement learning with verifiable rewards (RLVR) in a robust code execution environment. Applied to five low-resource languages--Lua, Julia, R, OCaml, and Fortran--Agnostics (1) improves Qwen-3 4B to performance that rivals other 16B-70B open-weight models; (2) scales cleanly to larger and diverse model families (Qwen-3 8B, DeepSeek Coder 6.7B Instruct, Phi 4 Mini); and (3) for ${\le} 16$B parameter models, sets new state-of-the-art pass@1 results on MultiPL-E and a new multi-language version LiveCodeBench that we introduce. We will release the language-agnostic training datasets (Ag-MBPP-X, Ag-Codeforces-X, Ag-LiveCodeBench-X), training code, and ready-to-use configurations, making RL post-training in any programming language as simple as editing a short YAML file.


CodeRosetta: Pushing the Boundaries of Unsupervised Code Translation for Parallel Programming

Neural Information Processing Systems

Automatic translation of programming languages has garnered renewed interest, driven by recent advancements in large language models (LLMs). Encoder-decoder transformer models, in particular, have shown promise in translating between different programming languages. However, translating between a language and its high-performance computing (HPC) extension remains underexplored due to inherent challenges like complex parallel semantics understanding. In this paper, we introduce CodeRosetta, an encoder-decoder transformer model explicitly designed for translating between programming languages and also their HPC extensions. CodeRosetta is evaluated on C to CUDA and Fortran to C translation.It employs a customized learning-based framework with tailored pretraining and training objectives that enable it to effectively capture code semantics and parallel structural nuances, allowing for bidirectional code translation. Our results show that CodeRosetta outperforms state-of-the-art baselines in C to CUDA translation by 2.9 BLEU and 1.72 CodeBLUE points while improving compilation accuracy by 6.05%.


LLM Benchmarking with LLaMA2: Evaluating Code Development Performance Across Multiple Programming Languages

Diehl, Patrick, Nader, Nojoud, Moraru, Maxim, Brandt, Steven R.

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have made significant advances in various code-related tasks, particularly in generating source code from natural language descriptions (Zhao et al. (2023); Chang et al. (2024)). Their effectiveness is primarily driven by their extensive number of model parameters, the use of large and diverse datasets, and the immense computational resources employed during training (Kaplan et al. (2020)). These models are typically trained on vast corpora sourced from the web. LLMs are capable of capturing intricate patterns, linguistic subtleties, and semantic relationships. A wide range of models are available for code generation. There are general-purpose models like ChatGPT (Ouyang et al. (2022)), GPT -4 (Achiam et al. (2023)), and LLaMA (Touvron et al. (2023a)) which are designed for a broad range of applications, as well as specialized models such as StarCoder, Code LLaMA (Roziere et al. (2023)), DeepSeek-Coder, and Code Gemma that are optimized for code-related tasks. The integration of code generation with the latest advances in LLM technology is now an essential tool for many businesses, as well as an essential target for LLM developers as programming languages are considered to be different dialects of natural language (Athiwaratkun et al. (2022)).


Native Fortran Implementation of TensorFlow-Trained Deep and Bayesian Neural Networks

Furlong, Aidan, Zhao, Xingang, Salko, Bob, Wu, Xu

arXiv.org Artificial Intelligence

Over the past decade, the investigation of machine learning (ML) within the field of nuclear engineering has grown significantly. With many approaches reaching maturity, the next phase of investigation will determine the feasibility and usefulness of ML model implementation in a production setting. Several of the codes used for reactor design and assessment are primarily written in the Fortran language, which is not immediately compatible with TensorFlow-trained ML models. This study presents a framework for implementing deep neural networks (DNNs) and Bayesian neural networks (BNNs) in Fortran, allowing for native execution without TensorFlow's C API, Python runtime, or ONNX conversion. Designed for ease of use and computational efficiency, the framework can be implemented in any Fortran code, supporting iterative solvers and UQ via ensembles or BNNs. Verification was performed using a two-input, one-output test case composed of a noisy sinusoid to compare Fortran-based predictions to those from TensorFlow. The DNN predictions showed negligible differences and achieved a 19.6x speedup, whereas the BNN predictions exhibited minor disagreement, plausibly due to differences in random number generation. An 8.0x speedup was noted for BNN inference. The approach was then further verified on a nuclear-relevant problem predicting critical heat flux (CHF), which demonstrated similar behavior along with significant computational gains. Discussion regarding the framework's successful integration into the CTF thermal-hydraulics code is also included, outlining its practical usefulness. Overall, this framework was shown to be effective at implementing both DNN and BNN model inference within Fortran, allowing for the continued study of ML-based methods in real-world nuclear applications.


CodeRosetta: Pushing the Boundaries of Unsupervised Code Translation for Parallel Programming

TehraniJamsaz, Ali, Bhattacharjee, Arijit, Chen, Le, Ahmed, Nesreen K., Yazdanbakhsh, Amir, Jannesari, Ali

arXiv.org Artificial Intelligence

Recent advancements in Large Language Models (LLMs) have renewed interest in automatic programming language translation. Encoder-decoder transformer models, in particular, have shown promise in translating between different programming languages. However, translating between a language and its high-performance computing (HPC) extensions remains underexplored due to challenges such as complex parallel semantics. In this paper, we introduce CodeRosetta, an encoder-decoder transformer model designed specifically for translating between programming languages and their HPC extensions. CodeRosetta is evaluated on C++ to CUDA and Fortran to C++ translation tasks. It uses a customized learning framework with tailored pretraining and training objectives to effectively capture both code semantics and parallel structural nuances, enabling bidirectional translation. Our results show that CodeRosetta outperforms state-of-the-art baselines in C++ to CUDA translation by 2.9 BLEU and 1.72 CodeBLEU points while improving compilation accuracy by 6.05%. Compared to general closed-source LLMs, our method improves C++ to CUDA translation by 22.08 BLEU and 14.39 CodeBLEU, with 2.75% higher compilation accuracy. Finally, CodeRosetta exhibits proficiency in Fortran to parallel C++ translation, marking it, to our knowledge, as the first encoder-decoder model for this complex task, improving CodeBLEU by at least 4.63 points compared to closed-source and open-code LLMs.


Evaluating AI-generated code for C++, Fortran, Go, Java, Julia, Matlab, Python, R, and Rust

Diehl, Patrick, Nader, Noujoud, Brandt, Steve, Kaiser, Hartmut

arXiv.org Artificial Intelligence

This study evaluates the capabilities of ChatGPT versions 3.5 and 4 in generating code across a diverse range of programming languages. Our objective is to assess the effectiveness of these AI models for generating scientific programs. To this end, we asked ChatGPT to generate three distinct codes: a simple numerical integration, a conjugate gradient solver, and a parallel 1D stencil-based heat equation solver. The focus of our analysis was on the compilation, runtime performance, and accuracy of the codes. While both versions of ChatGPT successfully created codes that compiled and ran (with some help), some languages were easier for the AI to use than others (possibly because of the size of the training sets used). Parallel codes -- even the simple example we chose to study here -- also difficult for the AI to generate correctly.


The hard truth about AI? It might produce some better software John Naughton

The Guardian

As you have doubtless noticed, we are in the middle of a feeding frenzy about something called generative AI. Legions of hitherto normal people – and economists – are surfing a wave of irrational exuberance about its transformative potential. For anyone suffering from the fever, two antidotes are recommended. The first is the hype cycle monitor produced by consultants Gartner, which shows the technology currently perched on the "peak of inflated expectations", before a steep decline into the "trough of disillusionment". The other is Hofstadter's law, about the difficulty of estimating how long difficult tasks will take, which says that "It always takes longer than you expect, even when you take into account Hofstadter's law".