AITopics | Christodorescu, Mihai

Collaborating Authors

Christodorescu, Mihai

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LLM-Driven Multi-step Translation from C to Rust using Static Analysis

Zhou, Tianyang, Lin, Haowen, Jha, Somesh, Christodorescu, Mihai, Levchenko, Kirill, Chandrasekaran, Varun

arXiv.org Artificial IntelligenceMar-18-2025

Translating software written in legacy languages to modern languages, such as C to Rust, has significant benefits in improving memory safety while maintaining high performance. However, manual translation is cumbersome, error-prone, and produces unidiomatic code. Large language models (LLMs) have demonstrated promise in producing idiomatic translations, but offer no correctness guarantees as they lack the ability to capture all the semantics differences between the source and target languages. To resolve this issue, we propose SACTOR, an LLM-driven C-to-Rust zero-shot translation tool using a two-step translation methodology: an "unidiomatic" step to translate C into Rust while preserving semantics, and an "idiomatic" step to refine the code to follow Rust's semantic standards. SACTOR utilizes information provided by static analysis of the source C program to address challenges such as pointer semantics and dependency resolution. To validate the correctness of the translated result from each step, we use end-to-end testing via the foreign function interface to embed our translated code segment into the original code. We evaluate the translation of 200 programs from two datasets and two case studies, comparing the performance of GPT-4o, Claude 3.5 Sonnet, Gemini 2.0 Flash, Llama 3.3 70B and DeepSeek-R1 in SACTOR. Our results demonstrate that SACTOR achieves high correctness and improved idiomaticity, with the best-performing model (DeepSeek-R1) reaching 93% and (GPT-4o, Claude 3.5, DeepSeek-R1) reaching 84% correctness (on each dataset, respectively), while producing more natural and Rust-compliant translations compared to existing methods.

large language model, machine learning, translation, (19 more...)

arXiv.org Artificial Intelligence

2503.12511

Country:

North America > United States > Illinois (0.14)
North America > United States > Wisconsin (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Securing the Future of GenAI: Policy and Technology

Christodorescu, Mihai, Craven, Ryan, Feizi, Soheil, Gong, Neil, Hoffmann, Mia, Jha, Somesh, Jiang, Zhengyuan, Kamarposhti, Mehrdad Saberi, Mitchell, John, Newman, Jessica, Probasco, Emelia, Qi, Yanjun, Shams, Khawaja, Turek, Matthew

arXiv.org Artificial IntelligenceMay-21-2024

The rise of Generative AI (GenAI) brings about transformative potential across sectors, but its dual-use nature also amplifies risks. Governments globally are grappling with the challenge of regulating GenAI, balancing innovation against safety. China, the United States (US), and the European Union (EU) are at the forefront with initiatives like the Management of Algorithmic Recommendations, the Executive Order, and the AI Act, respectively. However, the rapid evolution of GenAI capabilities often outpaces the development of comprehensive safety measures, creating a gap between regulatory needs and technical advancements. A workshop co-organized by Google, University of Wisconsin, Madison (UW-Madison), and Stanford University aimed to bridge this gap between GenAI policy and technology. The diverse stakeholders of the GenAI space -- from the public and governments to academia and industry -- make any safety measures under consideration more complex, as both technical feasibility and regulatory guidance must be realized. This paper summarizes the discussions during the workshop which addressed questions, such as: How regulation can be designed without hindering technological progress? How technology can evolve to meet regulatory standards? The interplay between legislation and technology is a very vast topic, and we don't claim that this paper is a comprehensive treatment on this topic. This paper is meant to capture findings based on the workshop, and hopefully, can guide discussion on this topic.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2407.12999

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Wisconsin > Dane County > Madison (0.24)

Genre:

Overview (0.92)
Research Report > New Finding (0.45)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(3 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(5 more...)

Add feedback

Do Large Code Models Understand Programming Concepts? A Black-box Approach

Hooda, Ashish, Christodorescu, Mihai, Allamanis, Miltos, Wilson, Aaron, Fawaz, Kassem, Jha, Somesh

arXiv.org Artificial IntelligenceFeb-8-2024

Large Language Models' success on text generation has also made them better at code generation and coding tasks. While a lot of work has demonstrated their remarkable performance on tasks such as code completion and editing, it is still unclear as to why. We help bridge this gap by exploring to what degree auto-regressive models understand the logical constructs of the underlying programs. We propose Counterfactual Analysis for Programming Concept Predicates (CACP) as a counterfactual testing framework to evaluate whether Large Code Models understand programming concepts. With only black-box access to the model, we use CACP to evaluate ten popular Large Code Models for four different programming concepts. Our findings suggest that current models lack understanding of concepts such as data flow and control flow.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2402.0598

Country:

North America > United States > Wisconsin (0.14)
Europe > Portugal (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Transportation > Air (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Add feedback

Identifying and Mitigating the Security Risks of Generative AI

Barrett, Clark, Boyd, Brad, Burzstein, Elie, Carlini, Nicholas, Chen, Brad, Choi, Jihye, Chowdhury, Amrita Roy, Christodorescu, Mihai, Datta, Anupam, Feizi, Soheil, Fisher, Kathleen, Hashimoto, Tatsunori, Hendrycks, Dan, Jha, Somesh, Kang, Daniel, Kerschbaum, Florian, Mitchell, Eric, Mitchell, John, Ramzan, Zulfikar, Shams, Khawaja, Song, Dawn, Taly, Ankur, Yang, Diyi

arXiv.org Artificial IntelligenceDec-28-2023

Every major technical invention resurfaces the dual-use dilemma -- the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well by attackers to generate new attacks and increase the velocity and efficacy of existing attacks. This paper reports the findings of a workshop held at Google (co-organized by Stanford University and the University of Wisconsin-Madison) on the dual-use dilemma posed by GenAI. This paper is not meant to be comprehensive, but is rather an attempt to synthesize some of the interesting findings from the workshop. We discuss short-term and long-term goals for the community on this topic. We hope this paper provides both a launching point for a discussion on this important topic as well as interesting problems that the research community can work to address.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1561/3300000041

2308.1484

Country:

Europe (1.00)
North America > United States > Wisconsin > Dane County > Madison (0.24)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Robustness against Relational Adversary

Wang, Yizhen, Meng, Xiaozhu, Wang, Ke, Christodorescu, Mihai, Jha, Somesh

arXiv.org Machine LearningOct-29-2020

Test-time adversarial attacks have posed serious challenges to the robustness of machine-learning models, and in many settings the adversarial perturbation need not be bounded by small $\ell_p$-norms. Motivated by the semantics-preserving attacks in vision and security domain, we investigate $\textit{relational adversaries}$, a broad class of attackers who create adversarial examples that are in a reflexive-transitive closure of a logical relation. We analyze the conditions for robustness and propose $\textit{normalize-and-predict}$ -- a learning framework with provable robustness guarantee. We compare our approach with adversarial training and derive an unified framework that provides benefits of both approaches. Guided by our theoretical findings, we apply our framework to image classification and malware detection. Results of both tasks show that attacks using relational adversaries frequently fool existing models, but our unified framework can significantly enhance their robustness.

accuracy, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

2007.00772

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

COSET: A Benchmark for Evaluating Neural Program Embeddings

Wang, Ke, Christodorescu, Mihai

arXiv.org Machine LearningMay-27-2019

Neural program embedding can be helpful in analyzing large software, a task that is challenging for traditional logic-based program analyses due to their limited scalability. A key focus of recent machine-learning advances in this area is on modeling program semantics instead of just syntax. Unfortunately evaluating such advances is not obvious, as program semantics does not lend itself to straightforward metrics. In this paper, we introduce a benchmarking framework called COSET for standardizing the evaluation of neural program embeddings. COSET consists of a diverse dataset of programs in source-code format, labeled by human experts according to a number of program properties of interest. A point of novelty is a suite of program transformations included in COSET. These transformations when applied to the base dataset can simulate natural changes to program code due to optimization and refactoring and can serve as a "debugging" tool for classification mistakes. We conducted a pilot study on four prominent models--TreeLSTM [1], gated graph neural network (GGNN) [2], AST-Path neural network (APNN) [3], and DYPRO [4]. We found that COSET is useful in identifying the strengths and limitations of each model and in pinpointing specific syntactic and semantic characteristics of programs that pose challenges.

coset, deep learning, software engineering, (22 more...)

arXiv.org Machine Learning

1905.11445

Country: North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry:

Education > Educational Setting > Online (0.46)
Information Technology (0.46)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback