Goto

Collaborating Authors

 Generative AI


Challenges and Paths Towards AI for Software Engineering

arXiv.org Artificial Intelligence

AI for software engineering has made remarkable progress recently, becoming a notable success within generative AI. Despite this, there are still many challenges that need to be addressed before automated software engineering reaches its full potential. It should be possible to reach high levels of automation where humans can focus on the critical decisions of what to build and how to balance difficult tradeoffs while most routine development effort is automated away. Reaching this level of automation will require substantial research and engineering efforts across academia and industry. In this paper, we aim to discuss progress towards this in a threefold manner. First, we provide a structured taxonomy of concrete tasks in AI for software engineering, emphasizing the many other tasks in software engineering beyond code generation and completion. Second, we outline several key bottlenecks that limit current approaches. Finally, we provide an opinionated list of promising research directions toward making progress on these bottlenecks, hoping to inspire future research in this rapidly maturing field.


Comparing Methods for Bias Mitigation in Graph Neural Networks

arXiv.org Artificial Intelligence

This paper examines the critical role of Graph Neural Networks (GNNs) in data preparation for generative artificial intelligence (GenAI) systems, with a particular focus on addressing and mitigating biases. We present a comparative analysis of three distinct methods for bias mitigation: data sparsification, feature modification, and synthetic data augmentation. Through experimental analysis using the german credit dataset, we evaluate these approaches using multiple fairness metrics, including statistical parity, equality of opportunity, and false positive rates. Our research demonstrates that while all methods improve fairness metrics compared to the original dataset, stratified sampling and synthetic data augmentation using GraphSAGE prove particularly effective in balancing demographic representation while maintaining model performance. The results provide practical insights for developing more equitable AI systems while maintaining model performance.


OpenAI releases impressive 4o image generator for free and paid users

PCWorld

Earlier this week, OpenAI released their "most advanced image generator yet" and made it available through ChatGPT using the GPT-4o model. ChatGPT previously relied on Dall-E to generate images. According to OpenAI, the improved 4o model is able to produce precise, accurate, and photorealistic results. They claim that it's also particularly good at rendering text, following instructions precisely, and even understanding the context of a chat. All of this includes the transformation of uploaded images or using uploaded images as visual inspiration.


OpenAI delays rollout of ChatGPT's image generator to free users

Engadget

Free ChatGPT users will have to wait a while longer to be able to use its built-in image generation capability. OpenAI has just launched a feature that will allow users to generate images directly inside of ChatGPT, and it was supposed to roll out to all Plus, Pro, Team and Free users. But according to company CEO Sam Altman, it has been way more popular than OpenAI had expected even though they already had high expectations to begin with. As such, its rollout to the free tier is "unfortunately going to be delayed for a while." People have been posting ChatGPT's output all over social media.


What is vibe coding, should you be doing it, and does it matter?

New Scientist

Getting an AI to write software for you? Want to write software, but haven't got the first clue where to start? Enter "vibe coding", a term that has swept the internet to describe the use of AI tools, including large language models (LLMs) like ChatGPT, to generate computer code even if you can't program. "Vibe coding basically refers to using generative AI not just to assist with coding, but to generate the entire code for an app," says Noah Giansiracusa at Bentley University in Waltham, Massachusetts. Users ask, or prompt, LLM-based models such as ChatGPT, Claude or Copilot to produce the code for an app or service, and the AI system does all the work.


I Opted Out of AI Training. Does This Reduce My Future Influence?

WIRED

If we all start opting out of our posts being used for training models, doesn't that reduce the influence of our unique voice and perspectives on those models? Increasingly, the models will be everyone's primary window into the rest of the world. It seems like the people who care the least about these things will be the ones with the most data that ends up training the models' default behavior. Honestly, it's frustrating to me that users of the internet are forced to opt out of artificial intelligence training as the default. Wouldn't it be nice if affirmative consent was the norm for generative AI companies as they scrape the web and any other data repositories they can find to build increasingly larger and larger frontier models?


OpenAI close to finalizing 40 billion SoftBank-led funding

The Japan Times

OpenAI is close to finalizing a 40 billion ( 6 trillion) funding round led by SoftBank Group -- with investors including Magnetar Capital, Coatue Management, Founders Fund and Altimeter Capital Management in talks to participate, according to people familiar with the matter. Magnetar Capital -- an Evanston, Illinois-based hedge fund -- could contribute up to 1 billion, according to multiple people, all of whom asked not to be identified because the information is private. The artificial intelligence developer's funding round would be the largest of all time, according to data compiled by research firm PitchBook. The deal is set to value the company at 300 billion including dollars raised -- almost double the ChatGPT maker's previous valuation of 157 billion from when it raised money in October.


Composable Prompting Workspaces for Creative Writing: Exploration and Iteration Using Dynamic Widgets

arXiv.org Artificial Intelligence

Generative AI models offer many possibilities for text creation and transformation. Current graphical user interfaces (GUIs) for prompting them lack support for iterative exploration, as they do not represent prompts as actionable interface objects. We propose the concept of a composable prompting canvas for text exploration and iteration using dynamic widgets. Users generate widgets through system suggestions, prompting, or manually to capture task-relevant facets that affect the generated text. In a comparative study with a baseline (conversational UI), 18 participants worked on two writing tasks, creating diverse prompting environments with custom widgets and spatial layouts. They reported having more control over the generated text and preferred our system over the baseline. Our design significantly outperformed the baseline on the Creativity Support Index, and participants felt the results were worth the effort. This work highlights the need for GUIs that support user-driven customization and (re-)structuring to increase both the flexibility and efficiency of prompting.


Unlocking the Potential of Past Research: Using Generative AI to Reconstruct Healthcare Simulation Models

arXiv.org Artificial Intelligence

Discrete-event simulation (DES) is widely used in healthcare Operations Research, but the models themselves are rarely shared. This limits their potential for reuse and long-term impact in the modelling and healthcare communities. This study explores the feasibility of using generative artificial intelligence (AI) to recreate published models using Free and Open Source Software (FOSS), based on the descriptions provided in an academic journal. Using a structured methodology, we successfully generated, tested and internally reproduced two DES models, including user interfaces. The reported results were replicated for one model, but not the other, likely due to missing information on distributions. These models are substantially more complex than AI-generated DES models published to date. Given the challenges we faced in prompt engineering, code generation, and model testing, we conclude that our iterative approach to model development, systematic comparison and testing, and the expertise of our team were necessary to the success of our recreated simulation models.


The Risks of Using Large Language Models for Text Annotation in Social Science Research

arXiv.org Artificial Intelligence

Generative artificial intelligence (GenAI) or large language models (LLMs) have the potential to revolutionize computational social science, particularly in automated textual analysis. In this paper, we conduct a systematic evaluation of the promises and risks of using LLMs for diverse coding tasks, with social movement studies serving as a case example. We propose a framework for social scientists to incorporate LLMs into text annotation, either as the primary coding decision-maker or as a coding assistant. This framework provides tools for researchers to develop the optimal prompt, and to examine and report the validity and reliability of LLMs as a methodological tool. Additionally, we discuss the associated epistemic risks related to validity, reliability, replicability, and transparency. We conclude with several practical guidelines for using LLMs in text annotation tasks, and how we can better communicate the epistemic risks in research.