prose
AI-Writing Scandals Are Getting Very Confusing
What counts as an acceptable use of AI has never been fuzzier. Steven Rosenbaum has decided that the real villain behind the bogus quotes in his book is a chatbot. Earlier this week, reported that, Rosenbaum's much-discussed book about how AI shapes reality, contains more than half a dozen fake or misattributed quotes . Rosenbaum pinned some of them on his use of AI. He claimed responsibility for the errors and said he was investigating what went wrong.
What if Readers Like A.I.-Generated Fiction?
Finally, he gave the summaries to his fine-tuned model, and he asked it to compose passages "in the style of Vauhini Vara." Going into all this, I was self-assured, even smug. I'd always felt that my style was original and, more important, that my books were totally distinct from one another. I figured that, even if the A.I. model could imitate my past books, it couldn't predict the style of the novel in progress. So, when Chakrabarty sent me the A.I.-generated imitations, I was genuinely confused.
The Strange Ways Writers Are Proving That Their Writing Isn't ChatGPT
Sign up for the Slatest to get the most insightful analysis, criticism, and advice out there, delivered to your inbox daily. The other week, I was reading an email I'd written when a strange notion occurred to me. Would it perhaps be better, an unsettling new voice suddenly whispered, to leave it in? This is a thought that would've appalled me a year ago. As a professional writer, I have long prided myself on impeccable grammar, judiciously wielded punctuation, and (at times indulgent) verbosity.
Aligning LLMs by Predicting Preferences from User Writing Samples
Aroca-Ouellette, Stéphane, Mackraz, Natalie, Theobald, Barry-John, Metcalf, Katherine
Accommodating human preferences is essential for creating aligned LLM agents that deliver personalized and effective interactions. Recent work has shown the potential for LLMs acting as writing agents to infer a description of user preferences. Agent alignment then comes from conditioning on the inferred preference description. However, existing methods often produce generic preference descriptions that fail to capture the unique and individualized nature of human preferences. This paper introduces PROSE, a method designed to enhance the precision of preference descriptions inferred from user writing samples. PROSE incorporates two key elements: (1) iterative refinement of inferred preferences, and (2) verification of inferred preferences across multiple user writing samples. We evaluate PROSE with several LLMs (i.e., Qwen2.5 7B and 72B Instruct, GPT-mini, and GPT-4o) on a summarization and an email writing task. We find that PROSE more accurately infers nuanced human preferences, improving the quality of the writing agent's generations over CIPHER (a state-of-the-art method for inferring preferences) by 33\%. Lastly, we demonstrate that ICL and PROSE are complementary methods, and combining them provides up to a 9\% improvement over ICL alone.
Generative Social Choice: The Next Generation
Boehmer, Niclas, Fish, Sara, Procaccia, Ariel D.
A key task in certain democratic processes is to produce a concise slate of statements that proportionally represents the full spectrum of user opinions. This task is similar to committee elections, but unlike traditional settings, the candidate set comprises all possible statements of varying lengths, and so it can only be accessed through specific queries. Combining social choice and large language models, prior work has approached this challenge through a framework of generative social choice. We extend the framework in two fundamental ways, providing theoretical guarantees even in the face of approximately optimal queries and a budget limit on the overall length of the slate. Using GPT-4o to implement queries, we showcase our approach on datasets related to city improvement measures and drug reviews, demonstrating its effectiveness in generating representative slates from unstructured user opinions.
The Great Language Flattening
In at least one crucial way, AI has already won its campaign for global dominance. An unbelievable volume of synthetic prose is published every moment of every day--heaping piles of machine-written news articles, text messages, emails, search results, customer-service chats, even scientific research. Chatbots learned from human writing. Now the influence may run in the other direction. Some people have hypothesized that the proliferation of generative-AI tools such as ChatGPT will seep into human communication, that the terse language we use when prompting a chatbot may lead us to dispose of any niceties or writerly flourishes when corresponding with friends and colleagues.
What Kind of Writer Is ChatGPT?
Last spring, a graduate student in social anthropology--let's call him Chris--sat down at his laptop and asked ChatGPT for help with a writing assignment. He pasted a few thousand words, a mix of rough summaries and jotted-down bullet points, into the text box that serves as ChatGPT's interface. "Here's my entire exam," he wrote. "Don't edit it, I will tell you what to do after you've read it." Chris was tackling a difficult paper about perspectivism, which is the anthropological principle that one's perspective inevitably shapes the observations one makes and the knowledge one acquires.
On Training a Neural Network to Explain Binaries
Interrante-Grant, Alexander, Davis, Andy, Preslier, Heather, Leek, Tim
In this work, we begin to investigate the possibility of training a deep neural network on the task of binary code understanding. Specifically, the network would take, as input, features derived directly from binaries and output English descriptions of functionality to aid a reverse engineer in investigating the capabilities of a piece of closed-source software, be it malicious or benign. Given recent success in applying large language models (generative AI) to the task of source code summarization, this seems a promising direction. However, in our initial survey of the available datasets, we found nothing of sufficiently high quality and volume to train these complex models. Instead, we build our own dataset derived from a capture of Stack Overflow containing 1.1M entries. A major result of our work is a novel dataset evaluation method using the correlation between two distances on sample pairs: one distance in the embedding space of inputs and the other in the embedding space of outputs. Intuitively, if two samples have inputs close in the input embedding space, their outputs should also be close in the output embedding space. We found this Embedding Distance Correlation (EDC) test to be highly diagnostic, indicating that our collected dataset and several existing open-source datasets are of low quality as the distances are not well correlated. We proceed to explore the general applicability of EDC, applying it to a number of qualitatively known good datasets and a number of synthetically known bad ones and found it to be a reliable indicator of dataset value.
GPT-4 and Safety Case Generation: An Exploratory Analysis
Sivakumar, Mithila, Belle, Alvine Boaye, Shan, Jinjun, Shahandashti, Kimya Khakzad
In the ever-evolving landscape of software engineering, the emergence of large language models (LLMs) and conversational interfaces, exemplified by ChatGPT, is nothing short of revolutionary. While their potential is undeniable across various domains, this paper sets out on a captivating expedition to investigate their uncharted territory, the exploration of generating safety cases. In this paper, our primary objective is to delve into the existing knowledge base of GPT-4, focusing specifically on its understanding of the Goal Structuring Notation (GSN), a well-established notation allowing to visually represent safety cases. Subsequently, we perform four distinct experiments with GPT-4. These experiments are designed to assess its capacity for generating safety cases within a defined system and application domain. To measure the performance of GPT-4 in this context, we compare the results it generates with ground-truth safety cases created for an X-ray system system and a Machine-Learning (ML)-enabled component for tire noise recognition (TNR) in a vehicle. This allowed us to gain valuable insights into the model's generative capabilities. Our findings indicate that GPT-4 demonstrates the capacity to produce safety arguments that are moderately accurate and reasonable. Furthermore, it exhibits the capability to generate safety cases that closely align with the semantic content of the reference safety cases used as ground-truths in our experiments.
PROSE: Predicting Operators and Symbolic Expressions using Multimodal Transformers
Liu, Yuxuan, Zhang, Zecheng, Schaeffer, Hayden
Approximating nonlinear differential equations using a neural network provides a robust and efficient tool for various scientific computing tasks, including real-time predictions, inverse problems, optimal controls, and surrogate modeling. Previous works have focused on embedding dynamical systems into networks through two approaches: learning a single solution operator (i.e., the mapping from input parametrized functions to solutions) or learning the governing system of equations (i.e., the constitutive model relative to the state variables). Both of these approaches yield different representations for the same underlying data or function. Additionally, observing that families of differential equations often share key characteristics, we seek one network representation across a wide range of equations. Our method, called Predicting Operators and Symbolic Expressions (PROSE), learns maps from multimodal inputs to multimodal outputs, capable of generating both numerical predictions and mathematical equations. By using a transformer structure and a feature fusion approach, our network can simultaneously embed sets of solution operators for various parametric differential equations using a single trained network. Detailed experiments demonstrate that the network benefits from its multimodal nature, resulting in improved prediction accuracy and better generalization. The network is shown to be able to handle noise in the data and errors in the symbolic representation, including noisy numerical values, model misspecification, and erroneous addition or deletion of terms. PROSE provides a new neural network framework for differential equations which allows for more flexibility and generality in learning operators and governing equations from data.