prose
What if Readers Like A.I.-Generated Fiction?
Finally, he gave the summaries to his fine-tuned model, and he asked it to compose passages "in the style of Vauhini Vara." Going into all this, I was self-assured, even smug. I'd always felt that my style was original and, more important, that my books were totally distinct from one another. I figured that, even if the A.I. model could imitate my past books, it couldn't predict the style of the novel in progress. So, when Chakrabarty sent me the A.I.-generated imitations, I was genuinely confused.
- South America (0.04)
- North America > United States > New York > Suffolk County > Stony Brook (0.04)
- North America > United States > Michigan (0.04)
- (7 more...)
- Personal (1.00)
- Research Report > New Finding (0.46)
The Strange Ways Writers Are Proving That Their Writing Isn't ChatGPT
Sign up for the Slatest to get the most insightful analysis, criticism, and advice out there, delivered to your inbox daily. The other week, I was reading an email I'd written when a strange notion occurred to me. Would it perhaps be better, an unsettling new voice suddenly whispered, to leave it in? This is a thought that would've appalled me a year ago. As a professional writer, I have long prided myself on impeccable grammar, judiciously wielded punctuation, and (at times indulgent) verbosity.
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.61)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Aligning LLMs by Predicting Preferences from User Writing Samples
Aroca-Ouellette, Stéphane, Mackraz, Natalie, Theobald, Barry-John, Metcalf, Katherine
Accommodating human preferences is essential for creating aligned LLM agents that deliver personalized and effective interactions. Recent work has shown the potential for LLMs acting as writing agents to infer a description of user preferences. Agent alignment then comes from conditioning on the inferred preference description. However, existing methods often produce generic preference descriptions that fail to capture the unique and individualized nature of human preferences. This paper introduces PROSE, a method designed to enhance the precision of preference descriptions inferred from user writing samples. PROSE incorporates two key elements: (1) iterative refinement of inferred preferences, and (2) verification of inferred preferences across multiple user writing samples. We evaluate PROSE with several LLMs (i.e., Qwen2.5 7B and 72B Instruct, GPT-mini, and GPT-4o) on a summarization and an email writing task. We find that PROSE more accurately infers nuanced human preferences, improving the quality of the writing agent's generations over CIPHER (a state-of-the-art method for inferring preferences) by 33\%. Lastly, we demonstrate that ICL and PROSE are complementary methods, and combining them provides up to a 9\% improvement over ICL alone.
- North America > United States > Colorado > Boulder County > Boulder (0.14)
- Europe > Germany (0.04)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- (4 more...)
Generative Social Choice: The Next Generation
Boehmer, Niclas, Fish, Sara, Procaccia, Ariel D.
A key task in certain democratic processes is to produce a concise slate of statements that proportionally represents the full spectrum of user opinions. This task is similar to committee elections, but unlike traditional settings, the candidate set comprises all possible statements of varying lengths, and so it can only be accessed through specific queries. Combining social choice and large language models, prior work has approached this challenge through a framework of generative social choice. We extend the framework in two fundamental ways, providing theoretical guarantees even in the face of approximately optimal queries and a budget limit on the overall length of the slate. Using GPT-4o to implement queries, we showcase our approach on datasets related to city improvement measures and drug reviews, demonstrating its effectiveness in generating representative slates from unstructured user opinions.
- Asia > Middle East > Republic of Türkiye > Konya Province > Konya (0.04)
- Oceania > Australia (0.04)
- North America > United States > Kentucky > Warren County > Bowling Green (0.04)
- (2 more...)
- Health & Medicine > Consumer Health (1.00)
- Education (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.94)
- (2 more...)
The Great Language Flattening
In at least one crucial way, AI has already won its campaign for global dominance. An unbelievable volume of synthetic prose is published every moment of every day--heaping piles of machine-written news articles, text messages, emails, search results, customer-service chats, even scientific research. Chatbots learned from human writing. Now the influence may run in the other direction. Some people have hypothesized that the proliferation of generative-AI tools such as ChatGPT will seep into human communication, that the terse language we use when prompting a chatbot may lead us to dispose of any niceties or writerly flourishes when corresponding with friends and colleagues.
What Kind of Writer Is ChatGPT?
Last spring, a graduate student in social anthropology--let's call him Chris--sat down at his laptop and asked ChatGPT for help with a writing assignment. He pasted a few thousand words, a mix of rough summaries and jotted-down bullet points, into the text box that serves as ChatGPT's interface. "Here's my entire exam," he wrote. "Don't edit it, I will tell you what to do after you've read it." Chris was tackling a difficult paper about perspectivism, which is the anthropological principle that one's perspective inevitably shapes the observations one makes and the knowledge one acquires.
- North America > United States > Texas (0.05)
- North America > United States > North Carolina (0.05)
- Education (0.55)
- Health & Medicine (0.49)
On Training a Neural Network to Explain Binaries
Interrante-Grant, Alexander, Davis, Andy, Preslier, Heather, Leek, Tim
In this work, we begin to investigate the possibility of training a deep neural network on the task of binary code understanding. Specifically, the network would take, as input, features derived directly from binaries and output English descriptions of functionality to aid a reverse engineer in investigating the capabilities of a piece of closed-source software, be it malicious or benign. Given recent success in applying large language models (generative AI) to the task of source code summarization, this seems a promising direction. However, in our initial survey of the available datasets, we found nothing of sufficiently high quality and volume to train these complex models. Instead, we build our own dataset derived from a capture of Stack Overflow containing 1.1M entries. A major result of our work is a novel dataset evaluation method using the correlation between two distances on sample pairs: one distance in the embedding space of inputs and the other in the embedding space of outputs. Intuitively, if two samples have inputs close in the input embedding space, their outputs should also be close in the output embedding space. We found this Embedding Distance Correlation (EDC) test to be highly diagnostic, indicating that our collected dataset and several existing open-source datasets are of low quality as the distances are not well correlated. We proceed to explore the general applicability of EDC, applying it to a number of qualitatively known good datasets and a number of synthetically known bad ones and found it to be a reliable indicator of dataset value.
- Government > Regional Government (0.69)
- Information Technology > Security & Privacy (0.68)
- Government > Military (0.47)
GPT-4 and Safety Case Generation: An Exploratory Analysis
Sivakumar, Mithila, Belle, Alvine Boaye, Shan, Jinjun, Shahandashti, Kimya Khakzad
In the ever-evolving landscape of software engineering, the emergence of large language models (LLMs) and conversational interfaces, exemplified by ChatGPT, is nothing short of revolutionary. While their potential is undeniable across various domains, this paper sets out on a captivating expedition to investigate their uncharted territory, the exploration of generating safety cases. In this paper, our primary objective is to delve into the existing knowledge base of GPT-4, focusing specifically on its understanding of the Goal Structuring Notation (GSN), a well-established notation allowing to visually represent safety cases. Subsequently, we perform four distinct experiments with GPT-4. These experiments are designed to assess its capacity for generating safety cases within a defined system and application domain. To measure the performance of GPT-4 in this context, we compare the results it generates with ground-truth safety cases created for an X-ray system system and a Machine-Learning (ML)-enabled component for tire noise recognition (TNR) in a vehicle. This allowed us to gain valuable insights into the model's generative capabilities. Our findings indicate that GPT-4 demonstrates the capacity to produce safety arguments that are moderately accurate and reasonable. Furthermore, it exhibits the capability to generate safety cases that closely align with the semantic content of the reference safety cases used as ground-truths in our experiments.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States (0.14)
- Europe > United Kingdom > England > North Yorkshire > York (0.04)
- (2 more...)
- Law (0.68)
- Education (0.46)
- Automobiles & Trucks (0.46)
PROSE: Predicting Operators and Symbolic Expressions using Multimodal Transformers
Liu, Yuxuan, Zhang, Zecheng, Schaeffer, Hayden
Approximating nonlinear differential equations using a neural network provides a robust and efficient tool for various scientific computing tasks, including real-time predictions, inverse problems, optimal controls, and surrogate modeling. Previous works have focused on embedding dynamical systems into networks through two approaches: learning a single solution operator (i.e., the mapping from input parametrized functions to solutions) or learning the governing system of equations (i.e., the constitutive model relative to the state variables). Both of these approaches yield different representations for the same underlying data or function. Additionally, observing that families of differential equations often share key characteristics, we seek one network representation across a wide range of equations. Our method, called Predicting Operators and Symbolic Expressions (PROSE), learns maps from multimodal inputs to multimodal outputs, capable of generating both numerical predictions and mathematical equations. By using a transformer structure and a feature fusion approach, our network can simultaneously embed sets of solution operators for various parametric differential equations using a single trained network. Detailed experiments demonstrate that the network benefits from its multimodal nature, resulting in improved prediction accuracy and better generalization. The network is shown to be able to handle noise in the data and errors in the symbolic representation, including noisy numerical values, model misspecification, and erroneous addition or deletion of terms. PROSE provides a new neural network framework for differential equations which allows for more flexibility and generality in learning operators and governing equations from data.
Chatbots Sound Like They're Posting on LinkedIn
If you spend any time on the internet, you're likely now familiar with the gray-and-teal screenshots of AI-generated text. At first they were meant to illustrate ChatGPT's surprising competence at generating human-sounding prose, and then to demonstrate the occasionally unsettling answers that emerged once the general public could bombard it with prompts. OpenAI, the organization that is developing the tool, describes one of its biggest problems this way: "ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers." In layman's terms, the chatbot makes stuff up. As similar services, such as Google's Bard, have rushed their tools into public testing, their screenshots have demonstrated the same capacity for fabricating people, historical events, research citations, and more, and for rendering those falsehoods in the same confident, tidy prose.
- Information Technology (0.48)
- Health & Medicine (0.31)