Goto

Collaborating Authors

 prose


Amortized Sampling with Transferable Normalizing Flows

Neural Information Processing Systems

Efficient equilibrium sampling of molecular conformations remains a core challenge in computational chemistry and statistical inference. Classical approaches such as molecular dynamics or Markov chain Monte Carlo inherently lack amortization; the computational cost of sampling must be paid in full for each system of interest. The widespread success of generative models has inspired interest towards overcoming this limitation through learning sampling algorithms. Despite performing competitively with conventional methods when trained on a single system, learned samplers have so far demonstrated limited ability to transfer across systems. We demonstrate that deep learning enables the design of scalable and transferable samplers by introducing PROSE, a 285 million parameter all-atom transferable normalizing flow trained on a corpus of peptide molecular dynamics trajectories up to 8 residues in length. PROSE draws zero-shot uncorrelated proposal samples for arbitrary peptide systems, achieving the previously intractable transferability across sequence length, whilst retaining the efficient likelihood evaluation of normalizing flows. Through extensive empirical evaluation we demonstrate the efficacy of PROSE as a proposal for a variety of sampling algorithms, finding a simple importance sampling-based fine-tuning procedure to achieve competitive performance to established methods such as sequential Monte Carlo. We open-source the PROSE codebase, model weights, and training dataset, to further stimulate research into amortized sampling methods and objectives.


Amortized Sampling with Transferable Normalizing Flows

Neural Information Processing Systems

Efficient equilibrium sampling of molecular conformations remains a core challenge in computational chemistry and statistical inference. Classical approaches such as molecular dynamics or Markov chain Monte Carlo inherently lack amortization; the computational cost of sampling must be paid in full for each system of interest. The widespread success of generative models has inspired interest towards overcoming this limitation through learning sampling algorithms. Despite performing competitively with conventional methods when trained on a single system, learned samplers have so far demonstrated limited ability to transfer across systems. We demonstrate that deep learning enables the design of scalable and transferable samplers by introducing Prose, a 285 million parameter all-atom transferable normalizing flow trained on a corpus of peptide molecular dynamics trajectories up to 8 residues in length. Prose draws zero-shot uncorrelated proposal samples for arbitrary peptide systems, achieving the previously intractable transferability across sequence length, whilst retaining the efficient likelihood evaluation of normalizing flows. Through extensive empirical evaluation we demonstrate the efficacy of Prose as a proposal for a variety of sampling algorithms, finding a simple importance sampling-based finetuning procedure to achieve competitive performance to established methods such as sequential Monte Carlo. We open-source the Prose codebase, model weights, and training dataset, to further stimulate research into amortized sampling methods and finetuning objectives.


AI-Writing Scandals Are Getting Very Confusing

The Atlantic - Technology

What counts as an acceptable use of AI has never been fuzzier. Steven Rosenbaum has decided that the real villain behind the bogus quotes in his book is a chatbot. Earlier this week, reported that, Rosenbaum's much-discussed book about how AI shapes reality, contains more than half a dozen fake or misattributed quotes . Rosenbaum pinned some of them on his use of AI. He claimed responsibility for the errors and said he was investigating what went wrong.


What if Readers Like A.I.-Generated Fiction?

The New Yorker

Finally, he gave the summaries to his fine-tuned model, and he asked it to compose passages "in the style of Vauhini Vara." Going into all this, I was self-assured, even smug. I'd always felt that my style was original and, more important, that my books were totally distinct from one another. I figured that, even if the A.I. model could imitate my past books, it couldn't predict the style of the novel in progress. So, when Chakrabarty sent me the A.I.-generated imitations, I was genuinely confused.


The Strange Ways Writers Are Proving That Their Writing Isn't ChatGPT

Slate

Sign up for the Slatest to get the most insightful analysis, criticism, and advice out there, delivered to your inbox daily. The other week, I was reading an email I'd written when a strange notion occurred to me. Would it perhaps be better, an unsettling new voice suddenly whispered, to leave it in? This is a thought that would've appalled me a year ago. As a professional writer, I have long prided myself on impeccable grammar, judiciously wielded punctuation, and (at times indulgent) verbosity.


Aligning LLMs by Predicting Preferences from User Writing Samples

arXiv.org Artificial Intelligence

Accommodating human preferences is essential for creating aligned LLM agents that deliver personalized and effective interactions. Recent work has shown the potential for LLMs acting as writing agents to infer a description of user preferences. Agent alignment then comes from conditioning on the inferred preference description. However, existing methods often produce generic preference descriptions that fail to capture the unique and individualized nature of human preferences. This paper introduces PROSE, a method designed to enhance the precision of preference descriptions inferred from user writing samples. PROSE incorporates two key elements: (1) iterative refinement of inferred preferences, and (2) verification of inferred preferences across multiple user writing samples. We evaluate PROSE with several LLMs (i.e., Qwen2.5 7B and 72B Instruct, GPT-mini, and GPT-4o) on a summarization and an email writing task. We find that PROSE more accurately infers nuanced human preferences, improving the quality of the writing agent's generations over CIPHER (a state-of-the-art method for inferring preferences) by 33\%. Lastly, we demonstrate that ICL and PROSE are complementary methods, and combining them provides up to a 9\% improvement over ICL alone.


Generative Social Choice: The Next Generation

arXiv.org Artificial Intelligence

A key task in certain democratic processes is to produce a concise slate of statements that proportionally represents the full spectrum of user opinions. This task is similar to committee elections, but unlike traditional settings, the candidate set comprises all possible statements of varying lengths, and so it can only be accessed through specific queries. Combining social choice and large language models, prior work has approached this challenge through a framework of generative social choice. We extend the framework in two fundamental ways, providing theoretical guarantees even in the face of approximately optimal queries and a budget limit on the overall length of the slate. Using GPT-4o to implement queries, we showcase our approach on datasets related to city improvement measures and drug reviews, demonstrating its effectiveness in generating representative slates from unstructured user opinions.


The Great Language Flattening

The Atlantic - Technology

In at least one crucial way, AI has already won its campaign for global dominance. An unbelievable volume of synthetic prose is published every moment of every day--heaping piles of machine-written news articles, text messages, emails, search results, customer-service chats, even scientific research. Chatbots learned from human writing. Now the influence may run in the other direction. Some people have hypothesized that the proliferation of generative-AI tools such as ChatGPT will seep into human communication, that the terse language we use when prompting a chatbot may lead us to dispose of any niceties or writerly flourishes when corresponding with friends and colleagues.


What Kind of Writer Is ChatGPT?

The New Yorker

Last spring, a graduate student in social anthropology--let's call him Chris--sat down at his laptop and asked ChatGPT for help with a writing assignment. He pasted a few thousand words, a mix of rough summaries and jotted-down bullet points, into the text box that serves as ChatGPT's interface. "Here's my entire exam," he wrote. "Don't edit it, I will tell you what to do after you've read it." Chris was tackling a difficult paper about perspectivism, which is the anthropological principle that one's perspective inevitably shapes the observations one makes and the knowledge one acquires.


On Training a Neural Network to Explain Binaries

arXiv.org Artificial Intelligence

In this work, we begin to investigate the possibility of training a deep neural network on the task of binary code understanding. Specifically, the network would take, as input, features derived directly from binaries and output English descriptions of functionality to aid a reverse engineer in investigating the capabilities of a piece of closed-source software, be it malicious or benign. Given recent success in applying large language models (generative AI) to the task of source code summarization, this seems a promising direction. However, in our initial survey of the available datasets, we found nothing of sufficiently high quality and volume to train these complex models. Instead, we build our own dataset derived from a capture of Stack Overflow containing 1.1M entries. A major result of our work is a novel dataset evaluation method using the correlation between two distances on sample pairs: one distance in the embedding space of inputs and the other in the embedding space of outputs. Intuitively, if two samples have inputs close in the input embedding space, their outputs should also be close in the output embedding space. We found this Embedding Distance Correlation (EDC) test to be highly diagnostic, indicating that our collected dataset and several existing open-source datasets are of low quality as the distances are not well correlated. We proceed to explore the general applicability of EDC, applying it to a number of qualitatively known good datasets and a number of synthetically known bad ones and found it to be a reliable indicator of dataset value.