Large Language Model
Testing the General Deductive Reasoning Capacity of Large Language Models Using OODExamples
Given the intractably large size of the space of proofs, any model that is capable of general deductive reasoning must generalize to proofs of greater complexity. Recent studies have shown that large language models (LLMs) possess some abstract deductive reasoning ability given chain-of-thought prompts. However, they have primarily been tested on proofs using modus ponens or of a specific size, and from the same distribution as the in-context examples. To measure the general deductive reasoning ability of LLMs, we test on a broad set of deduction rules and measure their ability to generalize to more complex proofs from simpler demonstrations from multiple angles: depth-, width-, and compositional generalization. To facilitate systematic exploration, we construct a new synthetic and programmable reasoning dataset that enables control over deduction rules and proof complexity. Our experiments on four LLMs of various sizes and training objectives show that they are able to generalize to compositional proofs. However, they have difficulty generalizing to longer proofs, and they require explicit demonstrations to produce hypothetical subproofs, specifically in proof by cases and proof by contradiction.
How Shivon Zilis Operated as Elon Musk's OpenAI Insider
Messages presented at trial reveal how Zilis, the mother of four of Musk's children, acted as an intermediary between him and OpenAI. As the first week of trial in comes to a close, one person has emerged as a critical behind-the-scenes manager of communications and egos in OpenAI's early years: Shivon Zilis. A longtime employee of Musk and the mother to four of his children, Zilis first joined OpenAI as an advisor in 2016. She later served as a director of its nonprofit board from 2020 until 2023 and has also worked as an executive at Musk's other companies, Neuralink and Tesla. When asked about the nature of his relationship with Zilis in court, Musk offered several answers.
Optimized Deferral for Imbalanced Settings
Cortes, Corinna, Mao, Anqi, Mohri, Mehryar, Zhong, Yutao
Learning algorithms can be significantly improved by routing complex or uncertain inputs to specialized experts, balancing accuracy with computational cost. This approach, known as learning to defer, is essential in domains like natural language generation, medical diagnosis, and computer vision, where an effective deferral can reduce errors at low extra resource consumption. However, the two-stage learning to defer setting, which leverages existing predictors such as a collection of LLMs or other classifiers, often faces challenges due to an expert imbalance problem. This imbalance can lead to suboptimal performance, with deferral algorithms favoring the majority expert. We present a comprehensive study of two-stage learning to defer in expert imbalance settings. We cast the deferral loss optimization as a novel cost-sensitive learning problem over the input-expert domain. We derive new margin-based loss functions and guarantees tailored to this setting, and develop novel algorithms for cost-sensitive learning. Leveraging these results, we design principled deferral algorithms, MILD (Margin-based Imbalanced Learning to Defer), specifically suited for expert imbalance settings. Extensive experiments demonstrate the effectiveness of our approach, showing clear improvements over existing baselines on both image classification and real-world Large Language Model (LLM) routing tasks.
Mind the Gap: Structure-Aware Consistency in Preference Learning
Abstractsurrogate loss (e.g., the logistic loss) as a proxy for the true objective: the non-convex, discontinuous 0-1 ranking Preference learning has become the foundationloss. This reliance raises a fundamental theoretical question of aligning Large Language Models (LLMs) withthat remains largely unanswered for deep networks: Does human intent. Popular methods, such as Direct Preference Optimization (DPO), minimize surrominimizing these surrogate losses actually guarantee the minimization of the true ranking error? However, we demonstrate that for In this work, we investigate this question through the lens the equicontinuous hypothesis sets typical of neu-of H-consistency (Mao, Mohri, and Zhong, 2023e). We ral networks, these standard surrogates are theo-formulate LLM preference learning as a pairwise ranking retically inconsistent, yielding vacuous general-problem and derive a series of results that bridge the gap between learning theory and practical fine-tuning. To resolve this, we formulate LLM alignment within a margin-shifted rankingwe identify a fundamental theoretical deficiency in standard framework. We demonstrate that for equicontinuous hypothbounds that depend on enforcing a separationesis sets, a property satisfied by neural networks, standard margin γ. Crucially, we extend this to Structure-surrogate minimization yields vacuous consistency guaranAware H-consistency, introducing a novel ob-tees. Specifically, without explicit constraints, a model can achieve arbitrarily low surrogate risk while maintaining ajective (SA-DPO) that adapts the margin based on the semantic distance between responses tohigh ranking error, effectively "cheating" the objective by handle synonyms and hard pairs. Finally, weshrinking score differences rather than learning the correct analyze the trade-off between consistency andordering. We prove that enforcing a confidence the Polynomial Hinge family) offer superior con-gap γ is not merely a heuristic, but a strict requirement for sistency guarantees for capacity-bounded models H-consistency in the deep learning regime. However, while compared to the standard logistic loss used in DPO. a uniform margin restores consistency, it is a blunt instrument. We show that demanding a large, fixed margin on semantically identical pairs (synonyms) forces the model to hallucinate differences where none exist, introducing bias 1. Introductionand instability. To address this, we propose Structure-Aware H-consistency and a corresponding objective, StructureThe alignment of Large Language Models (LLMs) has shifted from explicit Reward Modeling (Stiennon et al., Aware DPO (SA-DPO).
Elon Musk Says He's Suing OpenAI Because They Abandoned Their Mission. I Think His Real Reason Is Much More Embarrassing.
A new scale of humiliation ritual kicked off this week as Elon Musk's lawsuit against OpenAI went to trial in Silicon Valley. The Tesla CEO, who co-founded OpenAI, is suing the artificial intelligence firm and two of its other co-founders, Sam Altman and Greg Brockman, for diverting from its original nonprofit goal of developing A.I. for the public good in favor of for-profit motives. "This lawsuit is very simple: It is not OK to steal a charity," Musk said on the witness stand on Tuesday. The trial is big by every conceivable measure. Both Musk and OpenAI have mustered high-dollar legal armies who are prepared to wage potentially years of litigation, including this federal trial.
Elon Musk Seemingly Admits xAI Has Used OpenAI's Models to Train Its Own
Elon Musk Seemingly Admits xAI Has Used OpenAI's Models to Train Its Own While answering questions under oath, Musk argued it's standard practice for AI labs to use their competitors' models. While testifying on Thursday in federal court, Elon Musk seemed to indicate that his AI lab may have used OpenAI's models to train xAI's own. He touched upon the topic while sitting on the witness stand answering cross-examination questions from an OpenAI attorney amid his ongoing legal battle against the ChatGPT-maker . Do you know what distillation is? It means to use one AI model to train another AI model.
OpenAI Rolls Out 'Advanced' Security Mode for At-Risk Accounts
OpenAI is rolling out Advanced Account Security for people concerned that their ChatGPT or Codex accounts could be potential targets of phishing attacks. For anyone who fears their ChatGPT and Codex accounts might be targeted by attackers, OpenAI announced on Thursday that it is adding an optional new level of account protection that adds an extra layer of security. Dubbed Advanced Account Security, the feature enforces strict access controls that would make account takeover attacks very difficult. Such measures are not a new idea in the realm of account security. Google, for example, has offered its Advanced Protection account security tier for nearly a decade . But as mainstream AI services rapidly proliferate around the world, there is a pressing need for an array of basic protections to be put in place.
Sam Altman's ChatGPT Couldn't Stop Obsessing Over Goblins
OpenAI desires less regulation, but it still doesn't know how its chatbot works. Get your news from a source that's not owned and controlled by oligarchs. OpenAI admitted it had to develop a specific instruction in the code of its latest model of ChatGPT to stop it from repeatedly referencing "goblins, gremlins, and other creatures." In an explanation posted Wednesday, the company said the "strange habit" came from its chatbot personality feature --specifically for users who chose the "Nerdy" personality. You are an unapologetically nerdy, playful and wise AI mentor to a human.
LLMDFA: Analyzing Dataflow in Code with Large Language Models
Dataflow analysis is a fundamental code analysis technique that identifies dependencies between program values. Traditional approaches typically necessitate successful compilation and expert customization, hindering their applicability and usability for analyzing uncompilable programs with evolving analysis needs in real-world scenarios. This paper presents LLMDFA, an LLM-powered compilation-free and customizable dataflow analysis framework. To address hallucinations for reliable results, we decompose the problem into several subtasks and introduce a series of novel strategies. Specifically, we leverage LLMs to synthesize code that outsources delicate reasoning to external expert tools, such as using a parsing library to extract program values of interest and invoking an automated theorem prover to validate path feasibility. Additionally, we adopt a few-shot chain-of-thought prompting to summarize dataflow facts in individual functions, aligning the LLMs with the program semantics of small code snippets to mitigate hallucinations. We evaluate LLMDFA on synthetic programs to detect three representative types of bugs and on real-world Android applications for customized bug detection. On average, LLMDFA achieves 87.10% precision and 80.77% recall, surpassing existing techniques with F1 score improvements of up to 0.35.