Media
Fox News AI Newsletter: Amazon to cut workforce due to new tech
Amazon CEO Andy Jassy speaks during an Amazon Devices launch event in New York City, Feb. 26, 2025. TECH TAKEOVER: Amazon CEO Andy Jassy says artificial intelligence will "change the way" work is done and expects the company's total corporate workforce to be reduced as a result. 'GIANT OFFERS': Meta has allegedly tried to recruit employees from competitor OpenAI by offering bonuses as high as 100 million, OpenAI CEO Sam Altman claimed on a podcast that aired Tuesday. ENERGY OUTLOOK: The rise of artificial intelligence and the increasing popularity of cryptocurrency will continue to push electricity consumption to record highs in 2025 and 2026. POWER DRAIN CRISIS: Every time you ask ChatGPT a question, to generate an image or let artificial intelligence summarize your email, something big is happening behind the scenes.
This May Be Trump's Most Consequential Decision Yet
This week, Emily Bazelon, John Dickerson, and David Plotz discuss whether the US should join Israel's war on Iran, the tragic Minnesota assassinations and why US political violence is surging now, and the Supreme Court's unsurprising but willfully obtuse decision to uphold Tennessee's youth transgender care ban. Here are some notes and references from this week's show: Alexander Ward, Lara Seligman, and Dustin Volz for The Wall Street Journal (Exclusive): Israel Built Its Case for War With Iran on New Intelligence. The U.S. Didn't Buy It. Thomas L. Friedman for The New York Times (Opinion): The Smart Way for Trump to End the Israel-Iran War Oren Cass for Understanding America (Substack): Is Israel the Ideal "America First" Ally? Warren P. Strobel, Alex Horton, and Abigail Hauslohner for the Washington Post: Navigating Iran crisis, Trump relies on experience over star power Amy Howe for SCOTUSblog: Court upholds Tennessee's ban on certain medical treatments for transgender minors Abbie VanSickle for The New York Times: Sotomayor Writes the Court'Abandons' Transgender Children to'Political Whims' Ella Lee for The Hill: Clarence Thomas urges courts to end deferring to'experts' on gender-affirming care Ian Millhiser for Vox: The Supreme Court's incoherent new attack on trans rights, explained Here are this week's chatters: Emily: A Family Matter by Claire Lynch; The Fall of Affirmative Action: Race, the Supreme Court, and the Future of Higher Education by Justin Driver; A Flower Traveled in My Blood: The Incredible True Story of the Grandmothers Who Fought to Find a Stolen Generation of Children by Haley Cohen Gilliland. John: Mary Cunningham for CBS News: Federal Reserve holds its benchmark interest rate steady at today's FOMC meeting; ABA Banking Journal: Fed's Powell says some areas of U.S. may be'uninsurable' in next decade David: Trip Gabriel for the New York Times: William Langewiesche, the'Steve McQueen of Journalism,' Dies at 70 For this week's Slate Plus bonus episode, Emily, John, and David discuss the exciting possibilities and likely limitations of using AI tools for historical research and writing.
'We're all connected – but it's not the connection I imagined': Hideo Kojima on Death Stranding 2
Hideo Kojima – the acclaimed video game director who helmed the stealth-action Metal Gear series for decades before founding his own company to make Death Stranding, a supernatural post-apocalyptic delivery game this publication described as "2019's most interesting blockbuster" – is still starstruck, or perhaps awestruck. "George [Miller] is my sensei, my God," he proclaims gleefully. Kojima is visiting Australia for a sold-out chat with Miller, the creator of the Mad Max film franchise, at the Sydney film festival. The two struck up an unlikely but fierce friendship nearly a decade ago, and Kojima says that, as a teenager, the first two Mad Max films inspired him to become a movie director and thus, eventually, a video game maker. At the panel later, Miller is equally effusive, calling Kojima "almost my brother"; the Australian even lent his appearance to a major character in Kojima's latest game, Death Stranding 2. It's actually because of Miller that much of this latest game is set in a heavily fictionalised version of Australia, Kojima jokes.
PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning
Shi, Yuhui, Yang, Yehan, Sheng, Qiang, Mi, Hao, Hu, Beizhe, Xu, Chaoxi, Cao, Juan
With the popularity of large language models (LLMs), undesirable societal problems like misinformation production and academic misconduct have been more severe, making LLM-generated text detection now of unprecedented importance. Although existing methods have made remarkable progress, a new challenge posed by text from privately tuned LLMs remains underexplored. Users could easily possess private LLMs by fine-tuning an open-source one with private corpora, resulting in a significant performance drop of existing detectors in practice. To address this issue, we propose PhantomHunter, an LLM-generated text detector specialized for detecting text from unseen, privately-tuned LLMs. Its family-aware learning framework captures family-level traits shared across the base models and their derivatives, instead of memorizing individual characteristics. Experiments on data from LLaMA, Gemma, and Mistral families show its superiority over 7 baselines and 3 industrial services, with F1 scores of over 96%.
CC-LEARN: Cohort-based Consistency Learning
Ye, Xiao, Shrivastava, Shaswat, Li, Zhaonan, Dineen, Jacob, Lu, Shijie, Ahuja, Avneet, Shen, Ming, Xu, Zhikun, Zhou, Ben
Large language models excel at many tasks but still struggle with consistent, robust reasoning. We introduce Cohort-based Consistency Learning (CC-Learn), a reinforcement learning framework that improves the reliability of LLM reasoning by training on cohorts of similar questions derived from shared programmatic abstractions. To enforce cohort-level consistency, we define a composite objective combining cohort accuracy, a retrieval bonus for effective problem decomposition, and a rejection penalty for trivial or invalid lookups that reinforcement learning can directly optimize, unlike supervised fine-tuning. Optimizing this reward guides the model to adopt uniform reasoning patterns across all cohort members. Experiments on challenging reasoning benchmarks (including ARC-Challenge and StrategyQA) show that CC-Learn boosts both accuracy and reasoning stability over pretrained and SFT baselines. These results demonstrate that cohort-level RL effectively enhances reasoning consistency in LLMs.
SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning
Chopra, Anuradha, Roy, Abhinaba, Herremans, Dorien
Detailed captions that accurately reflect the characteristics of a music piece can enrich music databases and drive forward research in music AI. This paper introduces a multi-task music captioning model, SonicVerse, that integrates caption generation with auxiliary music feature detection tasks such as key detection, vocals detection, and more, so as to directly capture both low-level acoustic details as well as high-level musical attributes. The key contribution is a projection-based architecture that transforms audio input into language tokens, while simultaneously detecting music features through dedicated auxiliary heads. The outputs of these heads are also projected into language tokens, to enhance the captioning input. This framework not only produces rich, descriptive captions for short music fragments but also directly enables the generation of detailed time-informed descriptions for longer music pieces, by chaining the outputs using a large-language model. To train the model, we extended the MusicBench dataset by annotating it with music features using MIRFLEX, a modular music feature extractor, resulting in paired audio, captions and music feature data. Experimental results show that incorporating features in this way improves the quality and detail of the generated captions.
Identifying economic narratives in large text corpora -- An integrated approach using Large Language Models
Schmidt, Tobias, Lange, Kai-Robin, Reccius, Matthias, Müller, Henrik, Roos, Michael, Jentsch, Carsten
As interest in economic narratives has grown in recent years, so has the number of pipelines dedicated to extracting such narratives from texts. Pipelines often employ a mix of state-of-the-art natural language processing techniques, such as BERT, to tackle this task. While effective on foundational linguistic operations essential for narrative extraction, such models lack the deeper semantic understanding required to distinguish extracting economic narratives from merely conducting classic tasks like Semantic Role Labeling. Instead of relying on complex model pipelines, we evaluate the benefits of Large Language Models (LLMs) by analyzing a corpus of Wall Street Journal and New York Times newspaper articles about inflation. We apply a rigorous narrative definition and compare GPT-4o outputs to gold-standard narratives produced by expert annotators. Our results suggests that GPT-4o is capable of extracting valid economic narratives in a structured format, but still falls short of expert-level performance when handling complex documents and narratives. Given the novelty of LLMs in economic research, we also provide guidance for future work in economics and the social sciences that employs LLMs to pursue similar objectives.
Enhancing Goal-oriented Proactive Dialogue Systems via Consistency Reflection and Correction
Zhang, Didi, Fan, Yaxin, Li, Peifeng, Zhu, Qiaoming
Goal-oriented proactive dialogue systems are designed to guide user conversations seamlessly towards specific objectives by planning a goal-oriented path. However, previous research has focused predominantly on optimizing these paths while neglecting the inconsistencies that may arise between generated responses and dialogue contexts, including user profiles, dialogue history, domain knowledge, and subgoals. To address this issue, we introduce a model-agnostic two-stage Consistency Reflection and Correction (CRC) framework. Specifically, in the consistency reflection stage, the model is prompted to reflect on the discrepancies between generated responses and dialogue contexts, identifying inconsistencies and suggesting possible corrections. In the consistency correction stage, the model generates responses that are more consistent with the dialogue context based on these reflection results. We conducted experiments on various model architectures with different parameter sizes, including encoder-decoder models (BART, T5) and decoder-only models (GPT-2, DialoGPT, Phi3, Mistral and LLaMA3), and the experimental results on three datasets demonstrate that our CRC framework significantly improves the consistency between generated responses and dialogue contexts.
Influential Bandits: Pulling an Arm May Change the Environment
While classical formulations of multi-armed bandit problems assume that each arm's reward is independent and stationary, real-world applications often involve non-stationary environments and interdependencies between arms. In particular, selecting one arm may influence the future rewards of other arms, a scenario not adequately captured by existing models such as rotting bandits or restless bandits. To address this limitation, we propose the influential bandit problem, which models inter-arm interactions through an unknown, symmetric, positive semi-definite interaction matrix that governs the dynamics of arm losses. We formally define this problem and establish two regret lower bounds, including a superlinear $Ω(T^2 / \log^2 T)$ bound for the standard LCB algorithm (loss minimization version of UCB) and an algorithm-independent $Ω(T)$ bound, which highlight the inherent difficulty of the setting. We then introduce a new algorithm based on a lower confidence bound (LCB) estimator tailored to the structure of the loss dynamics. Under mild assumptions, our algorithm achieves a regret of $O(KT \log T)$, which is nearly optimal in terms of its dependence on the time horizon. The algorithm is simple to implement and computationally efficient. Empirical evaluations on both synthetic and real-world datasets demonstrate the presence of inter-arm influence and confirm the superior performance of our method compared to conventional bandit algorithms.
Contributions to Representation Learning with Graph Autoencoders and Applications to Music Recommendation
Graph autoencoders (GAE) and variational graph autoencoders (VGAE) emerged as two powerful groups of unsupervised node embedding methods, with various applications to graph-based machine learning problems such as link prediction and community detection. Nonetheless, at the beginning of this Ph.D. project, GAE and VGAE models were also suffering from key limitations, preventing them from being adopted in the industry. In this thesis, we present several contributions to improve these models, with the general aim of facilitating their use to address industrial-level problems involving graph representations. Firstly, we propose two strategies to overcome the scalability issues of previous GAE and VGAE models, permitting to effectively train these models on large graphs with millions of nodes and edges. These strategies leverage graph degeneracy and stochastic subgraph decoding techniques, respectively. Besides, we introduce Gravity-Inspired GAE and VGAE, providing the first extensions of these models for directed graphs, that are ubiquitous in industrial applications. We also consider extensions of GAE and VGAE models for dynamic graphs. Furthermore, we argue that GAE and VGAE models are often unnecessarily complex, and we propose to simplify them by leveraging linear encoders. Lastly, we introduce Modularity-Aware GAE and VGAE to improve community detection on graphs, while jointly preserving good performances on link prediction. In the last part of this thesis, we evaluate our methods on several graphs extracted from the music streaming service Deezer. We put the emphasis on graph-based music recommendation problems. In particular, we show that our methods can improve the detection of communities of similar musical items to recommend to users, that they can effectively rank similar artists in a cold start setting, and that they permit modeling the music genre perception across cultures.