Goto

Collaborating Authors

 nytimes


A supplementary for the paper Falconn++: ALocality-sensitive Filtering Approach for Approximate Nearest Neighbor Search

Neural Information Processing Systems

We define µ = µ1 µ2 > 0 and set the threshold t = µ1 = (1 r2/2) 2lnD. Since µ/σ2 is monotonic with respect to c, further points has a higher probability of being discarded. Therefore, the second property holds for any far away point y, i.e. y q cr. The first property holds for any close point x, i.e. x q r, since their projection value onto r1 follows a Gaussian distribution with mean µ µ1. Figure 1 shows the recall-speed comparison between Falconn++ and recent theoretical LSF frameworks [2, 3]. All 3 data sets use L = 100, α = {0.1,0.5},



Retrieval and Argumentation Enhanced Multi-Agent LLMs for Judgmental Forecasting

arXiv.org Artificial Intelligence

Judgmental forecasting is the task of making predictions about future events based on human judgment. This task can be seen as a form of claim verification, where the claim corresponds to a future event and the task is to assess the plausibility of that event. In this paper, we propose a novel multi-agent framework for claim verification, whereby different agents may disagree on claim veracity and bring specific evidence for and against the claims, represented as quantitative bipolar argumentation frameworks (QBAFs). We then instantiate the framework for supporting claim verification, with a variety of agents realised with Large Language Models (LLMs): (1) ArgLLM agents, an existing approach for claim verification that generates and evaluates QBAFs; (2) RbAM agents, whereby LLM-empowered Relation-based Argument Mining (RbAM) from external sources is used to generate QBAFs; (3) RAG-ArgLLM agents, extending ArgLLM agents with a form of Retrieval-Augmented Generation (RAG) of arguments from external sources. Finally, we conduct experiments with two standard judgmental forecasting datasets, with instances of our framework with two or three agents, empowered by six different base LLMs. We observe that combining evidence from agents can improve forecasting accuracy, especially in the case of three agents, while providing an explainable combination of evidence for claim verification.


Digital Domination: A Case for Republican Liberty in Artificial Intelligence

arXiv.org Artificial Intelligence

Artificial intelligence is set to revolutionize social and political life in unpredictable ways, raising questions about the principles that ought to guide its development and regulation. By examining digital advertising and social media algorithms, this article highlights how artificial intelligence already poses a significant threat to the republican conception of liberty -- or freedom from unaccountable power -- and thereby highlights the necessity of protecting republican liberty when integrating artificial intelligence into society. At an individual level, these algorithms can subconsciously influence behavior and thought, and those subject to this influence have limited power over the algorithms they engage. At the political level, these algorithms give technology company executives and other foreign parties the power to influence domestic political processes, such as elections; the multinational nature of algorithm-based platforms and the speed with which technology companies innovate make incumbent state institutions ineffective at holding these actors accountable. At both levels, artificial intelligence has thus created a new form of unfreedom: digital domination. By drawing on the works of Quentin Skinner, Philip Pettit, and other republican theorists, this article asserts that individuals must have mechanisms to hold algorithms (and those who develop them) accountable in order to be truly free.


The Reasonable Person Standard for AI

arXiv.org Artificial Intelligence

As AI systems are increasingly incorporated into domains where human behavior has set the norm, a challenge for AI governance and AI alignment research is to regulate their behavior in a way that is useful and constructive for society. One way to answer this question is to ask: how do we govern the human behavior that the models are emulating? To evaluate human behavior, the American legal system often uses the "Reasonable Person Standard." The idea of "reasonable" behavior comes up in nearly every area of law. The legal system often judges the actions of parties with respect to what a reasonable person would have done under similar circumstances. This paper argues that the reasonable person standard provides useful guidelines for the type of behavior we should develop, probe, and stress-test in models. It explains how reasonableness is defined and used in key areas of the law using illustrative cases, how the reasonable person standard could apply to AI behavior in each of these areas and contexts, and how our societal understanding of "reasonable" behavior provides useful technical goals for AI researchers.


The New York Times is suing OpenAI and Microsoft for copyright infringement

Engadget

The New York Times is suing OpenAI and Microsoft for using published news articles to train its artificial intelligence chatbots without an agreement that compensates it for its intellectual property. The NYT did not specify how much it seeks in payout from the companies but that "this action seeks to hold them responsible for the billions of dollars in statutory and actual damages." The NYT claims that OpenAI and Microsoft, the makers of Chat GPT and Copilot, "seek to free-ride on The Times's massive investment in its journalism" without having any licensing agreements. In one part of the complaint, the NYT highlights that its domain (www.nytimes.com) It alleges more than 66 million records, ranging from breaking news articles to op-eds, published across the NYT websites and other affiliated brands were used to train the AI models.


Wordle: A Microcosm of Life. Luck, Skill, Cheating, Loyalty, and Influence!

arXiv.org Artificial Intelligence

Wordle is a popular, online word game offered by the New York Times (nytimes.com). Currently there are some 2 million players of the English version worldwide. Players have 6 attempts to guess the daily word (target word) and after each attempt, the player receives color-coded information about the correctness and position of each letter in the guess. After either a successful completion of the puzzle or the final unsuccessful attempt, software can assess the player's luck and skill using Information Theory and can display data for the first, second, ..., sixth guesses of a random sample of all players. Recently, I discovered that the latter data is presented in a format that can easily be copied and pasted into a spreadsheet. I compiled data on Wordle players' first guesses from May 2023 - August 2023 and inferred some interesting information about Wordle players. A) Every day, about 0.2-0.5% of players solve the puzzle in one attempt. Because the odds of guessing the one of 2,315 possible target words at random is 0.043%, this implies that 4,000 - 10,000 players cheat by obtaining the target word outside of playing the game! B) At least 1/3 of the players have a favorite starting word, or cycle through several. And even though players should be aware that target words are never repeated, most players appear to remain loyal to their starting word even after its appearance as a target word. C) On August 15, 2023, about 30,000 players abruptly changed their starting word, presumably based on a crossword puzzle clue! Wordle players can be influenced! This study goes beyond social media postings, surveys, and Google Trends to provide solid, quantitative evidence about cheating in Wordle.


Inductive detection of Influence Operations via Graph Learning

arXiv.org Artificial Intelligence

Influence operations are large-scale efforts to manipulate public opinion. The rapid detection and disruption of these operations is critical for healthy public discourse. Emergent AI technologies may enable novel operations which evade current detection methods and influence public discourse on social media with greater scale, reach, and specificity. New methods with inductive learning capacity will be needed to identify these novel operations before they indelibly alter public opinion and events. We develop an inductive learning framework which: 1) determines content- and graph-based indicators that are not specific to any operation; 2) uses graph learning to encode abstract signatures of coordinated manipulation; and 3) evaluates generalization capacity by training and testing models across operations originating from Russia, China, and Iran. We find that this framework enables strong cross-operation generalization while also revealing salient indicators$\unicode{x2013}$illustrating a generic approach which directly complements transductive methodologies, thereby enhancing detection coverage.


The Ezra Klein Show - Why A.I. Might Not Take Your Job or Supercharge the Economy

#artificialintelligence

Typically when we put out a call for audience questions, there’s no single topic that dominates. This time was different. The questions we received were overwhelmingly focused on artificial intelligence: Do A.I. systems pose an existential threat to humanity? Will robots take our jobs? How could these machines potentially make our lives — and the lives of our children — better? So I asked the show’s senior editor, Roge Karma, to join me to talk through them. We also discuss my mixed feelings about the calls to “pause” A.I. development, why I’m less worried about rogue A.I. systems than the incentives of the companies and countries developing A.I., the need for a “public vision” for A.I. development, whether A.I. companions can help address widespread loneliness, why I’m skeptical that A.I. advances will lead to skyrocketing economic productivity, the possibility that A.I. advances will lead to a post-work utopia, why I think of A.I. less as a normal technology and more as a “hyper object,” what A.I. systems are unveiling about what it means to be human and more. Mentioned: “Natural Selection Favors AIs over Humans” by Dan Hendrycks “2022 Expert Survey on Progress in AI” God, Human, Animal, Machine by Meghan O’Gieblyn “Resisting dehumanization in the age of A.I.” with Emily Bender “The Moral Economy of High-Tech Modernism” by Henry Farrell and Marion Fourcade Recommendations: “Some of Us Are Brave” by Danielle Ponder “In Memory of a Honeybee” by Felix Rösch “Clouds” by Felix Rösch and Laura Masotto “Driven” by Felix Rösch Mabe Fratti Trance Frendz by Ólafur Arnalds and Nils Frahm Thoughts? Guest suggestions? Email us at ezrakleinshow@nytimes.com. You can find transcripts (posted midday) and more episodes of “The Ezra Klein Show” at nytimes.com/ezra-klein-podcast, and you can find Ezra on Twitter @ezraklein. Book recommendations from all our guests are listed at https://www.nytimes.com/article/ezra-klein-show-book-recs. This episode of “The Ezra Klein Show” is produced by Roge Karma, Kristin Lin and Jeff Geld. Fact-checking by Michelle Harris and Kate Sinclair. Mixing by Jeff Geld. Original music by Isaac Jones. Audience strategy by Shannon Busta. The executive producer of New York Times Opinion Audio is Annie-Rose Strasser. Special thanks to Sonia Herrero and Kristina Samulewski.


ChatGPT: More than a Weapon of Mass Deception, Ethical challenges and responses from the Human-Centered Artificial Intelligence (HCAI) perspective

arXiv.org Artificial Intelligence

This article explores the ethical problems arising from the use of ChatGPT as a kind of generative AI and suggests responses based on the Human-Centered Artificial Intelligence (HCAI) framework. The HCAI framework is appropriate because it understands technology above all as a tool to empower, augment, and enhance human agency while referring to human wellbeing as a grand challenge, thus perfectly aligning itself with ethics, the science of human flourishing. Further, HCAI provides objectives, principles, procedures, and structures for reliable, safe, and trustworthy AI which we apply to our ChatGPT assessments. The main danger ChatGPT presents is the propensity to be used as a weapon of mass deception (WMD) and an enabler of criminal activities involving deceit. We review technical specifications to better comprehend its potentials and limitations. We then suggest both technical (watermarking, styleme, detectors, and fact-checkers) and non-technical measures (terms of use, transparency, educator considerations, HITL) to mitigate ChatGPT misuse or abuse and recommend best uses (creative writing, non-creative writing, teaching and learning). We conclude with considerations regarding the role of humans in ensuring the proper use of ChatGPT for individual and social wellbeing.