Goto

Collaborating Authors

 tragedy


I'm a drone CEO. America must protect its airspace now, before it's too late

FOX News

Drones have rapidly evolved from backyard novelties into critical components of today's infrastructure and now, into one of the fastest-growing threats to our national security. As the CEO of one of the nation's largest drone technology companies and a former naval officer, I've seen firsthand how powerful these tools can be. I've also seen how dangerous they are when left unregulated; they become liabilities, capable of disruption, destruction and danger. Just days ago, amid deadly flash floods in Texas, a private drone collided with a rescue helicopter during an active life-saving mission. The crash forced the crew to land, grounding a critical asset in the middle of an unfolding emergency.


Where Are All the AI Drugs?

WIRED

A new drug usually starts with a tragedy. Born in what is now Zimbabwe, the child of a mechanic and a radiology technician, Ray fled with his family to South Africa during the Zimbabwean War of Liberation. He remembers the journey there in 1980 in a convoy of armored cars. As the sun blazed down, a soldier taught 8-year-old Ray how to fire a machine gun. But his mother kept having to stop.


Leveraging Implicit Sentiments: Enhancing Reliability and Validity in Psychological Trait Evaluation of LLMs

Ma, Huanhuan, Gong, Haisong, Yi, Xiaoyuan, Xie, Xing, Xu, Dongkuan

arXiv.org Artificial Intelligence

Recent advancements in Large Language Models (LLMs) have led to their increasing integration into human life. With the transition from mere tools to human-like assistants, understanding their psychological aspects-such as emotional tendencies and personalities-becomes essential for ensuring their trustworthiness. However, current psychological evaluations of LLMs, often based on human psychological assessments like the BFI, face significant limitations. The results from these approaches often lack reliability and have limited validity when predicting LLM behavior in real-world scenarios. In this work, we introduce a novel evaluation instrument specifically designed for LLMs, called Core Sentiment Inventory (CSI). CSI is a bilingual tool, covering both English and Chinese, that implicitly evaluates models' sentiment tendencies, providing an insightful psychological portrait of LLM across three dimensions: optimism, pessimism, and neutrality. Through extensive experiments, we demonstrate that: 1) CSI effectively captures nuanced emotional patterns, revealing significant variation in LLMs across languages and contexts; 2) Compared to current approaches, CSI significantly improves reliability, yielding more consistent results; and 3) The correlation between CSI scores and the sentiment of LLM's real-world outputs exceeds 0.85, demonstrating its strong validity in predicting LLM behavior. We make CSI public available via: https://github.com/dependentsign/CSI.



Uncovering Differences in Persuasive Language in Russian versus English Wikipedia

Li, Bryan, Panasyuk, Aleksey, Callison-Burch, Chris

arXiv.org Artificial Intelligence

We study how differences in persuasive language across Wikipedia articles, written in either English and Russian, can uncover each culture's distinct perspective on different subjects. We develop a large language model (LLM) powered system to identify instances of persuasive language in multilingual texts. Instead of directly prompting LLMs to detect persuasion, which is subjective and difficult, we propose to reframe the task to instead ask high-level questions (HLQs) which capture different persuasive aspects. Importantly, these HLQs are authored by LLMs themselves. LLMs over-generate a large set of HLQs, which are subsequently filtered to a small set aligned with human labels for the original task. We then apply our approach to a large-scale, bilingual dataset of Wikipedia articles (88K total), using a two-stage identify-then-extract prompting strategy to find instances of persuasion. We quantify the amount of persuasion per article, and explore the differences in persuasion through several experiments on the paired articles. Notably, we generate rankings of articles by persuasion in both languages. These rankings match our intuitions on the culturally-salient subjects; Russian Wikipedia highlights subjects on Ukraine, while English Wikipedia highlights the Middle East. Grouping subjects into larger topics, we find politically-related events contain more persuasion than others. We further demonstrate that HLQs obtain similar performance when posed in either English or Russian. Our methodology enables cross-lingual, cross-cultural understanding at scale, and we release our code, prompts, and data.


"Martyr!" Plays Its Subject for Laughs but Is Also Deadly Serious

The New Yorker

A novel with the title "Martyr!" arrives on the scene preloaded and explosive. The word is fraught, even more so now than when the book's author, the Iranian American poet Kaveh Akbar, chose it. It signals that Akbar is fascinated with words in action, words that someone has reached for in a state of excitation, like joy or deep grief. The shouter of "Martyr!" bears something within him which he is determined to force the word to express. But the title's punctuation ironizes or undercuts this intention, as if to suggest that language signifies in ways that are impossible to control.


First Tragedy, then Parse: History Repeats Itself in the New Era of Large Language Models

Saphra, Naomi, Fleisig, Eve, Cho, Kyunghyun, Lopez, Adam

arXiv.org Artificial Intelligence

Many NLP researchers are experiencing an existential crisis triggered by the astonishing success of ChatGPT and other systems based on large language models (LLMs). After such a disruptive change to our understanding of the field, what is left to do? Taking a historical lens, we look for guidance from the first era of LLMs, which began in 2005 with large $n$-gram models for machine translation. We identify durable lessons from the first era, and more importantly, we identify evergreen problems where NLP researchers can continue to make meaningful contributions in areas where LLMs are ascendant. Among these lessons, we discuss the primacy of hardware advancement in shaping the availability and importance of scale, as well as the urgent challenge of quality evaluation, both automated and human. We argue that disparities in scale are transient and that researchers can work to reduce them; that data, rather than hardware, is still a bottleneck for many meaningful applications; that meaningful evaluation informed by actual use is still an open problem; and that there is still room for speculative approaches.


Preventing mass shootings with AI detection: Navy SEALs-inspired invention

FOX News

CyberGuy explains a new factory in Oregon that can produce 10,000 robots a year. Maine, a state often admired for its serenity and scenic beauty, recently witnessed an unimaginable nightmare. Robert Card, an assault rifle-carrying gun instructor with documented mental troubles, gunned down 18 innocent people. The 40-year-old suspect was found dead two days later after an intense search by law enforcement. As families and communities mourn the loss, an important question is raised: Could alerting the police mere minutes earlier than the first 911 call have changed the outcome?


The Tragedy of Google Search

The Atlantic - Technology

What is Google Search in 2023? A site that started 25 years ago as a list of blue links has mutated beyond recognition. Today, Google isn't just an index to help sort through the endless libraries of online information--it's a reference guide for the physical world too, having mapped most corners of the Earth and cataloged its contents. It is part encyclopedia and part predictive engine, guessing what you might be typing or thinking, serving information based on what others before you typed. It is Moviefone and the stock ticker, a well-trained chatbot, an image repository, a shopping mall.


AI and the tyranny of the data commons

Al Jazeera

I am here to tell you the sad but true story of the demise of the sharing economy. Remember how we were told, back in the 1990s and 2000s, that we were contributing to the creation of the largest commons known to humanity? Well, to paraphrase The Lord of the Rings, we were all of us deceived, for another ring was made. Artificial intelligence (AI) is making that clearer than ever. The free data we generated by spending thousands of hours on Big Tech's platforms has been appropriated and converted into training data for AI models.