Media


Interpreting Learned Feedback Patterns in Large Language Models Luke Marks Amir Abdullah Clement Neo

Neural Information Processing Systems

Reinforcement learning from human feedback (RLHF) is widely used to train large language models (LLMs). However, it is unclear whether LLMs accurately learn the underlying preferences in human feedback data. We coin the term Learned Feedback Pattern (LFP) for patterns in an LLM's activations learned during RLHF that improve its performance on the fine-tuning task. We hypothesize that LLMs with LFPs accurately aligned to the fine-tuning feedback exhibit consistent activation patterns for outputs that would have received similar feedback during RLHF. To test this, we train probes to estimate the feedback signal implicit in the activations of a fine-tuned LLM. We then compare these estimates to the true feedback, measuring how accurate the LFPs are to the fine-tuning feedback. Our probes are trained on a condensed, sparse and interpretable representation of LLM activations, making it easier to correlate features of the input with our probe's predictions. We validate our probes by comparing the neural features they correlate with positive feedback inputs against the features GPT-4 describes and classifies as related to LFPs. Understanding LFPs can help minimize discrepancies between LLM behavior and training objectives, which is essential for the safety and alignment of LLMs.


5 AI terms you keep hearing and what they actually mean

FOX News

Tyler Saltsman, founder and CEO of EdgeRunner AI, warns that creating artificial general intelligence could "destroy the world as we know it." Whether it's powering your phone's autocorrect or helping someone create a new recipe with a few words, artificial intelligence (AI) is everywhere right now. But if you're still nodding along when someone mentions "neural networks" or "generative AI," you're not alone. Today I am breaking down five buzzy AI terms that you've probably seen in headlines, group chats or app updates, minus the tech talk. Understanding these basics will help you talk AI with confidence, even if you're not a programmer.


Localize, Understand, Collaborate: Semantic-Aware Dragging via Intention Reasoner

Neural Information Processing Systems

Flexible and accurate drag-based editing is a challenging task that has recently garnered significant attention. Current methods typically model this problem as automatically learning "how to drag" through point dragging and often produce one deterministic estimation, which presents two key limitations: 1) Overlooking the inherently ill-posed nature of drag-based editing, where multiple results may correspond to a given input, as illustrated in Figure 1; 2) Ignoring the constraint of image quality, which may lead to unexpected distortion. To alleviate this, we propose LucidDrag, which shifts the focus from "how to drag" to "what-then-how" paradigm. LucidDrag comprises an intention reasoner and a collaborative guidance sampling mechanism. The former infers several optimal editing strategies, identifying what content and what semantic direction to be edited. Based on the former, the latter addresses "how to drag" by collaboratively integrating existing editing guidance with the newly proposed semantic guidance and quality guidance. Specifically, semantic guidance is derived by establishing a semantic editing direction based on reasoned intentions, while quality guidance is achieved through classifier guidance using an image fidelity discriminator. Both qualitative and quantitative comparisons demonstrate the superiority of LucidDrag over previous methods.


Rethinking Score Distillation as a Bridge Between Image Distributions David McAllister 1 Songwei Ge2 Jia-Bin Huang 2 David W. Jacobs 2

Neural Information Processing Systems

Score distillation sampling (SDS) has proven to be an important tool, enabling the use of large-scale diffusion priors for tasks operating in data-poor domains. Unfortunately, SDS has a number of characteristic artifacts that limit its usefulness in general-purpose applications. In this paper, we make progress toward understanding the behavior of SDS and its variants by viewing them as solving an optimal-cost transport path from a source distribution to a target distribution. Under this new interpretation, these methods seek to transport corrupted images (source) to the natural image distribution (target). We argue that current methods' characteristic artifacts are caused by (1) linear approximation of the optimal path and (2) poor estimates of the source distribution. We show that calibrating the text conditioning of the source distribution can produce high-quality generation and translation results with little extra overhead. Our method can be easily applied across many domains, matching or beating the performance of specialized methods. We demonstrate its utility in text-to-2D, text-based NeRF optimization, translating paintings to real images, optical illusion generation, and 3D sketch-to-real. We compare our method to existing approaches for score distillation sampling and show that it can produce high-frequency details with realistic colors.


AI to monitor NYC subway safety as crime concerns rise

FOX News

Fox News anchor Bret Baier has the latest on the Murdoch Children's Research Institute's partnership with the Gladstone Institutes for the "Decoding Broken Hearts" initiative on "Special Report." Imagine having a tireless guardian watching over you during your subway commute. New York City's subway system is testing artificial intelligence to boost security and reduce crime. Michael Kemper, a 33-year NYPD veteran and the chief security officer for the Metropolitan Transportation Authority (MTA), which is the largest transit agency in the United States, is leading the rollout of AI software designed to spot suspicious behavior as it happens. The MTA says this technology represents the future of subway surveillance and reassures riders that privacy concerns are being taken seriously.


The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Neural Information Processing Systems

The performance of a large language model (LLM) depends heavily on the quality and size of its pretraining dataset. However, the pretraining datasets for state-ofthe-art open LLMs like Llama 3 and Mixtral are not publicly available and very little is known about how they were created. In this work, we introduce FineWeb, a 15-trillion token dataset derived from 96 Common Crawl snapshots that produces better-performing LLMs than other open pretraining datasets. To advance the understanding of how best to curate high-quality pretraining datasets, we carefully document and ablate all of the design choices used in FineWeb, including indepth investigations of deduplication and filtering strategies. In addition, we introduce FineWeb-Edu, a 1.3-trillion token collection of educational text filtered from FineWeb.


'Frasier' star Kelsey Grammer voices growing alarm over AI manipulation

FOX News

While artificial intelligence (AI) is playing a bigger role than ever in Hollywood, award-winning actor Kelsey Grammer is warning it may be "dangerous." The "Karen: A Brother Remembers" author opened up about his growing concern over AI deepfakes and the potential blurred lines between reality and manipulation. "What I'm a little sad about is our prevalence these days to come up with so many, as they try to say deepfakes," he told Fox News Digital. "You know, the ones who say it usually are the ones who are actually doing it. "Karen: A Brother Remembers" author Kelsey Grammer warns about the dangers of AI deepfakes in Hollywood, expressing concerns over the blurred lines between reality and manipulation. AI-generated images, known as "deepfakes," often involve editing videos or photos of people to make them look like someone else by using artificial intelligence. While the "Frasier" star has acknowledged AI to be beneficial in some capacity, including in the medical field, Grammer shared his reservations about how the system can potentially fabricate someone's identity in seconds. WATCH: KELSEY GRAMMER WARNS AI WILL'NEVER REFLECT THE SAME SPONTANEITY' AS HUMANS "I recognize the validity and the potential in AI," Grammer said. "I recognize the validity and the potential in AI, especially in medicine and a number of other things." Grammer warned, "But AI still is...


MACD: Multilingual Abusive Comment Detection at Scale for Indic Languages

Neural Information Processing Systems

Social media platforms were conceived to act as online'town squares' where people could get together, share information and communicate with each other peacefully. However, harmful content borne out of bad actors are constantly plaguing these platforms slowly converting them into'mosh pits' where the bad actors take the liberty to extensively abuse various marginalised groups. Accurate and timely detection of abusive content on social media platforms is therefore very important for facilitating safe interactions between users. However, due to the small scale and sparse linguistic coverage of Indic abusive speech datasets, development of such algorithms for Indic social media users (one-sixth of global population) is severely impeded.


Leak reveals what Sam Altman and Jony Ive are cooking up: 100 million AI companion devices

Mashable

OpenAI and Jony Ive's vision for its AI device is a screenless companion that knows everything about you. Details leaked to the Wall Street Journal give us a clearer picture of OpenAI's acquisition of io, cofounded by Ive, the iconic iPhone designer. The ChatGPT maker reportedly plans to ship 100 million AI devices designed to fit in with users' everyday life. "The product will be capable of being fully aware of a user's surroundings and life, will be unobtrusive, able to rest in one's pocket or on one's desk," according to a recording of an OpenAI staff meeting reviewed by the Journal. The device "will be a third core device a person would put on a desk after a MacBook Pro and an iPhone," per the meeting which occurred the same day (Wednesday) that OpenAI announced its acquisition of Ive's company.


News/Media Alliance says Google's AI takes content by force

Mashable

Is Google's new AI Mode feature theft? The News/Media Alliance, trade association representing news media organizations in the U.S. and Canada, certainly thinks so. At Google's I/O showcase earlier this week, the tech company announced the public release of AI Mode in Google Search. AI Mode expands AI Overviews in search and signifies a pivot away from Google's traditional search. Users will see a tab at the top of their Google Search page that takes them to a chatbot interface much like, say, ChatGPT, instead of your typical Google Search results.