Goto

Collaborating Authors

 information


The Feeling of Control Slipping Away

The Atlantic - Technology

AI is causing a crisis of agency. Back in the web-traffic-obsessed days of 2018, at a time of dawning awareness of how easily audiences online could be manipulated and spoofed by bots, the writer Max Read argued that the internet had crossed a threshold known as "the Inversion." Not only had bots proliferated across the internet; they had come to constitute it. In outnumbering humans, bots were also loosening everyone's grasp on the very reality of online experience. "What's gone from the internet, after all, isn't'truth,' but trust: the sense that the people and things we encounter are what they represent themselves to be," Read wrote.


Fake grant email promises 4.5 Million but could steal your identity

FOX News

IMF grant scam emails pressure recipients with a $4.5 million promise while requesting personal documents like passports. Learn the red flags that expose this phishing attempt.


Low Rank for Rank: Uncertainty-Aware Task-Specific LLM Ranking under Sparse Pairwise Comparisons

arXiv.org Machine Learning

Pairwise human-preference platforms such as Chatbot Arena have become central to large language model (LLM) evaluation, yet reliable task-specific ranking remains challenging. Global leaderboards mask task heterogeneity, while ranking each fine-grained task independently is unstable under sparse, imbalanced comparisons. We propose a low-rank framework for task-specific LLM ranking from sparse pairwise comparisons, modeling the task-by-model ability matrix $ฮ˜^\star \in \mathbb{R}^{d_t \times d_m}$ as low rank so that information is shared across related tasks while task-specific differences are preserved. We first develop a max-norm ($\ell_\infty$) accurate estimator for the latent scores, combining a convex initializer with alternating-minimization refinement, and prove task-wise top-$K$ recovery guarantees under sparse sampling. Our main contribution is an uncertainty quantification framework for task-specific ranking. We construct cross-fitted one-step debiased estimators for fixed score contrasts -- such as the task-specific ability gap between two models -- yielding asymptotically valid confidence intervals that attain the semiparametric efficiency bound. We then extend the inference to the high-dimensional ranking regime, where per-task ranks and top-$K$ membership are determined by many dependent score-gap hypotheses. Using Gaussian and multiplier-bootstrap calibration, we obtain simultaneous confidence sets for per-task ranks and valid top-$K$ membership tests across many tasks and models. Experiments on synthetic data and Chatbot Arena show that low-rank sharing improves sample efficiency over independent task-wise Bradley-Terry estimation and produces tighter, better-calibrated ranking certificates, with the largest gains in the sparse regime typical of real LLM benchmarks.


Constructing efficient channels for ideal observers using the conjugate gradient method

arXiv.org Machine Learning

Purpose: Task-based assessment of image quality (IQ) is critically important for the design and optimization of medical imaging systems. Ideal observers, including the Bayesian Ideal Observer (IO) and the ideal linear observer, i.e., the Hotelling observer (HO), provide objective figures of merit (FOMs) that quantify system performance on signal detection tasks. However, the application of ideal observers to high-dimensional image data is often computationally intractable. Channel mechanisms provide an effective framework for dimensionality reduction that can facilitate the computation of ideal observers. This work presents a conjugate gradient (CG)-based method to construct efficient channels for approximating the IO and HO performance.


Prediction-Powered Inference Across Many Tasks for AI Evaluation & Social Science Research

arXiv.org Machine Learning

Many applications require statistically valid inference across many related "tasks", while using only a handful of high-quality labels per hypothesis. In AI evaluation, these tasks may correspond to model behaviors across prompts, subgroups, or hypotheses; in social science surveys, they may correspond to related questions, populations, or measurement conditions. Prediction-powered inference (PPI) uses abundant but inexpensive proxy measurements to improve inference from limited, "ground-truth" labels, but commonly used methods treat tasks independently and therefore fail to exploit shared structure across related tasks. This limitation is especially important in settings where only a small number of labels are available per task. To address this issue, we introduce a multi-task prediction-powered inference framework that uses labeled data from related tasks to improve power while preserving task-specific inference. Our methods exploit the shared structure in the proxy-ground-truth relationship through cross-task recalibration, while retaining within-task rectification and power tuning to construct accurate point estimates and confidence intervals. We prove that efficiency gains beyond power-tuned PPI are only possible when the proxy-ground-truth relationship contains nonlinear structure; affine cross-task recalibrations are asymptotically equivalent to using the original proxy. We complement our theoretical findings with experiments on synthetic and semi-synthetic datasets, as well as a case study auditing language models on election-related information during the 2024 U.S. presidential election. Using a large human-annotation study, we show that cross-task recalibration can substantially reduce confidence interval widths when labels are scarce.


On Language Generation in the Limit with Bounded Memory

arXiv.org Machine Learning

We study language generation in the limit under bounded memory. In this task, a learner observes examples from an unknown target language one at a time and must eventually output only new valid examples. Prior work assumes access to the entire history, a strong assumption since realistic algorithms retain limited past information. Classical work in learning theory shows memory constraints dramatically alter learnability; we extend this to language generation. First, we study memoryless generators. Under a mild enumeration restriction, every countable collection of infinite languages remains generable without memory. Without this restriction, we exactly characterize when memoryless generation is possible. For finite collections, we characterize the optimal minimax density achievable by memoryless generators -- the best density guaranteed against any collection of a given size. This combinatorial bound relies on Sperner's theorem and symmetric chain decompositions. We further show that a sliding window of the last $W$ examples does not improve this worst-case density, whereas allowing it to store $b$ adaptively chosen past examples improves the achievable density for every $b \geq 1$. Finally, we revisit identification in the limit, where the learner must converge to a single correct hypothesis for the target language. We focus on its incremental variant, where the learner remembers only its previous guess. Here, although exact identification fails on a collection of just three languages, a mild relaxation requiring convergence to an ``approximate'' version of the target is achievable for every finite collection. These results show bounded memory affects these tasks differently: generation remains achievable for every countable collection, while density and identification are confined to finite collections, with guarantees weakening as the collection grows.


Could the 7-Eleven breach affect you?

FOX News

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by LSEG .


Mosquitoes can learn that DEET means dinner is served

Popular Science

But don't throw away the bug spray just yet. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. DEET has helped repel mosquitoes for 80 years. Breakthroughs, discoveries, and DIY tips sent six days a week. By signing up, you confirm you are 16+, will receive newsletters and promotional content and agree to our Terms of Use and acknowledge the data practices in our Privacy Policy .


How to run a local AI chatbot on your iPhone

Engadget

When most of us think of AI chatbots, we think of complex systems running on powerful hardware in massive data centers. Ask ChatGPT or Gemini a question, then watch it think as it pings some faraway server network to process, before it generates an answer. The reality is that's just one way to interact with the latest AI models, and you can run an open-weight chatbots on a recent iPhone. A local chatbot might not be as powerful as its cloud counterparts, but there are compelling reasons to ditch ChatGPT, Claude and Gemini, which I'll go over in this guide. I'll also explain how to install a local AI model on your phone. It might seem complicated, but I promise it's easier than you think.


Scammers Are Using Your Real Hotel Reservations to Trick You With Spear-Phishing Attacks

WIRED

Customer data from more than 350 hotels around the world may have been accessed as part of realistic reservation-hijacking scams. Travelers' information and booking details may have been stolen from hundreds of hotels around the world, according to new findings from security researchers. These swiped trip details, such as booking names and reservation information, are then being repurposed by cybercriminals to create highly targeted phishing messages used to steal credit card information. At least 350 hotels, vacation rentals, motels, and guesthouses in 50 different countries have been caught up in so-called reservation hijacking scams, according to an analysis of phishing messages and cybercriminal infrastructure by security company Norton. Researchers say the use of legitimate booking information in phishing messages may increase the chances that someone clicks on a fraudulent link and hands over other sensitive details to criminals.