Media
Dual Ascent Diffusion for Inverse Problems
Kim, Minseo, Levy, Axel, Wetzstein, Gordon
Ill-posed inverse problems are fundamental in many domains, ranging from astrophysics to medical imaging. Emerging diffusion models provide a powerful prior for solving these problems. Existing maximum-a-posteriori (MAP) or posterior sampling approaches, however, rely on different computational approximations, leading to inaccurate or suboptimal samples. To address this issue, we introduce a new approach to solving MAP problems with diffusion model priors using a dual ascent optimization framework. Our framework achieves better image quality as measured by various metrics for image restoration problems, it is more robust to high levels of measurement noise, it is faster, and it estimates solutions that represent the observations more faithfully than the state of the art.
CHAOS: Chart Analysis with Outlier Samples
Moured, Omar, Chen, Yufan, Liu, Ruiping, Reiร, Simon, Torr, Philip, Zhang, Jiaming, Stiefelhagen, Rainer
Charts play a critical role in data analysis and visualization, yet real-world applications often present charts with challenging or noisy features. However, "outlier charts" pose a substantial challenge even for Multimodal Large Language Models (MLLMs), which can struggle to interpret perturbed charts. In this work, we introduce CHAOS (CHart Analysis with Outlier Samples), a robustness benchmark to systematically evaluate MLLMs against chart perturbations. CHAOS encompasses five types of textual and ten types of visual perturbations, each presented at three levels of severity (easy, mid, hard) inspired by the study result of human evaluation. The benchmark includes 13 state-of-the-art MLLMs divided into three groups ( i.e., general-, document-, and chart-specific models) according to the training scope and data. Comprehensive analysis involves two downstream tasks (ChartQA and Chart-to-Text). Extensive experiments and case studies highlight critical insights into robustness of models across chart perturbations, aiming to guide future research in chart understanding domain. Data and code are publicly available at: http://huggingface.co/datasets/omoured/CHAOS .
Harry Potter is Still Here! Probing Knowledge Leakage in Targeted Unlearned Large Language Models via Automated Adversarial Prompting
This work presents LURK (Latent UnleaRned Knowledge), a novel framework that probes for hidden retained knowledge in unlearned LLMs through adversarial suffix prompting. LURK automatically generates adversarial prompt suffixes designed to elicit residual knowledge about the Harry Potter domain, a commonly used benchmark for unlearning. Our experiments reveal that even models deemed successfully unlearned can leak idiosyncratic information under targeted adversarial conditions, highlighting critical limitations of current unlearning evaluation standards. By uncovering latent knowledge through indirect probing, LURK offers a more rigorous and diagnostic tool for assessing the robustness of unlearning algorithms. All code will be publicly available.
Predictively Combatting Toxicity in Health-related Online Discussions through Machine Learning
Paz-Ruza, Jorge, Alonso-Betanzos, Amparo, Guijarro-Berdiรฑas, Bertha, Eiras-Franco, Carlos
--In health-related topics, user toxicity in online discussions frequently becomes a source of social conflict or promotion of dangerous, unscientific behaviour; common approaches for battling it include different forms of detection, flagging and/or removal of existing toxic comments, which is often counterproductive for platforms and users alike. In this work, we propose the alternative of combatting user toxicity predictively, anticipating where a user could interact toxically in health-related online discussions. The hierarchical and decentralised structure made Reddit a hub of heated debate during the onset of the COVID pandemic, with over 200,000 related posts per day. Center accredited by Galician University System, is funded by "Conseller Conversely, volunteer-based moderation is generally more susceptible to bias and under-moderation, depending on the platform's audience. The design of an adapted Leave Out Last Item data partitioning method suitable for binary classification-oriented Collaborative Filtering tasks. We remove "generic comments'' from the set, i.e. those Label comments as "generic'' if they do not contain any words from Authors have temporarily removed this link to the work's repository to The majority of users do not post toxic comments when discussing health on Reddit, with 9.96% of toxic comments in the aggregate, similar to previous work. Furthermore, as Figure 2 shows, a user's toxicity on a subreddit tends to be consistent (toxic or non-toxic, as indicated by the peaks in the distribution at toxicities 0 Note the logarithmic scale on the y-axis. To tag the toxicity of comments we use Detoxify-original [7], a pre-trained language model. Instead of only detecting and punishing the toxicity of existing interactions like common content moderation methods, which is ineffective and counterproductive in the long term, this work's proposal is to predict the toxicity of an unobserved interaction Figure 5. Topology of the Machine Learning model proposed to predict the toxicity of health-related conversations in unobserved user-subreddit interactions on the Reddit platform. We assessed the predictive ability of our model and baselines using classical binary classification metrics: sensitivity, specificity, and geometric mean (G.Mean) of the class-wise We identify different avenues of future work. U. Naseem, J. Kim, M. Khushi, and A. G. Dunn, "Identification of disease or symptom terms in reddit to improve health mention classification," in "R/redditsecurity - understanding hate on reddit, and the impact of our Iii, "Toxicity detection is not all you need: Measuring the gaps to "Meta to replace'biased' fact-checkers with moderation by users -- J. Brownlee, Imbalanced classification with Python: better metrics, balance skewed classes, cost-sensitive learning .
Embedding-to-Prefix: Parameter-Efficient Personalization for Pre-Trained Large Language Models
Huber, Bernd, Fazelnia, Ghazal, Damianou, Andreas, Peleato, Sebastian, Lefarov, Max, Ravichandran, Praveen, De Nadai, Marco, Lalmas-Roellke, Mounia, Bennett, Paul N.
Large language models (LLMs) excel at generating contextually relevant content. However, tailoring these outputs to individual users for effective personalization is a significant challenge. While rich user-specific information often exists as pre-existing user representations, such as embeddings learned from preferences or behaviors, current methods to leverage these for LLM personalization typically require costly fine-tuning or token-heavy prompting. We propose Embedding-to-Prefix (E2P), a parameter-efficient method that injects pre-computed context embeddings into an LLM's hidden representation space through a learned projection to a single soft token prefix. This enables effective personalization while keeping the backbone model frozen and avoiding expensive adaptation techniques. We evaluate E2P across two public datasets and in a production setting: dialogue personalization on Persona-Chat, contextual headline generation on PENS, and large-scale personalization for music and podcast consumption. Results show that E2P preserves contextual signals and achieves strong performance with minimal computational overhead, offering a scalable, efficient solution for contextualizing generative AI systems.
Adversarial Deep Metric Learning for Cross-Modal Audio-Text Alignment in Open-Vocabulary Keyword Spotting
Jung, Youngmoon, Lee, Yong-Hyeok, Jung, Myunghun, Roh, Jaeyoung, Han, Chang Woo, Cho, Hoon-Young
For text enrollment-based open-vocabulary keyword spotting (KWS), acoustic and text embeddings are typically compared at either the phoneme or utterance level. To facilitate this, we optimize acoustic and text encoders using deep metric learning (DML), enabling direct comparison of multi-modal embeddings in a shared embedding space. However, the inherent heterogeneity between audio and text modalities presents a significant challenge. To address this, we propose Modality Adversarial Learning (MAL), which reduces the domain gap in heterogeneous modality representations. Specifically, we train a modality classifier adversarially to encourage both encoders to generate modality-invariant embeddings. Additionally, we apply DML to achieve phoneme-level alignment between audio and text, and conduct extensive comparisons across various DML objectives. Experiments on the Wall Street Journal (WSJ) and LibriPhrase datasets demonstrate the effectiveness of the proposed approach.
InfoDeepSeek: Benchmarking Agentic Information Seeking for Retrieval-Augmented Generation
Xi, Yunjia, Lin, Jianghao, Zhu, Menghui, Xiao, Yongzhao, Ou, Zhuoying, Liu, Jiaqi, Wan, Tong, Chen, Bo, Liu, Weiwen, Wang, Yasheng, Tang, Ruiming, Zhang, Weinan, Yu, Yong
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by grounding responses with retrieved information. As an emerging paradigm, Agentic RAG further enhances this process by introducing autonomous LLM agents into the information seeking process. However, existing benchmarks fall short in evaluating such systems, as they are confined to a static retrieval environment with a fixed, limited corpus} and simple queries that fail to elicit agentic behavior. Moreover, their evaluation protocols assess information seeking effectiveness by pre-defined gold sets of documents, making them unsuitable for the open-ended and dynamic nature of real-world web environments. To bridge this gap, we present InfoDeepSeek, a new benchmark with challenging questions designed for assessing agentic information seeking in real-world, dynamic web environments. We propose a systematic methodology for constructing challenging queries satisfying the criteria of determinacy, difficulty, and diversity. Based on this, we develop the first evaluation framework tailored to dynamic agentic information seeking, including fine-grained metrics about the accuracy, utility, and compactness of information seeking outcomes. Through extensive experiments across LLMs, search engines, and question types, InfoDeepSeek reveals nuanced agent behaviors and offers actionable insights for future research.
Fox News Entertainment Newsletter: Kris Jenner compared to daughters, 'Mormon Wives' stars' torn friendships
Kris Jenner gets compared to her daughters with new look; "Mormon Wives" stars' say fame tore their friendships apart. Welcome to the Fox News Entertainment Newsletter. Billy Joel cancels all shows after a rare brain disorder diagnosis. MUSIC ON HOLD - Billy Joel cancels all concerts due to brain disorder diagnosis. ROCK TRAGEDY - Music exec and rock drummer among those killed in San Diego plane crash.
'Alexa, what do you know about us?' What I discovered when I asked Amazon to tell me everything my family's smart speaker had heard
She needs to be spoken to slowly and clearly, as you'd talk to an aged relative with diminished faculties. '"Alexa, how long do wasps live for?" "Alexa, how long do wasps live if you hit them with a tea towel and then a saucepan?" In September 2016, a new presence appears in our house, squatting on the kitchen counter between the kettle and the coffee machine. It is blandly futuristic, a minimal cylinder with an LED ring that glows blue to alert us to the fact that it is ready, poised to answer our questions or carry out our instructions, as long as those instructions are clearly stated and fall within a narrow band of available "skills".
We have a chance to prevent AI decimating Britain's creative industries โ but it's slipping away Beeban Kidron
But opting out is impossible to do without AI transparency. The plan is a charter for theft, since creatives would have no idea who is taking what, when and from whom. When the government stoops to a preferred outcome that undermines the moral right to your work and income, you might reasonably be angered. As Elton John said last weekend: "The government have no right to do this to my songs. They have no right to do it to anybody's songs, or anybody's prose."