AITopics

Large Vision-Language Models (LVLMs) unlock powerful multimodal reasoning but also expand the attack surface, particularly through adversarial inputs that conceal harmful goals in benign prompts. We propose SHIELD, a lightweight, model-agnostic preprocessing framework that couples fine-grained safety classification with category-specific guidance and explicit actions (Block, Reframe, Forward). Unlike binary moderators, SHIELD composes tailored safety prompts that enforce nuanced refusals or safe redirection without retraining. Across five benchmarks and five representative LVLMs, SHIELD consistently lowers jailbreak and non-following rates while preserving utility. Our method is plug-and-play, incurs negligible overhead, and is easily extendable to new attack types -- serving as a practical safety patch for both weakly and strongly aligned LVLMs.

arxiv preprint, large language model, machine learning, (18 more...)

2510.1319

Country: North America > Mexico (0.28)

Genre: Research Report (0.50)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (0.94)
(2 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Staab, Robin, Dekoninck, Jasper, Baader, Maximilian, Vechev, Martin

Adaptive Generation of Bias-Eliciting Questions for LLMs

Large language models (LLMs) are now widely deployed in user-facing applications, reaching hundreds of millions worldwide. As they become integrated into everyday tasks, growing reliance on their outputs raises significant concerns. In particular, users may unknowingly be exposed to model-inherent biases that systematically disadvantage or stereotype certain groups. However, existing bias benchmarks continue to rely on templated prompts or restrictive multiple-choice questions that are suggestive, simplistic, and fail to capture the complexity of real-world user interactions. In this work, we address this gap by introducing a counterfactual bias evaluation framework that automatically generates realistic, open-ended questions over sensitive attributes such as sex, race, or religion. By iteratively mutating and selecting bias-inducing questions, our approach systematically explores areas where models are most susceptible to biased behavior. Beyond detecting harmful biases, we also capture distinct response dimensions that are increasingly relevant in user interactions, such as asymmetric refusals and explicit acknowledgment of bias. Leveraging our framework, we construct CAB, a human-verified benchmark spanning diverse topics, designed to enable cross-model comparisons. Using CAB, we analyze a range of LLMs across multiple bias dimensions, revealing nuanced insights into how different models manifest bias. For instance, while GPT-5 outperforms other models, it nonetheless exhibits persistent biases in specific scenarios. These findings underscore the need for continual improvements to ensure fair model behavior.

large language model, machine learning, natural language, (21 more...)

2510.12857

Country:

Europe (1.00)
Asia > Middle East > UAE (0.45)
North America > United States > Minnesota (0.27)

Genre:

Personal > Interview (0.92)
Research Report > New Finding (0.87)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Lacunza, Iñaki, Gilabert, Javier Garcia, Fornaciari, Francesca De Luca, Aula-Blasco, Javier, Gonzalez-Agirre, Aitor, Melero, Maite, Villegas, Marta

ACADATA: Parallel Dataset of Academic Data for Machine Translation

We present ACADATA, a high-quality parallel dataset for academic translation, that consists of two subsets: ACAD-TRAIN, which contains approximately 1.5 million author-generated paragraph pairs across 96 language directions and ACAD-BENCH, a curated evaluation set of almost 6,000 translations covering 12 directions. To validate its utility, we fine-tune two Large Language Models (LLMs) on ACAD-TRAIN and benchmark them on ACAD-BENCH against specialized machine-translation systems, general-purpose, open-weight LLMs, and several large-scale proprietary models. Experimental results demonstrate that fine-tuning on ACAD-TRAIN leads to improvements in academic translation quality by +6.1 and +12.4 d-BLEU points on average for 7B and 2B models respectively, while also improving long-context translation in a general domain by up to 24.9% when translating out of English. The fine-tuned top-performing model surpasses the best propietary and open-weight models on academic translation domain. By releasing ACAD-TRAIN, ACAD-BENCH and the fine-tuned models, we provide the community with a valuable resource to advance research in academic domain and long-context translation.

large language model, machine learning, natural language, (19 more...)

2510.12621

Country:

North America > United States (1.00)
Asia > Middle East (0.67)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Government (0.92)
Leisure & Entertainment > Games (0.67)
Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Uncolorable Examples: Preventing Unauthorized AI Colorization via Perception-Aware Chroma-Restrictive Perturbation

Nii, Yuki, Waseda, Futa, Chang, Ching-Chun, Echizen, Isao

AI-based colorization has shown remarkable capability in generating realistic color images from grayscale inputs. However, it poses risks of copyright infringement -- for example, the unauthorized colorization and resale of monochrome manga and films. Despite these concerns, no effective method currently exists to prevent such misuse. To address this, we introduce the first defensive paradigm, Uncolorable Examples, which embed imperceptible perturbations into grayscale images to invalidate unauthorized colorization. To ensure real-world applicability, we establish four criteria: effectiveness, imperceptibility, transferability, and robustness. Our method, Perception-Aware Chroma-Restrictive Perturbation (PAChroma), generates Uncolorable Examples that meet these four criteria by optimizing imperceptible perturbations with a Laplacian filter to preserve perceptual quality, and applying diverse input transformations during optimization to enhance transferability across models and robustness against common post-processing (e.g., compression). Experiments on ImageNet and Danbooru datasets demonstrate that PAChroma effectively degrades colorization quality while maintaining the visual appearance. This work marks the first step toward protecting visual content from illegitimate AI colorization, paving the way for copyright-aware defenses in generative media.

artificial intelligence, colorization, machine learning, (16 more...)

2510.08979

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report (0.51)

Industry: Law (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

arXiv.org Machine LearningOct-16-2025

Gaussian Certified Unlearning in High Dimensions: A Hypothesis Testing Approach

Pandey, Aaradhya, Auddy, Arnab, Zou, Haolin, Maleki, Arian, Kulkarni, Sanjeev

Machine unlearning seeks to efficiently remove the influence of selected data while preserving generalization. Significant progress has been made in low dimensions $(p \ll n)$, but high dimensions pose serious theoretical challenges as standard optimization assumptions of $Ω(1)$ strong convexity and $O(1)$ smoothness of the per-example loss $f$ rarely hold simultaneously in proportional regimes $(p\sim n)$. In this work, we introduce $\varepsilon$-Gaussian certifiability, a canonical and robust notion well-suited to high-dimensional regimes, that optimally captures a broad class of noise adding mechanisms. Then we theoretically analyze the performance of a widely used unlearning algorithm based on one step of the Newton method in the high-dimensional setting described above. Our analysis shows that a single Newton step, followed by a well-calibrated Gaussian noise, is sufficient to achieve both privacy and accuracy in this setting. This result stands in sharp contrast to the only prior work that analyzes machine unlearning in high dimensions \citet{zou2025certified}, which relaxes some of the standard optimization assumptions for high-dimensional applicability, but operates under the notion of $\varepsilon$-certifiability. That work concludes %that a single Newton step is insufficient even for removing a single data point, and that at least two steps are required to ensure both privacy and accuracy. Our result leads us to conclude that the discrepancy in the number of steps arises because of the sub optimality of the notion of $\varepsilon$-certifiability and its incompatibility with noise adding mechanisms, which $\varepsilon$-Gaussian certifiability is able to overcome optimally.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2510.13094

Country:

North America > United States (0.93)
Europe (0.67)

Genre: Research Report > New Finding (0.66)

Industry:

Law > Statutes (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Daily Mail - Science & techOct-15-2025, 17:12:37 GMT

Astonishing interactive map lays bare where MILLIONS of homes will be submerged by water within a few years... are YOU at risk?

Doctor's husband'was watching X-rated videos in his house while daughter, 2, died in roasting car outside' Florida's housing market is flashing a warning for the rest of the US Now scientists redefine'obese' - and they've made up to 60% more people'fat' Bella Hadid's health battle takes dark turn: Loved ones reveal hellish new details about'missing' model... as ominous texts emerge America's saddest lost soul can no longer SPEAK and spends days hitting herself'after years of unspeakable abuse by gangs of men' Shocking moment brazen gunman opens fire at Michigan businessman's Land Rover in daylight attack'You will DIE if you do not remove your breasts', doctors screamed at me. I refused and tried a new experimental therapy instead... now I'm cancer-free The world's most powerful passport revealed - as UK and USA both drop to record lows Police say they have FOUND woman seen in viral'kidnapping' video and reveal what happened to her after harrowing footage emerged Will Trump's Gaza peace deal fail? Policy expert MARK DUBOWITZ breaks down all the forces at play... and how the president can actually pull this off America's most renowned'prophet' makes startling prediction about alien'mothership' Kim Kardashian says she wasn't'emotionally or financially safe' during'toxic' marriage to Kanye West as she claims rapper hasn't contacted their children for MONTHS and has destroyed her dating life Astonishing interactive map lays bare where MILLIONS of homes will be submerged by water within a few years... are YOU at risk? Outrageous reason LA County CEO was awarded $2m payout for'hurt feelings' that'll see her take months off taxpayer-funded $570,000-a-year job Ugly divorce war between Mitt Romney's wealthy brother and estranged wife before she was found dead Full horrors of torture suffered by Noa Argamani's commando boyfriend are revealed - including how 6ft 5in hostage was beaten and kept chained in 6ft cell for a year after he tried to escape from Hamas Mother, 52, and daughter, 21, die after eating'poisoned birthday cake delivered by relative who owed them money' in Brazil Astonishing interactive map lays bare where MILLIONS of homes will be submerged by water within a few years... are YOU at risk? Millions of buildings and even more Americans could be at risk of sinking underwater by the end of the century. Researchers from McGill University in Canada warned rising sea levels, resulting from continued greenhouse gas emissions, threaten to wipe out coastal cities worldwide. Sea level rise measures the ocean's surface height over time.

astonishing interactive map lay, sea level, sea level rise, (11 more...)

Country:

North America > Canada > Quebec > Montreal (0.24)
South America > Brazil (0.24)
North America > United States > Michigan (0.24)
(26 more...)

Genre:

Personal (1.00)
Research Report > New Finding (0.68)

Industry:

Media > Television (1.00)
Media > Music (1.00)
Media > Film (1.00)
(7 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.94)

Daily Mail - Science & techOct-15-2025, 15:14:36 GMT

Apple surprises fans with three brand NEW products - the iPad Pro, MacBook Pro and Vision Pro

My mansion creeps 17 inches closer to the ocean every week... but I refuse to leave Police say they have FOUND woman seen in viral'kidnapping' video and reveal what happened to her after harrowing footage emerged Why tonsil stones are behind your bad breath: Foul-smelling'pebbles' of rotting food and bacteria are lurking in your throat. In heartbreaking sit down, Fox News' Harris Faulkner reveals her last talk with Charlie Kirk... and the change she saw in him before his death Bella Hadid's heath battle takes dark turn: Loved ones reveal hellish new details about'missing' model... as ominous texts emerge The world's most powerful passport revealed - as UK and USA both drop to record lows Unmasked after 80 years - the Nazi executioner in infamous WWII photo: Historian uses AI to uncover identity of killer in'The Last Jew of Vinnytsia' image MARK DUBOWITZ: I've uncovered the Muslim Brotherhood plot to sabotage Trump's peace deal'Pathetic' JD Vance slammed for'cheap' reaction to racist texts as Young Republicans spark Trump world crisis Kim Kardashian says she wasn't'emotionally or financially safe' during'toxic' marriage to Kanye West as she claims rapper hasn't contacted their children for MONTHS and has destroyed her dating life Every woman I date has the same repulsive bedroom kink... it feels so wrong, but I always say yes: DEAR JANE Jason Kelce speaks out after'brutal comments' about Bad Bunny's Super Bowl halftime show go viral Victoria's Secret Fashion Show 2025: Brand back to'super sexy' with Irina Shayk and Emily Ratajkowski after going'woke' Full horrors of torture suffered by Noa Argamani's commando boyfriend are revealed - including how 6ft 5in hostage was beaten and kept chained in 6ft cell for a year after he tried to escape from Hamas Mother, 52, and daughter, 21, die after eating'poisoned birthday cake delivered by relative who owed them money' in Brazil I had 30 debilitating symptoms but doctors dismissed me. Ellen Greenberg's ex breaks his silence after court hearing rules her 20-stab-wound death was'suicide'... see inside his plush new life Ugly divorce war between Mitt Romney's wealthy brother and estranged wife before she was found dead READ MORE: Apple has rebranded its TV service as part of a'new identity' It's barely been a month since Apple released its latest generation of iPhones, but the tech giant has already released three new products. In an unexpected launch, Apple has unveiled new models of the iPad Pro, MacBook Pro, and Vision Pro - which are all now available to pre-order. All of the new devices feature the M5 chip, Apple's latest and most powerful in-house processor.

apple, macbook, vision, (12 more...)

Country:

South America > Brazil (0.24)
Europe > Ukraine > Vinnytsia Oblast > Vinnytsia (0.24)
North America > Canada > Alberta (0.14)
(13 more...)

Genre: Personal (1.00)

Industry:

Media > Music (1.00)
Media > Film (1.00)
Leisure & Entertainment (1.00)
(4 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence (1.00)

Daily Mail - Science & techOct-15-2025, 13:26:59 GMT

Explosive volcano eruption in Pacific Ring of Fire forces evacuations and grounds flights

'Pathetic' JD Vance slammed for'cheap' reaction to racist texts as Young Republicans spark Trump world crisis Jason Kelce speaks out after brutal comments about Bad Bunny's Super Bowl halftime show go viral The world's most powerful passport revealed - as UK and USA both drop to record lows Behind the scenes at Time as laughing staff picked Trump's'worst' photo: 'It's not Vogue' Meghan Markle compares herself to the Obamas as she tries to put a positive spin on her Netflix woes... and takes another apparent jab at Royal family Los Angeles sparks fury as it declares state of emergency to combat ICE crackdowns: 'A middle finger to the law' Every woman I date has the same repulsive bedroom kink... it feels so wrong, but I always say yes: DEAR JANE Ellen Greenberg's ex breaks his silence after court hearing rules her 20-stab-wound death was'suicide'... see inside his plush new life The truth about Dan and Phil's secret relationship - and exactly why they kept it hidden for so long: ...

eruption, explosive volcano eruption, force evacuation and ground flight, (10 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.24)
North America > Canada > Alberta (0.14)
North America > United States > New York (0.05)
(14 more...)

Genre: Personal (0.93)

Industry:

Media > Television (1.00)
Media > Music (1.00)
Media > Film (1.00)
(6 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.94)

Daily Mail - Science & techOct-15-2025, 12:05:37 GMT

Police issue warning over 'stupid and dangerous' TikTok trend that sees teens using AI to pretend a homeless person has broken into their house

'Pathetic' JD Vance slammed for'cheap' reaction to racist texts as Young Republicans spark Trump world crisis Meghan Markle compares herself to the Obamas as she tries to put a positive spin on her Netflix woes... and takes another apparent jab at Royal family Jason Kelce speaks out after brutal comments about Bad Bunny's Super Bowl halftime show go viral Behind the scenes at Time as laughing staff picked Trump's'worst' photo: 'It's not Vogue' Los Angeles sparks fury as it declares state of emergency to combat ICE crackdowns: 'A middle finger to the law' Every woman I date has the same repulsive bedroom kink... it feels so wrong, but I always say yes: DEAR JANE Michelle Obama's snide comment about Barack's parenting as divorce rumors continue to swirl Ellen Greenberg's ex breaks his silence after court hearing rules her 20-stab-wound death was'suicide'... see inside his plush new life The truth about Dan and Phil's secret relationship - and exactly why they kept it hidden for so long: Insiders reveal to MOLLY CLAYTON the sad fears that plagued the couple and the'ring of trust' they relied on Prince William makes VERY cheeky remark as he and Kate Middleton bake potato apple bread during Northern Ireland visit (and her attempt is flawless first time!) Disturbing revelations about Mitt Romney's lovelorn sister-in-law after she was found dead at bottom of parking garage Britney Spears' son Jayden, 19, surfaces after dad Kevin Federline's disturbing knife claim Full horrors of torture suffered by Noa Argamani's commando boyfriend are revealed - including how 6ft 5in hostage was beaten and kept chained in 6ft cell for a year after he tried to escape from Hamas The world's most powerful passport revealed - as UK and USA both drop to record lows The moment I looked into the eyes of a hostage's brother on the Gaza frontlines... and felt the enormous shift Body returned to Israel'is NOT an Israeli hostage': Hamas is accused of fresh insult after'tests show one set of remains was a Gazan' - after Trump'violently' threatened the terror group Benjamin Netanyahu appears in court to face corruption charges - after Trump told the Knesset: 'Cigars and champagne, who the hell cares about that? Why don't you give him a pardon' Police issue warning over'stupid and dangerous' TikTok trend that sees teens using AI to pretend a homeless person has broken into their house From the bizarre'barefoot everywhere challenge' to the rise of so-called'Sephora Kids', TikTok has given rise to many baffling trends. But the latest trend sweeping the social media platform has been dubbed'stupid and dangerous' by police. The trend sees teens using artificial intelligence ( AI) to pretend a homeless person has broken into their home.

britney spear, homeless person, police issue warning, (13 more...)

Country:

Asia > Middle East > Israel (0.54)
North America > United States > California > Los Angeles County > Los Angeles (0.24)
Europe > United Kingdom > Northern Ireland (0.24)
(12 more...)

Genre: Personal (1.00)

Industry:

Media > Television (1.00)
Leisure & Entertainment (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Al JazeeraOct-15-2025, 08:25:31 GMT

'Surveillance pricing': Why you might be paying more than your neighbour

'Surveillance pricing': Why you might be paying more than your neighbour You go into a store to buy a two-litre bottle of milk at your local supermarket and pay $3. But the person before you in the queue paid $3.50. And the person after you paid $2. What if those prices were based on your personal data or circumstances, or even the battery power on your phone? This may sound like science fiction, but it's not as far-fetched as you might think. In July, US group Delta Air Lines revealed that approximately 3 percent of its domestic fare pricing is determined using artificial intelligence (AI) - although it has not elaborated on how this happens. The company said it aims to increase this figure to 20 percent by the end of this year.

consumer, customer, pricing, (13 more...)

Al Jazeera

Country:

North America > United States > New York (0.05)
North America > United States > California (0.05)
South America (0.04)
(9 more...)

Genre: Press Release (0.34)

Industry:

Transportation > Passenger (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Information Technology > Services (0.94)
(5 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Applied AI (0.35)