AITopics

2410.03136

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Maryland (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Workflow (0.93)
Research Report (0.82)

Industry:

Law (0.67)
Energy (0.67)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Thaler, Marion, Köksal, Abdullatif, Leidinger, Alina, Korhonen, Anna, Schütze, Hinrich

How far can bias go? -- Tracing bias from pretraining data to alignment

arXiv.org Artificial IntelligenceNov-28-2024

As LLMs are increasingly integrated into user-facing applications, addressing biases that perpetuate societal inequalities is crucial. While much work has gone into measuring or mitigating biases in these models, fewer studies have investigated their origins. Therefore, this study examines the correlation between gender-occupation bias in pre-training data and their manifestation in LLMs, focusing on the Dolma dataset and the OLMo model. Using zero-shot prompting and token co-occurrence analyses, we explore how biases in training data influence model outputs. Our findings reveal that biases present in pre-training data are amplified in model outputs. The study also examines the effects of prompt types, hyperparameters, and instruction-tuning on bias expression, finding instruction-tuning partially alleviating representational bias while still maintaining overall stereotypical gender associations, whereas hyperparameters and prompting variation have a lesser effect on bias expression. Our research traces bias throughout the LLM development pipeline and underscores the importance of mitigating bias at the pretraining stage.

computational linguistic, large language model, machine learning, (17 more...)

2411.1924

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York (0.04)
North America > Canada > Ontario > Toronto (0.04)
(15 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Health & Medicine > Therapeutic Area (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceNov-28-2024

Extracting Training Data from Unconditional Diffusion Models

Chen, Yunhao, Wang, Shujie, Zou, Difan, Ma, Xingjun

As diffusion probabilistic models (DPMs) are being employed as mainstream models for Generative Artificial Intelligence (GenAI), the study of their memorization has attracted growing attention. Existing works in this field aim to establish an understanding of whether or to what extent DPMs learn via memorization. Such an understanding is crucial for identifying potential risks of data leakage and copyright infringement in diffusion models and, more importantly, for trustworthy application of GenAI. Existing works revealed that conditional DPMs are more prone to memorize training data than unconditional DPMs. And most data extraction methods developed so far target conditional DPMs. Although unconditional DPMs are less prone to data extraction, further investigation into these attacks remains essential since they serve as the foundation for conditional models like Stable Diffusion, and exploring these attacks will enhance our understanding of memorization in DPMs. In this work, we propose a novel data extraction method named \textbf{Surrogate condItional Data Extraction (SIDE)} that leverages a time-dependent classifier trained on generated data as surrogate conditions to extract training data from unconditional DPMs. Empirical results demonstrate that it can extract training data in challenging scenarios where previous methods fail, and it is, on average, over 50\% more effective across different scales of the CelebA dataset. Furthermore, we provide a theoretical understanding of memorization in both conditional and unconditional DPMs and why SIDE is effective.

classifier, memorization, training data, (14 more...)

2410.02467

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.87)

Industry:

Law > Intellectual Property & Technology Law (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.66)

BBC NewsNov-27-2024, 15:18:22 GMT

US regulator says AI scanner 'deceived' users after BBC story

"The FTC has been clear that claims about technology – including artificial intelligence – need to be backed up", said Samuel Levine, Director of the Bureau of Consumer Protection. Evolv Technology's mission is to replace metal detectors with AI weapons scanners. It claims to do this with artificial intelligence, which can actively detect concealed weapons like bombs, knives and guns. The FTC's complaint alleges the company deceptively advertised its scanners would detect "all weapons". In 2022 the BBC outlined some of the impressive claims Evolv's then CEO had made about the technology.

artificial intelligence, evolv, us regulator, (6 more...)

BBC News

Country: North America > United States > New York (0.07)

Industry:

Law (0.86)
Government > Regional Government (0.66)

Technology: Information Technology > Artificial Intelligence (1.00)

The New YorkerNov-27-2024, 11:00:00 GMT

What Google Off-loading Chrome Would Mean for Users

Using "the Internet" sometimes seems disconcertingly synonymous with using Google. Google Search, the most popular search engine on the planet, indexes the open Internet, driving traffic to Web sites, and Google Ads provides the revenue that publishers survive on. Gmail is how some two billion people receive their e-mail; many Gmail in-boxes have been accumulating messages for a decade or more. Last, but certainly not least, the company's browser, Google Chrome, is what a staggering three billion people use to navigate the Internet. According to some estimates, Google holds nearly ninety per cent market share in search engines in the U.S. Chrome, in turn, provides the audience data that Google's ads leverage to target users, and links the company's other services together.

artificial intelligence, information management, machine learning, (18 more...)

The New Yorker

Country:

North America > United States > California (0.15)
Europe (0.15)

Industry:

Information Technology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Law > Statutes (0.71)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.31)

Safe + Safe = Unsafe? Exploring How Safe Images Can Be Exploited to Jailbreak Large Vision-Language Models

Cui, Chenhang, Deng, Gelei, Zhang, An, Zheng, Jingnan, Li, Yicong, Gao, Lianli, Zhang, Tianwei, Chua, Tat-Seng

Recent advances in Large Vision-Language Models (LVLMs) have showcased strong reasoning abilities across multiple modalities, achieving significant breakthroughs in various real-world applications. Despite this great success, the safety guardrail of LVLMs may not cover the unforeseen domains introduced by the visual modality. Existing studies primarily focus on eliciting LVLMs to generate harmful responses via carefully crafted image-based jailbreaks designed to bypass alignment defenses. In this study, we reveal that a safe image can be exploited to achieve the same jailbreak consequence when combined with additional safe images and prompts. This stems from two fundamental properties of LVLMs: universal reasoning capabilities and safety snowball effect. Building on these insights, we propose Safety Snowball Agent (SSA), a novel agent-based framework leveraging agents' autonomous and tool-using abilities to jailbreak LVLMs. SSA operates through two principal stages: (1) initial response generation, where tools generate or retrieve jailbreak images based on potential harmful intents, and (2) harmful snowballing, where refined subsequent prompts induce progressively harmful outputs. Our experiments demonstrate that \ours can use nearly any image to induce LVLMs to produce unsafe content, achieving high success jailbreaking rates against the latest LVLMs. Unlike prior works that exploit alignment flaws, \ours leverages the inherent properties of LVLMs, presenting a profound challenge for enforcing safety in generative multimodal systems. Our code is avaliable at \url{https://github.com/gzcch/Safety_Snowball_Agent}.

large language model, lvlm, machine learning, (20 more...)

2411.11496

Country:

North America > United States (1.00)
Asia > Russia (1.00)
Europe > Switzerland > Zürich > Zürich (0.14)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Music (1.00)
Materials > Chemicals (1.00)
Leisure & Entertainment (1.00)
(11 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

An indicator for effectiveness of text-to-image guardrails utilizing the Single-Turn Crescendo Attack (STCA)

Kwartler, Ted, Bagan, Nataliia, Banny, Ivan, Aqrawi, Alan, Abbasi, Arian

The Single-Turn Crescendo Attack (STCA), first introduced in Aqrawi and Abbasi [2024], is an innovative method designed to bypass the ethical safeguards of text-to-text AI models, compelling them to generate harmful content. This technique leverages a strategic escalation of context within a single prompt, combined with trust-building mechanisms, to subtly deceive the model into producing unintended outputs. Extending the application of STCA to text-to-image models, we demonstrate its efficacy by compromising the guardrails of a widely-used model, DALL-E 3, achieving outputs comparable to outputs from the uncensored model Flux Schnell, which served as a baseline control. This study provides a framework for researchers to rigorously evaluate the robustness of guardrails in text-to-image models and benchmark their resilience against adversarial attacks.

guardrail, machine learning, natural language, (19 more...)

2411.18699

Country: Europe > Netherlands > South Holland > Leiden (0.04)

Genre:

Research Report > Promising Solution (0.34)
Research Report > Experimental Study (0.34)

Industry:

Health & Medicine (0.95)
Information Technology > Security & Privacy (0.88)
Law (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.53)

Joshi, Ratnesh Kumar, Priya, Priyanshu, Desai, Vishesh, Dudhate, Saurav, Senapati, Siddhant, Ekbal, Asif, Ramnani, Roshni, Maitra, Anutosh, Sengupta, Shubhashis

Strategic Prompting for Conversational Tasks: A Comparative Analysis of Large Language Models Across Diverse Conversational Tasks

Given the advancements in conversational artificial intelligence, the evaluation and assessment of Large Language Models (LLMs) play a crucial role in ensuring optimal performance across various conversational tasks. In this paper, we present a comprehensive study that thoroughly evaluates the capabilities and limitations of five prevalent LLMs: Llama, OPT, Falcon, Alpaca, and MPT. The study encompasses various conversational tasks, including reservation, empathetic response generation, mental health and legal counseling, persuasion, and negotiation. To conduct the evaluation, an extensive test setup is employed, utilizing multiple evaluation criteria that span from automatic to human evaluation. This includes using generic and task-specific metrics to gauge the LMs' performance accurately. From our evaluation, no single model emerges as universally optimal for all tasks. Instead, their performance varies significantly depending on the specific requirements of each task. While some models excel in certain tasks, they may demonstrate comparatively poorer performance in others. These findings emphasize the importance of considering task-specific requirements and characteristics when selecting the most suitable LM for conversational applications.

large language model, machine learning, natural language, (18 more...)

2411.17204

Country:

Asia > India > Bihar > Patna (0.04)
North America > United States > New York (0.04)
Asia > India > Karnataka > Bengaluru (0.04)
(3 more...)

Genre: Research Report (0.81)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Spangher, Alexander, Huang, Kung-Hsiang, Cho, Hyundong, May, Jonathan

NewsEdits 2.0: Learning the Intentions Behind Updating News

As events progress, news articles often update with new information: if we are not cautious, we risk propagating outdated facts. In this work, we hypothesize that linguistic features indicate factual fluidity, and that we can predict which facts in a news article will update using solely the text of a news article (i.e. not external resources like search engines). We test this hypothesis, first, by isolating fact-updates in large news revisions corpora. News articles may update for many reasons (e.g. factual, stylistic, narrative). We introduce the NewsEdits 2.0 taxonomy, an edit-intentions schema that separates fact updates from stylistic and narrative updates in news writing. We annotate over 9,200 pairs of sentence revisions and train high-scoring ensemble models to apply this schema. Then, taking a large dataset of silver-labeled pairs, we show that we can predict when facts will update in older article drafts with high precision. Finally, to demonstrate the usefulness of these findings, we construct a language model question asking (LLM-QA) abstention task. We wish the LLM to abstain from answering questions when information is likely to become outdated. Using our predictions, we show, LLM absention reaches near oracle levels of accuracy.

information, large language model, natural language, (17 more...)

2411.18811

Country:

North America > United States > California (0.14)
North America > United States > New York (0.04)
Asia > Middle East > Syria (0.04)
(15 more...)

Genre: Research Report (0.64)

Industry:

Media > News (1.00)
Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

arXiv.org Machine LearningNov-27-2024

The Return of Pseudosciences in Artificial Intelligence: Have Machine Learning and Deep Learning Forgotten Lessons from Statistics and History?

Sublime, Jérémie

In today's world, AI programs powered by Machine Learning are ubiquitous, and have achieved seemingly exceptional performance across a broad range of tasks, from medical diagnosis and credit rating in banking, to theft detection via video analysis, and even predicting political or sexual orientation from facial images. These predominantly deep learning methods excel due to their extraordinary capacity to process vast amounts of complex data to extract complex correlations and relationship from different levels of features. In this paper, we contend that the designers and final users of these ML methods have forgotten a fundamental lesson from statistics: correlation does not imply causation. Not only do most state-of-the-art methods neglect this crucial principle, but by doing so they often produce nonsensical or flawed causal models, akin to social astrology or physiognomy. Consequently, we argue that current efforts to make AI models more ethical by merely reducing biases in the training data are insufficient. Through examples, we will demonstrate that the potential for harm posed by these methods can only be mitigated by a complete rethinking of their core models, improved quality assessment metrics and policies, and by maintaining humans oversight throughout the process.

application, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

2411.18656

Country:

North America > United States > New York (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > Canada > Quebec > Montreal (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)