AITopics | Lakewood

Collaborating Authors

Lakewood

Fine-Grained Detoxification via Instance-Level Prefixes for Large Language Models

Yi, Xin, Wang, Linlin, Wang, Xiaoling, He, Liang

arXiv.org Artificial IntelligenceFeb-25-2024

Impressive results have been achieved in natural language processing (NLP) tasks through the training of large language models (LLMs). However, these models occasionally produce toxic content such as insults, threats, and profanity in response to certain prompts, thereby constraining their practical utility. To tackle this issue, various finetuning-based and decoding-based approaches have been utilized to mitigate toxicity. However, these methods typically necessitate additional costs such as high-quality training data or auxiliary models. In this paper, we propose fine-grained detoxification via instance-level prefixes (FGDILP) to mitigate toxic text without additional cost. Specifically, FGDILP contrasts the contextualized representation in attention space using a positive prefix-prepended prompt against multiple negative prefix-prepended prompts at the instance level. This allows for constructing fine-grained subtoxicity vectors, which enables collaborative detoxification by fusing them to correct the normal generation process when provided with a raw prompt. We validate that FGDILP enables controlled text generation with regard to toxicity at both the utterance and context levels. Our method surpasses prompt-based baselines in detoxification, although at a slight cost to generation fluency and diversity.

language model, toxicity, vector, (11 more...)

arXiv.org Artificial Intelligence

2402.15202

Country:

North America > United States > New York > Bronx County > New York City (0.04)
North America > United States > New Jersey > Ocean County > Lakewood (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.93)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Numeracy from Literacy: Data Science as an Emergent Skill from Large Language Models

Noever, David, McKee, Forrest

arXiv.org Artificial IntelligenceJan-30-2023

Previous publicly-available transformer models from eighteen months prior and 1000 times smaller failed to provide basic arithmetic. The statistical analysis of four complex datasets described here combines arithmetic manipulations that cannot be memorized or encoded by simple rules. The work examines whether next-token prediction succeeds from sentence completion into the realm of actual numerical understanding. For example, the work highlights cases for descriptive statistics on in-memory datasets that the LLM initially loads from memory or generates randomly using python libraries. The resulting exploratory data analysis showcases the model's capabilities to group by or pivot categorical sums, infer feature importance, derive correlations, and predict unseen test cases using linear regression. To extend the model's testable range, the research deletes and appends random rows such that recall alone cannot explain emergent numeracy.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2301.13382

Country:

North America > United States > Alabama > Madison County > Huntsville (0.14)
Asia > Bangladesh (0.04)
North America > United States > New York > New York County > Manhattan (0.04)
(10 more...)

Genre: Research Report (0.63)

Industry:

Health & Medicine (1.00)
Government > Regional Government (0.46)
Banking & Finance > Real Estate (0.46)
Transportation > Passenger (0.31)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Add feedback

Semblance: A Rank-Based Kernel on Probability Spaces for Niche Detection

Agarwal, Divyansh, Zhang, Nancy

arXiv.org Machine LearningAug-6-2018

Kernel methods provide a principled approach for detecting nonlinear relations using well understood linear algorithms. In exploratory data analyses when the underlying structure of the data's probability space is unclear, the choice of kernel is often arbitrary. Here, we present a novel kernel, Semblance, on a probability feature space. The advantage of Semblance lies in its distribution free formulation and its ability to detect niche features by placing greater emphasis on similarity between observation pairs that fall at the tail ends of a distribution, as opposed to those that fall towards the mean. We prove that Semblance is a valid Mercer kernel and illustrate its applicability through simulations and real world examples.

artificial intelligence, machine learning, semblance, (16 more...)

arXiv.org Machine Learning

1808.02061

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback