llama3-instruct
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- (19 more...)
- Personal (0.93)
- Research Report > New Finding (0.45)
- Research Report > Promising Solution (0.34)
- Leisure & Entertainment (1.00)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- (3 more...)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- (19 more...)
- Personal (0.93)
- Research Report > New Finding (0.45)
- Leisure & Entertainment (1.00)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- (3 more...)
Colombian Waitresses y Jueces canadienses: Gender and Country Biases in Occupation Recommendations from LLMs
Rodríguez, Elisa Forcada, Perez-de-Viñaspre, Olatz, Campos, Jon Ander, Klakow, Dietrich, Gautam, Vagrant
One of the goals of fairness research in NLP is to measure and mitigate stereotypical biases that are propagated by NLP systems. However, such work tends to focus on single axes of bias (most often gender) and the English language. Addressing these limitations, we contribute the first study of multilingual intersecting country and gender biases, with a focus on occupation recommendations generated by large language models. We construct a benchmark of prompts in English, Spanish and German, where we systematically vary country and gender, using 25 countries and four pronoun sets. Then, we evaluate a suite of 5 Llama-based models on this benchmark, finding that LLMs encode significant gender and country biases. Notably, we find that even when models show parity for gender or country individually, intersectional occupational biases based on both country and gender persist. We also show that the prompting language significantly affects bias, and instruction-tuned models consistently demonstrate the lowest and most stable levels of bias. Our findings highlight the need for fairness researchers to use intersectional and multilingual lenses in their work.
- North America > United States > Washington > King County > Seattle (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Africa > Democratic Republic of the Congo (0.14)
- (33 more...)
Syntactic Control of Language Models by Posterior Inference
Xefteri, Vicky, Vieira, Tim, Cotterell, Ryan, Amini, Afra
Controlling the syntactic structure of text generated by language models is valuable for applications requiring clarity, stylistic consistency, or interpretability, yet it remains a challenging task. In this paper, we argue that sampling algorithms based on the posterior inference can effectively enforce a target constituency structure during generation. Our approach combines sequential Monte Carlo, which estimates the posterior distribution by sampling from a proposal distribution, with a syntactic tagger that ensures that each generated token aligns with the desired syntactic structure. Our experiments with GPT2 and Llama3-8B models show that with an appropriate proposal distribution, we can improve syntactic accuracy, increasing the F1 score from $12.31$ (GPT2-large) and $35.33$ (Llama3-8B) to about $93$ in both cases without compromising the language model's fluency. These results underscore both the complexity of syntactic control and the effectiveness of sampling algorithms, offering a promising approach for applications where precise control over syntax is essential.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Improving Structural Diversity of Blackbox LLMs via Chain-of-Specification Prompting
Young, Halley, Zeng, Yimeng, Gardner, Jacob, Bastani, Osbert
The capability to generate diverse text is a key challenge facing large language models (LLMs). Thus far, diversity has been studied via metrics such as $n$-gram diversity or diversity of BERT embeddings. However, for these kinds of diversity, the user has little control over the dimensions along which diversity is considered. For example, in the poetry domain, one might desire diversity in terms of rhyme and meter, whereas in the code domain, one might desire diversity in terms of the kinds of expressions used to solve a problem. We propose a diversity metric called structural diversity, where the user provides a mapping from generated text to features capturing the kinds of diversity that they care about. In addition, we propose a novel strategy called chain-of-specification (CoS) prompting for improving diversity by first having the LLM generate a specification encoding one instance of structural features, and then prompting the LLM to generate text that satisfies these features; notably, our strategy works with blackbox LLMs. In our experiments, we show that for structural diversity in the poetry and code domains, CoS significantly improves diversity compared to several baselines.
- North America > United States > Pennsylvania (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models
Jin, Zhuoran, Cao, Pengfei, Wang, Chenhao, He, Zhitao, Yuan, Hongbang, Li, Jiachun, Chen, Yubo, Liu, Kang, Zhao, Jun
Large language models (LLMs) inevitably memorize sensitive, copyrighted, and harmful knowledge from the training corpus; therefore, it is crucial to erase this knowledge from the models. Machine unlearning is a promising solution for efficiently removing specific knowledge by post hoc modifying models. In this paper, we propose a Real-World Knowledge Unlearning benchmark (RWKU) for LLM unlearning. RWKU is designed based on the following three key factors: (1) For the task setting, we consider a more practical and challenging unlearning setting, where neither the forget corpus nor the retain corpus is accessible. (2) For the knowledge source, we choose 200 real-world famous people as the unlearning targets and show that such popular knowledge is widely present in various LLMs. (3) For the evaluation framework, we design the forget set and the retain set to evaluate the model's capabilities across various real-world applications. Regarding the forget set, we provide four four membership inference attack (MIA) methods and nine kinds of adversarial attack probes to rigorously test unlearning efficacy. Regarding the retain set, we assess locality and utility in terms of neighbor perturbation, general ability, reasoning ability, truthfulness, factuality, and fluency. We conduct extensive experiments across two unlearning scenarios, two models and six baseline methods and obtain some meaningful findings. We release our benchmark and code publicly at http://rwku-bench.github.io for future work.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Indiana (0.04)
- (21 more...)
- Personal (1.00)
- Research Report > Promising Solution (0.34)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Law (1.00)
- (3 more...)
InstructionCP: A fast approach to transfer Large Language Models into target language
Chen, Kuang-Ming, Lee, Hung-yi
The rapid development of large language models (LLMs) in recent years has largely focused on English, resulting in models that respond exclusively in English. To adapt these models to other languages, continual pre-training (CP) is often employed, followed by supervised fine-tuning (SFT) to maintain conversational abilities. However, CP and SFT can reduce a model's ability to filter harmful content. We propose Instruction Continual Pre-training (InsCP), which integrates instruction tags into the CP process to prevent loss of conversational proficiency while acquiring new languages. Our experiments demonstrate that InsCP retains conversational and Reinforcement Learning from Human Feedback (RLHF) abilities. Empirical evaluations on language alignment, reliability, and knowledge benchmarks confirm the efficacy of InsCP. Notably, this approach requires only 0.1 billion tokens of high-quality instruction-following data, thereby reducing resource consumption.