AITopics | alpaca

Collaborating Authors

alpaca

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RRHF: Rank Responses to Align Language Models with Human Feedback

Neural Information Processing SystemsFeb-19-2026, 03:16:23 GMT

InstructGPT implements RLHF through several stages, including Supervised Fine-Tuning (SFT), reward model training, and Proximal Policy Optimization (PPO).

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Oceania > New Zealand (0.04)
Oceania > Australia > Tasmania (0.04)
(6 more...)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Interpretability at Scale: Identifying Causal Mechanisms in Alpaca

Neural Information Processing SystemsDec-27-2025, 05:53:45 GMT

Obtaining human-interpretable explanations of large, general-purpose language models is an urgent goal for AI safety. However, it is just as important that our interpretability methods are faithful to the causal dynamics underlying model behavior and able to robustly generalize to unseen inputs. Distributed Alignment Search (DAS) is a powerful gradient descent method grounded in a theory of causal abstraction that uncovered perfect alignments between interpretable symbolic algorithms and small deep learning models fine-tuned for specific tasks. In the present paper, we scale DAS significantly by replacing the remaining brute-force search steps with learned parameters -- an approach we call Boundless DAS. This enables us to efficiently search for interpretable causal structure in large language models while they follow instructions. We apply Boundless DAS to the Alpaca model (7B parameters), which, off the shelf, solves a simple numerical reasoning problem. With Boundless DAS, we discover that Alpaca does this by implementing a causal model with two interpretable boolean variables. Furthermore, we find that the alignment of neural representations with these variables is robust to changes in inputs and instructions. These findings mark a first step toward deeply understanding the inner-workings of our largest and most widely deployed language models.

identifying causal mechanism, interpretability, name change, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

Pay Attention to the Triggers: Constructing Backdoors That Survive Distillation

De Muri, Giovanni, Vero, Mark, Staab, Robin, Vechev, Martin

arXiv.org Artificial IntelligenceOct-22-2025

LLMs are often used by downstream users as teacher models for knowledge distillation, compressing their capabilities into memory-efficient models. However, as these teacher models may stem from untrusted parties, distillation can raise unexpected security risks. In this paper, we investigate the security implications of knowledge distillation from backdoored teacher models. First, we show that prior backdoors mostly do not transfer onto student models. Our key insight is that this is because existing LLM backdooring methods choose trigger tokens that rarely occur in usual contexts. We argue that this underestimates the security risks of knowledge distillation and introduce a new backdooring technique, T-MTB, that enables the construction and study of transferable backdoors. T-MTB carefully constructs a composite backdoor trigger, made up of several specific tokens that often occur individually in anticipated distillation datasets. As such, the poisoned teacher remains stealthy, while during distillation the individual presence of these tokens provides enough signal for the backdoor to transfer onto the student. Using T-MTB, we demonstrate and extensively study the security risks of transferable backdoors across two attack scenarios, jailbreaking and content modulation, and across four model families of LLMs.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2510.18541

Country: Europe (0.46)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.74)
Education > Educational Technology > Educational Software (0.36)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Prompt-Aware Scheduling for Low-Latency LLM Serving

Tao, Yiheng, Zhang, Yihe, Dearing, Matthew T., Wang, Xin, Fan, Yuping, Lan, Zhiling

arXiv.org Artificial IntelligenceOct-13-2025

Abstract--Efficient scheduling of large language model (LLM) inference tasks is essential for achieving low latency and high throughput, particularly with the growing use of reasoning-capable LLMs. Traditional strategies like First Come, First-Serve (FCFS) often suffer from Head-of-Line (HOL) blocking, where long-running tasks delay shorter ones queued behind them. In this paper, we introduce PARS, a prompt-aware LLM task scheduler that improves serving efficiency by approximating shortest-job-first (SJF) scheduling through pairwise ranking with margin ranking loss. PARS focuses on impactful scheduling decisions and seamlessly integrates into the state-of-the-art LLM serving system vLLM. It effectively predicts response-length-based task ordering, reducing latency with minimal overhead. Extensive experiments across multiple LLMs and real-world inference datasets show that PARS significantly improves performance, including for reasoning workloads. Furthermore, our cross-model evaluations demonstrate that the design generalizes well, enabling effective scheduling even when predictors are trained on different LLMs. Large language models (LLMs) have emerged as core engines for artificial intelligence applications, demonstrating remarkable capabilities in a wide range of tasks, including question answering, code generation, and text classification.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.03243

Country: North America > United States > Illinois (0.15)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

b9cfe8b6042cf759dc4c0cccb27a6737-AuthorFeedback.pdf

Neural Information Processing SystemsAug-16-2025, 01:52:37 GMT

acceptance, alpaca, reviewer, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Evolution without Large Models: Training Language Model with Task Principles

Zhu, Minghang, Gao, Shen, Shi, Zhengliang, Fang, Jiabao, Ren, Pengjie, Ren, Zhaochun, Chen, Zhumin, Shang, Shuo

arXiv.org Artificial IntelligenceJul-9-2025

A common training approach for language models involves using a large-scale language model to expand a human-provided dataset, which is subsequently used for model training.This method significantly reduces training costs by eliminating the need for extensive human data annotation. However, it still faces challenges such as high carbon emissions during data augmentation and the risk of data leakage when we use closed-source LLMs. To address these issues, we propose a self-evolution method for language models. First, we introduce the Multi-level Principle Generation, which enables a large-scale model to summarize task-completion principles based on a small amount of task data. Then, we propose the Principle-based Instance Generation, in which a smaller-scale language model uses these task principles to generate a large amount of data. This data is then used for model training. Experimental results show that our proposed method significantly improves model performance compared to directly using a smaller-scale language model to generate data. Additionally, since we only use the large-scale language model to generate the task-completion principles, the carbon emissions associated with training the model are greatly reduced.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.05991

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.66)

Industry:

Energy (1.00)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

ProDS: Preference-oriented Data Selection for Instruction Tuning

Guo, Wenya, Zhang, Zhengkun, Liu, Xumeng, Zhang, Ying, Lu, Ziyu, Zhu, Haoze, Liu, Xubo, Yan, Ruxue

arXiv.org Artificial IntelligenceMay-20-2025

Instruction data selection aims to identify a high-quality subset from the training set that matches or exceeds the performance of the full dataset on target tasks. Existing methods focus on the instruction-to-response mapping, but neglect the human preference for diverse responses. In this paper, we propose Preference-oriented Data Selection method (ProDS) that scores training samples based on their alignment with preferences observed in the target set. Our key innovation lies in shifting the data selection criteria from merely estimating features for accurate response generation to explicitly aligning training samples with human preferences in target tasks. Specifically, direct preference optimization (DPO) is employed to estimate human preferences across diverse responses. Besides, a bidirectional preference synthesis strategy is designed to score training samples according to both positive preferences and negative preferences. Extensive experimental results demonstrate our superiority to existing task-agnostic and targeted methods.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.12754

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
(7 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Government (1.00)
Leisure & Entertainment > Sports (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

From Informal to Formal -- Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs

Cao, Jialun, Lu, Yaojie, Li, Meiziniu, Ma, Haoyang, Li, Haokun, He, Mengda, Wen, Cheng, Sun, Le, Zhang, Hongyu, Qin, Shengchao, Cheung, Shing-Chi, Tian, Cong

arXiv.org Artificial IntelligenceFeb-17-2025

The research in AI-based formal mathematical reasoning has shown an unstop- pable growth trend. These studies have excelled in mathematical competitions like IMO and have made significant progress. This paper focuses on formal verification, an immediate application scenario of formal reasoning, and breaks it down into sub-tasks. We constructed 18k high-quality instruction-response pairs across five formal specification languages (Coq, Lean4, Dafny, ACSL, and TLA+) by distilling gpt-4o and evaluated against ten open-sourced LLMs, including recent popular DeepSeek-R1. We also fine-tuned several 7~8B small models to achieve comparable performance with Deepseek-R1-671B. Interestingly, we observed that fine-tuning with formal data also enhances mathematics, reasoning, and coding capabilities. Fine-tuned models are released at https: //huggingface.co/fm-universe.

large language model, machine learning, specification, (18 more...)

arXiv.org Artificial Intelligence

2501.16207

Country: