AITopics | Xu, Zhaozhuo

Collaborating Authors

Xu, Zhaozhuo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ALinFiK: Learning to Approximate Linearized Future Influence Kernel for Scalable Third-Parity LLM Data Valuation

Pan, Yanzhou, Lin, Huawei, Ran, Yide, Chen, Jiamin, Yu, Xiaodong, Zhao, Weijie, Zhang, Denghui, Xu, Zhaozhuo

arXiv.org Artificial IntelligenceMar-2-2025

Large Language Models (LLMs) heavily rely on high-quality training data, making data valuation crucial for optimizing model performance, especially when working within a limited budget. In this work, we aim to offer a third-party data valuation approach that benefits both data providers and model developers. We introduce a linearized future influence kernel (LinFiK), which assesses the value of individual data samples in improving LLM performance during training. We further propose ALinFiK, a learning strategy to approximate LinFiK, enabling scalable data valuation. Our comprehensive evaluations demonstrate that this approach surpasses existing baselines in effectiveness and efficiency, demonstrating significant scalability advantages as LLM parameters increase.

artificial intelligence, large language model, natural language, (13 more...)

arXiv.org Artificial Intelligence

2503.01052

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Energy (0.47)
Information Technology (0.46)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Fox-1 Technical Report

Hu, Zijian, Zhang, Jipeng, Pan, Rui, Xu, Zhaozhuo, Han, Shanshan, Jin, Han, Shah, Alay Dilipbhai, Stripelis, Dimitris, Yao, Yuhang, Avestimehr, Salman, He, Chaoyang, Zhang, Tong

arXiv.org Artificial IntelligenceNov-17-2024

We present Fox-1, a series of small language models (SLMs) consisting of Fox-1-1.6B and Fox-1-1.6B-Instruct-v0.1. These models are pre-trained on 3 trillion tokens of web-scraped document data and fine-tuned with 5 billion tokens of instruction-following and multi-turn conversation data. Aiming to improve the pre-training efficiency, Fox-1-1.6B model introduces a novel 3-stage data curriculum across all the training data with 2K-8K sequence length. In architecture design, Fox-1 features a deeper layer structure, an expanded vocabulary, and utilizes Grouped Query Attention (GQA), offering a performant and efficient architecture compared to other SLMs. Fox-1 achieves better or on-par performance in various benchmarks compared to StableLM-2-1.6B, Gemma-2B, Qwen1.5-1.8B, and OpenELM1.1B, with competitive inference speed and throughput. The model weights have been released under the Apache 2.0 license, where we aim to promote the democratization of LLMs and make them fully accessible to the whole open-source community.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2411.05281

Country: North America > United States (0.29)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Alopex: A Computational Framework for Enabling On-Device Function Calls with LLMs

Ran, Yide, Xu, Zhaozhuo, Yao, Yuhang, Hu, Zijian, Han, Shanshan, Jin, Han, Shah, Alay Dilipbhai, Zhang, Jipeng, Stripelis, Dimitris, Zhang, Tong, Avestimehr, Salman, He, Chaoyang

arXiv.org Artificial IntelligenceNov-7-2024

The rapid advancement of Large Language Models (LLMs) has led to their increased integration into mobile devices for personalized assistance, which enables LLMs to call external API functions to enhance their performance. However, challenges such as data scarcity, ineffective question formatting, and catastrophic forgetting hinder the development of on-device LLM agents. To tackle these issues, we propose Alopex, a framework that enables precise on-device function calls using the Fox LLM. Alopex introduces a logic-based method for generating high-quality training data and a novel ``description-question-output'' format for fine-tuning, reducing risks of function information leakage. Additionally, a data mixing strategy is used to mitigate catastrophic forgetting, combining function call data with textbook datasets to enhance performance in various tasks. Experimental results show that Alopex improves function call accuracy and significantly reduces catastrophic forgetting, providing a robust solution for integrating function call capabilities into LLMs without manual intervention.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.05209

Country:

North America > United States > Illinois (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Education (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Do LLMs Know to Respect Copyright Notice?

Xu, Jialiang, Li, Shenglan, Xu, Zhaozhuo, Zhang, Denghui

arXiv.org Artificial IntelligenceNov-2-2024

Prior study shows that LLMs sometimes generate content that violates copyright. In this paper, we study another important yet underexplored problem, i.e., will LLMs respect copyright information in user input, and behave accordingly? The research problem is critical, as a negative answer would imply that LLMs will become the primary facilitator and accelerator of copyright infringement behavior. We conducted a series of experiments using a diverse set of language models, user prompts, and copyrighted materials, including books, news articles, API documentation, and movie scripts. Our study offers a conservative evaluation of the extent to which language models may infringe upon copyrights when processing user input containing protected material. This research emphasizes the need for further investigation and the importance of ensuring LLMs respect copyright regulations when handling user input to prevent unauthorized use or reproduction of protected content. We also release a benchmark dataset serving as a test bed for evaluating infringement behaviors by LLMs and stress the need for future alignment.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.01136

Country:

North America > United States (1.00)
Asia (0.67)

Genre: Research Report > New Finding (0.46)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Weighted Diversified Sampling for Efficient Data-Driven Single-Cell Gene-Gene Interaction Discovery

Wu, Yifan, Yang, Yuntao, Liu, Zirui, Li, Zhao, Pahwa, Khushbu, Li, Rongbin, Zheng, Wenjin, Hu, Xia, Xu, Zhaozhuo

arXiv.org Artificial IntelligenceOct-20-2024

Gene-gene interactions play a crucial role in the manifestation of complex human diseases. Uncovering significant gene-gene interactions is a challenging task. Here, we present an innovative approach utilizing data-driven computational tools, leveraging an advanced Transformer model, to unearth noteworthy gene-gene interactions. Despite the efficacy of Transformer models, their parameter intensity presents a bottleneck in data ingestion, hindering data efficiency. To mitigate this, we introduce a novel weighted diversified sampling algorithm. This algorithm computes the diversity score of each data sample in just two passes of the dataset, facilitating efficient subset generation for interaction discovery. Our extensive experimentation demonstrates that by sampling a mere 1\% of the single-cell dataset, we achieve performance comparable to that of utilizing the entire dataset.

bioinformatics, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.15616

Country: North America > United States (0.93)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

SpaLLM: Unified Compressive Adaptation of Large Language Models with Sketching

Zhang, Tianyi, Su, Junda, Wu, Oscar, Xu, Zhaozhuo, Shrivastava, Anshumali

arXiv.org Artificial IntelligenceOct-8-2024

Compressive adaptation approaches, such as QLoRA, are widely popular alternatives for reducing memory requirements during fine-tuning of large language models (LLMs) while producing models capable of handling various downstream tasks. The key idea is to employ a "two-tower" architecture: compressing pretrained LLM parameters into compact representations and fine-tuning the additive full-precision adapter, which typically has few tunable parameters in low-rank format. However, the strict algebraic assumptions, such as low-rank assumption, and the complexity of composing two-tower architectures are some of the known shortcomings, resulting in a poor accuracy-efficiency trade-off. In response to these known limitations, we propose SpaLLM (Sketched Parameter Adaptation of LLMs), a novel compressive adaptation approach for LLMs. This method is also the first to illustrate parameter-sharing compression methods for LLM finetuning, which, unlike QLoRA, are free from strict low-rank algebraic assumptions on adapters. This approach simplifies LLMs' compressive adaptation workflow, potentially improves multi-user serving efficiency, and delivers significantly better accuracy for both natural language understanding and generation tasks. Moreover, by avoiding the "two-tower" architecture, our framework only requires one compressed matrix multiplication per layer during inference, demonstrating superior inference efficiency compared to previous methods. Recent advancements in Large Language Models (LLMs) have demonstrated exceptional performance in Natural Language Processing (NLP), enabling a broad spectrum of downstream applications. LLMs have demonstrated impressive generalization abilities across many downstream tasks in a zero-shot manner. However, compared to training-free methods such as in-context learning (Dong et al., 2022; Rubin et al., 2021) and few-shot prompting (Brown, 2020; Song et al., 2023), fine-tuning on these LLMs is often the ideal method to achieve optimal performance on a specific downstream task (Ding et al., 2023). Clearly, full-precision fine-tuning on these LLMs are often impractical due to the massive requirement of high-performance computing devices such as GPUs. As a result, Parameter-Efficient Fine-Tuning methods (PEFT), such as Low-Rank Adaptation (LoRA) (Hu et al., 2022), emerged as a less resource-intensive approach to fine-tuning while achieving reasonable Clearly, there is a trade-off between accuracy and efficiency.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.06364

Country: North America > United States (0.68)

Genre: Research Report (0.51)

Industry:

Government (0.68)
Law (0.68)
Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Measuring Copyright Risks of Large Language Model via Partial Information Probing

Zhao, Weijie, Shao, Huajie, Xu, Zhaozhuo, Duan, Suzhen, Zhang, Denghui

arXiv.org Artificial IntelligenceSep-20-2024

Abstracting with credit is permitted.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2409.13831

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > New Finding (0.94)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Government > Regional Government > North America Government > United States Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches

Yuan, Jiayi, Liu, Hongyi, Shaochen, null, Zhong, null, Chuang, Yu-Neng, Li, Songchen, Wang, Guanchu, Le, Duy, Jin, Hongye, Chaudhary, Vipin, Xu, Zhaozhuo, Liu, Zirui, Hu, Xia

arXiv.org Artificial IntelligenceJul-1-2024

Long context capability is a crucial competency for large language models (LLMs) as it mitigates the human struggle to digest long-form texts. This capability enables complex task-solving scenarios such as book summarization, code assistance, and many more tasks that are traditionally manpower-intensive. However, transformer-based LLMs face significant challenges with long context input due to the growing size of the KV cache and the intrinsic complexity of attending to extended inputs; where multiple schools of efficiency-driven approaches -- such as KV cache quantization, token dropping, prompt compression, linear-time sequence models, and hybrid architectures -- have been proposed to produce efficient yet long context-capable models. Despite these advancements, no existing work has comprehensively benchmarked these methods in a reasonably aligned environment. In this work, we fill this gap by providing a taxonomy of current methods and evaluating 10+ state-of-the-art approaches across seven categories of long context tasks. Our work reveals numerous previously unknown phenomena and offers insights -- as well as a friendly workbench -- for the future development of long context-capable LLMs. The source code will be available at https://github.com/henryzhongsc/longctx_bench

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2407.01527

Country: North America > United States (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

TorchOpera: A Compound AI System for LLM Safety

Han, Shanshan, Yao, Yuhang, Hu, Zijian, Stripelis, Dimitris, Xu, Zhaozhuo, He, Chaoyang

arXiv.org Artificial IntelligenceJun-16-2024

We introduce TorchOpera, a compound AI system for enhancing the safety and quality of prompts and responses for Large Language Models. TorchOpera ensures that all user prompts are safe, contextually grounded, and effectively processed, while enhancing LLM responses to be relevant and high quality. TorchOpera utilizes the vector database for contextual grounding, rule-based wrappers for flexible modifications, and specialized mechanisms for detecting and adjusting unsafe or incorrect content. We also provide a view of the compound AI system to reduce the computational cost. Extensive experiments show that TorchOpera ensures the safety, reliability, and applicability of LLMs in real-world settings while maintaining the efficiency of LLM responses.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.10847

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Industry: Information Technology (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity

Guo, Wentao, Long, Jikai, Zeng, Yimeng, Liu, Zirui, Yang, Xinyu, Ran, Yide, Gardner, Jacob R., Bastani, Osbert, De Sa, Christopher, Yu, Xiaodong, Chen, Beidi, Xu, Zhaozhuo

arXiv.org Artificial IntelligenceJun-5-2024

Zeroth-order optimization (ZO) is a memory-efficient strategy for fine-tuning Large Language Models using only forward passes. However, the application of ZO fine-tuning in memory-constrained settings such as mobile phones and laptops is still challenging since full precision forward passes are infeasible. In this study, we address this limitation by integrating sparsity and quantization into ZO fine-tuning of LLMs. Specifically, we investigate the feasibility of fine-tuning an extremely small subset of LLM parameters using ZO. This approach allows the majority of un-tuned parameters to be quantized to accommodate the constraint of limited device memory. Our findings reveal that the pre-training process can identify a set of "sensitive parameters" that can guide the ZO fine-tuning of LLMs on downstream tasks. Our results demonstrate that fine-tuning 0.1% sensitive parameters in the LLM with ZO can outperform the full ZO fine-tuning performance, while offering wall-clock time speedup. Additionally, we show that ZO fine-tuning targeting these 0.1% sensitive parameters, combined with 4 bit quantization, enables efficient ZO fine-tuning of an Llama2-7B model on a GPU device with less than 8 GiB of memory and notably reduced latency.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2406.02913

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback