AITopics | He, Chaoyang

Collaborating Authors

He, Chaoyang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bridging the Safety Gap: A Guardrail Pipeline for Trustworthy LLM Inferences

Han, Shanshan, Avestimehr, Salman, He, Chaoyang

arXiv.org Artificial IntelligenceFeb-12-2025

We present Wildflare GuardRail, a guardrail pipeline designed to enhance the safety and reliability of Large Language Model (LLM) inferences by systematically addressing risks across the entire processing workflow. Wildflare GuardRail integrates several core functional modules, including Safety Detector that identifies unsafe inputs and detects hallucinations in model outputs while generating root-cause explanations, Grounding that contextualizes user queries with information retrieved from vector databases, Customizer that adjusts outputs in real time using lightweight, rule-based wrappers, and Repairer that corrects erroneous LLM outputs using hallucination explanations provided by Safety Detector. Results show that our unsafe content detection model in Safety Detector achieves comparable performance with OpenAI API, though trained on a small dataset constructed with several public datasets. Meanwhile, the lightweight wrappers can address malicious URLs in model outputs in 1.06s per query with 100% accuracy without costly model calls. Moreover, the hallucination fixing model demonstrates effectiveness in reducing hallucinations with an accuracy of 80.7%.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.08142

Country: North America > United States > California > Orange County > Irvine (0.14)

Genre: Research Report (0.70)

Industry:

Health & Medicine (0.68)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Fox-1 Technical Report

Hu, Zijian, Zhang, Jipeng, Pan, Rui, Xu, Zhaozhuo, Han, Shanshan, Jin, Han, Shah, Alay Dilipbhai, Stripelis, Dimitris, Yao, Yuhang, Avestimehr, Salman, He, Chaoyang, Zhang, Tong

arXiv.org Artificial IntelligenceNov-17-2024

We present Fox-1, a series of small language models (SLMs) consisting of Fox-1-1.6B and Fox-1-1.6B-Instruct-v0.1. These models are pre-trained on 3 trillion tokens of web-scraped document data and fine-tuned with 5 billion tokens of instruction-following and multi-turn conversation data. Aiming to improve the pre-training efficiency, Fox-1-1.6B model introduces a novel 3-stage data curriculum across all the training data with 2K-8K sequence length. In architecture design, Fox-1 features a deeper layer structure, an expanded vocabulary, and utilizes Grouped Query Attention (GQA), offering a performant and efficient architecture compared to other SLMs. Fox-1 achieves better or on-par performance in various benchmarks compared to StableLM-2-1.6B, Gemma-2B, Qwen1.5-1.8B, and OpenELM1.1B, with competitive inference speed and throughput. The model weights have been released under the Apache 2.0 license, where we aim to promote the democratization of LLMs and make them fully accessible to the whole open-source community.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2411.05281

Country: North America > United States (0.29)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Alopex: A Computational Framework for Enabling On-Device Function Calls with LLMs

Ran, Yide, Xu, Zhaozhuo, Yao, Yuhang, Hu, Zijian, Han, Shanshan, Jin, Han, Shah, Alay Dilipbhai, Zhang, Jipeng, Stripelis, Dimitris, Zhang, Tong, Avestimehr, Salman, He, Chaoyang

arXiv.org Artificial IntelligenceNov-7-2024

The rapid advancement of Large Language Models (LLMs) has led to their increased integration into mobile devices for personalized assistance, which enables LLMs to call external API functions to enhance their performance. However, challenges such as data scarcity, ineffective question formatting, and catastrophic forgetting hinder the development of on-device LLM agents. To tackle these issues, we propose Alopex, a framework that enables precise on-device function calls using the Fox LLM. Alopex introduces a logic-based method for generating high-quality training data and a novel ``description-question-output'' format for fine-tuning, reducing risks of function information leakage. Additionally, a data mixing strategy is used to mitigate catastrophic forgetting, combining function call data with textbook datasets to enhance performance in various tasks. Experimental results show that Alopex improves function call accuracy and significantly reduces catastrophic forgetting, providing a robust solution for integrating function call capabilities into LLMs without manual intervention.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.05209

Country:

North America > United States > Illinois (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Education (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

TorchOpera: A Compound AI System for LLM Safety

Han, Shanshan, Yao, Yuhang, Hu, Zijian, Stripelis, Dimitris, Xu, Zhaozhuo, He, Chaoyang

arXiv.org Artificial IntelligenceJun-16-2024

We introduce TorchOpera, a compound AI system for enhancing the safety and quality of prompts and responses for Large Language Models. TorchOpera ensures that all user prompts are safe, contextually grounded, and effectively processed, while enhancing LLM responses to be relevant and high quality. TorchOpera utilizes the vector database for contextual grounding, rule-based wrappers for flexible modifications, and specialized mechanisms for detecting and adjusting unsafe or incorrect content. We also provide a view of the compound AI system to reduce the computational cost. Extensive experiments show that TorchOpera ensures the safety, reliability, and applicability of LLMs in real-world settings while maintaining the efficiency of LLM responses.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.10847

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Industry: Information Technology (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

LLM Multi-Agent Systems: Challenges and Open Problems

Han, Shanshan, Zhang, Qifan, Yao, Yuhang, Jin, Weizhao, Xu, Zhaozhuo, He, Chaoyang

arXiv.org Artificial IntelligenceFeb-5-2024

This paper explores existing works of multi-agent systems and identifies challenges that remain inadequately addressed. By leveraging the diverse capabilities and roles of individual agents within a multi-agent system, these systems can tackle complex tasks through collaboration. We discuss optimizing task allocation, fostering robust reasoning through iterative debates, managing complex and layered context information, and enhancing memory management to support the intricate interactions within multi-agent systems. We also explore the potential application of multi-agent systems in blockchain systems to shed light on their future development and application in real-world distributed systems.

agent, artificial intelligence, multi-agent system, (15 more...)

arXiv.org Artificial Intelligence

2402.03578

Country: North America > United States > California (0.46)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Kick Bad Guys Out! Zero-Knowledge-Proof-Based Anomaly Detection in Federated Learning

Han, Shanshan, Wu, Wenxuan, Buyukates, Baturalp, Jin, Weizhao, Zhang, Qifan, Yao, Yuhang, Avestimehr, Salman, He, Chaoyang

arXiv.org Artificial IntelligenceFeb-5-2024

Federated Learning (FL) systems are vulnerable to adversarial attacks, where malicious clients submit poisoned models to prevent the global model from converging or plant backdoors to induce the global model to misclassify some samples. Current defense methods fall short in real-world FL systems, as they either rely on impractical prior knowledge or introduce accuracy loss even when no attack happens. Also, these methods do not offer a protocol for verifying the execution, leaving participants doubtful about the correct execution of the mechanism. To address these issues, we propose a novel anomaly detection strategy designed for real-world FL systems. Our approach activates the defense only upon occurrence of attacks, and removes malicious models accurately, without affecting the benign ones. Additionally, our approach incorporates zero-knowledge proofs to ensure the integrity of defense mechanisms. Experimental results demonstrate the effectiveness of our approach in enhancing the security of FL systems against adversarial attacks.

data mining, local model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2310.04055

Country: North America > United States > California (0.46)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Data-Free Approach to Mitigate Catastrophic Forgetting in Federated Class Incremental Learning for Vision Tasks

Babakniya, Sara, Fabian, Zalan, He, Chaoyang, Soltanolkotabi, Mahdi, Avestimehr, Salman

arXiv.org Artificial IntelligenceNov-21-2023

Deep learning models often suffer from forgetting previously learned information when trained on new data. This problem is exacerbated in federated learning (FL), where the data is distributed and can change independently for each user. Many solutions are proposed to resolve this catastrophic forgetting in a centralized setting. However, they do not apply directly to FL because of its unique complexities, such as privacy concerns and resource limitations. To overcome these challenges, this paper presents a framework for $\textbf{federated class incremental learning}$ that utilizes a generative model to synthesize samples from past distributions. This data can be later exploited alongside the training data to mitigate catastrophic forgetting. To preserve privacy, the generative model is trained on the server using data-free methods at the end of each task without requesting data from clients. Moreover, our solution does not demand the users to store old data or models, which gives them the freedom to join/leave the training at any time. Additionally, we introduce SuperImageNet, a new regrouping of the ImageNet dataset specifically tailored for federated continual learning. We demonstrate significant improvements compared to existing baselines through extensive experiments on multiple datasets.

artificial intelligence, generative model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2311.07784

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

FedML-HE: An Efficient Homomorphic-Encryption-Based Privacy-Preserving Federated Learning System

Jin, Weizhao, Yao, Yuhang, Han, Shanshan, Joe-Wong, Carlee, Ravi, Srivatsan, Avestimehr, Salman, He, Chaoyang

arXiv.org Artificial IntelligenceOct-30-2023

Federated Learning trains machine learning models on distributed devices by aggregating local model updates instead of local data. However, privacy concerns arise as the aggregated local models on the server may reveal sensitive personal information by inversion attacks. Privacy-preserving methods, such as homomorphic encryption (HE), then become necessary for FL training. Despite HE's privacy advantages, its applications suffer from impractical overheads, especially for foundation models. In this paper, we present FedML-HE, the first practical federated learning system with efficient HE-based secure model aggregation. FedML-HE proposes to selectively encrypt sensitive parameters, significantly reducing both computation and communication overheads during training while providing customizable privacy preservation. Our optimized system demonstrates considerable overhead reduction, particularly for large foundation models (e.g., ~10x reduction for ResNet-50, and up to ~40x reduction for BERT), demonstrating the potential for scalable HE-based FL deployment.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2303.10837

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.61)

Add feedback

FedMLSecurity: A Benchmark for Attacks and Defenses in Federated Learning and Federated LLMs

Han, Shanshan, Buyukates, Baturalp, Hu, Zijian, Jin, Han, Jin, Weizhao, Sun, Lichao, Wang, Xiaoyang, Wu, Wenxuan, Xie, Chulin, Yao, Yuhang, Zhang, Kai, Zhang, Qifan, Zhang, Yuhui, Avestimehr, Salman, He, Chaoyang

arXiv.org Artificial IntelligenceOct-6-2023

This paper introduces FedMLSecurity, a benchmark designed to simulate adversarial attacks and corresponding defense mechanisms in Federated Learning (FL). As an integral module of the open-sourced library FedML that facilitates FL algorithm development and performance comparison, FedMLSecurity enhances FedML's capabilities to evaluate security issues and potential remedies in FL. FedMLSecurity comprises two major components: FedMLAttacker that simulates attacks injected during FL training, and FedMLDefender that simulates defensive mechanisms to mitigate the impacts of the attacks. FedMLSecurity is open-sourced and can be customized to a wide range of machine learning models (e.g., Logistic Regression, ResNet, GAN, etc.) and federated optimizers (e.g., FedAVG, FedOPT, FedNOVA, etc.). FedMLSecurity can also be applied to Large Language Models (LLMs) easily, demonstrating its adaptability and applicability in various scenarios.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.04959

Country:

North America > United States > Texas (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Don't Memorize; Mimic The Past: Federated Class Incremental Learning Without Episodic Memory

Babakniya, Sara, Fabian, Zalan, He, Chaoyang, Soltanolkotabi, Mahdi, Avestimehr, Salman

arXiv.org Artificial IntelligenceJul-17-2023

Deep learning models are prone to forgetting information learned in the past when trained on new data. This problem becomes even more pronounced in the context of federated learning (FL), where data is decentralized and subject to independent changes for each user. Continual Learning (CL) studies this so-called \textit{catastrophic forgetting} phenomenon primarily in centralized settings, where the learner has direct access to the complete training dataset. However, applying CL techniques to FL is not straightforward due to privacy concerns and resource limitations. This paper presents a framework for federated class incremental learning that utilizes a generative model to synthesize samples from past distributions instead of storing part of past data. Then, clients can leverage the generative model to mitigate catastrophic forgetting locally. The generative model is trained on the server using data-free methods at the end of each task without requesting data from clients. Therefore, it reduces the risk of data leakage as opposed to training it on the client's private data. We demonstrate significant improvements for the CIFAR-100 dataset compared to existing baselines.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2307.00497

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback