AITopics | Zhang, Zijie

Collaborating Authors

Zhang, Zijie

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fisher Information-based Efficient Curriculum Federated Learning with Large Language Models

Liu, Ji, Ren, Jiaxiang, Jin, Ruoming, Zhang, Zijie, Zhou, Yang, Valduriez, Patrick, Dou, Dejing

arXiv.org Artificial IntelligenceOct-18-2024

As a promising paradigm to collaboratively train models with decentralized data, Federated Learning (FL) can be exploited to fine-tune Large Language Models (LLMs). While LLMs correspond to huge size, the scale of the training data significantly increases, which leads to tremendous amounts of computation and communication costs. The training data is generally non-Independent and Identically Distributed (non-IID), which requires adaptive data processing within each device. Although Low Rank Adaptation (LoRA) can significantly reduce the scale of parameters to update in the fine-tuning process, it still takes unaffordable time to transfer the low-rank parameters of all the layers in LLMs. In this paper, we propose a Fisher Information-based Efficient Curriculum Federated Learning framework (FibecFed) with two novel methods, i.e., adaptive federated curriculum learning and efficient sparse parameter update. First, we propose a fisher information-based method to adaptively sample data within each device to improve the effectiveness of the FL fine-tuning process. Second, we dynamically select the proper layers for global aggregation and sparse parameters for local update with LoRA so as to improve the efficiency of the FL fine-tuning process. Extensive experimental results based on 10 datasets demonstrate that FibecFed yields excellent performance (up to 45.35% in terms of accuracy) and superb fine-tuning speed (up to 98.61% faster) compared with 17 baseline approaches).

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.00131

Country:

North America > United States (1.00)
Asia > China (0.92)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Survey of Lottery Ticket Hypothesis

Liu, Bohan, Zhang, Zijie, He, Peixiong, Wang, Zhensen, Xiao, Yang, Ye, Ruimeng, Zhou, Yang, Ku, Wei-Shinn, Hui, Bo

arXiv.org Artificial IntelligenceMar-12-2024

The Lottery Ticket Hypothesis (LTH) states that a dense neural network model contains a highly sparse subnetwork (i.e., winning tickets) that can achieve even better performance than the original model when trained in isolation. While LTH has been proved both empirically and theoretically in many works, there still are some open issues, such as efficiency and scalability, to be addressed. Also, the lack of open-source frameworks and consensual experimental setting poses a challenge to future research on LTH. We, for the first time, examine previous research and studies on LTH from different perspectives. We also discuss issues in existing works and list potential directions for further exploration. This survey aims to provide an in-depth look at the state of LTH and develop a duly maintained platform to conduct experiments and compare with the most updated baselines.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2403.04861

Country:

Europe (1.00)
North America > United States > Maryland (0.14)
North America > United States > Hawaii (0.14)
(3 more...)

Genre:

Overview (1.00)
Contests & Prizes (1.00)
Research Report > Promising Solution (0.88)
Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Gambling (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback