AITopics | Yang, Fang

Collaborating Authors

Yang, Fang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Navigation System for ROV's inspection on Fish Net Cage

Ge, Zhikang, Yang, Fang, Lu, Wenwu, Wei, Peng, Ying, Yibin, Peng, Chen

arXiv.org Artificial IntelligenceMar-1-2025

In this paper, we modify an off-the-shelf ROV, the BlueROV2, into a ROS-based framework and develop a localization module, a path planning system, and a control framework. For real-time, local localization, we employ the open-source TagSLAM library. Additionally, we propose a control strategy based on a Nominal Feedback Controller (NFC) to achieve precise trajectory tracking. The proposed system has been implemented and validated through experiments in a controlled laboratory environment, demonstrating its effectiveness for real-world applications.

artificial intelligence, inspection, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.00482

Country:

Asia > China (0.29)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry:

Electrical Industrial Apparatus (0.79)
Food & Agriculture (0.70)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs

Zhao, Pinxue, Zhang, Hailin, Fu, Fangcheng, Nie, Xiaonan, Liu, Qibin, Yang, Fang, Peng, Yuanbo, Jiao, Dian, Li, Shuaipeng, Xue, Jinbao, Tao, Yangyu, Cui, Bin

arXiv.org Artificial IntelligenceJul-16-2024

Nowadays, Large Language Models (LLMs) have been trained using extended context lengths to foster more creative applications. However, long context training poses great challenges considering the constraint of GPU memory. It not only leads to substantial activation memory consumption during training, but also incurs considerable memory fragmentation. To facilitate long context training, existing frameworks have adopted strategies such as recomputation and various forms of parallelisms. Nevertheless, these techniques rely on redundant computation or extensive communication, resulting in low Model FLOPS Utilization (MFU). In this paper, we propose MEMO, a novel LLM training framework designed for fine-grained activation memory management. Given the quadratic scaling of computation and linear scaling of memory with sequence lengths when using FlashAttention, we offload memory-consuming activations to CPU memory after each layer's forward pass and fetch them during the backward pass. To maximize the swapping of activations without hindering computation, and to avoid exhausting limited CPU memory, we implement a token-wise activation recomputation and swapping mechanism. Furthermore, we tackle the memory fragmentation issue by employing a bi-level Mixed Integer Programming (MIP) approach, optimizing the reuse of memory across transformer layers. Empirical results demonstrate that MEMO achieves an average of 2.42x and 2.26x MFU compared to Megatron-LM and DeepSpeed, respectively. This improvement is attributed to MEMO's ability to minimize memory fragmentation, reduce recomputation and intensive communication, and circumvent the delays associated with the memory reorganization process due to fragmentation. By leveraging fine-grained activation memory management, MEMO facilitates efficient training of 7B LLM with 1 million sequence length on just 8 A800 GPUs, achieving an MFU of 52.30%.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2407.12117

Country:

North America > United States (1.00)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)

Genre:

Research Report (0.84)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improving Sequential Recommendation Models with an Enhanced Loss Function

Li, Fangyu, Yu, Shenbao, Zeng, Feng, Yang, Fang

arXiv.org Artificial IntelligenceMay-21-2023

There has been a growing interest in benchmarking sequential recommendation models and reproducing/improving existing models. For example, Rendle et al. improved matrix factorization models by tuning their parameters and hyperparameters. Petrov and Macdonald developed a more efficient and effective implementation of BERT4Rec, which resolved inconsistencies in performance comparison between BERT4Rec and SASRec in previous works. In particular, BERT4Rec and SASRec share a similar network structure, with the main difference lying in their training objective/loss function. Therefore, we analyzed the advantages and disadvantages of commonly used loss functions in sequential recommendation and proposed an improved loss function that leverages their strengths. We conduct extensive experiments on two influential open-source libraries, and the results demonstrate that our improved loss function significantly enhances the performance of GRU4Rec, SASRec, SR-GNN, and S3Rec models, improving their benchmarks significantly. Furthermore, the improved SASRec benchmark outperforms BERT4Rec on the ML-1M and Beauty datasets and achieves similar results to BERT4Rec on the ML-20M and Steam datasets. We also reproduce the results of the BERT4Rec model on the Beauty dataset. Finally, we provide a comprehensive explanation of the effectiveness of our improved loss function through experiments. Our code is publicly available at https://github.com/Li-fAngyU/sequential_rec.

artificial intelligence, loss function, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2301.00979

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Kernel induced random survival forests

Yang, Fang, Wang, Jiheng, Fan, Guangzhe

arXiv.org Machine LearningAug-23-2010

Kernel Induced Random Survival Forests (KIRSF) is a statistical learning algorithm which aims to improve prediction accuracy for survival data. As in Random Survival Forests (RSF), Cumulative Hazard Function is predicted for each individual in the test set. Prediction error is estimated using Harrell's concordance index (C index) [Harrell et al. (1982)]. The C-index can be interpreted as a misclassification probability and does not depend on a single fixed time for evaluation. The C-index also specifically accounts for censoring. By utilizing kernel functions, KIRSF achieves better results than RSF in many situations. In this report, we show how to incorporate kernel functions into RSF. We test the performance of KIRSF and compare our method to RSF. We find that the KIRSF's performance is better than RSF in many occasions.

artificial intelligence, covariate, health & medicine, (15 more...)

arXiv.org Machine Learning

1008.3952

Country: North America > United States > New York (0.14)

Genre: Research Report > Experimental Study (0.47)

Industry:

Health & Medicine (0.94)
Law > Civil Rights & Constitutional Law (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback