AITopics | Xia, Qingrong

Collaborating Authors

Xia, Qingrong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Beware of Calibration Data for Pruning Large Language Models

Ji, Yixin, Xiang, Yang, Li, Juntao, Xia, Qingrong, Li, Ping, Duan, Xinyu, Wang, Zhefeng, Zhang, Min

arXiv.org Artificial IntelligenceOct-23-2024

As large language models (LLMs) are widely applied across various fields, model compression has become increasingly crucial for reducing costs and improving inference efficiency. Post-training pruning is a promising method that does not require resource-intensive iterative training and only needs a small amount of calibration data to assess the importance of parameters. Previous research has primarily focused on designing advanced pruning methods, while different calibration data's impact on pruning performance still lacks systematical exploration. We fill this blank and surprisingly observe that the effects of calibration data even value more than designing advanced pruning strategies, especially for high sparsity. Our preliminary exploration also discloses that using calibration data similar to the training data can yield better performance. As pre-training data is usually inaccessible for advanced LLMs, we further provide a self-generating calibration data synthesis strategy to construct feasible calibration data. We conduct experiments on the recent strong open-source LLMs (e.g., DCLM, and LLaMA-3), and the results show that the proposed method outperforms commonly used calibration data and can effectively enhance strong pruning methods (e.g., Wanda, OWL). Recently, Large Language Models (LLMs) have exhibited remarkable performance and enormous potential in Natural Language Processing (NLP) and Artificial Intelligence (AI) (OpenAI, 2022; 2023; Bubeck et al., 2023; Yang et al., 2023). The success of LLMs is closely tied to scaling laws (Kaplan et al., 2020; Hoffmann et al., 2022): training language models with more parameters, using more data and greater computational resources leads to more powerful capabilities.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.17711

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure

Wang, Jikai, Su, Yi, Li, Juntao, Xia, Qingrong, Ye, Zi, Duan, Xinyu, Wang, Zhefeng, Zhang, Min

arXiv.org Artificial IntelligenceJul-16-2024

Autoregressive language models demonstrate excellent performance in various scenarios. However, the inference efficiency is limited by its one-step-one-word generation mode, which has become a pressing problem recently as the models become increasingly larger. Speculative decoding employs a "draft and then verify" mechanism to allow multiple tokens to be generated in one step, realizing lossless acceleration. Existing methods mainly adopt fixed heuristic draft structures, which fail to adapt to different situations to maximize the acceptance length during verification. To alleviate this dilemma, we proposed OPT-Tree, an algorithm to construct adaptive and scalable draft trees. It searches the optimal tree structure that maximizes the mathematical expectation of the acceptance length in each decoding step. Experimental results reveal that OPT-Tree outperforms the existing draft structures and achieves a speed-up ratio of up to 3.2 compared with autoregressive decoding. If the draft model is powerful enough and the node budget is sufficient, it can generate more than ten tokens in a single step. Our code is available at https://github.com/Jikai0Wang/OPT-Tree.

acceptance length, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2406.17276

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.77)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

A Survey on Arabic Named Entity Recognition: Past, Recent Advances, and Future Trends

Qu, Xiaoye, Gu, Yingjie, Xia, Qingrong, Li, Zechang, Wang, Zhefeng, Huai, Baoxing

arXiv.org Artificial IntelligenceAug-8-2023

As more and more Arabic texts emerged on the Internet, extracting important information from these Arabic texts is especially useful. As a fundamental technology, Named entity recognition (NER) serves as the core component in information extraction technology, while also playing a critical role in many other Natural Language Processing (NLP) systems, such as question answering and knowledge graph building. In this paper, we provide a comprehensive review of the development of Arabic NER, especially the recent advances in deep learning and pre-trained language model. Specifically, we first introduce the background of Arabic NER, including the characteristics of Arabic and existing resources for Arabic NER. Then, we systematically review the development of Arabic NER methods. Traditional Arabic NER systems focus on feature engineering and designing domain-specific rules. In recent years, deep learning methods achieve significant progress by representing texts via continuous vector representations. With the growth of pre-trained language model, Arabic NER yields better performance. Finally, we conclude the method gap between Arabic NER and NER methods from other languages, which helps outline future directions for Arabic NER.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2302.03512

Country:

Europe (1.00)
North America > United States > California (0.46)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Media > News (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AraMUS: Pushing the Limits of Data and Model Scale for Arabic Natural Language Processing

Alghamdi, Asaad, Duan, Xinyu, Jiang, Wei, Wang, Zhenhai, Wu, Yimeng, Xia, Qingrong, Wang, Zhefeng, Zheng, Yi, Rezagholizadeh, Mehdi, Huai, Baoxing, Cheng, Peilun, Ghaddar, Abbas

arXiv.org Artificial IntelligenceJun-11-2023

Developing monolingual large Pre-trained Language Models (PLMs) is shown to be very successful in handling different tasks in Natural Language Processing (NLP). In this work, we present AraMUS, the largest Arabic PLM with 11B parameters trained on 529GB of high-quality Arabic textual data. AraMUS achieves state-of-the-art performances on a diverse set of Arabic classification and generative tasks. Moreover, AraMUS shows impressive few-shot learning abilities compared with the best existing Arabic PLMs.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2306.068

Country:

Europe (0.46)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback