AITopics | Su, Teng

Collaborating Authors

Su, Teng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X

Zheng, Qinkai, Xia, Xiao, Zou, Xu, Dong, Yuxiao, Wang, Shan, Xue, Yufei, Wang, Zihan, Shen, Lei, Wang, Andi, Li, Yang, Su, Teng, Yang, Zhilin, Tang, Jie

arXiv.org Artificial IntelligenceMar-30-2023

Large pre-trained code generation models, such as OpenAI Codex, can generate syntax- and function-correct code, making the coding of programmers more productive and our pursuit of artificial general intelligence closer. In this paper, we introduce CodeGeeX, a multilingual model with 13 billion parameters for code generation. CodeGeeX is pre-trained on 850 billion tokens of 23 programming languages as of June 2022. Our extensive experiments suggest that CodeGeeX outperforms multilingual code models of similar scale for both the tasks of code generation and translation on HumanEval-X. Building upon HumanEval (Python only), we develop the HumanEval-X benchmark for evaluating multilingual models by hand-writing the solutions in C++, Java, JavaScript, and Go. In addition, we build CodeGeeX-based extensions on Visual Studio Code, JetBrains, and Cloud Studio, generating 4.7 billion tokens for tens of thousands of active users per week. Our user study demonstrates that CodeGeeX can help to increase coding efficiency for 83.4% of its users. Finally, CodeGeeX is publicly accessible and in Sep. 2022, we open-sourced its code, model weights (the version of 850B tokens), API, extensions, and HumanEval-X at https://github.com/THUDM/CodeGeeX.

machine learning, natural language, programming language, (18 more...)

arXiv.org Artificial Intelligence

2303.17568

Genre:

Research Report (0.70)
Questionnaire & Opinion Survey (0.54)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

PanGu-{\Sigma}: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing

Ren, Xiaozhe, Zhou, Pingyi, Meng, Xinfan, Huang, Xinjing, Wang, Yadao, Wang, Weichao, Li, Pengfei, Zhang, Xiaoda, Podolskiy, Alexander, Arshinov, Grigory, Bout, Andrey, Piontkovskaya, Irina, Wei, Jiansheng, Jiang, Xin, Su, Teng, Liu, Qun, Yao, Jun

arXiv.org Artificial IntelligenceMar-19-2023

The scaling of large language models has greatly improved natural language understanding, generation, and reasoning. In this work, we develop a system that trained a trillion-parameter language model on a cluster of Ascend 910 AI processors and MindSpore framework, and present the language model with 1.085T parameters named PanGu-{\Sigma}. With parameter inherent from PanGu-{\alpha}, we extend the dense Transformer model to sparse one with Random Routed Experts (RRE), and efficiently train the model over 329B tokens by using Expert Computation and Storage Separation(ECSS). This resulted in a 6.3x increase in training throughput through heterogeneous computing. Our experimental findings show that PanGu-{\Sigma} provides state-of-the-art performance in zero-shot learning of various Chinese NLP downstream tasks. Moreover, it demonstrates strong abilities when fine-tuned in application data of open-domain dialogue, question answering, machine translation and code generation.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2303.10845

Country:

Europe (1.00)
Africa (0.92)
North America (0.67)
Asia > China > Guangdong Province (0.28)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Basketball (0.92)
Health & Medicine > Therapeutic Area (0.68)
Media (0.67)
Leisure & Entertainment > Sports > Football (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback