AITopics | Wu, Zhongyuan

Collaborating Authors

Wu, Zhongyuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Empowering Character-level Text Infilling by Eliminating Sub-Tokens

Ren, Houxing, Zhan, Mingjie, Wu, Zhongyuan, Li, Hongsheng

arXiv.org Artificial IntelligenceJun-14-2024

In infilling tasks, sub-tokens, representing instances where a complete token is segmented into two parts, often emerge at the boundaries of prefixes, middles, and suffixes. Traditional methods focused on training models at the token level, leading to sub-optimal performance in character-level infilling tasks during the inference stage. Alternately, some approaches considered character-level infilling, but they relied on predicting sub-tokens in inference, yet this strategy diminished ability in character-level infilling tasks due to the large perplexity of the model on sub-tokens. In this paper, we introduce FIM-SE, which stands for Fill-In-the-Middle with both Starting and Ending character constraints. The proposed method addresses character-level infilling tasks by utilizing a line-level format to avoid predicting any sub-token in inference. In addition, we incorporate two special tokens to signify the rest of the incomplete lines, thereby enhancing generation guidance. Extensive experiments demonstrate that our proposed approach surpasses previous methods, offering a significant advantage. Code is available at https://github.com/SenseLLM/FIM-SE.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.17103

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation

Ren, Houxing, Zhan, Mingjie, Wu, Zhongyuan, Zhou, Aojun, Pan, Junting, Li, Hongsheng

arXiv.org Artificial IntelligenceMay-27-2024

Code generation plays a crucial role in various tasks, such as code auto-completion and mathematical reasoning. Previous work has proposed numerous methods to enhance code generation performance, including integrating feedback from the compiler. Inspired by this, we present ReflectionCoder, a novel approach that effectively leverages reflection sequences constructed by integrating compiler feedback to improve one-off code generation performance. Furthermore, we propose reflection self-distillation and dynamically masked distillation to effectively utilize these reflection sequences. Extensive experiments on three benchmarks, i.e., HumanEval (+), MBPP (+), and MultiPl-E, demonstrate that models fine-tuned with our method achieve state-of-the-art performance. Notably, ReflectionCoder-DeepSeek-Coder-33B reaches pass@1 of 82.9 (76.8) on HumanEval (+) and 84.1 (72.0) on MBPP (+), on par with GPT-3.5-Turbo and Claude-3-opus, and surpasses early GPT-4. Beyond the code domain, we believe this approach can benefit other domains that focus on final results and require long reasoning paths. Code and data are available at https://github.com/SenseLLM/ReflectionCoder.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2405.17057

Country:

North America > United States > Hawaii (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report > Promising Solution (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

BigDL: A Distributed Deep Learning Framework for Big Data

Dai, Jason, Wang, Yiheng, Qiu, Xin, Ding, Ding, Zhang, Yao, Wang, Yanzhang, Jia, Xianyan, Zhang, Cherry, Wan, Yan, Li, Zhichao, Wang, Jiao, Huang, Shengsheng, Wu, Zhongyuan, Wang, Yang, Yang, Yuhao, She, Bowen, Shi, Dongjie, Lu, Qi, Huang, Kai, Song, Guoqiong

arXiv.org Artificial IntelligenceApr-22-2018

In this paper, we present BigDL, a distributed deep learning framework for Big Data platforms and workflows. It is implemented on top of Apache Spark, and allows users to write their deep learning applications as standard Spark programs (running directly on large-scale big data clusters in a distributed fashion). It provides an expressive, "data-analytics integrated" deep learning programming model, so that users can easily build the end-to-end analytics + AI pipelines under a unified programming paradigm; by implementing an AllReduce like operation using existing primitives in Spark (e.g., shuffle, broadcast, and in-memory data persistence), it also provides a highly efficient "parameter server" style architecture, so as to achieve highly scalable, data-parallel distributed training. Since its initial open source release, BigDL users have built many analytics and deep learning applications (e.g., object detection, sequence-to-sequence generation, visual similarity, neural recommendations, fraud detection, etc.) on Spark.

bigdl, deep learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

1804.05839

Genre: Research Report (0.40)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback