AITopics | Zou, Xu

Collaborating Authors

Zou, Xu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

BIPro: Zero-shot Chinese Poem Generation via Block Inverse Prompting Constrained Generation Framework

Zou, Xu

arXiv.org Artificial IntelligenceNov-20-2024

Recently, generative pre-trained models have made significant strides, particularly highlighted by the release of ChatGPT and GPT-4, which exhibit superior cross-domain capabilities. However, these models still face challenges on constrained writing tasks like poem generation under open-domain titles. In response to this challenge, we introduce Block Inverse Prompting (BIPro) constrained generation framework. BIPro leverages two block inverse prompting methods, revise and rewrite, that mimic the process of human text writing using block generative models. It significantly improves the zero-shot generation quality on the formidable constrained generation task of open-domain traditional-form Chinese poem generation. Based on a less powerful block generative model GLM-10B-Chinese, poems composed via BIPro without priming or additional training outperform both most advanced direct generative systems like GPT-4 or GLM-4 and best domain-specific systems such as Yusheng, Shisanbai, or Baidu Poetry Helper in human evaluation by proficient poets. Finally, BIPro considerably narrows the gap between AI-generated works and short-listed human literary arts in another human evaluation, unveiling the promising potential of block generative models in improving the quality of constrained generation.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2411.13237

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Social Sector (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X

Zheng, Qinkai, Xia, Xiao, Zou, Xu, Dong, Yuxiao, Wang, Shan, Xue, Yufei, Wang, Zihan, Shen, Lei, Wang, Andi, Li, Yang, Su, Teng, Yang, Zhilin, Tang, Jie

arXiv.org Artificial IntelligenceMar-30-2023

Large pre-trained code generation models, such as OpenAI Codex, can generate syntax- and function-correct code, making the coding of programmers more productive and our pursuit of artificial general intelligence closer. In this paper, we introduce CodeGeeX, a multilingual model with 13 billion parameters for code generation. CodeGeeX is pre-trained on 850 billion tokens of 23 programming languages as of June 2022. Our extensive experiments suggest that CodeGeeX outperforms multilingual code models of similar scale for both the tasks of code generation and translation on HumanEval-X. Building upon HumanEval (Python only), we develop the HumanEval-X benchmark for evaluating multilingual models by hand-writing the solutions in C++, Java, JavaScript, and Go. In addition, we build CodeGeeX-based extensions on Visual Studio Code, JetBrains, and Cloud Studio, generating 4.7 billion tokens for tens of thousands of active users per week. Our user study demonstrates that CodeGeeX can help to increase coding efficiency for 83.4% of its users. Finally, CodeGeeX is publicly accessible and in Sep. 2022, we open-sourced its code, model weights (the version of 850B tokens), API, extensions, and HumanEval-X at https://github.com/THUDM/CodeGeeX.

machine learning, natural language, programming language, (18 more...)

arXiv.org Artificial Intelligence

2303.17568

Genre:

Research Report (0.70)
Questionnaire & Opinion Survey (0.54)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine Learning

Zheng, Qinkai, Zou, Xu, Dong, Yuxiao, Cen, Yukuo, Yin, Da, Xu, Jiarong, Yang, Yang, Tang, Jie

arXiv.org Artificial IntelligenceNov-8-2021

Adversarial attacks on graphs have posed a major threat to the robustness of graph machine learning (GML) models. Naturally, there is an ever-escalating arms race between attackers and defenders. However, the strategies behind both sides are often not fairly compared under the same and realistic conditions. To bridge this gap, we present the Graph Robustness Benchmark (GRB) with the goal of providing a scalable, unified, modular, and reproducible evaluation for the adversarial robustness of GML models. GRB standardizes the process of attacks and defenses by 1) developing scalable and diverse datasets, 2) modularizing the attack and defense implementations, and 3) unifying the evaluation protocol in refined scenarios. By leveraging the GRB pipeline, the end-users can focus on the development of robust GML models with automated data processing and experimental evaluations. To support open and reproducible research on graph adversarial learning, GRB also hosts public leaderboards across different scenarios. As a starting point, we conduct extensive experiments to benchmark baseline techniques. GRB is open-source and welcomes contributions from the community.

artificial intelligence, gml model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2111.04314

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.49)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.89)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Controllable Generation from Pre-trained Language Models via Inverse Prompting

Zou, Xu, Yin, Da, Zhong, Qingyang, Yang, Hongxia, Yang, Zhilin, Tang, Jie

arXiv.org Artificial IntelligenceMar-19-2021

Large-scale pre-trained language models have demonstrated strong capabilities of generating realistic text. However, it remains challenging to control the generation results. Previous approaches such as prompting are far from sufficient, which limits the usage of language models. To tackle this challenge, we propose an innovative method, inverse prompting, to better control text generation. The core idea of inverse prompting is to use generated text to inversely predict the prompt during beam search, which enhances the relevance between the prompt and the generated text and provides better controllability. Empirically, we pre-train a large-scale Chinese language model to perform a systematic study using human evaluation on the tasks of open-domain poem generation and open-domain long-form question answering. Our results show that our proposed method substantially outperforms the baselines and that our generation quality is close to human performance on some of the tasks. Narrators can try our poem generation demo at https://pretrain.aminer.cn/apps/poetry.html, while our QA demo can be found at https://pretrain.aminer.cn/app/qa. For researchers, the code is provided in https://github.com/THUDM/InversePrompting.

foreign policy, inverse, neural network, (16 more...)

arXiv.org Artificial Intelligence

2103.10685

Country:

North America > United States (0.93)
Asia (0.93)
Europe > United Kingdom (0.68)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.48)

Industry:

Government > Regional Government (0.94)
Health & Medicine > Therapeutic Area (0.93)
Government > Foreign Policy (0.68)
Government > Commerce (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Controllable Multi-Interest Framework for Recommendation

Cen, Yukuo, Zhang, Jianwei, Zou, Xu, Zhou, Chang, Yang, Hongxia, Tang, Jie

arXiv.org Machine LearningAug-2-2020

Recently, neural networks have been widely used in e-commerce recommender systems, owing to the rapid development of deep learning. We formalize the recommender system as a sequential recommendation problem, intending to predict the next items that the user might be interacted with. Recent works usually give an overall embedding from a user's behavior sequence. However, a unified user embedding cannot reflect the user's multiple interests during a period. In this paper, we propose a novel controllable multi-interest framework for the sequential recommendation, called ComiRec. Our multi-interest module captures multiple interests from user behavior sequences, which can be exploited for retrieving candidate items from the large-scale item pool. These items are then fed into an aggregation module to obtain the overall recommendation. The aggregation module leverages a controllable factor to balance the recommendation accuracy and diversity. We conduct experiments for the sequential recommendation on two real-world datasets, Amazon and Taobao. Experimental results demonstrate that our framework achieves significant improvements over state-of-the-art models. Our framework has also been successfully deployed on the offline Alibaba distributed cloud platform.

deep learning, neural network, recommendation, (22 more...)

arXiv.org Machine Learning

2005.09347

Country: North America > Canada > Ontario > Toronto (0.14)

Genre:

Research Report > New Finding (0.34)
Research Report > Promising Solution (0.34)

Industry: Information Technology > Services (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Dimensional Reweighting Graph Convolutional Networks

Zou, Xu, Jia, Qiuye, Zhang, Jianwei, Zhou, Chang, Yang, Hongxia, Tang, Jie

arXiv.org Machine LearningJul-4-2019

Graph Convolution Networks (GCNs) are becoming more and more popular for learning node representations on graphs. Though there exist various developments on sampling and aggregation to accelerate the training process and improve the performances, limited works focus on dealing with the dimensional information imbalance of node representations. To bridge the gap, we propose a method named Dimensional reweighting Graph Convolution Network (DrGCN). We theoretically prove that our DrGCN can guarantee to improve the stability of GCNs via mean field theory. Our dimensional reweighting method is very flexible and can be easily combined with most sampling and aggregation techniques for GCNs. Experimental results demonstrate its superior performances on several challenging transductive and inductive node classification benchmark datasets. Our DrGCN also outperforms existing models on an industrial-sized Alibaba recommendation dataset.

dataset, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1907.02237

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback