AITopics | Li, Xiaopeng

Plotting

Li, Xiaopeng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring Continual Learning for Code Generation Models

Yadav, Prateek, Sun, Qing, Ding, Hantian, Li, Xiaopeng, Zhang, Dejiao, Tan, Ming, Ma, Xiaofei, Bhatia, Parminder, Nallapati, Ramesh, Ramanathan, Murali Krishna, Bansal, Mohit, Xiang, Bing

arXiv.org Artificial IntelligenceJul-5-2023

Large-scale code generation models such as Codex and CodeT5 have achieved impressive performance. However, libraries are upgraded or deprecated very frequently and re-training large-scale language models is computationally expensive. Therefore, Continual Learning (CL) is an important aspect that remains underexplored in the code domain. In this paper, we introduce a benchmark called CodeTask-CL that covers a wide range of tasks, including code generation, translation, summarization, and refinement, with different input and output programming languages. Next, on our CodeTask-CL benchmark, we compare popular CL techniques from NLP and Vision domains. We find that effective methods like Prompt Pooling (PP) suffer from catastrophic forgetting due to the unstable training of the prompt selection mechanism caused by stark distribution shifts in coding tasks. We address this issue with our proposed method, Prompt Pooling with Teacher Forcing (PP-TF), that stabilizes training by enforcing constraints on the prompt selection mechanism and leads to a 21.54% improvement over Prompt Pooling. Along with the benchmark, we establish a training pipeline that can be used for CL on code models, which we believe can motivate further development of CL methods for code models. Our code is available at https://github.com/amazon-science/codetaskcl-pptf

artificial intelligence, machine learning, natural language, (12 more...)

arXiv.org Artificial Intelligence

2307.02435

Country:

North America > United States > North Carolina (0.14)
Europe > Spain (0.14)

Genre: Research Report > New Finding (0.94)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (0.84)

Add feedback

A Static Evaluation of Code Completion by Large Language Models

Ding, Hantian, Kumar, Varun, Tian, Yuchen, Wang, Zijian, Kwiatkowski, Rob, Li, Xiaopeng, Ramanathan, Murali Krishna, Ray, Baishakhi, Bhatia, Parminder, Sengupta, Sudipta, Roth, Dan, Xiang, Bing

arXiv.org Artificial IntelligenceJun-5-2023

Large language models trained on code have shown great potential to increase productivity of software developers. Several execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems. Nevertheless, it is expensive to perform the same evaluation on complex real-world projects considering the execution cost. On the contrary, static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models. In this work, we propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees. Compared with execution-based evaluation, our method is not only more efficient, but also applicable to code in the wild. For experiments, we collect code context from open source repos to generate one million function bodies using public models. Our static analysis reveals that Undefined Name and Unused Variable are the most common errors among others made by language models. Through extensive studies, we also show the impact of sampling temperature, model size, and context on static errors in code completions.

artificial intelligence, completion, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.03203

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.85)

Add feedback

ContraCLM: Contrastive Learning For Causal Language Model

Jain, Nihal, Zhang, Dejiao, Ahmad, Wasi Uddin, Wang, Zijian, Nan, Feng, Li, Xiaopeng, Tan, Ming, Nallapati, Ramesh, Ray, Baishakhi, Bhatia, Parminder, Ma, Xiaofei, Xiang, Bing

arXiv.org Artificial IntelligenceMay-2-2023

Despite exciting progress in causal language models, the expressiveness of the representations is largely limited due to poor discrimination ability. To remedy this issue, we present ContraCLM, a novel contrastive learning framework at both token-level and sequence-level. We assess ContraCLM on a variety of downstream tasks. We show that ContraCLM enhances discrimination of the representations and bridges the gap with the encoder-only models, which makes causal language models better suited for tasks beyond language generation. Specifically, we attain $44\%$ relative improvement on the Semantic Textual Similarity tasks and $34\%$ on Code-to-Code Search tasks. Furthermore, by improving the expressiveness of the representations, ContraCLM also boosts the source code generation capability with $9\%$ relative improvement on execution accuracy on the HumanEval benchmark.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2210.01185

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Multi-lingual Evaluation of Code Generation Models

Athiwaratkun, Ben, Gouda, Sanjay Krishna, Wang, Zijian, Li, Xiaopeng, Tian, Yuchen, Tan, Ming, Ahmad, Wasi Uddin, Wang, Shiqi, Sun, Qing, Shang, Mingyue, Gonugondla, Sujan Kumar, Ding, Hantian, Kumar, Varun, Fulton, Nathan, Farahani, Arash, Jain, Siddhartha, Giaquinto, Robert, Qian, Haifeng, Ramanathan, Murali Krishna, Nallapati, Ramesh, Ray, Baishakhi, Bhatia, Parminder, Sengupta, Sudipta, Roth, Dan, Xiang, Bing

arXiv.org Artificial IntelligenceMar-28-2023

We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X. These datasets cover over 10 programming languages and are generated using a scalable conversion framework that transpiles prompts and test cases from the original Python datasets into the corresponding data in the target language. Using these benchmarks, we are able to assess the performance of code generation models in a multi-lingual fashion, and discovered generalization ability of language models on out-of-domain languages, advantages of multi-lingual models over mono-lingual, the ability of few-shot prompting to teach the model new languages, and zero-shot translation abilities even on mono-lingual settings. Furthermore, we use our code generation model to perform large-scale bootstrapping to obtain synthetic canonical solutions in several languages, which can be used for other code-related evaluations such as code insertion, robustness, or summarization tasks. Overall, our benchmarks represents a significant step towards a deeper understanding of language models' code generation abilities. We publicly release our code and datasets at https://github.com/amazon-research/mxeval.

machine learning, natural language, programming language, (18 more...)

arXiv.org Artificial Intelligence

2210.14868

Country: Africa > Ethiopia (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Greener yet Powerful: Taming Large Code Generation Models with Quantization

Wei, Xiaokai, Gonugondla, Sujan, Ahmad, Wasi, Wang, Shiqi, Ray, Baishakhi, Qian, Haifeng, Li, Xiaopeng, Kumar, Varun, Wang, Zijian, Tian, Yuchen, Sun, Qing, Athiwaratkun, Ben, Shang, Mingyue, Ramanathan, Murali Krishna, Bhatia, Parminder, Xiang, Bing

arXiv.org Artificial IntelligenceMar-9-2023

ML-powered code generation aims to assist developers to write code in a more productive manner, by intelligently generating code blocks based on natural language prompts. Recently, large pretrained deep learning models have substantially pushed the boundary of code generation and achieved impressive performance. Despite their great power, the huge number of model parameters poses a significant threat to adapting them in a regular software development environment, where a developer might use a standard laptop or mid-size server to develop her code. Such large models incur significant resource usage (in terms of memory, latency, and dollars) as well as carbon footprint. Model compression is a promising approach to address these challenges. Several techniques are proposed to compress large pretrained models typically used for vision or textual data. Out of many available compression techniques, we identified that quantization is mostly applicable for code generation task as it does not require significant retraining cost. As quantization represents model parameters with lower-bit integer (e.g., int8), the model size and runtime latency would both benefit from such int representation. We extensively study the impact of quantized model on code generation tasks across different dimension: (i) resource usage and carbon footprint, (ii) accuracy, and (iii) robustness. To this end, through systematic experiments we find a recipe of quantization technique that could run even a $6$B model in a regular laptop without significant accuracy or robustness degradation. We further found the recipe is readily applicable to code summarization task as well.

machine learning, natural language, quantization, (18 more...)

arXiv.org Artificial Intelligence

2303.05378

Country:

Europe > Switzerland (0.14)
Europe > Italy (0.14)
Europe > Belgium (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hard instance learning for quantum adiabatic prime factorization

Lin, Jian, Zhang, Zhengfeng, Zhang, Junping, Li, Xiaopeng

arXiv.org Artificial IntelligenceOct-10-2021

Prime factorization is a difficult problem with classical computing, whose exponential hardness is the foundation of Rivest-Shamir-Adleman (RSA) cryptography. With programmable quantum devices, adiabatic quantum computing has been proposed as a plausible approach to solve prime factorization, having promising advantage over classical computing. Here, we find there are certain hard instances that are consistently intractable for both classical simulated annealing and un-configured adiabatic quantum computing (AQC). Aiming at an automated architecture for optimal configuration of quantum adiabatic factorization, we apply a deep reinforcement learning (RL) method to configure the AQC algorithm. By setting the success probability of the worst-case problem instances as the reward to RL, we show the AQC performance on the hard instances is dramatically improved by RL configuration. The success probability also becomes more evenly distributed over different problem instances, meaning the configured AQC is more stable as compared to the un-configured case. Through a technique of transfer learning, we find prominent evidence that the framework of AQC configuration is scalable -- the configured AQC as trained on five qubits remains working efficiently on nine qubits with a minimal amount of additional training cost.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2110.04782

Country: Asia > China (0.16)

Genre:

Workflow (0.68)
Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Fermion Sampling Made More Efficient

Sun, Haoran, Zou, Jie, Li, Xiaopeng

arXiv.org Artificial IntelligenceSep-15-2021

Fermion sampling is to generate probability distribution of a many-body Slater-determinant wavefunction, which is termed "determinantal point process" in statistical analysis. For its inherently-embedded Pauli exclusion principle, its application reaches beyond simulating fermionic quantum many-body physics to constructing machine learning models for diversified datasets. Here we propose a fermion sampling algorithm, which has a polynomial time-complexity -- quadratic in the fermion number and linear in the system size. This algorithm is about 100% more efficient in computation time than the best known algorithms. In sampling the corresponding marginal distribution, our algorithm has a more drastic improvement, achieving a scaling advantage. We demonstrate its power on several test applications, including sampling fermions in a many-body system and a machine learning task of text summarization, and confirm its improved computation efficiency over other methods by counting floating-point operations.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1103/PhysRevB.107.035119

2109.07358

Country:

Asia > China (0.16)
Europe > United Kingdom (0.14)
Europe > Middle East > Malta (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reinforcement learning architecture for automated quantum-adiabatic-algorithm design

Lin, Jian, Lai, Zhong Yuan, Li, Xiaopeng

arXiv.org Artificial IntelligenceDec-27-2018

Quantum algorithm design lies in the hallmark of applications of quantum computation and quantum simulation. Here we put forward a deep reinforcement learning (RL) architecture for automated algorithm design in the framework of quantum adiabatic algorithm, where the optimal Hamiltonian path to reach a quantum ground state that encodes a compution problem is obtained by RL techniques. We benchmark our approach in Grover search and 3-SAT problems, and find that the adiabatic algorithm obtained by our RL approach leads to significant improvement in the success probability and computing speedups for both moderate and large number of qubits compared to conventional algorithms. The RL-designed algorithm is found to be qualitatively distinct from the linear algorithm in the resultant distribution of success probability. Considering the established complexity-equivalence of circuit and adiabatic quantum algorithms, we expect the RL-designed adiabatic algorithm to inspire novel circuit algorithms as well. Our approach offers a recipe to design quantum algorithms for generic problems through a machinery RL process, which paves a novel way to automated quantum algorithm design using artificial intelligence, potentially applicable to different quantum simulation and computation platforms from trapped ions and optical lattices to superconducting-qubit devices.

algorithm, artificial intelligence, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

1812.10797

Country: Asia > China (0.15)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Review Helpfulness Assessment based on Convolutional Neural Network

Qu, Xianshan, Li, Xiaopeng, Rose, John R.

arXiv.org Artificial IntelligenceAug-27-2018

In this paper we describe the implementation of a convolutional neural network (CNN) used to assess online review helpfulness. To our knowledge, this is the first use of this architecture to address this problem. We explore the impact of two related factors impacting CNN performance: different word embedding initializations and different input review lengths. We also propose an approach to combining rating star information with review text to further improve prediction accuracy. We demonstrate that this can improve the overall accuracy by 2%. Finally, we evaluate the method on a benchmark dataset and show an improvement in accuracy relative to published results for traditional methods of 2.5% for a model trained using only review text and 4.24% for a model trained on a combination of rating star information and review text.

deep learning, neural network, rating star information, (17 more...)

arXiv.org Artificial Intelligence

1808.09016

Country: North America > United States > South Carolina > Richland County > Columbia (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback