AITopics | Yang, Liqun

Collaborating Authors

Yang, Liqun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multi-Agent Collaboration for Multilingual Code Instruction Tuning

Yang, Jian, Zhang, Wei, Yang, Jiaxi, Miao, Yibo, Quan, Shanghaoran, Wu, Zhenhe, Peng, Qiyao, Yang, Liqun, Liu, Tianyu, Cui, Zeyu, Hui, Binyuan, Lin, Junyang

arXiv.org Artificial IntelligenceFeb-11-2025

Recent advancement in code understanding and generation demonstrates that code LLMs fine-tuned on a high-quality instruction dataset can gain powerful capabilities to address wide-ranging code-related tasks. However, most previous existing methods mainly view each programming language in isolation and ignore the knowledge transfer among different programming languages. To bridge the gap among different programming languages, we introduce a novel multi-agent collaboration framework to enhance multilingual instruction tuning for code LLMs, where multiple language-specific intelligent agent components with generation memory work together to transfer knowledge from one language to another efficiently and effectively. Specifically, we first generate the language-specific instruction data from the code snippets and then provide the generated data as the seed data for language-specific agents. Multiple language-specific agents discuss and collaborate to formulate a new instruction and its corresponding solution (A new programming language or existing programming language), To further encourage the cross-lingual transfer, each agent stores its generation history as memory and then summarizes its merits and faults. Finally, the high-quality multilingual instruction data is used to encourage knowledge transfer among different programming languages to train Qwen2.5-xCoder. Experimental results on multilingual programming benchmarks demonstrate the superior performance of Qwen2.5-xCoder in sharing common knowledge, highlighting its potential to reduce the cross-lingual gap.

large language model, machine learning, qwen2, (18 more...)

arXiv.org Artificial Intelligence

2502.07487

Country:

North America > United States (0.14)
North America > Canada (0.14)
Asia > China (0.14)
Africa > Rwanda (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

CryptoX : Compositional Reasoning Evaluation of Large Language Models

Shi, Jiajun, Wei, Chaoren, Yang, Liqun, Wang, Zekun Moore, Yang, Chenghao, Zhang, Ge, Huang, Stephen, Peng, Tao, Yang, Jian, Wen, Zhoufutu

arXiv.org Artificial IntelligenceFeb-8-2025

The compositional reasoning capacity has long been regarded as critical to the generalization and intelligence emergence of large language models LLMs. However, despite numerous reasoning-related benchmarks, the compositional reasoning capacity of LLMs is rarely studied or quantified in the existing benchmarks. In this paper, we introduce CryptoX, an evaluation framework that, for the first time, combines existing benchmarks and cryptographic, to quantify the compositional reasoning capacity of LLMs. Building upon CryptoX, we construct CryptoBench, which integrates these principles into several benchmarks for systematic evaluation. We conduct detailed experiments on widely used open-source and closed-source LLMs using CryptoBench, revealing a huge gap between open-source and closed-source LLMs. We further conduct thorough mechanical interpretability experiments to reveal the inner mechanism of LLMs' compositional reasoning, involving subproblem decomposition, subproblem inference, and summarizing subproblem conclusions. Through analysis based on CryptoBench, we highlight the value of independently studying compositional reasoning and emphasize the need to enhance the compositional reasoning capabilities of LLMs.

large language model, machine learning, qwen2, (17 more...)

arXiv.org Artificial Intelligence

2502.07813

Country: Asia (0.45)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Evaluating and Aligning CodeLLMs on Human Preference

Yang, Jian, Yang, Jiaxi, Jin, Ke, Miao, Yibo, Zhang, Lei, Yang, Liqun, Cui, Zeyu, Zhang, Yichang, Hui, Binyuan, Lin, Junyang

arXiv.org Artificial IntelligenceDec-6-2024

Code large language models (codeLLMs) have made significant strides in code generation. Most previous code-related benchmarks, which consist of various programming exercises along with the corresponding test cases, are used as a common measure to evaluate the performance and capabilities of code LLMs. However, the current code LLMs focus on synthesizing the correct code snippet, ignoring the alignment with human preferences, where the query should be sampled from the practical application scenarios and the model-generated responses should satisfy the human preference. To bridge the gap between the model-generated response and human preference, we present a rigorous human-curated benchmark CodeArena to emulate the complexity and diversity of real-world coding tasks, where 397 high-quality samples spanning 40 categories and 44 programming languages, carefully curated from user queries. Further, we propose a diverse synthetic instruction corpus SynCode-Instruct (nearly 20B tokens) by scaling instructions from the website to verify the effectiveness of the large-scale synthetic instruction fine-tuning, where Qwen2.5-SynCoder totally trained on synthetic instruction data can achieve top-tier performance of open-source code LLMs. The results find performance differences between execution-based benchmarks and CodeArena. Our systematic experiments of CodeArena on 40+ LLMs reveal a notable performance gap between open SOTA code LLMs (e.g. Qwen2.5-Coder) and proprietary LLMs (e.g., OpenAI o1), underscoring the importance of the human preference alignment.\footnote{\url{https://codearenaeval.github.io/ }}

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.0521

Country:

Asia (0.93)
North America > United States (0.28)
Europe > Austria (0.28)

Genre: Research Report (0.84)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MdEval: Massively Multilingual Code Debugging

Liu, Shukai, Chai, Linzheng, Yang, Jian, Shi, Jiajun, Zhu, He, Wang, Liran, Jin, Ke, Zhang, Wei, Zhu, Hualei, Guo, Shuyue, Sun, Tao, Liu, Jiaheng, Duan, Yunlong, Hao, Yu, Yang, Liqun, Niu, Guanglin, Zhang, Ge, Li, Zhoujun

arXiv.org Artificial IntelligenceNov-4-2024

Code large language models (LLMs) have made significant progress in code debugging by directly generating the correct code based on the buggy code snippet. Programming benchmarks, typically consisting of buggy code snippet and their associated test cases, are used to assess the debugging capabilities of LLMs. However, many existing benchmarks primarily focus on Python and are often limited in terms of language diversity (e.g., DebugBench and DebugEval). To advance the field of multilingual debugging with LLMs, we propose the first massively multilingual debugging benchmark, which includes 3.6K test samples of 18 programming languages and covers the automated program repair (APR) task, the code review (CR) task, and the bug identification (BI) task. Further, we introduce the debugging instruction corpora MDEVAL-INSTRUCT by injecting bugs into the correct multilingual queries and solutions (xDebugGen). Further, a multilingual debugger xDebugCoder trained on MDEVAL-INSTRUCT as a strong baseline specifically to handle the bugs of a wide range of programming languages (e.g. "Missing Mut" in language Rust and "Misused Macro Definition" in language C). Our extensive experiments on MDEVAL reveal a notable performance gap between open-source models and closed-source LLMs (e.g., GPT and Claude series), highlighting huge room for improvement in multilingual code debugging scenarios.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2411.0231

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

UniCoder: Scaling Code Large Language Model via Universal Code

Sun, Tao, Chai, Linzheng, Yang, Jian, Yin, Yuwei, Guo, Hongcheng, Liu, Jiaheng, Wang, Bing, Yang, Liqun, Li, Zhoujun

arXiv.org Artificial IntelligenceJun-24-2024

Intermediate reasoning or acting steps have successfully improved large language models (LLMs) for handling various downstream natural language processing (NLP) tasks. When applying LLMs for code generation, recent works mainly focus on directing the models to articulate intermediate natural-language reasoning steps, as in chain-of-thought (CoT) prompting, and then output code with the natural language or other structured intermediate steps. However, such output is not suitable for code translation or generation tasks since the standard CoT has different logical structures and forms of expression with the code. In this work, we introduce the universal code (UniCode) as the intermediate representation. It is a description of algorithm steps using a mix of conventions of programming languages, such as assignment operator, conditional operator, and loop. Hence, we collect an instruction dataset UniCoder-Instruct to train our model UniCoder on multi-task learning objectives. UniCoder-Instruct comprises natural-language questions, code solutions, and the corresponding universal code. The alignment between the intermediate universal code representation and the final code solution significantly improves the quality of the generated code. The experimental results demonstrate that UniCoder with the universal code significantly outperforms the previous prompting methods by a large margin, showcasing the effectiveness of the structural clues in pseudo-code.

large language model, machine learning, preprint arxiv, (18 more...)

arXiv.org Artificial Intelligence

2406.16441

Country:

Asia (0.93)
North America > United States > Hawaii (0.14)
North America > United States > California (0.14)
(2 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

m3P: Towards Multimodal Multilingual Translation with Multimodal Prompt

Yang, Jian, Guo, Hongcheng, Yin, Yuwei, Bai, Jiaqi, Wang, Bing, Liu, Jiaheng, Liang, Xinnian, Cahi, Linzheng, Yang, Liqun, Li, Zhoujun

arXiv.org Artificial IntelligenceMar-26-2024

Multilingual translation supports multiple translation directions by projecting all languages in a shared space, but the translation quality is undermined by the difference between languages in the text-only modality, especially when the number of languages is large. To bridge this gap, we introduce visual context as the universal language-independent representation to facilitate multilingual translation. In this paper, we propose a framework to leverage the multimodal prompt to guide the Multimodal Multilingual neural Machine Translation (m3P), which aligns the representations of different languages with the same meaning and generates the conditional vision-language memory for translation. We construct a multilingual multimodal instruction dataset (InstrMulti102) to support 102 languages. Our method aims to minimize the representation distance of different languages by regarding the image as a central language. Experimental results show that m3P outperforms previous text-only baselines and multilingual multimodal methods by a large margin. Furthermore, the probing experiments validate the effectiveness of our method in enhancing translation under the low-resource and massively multilingual scenario.

artificial intelligence, natural language, translation, (15 more...)

arXiv.org Artificial Intelligence

2403.17556

Country:

Europe (1.00)
Asia (0.93)
North America > United States > New York (0.14)
(3 more...)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction

Li, Tongliang, Wang, Zixiang, Chai, Linzheng, Yang, Jian, Bai, Jiaqi, Yin, Yuwei, Liu, Jiaheng, Guo, Hongcheng, Yang, Liqun, el-abidine, Hebboul Zine, Li, Zhoujun

arXiv.org Artificial IntelligenceSep-20-2023

Cross-lingual open information extraction aims to extract structured information from raw text across multiple languages. Previous work uses a shared cross-lingual pre-trained model to handle the different languages but underuses the potential of the language-specific representation. In this paper, we propose an effective multi-stage tuning framework called MT4CrossIE, designed for enhancing cross-lingual open information extraction by injecting language-specific knowledge into the shared model. Specifically, the cross-lingual pre-trained model is first tuned in a shared semantic space (e.g., embedding matrix) in the fixed encoder and then other components are optimized in the second stage. After enough training, we freeze the pre-trained model and tune the multiple extra low-rank language-specific modules using mixture-of-LoRAs for model-based cross-lingual transfer. In addition, we leverage two-stage prompting to encourage the large language model (LLM) to annotate the multi-lingual raw data for data-based cross-lingual transfer. The model is trained with multi-lingual objectives on our proposed dataset OpenIE4++ by combing the model-based and data-based transfer techniques. Experimental results on various benchmarks emphasize the importance of aggregating multiple plug-in-and-play language-specific modules and demonstrate the effectiveness of MT4CrossIE in cross-lingual OIE\footnote{\url{https://github.com/CSJianYang/Multilingual-Multimodal-NLP}}.

artificial intelligence, large language model, natural language, (3 more...)

arXiv.org Artificial Intelligence

2308.06552

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining > Text Mining (0.80)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.80)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)

Add feedback

OWL: A Large Language Model for IT Operations

Guo, Hongcheng, Yang, Jian, Liu, Jiaheng, Yang, Liqun, Chai, Linzheng, Bai, Jiaqi, Peng, Junran, Hu, Xiaorong, Chen, Chao, Zhang, Dongfeng, Shi, Xu, Zheng, Tieqiao, Zheng, Liangfan, Zhang, Bo, Xu, Ke, Li, Zhoujun

arXiv.org Artificial IntelligenceSep-17-2023

With the rapid development of IT operations, it has become increasingly crucial to efficiently manage and analyze large volumes of data for practical applications. The techniques of Natural Language Processing (NLP) have shown remarkable capabilities for various tasks, including named entity recognition, machine translation and dialogue systems. Recently, Large Language Models (LLMs) have achieved significant improvements across various NLP downstream tasks. However, there is a lack of specialized LLMs for IT operations. In this paper, we introduce the Owl, a large language model trained on our collected Owl-Instruct dataset with a wide range of IT-related information, where the mixture-of-adapter strategy is proposed to improve the parameter-efficient tuning across different domains or tasks. Furthermore, we evaluate the performance of our Owl on the Owl-Bench established by us and open IT-related benchmarks. Owl demonstrates superior performance results on IT tasks, which outperforms existing models by significant margins. Moreover, we hope that the findings of our work will provide more insights to revolutionize the techniques of IT operations with specialized LLMs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2309.09298

Country:

North America > United States > Hawaii (0.14)
North America > United States > California (0.14)
Europe > Austria > Vienna (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

QURG: Question Rewriting Guided Context-Dependent Text-to-SQL Semantic Parsing

Chai, Linzheng, Xiao, Dongling, Yang, Jian, Yang, Liqun, Zhang, Qian-Wen, Cao, Yunbo, Li, Zhoujun, Yan, Zhao

arXiv.org Artificial IntelligenceMay-16-2023

Context-dependent Text-to-SQL aims to translate multi-turn natural language questions into SQL queries. Despite various methods have exploited context-dependence information implicitly for contextual SQL parsing, there are few attempts to explicitly address the dependencies between current question and question context. This paper presents QURG, a novel Question Rewriting Guided approach to help the models achieve adequate contextual understanding. Specifically, we first train a question rewriting model to complete the current question based on question context, and convert them into a rewriting edit matrix. We further design a two-stream matrix encoder to jointly model the rewriting relations between question and context, and the schema linking relations between natural language and structured schema. Experimental results show that QURG significantly improves the performances on two large-scale context-dependent datasets SParC and CoSQL, especially for hard and long-turn questions.

artificial intelligence, computational linguistic, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2305.06655

Country:

Europe (0.47)
Asia > China (0.29)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator

Yang, Jian, Ma, Shuming, Dong, Li, Huang, Shaohan, Huang, Haoyang, Yin, Yuwei, Zhang, Dongdong, Yang, Liqun, Wei, Furu, Li, Zhoujun

arXiv.org Artificial IntelligenceMay-9-2023

Pre-trained models have achieved remarkable success in natural language processing (NLP). However, existing pre-training methods underutilize the benefits of language understanding for generation. Inspired by the idea of Generative Adversarial Networks (GANs), we propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator, unifying the ability of language understanding and generation in a single model. Our model, named as GanLM, is trained with two pre-training objectives: replaced token detection and replaced token denoising. Specifically, given masked source sentences, the generator outputs the target distribution and the discriminator predicts whether the target sampled tokens from distribution are incorrect. The target sentence is replaced with misclassified tokens to construct noisy previous context, which is used to generate the gold sentence. In general, both tasks improve the ability of language understanding and generation by selectively using the denoising data. Extensive experiments in language generation benchmarks show that GanLM with the powerful language understanding capability outperforms various strong pre-trained language models (PLMs) and achieves state-of-the-art performance.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2212.10218

Country: Asia (0.27)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback