AITopics | Ouyang, Yuanxin

Collaborating Authors

Ouyang, Yuanxin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enhancing Input-Label Mapping in In-Context Learning with Contrastive Decoding

Peng, Keqin, Ding, Liang, Ouyang, Yuanxin, Fang, Meng, Yuan, Yancheng, Tao, Dacheng

arXiv.org Artificial IntelligenceFeb-19-2025

Large language models (LLMs) excel at a range of tasks through in-context learning (ICL), where only a few task examples guide their predictions. However, prior research highlights that LLMs often overlook input-label mapping information in ICL, relying more on their pre-trained knowledge. To address this issue, we introduce In-Context Contrastive Decoding (ICCD), a novel method that emphasizes input-label mapping by contrasting the output distributions between positive and negative in-context examples. Experiments on 7 natural language understanding (NLU) tasks show that our ICCD method brings consistent and significant improvement (up to +2.1 improvement on average) upon 6 different scales of LLMs without requiring additional training. Our approach is versatile, enhancing performance with various demonstration selection methods, demonstrating its broad applicability and effectiveness. The code and scripts will be publicly released.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2502.13738

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

AAKT: Enhancing Knowledge Tracing with Alternate Autoregressive Modeling

Zhou, Hao, Rong, Wenge, Zhang, Jianfei, Sun, Qing, Ouyang, Yuanxin, Xiong, Zhang

arXiv.org Artificial IntelligenceFeb-17-2025

Knowledge Tracing (KT) aims to predict students' future performances based on their former exercises and additional information in educational settings. KT has received significant attention since it facilitates personalized experiences in educational situations. Simultaneously, the autoregressive modeling on the sequence of former exercises has been proven effective for this task. One of the primary challenges in autoregressive modeling for Knowledge Tracing is effectively representing the anterior (pre-response) and posterior (post-response) states of learners across exercises. Existing methods often employ complex model architectures to update learner states using question and response records. In this study, we propose a novel perspective on knowledge tracing task by treating it as a generative process, consistent with the principles of autoregressive models. We demonstrate that knowledge states can be directly represented through autoregressive encodings on a question-response alternate sequence, where model generate the most probable representation in hidden state space by analyzing history interactions. This approach underpins our framework, termed Alternate Autoregressive Knowledge Tracing (AAKT). Additionally, we incorporate supplementary educational information, such as question-related skills, into our framework through an auxiliary task, and include extra exercise details, like response time, as additional inputs. Our proposed framework is implemented using advanced autoregressive technologies from Natural Language Generation (NLG) for both training and prediction. Empirical evaluations on four real-world KT datasets indicate that AAKT consistently outperforms all baseline models in terms of AUC, ACC, and RMSE. Furthermore, extensive ablation studies and visualized analysis validate the effectiveness of key components in AAKT.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TLT.2024.3521898

2502.11817

Country: Asia > China (0.28)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.34)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.68)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

Explainable Few-shot Knowledge Tracing

Li, Haoxuan, Yu, Jifan, Ouyang, Yuanxin, Liu, Zhuang, Rong, Wenge, Li, Juanzi, Xiong, Zhang

arXiv.org Artificial IntelligenceMay-25-2024

Knowledge tracing (KT), aiming to mine students' mastery of knowledge by their exercise records and predict their performance on future test questions, is a critical task in educational assessment. While researchers achieved tremendous success with the rapid development of deep learning techniques, current knowledge tracing tasks fall into the cracks from real-world teaching scenarios. Relying heavily on extensive student data and solely predicting numerical performances differs from the settings where teachers assess students' knowledge state from limited practices and provide explanatory feedback. To fill this gap, we explore a new task formulation: Explainable Few-shot Knowledge Tracing. By leveraging the powerful reasoning and generation abilities of large language models (LLMs), we then propose a cognition-guided framework that can track the student knowledge from a few student records while providing natural language explanations. Experimental results from three widely used datasets show that LLMs can perform comparable or superior to competitive deep knowledge tracing methods. We also discuss potential directions and call for future improvements in relevant topics.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2405.14391

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry:

Education > Assessment & Standards (0.88)
Education > Educational Setting (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ProCQA: A Large-scale Community-based Programming Question Answering Dataset for Code Search

Li, Zehan, Zhang, Jianfei, Yin, Chuantao, Ouyang, Yuanxin, Rong, Wenge

arXiv.org Artificial IntelligenceMar-25-2024

Retrieval-based code question answering seeks to match user queries in natural language to relevant code snippets. Previous approaches typically rely on pretraining models using crafted bi-modal and uni-modal datasets to align text and code representations. In this paper, we introduce ProCQA, a large-scale programming question answering dataset extracted from the StackOverflow community, offering naturally structured mixed-modal QA pairs. To validate its effectiveness, we propose a modality-agnostic contrastive pre-training approach to improve the alignment of text and code representations of current code language models. Compared to previous models that primarily employ bimodal and unimodal pairs extracted from CodeSearchNet for pre-training, our model exhibits significant performance improvements across a wide range of code retrieval benchmarks.

machine learning, natural language, question answering, (17 more...)

arXiv.org Artificial Intelligence

2403.16702

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Review of Data Mining in Personalized Education: Current Trends and Future Prospects

Xiong, Zhang, Li, Haoxuan, Liu, Zhuang, Chen, Zhuofan, Zhou, Hao, Rong, Wenge, Ouyang, Yuanxin

arXiv.org Artificial IntelligenceFeb-27-2024

Personalized education, tailored to individual student needs, leverages educational technology and artificial intelligence (AI) in the digital age to enhance learning effectiveness. The integration of AI in educational platforms provides insights into academic performance, learning preferences, and behaviors, optimizing the personal learning process. Driven by data mining techniques, it not only benefits students but also provides educators and institutions with tools to craft customized learning experiences. To offer a comprehensive review of recent advancements in personalized educational data mining, this paper focuses on four primary scenarios: educational recommendation, cognitive diagnosis, knowledge tracing, and learning analysis. This paper presents a structured taxonomy for each area, compiles commonly used datasets, and identifies future research directions, emphasizing the role of data mining in enhancing personalized education and paving the way for future exploration and innovation.

data mining, evolutionary algorithm, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.3868/s110-009-024-0004-9

2402.17236

Country: Asia > China (0.14)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)
Education > Curriculum > Subject-Specific Education (1.00)
(2 more...)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(8 more...)

Add feedback

Revisiting Demonstration Selection Strategies in In-Context Learning

Peng, Keqin, Ding, Liang, Yuan, Yancheng, Liu, Xuebo, Zhang, Min, Ouyang, Yuanxin, Tao, Dacheng

arXiv.org Artificial IntelligenceJan-22-2024

Large language models (LLMs) have shown an impressive ability to perform a wide range of tasks using in-context learning (ICL), where a few examples are used to describe a task to the model. However, the performance of ICL varies significantly with the choice of demonstrations, and it is still unclear why this happens or what factors will influence its choice. In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent. We further proposed a data- and model-dependent demonstration selection method, \textbf{TopK + ConE}, based on the assumption that \textit{the performance of a demonstration positively correlates with its contribution to the model's understanding of the test samples}, resulting in a simple and effective recipe for ICL. Empirically, our method yields consistent improvements in both language understanding and generation tasks with different model scales. Further analyses confirm that, besides the generality and stability under different circumstances, our method provides a unified explanation for the effectiveness of previous methods. Code will be released.

demonstration, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2401.12087

Country: Asia > China (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Making the Most of ChatGPT for Machine Translation

Peng, Keqin, Ding, Liang, Zhong, Qihuang, Shen, Li, Liu, Xuebo, Zhang, Min, Ouyang, Yuanxin, Tao, Dacheng

arXiv.org Artificial IntelligenceOct-20-2023

ChatGPT shows remarkable capabilities for machine translation (MT). Several prior studies have shown that it achieves comparable results to commercial systems for high-resource languages, but lags behind in complex tasks, e.g., low-resource and distant-language-pairs translation. However, they usually adopt simple prompts which can not fully elicit the capability of ChatGPT. In this paper, we aim to further mine ChatGPT's translation ability by revisiting several aspects: temperature, task information, and domain information, and correspondingly propose an optimal temperature setting and two (simple but effective) prompts: Task-Specific Prompts (TSP) and Domain-Specific Prompts (DSP). We show that: 1) The performance of ChatGPT depends largely on temperature, and a lower temperature usually can achieve better performance; 2) Emphasizing the task information can further improve ChatGPT's performance, particularly in complex MT tasks; 3) Introducing domain information can elicit ChatGPT's generalization ability and improve its performance in the specific domain; 4) ChatGPT tends to generate hallucinations for non-English-centric MT tasks, which can be partially addressed by our proposed prompts but still need to be highlighted for the MT/NLP community. We also explore the effects of advanced in-context learning strategies and find a (negative but interesting) observation: the powerful chain-of-thought prompt leads to word-by-word translation behavior, thus bringing significant translation degradation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2303.1378

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Question/Answer Matching for CQA System via Combining Lexical and Sequential Information

Shen, Yikang (Beihang University) | Rong, Wenge (Beihang University) | Sun, Zhiwei (Beihang University) | Ouyang, Yuanxin (Beihang University) | Xiong, Zhang (Beihang University)

AAAI ConferencesMar-6-2015

Community-based Question Answering (CQA) has become popular in knowledge sharing sites since it allows users to get answers to complex, detailed, and personal questions directly from other users. Large archives of historical questions and associated answers have been accumulated. Retrieving relevant historical answers that best match a question is an essential component of a CQA service. Most state of the art approaches are based on bag-of-words models, which have been proven successful in a range of text matching tasks, but are insufficient for capturing the important word sequence information in short text matching. In this paper, a new architecture is proposed to more effectively model the complicated matching relations between questions and answers. It utilises a similarity matrix which contains both lexical and sequential information. Afterwards the information is put into a deep architecture to find potentially suitable answers. The experimental study shows its potential in improving matching accuracy of question and answer.

deep learning, information, neural network, (22 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.29)

Genre:

Research Report > New Finding (0.49)
Research Report > Experimental Study (0.34)
Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback