AITopics | Instructional Material

Collaborating Authors

Instructional Material

ReliCD: A Reliable Cognitive Diagnosis Framework with Confidence Awareness

Zhang, Yunfei, Qin, Chuan, Shen, Dazhong, Ma, Haiping, Zhang, Le, Zhang, Xingyi, Zhu, Hengshu

arXiv.org Artificial IntelligenceDec-29-2023

During the past few decades, cognitive diagnostics modeling has attracted increasing attention in computational education communities, which is capable of quantifying the learning status and knowledge mastery levels of students. Indeed, the recent advances in neural networks have greatly enhanced the performance of traditional cognitive diagnosis models through learning the deep representations of students and exercises. Nevertheless, existing approaches often suffer from the issue of overconfidence in predicting students' mastery levels, which is primarily caused by the unavoidable noise and sparsity in realistic student-exercise interaction data, severely hindering the educational application of diagnostic feedback. To address this, in this paper, we propose a novel Reliable Cognitive Diagnosis(ReliCD) framework, which can quantify the confidence of the diagnosis feedback and is flexible for different cognitive diagnostic functions. Specifically, we first propose a Bayesian method to explicitly estimate the state uncertainty of different knowledge concepts for students, which enables the confidence quantification of diagnostic feedback. In particular, to account for potential differences, we suggest modeling individual prior distributions for the latent variables of different ability concepts using a pre-trained model. Additionally, we introduce a logical hypothesis for ranking confidence levels. Along this line, we design a novel calibration loss to optimize the confidence parameters by modeling the process of student performance prediction. Finally, extensive experiments on four real-world datasets clearly demonstrate the effectiveness of our ReliCD framework.

diagnostic feedback, knowledge concept, student, (17 more...)

arXiv.org Artificial Intelligence

2401.10749

Country:

Asia > China > Anhui Province (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Instructional Material (0.93)
Research Report > Experimental Study (0.46)

Industry:

Education > Educational Setting (0.68)
Education > Assessment & Standards > Student Performance (0.48)
Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
(2 more...)

Add feedback

ChatEd: A Chatbot Leveraging ChatGPT for an Enhanced Learning Experience in Higher Education

Wang, Kevin, Ramos, Jason, Lawrence, Ramon

arXiv.org Artificial IntelligenceDec-29-2023

With the rapid evolution of Natural Language Processing (NLP), Large Language Models (LLMs) like ChatGPT have emerged as powerful tools capable of transforming various sectors. Their vast knowledge base and dynamic interaction capabilities represent significant potential in improving education by operating as a personalized assistant. However, the possibility of generating incorrect, biased, or unhelpful answers are a key challenge to resolve when deploying LLMs in an education context. This work introduces an innovative architecture that combines the strengths of ChatGPT with a traditional information retrieval based chatbot framework to offer enhanced student support in higher education. Our empirical evaluations underscore the high promise of this approach.

chatbot, chated, information, (16 more...)

arXiv.org Artificial Intelligence

2401.00052

Country:

North America > Canada > British Columbia > Regional District of Central Okanagan > Kelowna (0.05)
Oceania > Australia (0.04)

Genre:

Instructional Material > Course Syllabus & Notes (1.00)
Research Report (0.64)

Industry:

Education > Educational Setting > Higher Education (0.71)
Education > Educational Technology > Educational Software (0.47)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Building Efficient Universal Classifiers with Natural Language Inference

Laurer, Moritz, van Atteveldt, Wouter, Casas, Andreu, Welbers, Kasper

arXiv.org Artificial IntelligenceDec-29-2023

Generative Large Language Models (LLMs) have become the mainstream choice for fewshot and zeroshot learning thanks to the universality of text generation. Many users, however, do not need the broad capabilities of generative LLMs when they only want to automate a classification task. Smaller BERT-like models can also learn universal tasks, which allow them to do any text classification task without requiring fine-tuning (zeroshot classification) or to learn new tasks with only a few examples (fewshot), while being significantly more efficient than generative LLMs. This paper (1) explains how Natural Language Inference (NLI) can be used as a universal classification task that follows similar principles as instruction fine-tuning of generative LLMs, (2) provides a step-by-step guide with reusable Jupyter notebooks for building a universal classifier, and (3) shares the resulting universal classifier that is trained on 33 datasets with 389 diverse classes. Parts of the code we share has been used to train our older zeroshot classifiers that have been downloaded more than 55 million times via the Hugging Face Hub as of December 2023. Our new classifier improves zeroshot performance by 9.4%.

classification, dataset, hypothesis, (12 more...)

arXiv.org Artificial Intelligence

2312.17543

Country:

Europe > Germany (0.28)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(5 more...)

Genre:

Research Report (0.64)
Instructional Material > Training Manual (0.48)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math

Wang, Zengzhi, Xia, Rui, Liu, Pengfei

arXiv.org Artificial IntelligenceDec-28-2023

High-quality, large-scale corpora are the cornerstone of building foundation models. In this work, we introduce \textsc{MathPile}, a diverse and high-quality math-centric corpus comprising about 9.5 billion tokens. Throughout its creation, we adhered to the principle of ``\emph{less is more}'', firmly believing in the supremacy of data quality over quantity, even in the pre-training phase. Our meticulous data collection and processing efforts included a complex suite of preprocessing, prefiltering, language identification, cleaning, filtering, and deduplication, ensuring the high quality of our corpus. Furthermore, we performed data contamination detection on downstream benchmark test sets to eliminate duplicates. We hope our \textsc{MathPile} can help to enhance the mathematical reasoning abilities of language models. We plan to open-source different versions of \mathpile with the scripts used for processing, to facilitate future developments in this field.

corpus, dataset, math, (16 more...)

arXiv.org Artificial Intelligence

2312.1712

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > Canada > Saskatchewan (0.04)
(16 more...)

Genre:

Research Report (1.00)
Instructional Material (0.93)

Industry:

Health & Medicine (0.67)
Education (0.67)
Transportation (0.46)
Law (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.50)

Add feedback

PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion

Lu, Guansong, Guo, Yuanfan, Han, Jianhua, Niu, Minzhe, Zeng, Yihan, Xu, Songcen, Huang, Zeyi, Zhong, Zhao, Zhang, Wei, Xu, Hang

arXiv.org Artificial IntelligenceDec-28-2023

Current large-scale diffusion models represent a giant leap forward in conditional image synthesis, capable of interpreting diverse cues like text, human poses, and edges. However, their reliance on substantial computational resources and extensive data collection remains a bottleneck. On the other hand, the integration of existing diffusion models, each specialized for different controls and operating in unique latent spaces, poses a challenge due to incompatible image resolutions and latent space embedding structures, hindering their joint use. Addressing these constraints, we present "PanGu-Draw", a novel latent diffusion model designed for resource-efficient text-to-image synthesis that adeptly accommodates multiple control signals. We first propose a resource-efficient Time-Decoupling Training Strategy, which splits the monolithic text-to-image model into structure and texture generators. Each generator is trained using a regimen that maximizes data utilization and computational efficiency, cutting data preparation by 48% and reducing training resources by 51%. Secondly, we introduce "Coop-Diffusion", an algorithm that enables the cooperative use of various pre-trained diffusion models with different latent spaces and predefined resolutions within a unified denoising process. This allows for multi-control image synthesis at arbitrary resolutions without the necessity for additional data or retraining. Empirical validations of Pangu-Draw show its exceptional prowess in text-to-image and multi-control image generation, suggesting a promising direction for future model training efficiencies and generation versatility. The largest 5B T2I PanGu-Draw model is released on the Ascend platform. Project page: $\href{https://pangu-draw.github.io}{this~https~URL}$

diffusion model, efficiency, pangu-draw, (13 more...)

arXiv.org Artificial Intelligence

2312.16486

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(2 more...)

Genre:

Research Report (0.82)
Instructional Material (0.66)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Matrix Decomposition and Applications

Lu, Jun

arXiv.org Artificial IntelligenceDec-28-2023

In 1954, Alston S. Householder published Principles of Numerical Analysis, one of the first modern treatments on matrix decomposition that favored a (block) LU decomposition-the factorization of a matrix into the product of lower and upper triangular matrices. And now, matrix decomposition has become a core technology in machine learning, largely due to the development of the back propagation algorithm in fitting a neural network. The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in numerical linear algebra and matrix analysis in order to seamlessly introduce matrix decomposition techniques and their applications in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results concerning matrix decomposition and given the paucity of scope to present this discussion, e.g., the separated analysis of the Euclidean space, Hermitian space, Hilbert space, and things in the complex domain. We refer the reader to literature in the field of linear algebra for a more detailed introduction to the related fields.

row rank and column rank, skew-symmetric matrix, tridiagonal decomposition, (14 more...)

arXiv.org Artificial Intelligence

2201.00145

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.45)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Media > Film (0.92)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(3 more...)

Add feedback

Foundations of Reinforcement Learning and Interactive Decision Making

Foster, Dylan J., Rakhlin, Alexander

arXiv.org Machine LearningDec-27-2023

When we say interactive decision making, we are thinking of problems such as: Medical treatment: based on a patient's medical history and vital signs, we need to decide what treatment will lead to the most positive outcome. Controlling a robot: based on sensor signals, we need to decide what signals to send to a robot's actuators in order to navigate to a goal. For both problems, we (the learner/agent) are interacting with an unknown environment. In the robotics example, we do not necessarily a-priori know how the signals we send to our robot's actuators change its configuration, or what the landscape it's trying to navigate looks like. However, because we are able to actively control the agent, we can learn to model the environment on the fly as we make decisions and collect data, which will reduce uncertainty and allow us to make better decisions in the future. The crux of the interactive decision making problem is to make decisions in a way that balances (i) exploring the environment to reduce our uncertainty and (ii) maximizing our overall performance (e.g., reaching a goal state as fast as possible). Figure 1 depicts an idealized interactive decision making setting, which we will return to throughout this course.

data mining, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2312.1673

Country: North America > United States (1.00)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Energy > Oil & Gas > Upstream (0.45)
Health & Medicine > Diagnostic Medicine > Vital Signs (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges

Li, Qingyao, Fu, Lingyue, Zhang, Weiming, Chen, Xianyu, Yu, Jingwei, Xia, Wei, Zhang, Weinan, Tang, Ruiming, Yu, Yong

arXiv.org Artificial IntelligenceDec-27-2023

Online education platforms, leveraging the internet to distribute education resources, seek to provide convenient education but often fall short in real-time communication with students. They often struggle to offer personalized education resources due to the challenge of addressing the diverse obstacles students encounter throughout their learning journey. Recently, the emergence of large language models (LLMs), such as ChatGPT, offers the possibility for resolving this issue by comprehending individual requests. Although LLMs have been successful in various fields, creating an LLM-based education system is still challenging for the wide range of educational skills required. This paper reviews the recently emerged LLM researches related to educational capabilities, including mathematics, writing, programming, reasoning, and knowledge-based question answering, with the aim to explore their potential in constructing the next-generation intelligent education system. Based on the current development status, we further outline two approaches for an LLM-based education system: a unified approach and a mixture-of-expert (MoE) approach. Finally, we explore the challenges and future directions, providing new research opportunities and perspectives on adapting LLMs for education.

arxiv preprint arxiv, language model, llm, (15 more...)

arXiv.org Artificial Intelligence

2401.08664

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Germany > Berlin (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material (1.00)

Industry:

Education > Educational Setting > Online (0.87)
Education > Educational Technology > Educational Software > Computer Based Training (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Disentangled Continual Learning: Separating Memory Edits from Model Updates

Dziadzio, Sebastian, Yıldız, Çağatay, van de Ven, Gido M., Trzciński, Tomasz, Tuytelaars, Tinne, Bethge, Matthias

arXiv.org Artificial IntelligenceDec-27-2023

To mitigate this is hindered by catastrophic forgetting, the tendency issue, continual learning methods employ strategies such as of neural networks to overwrite existing knowledge when (i) regularization, which aims to preserve existing knowledge learning a new task. Existing continual learning methods by limiting the plasticity of selected network weights alleviate this problem through regularisation, parameter [15, 17, 26, 36], (ii) parameter isolation or dynamic architectures, isolation, or rehearsal, and are typically evaluated on benchmarks which effectively solve each task with a dedicated consisting of a handful of tasks. We propose a novel model [6, 33], or (iii) replay, which augments the training conceptual approach to continual classification that aims data with stored samples from past tasks [4, 12, 30, 32]. to disentangle class-specific information that needs to be Most continual learning methods are evaluated on image memorised from the class-agnostic knowledge that encapsulates classification benchmarks in which a discriminative model generalization. We store the former in a buffer that is transferred across tasks that typically involve disjoint sets can be easily pruned or updated when new categories arrive, of classes. We argue that this purely discriminative learning while the latter is represented with a neural network that framework is not conducive to positive forward or backward generalizes across tasks. We show that the class-agnostic transfer. Supervised classification networks tend to preserve network does not suffer from catastrophic forgetting and by only the features that are relevant for predicting the output leveraging it to perform classification, we improve accuracy labels in the training data [11, 35].

continual learning, exemplar, learning, (15 more...)

arXiv.org Artificial Intelligence

2312.16731

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States (0.14)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)

Genre:

Instructional Material (0.68)
Research Report > Promising Solution (0.34)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Comprehensive Overview of Large Language Models

Naveed, Humza, Khan, Asad Ullah, Qiu, Shi, Saqib, Muhammad, Anwar, Saeed, Usman, Muhammad, Akhtar, Naveed, Barnes, Nick, Mian, Ajmal

arXiv.org Artificial IntelligenceDec-27-2023

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.

arxiv preprint arxiv, language model, llm, (15 more...)

arXiv.org Artificial Intelligence

2307.06435

Country:

Asia > Middle East > Saudi Arabia > Eastern Province > Dhahran (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
Oceania > Australia > New South Wales > Sydney (0.04)
(13 more...)

Genre:

Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.33)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Education > Educational Setting (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback