Goto

Collaborating Authors

 Liu, Chuanren


Unified Uncertainty Estimation for Cognitive Diagnosis Models

arXiv.org Artificial Intelligence

Cognitive diagnosis models have been widely used in different areas, especially intelligent education, to measure users' proficiency levels on knowledge concepts, based on which users can get personalized instructions. As the measurement is not always reliable due to the weak links of the models and data, the uncertainty of measurement also offers important information for decisions. However, the research on the uncertainty estimation lags behind that on advanced model structures for cognitive diagnosis. Existing approaches have limited efficiency and leave an academic blank for sophisticated models which have interaction function parameters (e.g., deep learning-based models). To address these problems, we propose a unified uncertainty estimation approach for a wide range of cognitive diagnosis models. Specifically, based on the idea of estimating the posterior distributions of cognitive diagnosis model parameters, we first provide a unified objective function for mini-batch based optimization that can be more efficiently applied to a wide range of models and large datasets. Then, we modify the reparameterization approach in order to adapt to parameters defined on different domains. Furthermore, we decompose the uncertainty of diagnostic parameters into data aspect and model aspect, which better explains the source of uncertainty. Extensive experiments demonstrate that our method is effective and can provide useful insights into the uncertainty of cognitive diagnosis.


Continuous-Time User Preference Modelling for Temporal Sets Prediction

arXiv.org Artificial Intelligence

Given a sequence of sets, where each set has a timestamp and contains an arbitrary number of elements, temporal sets prediction aims to predict the elements in the subsequent set. Previous studies for temporal sets prediction mainly focus on the modelling of elements and implicitly represent each user's preference based on his/her interacted elements. However, user preferences are often continuously evolving and the evolutionary trend cannot be fully captured with the indirect learning paradigm of user preferences. To this end, we propose a continuous-time user preference modelling framework for temporal sets prediction, which explicitly models the evolving preference of each user by maintaining a memory bank to store the states of all the users and elements. Specifically, we first construct a universal sequence by arranging all the user-set interactions in a non-descending temporal order, and then chronologically learn from each user-set interaction. For each interaction, we continuously update the memories of the related user and elements based on their currently encoded messages and past memories. Moreover, we present a personalized user behavior learning module to discover user-specific characteristics based on each user's historical sequence, which aggregates the previously interacted elements from dual perspectives according to the user and elements. Finally, we develop a set-batch algorithm to improve the model efficiency, which can create time-consistent batches in advance and achieve 3.5x and 3.0x speedups in the training and evaluation process on average. Experiments on four real-world datasets demonstrate the superiority of our approach over state-of-the-arts under both transductive and inductive settings. The good interpretability of our method is also shown.


Multi-Dimensional Ability Diagnosis for Machine Learning Algorithms

arXiv.org Artificial Intelligence

Machine learning algorithms have become ubiquitous in a number of applications (e.g. image classification). However, due to the insufficient measurement of traditional metrics (e.g. the coarse-grained Accuracy of each classifier), substantial gaps are usually observed between the real-world performance of these algorithms and their scores in standardized evaluations. In this paper, inspired by the psychometric theories from human measurement, we propose a task-agnostic evaluation framework Camilla, where a multi-dimensional diagnostic metric Ability is defined for collaboratively measuring the multifaceted strength of each machine learning algorithm. Specifically, given the response logs from different algorithms to data samples, we leverage cognitive diagnosis assumptions and neural networks to learn the complex interactions among algorithms, samples and the skills (explicitly or implicitly pre-defined) of each sample. In this way, both the abilities of each algorithm on multiple skills and some of the sample factors (e.g. sample difficulty) can be simultaneously quantified. We conduct extensive experiments with hundreds of machine learning algorithms on four public datasets, and our experimental results demonstrate that Camilla not only can capture the pros and cons of each algorithm more precisely, but also outperforms state-of-the-art baselines on the metric reliability, rank consistency and rank stability.


GraphMI: Extracting Private Graph Data from Graph Neural Networks

arXiv.org Artificial Intelligence

As machine learning becomes more widely used for critical applications, the need to study its implications in privacy turns to be urgent. Given access to the target model and auxiliary information, the model inversion attack aims to infer sensitive features of the training dataset, which leads to great privacy concerns. Despite its success in grid-like domains, directly applying model inversion techniques on non-grid domains such as graph achieves poor attack performance due to the difficulty to fully exploit the intrinsic properties of graphs and attributes of nodes used in Graph Neural Networks (GNN). To bridge this gap, we present \textbf{Graph} \textbf{M}odel \textbf{I}nversion attack (GraphMI), which aims to extract private graph data of the training graph by inverting GNN, one of the state-of-the-art graph analysis tools. Specifically, we firstly propose a projected gradient module to tackle the discreteness of graph edges while preserving the sparsity and smoothness of graph features. Then we design a graph auto-encoder module to efficiently exploit graph topology, node attributes, and target model parameters for edge inference. With the proposed methods, we study the connection between model inversion risk and edge influence and show that edges with greater influence are more likely to be recovered. Extensive experiments over several public datasets demonstrate the effectiveness of our method. We also show that differential privacy in its canonical form can hardly defend our attack while preserving decent utility.


Predicting Temporal Sets with Deep Neural Networks

arXiv.org Artificial Intelligence

Given a sequence of sets, where each set contains an arbitrary number of elements, the problem of temporal sets prediction aims to predict the elements in the subsequent set. In practice, temporal sets prediction is much more complex than predictive modelling of temporal events and time series, and is still an open problem. Many possible existing methods, if adapted for the problem of temporal sets prediction, usually follow a two-step strategy by first projecting temporal sets into latent representations and then learning a predictive model with the latent representations. The two-step approach often leads to information loss and unsatisfactory prediction performance. In this paper, we propose an integrated solution based on the deep neural networks for temporal sets prediction. A unique perspective of our approach is to learn element relationship by constructing set-level co-occurrence graph and then perform graph convolutions on the dynamic relationship graphs. Moreover, we design an attention-based module to adaptively learn the temporal dependency of elements and sets. Finally, we provide a gated updating mechanism to find the hidden shared patterns in different sequences and fuse both static and dynamic information to improve the prediction performance. Experiments on real-world data sets demonstrate that our approach can achieve competitive performances even with a portion of the training data and can outperform existing methods with a significant margin.


Exploiting Cognitive Structure for Adaptive Learning

arXiv.org Machine Learning

Adaptive learning, also known as adaptive teaching, relies on learning path recommendation, which sequentially recommends personalized learning items (e.g., lectures, exercises) to satisfy the unique needs of each learner. Although it is well known that modeling the cognitive structure including knowledge level of learners and knowledge structure (e.g., the prerequisite relations) of learning items is important for learning path recommendation, existing methods for adaptive learning often separately focus on either knowledge levels of learners or knowledge structure of learning items. To fully exploit the multifaceted cognitive structure for learning path recommendation, we propose a Cognitive Structure Enhanced framework for Adaptive Learning, named CSEAL. By viewing path recommendation as a Markov Decision Process and applying an actor-critic algorithm, CSEAL can sequentially identify the right learning items to different learners. Specifically, we first utilize a recurrent neural network to trace the evolving knowledge levels of learners at each learning step. Then, we design a navigation algorithm on the knowledge structure to ensure the logicality of learning paths, which reduces the search space in the decision process. Finally, the actor-critic algorithm is used to determine what to learn next and whose parameters are dynamically updated along the learning path. Extensive experiments on real-world data demonstrate the effectiveness and robustness of CSEAL.


Skeptical Deep Learning with Distribution Correction

arXiv.org Machine Learning

Recently deep neural networks have been successfully used for various classification tasks, especially for problems with massive perfectly labeled training data. However, it is often costly to have large-scale credible labels in real-world applications. One solution is to make supervised learning robust with imperfectly labeled input. In this paper, we develop a distribution correction approach that allows deep neural networks to avoid overfitting imperfect training data. Specifically, we treat the noisy input as samples from an incorrect distribution, which will be automatically corrected during our training process. We test our approach on several classification datasets with elaborately generated noisy labels. The results show significantly higher prediction and recovery accuracy with our approach compared to alternative methods.


Confidence-Aware Matrix Factorization for Recommender Systems

AAAI Conferences

Collaborative filtering (CF), particularly matrix factorization (MF) based methods, have been widely used in recommender systems. The literature has reported that matrix factorization methods often produce superior accuracy of rating prediction in recommender systems. However, existing matrix factorization methods rarely consider confidence of the rating prediction and thus cannot support advanced recommendation tasks. In this paper, we propose a Confidence-aware Matrix Factorization (CMF) framework to simultaneously optimize the accuracy of rating prediction and measure the prediction confidence in the model. Specifically, we introduce variance parameters for both users and items in the matrix factorization process. Then, prediction interval can be computed to measure confidence for each predicted rating. These confidence quantities can be used to enhance the quality of recommendation results based on Confidence-aware Ranking (CR). We also develop two effective implementations of our framework to compute the confidence-aware matrix factorization for large-scale data. Finally, extensive experiments on three real-world datasets demonstrate the effectiveness of our framework from multiple perspectives.


A Context-Enriched Neural Network Method for Recognizing Lexical Entailment

AAAI Conferences

Recognizing lexical entailment (RLE) always plays an important role in inference of natural language, i.e., identifying whether one word entails another, for example, fox entails animal. In the literature, automatically recognizing lexical entailment for word pairs deeply relies on words' contextual representations. However, as a "prototype" vector, a single representation cannot reveal multifaceted aspects of the words due to their homonymy and polysemy. In this paper, we propose a supervised Context-Enriched Neural Network (CENN) method for recognizing lexical entailment. To be specific, we first utilize multiple embedding vectors from different contexts to represent the input word pairs. Then, through different combination methods and attention mechanism, we integrate different embedding vectors and optimize their weights to predict whether there are entailment relations in word pairs. Moreover, our proposed framework is flexible and open to handle different word contexts and entailment perspectives in the text corpus. Extensive experiments on five datasets show that our approach significantly improves the performance of automatic RLE in comparison with several state-of-the-art methods.