Li, Xinyu
Practical and Ethical Challenges of Large Language Models in Education: A Systematic Scoping Review
Yan, Lixiang, Sha, Lele, Zhao, Linxuan, Li, Yuheng, Martinez-Maldonado, Roberto, Chen, Guanliang, Li, Xinyu, Jin, Yueqiao, Gašević, Dragan
Advancements in generative artificial intelligence (AI) and large language models (LLMs) have fueled the development of many educational technology innovations that aim to automate the often time-consuming and laborious tasks of generating and analysing textual content (e.g., generating open-ended questions and analysing student feedback survey) (Kasneci et al., 2023; Wollny et al., 2021; Leiker et al., 2023). LLMs are generative artificial intelligence models that have been trained on an extensive amount of text data, capable of generating human-like text content based on natural language inputs. Specifically, these LLMs, such as Bidirectional Encoder Representations from Transformers (BERT) (Devlin et al., 2018) and Generative Pre-trained Transformer (GPT) (Brown et al., 2020), utilise deep learning and self-attention mechanisms (Vaswani et al., 2017) to selectively attend to the different parts of input texts, depending on the focus of the current tasks, allowing the model to learn complex patterns and relationships among textual contents, such as their semantic, contextual, and syntactic relationships (Min et al., 2021; Liu et al., 2023). As several LLMs (e.g., GPT-3 and Codex) have been pre-trained on massive amounts of data across multiple disciplines, they are capable of completing natural language processing tasks with little (few-shot learning) or no additional training (zero-shot learning) (Brown et al., 2020; Wu et al., 2023). This could lower the technological barriers to LLMs-based innovations as researchers and practitioners can develop new educational technologies by fine-tuning LLMs on specific educational tasks without starting from scratch (Caines et al., 2023; Sridhar et al., 2023). The recent release of ChatGPT, an LLMs-based generative AI chatbot that requires only natural language prompts without additional model training or fine-tuning (OpenAI, 2023), has further lowered the barrier for individuals without technological background to leverage the generative powers of LLMs. Although educational research that leverages LLMs to develop technological innovations for automating educational tasks is yet to achieve its full potential (i.e., most works have focused on improving model performances (Kurdi et al., 2020; Ramesh and Sanampudi, 2022)), a growing body of literature hints at how different stakeholders could potentially benefit from such innovations.
Functional sufficient dimension reduction through information maximization with application to classification
Li, Xinyu, Xu, Jianjun, Cui, Wenquan, Cheng, Haoyang
Considering the case where the response variable is a categorical variable and the predictor is a random function, two novel functional sufficient dimensional reduction (FSDR) methods are proposed based on mutual information and square loss mutual information. Compared to the classical FSDR methods, such as functional sliced inverse regression and functional sliced average variance estimation, the proposed methods are appealing because they are capable of estimating multiple effective dimension reduction directions in the case of a relatively small number of categories, especially for the binary response. Moreover, the proposed methods do not require the restrictive linear conditional mean assumption and the constant covariance assumption. They avoid the inverse problem of the covariance operator which is often encountered in the functional sufficient dimension reduction. The functional principal component analysis with truncation be used as a regularization mechanism. Under some mild conditions, the statistical consistency of the proposed methods is established. It is demonstrated that the two methods are competitive compared with some existing FSDR methods by simulations and real data analyses.
Adversarial Training for Gradient Descent: Analysis Through its Continuous-time Approximation
Gu, Haotian, Guo, Xin, Li, Xinyu
Adversarial training has gained great popularity as one of the most effective defenses for deep neural network and more generally for gradient-based machine learning models against adversarial perturbations on data points. This paper establishes a continuous-time approximation for the mini-max game of adversarial training. This approximation approach allows for precise and analytical comparisons between stochastic gradient descent and its adversarial training counterpart; and confirms theoretically the robustness of adversarial training from a new gradient-flow viewpoint. The analysis is then corroborated through various analytical and numerical examples.
LooPy: A Research-Friendly Mix Framework for Music Information Retrieval on Electronic Dance Music
Li, Xinyu
Music information retrieval (MIR) has gone through an explosive development with the advancement of deep learning in recent years. However, music genres like electronic dance music (EDM) has always been relatively less investigated compared to others. Considering its wide range of applications, we present a Python package for automated EDM audio generation as an infrastructure for MIR for EDM songs, to mitigate the difficulty of acquiring labelled data. It is a convenient tool that could be easily concatenated to the end of many symbolic music generation pipelines. Inside this package, we provide a framework to build professional-level templates that could render a well-produced track from specified melody and chords, or produce massive tracks given only a specific key by our probabilistic symbolic melody generator. Experiments show that our mixes could achieve the same quality of the original reference songs produced by world-famous artists, with respect to both subjective and objective criteria. Our code is accessible in this repository: https://github.com/Gariscat/loopy and the official site of the project is also online https://loopy4edm.com .
The impact of moving expenses on social segregation: a simulation with RL and ABM
Li, Xinyu
Over the past decades, breakthroughs such as Reinforcement Learning (RL) and Agent-based modeling (ABM) have made simulations of economic models feasible. Recently, there has been increasing interest in applying ABM to study the impact of residential preferences on neighborhood segregation in the Schelling Segregation Model. In this paper, RL is combined with ABM to simulate a modified Schelling Segregation model, which incorporates moving expenses as an input parameter. In particular, deep Q network (DQN) is adopted as RL agents' learning algorithm to simulate the behaviors of households and their preferences. This paper studies the impact of moving expenses on the overall segregation pattern and its role in social integration. A more comprehensive simulation of the segregation model is built for policymakers to forecast the potential consequences of their policies.
Robust Direct Learning for Causal Data Fusion
Li, Xinyu, Li, Yilin, Cui, Qing, Li, Longfei, Zhou, Jun
In the era of big data, the explosive growth of multi-source heterogeneous data offers many exciting challenges and opportunities for improving the inference of conditional average treatment effects. In this paper, we investigate homogeneous and heterogeneous causal data fusion problems under a general setting that allows for the presence of source-specific covariates. We provide a direct learning framework for integrating multi-source data that separates the treatment effect from other nuisance functions, and achieves double robustness against certain misspecification. To improve estimation precision and stability, we propose a causal information-aware weighting function motivated by theoretical insights from the semiparametric efficiency theory; it assigns larger weights to samples containing more causal information with high interpretability. We introduce a two-step algorithm, the weighted multi-source direct learner, based on constructing a pseudo-outcome and regressing it on covariates under a weighted least square criterion; it offers us a powerful tool for causal data fusion, enjoying the advantages of easy implementation, double robustness and model flexibility. In simulation studies, we demonstrate the effectiveness of our proposed methods in both homogeneous and heterogeneous causal data fusion scenarios.
Improving short-term bike sharing demand forecast through an irregular convolutional neural network
Li, Xinyu, Xu, Yang, Zhang, Xiaohu, Shi, Wenzhong, Yue, Yang, Li, Qingquan
As an important task for the management of bike sharing systems, accurate forecast of travel demand could facilitate dispatch and relocation of bicycles to improve user satisfaction. In recent years, many deep learning algorithms have been introduced to improve bicycle usage forecast. A typical practice is to integrate convolutional (CNN) and recurrent neural network (RNN) to capture spatial-temporal dependency in historical travel demand. For typical CNN, the convolution operation is conducted through a kernel that moves across a "matrix-format" city to extract features over spatially adjacent urban areas. This practice assumes that areas close to each other could provide useful information that improves prediction accuracy. However, bicycle usage in neighboring areas might not always be similar, given spatial variations in built environment characteristics and travel behavior that affect cycling activities. Yet, areas that are far apart can be relatively more similar in temporal usage patterns. To utilize the hidden linkage among these distant urban areas, the study proposes an irregular convolutional Long-Short Term Memory model (IrConv+LSTM) to improve short-term bike sharing demand forecast. The model modifies traditional CNN with irregular convolutional architecture to extract dependency among "semantic neighbors". The proposed model is evaluated with a set of benchmark models in five study sites, which include one dockless bike sharing system in Singapore, and four station-based systems in Chicago, Washington, D.C., New York, and London. We find that IrConv+LSTM outperforms other benchmark models in the five cities. The model also achieves superior performance in areas with varying levels of bicycle usage and during peak periods. The findings suggest that "thinking beyond spatial neighbors" can further improve short-term travel demand prediction of urban bike sharing systems.
A new neighborhood structure for job shop scheduling problems
Xie, Jin, Li, Xinyu, Gao, Liang, Gui, Lin
Job shop scheduling problem (JSP) is a widely studied NP-complete combinatorial optimization problem. Neighborhood structures play a critical role in solving JSP. At present, there are three state-of-the-art neighborhood structures, i.e., N5, N6, and N7. Improving the upper bounds of some famous benchmarks is inseparable from the role of these neighborhood structures. However, these existing neighborhood structures only consider the movement of critical operations within a critical block. According to our experiments, it is also possible to improve the makespan of a scheduling scheme by moving a critical operation outside its critical block. According to the above finding, this paper proposes a new N8 neighborhood structure considering the movement of critical operations within a critical block and the movement of critical operations outside the critical block. Besides, a neighborhood clipping method is designed to avoid invalid movement, reducing the computational time. Tabu search (TS) is a commonly used algorithm framework combined with neighborhood structures. This paper uses this framework to compare the N8 neighborhood structure with N5, N6, and N7 neighborhood structures on four famous benchmarks. The experimental results verify that the N8 neighborhood structure is more effective and efficient in solving JSP than the other state-of-the-art neighborhood structures.
Video Contrastive Learning with Global Context
Kuang, Haofei, Zhu, Yi, Zhang, Zhi, Li, Xinyu, Tighe, Joseph, Schwertfeger, Sören, Stachniss, Cyrill, Li, Mu
Contrastive learning has revolutionized self-supervised image representation learning field, and recently been adapted to video domain. One of the greatest advantages of contrastive learning is that it allows us to flexibly define powerful loss objectives as long as we can find a reasonable way to formulate positive and negative samples to contrast. However, existing approaches rely heavily on the short-range spatiotemporal salience to form clip-level contrastive signals, thus limit themselves from using global context. In this paper, we propose a new video-level contrastive learning method based on segments to formulate positive pairs. Our formulation is able to capture global context in a video, thus robust to temporal content change. We also incorporate a temporal order regularization term to enforce the inherent sequential structure of videos. Extensive experiments show that our video-level contrastive learning framework (VCLR) is able to outperform previous state-of-the-arts on five video datasets for downstream action classification, action localization and video retrieval. Code is available at https://github.com/amazon-research/video-contrastive-learning.
Deep Survival Machines: Fully Parametric Survival Regression and Representation Learning for Censored Data with Competing Risks
Nagpal, Chirag, Li, Xinyu, Dubrawski, Artur
We describe a new approach to estimating relative risks in time-to-event prediction problems with censored data in a fully parametric manner. Our approach does not require making strong assumptions of constant baseline hazard of the underlying survival distribution, as required by the Cox-proportional hazard model. By jointly learning deep nonlinear representations of the input covariates, we demonstrate the benefits of our approach when used to estimate survival risks through extensive experimentation on multiple real world datasets with different levels of censoring. We further demonstrate advantages of our model in the competing risks scenario. To the best of our knowledge, this is the first work involving fully parametric estimation of survival times with competing risks in the presence of censoring.