neural network-based model
Mitigating Forgetting in Online Continual Learning with Neuron Calibration
Inspired by human intelligence, the research on online continual learning aims to push the limits of the machine learning models to constantly learn from sequentially encountered tasks, with the data from each task being observed in an online fashion. Though recent studies have achieved remarkable progress in improving the online continual learning performance empowered by the deep neural networks-based models, many of today's approaches still suffer a lot from catastrophic forgetting, a persistent challenge for continual learning. In this paper, we present a novel method which attempts to mitigate catastrophic forgetting in online continual learning from a new perspective, i.e., neuron calibration. In particular, we model the neurons in the deep neural networks-based models as calibrated units under a general formulation. Then we formalize a learning framework to effectively train the calibrated model, where neuron calibration could give ubiquitous benefit to balance the stability and plasticity of online continual learning algorithms through influencing both their forward inference path and backward optimization path. Our proposed formulation for neuron calibration is lightweight and applicable to general feed-forward neural networks-based models. We perform extensive experiments to evaluate our method on four benchmark continual learning datasets. The results show that neuron calibration plays a vital role in improving online continual learning performance and our method could substantially improve the state-of-the-art performance on all~the~evaluated~datasets.
Mitigating Forgetting in Online Continual Learning with Neuron Calibration
Inspired by human intelligence, the research on online continual learning aims to push the limits of the machine learning models to constantly learn from sequentially encountered tasks, with the data from each task being observed in an online fashion. Though recent studies have achieved remarkable progress in improving the online continual learning performance empowered by the deep neural networks-based models, many of today's approaches still suffer a lot from catastrophic forgetting, a persistent challenge for continual learning. In this paper, we present a novel method which attempts to mitigate catastrophic forgetting in online continual learning from a new perspective, i.e., neuron calibration. In particular, we model the neurons in the deep neural networks-based models as calibrated units under a general formulation. Then we formalize a learning framework to effectively train the calibrated model, where neuron calibration could give ubiquitous benefit to balance the stability and plasticity of online continual learning algorithms through influencing both their forward inference path and backward optimization path.
Neural filtering for Neural Network-based Models of Dynamic Systems
Oveissi, Parham, Rozario, Turibius, Goel, Ankit
The application of neural networks in modeling dynamic systems has become prominent due to their ability to estimate complex nonlinear functions. Despite their effectiveness, neural networks face challenges in long-term predictions, where the prediction error diverges over time, thus degrading their accuracy. This paper presents a neural filter to enhance the accuracy of long-term state predictions of neural network-based models of dynamic systems. Motivated by the extended Kalman filter, the neural filter combines the neural network state predictions with the measurements from the physical system to improve the estimated state's accuracy. The neural filter's improvements in prediction accuracy are demonstrated through applications to four nonlinear dynamical systems. Numerical experiments show that the neural filter significantly improves prediction accuracy and bounds the state estimate covariance, outperforming the neural network predictions.
C-XGBoost: A tree boosting model for causal effect estimation
Kiriakidou, Niki, Livieris, Ioannis E., Diou, Christos
Causal effect estimation aims at estimating the Average Treatment Effect as well as the Conditional Average Treatment Effect of a treatment to an outcome from the available data. This knowledge is important in many safety-critical domains, where it often needs to be extracted from observational data. In this work, we propose a new causal inference model, named C-XGBoost, for the prediction of potential outcomes. The motivation of our approach is to exploit the superiority of tree-based models for handling tabular data together with the notable property of causal inference neural network-based models to learn representations that are useful for estimating the outcome for both the treatment and non-treatment cases. The proposed model also inherits the considerable advantages of XGBoost model such as efficiently handling features with missing values requiring minimum preprocessing effort, as well as it is equipped with regularization techniques to avoid overfitting/bias. Furthermore, we propose a new loss function for efficiently training the proposed causal inference model. The experimental analysis, which is based on the performance profiles of Dolan and Mor{\'e} as well as on post-hoc and non-parametric statistical tests, provide strong evidence about the effectiveness of the proposed approach.
An evaluation framework for comparing causal inference models
Kiriakidou, Niki, Diou, Christos
Estimation of causal effects is the core objective of many scientific disciplines. However, it remains a challenging task, especially when the effects are estimated from observational data. Recently, several promising machine learning models have been proposed for causal effect estimation. The evaluation of these models has been based on the mean values of the error of the Average Treatment Effect (ATE) as well as of the Precision in Estimation of Heterogeneous Effect (PEHE). In this paper, we propose to complement the evaluation of causal inference models using concrete statistical evidence, including the performance profiles of Dolan and Mor{\'e}, as well as non-parametric and post-hoc statistical tests. The main motivation behind this approach is the elimination of the influence of a small number of instances or simulation on the benchmarking process, which in some cases dominate the results. We use the proposed evaluation methodology to compare several state-of-the-art causal effect estimation models.
AI wave rolls through Microsoft's language translation technologies
A fresh wave of artificial intelligence rolling through Microsoft's language translation technologies is bringing more accurate speech recognition to more of the world's languages and higher quality machine-powered translations to all 60 languages supported by Microsoft's translation technologies. The advances were announced at Microsoft Tech Summit Sydney in Australia on November 16. "We've got a complex machine, and we're innovating on all fronts," said Olivier Fontana, the director of product strategy for Microsoft Translator, a platform for text and speech translation services. As the wave spreads, he added, these machine translation tools are allowing more people to grow businesses, build relationships and experience different cultures. Microsoft's research labs around the world are also building on top of these technologies to help people learn how to speak new languages, including a language learning application for non-native speakers of Chinese that also was announced at this week's tech summit. The new Microsoft Translator advances build on last year's switch to deep neural network-powered machine translations, which offer more fluent, human-sounding translations than the predecessor technology known as statistical machine translation.