prediction point
KnowIt: Deep Time Series Modeling and Interpretation
Theunissen, M. W., Rabe, R., Davel, M. H.
KnowIt (Knowledge discovery in time series data) is a flexible framework for building deep time series models and interpreting them. It is implemented as a Python toolkit, with source code and documentation available from https://must-deep-learning.github.io/KnowIt. It imposes minimal assumptions about task specifications and decouples the definition of dataset, deep neural network architecture, and interpretability technique through well defined interfaces. This ensures the ease of importing new datasets, custom architectures, and the definition of different interpretability paradigms while maintaining on-the-fly modeling and interpretation of different aspects of a user's own time series data. KnowIt aims to provide an environment where users can perform knowledge discovery on their own complex time series data through building powerful deep learning models and explaining their behavior. With ongoing development, collaboration and application our goal is to make this a platform to progress this underexplored field and produce a trusted tool for deep time series modeling.
Log Optimization Simplification Method for Predicting Remaining Time
Ye, Jianhong, Zhang, Siyuan, Lin, Yan
Information systems generate a large volume of event log data during business operations, much of which consists of low-value and redundant information. When performance predictions are made directly from these logs, the accuracy of the predictions can be compromised. Researchers have explored methods to simplify and compress these data while preserving their valuable components. Most existing approaches focus on reducing the dimensionality of the data by eliminating redundant and irrelevant features. However, there has been limited investigation into the efficiency of execution both before and after event log simplification. In this paper, we present a prediction point selection algorithm designed to avoid the simplification of all points that function similarly. We select sequences or self-loop structures to form a simplifiable segment, and we optimize the deviation between the actual simplifiable value and the original data prediction value to prevent over-simplification. Experiments indicate that the simplified event log retains its predictive performance and, in some cases, enhances its predictive accuracy compared to the original event log.
A parsimonious, computationally efficient machine learning method for spatial regression
Žukovič, Milan, Hristopulos, Dionissios T.
The spatial prediction (interpolation) problem arises in various fields of science and engineering that study spatially distributed variables. In the case of scattered data, filling gaps facilitates understanding of the spatial features, visualization of the observed process, and it is also necessary to obtain fully populated grids of spatially dependent parameters used in partial differential equations. Spatial prediction is highly relevant to many disciplines, such as environmental mapping, risk assessment(Christakos, 2012) and environmental health studies (Christakos and Hristopulos, 2013), subsurface hydrology (Kitanidis, 1997; Rubin, 2003), mining (Goovaerts, 1997), and oil reserves estimation (Hohn, 1988; Hamzehpour and Sahimi, 2006). In addition, remote sensing images often include gaps with missing data (e.g., clouds, snow, heavy precipitation, ground vegetation coverage, etc.) that need to be filled (Rossi et al, 1994). Spatial prediction is also useful in image analysis (Winkler, 2003; Gui and Wei, 2004) and signal processing (Unser and Blu, 2005; Ramani and Unser, 2006) including medical applications (Parrott et al, 1993; Cao and Worsley, 2001).
Automatically Reconciling the Trade-off between Prediction Accuracy and Earliness in Prescriptive Business Process Monitoring
Metzger, Andreas, Kley, Tristan, Rothweiler, Aristide, Pohl, Klaus
Prescriptive business process monitoring provides decision support to process managers on when and how to adapt an ongoing business process to prevent or mitigate an undesired process outcome. We focus on the problem of automatically reconciling the trade-off between prediction accuracy and prediction earliness in determining when to adapt. Adaptations should happen sufficiently early to provide enough lead time for the adaptation to become effective. However, earlier predictions are typically less accurate than later predictions. This means that acting on less accurate predictions may lead to unnecessary adaptations or missed adaptations. Different approaches were presented in the literature to reconcile the trade-off between prediction accuracy and earliness. So far, these approaches were compared with different baselines, and evaluated using different data sets or even confidential data sets. This limits the comparability and replicability of the approaches and makes it difficult to choose a concrete approach in practice. We perform a comparative evaluation of the main alternative approaches for reconciling the trade-off between prediction accuracy and earliness. Using four public real-world event log data sets and two types of prediction models, we assess and compare the cost savings of these approaches. The experimental results indicate which criteria affect the effectiveness of an approach and help us state initial recommendations for the selection of a concrete approach in practice.
Adaptive Neural Network Ensemble Using Frequency Distribution
Neural network (NN) ensembles can reduce large prediction variance of NN and improve prediction accuracy. For highly nonlinear problems with insufficient data set, the prediction accuracy of NN models becomes unstable, resulting in a decrease in the accuracy of ensembles. Therefore, this study proposes a frequency distribution-based ensemble that identifies core prediction values, which are expected to be concentrated near the true prediction value. The frequency distribution-based ensemble classifies core prediction values supported by multiple prediction values by conducting statistical analysis with a frequency distribution, which is based on various prediction values obtained from a given prediction point. The frequency distribution-based ensemble can improve predictive performance by excluding prediction values with low accuracy and coping with the uncertainty of the most frequent value. An adaptive sampling strategy that sequentially adds samples based on the core prediction variance calculated as the variance of the core prediction values is proposed to improve the predictive performance of the frequency distribution-based ensemble efficiently. Results of various case studies show that the prediction accuracy of the frequency distribution-based ensemble is higher than that of Kriging and other existing ensemble methods. In addition, the proposed adaptive sampling strategy effectively improves the predictive performance of the frequency distribution-based ensemble compared with the previously developed space-filling and prediction variance-based strategies.
A Correlation Information-based Spatiotemporal Network for Traffic Flow Forecasting
Zhu, Weiguo, Sun, Yongqi, Yi, Xintong, Wang, Yan
The technology of traffic flow forecasting plays an important role in intelligent transportation systems. Based on graph neural networks and attention mechanisms, most previous works utilize the transformer architecture to discover spatiotemporal dependencies and dynamic relationships. However, they have not considered correlation information among spatiotemporal sequences thoroughly. In this paper, based on the maximal information coefficient, we present two elaborate spatiotemporal representations, spatial correlation information (SCorr) and temporal correlation information (TCorr). Using SCorr, we propose a correlation information-based spatiotemporal network (CorrSTN) that includes a dynamic graph neural network component for integrating correlation information into spatial structure effectively and a multi-head attention component for modeling dynamic temporal dependencies accurately. Utilizing TCorr, we explore the correlation pattern among different periodic data to identify the most relevant data, and then design an efficient data selection scheme to further enhance model performance. The experimental results on the highway traffic flow (PEMS07 and PEMS08) and metro crowd flow (HZME inflow and outflow) datasets demonstrate that CorrSTN outperforms the state-of-the-art methods in terms of predictive performance. In particular, on the HZME (outflow) dataset, our model makes significant improvements compared with the ASTGNN model by 12.7%, 14.4% and 27.4% in the metrics of MAE, RMSE and MAPE, respectively.
A Method for Controlling Extrapolation when Visualizing and Optimizing the Prediction Profiles of Statistical and Machine Learning Models
Ash, Jeremy, Lancaster, Laura, Gotwalt, Chris
We present a novel method for controlling extrapolation in the prediction profiler in the JMP software. The prediction profiler is a graphical tool for exploring high dimensional prediction surfaces for statistical and machine learning models. The profiler contains interactive cross-sectional views, or profile traces, of the prediction surface of a model. Our method helps users avoid exploring predictions that should be considered extrapolation. It also performs optimization over a constrained factor region that avoids extrapolation using a genetic algorithm. In simulations and real world examples, we demonstrate how optimal factor settings without constraint in the profiler are frequently extrapolated, and how extrapolation control helps avoid these solutions with invalid factor settings that may not be useful to the user.
Unsupervised Document Embedding With CNNs
Liu, Chundi, Zhao, Shunan, Volkovs, Maksims
We propose a new model for unsupervised document embedding. Leading existing approaches either require complex inference or use recurrent neural networks (RNN) that are difficult to parallelize. We take a different route and develop a convolutional neural network (CNN) embedding model. Our CNN architecture is fully parallelizable resulting in over 10x speedup in inference time over RNN models. Parallelizable architecture enables to train deeper models where each successive layer has increasingly larger receptive field and models longer range semantic structure within the document. We additionally propose a fully unsupervised learning algorithm to train this model based on stochastic forward prediction. Empirical results on two public benchmarks show that our approach produces comparable to state-of-the-art accuracy at a fraction of computational cost.
Nested Kriging predictions for datasets with large number of observations
Rullière, Didier, Durrande, Nicolas, Bachoc, François, Chevalier, Clément
This work falls within the context of predicting the value of a real function at some input locations given a limited number of observations of this function. The Kriging interpolation technique (or Gaussian process regression) is often considered to tackle such a problem but the method suffers from its computational burden when the number of observation points is large. We introduce in this article nested Kriging predictors which are constructed by aggregating sub-models based on subsets of observation points. This approach is proven to have better theoretical properties than other aggregation methods that can be found in the literature. Contrarily to some other methods it can be shown that the proposed aggregation method is consistent. Finally, the practical interest of the proposed method is illustrated on simulated datasets and on an industrial test case with $10^4$ observations in a 6-dimensional space.