extrapolation problem
Towards Understanding Extrapolation: a Causal Lens
Canonical work handling distribution shifts typically necessitates an entire target distribution that lands inside the training distribution.However, practical scenarios often involve only a handful target samples, potentially lying outside the training support, which requires the capability of extrapolation.In this work, we aim to provide a theoretical understanding of when extrapolation is possible and offer principled methods to achieve it without requiring an on-support target distribution.To this end, we formulate the extrapolation problem with a latent-variable model that embodies the minimal change principle in causal mechanisms.Under this formulation, we cast the extrapolation problem into a latent-variable identification problem.We provide realistic conditions on shift properties and the estimation objectives that lead to identification even when only one off-support target sample is available, tackling the most challenging scenarios.Our theory reveals the intricate interplay between the underlying manifold's smoothness and the shift properties.We showcase how our theoretical results inform the design of practical adaptation algorithms.
Towards Understanding Extrapolation: a Causal Lens
Canonical work handling distribution shifts typically necessitates an entire target distribution that lands inside the training distribution.However, practical scenarios often involve only a handful target samples, potentially lying outside the training support, which requires the capability of extrapolation.In this work, we aim to provide a theoretical understanding of when extrapolation is possible and offer principled methods to achieve it without requiring an on-support target distribution.To this end, we formulate the extrapolation problem with a latent-variable model that embodies the minimal change principle in causal mechanisms.Under this formulation, we cast the extrapolation problem into a latent-variable identification problem.We provide realistic conditions on shift properties and the estimation objectives that lead to identification even when only one off-support target sample is available, tackling the most challenging scenarios.Our theory reveals the intricate interplay between the underlying manifold's smoothness and the shift properties.We showcase how our theoretical results inform the design of practical adaptation algorithms.
Function Extrapolation with Neural Networks and Its Application for Manifolds
This paper addresses the problem of accurately estimating a function on one domain when only its discrete samples are available on another domain. To answer this challenge, we utilize a neural network, which we train to incorporate prior knowledge of the function. In addition, by carefully analyzing the problem, we obtain a bound on the error over the extrapolation domain and define a condition number for this problem that quantifies the level of difficulty of the setup. Compared to other machine learning methods that provide time series prediction, such as transformers, our approach is suitable for setups where the interpolation and extrapolation regions are general subdomains and, in particular, manifolds. In addition, our construction leads to an improved loss function that helps us boost the accuracy and robustness of our neural network. We conduct comprehensive numerical tests and comparisons of our extrapolation versus standard methods. The results illustrate the effectiveness of our approach in various scenarios.
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- North America > United States > Louisiana > East Baton Rouge Parish > Baton Rouge (0.04)
- Europe > Switzerland (0.04)
- (3 more...)
Evolving Gaussian Process kernels from elementary mathematical expressions
Roman, Ibai, Santana, Roberto, Mendiburu, Alexander, Lozano, Jose A.
Choosing the most adequate kernel is crucial in many Machine Learning applications. Gaussian Process is a state-of-the-art technique for regression and classification that heavily relies on a kernel function. However, in the Gaussian Process literature, kernels have usually been either ad hoc designed, selected from a predefined set, or searched for in a space of compositions of kernels which have been defined a priori. In this paper, we propose a Genetic-Programming algorithm that represents a kernel function as a tree of elementary mathematical expressions. By means of this representation, a wider set of kernels can be modeled, where potentially better solutions can be found, although new challenges also arise. The proposed algorithm is able to overcome these difficulties and find kernels that accurately model the characteristics of the data. This method has been tested in several real-world time-series extrapolation problems, improving the state-of-the-art results while reducing the complexity of the kernels.
- North America > Canada > Quebec (0.05)
- Oceania > Australia (0.04)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- (5 more...)
Regression-Enhanced Random Forests
Zhang, Haozhe, Nettleton, Dan, Zhu, Zhengyuan
In the last few years, there have been many methodological and theoretical advances in the random forests approach. Some methodological developments and extensions include case-specific random forests [19], multivariate random forests [16], quantile regression forests [13], random survival forests [11], enriched random forests for microarry data [1] and predictor augmentation in random forests [18] among others. For theoretical developments, the statistical and asymptotic properties of random forests have been intensively investigated. Advances have been made in the areas such as consistency [2] [15], variable selection [8] and the construction of confidence intervals [17]. Although RF methodology has proven itself to be a reliable predictive approach in many application areas [3][10], there are some cases where random forests may suffer. First, as a fully nonparametric predictive algorithm, random forests may not efficiently incorporate known relationships between the response and the predictors. Second, random forests may fail in extrapolation problems where predictions are required at points out of the domain of the training dataset. For regression problems, a random forest prediction is an average of the predictions produced by the trees in the forest.
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Possibilistic decreasing persistence
Driankov, Dimiter, Lang, Jerome
A key issue in the handling of temporal data is the treatment of persistence; in most approaches it consists in inferring defeasible confusions by extrapolating from the actual knowledge of the history of the world; we propose here a gradual modelling of persistence, following the idea that persistence is decreasing (the further we are from the last time point where a fluent is known to be true, the less certainly true the fluent is); it is based on possibility theory, which has strong relations with other well-known ordering-based approaches to nonmonotonic reasoning. We compare our approach with Dean and Kanazawa's probabilistic projection. We give a formal modelling of the decreasing persistence problem. Lastly, we show how to infer nonmonotonic conclusions using the principle of decreasing persistence.
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.25)
- Europe > Sweden > Östergötland County > Linköping (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)