interpolation point
Less is More: Non-uniform Road Segments are Efficient for Bus Arrival Prediction
Huang, Zhen, Deng, Jiaxin, Xu, Jiayu, Pang, Junbiao, Yu, Haitao
Abstract--In bus arrival time prediction, the process of organizing road infrastructure network data into homogeneous entities is known as segmentation. Segmenting a road network is widely recognized as the first and most critical step in developing an arrival time prediction system, particularly for auto-regressive-based approaches. Traditional methods typically employ a uniform segmentation strategy, which fails to account for varying physical constraints along roads, such as road conditions, intersections, and points of interest, thereby limiting prediction efficiency. In this paper, we propose a Reinforcement Learning (RL)-based approach to efficiently and adaptively learn non-uniform road segments for arrival time prediction. Our method decouples the prediction process into two stages: 1) Nonuniform road segments are extracted based on their impact scores using the proposed RL framework; and 2) A linear prediction model is applied to the selected segments to make predictions. This method ensures optimal segment selection while maintaining computational efficiency, offering a significant improvement over traditional uniform approaches. Furthermore, our experimental results suggest that the linear approach can even achieve better performance than more complex methods. Extensive experiments demonstrate the superiority of the proposed method, which not only enhances efficiency but also improves learning performance on large-scale benchmarks.
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (3 more...)
Towards Interpretable Deep Learning and Analysis of Dynamical Systems via the Discrete Empirical Interpolation Method
We present a differentiable framework that leverages the Discrete Empirical Interpolation Method (DEIM) for interpretable deep learning and dynamical system analysis. Although DEIM efficiently approximates nonlinear terms in projection-based reduced-order models (POD-ROM), its fixed interpolation points limit the adaptability to complex and time-varying dynamics. To address this limitation, we first develop a differentiable adaptive DEIM formulation for the one-dimensional viscous Burgers equation, which allows neural networks to dynamically select interpolation points in a computationally efficient and physically consistent manner. We then apply DEIM as an interpretable analysis tool for examining the learned dynamics of a pre-trained Neural Ordinary Differential Equation (NODE) on a two-dimensional vortex-merging problem. The DEIM trajectories reveal physically meaningful features in the learned dynamics of NODE and expose its limitations when extrapolating to unseen flow configurations. These findings demonstrate that DEIM can serve not only as a model reduction tool but also as a diagnostic framework for understanding and improving the generalization behavior of neural differential equation models.
Scaling Gaussian Process Regression with Full Derivative Observations
We present a scalable Gaussian Process (GP) method that can fit and predict full derivative observations called DSoftKI. It extends SoftKI, a method that approximates a kernel via softmax interpolation from learned interpolation point locations, to the setting with derivatives. DSoftKI enhances SoftKI's interpolation scheme to incorporate the directional orientation of interpolation points relative to the data. This enables the construction of a scalable approximate kernel, including its first and second-order derivatives, through interpolation. We evaluate DSoftKI on a synthetic function benchmark and high-dimensional molecular force field prediction (100-1000 dimensions), demonstrating that DSoftKI is accurate and can scale to larger datasets with full derivative observations than previously possible.
- North America > United States (0.04)
- Europe > Spain > Andalusia > Cádiz Province > Cadiz (0.04)
Universal Approximation with Softmax Attention
Hu, Jerry Yao-Chieh, Liu, Hude, Chen, Hong-Yu, Wu, Weimin, Liu, Han
We prove that either two-layer self-attention or one-layer self-attention followed by a softmax (each equipped only with linear transformations) is capable of approximating any sequence-to-sequence continuous function on a compact domain. Different from previous studies [Y un et al., 2019, Jiang and Li, 2023, Takakura and Suzuki, 2023, Kajitsuka and Sato, 2023, Hu et al., 2024], our results highlight the expressive power of Transformers derived only from the attention module. By focusing exclusively on attention, our analysis demonstrates that the softmax operation itself suffices as a piecewise linear approximator. Furthermore, we extend this framework to broader applications, such as in-context learning [Brown et al., 2020, Bai et al., 2024], using the same attention-only architecture. Prior studies of Transformer-based universality lean on deep attention stacks [Y un et al., 2019] or feed-forward (FFN) sub-layers [Kajitsuka and Sato, 2023, Hu et al., 2024] or strong assumptions on data or architecture [Takakura and Suzuki, 2023, Petrov et al., 2024]. These results make it unclear whether attention alone is essential or auxiliary. To combat this, we develop a new interpolation-based technique for analyzing attention 1 .
- North America > United States > Illinois > Cook County > Evanston (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
Active Sampling of Interpolation Points to Identify Dominant Subspaces for Model Reduction
Reddig, Celine, Goyal, Pawan, Duff, Igor Pontes, Benner, Peter
Model reduction is an active research field to construct low-dimensional surrogate models of high fidelity to accelerate engineering design cycles. In this work, we investigate model reduction for linear structured systems using dominant reachable and observable subspaces. When the training set $-$ containing all possible interpolation points $-$ is large, then these subspaces can be determined by solving many large-scale linear systems. However, for high-fidelity models, this easily becomes computationally intractable. To circumvent this issue, in this work, we propose an active sampling strategy to sample only a few points from the given training set, which can allow us to estimate those subspaces accurately. To this end, we formulate the identification of the subspaces as the solution of the generalized Sylvester equations, guiding us to select the most relevant samples from the training set to achieve our goals. Consequently, we construct solutions of the matrix equations in low-rank forms, which encode subspace information. We extensively discuss computational aspects and efficient usage of the low-rank factors in the process of obtaining reduced-order models. We illustrate the proposed active sampling scheme to obtain reduced-order models via dominant reachable and observable subspaces and present its comparison with the method where all the points from the training set are taken into account. It is shown that the active sample strategy can provide us $17$x speed-up without sacrificing any noticeable accuracy.
- Europe > Germany > Saxony-Anhalt > Magdeburg (0.07)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- (2 more...)
- Workflow (0.46)
- Research Report (0.40)
Privacy-aware Berrut Approximated Coded Computing for Federated Learning
Luaña, Xavier Martínez, Redondo, Rebeca P. Díaz, Veiga, Manuel Fernández
Federated Learning (FL) is an interesting strategy that enables the collaborative training of an AI model among different data owners without revealing their private datasets. Even so, FL has some privacy vulnerabilities that have been tried to be overcome by applying some techniques like Differential Privacy (DP), Homomorphic Encryption (HE), or Secure Multi-Party Computation (SMPC). However, these techniques have some important drawbacks that might narrow their range of application: problems to work with non-linear functions and to operate large matrix multiplications and high communication and computational costs to manage semi-honest nodes. In this context, we propose a solution to guarantee privacy in FL schemes that simultaneously solves the previously mentioned problems. Our proposal is based on the Berrut Approximated Coded Computing, a technique from the Coded Distributed Computing paradigm, adapted to a Secret Sharing configuration, to provide input privacy to FL in a scalable way. It can be applied for computing non-linear functions and treats the special case of distributed matrix multiplication, a key primitive at the core of many automated learning tasks. Because of these characteristics, it could be applied in a wide range of FL scenarios, since it is independent of the machine learning models or aggregation algorithms used in the FL scheme. We provide analysis of the achieve privacy and complexity of our solution and, due to the extensive numerical results performed, it can be observed a good trade-off between privacy and precision.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Spain (0.04)
- North America > United States > Washington > King County > Redmond (0.04)
Engineering software 2.0 by interpolating neural networks: unifying training, solving, and calibration
Park, Chanwook, Saha, Sourav, Guo, Jiachen, Xie, Xiaoyu, Mojumder, Satyajit, Bessa, Miguel A., Qian, Dong, Chen, Wei, Wagner, Gregory J., Cao, Jian, Liu, Wing Kam
The evolution of artificial intelligence (AI) and neural network theories has revolutionized the way software is programmed, shifting from a hard-coded series of codes to a vast neural network. However, this transition in engineering software has faced challenges such as data scarcity, multi-modality of data, low model accuracy, and slow inference. Here, we propose a new network based on interpolation theories and tensor decomposition, the interpolating neural network (INN). Instead of interpolating training data, a common notion in computer science, INN interpolates interpolation points in the physical space whose coordinates and values are trainable. It can also extrapolate if the interpolation points reside outside of the range of training data and the interpolation functions have a larger support domain. INN features orders of magnitude fewer trainable parameters, faster training, a smaller memory footprint, and higher model accuracy compared to feed-forward neural networks (FFNN) or physics-informed neural networks (PINN). INN is poised to usher in Engineering Software 2.0, a unified neural network that spans various domains of space, time, parameters, and initial/boundary conditions. This has previously been computationally prohibitive due to the exponentially growing number of trainable parameters, easily exceeding the parameter size of ChatGPT, which is over 1 trillion. INN addresses this challenge by leveraging tensor decomposition and tensor product, with adaptable network architecture.
- North America > United States > Illinois > Cook County > Evanston (0.06)
- North America > United States > Texas > Dallas County > Richardson (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- Africa > Senegal > Kolda Region > Kolda (0.04)
Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach
In this paper we address the problem of decision making within a Markov decision process (MDP) framework where risk and modeling errors are taken into account. Our approach is to minimize a risk-sensitive conditional-value-at-risk (CVaR) objective, as opposed to a standard risk-neutral expectation. We refer to such problem as CVaR MDP. Our first contribution is to show that a CVaR objective, besides capturing risk sensitivity, has an alternative interpretation as expected cost under worst-case modeling errors, for a given error budget. This result, which is of independent interest, motivates CVaR MDPs as a unifying framework for risk-sensitive and robust decision making. Our second contribution is to present an approximate value-iteration algorithm for CVaR MDPs and analyze its convergence rate. To our knowledge, this is the first solution algorithm for CVaR MDPs that enjoys error guarantees. Finally, we present results from numerical experiments that corroborate our theoretical findings and show the practicality of our approach.
Fractal interpolation in the context of prediction accuracy optimization
Baicoianu, Alexandra, Gavrilă, Cristina Gabriela, Pacurar, Cristina Maria, Pacurar, Victor Dan
This paper focuses on the hypothesis of optimizing time series predictions using fractal interpolation techniques. In general, the accuracy of machine learning model predictions is closely related to the quality and quantitative aspects of the data used, following the principle of \textit{garbage-in, garbage-out}. In order to quantitatively and qualitatively augment datasets, one of the most prevalent concerns of data scientists is to generate synthetic data, which should follow as closely as possible the actual pattern of the original data. This study proposes three different data augmentation strategies based on fractal interpolation, namely the \textit{Closest Hurst Strategy}, \textit{Closest Values Strategy} and \textit{Formula Strategy}. To validate the strategies, we used four public datasets from the literature, as well as a private dataset obtained from meteorological records in the city of Brasov, Romania. The prediction results obtained with the LSTM model using the presented interpolation strategies showed a significant accuracy improvement compared to the raw datasets, thus providing a possible answer to practical problems in the field of remote sensing and sensor sensitivity. Moreover, our methodologies answer some optimization-related open questions for the fractal interpolation step using \textit{Optuna} framework.
- Europe > Romania > Centru Development Region > Brașov County > Brașov (0.25)
- Europe > Austria (0.04)
- North America > United States > Indiana (0.04)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)
- Banking & Finance > Trading (0.46)
- Energy (0.34)