source system
Adaptive prediction theory combining offline and online learning
Real-world intelligence systems usually operate by combining offline learning and online adaptation with highly correlated and non-stationary system data or signals, which, however, has rarely been investigated theoretically in the literature. This paper initiates a theoretical investigation on the prediction performance of a two-stage learning framework combining offline and online algorithms for a class of nonlinear stochastic dynamical systems. For the offline-learning phase, we establish an upper bound on the generalization error for approximate nonlinear-least-squares estimation under general datasets with strong correlation and distribution shift, leveraging the Kullback-Leibler divergence to quantify the distributional discrepancies. For the online-adaptation phase, we address, on the basis of the offline-trained model, the possible uncertain parameter drift in real-world target systems by proposing a meta-LMS prediction algorithm. This two-stage framework, integrating offline learning with online adaptation, demonstrates superior prediction performances compared with either purely offline or online methods. Both theoretical guarantees and empirical studies are provided.
LogAction: Consistent Cross-system Anomaly Detection through Logs via Active Domain Adaptation
Duan, Chiming, He, Minghua, Xiao, Pei, Jia, Tong, Zhang, Xin, Zhong, Zhewei, Luo, Xiang, Niu, Yan, Zhang, Lingzhe, Wu, Yifan, Yu, Siyu, Hong, Weijie, Li, Ying, Huang, Gang
Log-based anomaly detection is a essential task for ensuring the reliability and performance of software systems. However, the performance of existing anomaly detection methods heavily relies on labeling, while labeling a large volume of logs is highly challenging. To address this issue, many approaches based on transfer learning and active learning have been proposed. Nevertheless, their effectiveness is hindered by issues such as the gap between source and target system data distributions and cold-start problems. In this paper, we propose LogAction, a novel log-based anomaly detection model based on active domain adaptation. LogAction integrates transfer learning and active learning techniques. On one hand, it uses labeled data from a mature system to train a base model, mitigating the cold-start issue in active learning. On the other hand, LogAction utilize free energy-based sampling and uncertainty-based sampling to select logs located at the distribution boundaries for manual labeling, thus addresses the data distribution gap in transfer learning with minimal human labeling efforts. Experimental results on six different combinations of datasets demonstrate that LogAction achieves an average 93.01% F1 score with only 2% of manual labels, outperforming some state-of-the-art methods by 26.28%. Website: https://logaction.github.io
On the Residual-based Neural Network for Unmodeled Distortions in Coordinate Transformation
Rofatto, Vinicius Francisco, de Almeida, Luiz Felipe Rodrigues, Matsuoka, Marcelo Tomio, Klein, Ivandro, Veronez, Mauricio Roberto, Junior, Luiz Gonzaga Da Silveira
Coordinate transformation models often fail to account for nonlinear and spatially dependent distortions, leading to significant residual errors in geospatial applications. Here we propose a residual-based neural correction strategy, in which a neural network learns to model only the systematic distortions left by an initial geometric transformation. By focusing solely on residual patterns, the proposed method reduces model complexity and improves performance, particularly in scenarios with sparse or structured control point configurations. We evaluate the method using both simulated datasets with varying distortion intensities and sampling strategies, as well as under the real-world image georeferencing tasks. Compared with direct neural network coordinate converter and classical transformation models, the residual-based neural correction delivers more accurate and stable results under challenging conditions, while maintaining comparable performance in ideal cases. These findings demonstrate the effectiveness of residual modelling as a lightweight and robust alternative for improving coordinate transformation accuracy.
Meta-Learning for Physically-Constrained Neural System Identification
Chakrabarty, Ankush, Wichern, Gordon, Deshpande, Vedang M., Vinod, Abraham P., Berntorp, Karl, Laughman, Christopher R.
We present a gradient-based meta-learning framework for rapid adaptation of neural state-space models (NSSMs) for black-box system identification. When applicable, we also incorporate domain-specific physical constraints to improve the accuracy of the NSSM. The major benefit of our approach is that instead of relying solely on data from a single target system, our framework utilizes data from a diverse set of source systems, enabling learning from limited target data, as well as with few online training iterations. Through benchmark examples, we demonstrate the potential of our approach, study the effect of fine-tuning subnetworks rather than full fine-tuning, and report real-world case studies to illustrate the practical application and generalizability of the approach to practical problems with physical-constraints. Specifically, we show that the meta-learned models result in improved downstream performance in model-based state estimation in indoor localization and energy systems.
Transfer Learning for Control Systems via Neural Simulation Relations
Nadali, Alireza, Zhong, Bingzhuo, Trivedi, Ashutosh, Zamani, Majid
Transfer learning is an umbrella term for machine learning approaches that leverage knowledge gained from solving one problem (the source domain) to improve speed, efficiency, and data requirements in solving a different but related problem (the target domain). The performance of the transferred model in the target domain is typically measured via some notion of loss function in the target domain. This paper focuses on effectively transferring control logic from a source control system to a target control system while providing approximately similar behavioral guarantees in both domains. However, in the absence of a complete characterization of behavioral specifications, this problem cannot be captured in terms of loss functions. To overcome this challenge, we use (approximate) simulation relations to characterize observational equivalence between the behaviors of two systems. Simulation relations ensure that the outputs of both systems, equipped with their corresponding controllers, remain close to each other over time, and their closeness can be quantified {\it a priori}. By parameterizing simulation relations with neural networks, we introduce the notion of \emph{neural simulation relations}, which provides a data-driven approach to transfer any synthesized controller, regardless of the specification of interest, along with its proof of correctness. Compared with prior approaches, our method eliminates the need for a closed-loop mathematical model and specific requirements for both the source and target systems. We also introduce validity conditions that, when satisfied, guarantee the closeness of the outputs of two systems equipped with their corresponding controllers, thus eliminating the need for post-facto verification. We demonstrate the effectiveness of our approach through case studies involving a vehicle and a double inverted pendulum.
Local transfer learning Gaussian process modeling, with applications to surrogate modeling of expensive computer simulators
Wang, Xinming, Mak, Simon, Miller, John, Wu, Jianguo
A critical bottleneck for scientific progress is the costly nature of computer simulations for complex systems. Surrogate models provide an appealing solution: such models are trained on simulator evaluations, then used to emulate and quantify uncertainty on the expensive simulator at unexplored inputs. In many applications, one often has available data on related systems. For example, in designing a new jet turbine, there may be existing studies on turbines with similar configurations. A key question is how information from such "source" systems can be transferred for effective surrogate training on the "target" system of interest. We thus propose a new LOcal transfer Learning Gaussian Process (LOL-GP) model, which leverages a carefully-designed Gaussian process to transfer such information for surrogate modeling. The key novelty of the LOL-GP is a latent regularization model, which identifies regions where transfer should be performed and regions where it should be avoided. This "local transfer" property is desirable in scientific systems: at certain parameters, such systems may behave similarly and thus transfer is beneficial; at other parameters, they may behave differently and thus transfer is detrimental. By accounting for local transfer, the LOL-GP can rectify a critical limitation of "negative transfer" in existing transfer learning models, where the transfer of information worsens predictive performance. We derive a Gibbs sampling algorithm for efficient posterior predictive sampling on the LOL-GP, for both the multi-source and multi-fidelity transfer settings. We then show, via a suite of numerical experiments and an application for jet turbine design, the improved surrogate performance of the LOL-GP over existing methods.
Transfer of Safety Controllers Through Learning Deep Inverse Dynamics Model
Nadali, Alireza, Trivedi, Ashutosh, Zamani, Majid
Control barrier certificates have proven effective in formally guaranteeing the safety of the control systems. However, designing a control barrier certificate is a time-consuming and computationally expensive endeavor that requires expert input in the form of domain knowledge and mathematical maturity. Additionally, when a system undergoes slight changes, the new controller and its correctness certificate need to be recomputed, incurring similar computational challenges as those faced during the design of the original controller. Prior approaches have utilized transfer learning to transfer safety guarantees in the form of a barrier certificate while maintaining the control invariant. Unfortunately, in practical settings, the source and the target environments often deviate substantially in their control inputs, rendering the aforementioned approach impractical. To address this challenge, we propose integrating \emph{inverse dynamics} -- a neural network that suggests required action given a desired successor state -- of the target system with the barrier certificate of the source system to provide formal proof of safety. In addition, we propose a validity condition that, when met, guarantees correctness of the controller. We demonstrate the effectiveness of our approach through three case studies.
Can Transformers Learn Optimal Filtering for Unknown Systems?
Balim, Haldun, Du, Zhe, Oymak, Samet, Ozay, Necmiye
Transformer models have shown great success in natural language processing; however, their potential remains mostly unexplored for dynamical systems. In this work, we investigate the optimal output estimation problem using transformers, which generate output predictions using all the past ones. Particularly, we train the transformer using various distinct systems and then evaluate the performance on unseen systems with unknown dynamics. Empirically, the trained transformer adapts exceedingly well to different unseen systems and even matches the optimal performance given by the Kalman filter for linear systems. In more complex settings with non-i.i.d. noise, time-varying dynamics, and nonlinear dynamics like a quadrotor system with unknown parameters, transformers also demonstrate promising results. To support our experimental findings, we provide statistical guarantees that quantify the amount of training data required for the transformer to achieve a desired excess risk. Finally, we point out some limitations by identifying two classes of problems that lead to degraded performance, highlighting the need for caution when using transformers for control and estimation.
6 Things You Need To Know About Data Management And Why It Matters For Computer Vision - KDnuggets
The race to Industry 4.0 and accelerated adoption of digital automation are pressure testing organizations across industries. It is becoming increasingly matter-of-fact that an enterprise's ability to leverage data is a key source of competitive advantage--this principle especially holds true when it comes to building and maintaining computer vision applications. Visual automation models are powered by images and videos that capture a digital representation of our physical world. In most enterprises, media is captured across multiple sensor edge devices and lives in siloed source systems, making the integration of media across environments a core challenge that needs to be solved when building a computer vision system that seeks to automate visual inspection. Media becomes even more important in a world where model architectures (the code that builds up the neural networks on which an AI model is trained) are increasingly commoditized and stable.
Evolution of intelligent data pipelines
The potential of artificial intelligence (AI) and machine learning (ML) seems almost unbounded in its ability to derive and drive new sources of customer, product, service, operational, environmental, and societal value. If your organization is to compete in the economy of the future, then AI must be at the core of your business operations. A study by Kearney titled "The Impact of Analytics in 2020" highlights the untapped profitability and business impact for organizations looking for justification to accelerate their data science (AI / ML) and data management investments: The business, operational, and societal impacts could be staggering except for one significant organizational challenge--data. "The model and the code for many applications are basically a solved problem. Now that the models have advanced to a certain point, we've got to make the data work as well."