Active Learning for Gaussian Process Considering Uncertainties with Application to Shape Control of Composite Fuselage
Yue, Xiaowei, Wen, Yuchen, Hunt, Jeffrey H., Shi, Jianjun
This paper has been accepted by IEEE Transactions on Automation Science and Engineering. 1 This preprint is an accepted version, not the IEEE published version. Abstract--In the machine learning domain, active learning is an iterative data selection algorithm for maximizing information acquisition and improving model performance with limited training samples. It is very useful, especially for the industrial applications where training samples are expensive, time-consuming, or difficult to obtain. Existing methods mainly focus on active learning for classification, and a few methods are designed for regression such as linear regression or Gaussian process. Uncertainties from measurement errors and intrinsic input noise inevitably exist in the experimental data, which further affects the modeling performance. The existing active learning methods do not incorporate these uncertainties for Gaussian process. In this paper, we propose two new active learning algorithms for the Gaussian process with uncertainties, which are variance-based weighted active learning algorithm and D-optimal weighted active learning algorithm. Through numerical study, we show that the proposed approach can incorporate the impact from uncertainties, and realize better prediction performance. This approach has been applied to improving the predictive modeling for automatic shape control of composite fuselage. I. INTRODUCTION Active learning is a type of iterative supervised learning which focuses on maximizing information acquisition with limited samples. In statistics literature, this process is also called optimal experimental design, or sequential design. The main idea of active learning is to iteratively pose "query" or "design" to explore the most informative new experimental samples according to the information obtained from the current samples. In many machine learning applications, especially in some industrial systems, the explanatory data are rich and easy to get, but the response data are very expensive, time-consuming, or difficult to obtain. For example, when training autonomous driving algorithms, a lot of media (e.g., images, videos) require that oracle users mark them with particular labels, such as "vehicle", "street sign" or "road lines". It can be tedious, redundant and time-consuming to annotate lots of these instances.
Apr-22-2020
- Country:
- North America > United States
- Asia > Middle East
- Jordan (0.04)
- Genre:
- Overview (0.93)
- Research Report (0.70)
- Industry:
- Education (0.68)
- Transportation (0.48)
- Aerospace & Defense (0.46)
- Technology: