rocket model
Detach-ROCKET: Sequential feature selection for time series classification with random convolutional kernels
Uribarri, Gonzalo, Barone, Federico, Ansuini, Alessio, Fransén, Erik
Time Series Classification (TSC) is essential in many fields, such as medicine, environmental science and finance, enabling tasks like disease diagnosis, anomaly detection, and stock price analysis. Machine learning models for TSC like Recurrent Neural Networks and InceptionTime, while successful in numerous applications, can face scalability limitations due to intensive computational requirements. To address this, efficient models such as ROCKET and its derivatives have emerged, simplifying training and achieving state-of-the-art performance by utilizing a large number of randomly generated features from time series data. However, due to their random nature, most of the generated features are redundant or non-informative, adding unnecessary computational load and compromising generalization. Here, we introduce Sequential Feature Detachment (SFD) as a method to identify and prune these non-essential features. SFD uses model coefficients to estimate feature importance and, unlike previous algorithms, can handle large feature sets without the need for complex hyperparameter tuning. Testing on the UCR archive demonstrates that SFD can produce models with $10\%$ of the original features while improving the accuracy $0.2\%$ on the test set. We also present an end-to-end procedure for determining an optimal balance between the number of features and model accuracy, called Detach-ROCKET. When applied to the largest binary UCR dataset, Detach-ROCKET is able to improve test accuracy by $0.6\%$ while reducing the number of features by $98.9\%$. Thus, our proposed procedure is not only lightweight to train and effective in reducing model size and enhancing generalization, but its significant reduction in feature count also paves the way for feature interpretation.
Predicting Surface Texture in Steel Manufacturing at Speed
Milne, Alexander J. M., Xie, Xianghua
Control of the surface texture of steel strip during the galvanizing and temper rolling processes is essential to satisfy customer requirements and is conventionally measured post-production using a stylus. In-production laser reflection measurement is less consistent than physical measurement but enables real time adjustment of processing parameters to optimize product surface characteristics. We propose the use of machine learning to improve accuracy of the transformation from inline laser reflection measurements to a prediction of surface properties. In addition to accuracy, model evaluation speed is important for fast feedback control. The ROCKET model is one of the fastest state of the art models, however it can be sped up by utilizing a GPU. Our contribution is to implement the model in PyTorch for fast GPU kernel transforms and provide a soft version of the Proportion of Positive Values (PPV) nonlinear pooling function, allowing gradient flow. We perform timing and performance experiments comparing the implementations