Goto

Collaborating Authors

 minirocket


HIT-ROCKET: Hadamard-vector Inner-product Transformer for ROCKET

Hao, Wang, Zhang, Kuang, Chengyu, Hou, Zhonghao, Yuan, Chenxing, Tan, Weifeng, Fu, Yangying, Zhu

arXiv.org Artificial Intelligence

Time series classification holds broad application value in communications, information countermeasures, finance, and medicine. However, state-of-the-art (SOTA) methods-including HIVE-COTE, Proximity Forest, and TS-CHIEF-exhibit high computational complexity, coupled with lengthy parameter tuning and training cycles. In contrast, lightweight solutions like ROCKET (Random Convolutional Kernel Transform) offer greater efficiency but leave substantial room for improvement in kernel selection and computational overhead. To address these challenges, we propose a feature extraction approach based on Hadamard convolutional transform, utilizing column or row vectors of Hadamard matrices as convolution kernels with extended lengths of varying sizes. This enhancement maintains full compatibility with existing methods (e.g., ROCKET) while leveraging kernel orthogonality to boost computational efficiency, robustness, and adaptability. Comprehensive experiments on multi-domain datasets-focusing on the UCR time series dataset-demonstrate SOTA performance: F1-score improved by at least 5% vs. ROCKET, with 50% shorter training time than miniROCKET (fastest ROCKET variant) under identical hyperparameters, enabling deployment on ultra-low-power embedded devices. All code is available on GitHub.


An Empirical Evaluation of Factors Affecting SHAP Explanation of Time Series Classification

Serramazza, Davide Italo, Papadeas, Nikos, Abdallah, Zahraa, Ifrim, Georgiana

arXiv.org Artificial Intelligence

Explainable AI (XAI) has become an increasingly important topic for understanding and attributing the predictions made by complex Time Series Classification (TSC) models. Among attribution methods, SHapley Additive exPlanations (SHAP) is widely regarded as an excellent attribution method, but its computational complexity, which scales exponentially with the number of features, limits its practicality for long time series. To address this, recent studies have shown that aggregating features via segmentation, to compute a single attribution value for a group of consecutive time points, drastically reduces SHAP running time. However, the choice of the optimal segmentation strategy remains an open question. In this work, we investigated eight different Time Series Segmentation algorithms to understand how segment compositions affect the explanation quality. We evaluate these approaches using two established XAI evaluation methodologies: InterpretTime and AUC Difference. Through experiments on both Multivariate (MTS) and Univariate Time Series (UTS), we find that the number of segments has a greater impact on explanation quality than the specific segmentation method. Notably, equal-length segmentation consistently outperforms most of the custom time series segmentation algorithms. Furthermore, we introduce a novel attribution normalisation technique that weights segments by their length and we show that it consistently improves attribution quality.


IdentiARAT: Toward Automated Identification of Individual ARAT Items from Wearable Sensors

Homm, Daniel, Carqueville, Patrick, Eichhorn, Christian, Weikert, Thomas, Menard, Thomas, Plecher, David A., Easthope, Chris Awai

arXiv.org Artificial Intelligence

This study explores the potential of using wrist-worn inertial sensors to automate the labeling of ARAT (Action Research Arm Test) items. While the ARAT is commonly used to assess upper limb motor function, its limitations include subjectivity and time consumption of clinical staff. By using IMU (Inertial Measurement Unit) sensors and MiniROCKET as a time series classification technique, this investigation aims to classify ARAT items based on sensor recordings. We test common preprocessing strategies to efficiently leverage included information in the data. Afterward, we use the best preprocessing to improve the classification. The dataset includes recordings of 45 participants performing various ARAT items. Results show that MiniROCKET offers a fast and reliable approach for classifying ARAT domains, although challenges remain in distinguishing between individual resembling items. Future work may involve improving classification through more advanced machine-learning models and data enhancements.


An Investigation into Seasonal Variations in Energy Forecasting for Student Residences

Danish, Muhammad Umair, Sureshkumar, Mathumitha, Fonseka, Thanuri, Uthayakumar, Umeshika, Galwaduge, Vinura

arXiv.org Artificial Intelligence

This research provides an in-depth evaluation of various machine learning models for energy forecasting, focusing on the unique challenges of seasonal variations in student residential settings. The study assesses the performance of baseline models, such as LSTM and GRU, alongside state-of-the-art forecasting methods, including Autoregressive Feedforward Neural Networks, Transformers, and hybrid approaches. Special attention is given to predicting energy consumption amidst challenges like seasonal patterns, vacations, meteorological changes, and irregular human activities that cause sudden fluctuations in usage. The findings reveal that no single model consistently outperforms others across all seasons, emphasizing the need for season-specific model selection or tailored designs. Notably, the proposed Hyper Network based LSTM and MiniAutoEncXGBoost models exhibit strong adaptability to seasonal variations, effectively capturing abrupt changes in energy consumption during summer months. This study advances the energy forecasting field by emphasizing the critical role of seasonal dynamics and model-specific behavior in achieving accurate predictions.


Time series classification with random convolution kernels based transforms: pooling operators and input representations matter

Lo, Mouhamadou Mansour, Morvan, Gildas, Rossi, Mathieu, Morganti, Fabrice, Mercier, David

arXiv.org Artificial Intelligence

This article presents a new approach based on MiniRocket, called SelF-Rocket, for fast time series classification (TSC). Unlike existing approaches based on random convolution kernels, it dynamically selects the best couple of input representations and pooling operator during the training process. SelF-Rocket achieves state-of-the-art accuracy on the University of California Riverside (UCR) TSC benchmark datasets.


RandomNet: Clustering Time Series Using Untrained Deep Neural Networks

Li, Xiaosheng, Xi, Wenjie, Lin, Jessica

arXiv.org Artificial Intelligence

Neural networks are widely used in machine learning and data mining. Typically, these networks need to be trained, implying the adjustment of weights (parameters) within the network based on the input data. In this work, we propose a novel approach, RandomNet, that employs untrained deep neural networks to cluster time series. RandomNet uses different sets of random weights to extract diverse representations of time series and then ensembles the clustering relationships derived from these different representations to build the final clustering results. By extracting diverse representations, our model can effectively handle time series with different characteristics. Since all parameters are randomly generated, no training is required during the process. We provide a theoretical analysis of the effectiveness of the method. To validate its performance, we conduct extensive experiments on all of the 128 datasets in the well-known UCR time series archive and perform statistical analysis of the results. These datasets have different sizes, sequence lengths, and they are from diverse fields. The experimental results show that the proposed method is competitive compared with existing state-of-the-art methods.


Comparative Analysis of XGBoost and Minirocket Algortihms for Human Activity Recognition

Alagoz, Celal

arXiv.org Artificial Intelligence

Human Activity Recognition (HAR) has been extensively studied, with recent emphasis on the implementation of advanced Machine Learning (ML) and Deep Learning (DL) algorithms for accurate classification. This study investigates the efficacy of two ML algorithms, eXtreme Gradient Boosting (XGBoost) and MiniRocket, in the realm of HAR using data collected from smartphone sensors. The experiments are conducted on a dataset obtained from the UCI repository, comprising accelerometer and gyroscope signals captured from 30 volunteers performing various activities while wearing a smartphone. The dataset undergoes preprocessing, including noise filtering and feature extraction, before being utilized for training and testing the classifiers. Monte Carlo cross-validation is employed to evaluate the models' robustness. The findings reveal that both XGBoost and MiniRocket attain accuracy, F1 score, and AUC values as high as 0.99 in activity classification. XGBoost exhibits a slightly superior performance compared to MiniRocket. Notably, both algorithms surpass the performance of other ML and DL algorithms reported in the literature for HAR tasks. Additionally, the study compares the computational efficiency of the two algorithms, revealing XGBoost's advantage in terms of training time. Furthermore, the performance of MiniRocket, which achieves accuracy and F1 values of 0.94, and an AUC value of 0.96 using raw data and utilizing only one channel from the sensors, highlights the potential of directly leveraging unprocessed signals. It also suggests potential advantages that could be gained by utilizing sensor fusion or channel fusion techniques. Overall, this research sheds light on the effectiveness and computational characteristics of XGBoost and MiniRocket in HAR tasks, providing insights for future studies in activity recognition using smartphone sensor data.


Exploring Hierarchical Classification Performance for Time Series Data: Dissimilarity Measures and Classifier Comparisons

Alagoz, Celal

arXiv.org Artificial Intelligence

The comparative performance of hierarchical classification (HC) and flat classification (FC) methodologies in the realm of time series data analysis is investigated in this study. Dissimilarity measures, including Jensen-Shannon Distance (JSD), Task Similarity Distance (TSD), and Classifier Based Distance (CBD), are leveraged alongside various classifiers such as MINIROCKET, STSF, and SVM. A subset of datasets from the UCR archive, focusing on multi-class cases comprising more than two classes, is employed for analysis. A significant trend is observed wherein HC demonstrates significant superiority over FC when paired with MINIROCKET utilizing TSD, diverging from conventional understandings. Conversely, FC exhibits consistent dominance across all configurations when employing alternative classifiers such as STSF and SVM. Moreover, TSD is found to consistently outperform both CBD and JSD across nearly all scenarios, except in instances involving the STSF classifier where CBD showcases superior performance. This discrepancy underscores the nuanced nature of dissimilarity measures and emphasizes the importance of their tailored selection based on the dataset and classifier employed. Valuable insights into the dynamic interplay between classification methodologies and dissimilarity measures in the realm of time series data analysis are provided by these findings. By elucidating the performance variations across different configurations, a foundation is laid for refining classification methodologies and dissimilarity measures to optimize performance in diverse analytical scenarios. Furthermore, the need for continued research aimed at elucidating the underlying mechanisms driving classification performance in time series data analysis is underscored, with implications for enhancing predictive modeling and decision-making in various domains.


P-ROCKET: Pruning Random Convolution Kernels for Time Series Classification from a Feature Selection Perspective

Chen, Shaowu, Sun, Weize, Huang, Lei, Li, Xiaopeng, Wang, Qingyuan, John, Deepu

arXiv.org Artificial Intelligence

In recent years, two competitive time series classification models, namely, ROCKET and MINIROCKET, have garnered considerable attention due to their low training cost and high accuracy. However, they require a large number of random 1-D convolutional kernels to comprehensively capture features, which is incompatible with resource-constrained devices. Despite the development of heuristic algorithms designed to recognize and prune redundant kernels, the inherent time-consuming nature of evolutionary algorithms hinders efficient evaluation. To effectively prune models, this paper removes redundant random kernels from a feature selection perspective by eliminating associating connections in the sequential classifier. Two innovative algorithms are proposed, where the first ADMM-based algorithm formulates the pruning challenge as a group elastic net classification problem, and the second core algorithm named P-ROCKET greatly accelerates the first one by bifurcating the problem into two sequential stages. Stage 1 of P-ROCKET introduces dynamically varying penalties to efficiently implement group-level regularization to delete redundant kernels, and Stage 2 employs element-level regularization on the remaining features to refit a linear classifier for better performance. Experimental results on diverse time series datasets show that P-ROCKET prunes up to 60% of kernels without a significant reduction in accuracy and performs 11 times faster than its counterparts. Our code is publicly available at https://github.com/ShaowuChen/P-ROCKET.


An ensemble of convolution-based methods for fault detection using vibration signals

Lee, Xian Yeow, Kumar, Aman, Vidyaratne, Lasitha, Rao, Aniruddha Rajendra, Farahat, Ahmed, Gupta, Chetan

arXiv.org Machine Learning

This paper focuses on solving a fault detection problem using multivariate time series of vibration signals collected from planetary gearboxes in a test rig. Various traditional machine learning and deep learning methods have been proposed for multivariate time-series classification, including distance-based, functional data-oriented, feature-driven, and convolution kernel-based methods. Recent studies have shown using convolution kernel-based methods like ROCKET, and 1D convolutional neural networks with ResNet and FCN, have robust performance for multivariate time-series data classification. We propose an ensemble of three convolution kernel-based methods and show its efficacy on this fault detection problem by outperforming other approaches and achieving an accuracy of more than 98.8\%.