wuhan university
ECHO: Frequency-aware Hierarchical Encoding for Variable-length Signals
Zhang, Yucong, Liu, Juan, Li, Ming
Pre-trained foundation models have demonstrated remarkable success in audio, vision and language, yet their potential for general machine signal modeling with arbitrary sampling rates-covering acoustic, vibration, and other industrial sensor data-remains under-explored. In this work, we propose a novel foundation model ECHO that integrates an advanced band-split architecture with frequency positional embeddings, enabling spectral localization across arbitrary sampling configurations. Moreover, the model incorporates sliding patches to support inputs of variable length without padding or cropping, producing a concise embedding that retains both temporal and spectral fidelity and naturally extends to streaming scenarios. We evaluate our method on various kinds of machine signal datasets, including previous DCASE task 2 challenges (2020-2025), and widely-used industrial signal corpora. Experimental results demonstrate consistent state-of-the-art performance in machine signal anomaly detection and fault classification, confirming the effectiveness and generalization capability of the proposed model. We open-sourced ECHO on https://github.com/yucongzh/ECHO.
- Asia > China > Hubei Province > Wuhan (0.77)
- Asia > South Korea > Seoul > Seoul (0.04)
Metadata-Guided Adaptable Frequency Scaling across Heterogeneous Applications and Devices
Yan, Jinqi, He, Fang, Sang, Qianlong, Tong, Bifeng, Sun, Peng, Gong, Yili, Hu, Chuang, Cheng, Dazhao
Abstract--Dynamic V oltage and Frequency Scaling (DVFS) is essential for enhancing energy efficiency in mobile platforms. However, traditional heuristic-based governors are increasingly inadequate for managing the complexity of heterogeneous System-on-Chip designs and diverse application workloads. Although reinforcement learning approaches offer improved performance, their poor generalization capability and reliance on extensive retraining for each hardware and application combination leads to significant deployment costs. In this work, we observe that device and application metadata inherently encapsulate valuable knowledge for DVFS, presenting an opportunity to overcome these limitations. We formulate DVFS for heterogeneous devices and applications as a multi-task reinforcement learning problem. We introduce MetaDVFS, which is a metadata-guided framework that systematically leverages metadata to discover and transfer shared knowledge across DVFS tasks. Evaluations on five Google Pixel devices running six applications show that MetaDVFS achieves up to 17% improvement in Performance-Power Ratio and up to 26% improvement in Quality of Experience. Compared to state-of-the-art methods, MetaDVFS delivers 70.8% faster adaptation (3.5 1.1 vs. 11.8 5.2 minutes) and 5.8-27.6% These results establish MetaDVFS as an effective and scalable solution for DVFS deployment in heterogeneous mobile environments. Dynamic V oltage and Frequency Scaling (DVFS) is an essential technique for effectively improving energy efficiency in battery-powered mobile platforms. DVFS adjusts the operating voltage and frequency of a device in response to current workload demands [1]. Experimental evaluations report energy savings exceeding 26% on mobile MPSoCs where DVFS functions compared to statically managed systems [2]. Traditional DVFS policies typically rely on heuristic-based governors, such as ondemand and schedutil, which make frequency decisions based primarily on simple utilization metrics. Jinqi Y an, Qianlong Sang, Yili Gong, Chuang Hu, and Dazhao Cheng are with the School of Computer Science, Wuhan University.
- Asia > China > Hubei Province > Wuhan (0.25)
- Europe (0.04)
- Asia > China > Hong Kong (0.04)
- (3 more...)
- Semiconductors & Electronics (1.00)
- Information Technology (1.00)
- Education (1.00)
- Energy (0.88)
SmartPNT-MSF: A Multi-Sensor Fusion Dataset for Positioning and Navigation Research
Zhu, Feng, Zhang, Zihang, Teng, Kangcheng, Yakup, Abduhelil, Zhang, Xiaohong
-- High - precision navigation and positioning systems are critical for applications in autonomous vehicles and mobile mapping, where robust and continuous localization is essential. To test and enhance the performance of algorithms, some research institutions and companies have successively constructed and publicly released datasets. However, existing datasets still suffer from limitations in sensor diversity and environmental coverage. To address these shortcomings and advance development in related fields, the SmartPNT Multisource Integrated Navigation, Positioning, and Attitude Dataset has been developed. This dataset integrates data from multiple sensors, including Global Navigation Satellite Systems (GNSS), Inertial Measurement Units (IMU), optical cameras, and LiDAR, to provide a rich and versatile resource for research in multi - sensor fusion and high - precision navigation. The dataset construction process is thoroughly documented, encompassing sensor configurations, coordinate system definitions, and calibration procedures for both cameras and LiDAR. A standardized framework for data collection and processing ensures consistency and scalability, enabling large - scale analysis. Validation using state - of - the - art Simultaneous Localization and Mapping (SLAM) algorithms, such as VINS - Mono and LIO - SAM, demonstrates the dataset's applicability for advanced navigation research. Covering a wide range of real - world scenarios, including urban areas, campuses, tunnels, and suburban environments, the dataset offers a valuable tool for advancing navigation technologies and addressing challenges in complex environments. By providing a publicly accessible, high - quality dataset, this work aims to bridge gaps in sensor diversity, data accessibility, and environmental representation, fostering further innovation in the field . I NTRODUCTION h e continuous advancement of positioning and navigation technologies has driven rapid development across various domains. Feng Zhu is with the School of Geodesy and Geomatics, Wuhan University, Wuhan, Hubei 430079, China, and also with the Hubei Luojia Laboratory, Wuhan, Hubei 430079, China (e - mail: fzhu@whu.edu.cn). Zihang Zhang, Kangcheng Teng, and Abduhelil Yakup are with Wuhan University Technology, the School of Geodesy and Geomatics, Wuhan University, Wuhan, Hubei 430079, China (e - mail: zihangzhang@whu.edu.cn;
- Asia > China > Hubei Province > Wuhan (1.00)
- Oceania > Australia > Queensland > Brisbane (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- (9 more...)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Epidemiology (1.00)
MEET: A Million-Scale Dataset for Fine-Grained Geospatial Scene Classification with Zoom-Free Remote Sensing Imagery
Li, Yansheng, Wu, Yuning, Cheng, Gong, Tao, Chao, Dang, Bo, Wang, Yu, Zhang, Jiahao, Zhang, Chuge, Liu, Yiting, Tang, Xu, Ma, Jiayi, Zhang, Yongjun
Accurate fine-grained geospatial scene classification using remote sensing imagery is essential for a wide range of applications. However, existing approaches often rely on manually zooming remote sensing images at different scales to create typical scene samples. This approach fails to adequately support the fixed-resolution image interpretation requirements in real-world scenarios. To address this limitation, we introduce the Million-scale finE-grained geospatial scEne classification dataseT (MEET), which contains over 1.03 million zoom-free remote sensing scene samples, manually annotated into 80 fine-grained categories. In MEET, each scene sample follows a scene-inscene layout, where the central scene serves as the reference, and auxiliary scenes provide crucial spatial context for finegrained classification. Moreover, to tackle the emerging challenge of scene-in-scene classification, we present the Context-Aware Transformer (CAT), a model specifically designed for this task, which adaptively fuses spatial context to accurately classify the scene samples. CAT adaptively fuses spatial context to accurately classify the scene samples by learning attentional features that capture the relationships between the center and auxiliary scenes. Based on MEET, we establish a comprehensive benchmark for fine-grained geospatial scene classification, evaluating CAT against 11 competitive baselines. The results demonstrate that CAT significantly outperforms these baselines, achieving a 1.88% higher balanced accuracy (BA) with the Swin-Large backbone, and a notable 7.87% improvement with the Swin-Huge backbone. Further experiments validate the effectiveness of each module in CAT and show the practical applicability of CAT in the urban functional zone mapping. The source code and dataset will be publicly available at https://jerrywyn.github.io/project/MEET.html.
- Asia > China > Hubei Province > Wuhan (0.06)
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > China > Shaanxi Province > Xi'an (0.04)
- (7 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Aerodynamic sensors could speed up autonomous vehicles
If you live in one of the roughly dozen US cities where autonomous vehicles are present, you likely recognize them by their eye-catching, spinning tops. These high-tech flappers are filled with sensors--usually a mix of LiDAR, radar, and cameras--that serve as the eyes and ears for AVs to map the world around them. But those sensor stacks are often bulky, which can impede a car's ability to cut through the air around it. That hindrance can force the car to use more energy to speed up and ultimately limit a car's overall range. In current AVs, aerodynamic considerations can take a backseat to optimal sensor functionality.
- Asia > China > Hubei Province > Wuhan (0.06)
- North America > United States > Texas (0.05)
- Automobiles & Trucks (1.00)
- Transportation > Ground > Road (0.99)
SF-Loc: A Visual Mapping and Geo-Localization System based on Sparse Visual Structure Frames
Zhou, Yuxuan, Li, Xingxing, Li, Shengyu, Xia, Chunxi, Wang, Xuanbin, Feng, Shaoquan
For high-level geo-spatial applications and intelligent robotics, accurate global pose information is of crucial importance. Map-aided localization is a universal approach to overcome the limitations of global navigation satellite system (GNSS) in challenging environments. However, current solutions face challenges in terms of mapping flexibility, storage burden and re-localization performance. In this work, we present SF-Loc, a lightweight visual mapping and map-aided localization system, whose core idea is the map representation based on sparse frames with dense but compact depth, termed as visual structure frames. In the mapping phase, multi-sensor dense bundle adjustment (MS-DBA) is applied to construct geo-referenced visual structure frames. The local co-visbility is checked to keep the map sparsity and achieve incremental mapping. In the localization phase, coarse-to-fine vision-based localization is performed, in which multi-frame information and the map distribution are fully integrated. To be specific, the concept of spatially smoothed similarity (SSS) is proposed to overcome the place ambiguity, and pairwise frame matching is applied for efficient and robust pose estimation. Experimental results on the cross-season dataset verify the effectiveness of the system. In complex urban road scenarios, the map size is down to 3 MB per kilometer and stable decimeter-level re-localization can be achieved. The code will be made open-source soon (https://github.com/GREAT-WHU/SF-Loc).
- Transportation > Ground > Road (0.48)
- Transportation > Infrastructure & Services (0.34)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Vision (0.90)
iKalibr-RGBD: Partially-Specialized Target-Free Visual-Inertial Spatiotemporal Calibration For RGBDs via Continuous-Time Velocity Estimation
Chen, Shuolong, Li, Xingxing, Li, Shengyu, Zhou, Yuxuan
Visual-inertial systems have been widely studied and applied in the last two decades, mainly due to their low cost and power consumption, small footprint, and high availability. Such a trend simultaneously leads to a large amount of visual-inertial calibration methods being presented, as accurate spatiotemporal parameters between sensors are a prerequisite for visual-inertial fusion. In our previous work, i.e., iKalibr, a continuous-time-based visual-inertial calibration method was proposed as a part of one-shot multi-sensor resilient spatiotemporal calibration. While requiring no artificial target brings considerable convenience, computationally expensive pose estimation is demanded in initialization and batch optimization, limiting its availability. Fortunately, this could be vastly improved for the RGBDs with additional depth information, by employing mapping-free ego-velocity estimation instead of mapping-based pose estimation. In this paper, we present the continuous-time ego-velocity estimation-based RGBD-inertial spatiotemporal calibration, termed as iKalibr-RGBD, which is also targetless but computationally efficient. The general pipeline of iKalibr-RGBD is inherited from iKalibr, composed of a rigorous initialization procedure and several continuous-time batch optimizations. The implementation of iKalibr-RGBD is open-sourced at (https://github.com/Unsigned-Long/iKalibr) to benefit the research community.
- Asia > China > Hubei Province > Wuhan (0.06)
- Europe > Germany > Brandenburg > Potsdam (0.04)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
RIs-Calib: An Open-Source Spatiotemporal Calibrator for Multiple 3D Radars and IMUs Based on Continuous-Time Estimation
Chen, Shuolong, Li, Xingxing, Li, Shengyu, Zhou, Yuxuan, Wang, Shiwen
Aided inertial navigation system (INS), typically consisting of an inertial measurement unit (IMU) and an exteroceptive sensor, has been widely accepted as a feasible solution for navigation. Compared with vision-aided and LiDAR-aided INS, radar-aided INS could achieve better performance in adverse weather conditions since the radar utilizes low-frequency measuring signals with less attenuation effect in atmospheric gases and rain. For such a radar-aided INS, accurate spatiotemporal transformation is a fundamental prerequisite to achieving optimal information fusion. In this work, we present RIs-Calib: a spatiotemporal calibrator for multiple 3D radars and IMUs based on continuous-time estimation, which enables accurate spatiotemporal calibration and does not require any additional artificial infrastructure or prior knowledge. Our approach starts with a rigorous and robust procedure for state initialization, followed by batch optimizations, where all parameters can be refined to global optimal states steadily. We validate and evaluate RIs-Calib on both simulated and real-world experiments, and the results demonstrate that RIs-Calib is capable of accurate and consistent calibration. We open-source our implementations at (https://github.com/Unsigned-Long/RIs-Calib) to benefit the research community.
MSC-LIO: An MSCKF-Based LiDAR-Inertial Odometry with Same-Plane-Point Tracking
Zhang, Tisheng, Yuan, Man, Wei, Linfu, Tang, Hailiang, Niu, Xiaoji
The multi-state constraint Kalman filter (MSCKF) has been proven to be more efficient than graph optimization for visual-based odometry while with similar accuracy. However, it has not yet been properly considered and studied for LiDAR-based odometry. In this paper, we propose a novel tightly coupled LiDAR-inertial odometry based on the MSCKF framework, named MSC-LIO. An efficient LiDAR same-plane-point (LSPP) tracking method, without explicit feature extraction, is present for frame-to-frame data associations. The tracked LSPPs are employed to build an LSPP measurement model, which constructs a multi-state constraint. Besides, we propose an effective point-velocity-based LiDAR-IMU time-delay (LITD) estimation method, which is derived from the proposed LSPP tracking method. Extensive experiments were conducted on both public and private datasets. The results demonstrate that the proposed MSC-LIO yields higher accuracy and efficiency than the state-of-the-art methods. The ablation experiment results indicate that the data-association efficiency is improved by nearly 3 times using the LSPP tracking method. Besides, the proposed LITD estimation method can effectively and accurately estimate the LITD.
- Asia > China > Hubei Province > Wuhan (0.06)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (8 more...)
- Research Report > New Finding (0.34)
- Research Report > Promising Solution (0.34)
On State Estimation in Multi-Sensor Fusion Navigation: Optimization and Filtering
Zhu, Feng, Xu, Zhuo, Zhang, Xveqing, Zhang, Yuantai, Chen, Weijie, Zhang, Xiaohong
The essential of navigation, perception, and decision-making which are basic tasks for intelligent robots, is to estimate necessary system states. Among them, navigation is fundamental for other upper applications, providing precise position and orientation, by integrating measurements from multiple sensors. With observations of each sensor appropriately modelled, multi-sensor fusion tasks for navigation are reduced to the state estimation problem which can be solved by two approaches: optimization and filtering. Recent research has shown that optimization-based frameworks outperform filtering-based ones in terms of accuracy. However, both methods are based on maximum likelihood estimation (MLE) and should be theoretically equivalent with the same linearization points, observation model, measurements, and Gaussian noise assumption. In this paper, we deeply dig into the theories and existing strategies utilized in both optimization-based and filtering-based approaches. It is demonstrated that the two methods are equal theoretically, but this equivalence corrupts due to different strategies applied in real-time operation. By adjusting existing strategies of the filtering-based approaches, the Monte-Carlo simulation and vehicular ablation experiments based on visual odometry (VO) indicate that the strategy adjusted filtering strictly equals to optimization. Therefore, future research on sensor-fusion problems should concentrate on their own algorithms and strategies rather than state estimation approaches.
- Transportation (0.68)
- Information Technology (0.46)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)