Goto

Collaborating Authors

 bicycle



BikeScenes: Online LiDAR Semantic Segmentation for Bicycles

Goren, Denniz, Caesar, Holger

arXiv.org Artificial Intelligence

The vulnerability of cyclists, exacerbated by the rising popularity of faster e-bikes, motivates adapting automotive perception technologies for bicycle safety. We use our multi-sensor 'SenseBike' research platform to develop and evaluate a 3D LiDAR segmentation approach tailored to bicycles. To bridge the automotive-to-bicycle domain gap, we introduce the novel BikeScenes-lidarseg Dataset, comprising 3021 consecutive LiDAR scans around the university campus of the TU Delft, semantically annotated for 29 dynamic and static classes. By evaluating model performance, we demonstrate that fine-tuning on our BikeScenes dataset achieves a mean Intersection-over-Union (mIoU) of 63.6%, significantly outperforming the 13.8% obtained with SemanticKITTI pre-training alone. This result underscores the necessity and effectiveness of domain-specific training. We highlight key challenges specific to bicycle-mounted, hardware-constrained perception systems and contribute the BikeScenes dataset as a resource for advancing research in cyclist-centric LiDAR segmentation.


E-bike agents: Large Language Model-Driven E-Bike Accident Analysis and Severity Prediction

Yang, Zhichao, He, Jiashu, Al-Khasawneh, Mohammad B., Pandit, Darshan, Cinzia, Cirillo

arXiv.org Artificial Intelligence

E-bikes have rapidly gained popularity as a sustainable form of urban mobility, yet their safety implications remain underexplored. This paper analyzes injury incidents involving e-bikes and traditional bicycles using two sources of data, the CPSRMS (Consumer Product Safety Risk Management System Information Security Review Report) and NEISS (National Electronic Injury Surveillance System) datasets. We propose a standardized classification framework to identify and quantify injury causes and severity. By integrating incident narratives with demographic attributes, we reveal key differences in mechanical failure modes, injury severity patterns, and affected user groups. While both modes share common causes, such as loss of control and pedal malfunctions, e-bikes present distinct risks, including battery-related fires and brake failures. These findings highlight the need for tailored safety interventions and infrastructure design to support the safe integration of micromobility devices into urban transportation networks.


Quantum generative model on bicycle-sharing system and an application

Nemoto, Fumio, Koike, Nobuyuki, Sato, Daichi, Kawaai, Yuuta, Ohzeki, Masayuki

arXiv.org Artificial Intelligence

Recently, bicycle-sharing systems have been implemented in numerous cities, becoming integral to daily life. However, a prevalent issue arises when intensive commuting demand leads to bicycle shortages in specific areas and at particular times. To address this challenge, we employ a novel quantum machine learning model that analyzes time series data by fitting quantum time evolution to observed sequences. This model enables us to capture actual trends in bicycle counts at individual ports and identify correlations between different ports. Utilizing the trained model, we simulate the impact of proactively adding bicycles to high-demand ports on the overall rental number across the system. Given that the core of this method lies in a Monte Carlo simulation, it is anticipated to have a wide range of industrial applications.


Great GATsBi: Hybrid, Multimodal, Trajectory Forecasting for Bicycles using Anticipation Mechanism

Riehl, Kevin, El-Baklish, Shaimaa K., Kouvelas, Anastasios, Makridis, Michail A.

arXiv.org Artificial Intelligence

Accurate prediction of road user movement is increasingly required by many applications ranging from advanced driver assistance systems to autonomous driving, and especially crucial for road safety. Even though most traffic accident fatalities account to bicycles, they have received little attention, as previous work focused mainly on pedestrians and motorized vehicles. In this work, we present the Great GATsBi, a domain-knowledge-based, hybrid, multimodal trajectory prediction framework for bicycles. The model incorporates both physics-based modeling (inspired by motorized vehicles) and social-based modeling (inspired by pedestrian movements) to explicitly account for the dual nature of bicycle movement. The social interactions are modeled with a graph attention network, and include decayed historical, but also anticipated, future trajectory data of a bicycles neighborhood, following recent insights from psychological and social studies. The results indicate that the proposed ensemble of physics models -- performing well in the short-term predictions -- and social models -- performing well in the long-term predictions -- exceeds state-of-the-art performance. We also conducted a controlled mass-cycling experiment to demonstrate the framework's performance when forecasting bicycle trajectories and modeling social interactions with road users.




Expect the unexpected: Harnessing Sentence Completion for Sarcasm Detection

Joshi, Aditya, Agrawal, Samarth, Bhattacharyya, Pushpak, Carman, Mark

arXiv.org Artificial Intelligence

The trigram `I love being' is expected to be followed by positive words such as `happy'. In a sarcastic sentence, however, the word `ignored' may be observed. The expected and the observed words are, thus, incongruous. We model sarcasm detection as the task of detecting incongruity between an observed and an expected word. In order to obtain the expected word, we use Context2Vec, a sentence completion library based on Bidirectional LSTM. However, since the exact word where such an incongruity occurs may not be known in advance, we present two approaches: an All-words approach (which consults sentence completion for every content word) and an Incongruous words-only approach (which consults sentence completion for the 50% most incongruous content words). The approaches outperform reported values for tweets but not for discussion forum posts. This is likely to be because of redundant consultation of sentence completion for discussion forum posts. Therefore, we consider an oracle case where the exact incongruous word is manually labeled in a corpus reported in past work. In this case, the performance is higher than the all-words approach. This sets up the promise for using sentence completion for sarcasm detection.


Sanitizing Manufacturing Dataset Labels Using Vision-Language Models

Mahjourian, Nazanin, Nguyen, Vinh

arXiv.org Artificial Intelligence

The success of machine learning models in industrial applications is heavily dependent on the quality of the datasets used to train the models. However, large-scale datasets, specially those constructed from crowd-sourcing and web-scraping, often suffer from label noise, inconsistencies, and errors. This problem is particularly pronounced in manufacturing domains, where obtaining high-quality labels is costly and time-consuming. This paper introduces Vision-Language Sanitization and Refinement (VLSR), which is a vision-language-based framework for label sanitization and refinement in multi-label manufacturing image datasets. This method embeds both images and their associated textual labels into a shared semantic space leveraging the CLIP vision-language model. Then two key tasks are addressed in this process by computing the cosine similarity between embeddings. First, label sanitization is performed to identify irrelevant, misspelled, or semantically weak labels, and surface the most semantically aligned label for each image by comparing image-label pairs using cosine similarity between image and label embeddings. Second, the method applies density-based clustering on text embeddings, followed by iterative cluster merging, to group semantically similar labels into unified label groups. The Factorynet dataset, which includes noisy labels from both human annotations and web-scraped sources, is employed to evaluate the effectiveness of the proposed framework. Experimental results demonstrate that the VLSR framework successfully identifies problematic labels and improves label consistency. This method enables a significant reduction in label vocabulary through clustering, which ultimately enhances the dataset's quality for training robust machine learning models in industrial applications with minimal human intervention.


LazyVLM: Neuro-Symbolic Approach to Video Analytics

Jian, Xiangru, Pang, Wei, Dong, Zhengyuan, Zhang, Chao, Özsu, M. Tamer

arXiv.org Artificial Intelligence

Current video analytics approaches face a fundamental trade-off between flexibility and efficiency. End-to-end Vision Language Models (VLMs) often struggle with long-context processing and incur high computational costs, while neural-symbolic methods depend heavily on manual labeling and rigid rule design. In this paper, we introduce LazyVLM, a neuro-symbolic video analytics system that provides a user-friendly query interface similar to VLMs, while addressing their scalability limitation. LazyVLM enables users to effortlessly drop in video data and specify complex multi-frame video queries using a semi-structured text interface for video analytics. To address the scalability limitations of VLMs, LazyVLM decomposes multi-frame video queries into fine-grained operations and offloads the bulk of the processing to efficient relational query execution and vector similarity search. We demonstrate that LazyVLM provides a robust, efficient, and user-friendly solution for querying open-domain video data at scale.