Goto

Collaborating Authors

 Mahbub, Upal


V2CE: Video to Continuous Events Simulator

arXiv.org Artificial Intelligence

Dynamic Vision Sensor (DVS)-based solutions have recently garnered significant interest across various computer vision tasks, offering notable benefits in terms of dynamic range, temporal resolution, and inference speed. However, as a relatively nascent vision sensor compared to Active Pixel Sensor (APS) devices such as RGB cameras, DVS suffers from a dearth of ample labeled datasets. Prior efforts to convert APS data into events often grapple with issues such as a considerable domain shift from real events, the absence of quantified validation, and layering problems within the time axis. In this paper, we present a novel method for video-to-events stream conversion from multiple perspectives, considering the specific characteristics of DVS. A series of carefully designed losses helps enhance the quality of generated event voxels significantly. We also propose a novel local dynamic-aware timestamp inference strategy to accurately recover event timestamps from event voxels in a continuous fashion and eliminate the temporal layering problem. Results from rigorous validation through quantified metrics at all stages of the pipeline establish our method unquestionably as the current state-of-the-art (SOTA).


Neuromorphic High-Frequency 3D Dancing Pose Estimation in Dynamic Environment

arXiv.org Artificial Intelligence

As a beloved sport worldwide, dancing is getting integrated into traditional and virtual reality-based gaming platforms nowadays. It opens up new opportunities in the technology-mediated dancing space. These platforms primarily rely on passive and continuous human pose estimation as an input capture mechanism. Existing solutions are mainly based on RGB or RGB-Depth cameras for dance games. The former suffers in low-lighting conditions due to the motion blur and low sensitivity, while the latter is too power-hungry, has a low frame rate, and has limited working distance. With ultra-low latency, energy efficiency, and wide dynamic range characteristics, the event camera is a promising solution to overcome these shortcomings. We propose YeLan, an event camera-based 3-dimensional high-frequency human pose estimation(HPE) system that survives low-lighting conditions and dynamic backgrounds. We collected the world's first event camera dance dataset and developed a fully customizable motion-to-event physics-aware simulator. YeLan outperforms the baseline models in these challenging conditions and demonstrated robustness against different types of clothing, background motion, viewing angle, occlusion, and lighting fluctuations.


Continuous Authentication of Smartphones Based on Application Usage

arXiv.org Machine Learning

An empirical investigation of active/continuous authentication for smartphones is presented in this paper by exploiting users' unique application usage data, i.e., distinct patterns of use, modeled by a Markovian process. Variations of Hidden Markov Models (HMMs) are evaluated for continuous user verification, and challenges due to the sparsity of session-wise data, an explosion of states, and handling unforeseen events in the test data are tackled. Unlike traditional approaches, the proposed formulation does not depend on the top N-apps, rather uses the complete app-usage information to achieve low latency. Through experimentation, empirical assessment of the impact of unforeseen events, i.e., unknown applications and unforeseen observations, on user verification is done via a modified edit-distance algorithm for simple sequence matching. It is found that for enhanced verification performance, unforeseen events should be incorporated in the models by adopting smoothing techniques with HMMs. For validation, extensive experiments on two distinct datasets are performed. The marginal smoothing technique is the most effective for user verification in terms of equal error rate (EER) and with a sampling rate of 1/30s^{-1} and 30 minutes of historical data, and the method is capable of detecting an intrusion within ~2.5 minutes of application use.