Goto

Collaborating Authors

 mcduff




PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring

Wang, Jiyao, Yang, Xiao, Hu, Qingyong, Tang, Jiankai, Liu, Can, He, Dengbo, Wang, Yuntao, Chen, Yingcong, Wu, Kaishun

arXiv.org Artificial Intelligence

Robust and unobtrusive in-vehicle physiological monitoring is crucial for ensuring driving safety and user experience. While remote physiological measurement (RPM) offers a promising non-invasive solution, its translation to real-world driving scenarios is critically constrained by the scarcity of comprehensive datasets. Existing resources are often limited in scale, modality diversity, the breadth of biometric annotations, and the range of captured conditions, thereby omitting inherent real-world challenges in driving. Here, we present PhysDrive, the first large-scale multimodal dataset for contactless in-vehicle physiological sensing with dedicated consideration on various modality settings and driving factors. PhysDrive collects data from 48 drivers, including synchronized RGB, near-infrared camera, and raw mmWave radar data, accompanied with six synchronized ground truths (ECG, BVP, Respiration, HR, RR, and SpO2). It covers a wide spectrum of naturalistic driving conditions, including driver motions, dynamic natural light, vehicle types, and road conditions. We extensively evaluate both signal-processing and deep-learning methods on PhysDrive, establishing a comprehensive benchmark across all modalities, and release full open-source code with compatibility for mainstream public toolboxes. We envision PhysDrive will serve as a foundational resource and accelerate research on multimodal driver monitoring and smart-cockpit systems.


Camera-Based Remote Physiology Sensing for Hundreds of Subjects Across Skin Tones

Tang, Jiankai, Li, Xinyi, Liu, Jiacheng, Zhang, Xiyuxing, Wang, Zeyu, Wang, Yuntao

arXiv.org Artificial Intelligence

Remote photoplethysmography (rPPG) emerges as a promising method for non-invasive, convenient measurement of vital signs, utilizing the widespread presence of cameras. Despite advancements, existing datasets fall short in terms of size and diversity, limiting comprehensive evaluation under diverse conditions. This paper presents an in-depth analysis of the VitalVideo dataset, the largest real-world rPPG dataset to date, encompassing 893 subjects and 6 Fitzpatrick skin tones. Our experimentation with six unsupervised methods and three supervised models demonstrates that datasets comprising a few hundred subjects(i.e., 300 for UBFC-rPPG, 500 for PURE, and 700 for MMPD-Simple) are sufficient for effective rPPG model training. Our findings highlight the importance of diversity and consistency in skin tones for precise performance evaluation across different datasets.


EfficientPhys: Enabling Simple, Fast and Accurate Camera-Based Vitals Measurement

Liu, Xin, Hill, Brian L., Jiang, Ziheng, Patel, Shwetak, McDuff, Daniel

arXiv.org Artificial Intelligence

Camera-based physiological measurement is a growing field with neural models providing state-the-art-performance. Prior research have explored various "endto-end" models; however these methods still require several preprocessing steps. These additional operations are often non-trivial to implement making replication and deployment difficult and can even have a higher computational budget than the "core" network itself. In this paper, we propose two novel and efficient neural models for camera-based physiological measurement called EfficientPhys that remove the need for face detection, segmentation, normalization, color space transformation or any other preprocessing steps. Using an input of raw video frames, our models achieve state-of-the-art accuracy on three public datasets. We show that this is the case whether using a transformer or convolutional backbone. We further evaluate the latency of the proposed networks and show that our most light weight network also achieves a 33% improvement in efficiency. Camera-based physiological measurement is a non-contact approach for capturing cardiac signals via light reflected from the body. The most common such signal is the blood volume pulse (BVP) measured via the photoplethysmogram (PPG). From this, heart rate (Takano & Ohta, 2007; Verkruysse et al., 2008), respiration rate (Poh et al., 2010) and pulse transit times Shao et al. (2014) can be derived.


Emotion AI Has Great Promise (When Used Responsibly)

#artificialintelligence

You need to speak with an agent quickly, but everyone is occupied, with lines stretching endlessly. So you go to the robot for help. The robot assistant answers your questions over the course of a genuine, back-and-forth conversation. And despite the noisy environment, it's able to register the stress in your voice -- along with a multitude of other verbal emotional cues -- and modulate its own tone in response. That scenario, laid out by Rana Gujral, CEO of Behavioral Signals, is still a hypothetical -- but it might be reality sooner than you think. "Within the next five years, you'll see some really amazing experiences come out." "Within the next five years, you'll see some really amazing experiences come out," he said. Gujral isn't in the robotics or chatbot game, but he is in the business of emotion AI: artificial intelligence that detects and analyzes human emotional signals. Emotion AI isn't limited to voice.


What is emotion AI and why should you care? - KDnuggets

#artificialintelligence

By Natalia Modjeska, MBA, PhD, helps organizations make sense of AI/ML. Recently I had the opportunity to attend the inaugural Emotion AI Conference, organized by Seth Grimes, a leading analyst and business consultant in the areas of natural language processing (NLP), text analytics, sentiment analysis, and their business applications. The conference was attended by about 70 people (including presenters and panelists) from industry and academia in the US, Canada, and Europe. Given the conference topic, what is emotion AI, why is it relevant, and what do you need to know about it? Read on to find out (warning: this is a long-ish article), but first, some background. We humans are highly emotional beings, and emotions impact everything we do, even if we are not, for the most part, aware of it.


Microsoft's Ada Is an AI Art Installation That Converts Emotions into a Beautiful Light Display - WinBuzzer

#artificialintelligence

The role AI plays today is largely behind the scenes. Other than the occasional industrial robot or self-driving cars, the benefits we see are largely in opaque software features. By working with Novartis, Microsoft has created a much more visual representation of the emerging technology. Project Ada is a giant two-story structure that inhabits building 99 on Microsoft's campus. According to designer Jenny Sabin, it's the first time an architectural structure has been driven by AI in real-time.


Smiles beam and walls blush: Architecture meets AI at Microsoft

#artificialintelligence

Jenny Sabin is perched high on a scissor lift, her head poking through an opening of the porous fabric structure that she's struggling to stretch onto the exoskeleton of her installation piece, which is suspended in the airy atrium of building 99 on Microsoft's Redmond, Washington, campus. Momentarily defeated, she pauses and looks up. "It's going to be gorgeous," she says. "It" is a glowing, translucent and ethereal pavilion that Sabin and her Microsoft collaborators describe as both a research tool and a glimpse into a future in which architecture and artificial intelligence merge. "To my knowledge, this installation is the first architectural structure to be driven by artificial intelligence in real time," said Sabin, principal designer at Jenny Sabin Studio in Ithaca, New York, who designed and built the pavilion as part of Microsoft's Artist in Residence program.


Deep learning tools help users dig into advanced analytics data

#artificialintelligence

At Twitter Inc., Hugo Larochelle's job is to develop an understanding of how users of the social network are connected to each other and what interests them in order to categorize and promote content that includes tweets, images and videos. To help accomplish that, he and his fellow data analysts use an emerging technology: deep learning tools. As Larochelle, a research scientist at Twitter, explained during a presentation at the Deep Learning Summit in Boston this month, deep learning is a category of machine learning that seeks to understand complex problems, such as interpreting images or text-based natural language. He and other proponents say deep learning techniques -- which lean heavily on the use of neural networks -- are more useful than traditional machine learning when data analytics applications involve unstructured data or require subjective interpretations. And deep learning is quickly becoming a hot field in the realm of advanced data analytics.