bystander
Real-Time Mobile Video Analytics for Pre-arrival Emergency Medical Services
Jin, Liuyi, Haroon, Amran, Stoleru, Radu, Gunawardena, Pasan, Middleton, Michael, Kim, Jeeeun
Timely and accurate pre-arrival video streaming and analytics are critical for emergency medical services (EMS) to deliver life-saving interventions. Yet, current-generation EMS infrastructure remains constrained by one-to-one video streaming and limited analytics capabilities, leaving dispatchers and EMTs to manually interpret overwhelming, often noisy or redundant information in high-stress environments. We present TeleEMS, a mobile live video analytics system that enables pre-arrival multimodal inference by fusing audio and video into a unified decision-making pipeline before EMTs arrive on scene. TeleEMS comprises two key components: TeleEMS Client and TeleEMS Server. The TeleEMS Client runs across phones, smart glasses, and desktops to support bystanders, EMTs en route, and 911 dispatchers. The TeleEMS Server, deployed at the edge, integrates EMS-Stream, a communication backbone that enables smooth multi-party video streaming. On top of EMSStream, the server hosts three real-time analytics modules: (1) audio-to-symptom analytics via EMSLlama, a domain-specialized LLM for robust symptom extraction and normalization; (2) video-to-vital analytics using state-of-the-art rPPG methods for heart rate estimation; and (3) joint text-vital analytics via PreNet, a multimodal multitask model predicting EMS protocols, medication types, medication quantities, and procedures. Evaluation shows that EMSLlama outperforms GPT-4o (exact-match 0.89 vs. 0.57) and that text-vital fusion improves inference robustness, enabling reliable pre-arrival intervention recommendations. TeleEMS demonstrates the potential of mobile live video analytics to transform EMS operations, bridging the gap between bystanders, dispatchers, and EMTs, and paving the way for next-generation intelligent EMS infrastructure.
- Asia > Middle East > Yemen > Amran Governorate > Amran (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- (5 more...)
When Empowerment Disempowers
Yang, Claire, Cakmak, Maya, Kleiman-Weiner, Max
Empowerment, a measure of an agent's ability to control its environment, has been proposed as a universal goal-agnostic objective for motivating assistive behavior in AI agents. While multi-human settings like homes and hospitals are promising for AI assistance, prior work on empowerment-based assistance assumes that the agent assists one human in isolation. We introduce an open source multi-human gridworld test suite Disempower-Grid. Using Disempower-Grid, we empirically show that assistive RL agents optimizing for one human's empowerment can significantly reduce another human's environmental influence and rewards - a phenomenon we formalize as disempowerment. We characterize when disempowerment occurs in these environments and show that joint empowerment mitigates disempowerment at the cost of the user's reward. Our work reveals a broader challenge for the AI alignment community: goal-agnostic objectives that seem aligned in single-agent settings can become misaligned in multi-agent contexts.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Colorado > Boulder County > Boulder (0.04)
- Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
NeoARCADE: Robust Calibration for Distance Estimation to Support Assistive Drones for the Visually Impaired
Raj, Suman, Madhabhavi, Bhavani A, Kumar, Madhav, Gupta, Prabhav, Simmhan, Yogesh
Autonomous navigation by drones using onboard sensors, combined with deep learning and computer vision algorithms, is impacting a number of domains. We examine the use of drones to autonomously follow and assist Visually Impaired People (VIPs) in navigating urban environments. Estimating the absolute distance between the drone and the VIP, and to nearby objects, is essential to design obstacle avoidance algorithms. Here, we present NeoARCADE (Neo), which uses depth maps over monocular video feeds, common in consumer drones, to estimate absolute distances to the VIP and obstacles. Neo proposes robust calibration technique based on depth score normalization and coefficient estimations to translate relative distances from depth map to absolute ones. It further develops a dynamic recalibration method that can adapt to changing scenarios. We also develop two baseline models, Regression and Geometric, and compare Neo with SOTA depth map approaches and the baselines. We provide detailed evaluations to validate their robustness and generalizability for distance estimation to VIPs and other obstacles in diverse and dynamic conditions, using datasets collected in a campus environment. Neo predicts distances to VIP with an error <30cm, and to different obstacles like cars and bicycles within a maximum error of 60cm, which are better than the baselines. Neo also clearly out-performs SOTA depth map methods, reporting errors up to 5.3-14.6x lower.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > India > Karnataka > Bengaluru (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (10 more...)
- Transportation (1.00)
- Information Technology > Robotics & Automation (0.88)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.46)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games
Eckhaus, Niv, Berger, Uri, Stanovsky, Gabriel
LLMs are used predominantly in synchronous communication, where a human user and a model communicate in alternating turns. In contrast, many real-world settings are asynchronous. For example, in group chats, online team meetings, or social games, there is no inherent notion of turns. In this work, we develop an adaptive asynchronous LLM agent consisting of two modules: a generator that decides what to say, and a scheduler that decides when to say it. To evaluate our agent, we collect a unique dataset of online Mafia games, where our agent plays with human participants. Overall, our agent performs on par with human players, both in game performance metrics and in its ability to blend in with the other human players. Our analysis shows that the agent's behavior in deciding when to speak closely mirrors human patterns, although differences emerge in message content. We make all of our code and data publicly available. This work paves the way for integration of LLMs into realistic human group settings, from assistance in team discussions to educational and professional environments where complex social dynamics must be navigated.
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Hawaii (0.04)
- (3 more...)
NarraGuide: an LLM-based Narrative Mobile Robot for Remote Place Exploration
Hu, Yaxin, Sato, Arissa J., Du, Jingxin, Ye, Chenming, Zhu, Anjun, Praveena, Pragathi, Mutlu, Bilge
Robotic telepresence enables users to navigate and experience remote environments. However, effective navigation and situational awareness depend on users' prior knowledge of the environment, limiting the usefulness of these systems for exploring unfamiliar places. We explore how integrating location-aware LLM-based narrative capabilities into a mobile robot can support remote exploration. We developed a prototype system, called NarraGuide, that provides narrative guidance for users to explore and learn about a remote place through a dialogue-based interface. We deployed our prototype in a geology museum, where remote participants (n=20) used the robot to tour the museum. Our findings reveal how users perceived the robot's role, engaged in dialogue in the tour, and expressed preferences for bystander encountering. Our work demonstrates the potential of LLM-enabled robotic capabilities to deliver location-aware narrative guidance and enrich the experience of exploring remote environments.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- Asia > South Korea > Busan > Busan (0.05)
- (15 more...)
- Health & Medicine > Consumer Health (0.46)
- Government > Military (0.34)
Woman kicks Southwest employee, punches computer monitors in violent airport meltdown
A woman was captured on video attacking Southwest Airlines staff at Orlando International Airport during an apparent standby outburst. Video shows the moment a woman appeared to have a violent meltdown in the Southwest Airlines concourse at Orlando International Airport last week. "Motherf- - - -r, are you kidding me?" Velez-Rodriguez is heard saying on the video. Wearing a large backpack, shorts, a red long-sleeved shirt and a ballcap, she yells at an employee, saying she's trying to get to her destination to bury her brother. A woman had a meltdown at Orlando International Airport on August 14, 2025.
- North America > United States > Massachusetts (0.06)
- North America > United States > Illinois (0.06)
- Transportation > Infrastructure & Services > Airport (1.00)
- Transportation > Air (1.00)
Meta is reportedly working on facial recognition for its AI glasses
Diminished tech privacy appears to be another ripple effect from Trump 2.0. The Information reported on Wednesday that Meta has changed its tune on facial recognition. After considering but ultimately bailing on the technology for the first version of its smart glasses, the company is now actively working on wearables that can identify nearby faces. Remember when being a "Glasshole" was considered a faux pas? According to The Information, Meta has recently discussed adding software to its smart glasses that scans bystanders' faces and identifies people by name.
- Information Technology > Artificial Intelligence > Vision > Face Recognition (0.67)
- Information Technology > Human Computer Interaction > Interfaces (0.62)
- Information Technology > Hardware (0.62)
HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard
Dong, Yifei, Wu, Fengyi, He, Qi, Li, Heng, Li, Minghan, Cheng, Zebang, Zhou, Yuxuan, Sun, Jingdong, Dai, Qi, Cheng, Zhi-Qi, Hauptmann, Alexander G
Vision-and-Language Navigation (VLN) systems often focus on either discrete (panoramic) or continuous (free-motion) paradigms alone, overlooking the complexities of human-populated, dynamic environments. We introduce a unified Human-Aware VLN (HA-VLN) benchmark that merges these paradigms under explicit social-awareness constraints. Our contributions include: 1. A standardized task definition that balances discrete-continuous navigation with personal-space requirements; 2. An enhanced human motion dataset (HAPS 2.0) and upgraded simulators capturing realistic multi-human interactions, outdoor contexts, and refined motion-language alignment; 3. Extensive benchmarking on 16,844 human-centric instructions, revealing how multi-human dynamics and partial observability pose substantial challenges for leading VLN agents; 4. Real-world robot tests validating sim-to-real transfer in crowded indoor spaces; and 5. A public leaderboard supporting transparent comparisons across discrete and continuous tasks. Empirical results show improved navigation success and fewer collisions when social context is integrated, underscoring the need for human-centric design. By releasing all datasets, simulators, agent code, and evaluation tools, we aim to advance safer, more capable, and socially responsible VLN research.
- Workflow (0.67)
- Research Report > New Finding (0.34)
- Leisure & Entertainment (1.00)
- Media > Television (0.45)
Peek into the `White-Box': A Field Study on Bystander Engagement with Urban Robot Uncertainty
Yu, Xinyan, Hoggenmueller, Marius, Tran, Tram Thi Minh, Wang, Yiyuan, Zhang, Qiuming, Tomitsch, Martin
Uncertainty inherently exists in the autonomous decision-making process of robots. Involving humans in resolving this uncertainty not only helps robots mitigate it but is also crucial for improving human-robot interactions. However, in public urban spaces filled with unpredictability, robots often face heightened uncertainty without direct human collaborators. This study investigates how robots can engage bystanders for assistance in public spaces when encountering uncertainty and examines how these interactions impact bystanders' perceptions and attitudes towards robots. We designed and tested a speculative `peephole' concept that engages bystanders in resolving urban robot uncertainty. Our design is guided by considerations of non-intrusiveness and eliciting initiative in an implicit manner, considering bystanders' unique role as non-obligated participants in relation to urban robots. Drawing from field study findings, we highlight the potential of involving bystanders to mitigate urban robots' technological imperfections to both address operational challenges and foster public acceptance of urban robots. Furthermore, we offer design implications to encourage bystanders' involvement in mitigating the imperfections.
- Europe > Austria > Vienna (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- (32 more...)
- Transportation (0.68)
- Health & Medicine (0.47)
- Leisure & Entertainment (0.46)
Analyzing Human Perceptions of a MEDEVAC Robot in a Simulated Evacuation Scenario
Jordan, Tyson, Pandey, Pranav, Doshi, Prashant, Parasuraman, Ramviyas, Goodie, Adam
The use of autonomous systems in medical evacuation (MEDEVAC) scenarios is promising, but existing implementations overlook key insights from human-robot interaction (HRI) research. Studies on human-machine teams demonstrate that human perceptions of a machine teammate are critical in governing the machine's performance. Here, we present a mixed factorial design to assess human perceptions of a MEDEVAC robot in a simulated evacuation scenario. Participants were assigned to the role of casualty (CAS) or bystander (BYS) and subjected to three within-subjects conditions based on the MEDEVAC robot's operating mode: autonomous-slow (AS), autonomous-fast (AF), and teleoperation (TO). During each trial, a MEDEVAC robot navigated an 11-meter path, acquiring a casualty and transporting them to an ambulance exchange point while avoiding an idle bystander. Following each trial, subjects completed a questionnaire measuring their emotional states, perceived safety, and social compatibility with the robot. Results indicate a consistent main effect of operating mode on reported emotional states and perceived safety. Pairwise analyses suggest that the employment of the AF operating mode negatively impacted perceptions along these dimensions. There were no persistent differences between casualty and bystander responses.
- North America > United States > Georgia > Clarke County > Athens (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Government > Military > Army (0.68)
- Education (0.68)
- Health & Medicine (0.68)