Smart Monitoring of Complex Public Scenes

AAAI Conferences

Security operators are increasingly interested in solutions that can provide an automatic understanding of potentially crowded public environments. In this paper, an on-going research is presented, on building a complex system consists of three main components: human security operators carrying sensors, mobile robotic platforms carrying sensors and network of fixed sensors (i.e. cameras) installed in the environment. The main objectives of this research are: 1) to develop models and solutions for an intelligent integration of sensorial information coming from different sources, 2) to develop effective human-robot interaction methods in the paradigm multi-human vs. multi-robot, 3) to integrate all these components in a system that allows for robust and efficient coordination among robots, vision sensors and human guards, in order to enhance surveillance in crowded public environments.


Cooperative Active Perception using POMDPs

AAAI Conferences

The fixed cameras provide a global but incomplete and possibly inaccurate view of the environment, which can be enhanced by a robot's local sensors. Active perception means that the robot considers the effects of its actions on its sensory capabilities. In particular, it tries to improve its sensors' performance, for instance by pointing a pan-and-tilt camera. In this paper, we present a decision-theoretic approach to cooperative active perception, by formalizing the problem as a Partially Observable Markov Decision Process (POMDP). POMDPs provide an elegant way to model the interaction of an active sensor with its environment. The goal of this paper is to provide first steps towards an integrated decision-theoretic approach of cooperative active perception.


A Decision-Theoretic Approach to Dynamic Sensor Selection in Camera Networks

AAAI Conferences

Nowadays many urban areas have been equipped with networks of surveillance cameras, which can be used for automatic localization and tracking of people. However, given the large resource demands of imaging sensors in terms of bandwidth and computing power, processing the image streams of all cameras simultaneously might not be feasible. In this paper, we consider the problem of dynamical sensor selection based on user-defined objectives, such as maximizing coverage or improved localization uncertainty.  We propose a decision-theoretic approach modeled as a POMDP, which selects k sensors to consider in the next time frame, incorporating all observations made in the past. We show how, by changing the POMDP's reward function, we can change the system's behavior in a straightforward manner, fulfilling the user's chosen objective. We successfully apply our techniques to a network of 10 cameras.


Watch their Moves: Applying Probabilistic Multiple Object Tracking to Autonomous Robot Soccer

AAAI Conferences

In many autonomous robot applications robots must be capable of estimating the positions and motions of moving objects in their environments. In this paper, we apply probabilistic multiple object tracking to estimating the positions of opponent players in autonomous robot soccer. We extend an existing tracking algorithm to handle multiple mobile sensors with uncertain positions, discuss the specification of probabilistic models needed by the algorithm, and describe the required vision-interpretation algorithms. The multiple object tracking has been successfully applied throughout the RoboCup 2001 world championship.


Understanding Activity: Learning the Language of Action Randal Nelson and Yiannis Aloimonos Univ. of Rochester and Maryland 1.1 Overview

AAAI Conferences

What does it mean for an intelligent agent to "understand" activity? This question borders on the philosophical, and given the current state of the art, could be debated endlessly. However, we feel there is some leverage to be gained by attempting to define in a functional sense, what it might mean for an agent to "understand" activity that it observes. Several possibilities come to mind. 1. The agent is remembering what it observes. Specifically, it is storing a compact, indexed representation that captures salient aspects of the experience, related and linked to previous experience. In other words, understanding consists of constructing an integrated, accessible episodic memory. Abstracted "symbolic" structure arises as an emergent phenomenon from the requirements of compactness and indexability. Important issues include: how is saliency determined?, how are the compact representations acquired?