robot photographer
PhotoBot: Reference-Guided Interactive Photography via Natural Language
Limoyo, Oliver, Li, Jimmy, Rivkin, Dmitriy, Kelly, Jonathan, Dudek, Gregory
We introduce PhotoBot, a framework for automated photo acquisition based on an interplay between high-level human language guidance and a robot photographer. We propose to communicate photography suggestions to the user via a reference picture that is retrieved from a curated gallery. We exploit a visual language model (VLM) and an object detector to characterize reference pictures via textual descriptions and use a large language model (LLM) to retrieve relevant reference pictures based on a user's language query through text-based reasoning. To correspond the reference picture and the observed scene, we exploit pre-trained features from a vision transformer capable of capturing semantic similarity across significantly varying images. Using these features, we compute pose adjustments for an RGB-D camera by solving a Perspective-n-Point (PnP) problem. We demonstrate our approach on a real-world manipulator equipped with a wrist camera. Our user studies show that photos taken by PhotoBot are often more aesthetically pleasing than those taken by users themselves, as measured by human feedback.
- North America > Canada > Quebec > Montreal (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Research Report (0.82)
- Questionnaire & Opinion Survey (0.69)
ANSEL Photobot: A Robot Event Photographer with Semantic Intelligence
Rivkin, Dmitriy, Dudek, Gregory, Kakodkar, Nikhil, Meger, David, Limoyo, Oliver, Liu, Xue, Hogan, Francois
Our work examines the way in which large language models can be used for robotic planning and sampling, specifically the context of automated photographic documentation. Specifically, we illustrate how to produce a photo-taking robot with an exceptional level of semantic awareness by leveraging recent advances in general purpose language (LM) and vision-language (VLM) models. Given a high-level description of an event we use an LM to generate a natural-language list of photo descriptions that one would expect a photographer to capture at the event. We then use a VLM to identify the best matches to these descriptions in the robot's video stream. The photo portfolios generated by our method are consistently rated as more appropriate to the event by human evaluators than those generated by existing methods.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Media > Photography (1.00)
- Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.47)
Shutter, the Robot Photographer: Leveraging Behavior Trees for Public, In-the-Wild Human-Robot Interactions
Lew, Alexander, Thompson, Sydney, Tsoi, Nathan, Vázquez, Marynel
Deploying interactive systems in-the-wild requires adaptability to situations not encountered in lab environments. Our work details our experience about the impact of architecture choice on behavior reusability and reactivity while deploying a public interactive system. In particular, we introduce Shutter, a robot photographer and a platform for public interaction. In designing Shutter's architecture, we focused on adaptability for in-the-wild deployment, while developing a reusable platform to facilitate future research in public human-robot interaction. We find that behavior trees allow reactivity, especially in group settings, and encourage designing reusable behaviors.
Pixy drone hands-on: A flying robot photographer for Snapchat users
Drones are everywhere these days, filming dramatic reveals and awe-inspiring scenery for social media platforms. The problem is, they're not exactly approachable for beginners who have only ever used a smartphone. Last month, Snap debuted the $230 Pixy drone exactly for those people. It requires very little skill and acts like a personal robot photographer to help you produce nifty aerial shots. You don't need to pilot the Pixy.
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.50)
1673
We have developed an autonomous robot system that takes well-composed photographs of people at social events, such as weddings and conference receptions. In this article, we outline the overall architecture of the system and describe how the various components interrelate. We also describe our experiences deploying the robot photographer at a number of real-world events. The system is capable of operating in unaltered environments and has been deployed at a number of real-world events. This article gives an overview of the entire robot photographer system, and provides details of the architecture underlying the implementation.
- Media > Photography (1.00)
- Information Technology (1.00)
Say Cheese! Experiences with a Robot Photographer
Byers, Zachary, Dixon, Michael, Smart, William D., Grimm, Cindy M.
We have developed an autonomous robot system that takes well-composed photographs of people at social events, such as weddings and conference receptions. In this article, we outline the overall architecture of the system and describe how the various components interrelate. We also describe our experiences deploying the robot photographer at a number of real-world events.
- Media > Photography (1.00)
- Information Technology > Robotics & Automation (1.00)
Say Cheese! Experiences with a Robot Photographer
Byers, Zachary, Dixon, Michael, Smart, William D., Grimm, Cindy M.
This model makes system debugging significantly easier, because we know We introduced a sensor abstraction layer to exactly what each sensor reading is at every separate the task layer from concerns about point in the computation; something that physical sensing devices. We process the sensor would not be the case if we were reading from information (from the laser rangefinder in this the sensors every time a reading was used in a application) into distance measurements from calculation. This model also allows us to inject the center of the robot, thus allowing consideration modified sensor readings into the system, as of sensor error models and performance described in the next section.
- North America > United States > Texas > Bexar County > San Antonio (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- (2 more...)