AITopics | Pan, Boxiao

Collaborating Authors

Pan, Boxiao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

COPILOT: Human-Environment Collision Prediction and Localization from Egocentric Videos

Pan, Boxiao, Shen, Bokui, Rempe, Davis, Paschalidou, Despoina, Mo, Kaichun, Yang, Yanchao, Guibas, Leonidas J.

arXiv.org Artificial IntelligenceMar-26-2023

The ability to forecast human-environment collisions from egocentric observations is vital to enable collision avoidance in applications such as VR, AR, and wearable assistive robotics. In this work, we introduce the challenging problem of predicting collisions in diverse environments from multi-view egocentric videos captured from body-mounted cameras. Solving this problem requires a generalizable perception system that can classify which human body joints will collide and estimate a collision region heatmap to localize collisions in the environment. To achieve this, we propose a transformer-based model called COPILOT to perform collision prediction and localization simultaneously, which accumulates information across multi-view inputs through a novel 4D space-time-viewpoint attention mechanism. To train our model and enable future research on this task, we develop a synthetic data generation framework that produces egocentric videos of virtual humans moving and colliding within diverse 3D environments. This framework is then used to establish a large-scale dataset consisting of 8.6M egocentric RGBD frames. Extensive experiments show that COPILOT generalizes to unseen synthetic as well as real-world scenes. We further demonstrate COPILOT outputs are useful for downstream collision avoidance through simple closed-loop control. Please visit our project webpage at https://sites.google.com/stanford.edu/copilot.

artificial intelligence, collision, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2210.01781

Country: North America > United States > California > Santa Clara County > Palo Alto (0.24)

Genre: Research Report (0.40)

Industry:

Transportation (0.55)
Health & Medicine (0.48)
Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Efficient Geometry-aware 3D Generative Adversarial Networks

Chan, Eric R., Lin, Connor Z., Chan, Matthew A., Nagano, Koki, Pan, Boxiao, De Mello, Shalini, Gallo, Orazio, Guibas, Leonidas, Tremblay, Jonathan, Khamis, Sameh, Karras, Tero, Wetzstein, Gordon

arXiv.org Artificial IntelligenceDec-15-2021

Unsupervised generation of high-quality multi-view-consistent images and 3D shapes using only collections of single-view 2D photographs has been a long-standing challenge. Existing 3D GANs are either compute-intensive or make approximations that are not 3D-consistent; the former limits quality and resolution of the generated images and the latter adversely affects multi-view consistency and shape quality. In this work, we improve the computational efficiency and image quality of 3D GANs without overly relying on these approximations. For this purpose, we introduce an expressive hybrid explicit-implicit network architecture that, together with other design choices, synthesizes not only high-resolution multi-view-consistent images in real time but also produces high-quality 3D geometry. By decoupling feature generation and neural rendering, our framework is able to leverage state-of-the-art 2D CNN generators, such as StyleGAN2, and inherit their efficiency and expressiveness. We demonstrate state-of-the-art 3D-aware synthesis with FFHQ and AFHQ Cats, among other experiments.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2112.07945

Country:

North America > United States (0.46)
Asia > Japan > Honshū > Chūbu (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback