AITopics | Strong, Matthew

Collaborating Authors

Strong, Matthew

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Next Best Sense: Guiding Vision and Touch with FisherRF for 3D Gaussian Splatting

Strong, Matthew, Lei, Boshu, Swann, Aiden, Jiang, Wen, Daniilidis, Kostas, Kennedy, Monroe III

arXiv.org Artificial IntelligenceNov-19-2024

We propose a framework for active next best view and touch selection for robotic manipulators using 3D Gaussian Splatting (3DGS). 3DGS is emerging as a useful explicit 3D scene representation for robotics, as it has the ability to represent scenes in a both photorealistic and geometrically accurate manner. However, in real-world, online robotic scenes where the number of views is limited given efficiency requirements, random view selection for 3DGS becomes impractical as views are often overlapping and redundant. We address this issue by proposing an end-to-end online training and active view selection pipeline, which enhances the performance of 3DGS in few-view robotics settings. We first elevate the performance of few-shot 3DGS with a novel semantic depth alignment method using Segment Anything Model 2 (SAM2) that we supplement with Pearson depth and surface normal loss to improve color and depth reconstruction of real-world scenes. We then extend FisherRF, a next-best-view selection method for 3DGS, to select views and touch poses based on depth uncertainty. We perform online view selection on a real robot system during live 3DGS training. We motivate our improvements to few-shot GS scenes, and extend depth-based FisherRF to them, where we demonstrate both qualitative and quantitative improvements on challenging robot scenes. For more information, please see our project page at https://arm.stanford.edu/next-best-sense.

artificial intelligence, fisherrf, gaussian splatting, (15 more...)

arXiv.org Artificial Intelligence

2410.0468

Country:

North America > United States > Pennsylvania (0.28)
North America > United States > California > Santa Clara County > Palo Alto (0.24)

Genre: Research Report (0.82)

Industry:

Education > Educational Setting > Online (0.54)
Energy > Oil & Gas > Upstream (0.46)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Touch-GS: Visual-Tactile Supervised 3D Gaussian Splatting

Swann, Aiden, Strong, Matthew, Do, Won Kyung, Camps, Gadiel Sznaier, Schwager, Mac, Kennedy, Monroe III

arXiv.org Artificial IntelligenceMar-18-2024

In this work, we propose a novel method to supervise 3D Gaussian Splatting (3DGS) scenes using optical tactile sensors. Optical tactile sensors have become widespread in their use in robotics for manipulation and object representation; however, raw optical tactile sensor data is unsuitable to directly supervise a 3DGS scene. Our representation leverages a Gaussian Process Implicit Surface to implicitly represent the object, combining many touches into a unified representation with uncertainty. We merge this model with a monocular depth estimation network, which is aligned in a two stage process, coarsely aligning with a depth camera and then finely adjusting to match our touch data. For every training image, our method produces a corresponding fused depth and uncertainty map. Utilizing this additional information, we propose a new loss function, variance weighted depth supervised loss, for training the 3DGS scene model. We leverage the DenseTact optical tactile sensor and RealSense RGB-D camera to show that combining touch and vision in this manner leads to quantitatively and qualitatively better results than vision or touch alone in a few-view scene syntheses on opaque as well as on reflective and transparent objects. Please see our project page at http://armlabstanford.github.io/touch-gs

artificial intelligence, image understanding, representation, (18 more...)

arXiv.org Artificial Intelligence

2403.09875

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Vision > Image Understanding (0.88)

Add feedback