AITopics | Shakhnarovich, Gregory

Collaborating Authors

Shakhnarovich, Gregory

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Transcrib3D: 3D Referring Expression Resolution through Large Language Models

Fang, Jiading, Tan, Xiangshan, Lin, Shengjie, Vasiljevic, Igor, Guizilini, Vitor, Mei, Hongyuan, Ambrus, Rares, Shakhnarovich, Gregory, Walter, Matthew R

arXiv.org Artificial IntelligenceApr-29-2024

If robots are to work effectively alongside people, they must be able to interpret natural language references to objects in their 3D environment. Understanding 3D referring expressions is challenging -- it requires the ability to both parse the 3D structure of the scene and correctly ground free-form language in the presence of distraction and clutter. We introduce Transcrib3D, an approach that brings together 3D detection methods and the emergent reasoning capabilities of large language models (LLMs). Transcrib3D uses text as the unifying medium, which allows us to sidestep the need to learn shared representations connecting multi-modal inputs, which would require massive amounts of annotated 3D data. As a demonstration of its effectiveness, Transcrib3D achieves state-of-the-art results on 3D reference resolution benchmarks, with a great leap in performance from previous multi-modality baselines. To improve upon zero-shot performance and facilitate local deployment on edge computers and robots, we propose self-correction for fine-tuning that trains smaller models, resulting in performance close to that of large models. We show that our method enables a real robot to perform pick-and-place tasks given queries that contain challenging referring expressions. Project site is at https://ripl.github.io/Transcrib3D.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2404.19221

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

6-DoF Stability Field via Diffusion Models

Yoneda, Takuma, Jiang, Tianchong, Shakhnarovich, Gregory, Walter, Matthew R.

arXiv.org Artificial IntelligenceOct-26-2023

A core capability for robot manipulation is reasoning over where and how to stably place objects in cluttered environments. Traditionally, robots have relied on object-specific, hand-crafted heuristics in order to perform such reasoning, with limited generalizability beyond a small number of object instances and object interaction patterns. Recent approaches instead learn notions of physical interaction, namely motion prediction, but require supervision in the form of labeled object information or come at the cost of high sample complexity, and do not directly reason over stability or object placement. We present 6-DoFusion, a generative model capable of generating 3D poses of an object that produces a stable configuration of a given scene. Underlying 6-DoFusion is a diffusion model that incrementally refines a randomly initialized SE(3) pose to generate a sample from a learned, context-dependent distribution over stable poses. We evaluate our model on different object placement and stacking tasks, demonstrating its ability to construct stable scenes that involve novel object classes as well as to improve the accuracy of state-of-the-art 3D pose estimation methods.

6-dof stability field, artificial intelligence, diffusion model

arXiv.org Artificial Intelligence

2310.17649

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (0.73)
Information Technology > Artificial Intelligence > Machine Learning (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.53)

Add feedback

Natural and Adversarial Error Detection using Invariance to Image Transformations

Bahat, Yuval, Irani, Michal, Shakhnarovich, Gregory

arXiv.org Machine LearningFeb-1-2019

We propose an approach to distinguish between correct and incorrect image classifications. Our approach can detect misclassifications which either occurunintentionally ("natural errors"), or due to intentional adversarial attacks ("adversarial errors"),both in a single unified framework. Our approach is based on the observation that correctly classified images tend to exhibit robust and consistent classifications under certain image transformations (e.g., horizontal flip, small image translation, etc.). In contrast, incorrectly classified images (whether due to adversarial errors or natural errors)tend to exhibit large variations in classification resultsunder such transformations. Our approach does not require any modifications or retraining of the classifier, hence can be applied to any pre-trained classifier. We further use state of the art targeted adversarial attacks to demonstrate that even when the adversary has full knowledge of our method, the adversarial distortion needed for bypassing our detector is no longer imperceptible tothe human eye. Our approach obtains state-of-the-art results compared to previous adversarial detectionmethods, surpassing them by a large margin.

classifier, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1902.00236

Country: North America > United States (0.68)

Genre: Research Report (0.50)

Industry:

Government > Military (1.00)
Information Technology > Security & Privacy (0.87)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Sparse Coding for Learning Interpretable Spatio-Temporal Primitives

Kim, Taehwan, Shakhnarovich, Gregory, Urtasun, Raquel

Neural Information Processing SystemsDec-31-2010

Sparse coding has recently become a popular approach in computer vision to learn dictionaries of natural images. In this paper we extend sparse coding to learn interpretable spatio-temporal primitives of human motion. We cast the problem of learning spatio-temporal primitives as a tensor factorization problem and introduce constraints to learn interpretable primitives. In particular, we use group norms over those tensors, diagonal constraints on the activations as well as smoothness constraints that are inherent to human motion. We demonstrate the effectiveness of our approach to learn interpretable representations of human motion from motion capture data, and show that our approach outperforms recently developed matching pursuit and sparse coding algorithms.

artificial intelligence, machine learning, sparse, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Nonlinear physically-based models for decoding motor-cortical population activity

Shakhnarovich, Gregory, Kim, Sung-phil, Black, Michael J.

Neural Information Processing SystemsDec-31-2007

Neural motor prostheses (NMPs) require the accurate decoding of motor cortical population activity for the control of an artificial motor system. Previous work on cortical decoding for NMPs has focused on the recovery of hand kinematics. Human NMPs however may require the control of computer cursors or robotic devices with very different physical and dynamical properties. Here we show that the firing rates of cells in the primary motor cortex of nonhuman primates can be used to control the parameters of an artificial physical system exhibiting realistic dynamics. The model represents 2D hand motion in terms of a point mass connected to a system of idealized springs. The nonlinear spring coefficients are estimated from the firing rates of neurons in the motor cortex. We evaluate linear and a nonlinear decoding algorithms using neural recordings from two monkeys performing two different tasks. We found that the decoded spring coefficients produced accurate hand trajectories compared with state-of-the-art methods for direct decoding of hand kinematics. Furthermore, using a physically-based system produced decoded movements that were more "natural" in that their frequency spectrum more closely matched that of natural hand movements.

coefficient, neural network, neurology, (22 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.66)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Robots (0.69)
Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (0.69)

Add feedback

Boosted Dyadic Kernel Discriminants

Moghaddam, Baback, Shakhnarovich, Gregory

Neural Information Processing SystemsDec-31-2003

We introduce a novel learning algorithm for binary classification with hyperplane discriminants based on pairs of training points from opposite classes (dyadic hypercuts). This algorithm is further extended to nonlinear discriminants using kernel functions satisfying Mercer's conditions. An ensemble of simple dyadic hypercuts is learned incrementally by means of a confidence-rated version of AdaBoost, which provides a sound strategy for searching through the finite set of hypercut hypotheses. In experiments with real-world datasets from the UCI repository, the generalization performance of the hypercut classifiers was found to be comparable to that of SVMs and k-NN classifiers. Furthermore, the computational cost of classification (at run time) was found to be similar to, or better than, that of SVM. Similarly to SVMs, boosted dyadic kernel discriminants tend to maximize the margin (via AdaBoost). In contrast to SVMs, however, we offer an online and incremental learning machine for building kernel discriminants whose complexity (number of kernel evaluations) can be directly controlled (traded off for accuracy).

artificial intelligence, classifier, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Boosted Dyadic Kernel Discriminants

Moghaddam, Baback, Shakhnarovich, Gregory

Neural Information Processing SystemsDec-31-2003

We introduce a novel learning algorithm for binary classification with hyperplane discriminants based on pairs of training points from opposite classes (dyadic hypercuts). This algorithm is further extended to nonlinear discriminants using kernel functions satisfying Mercer'sconditions. An ensemble of simple dyadic hypercuts is learned incrementally by means of a confidence-rated version of AdaBoost, whichprovides a sound strategy for searching through the finite set of hypercut hypotheses. In experiments with real-world datasets from the UCI repository, the generalization performance of the hypercut classifiers was found to be comparable to that of SVMs and k-NN classifiers. Furthermore, the computational cost of classification (at run time) was found to be similar to, or better than,that of SVM. Similarly to SVMs, boosted dyadic kernel discriminants tend to maximize the margin (via AdaBoost). In contrast to SVMs, however, we offer an online and incremental learning machine for building kernel discriminants whose complexity (numberof kernel evaluations) can be directly controlled (traded off for accuracy).

artificial intelligence, classifier, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback