AITopics | Bregler, Christoph

Catching Out-of-Context Misinformation with Self-supervised Learning

Aneja, Shivangi, Bregler, Christoph, Nießner, Matthias

arXiv.org Artificial IntelligenceJan-15-2021

Despite the recent attention to DeepFakes and other forms of image manipulations, one of the most prevalent ways to mislead audiences is the use of unaltered images in a new but false context. To address these challenges and support fact-checkers, we propose a new method that automatically detects out-of-context image and text pairs. Our core idea is a self-supervised training strategy where we only need images with matching (and non-matching) captions from different sources. At train time, our method learns to selectively align individual objects in an image with textual claims, without explicit supervision. At test time, we check for a given text pair if both texts correspond to same object(s) in the image but semantically convey different descriptions, which allows us to make fairly accurate out-of-context predictions. Our method achieves 82% out-of-context detection accuracy. To facilitate training our method, we created a large-scale dataset of 203,570 images which we match with 456,305 textual captions from a variety of news websites, blogs, and social media posts; i.e., for each image, we obtained several captions.

caption, deep learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2101.06278

Country:

Europe (0.93)
North America > United States > Louisiana (0.14)

Genre: Research Report (0.40)

Industry: Media > News (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback

Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation

Tompson, Jonathan J., Jain, Arjun, LeCun, Yann, Bregler, Christoph

Neural Information Processing SystemsFeb-14-2020, 08:27:34 GMT

This paper proposes a new hybrid architecture that consists of a deep Convolutional Network and a Markov Random Field. We show how this architecture is successfully applied to the challenging problem of articulated human pose estimation in monocular images. The architecture can exploit structural domain constraints such as geometric relationships between body joint locations. We show that joint training of these two model paradigms improves performance and allows us to significantly outperform existing state-of-the-art techniques. Papers published at the Neural Information Processing Systems Conference.

artificial intelligence, convolutional network, video understanding, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.84)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.70)

Add feedback

Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation

Tompson, Jonathan J., Jain, Arjun, LeCun, Yann, Bregler, Christoph

Neural Information Processing SystemsDec-31-2014

This paper proposes a new hybrid architecture that consists of a deep Convolutional Network and a Markov Random Field. We show how this architecture is successfully applied to the challenging problem of articulated human pose estimation in monocular images. The architecture can exploit structural domain constraints such as geometric relationships between body joint locations. We show that joint training of these two model paradigms improves performance and allows us to significantly outperform existing state-of-the-art techniques.

deep learning, spatial-model, video understanding, (19 more...)

Neural Information Processing Systems

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Pose-Sensitive Embedding by Nonlinear NCA Regression

Taylor, Graham W., Fergus, Rob, Williams, George, Spiro, Ian, Bregler, Christoph

Neural Information Processing SystemsDec-31-2010

This paper tackles the complex problem of visually matching people in similar pose but with different clothes, background, and other appearance changes. We achieve this with a novel method for learning a nonlinear embedding based on several extensions to the Neighborhood Component Analysis (NCA) framework. Our method is convolutional, enabling it to scale to realistically-sized images. By cheaply labeling the head and hands in large video databases through Amazon Mechanical Turk (a crowd-sourcing service), we can use the task of localizing the head and hands as a proxy for determining body pose. We apply our method to challenging real-world data and show that it can generalize beyond hand localization to infer a more general notion of body pose. We evaluate our method quantitatively against other embedding methods. We also demonstrate that real-world performance can be improved through the use of synthetic data.

crowdsourcing, deep learning, synthetic example, (22 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
(2 more...)

Add feedback

Learning Motion Style Synthesis from Perceptual Observations

Torresani, Lorenzo, Hackney, Peggy, Bregler, Christoph

Neural Information Processing SystemsDec-31-2007

This paper presents an algorithm for synthesis of human motion in specified styles. We use a theory of movement observation (Laban Movement Analysis) to describe movement styles as points in a multidimensional perceptual space. We cast the task of learning to synthesize desired movement styles as a regression problem: sequences generated via space-time interpolation of motion capture data are used to learn a nonlinear mapping between animation parameters and movement styles in perceptual space. We demonstrate that the learned model can apply a variety of motion styles to prerecorded motion sequences and it can extrapolate styles not originally included in the training data.

artificial intelligence, machine learning, sequence, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Learning Motion Style Synthesis from Perceptual Observations

Torresani, Lorenzo, Hackney, Peggy, Bregler, Christoph

Neural Information Processing SystemsDec-31-2007

This paper presents an algorithm for synthesis of human motion in specified styles. We use a theory of movement observation (Laban Movement Analysis) to describe movement styles as points in a multidimensional perceptual space. We cast the task of learning to synthesize desired movement styles as a regression problem: sequences generated via space-time interpolation of motion capture data are used to learn a nonlinear mapping between animation parameters and movement styles in perceptual space. We demonstrate that the learned model can apply a variety of motion styles to prerecorded motion sequences and it can extrapolate styles not originally included in the training data.

artificial intelligence, machine learning, sequence, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Learning Non-Rigid 3D Shape from 2D Motion

Torresani, Lorenzo, Hertzmann, Aaron, Bregler, Christoph

Neural Information Processing SystemsDec-31-2004

This paper presents an algorithm for learning the time-varying shape of a nonrigid 3D object from uncalibrated 2D tracking data. We model shape motion as a rigid component (rotation and translation) combined with a nonrigid deformation. Reconstruction is ill-posed if arbitrary deformations are allowed. We constrain the problem by assuming that the object shape at each time instant is drawn from a Gaussian distribution. Based on this assumption, the algorithm simultaneously estimates 3D shape and motion for each time frame, learns the parameters of the Gaussian, and robustly fills-in missing data points. We then extend the algorithm to model temporal smoothness in object shape, thus allowing it to handle severe cases of missing data.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)

Add feedback

Learning Non-Rigid 3D Shape from 2D Motion

Torresani, Lorenzo, Hertzmann, Aaron, Bregler, Christoph

Neural Information Processing SystemsDec-31-2004

This paper presents an algorithm for learning the time-varying shape of a nonrigid 3D object from uncalibrated 2D tracking data. We model shape motion as a rigid component (rotation and translation) combined with a nonrigid deformation. Reconstruction is ill-posed if arbitrary deformations areallowed. We constrain the problem by assuming that the object shape at each time instant is drawn from a Gaussian distribution. Based on this assumption, the algorithm simultaneously estimates 3D shape and motion for each time frame, learns the parameters of the Gaussian, and robustly fills-in missing data points. We then extend the algorithm to model temporal smoothness in object shape, thus allowing it to handle severe cases of missing data.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)

Add feedback

Learning Appearance Based Models: Mixtures of Second Moment Experts

Bregler, Christoph, Malik, Jitendra

Neural Information Processing SystemsDec-31-1997

This paper describes a new technique for object recognition based on learning appearance models. The image is decomposed into local regions which are described by a new texture representation called "Generalized Second Moments" thatare derived from the output of multiscale, multiorientation filter banks. Class-characteristic local texture features and their global composition is learned by a hierarchical mixture of experts architecture (Jordan & Jacobs). The technique is applied to a vehicle database consisting of 5 general car categories (Sedan, Van with backdoors, Van without backdoors, old Sedan, and Volkswagen Bug). This is a difficult problem with considerable in-class variation. The new technique has a 6.5% misclassification rate, compared to eigen-images which give 17.4% misclassification rate, and nearest neighbors which give 15 .7%

artificial intelligence, machine learning, representation, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.25)
North America > United States (0.14)

Industry: Automobiles & Trucks > Manufacturer (0.49)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)

Add feedback

Learning Appearance Based Models: Mixtures of Second Moment Experts

Bregler, Christoph, Malik, Jitendra

Neural Information Processing SystemsDec-31-1997

This paper describes a new technique for object recognition based on learning appearance models. The image is decomposed into local regions which are described by a new texture representation called "Generalized Second Moments" that are derived from the output of multiscale, multiorientation filter banks. Class-characteristic local texture features and their global composition is learned by a hierarchical mixture of experts architecture (Jordan & Jacobs). The technique is applied to a vehicle database consisting of 5 general car categories (Sedan, Van with backdoors, Van without backdoors, old Sedan, and Volkswagen Bug). This is a difficult problem with considerable in-class variation.

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Country: