Visual explanation for video recognition – twentybn – Medium

Aug-24-2017, 17:25:29 GMT–#artificialintelligence

This post describes how temporally-sensitive saliency maps can be obtained for deep neural networks designed for video recognition. It is evident from the previous works [2, 3, 4] that saliency maps help visualize why a model produced a given prediction and can uncover artifacts in the data and point towards better model architectures. Task: Recognizing human actions in videos from our recently released dataset requires a fine-grained understanding of concepts like three-dimensional geometry, material properties, object permanence, affordance and gravity [1]. The dataset, dubbed "Something-Something", consists of 100,000 videos across 174 categories containing concepts such as dropping, picking, pushing etc. Grad-CAM or Gradient-weighted Class Activation Mapping, proposed by [4], allows us to obtain a localization map for any target class. Please refer [4] for more details.

artificial intelligence, machine learning, prediction 1, (11 more...)

#artificialintelligence

Aug-24-2017, 17:25:29 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found