Facebook AI Introduces TimeSformer: A New Video Architecture Based Purely On Transformers
Facebook AI has built a new architecture for video understanding called TimeSformer. The video architecture is purely based on Transformers. Transformers have become the dominant approach for many natural language processing (NLP) applications such as Machine Translation and General language understanding. TimeSformer was proven to achieve the best-reported numbers on multiple challenging action recognition benchmarks, including the Kinetics-400 action recognition data set. Compared with modern 3D convolutional neural networks, it is nearly three times faster to train requires less than one-tenth of computing inference.
Mar-17-2021, 12:53:00 GMT
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Neural Networks (0.57)
- Natural Language (1.00)
- Vision > Video Understanding (0.39)
- Information Technology > Artificial Intelligence