AITopics | labelling unlabelled video

Labelling unlabelled videos from scratch with multi-modal self-supervision

Neural Information Processing SystemsDec-23-2025, 22:16:21 GMT

A large part of the current success of deep learning lies in the effectiveness of data -- more precisely: of labeled data. Yet, labelling a dataset with human annotation continues to carry high costs, especially for videos. While in the image domain, recent methods have allowed to generate meaningful (pseudo-) labels for unlabelled datasets without supervision, this development is missing for the video domain where learning feature representations is the current focus. In this work, we a) show that unsupervised labelling of a video dataset does not come for free from strong feature encoders and b) propose a novel clustering method that allows pseudo-labelling of a video dataset without any human annotations, by leveraging the natural correspondence between audio and visual modalities. An extensive analysis shows that the resulting clusters have high semantic overlap to ground truth human labels. We further introduce the first benchmarking results on unsupervised labelling of common video datasets.

dataset, labelling unlabelled video, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Review for NeurIPS paper: Labelling unlabelled videos from scratch with multi-modal self-supervision

Neural Information Processing SystemsJan-23-2025, 05:22:42 GMT

Weaknesses: Required clarifications: there are some parts of the work that would require clarification, see below: * The description of the exact algorithm is not completely clear to me in the paper (and the appendix). I understand that code is provided but it should be clarified in the paper. In particular, is it a pure alternate approach? How many examples are sampled for the clustering stage (is N equal to the number of example in the dataset?) If I understand correctly, thanks to the probabilistic formulation, once the data is reclustered there is no need to reinit the last linear layer, is that correct? - If no, it is unclear to me how to apply the algorithm in an online fashion (see later for a related question).

labelling unlabelled video, neurips paper, video, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.52)

Add feedback

Labelling unlabelled videos from scratch with multi-modal self-supervision

Neural Information Processing SystemsOct-9-2024, 21:52:58 GMT

A large part of the current success of deep learning lies in the effectiveness of data -- more precisely: of labeled data. Yet, labelling a dataset with human annotation continues to carry high costs, especially for videos. While in the image domain, recent methods have allowed to generate meaningful (pseudo-) labels for unlabelled datasets without supervision, this development is missing for the video domain where learning feature representations is the current focus. In this work, we a) show that unsupervised labelling of a video dataset does not come for free from strong feature encoders and b) propose a novel clustering method that allows pseudo-labelling of a video dataset without any human annotations, by leveraging the natural correspondence between audio and visual modalities. An extensive analysis shows that the resulting clusters have high semantic overlap to ground truth human labels. We further introduce the first benchmarking results on unsupervised labelling of common video datasets.

dataset, labelling unlabelled video, video dataset

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

labelling unlabelled video

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Labelling unlabelled videos from scratch with multi-modal self-supervision

Review for NeurIPS paper: Labelling unlabelled videos from scratch with multi-modal self-supervision

Labelling unlabelled videos from scratch with multi-modal self-supervision