AITopics | video sequence

Collaborating Authors

video sequence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

baaa7b5b5bbaadca5023e1ab909b8af5-Paper-Conference.pdf

Neural Information Processing SystemsJun-22-2026, 09:27:55 GMT

The independently and temporal real world, inconsistenc ignoring is dynamic, temporal y yet . To most address correlations image this, fus we in ion videos propose methods and Unified process leading V static to ideo flick Fusion frames ering ( frame coherent UniVF learning), video a nov fusion.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

b87bdcf963cad3d0b265fcb78ae7d11e-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 01:19:26 GMT

artificial intelligence, machine learning, texture, (14 more...)

Neural Information Processing Systems

Country: Asia (0.46)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

License of the assets

Neural Information Processing SystemsApr-24-2026, 12:31:04 GMT

Licence for the codes We use the code for MS-TCN [13], ASRF [24], LAS [9], all of which are under MITLicense according to https://opensource.org/licenses/MIT. For the Jigsaws [18] dataset, we follow the data use agreeement according to https://cs.jhu. Action classification: Action classification is the task of identifying a single action, as opposed to a sequence of actions. Several methods use 2DCNNs to extract frame-wise features from an input video, which are then combined to predict a coarse action taking place in the video [56, 39, 59]. There also exist several works that perform action classification from kinematic data [2, 12]. Action segmentation: Action segmentation is the problem of segmenting an input stream of data, labeling each frame according to the action that is being carried out. Earlier methods for action segmentation employed hidden Markov models [33, 22]. More recently, convolutional neural networks [58, 26] and recurrent neural networks [50] have been applied to this problem Inspired by the success of temporal convolutional networks (TCNs) in speech synthesis, [37] adapted these models to action segmentation. MS-TCN [13], which uses a multi-stage TCN architecture, has become one of the most widely used architecture for action segmentation. Although these methods achieve high frame-wise accuracy, they still produce a significant number of over-segmentation errors. In order to address this, several boundary-aware methods have been developed which perform temporal smoothing of the frame-wise predictions [57, 24]. These methods use ground-truth boundary information to train a binary classification network to perform boundary detection. The boundary estimates are then used to aggregate the frame-wise predictions either in a soft manner (boundary-aware pooling) or by setting a hard threshold. However, for elemental actions with a short duration, such as the functional primitives in the StrokeRehab dataset, the duration of each action is very short. As a result, the boundaries between actions can be hard to detect or even hard to define (see Figure 4). Sequence-to-sequence models: Our proposed method is based on sequence-to-sequence (seq2seq) models. These models allow us to learn a mapping of a variable-length input sequence to a variablelength output sequence [53].

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry:

Government (1.00)
Information Technology (0.93)
Law > Intellectual Property & Technology Law (0.46)
Health & Medicine > Therapeutic Area (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

MBW: Multi-view Bootstrapping in the Wild

Neural Information Processing SystemsMar-18-2026, 04:19:05 GMT

Labeling articulated objects in unconstrained settings has a wide variety of applications including entertainment, neuroscience, psychology, ethology, and many fields of medicine. Large offline labeled datasets do not exist for all but the most common articulated object categories (e.g., humans). Hand labeling these landmarks within a video sequence is a laborious task. Learned landmark detectors can help, but can be error-prone when trained from only a few examples. Multi-camera systems that train fine-grained detectors have shown significant promise in detecting such errors, allowing for self-supervised solutions that only need a small percentage of the video sequence to be hand-labeled.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Industry: Health & Medicine (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.58)

Add feedback

Trading robust representations for sample complexity through self-supervised visual experience

Neural Information Processing SystemsMar-17-2026, 00:36:30 GMT

Learning in small sample regimes is among the most remarkable features of the human perceptual system. This ability is related to robustness to transformations, which is acquired through visual experience in the form of weak-or self-supervision during development. We explore the idea of allowing artificial systems to learn representations of visual stimuli through weak supervision prior to downstream supervised tasks. We introduce a novel loss function for representation learning using unlabeled image sets and video sequences, and experimentally demonstrate that these representations support one-shot learning and reduce the sample complexity of multiple recognition tasks. We establish the existence of a trade-off between the sizes of weakly supervised, automatically obtained from video sequences, and fully supervised data sets. Our results suggest that equivalence sets other than class labels, which are abundant in unlabeled visual experience, can be used for self-supervised learning of semantically relevant image embeddings.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback