STREAMER: Streaming Representation Learning and Event Segmentation in a Hierarchical Manner

Mar-27-2025, 11:56:15 GMT–Neural Information Processing Systems

We present a novel self-supervised approach for hierarchical representation learning and segmentation of perceptual inputs in a streaming fashion. Our research addresses how to semantically group streaming inputs into chunks at various levels of a hierarchy while simultaneously learning, for each chunk, robust global representations throughout the domain. To achieve this, we propose STREAMER, an architecture that is trained layer-by-layer, adapting to the complexity of the input domain. In our approach, each layer is trained with two primary objectives: making accurate predictions into the future and providing necessary information to other levels for achieving the same objective. The event hierarchy is constructed by detecting prediction error peaks at different levels, where a detected boundary triggers a bottom-up information flow. At an event boundary, the encoded representation of inputs at one layer becomes the input to a higher-level layer.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Mar-27-2025, 11:56:15 GMT

Conferences PDF

Add feedback

Country:
- Europe > Switzerland (0.28)
- North America > United States (0.28)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Education (0.67)

Technology:
- Information Technology
  - Artificial Intelligence
    - Cognitive Science (1.00)
    - Machine Learning
      - Neural Networks > Deep Learning (1.00)
      - Statistical Learning (0.67)
    - Natural Language > Large Language Model (0.68)
    - Representation & Reasoning (1.00)
    - Vision (1.00)
  - Sensing and Signal Processing > Image Processing (0.93)