Mixture of Dynamical Variational Autoencoders for Multi-Source Trajectory Modeling and Separation

Lin, Xiaoyu, Girin, Laurent, Alameda-Pineda, Xavier

Dec-7-2023–arXiv.org Artificial Intelligence

In this paper, we propose a latent-variable generative model called mixture of dynamical variational autoencoders (MixDV AE) to model the dynamics of a system composed of multiple moving sources. A DV AE model is pre-trained on a single-source dataset to capture the source dynamics. Then, multiple instances of the pre-trained DV AE model are integrated into a multi-source mixture model with a discrete observation-to-source assignment latent variable. The posterior distributions of both the discrete observation-to-source assignment variable and the continuous DV AE variables representing the sources content/position are estimated using a variational expectation-maximization algorithm, leading to multi-source trajectories estimation. We illustrate the versatility of the proposed MixDV AE model on two tasks: a computer vision task, namely multi-object tracking, and an audio processing task, namely single-channel audio source separation. Experimental results show that the proposed method works well on these two tasks, and outperforms several baseline methods.

dataset, mixdv ae, sequence, (14 more...)

arXiv.org Artificial Intelligence

Dec-7-2023

arXiv.org PDF

Add feedback

Country:
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America
  - United States
    - New York (0.04)
    - Utah > Salt Lake County
      - Salt Lake City (0.04)
  - Canada > Ontario
    - Toronto (0.14)
- Europe
  - United Kingdom > England
    - West Midlands > Birmingham (0.04)
    - East Sussex > Brighton (0.04)
  - France > Auvergne-Rhône-Alpes
    - Isère > Grenoble (0.04)
  - Czechia > South Moravian Region
    - Brno (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report > New Finding (0.66)

Industry:
- Leisure & Entertainment (0.67)
- Media (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Representation & Reasoning > Uncertainty (1.00)
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.46)