trajectory matrix
Grouping-Based Low-Rank Trajectory Completion and 3D Reconstruction
Katerina Fragkiadaki, Marta Salas, Pablo Arbelaez, Jitendra Malik
Extracting 3D shape of deforming objects in monocular videos, a task known as non-rigid structure-from-motion (NRSfM), has so far been studied only on synthetic datasets and controlled environments. Typically, the objects to reconstruct are pre-segmented, they exhibit limited rotations and occlusions, or full-length trajectories are assumed. In order to integrate NRSfM into current video analysis pipelines, one needs to consider as input realistic -thus incomplete-tracking, and perform spatio-temporal grouping to segment the objects from their surroundings. Furthermore, NRSfM needs to be robust to noise in both segmentation and tracking, e.g., drifting, segmentation "leaking", optical flow "bleeding" etc. In this paper, we make a first attempt towards this goal, and propose a method that combines dense optical flow tracking, motion trajectory clustering and NRSfM for 3D reconstruction of objects in videos. For each trajectory cluster, we compute multiple reconstructions by minimizing the reprojection error and the rank of the 3D shape under different rank bounds of the trajectory matrix. We show that dense 3D shape is extracted and trajectories are completed across occlusions and low textured regions, even under mild relative motion between the object and the camera. We achieve competitive results on a public NRSfM benchmark while using fixed parameters across all sequences and handling incomplete trajectories, in contrast to existing approaches.
Representing Neural Network Layers as Linear Operations via Koopman Operator Theory
Aswani, Nishant Suresh, Jabari, Saif Eddin, Shafique, Muhammad
The strong performance of simple neural networks is often attributed to their nonlinear activations. However, a linear view of neural networks makes understanding and controlling networks much more approachable. We draw from a dynamical systems view of neural networks, offering a fresh perspective by using Koopman operator theory and its connections with dynamic mode decomposition (DMD). Together, they offer a framework for linearizing dynamical systems by embedding the system into an appropriate observable space. By reframing a neural network as a dynamical system, we demonstrate that we can replace the nonlinear layer in a pretrained multi-layer perceptron (MLP) with a finite-dimensional linear operator. In addition, we analyze the eigenvalues of DMD and the right singular vectors of SVD, to present evidence that time-delayed coordinates provide a straightforward and highly effective observable space for Koopman theory to linearize a network layer. Consequently, we replace layers of an MLP trained on the Yin-Yang dataset with predictions from a DMD model, achieving a mdoel accuracy of up to 97.3%, compared to the original 98.4%. In addition, we replace layers in an MLP trained on the MNIST dataset, achieving up to 95.8%, compared to the original 97.2% on the test set.
Grouping-Based Low-Rank Trajectory Completion and 3D Reconstruction Marta Salas EECS, University of California, Universidad de Zaragoza, Berkeley, CA94720
Extracting 3D shape of deforming objects in monocular videos, a task known as non-rigid structure-from-motion (NRSfM), has so far been studied only on synthetic datasets and controlled environments. Typically, the objects to reconstruct are pre-segmented, they exhibit limited rotations and occlusions, or full-length trajectories are assumed. In order to integrate NRSfM into current video analysis pipelines, one needs to consider as input realistic -thus incomplete-tracking, and perform spatio-temporal grouping to segment the objects from their surroundings. Furthermore, NRSfM needs to be robust to noise in both segmentation and tracking, e.g., drifting, segmentation "leaking", optical flow "bleeding" etc. In this paper, we make a first attempt towards this goal, and propose a method that combines dense optical flow tracking, motion trajectory clustering and NRSfM for 3D reconstruction of objects in videos. For each trajectory cluster, we compute multiple reconstructions by minimizing the reprojection error and the rank of the 3D shape under different rank bounds of the trajectory matrix. We show that dense 3D shape is extracted and trajectories are completed across occlusions and low textured regions, even under mild relative motion between the object and the camera. We achieve competitive results on a public NRSfM benchmark while using fixed parameters across all sequences and handling incomplete trajectories, in contrast to existing approaches.
Multi-agent statistical discriminative sub-trajectory mining and an application to NBA basketball
Bunker, Rory, Duy, Vo Nguyen Le, Tabei, Yasuo, Takeuchi, Ichiro, Fujii, Keisuke
Improvements in tracking technology through optical and computer vision systems have enabled a greater understanding of the movement-based behaviour of multiple agents, including in team sports. In this study, a Multi-Agent Statistically Discriminative Sub-Trajectory Mining (MA-Stat-DSM) method is proposed that takes a set of binary-labelled agent trajectory matrices as input and incorporates Hausdorff distance to identify sub-matrices that statistically significantly discriminate between the two groups of labelled trajectory matrices. Utilizing 2015/16 SportVU NBA tracking data, agent trajectory matrices representing attacks consisting of the trajectories of five agents (the ball, shooter, last passer, shooter defender, and last passer defender), were truncated to correspond to the time interval following the receipt of the ball by the last passer, and labelled as effective or ineffective based on a definition of attack effectiveness that we devise in the current study. After identifying appropriate parameters for MA-Stat-DSM by iteratively applying it to all matches involving the two top- and two bottom-placed teams from the 2015/16 NBA season, the method was then applied to selected matches and could identify and visualize the portions of plays, e.g., involving passing, on-, and/or off-the-ball movements, which were most relevant in rendering attacks effective or ineffective.
Multivariate Functional Singular Spectrum Analysis Over Different Dimensional Domains
Trinka, Jordan, Haghbin, Hossein, Maadooliat, Mehdi
A common problem in time series analysis is detection, extraction, and exploration of mean, seasonal, trend, and noise components in time series data. A technique known as singular spectrum analysis (SSA) has been developed as a nonparametric, exploratory method which can be used to identify such interesting components in ordinary time series where observations are scalars (Golyandina et al., 2001). Often times, many variables are observed as a result of a single stochastic process and investigation of time series components can be made richer by performing a multivariate analysis of these vector observations. The MSSA algorithm is a technique that has seen success over its univariate SSA counterpart in decomposing a multidimensional time series into components if the covariates are moderately correlated (Golyandina and Stepanov, 2012). MSSA also has been broken up into two approaches of vertical MSSA (VMSSA) and horizontal MSSA (HMSSA) where VMSSA involves the vertical stacking of univariate Hankel trajectory matrices while HMSSA works with the horizontal stacking of the same elements (Hassani and Mahmoudvand, 2018). Over the course of the last 15 years, MSSA has seen significant success in various areas of application see Groth and Ghil (2011); Golyandina and Stepanov (2012); Silva et al. (2018); Hassani et al. (2019). Functional data analysis embodies the evaluation and exploration of data that is comprised of functions such as curves or surfaces (Ramsay and Silverman, 2005). Functional PCA (FPCA) is a technique that is used to find the most informative directions in a timeindependent collection of functional subjects (Ramsay and Silverman, 2005). Univariate Functional Singular Spectrum Analysis (FSSA) was developed by Haghbin et al. (2019) as a novel technique that is used to decompose a time-dependent collection of functional
Grouping-Based Low-Rank Trajectory Completion and 3D Reconstruction
Fragkiadaki, Katerina, Salas, Marta, Arbelaez, Pablo, Malik, Jitendra
Extracting 3D shape of deforming objects in monocular videos, a task known as non-rigid structure-from-motion (NRSfM), has so far been studied only on synthetic datasets and controlled environments. Typically, the objects to reconstruct are pre-segmented, they exhibit limited rotations and occlusions, or full-length trajectories are assumed. In order to integrate NRSfM into current video analysis pipelines, one needs to consider as input realistic -thus incomplete- tracking, and perform spatio-temporal grouping to segment the objects from their surroundings. Furthermore, NRSfM needs to be robust to noise in both segmentation and tracking, e.g., drifting, segmentation ``leaking'', optical flow ``bleeding'' etc. In this paper, we make a first attempt towards this goal, and propose a method that combines dense optical flow tracking, motion trajectory clustering and NRSfM for 3D reconstruction of objects in videos. For each trajectory cluster, we compute multiple reconstructions by minimizing the reprojection error and the rank of the 3D shape under different rank bounds of the trajectory matrix. We show that dense 3D shape is extracted and trajectories are completed across occlusions and low textured regions, even under mild relative motion between the object and the camera. We achieve competitive results on a public NRSfM benchmark while using fixed parameters across all sequences and handling incomplete trajectories, in contrast to existing approaches. We further test our approach on popular video segmentation datasets. To the best of our knowledge, our method is the first to extract dense object models from realistic videos, such as those found in Youtube or Hollywood movies, without object-specific priors.