vec
f66340d6f28dae6aab0176892c9065e7-Supplemental-Conference.pdf
Once closed-form expressions for these Jacobians are derived, it remains to substitute those expressions into (16). The following identity (often termed the "vec" rule) will To depict the spatial topographies of the latent components measured on the EEG and fMRI analyses, the "forward-model" [ The results of the comparison are shown in Fig S1, where it is clear that the signal fidelity of the GCs (right panel) significantly exceeds those yielded by PCA (left) and ICA (middle). GCA is only able to recover sources with temporal dependencies (i.e., s Both the single electrodes and Granger components exhibit two pronounced peaks in the spectra: one near 2 Hz ("delta" Fig S3 shows the corresponding result for the left motor imagery condition. EEG motor imagery dataset described in the main text. For each technique, the first 6 components are presented.
- South America > Chile > Arica y Parinacota Region > Arica Province > Arica (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
AT able of Notations
Eq. (13), we have Var null Q N. Putting it together, we have R (S Use the grouping of rows described in Step 2 to construct the block Householder quantizer. ResNet18/50 model, we adopt a slightly modified version, ResNetv1.5 [ We train for 200 epochs. Due to limited device memory, we set the batch size to 50 per GPU with 8 GPUs in total, the initial learning rate is 0.4. For both datasets, we use a cosine learning rate schedule, following [45].
- North America > United States > California > Alameda County > Berkeley (0.04)
- North America > Canada (0.04)
- Asia > Middle East > Jordan (0.04)
Efficient Learning of Stationary Diffusions with Stein-type Discrepancies
Bleile, Fabian, Lumpp, Sarah, Drton, Mathias
Learning a stationary diffusion amounts to estimating the parameters of a stochastic differential equation whose stationary distribution matches a target distribution. We build on the recently introduced kernel deviation from stationarity (KDS), which enforces stationarity by evaluating expectations of the diffusion's generator in a reproducing kernel Hilbert space. Leveraging the connection between KDS and Stein discrepancies, we introduce the Stein-type KDS (SKDS) as an alternative formulation. We prove that a vanishing SKDS guarantees alignment of the learned diffusion's stationary distribution with the target. Furthermore, under broad parametrizations, SKDS is convex with an empirical version that is $ε$-quasiconvex with high probability. Empirically, learning with SKDS attains comparable accuracy to KDS while substantially reducing computational cost and yields improvements over the majority of competitive baselines.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > United States > New York (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- (4 more...)
Sparse Tucker Decomposition and Graph Regularization for High-Dimensional Time Series Forecasting
Xia, Sijia, Ng, Michael K., Zhang, Xiongjun
Existing methods of vector autoregressive model for multivariate time series analysis make use of low-rank matrix approximation or Tucker decomposition to reduce the dimension of the over-parameterization issue. In this paper, we propose a sparse Tucker decomposition method with graph regularization for high-dimensional vector autoregressive time series. By stacking the time-series transition matrices into a third-order tensor, the sparse Tucker decomposition is employed to characterize important interactions within the transition third-order tensor and reduce the number of parameters. Moreover, the graph regularization is employed to measure the local consistency of the response, predictor and temporal factor matrices in the vector autoregressive model.The two proposed regularization techniques can be shown to more accurate parameters estimation. A non-asymptotic error bound of the estimator of the proposed method is established, which is lower than those of the existing matrix or tensor based methods. A proximal alternating linearized minimization algorithm is designed to solve the resulting model and its global convergence is established under very mild conditions. Extensive numerical experiments on synthetic data and real-world datasets are carried out to verify the superior performance of the proposed method over existing state-of-the-art methods.
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- (4 more...)
Nearly Optimal Approximation of Matrix Functions by the Lanczos Method
Approximating the action of a matrix function $f(\vec{A})$ on a vector $\vec{b}$ is an increasingly important primitive in machine learning, data science, and statistics, with applications such as sampling high dimensional Gaussians, Gaussian process regression and Bayesian inference, principle component analysis, and approximating Hessian spectral densities.Over the past decade, a number of algorithms enjoying strong theoretical guarantees have been proposed for this task.Many of the most successful belong to a family of algorithms called Krylov subspace methods.Remarkably, a classic Krylov subspace method, called the Lanczos method for matrix functions (Lanczos-FA), frequently outperforms newer methods in practice. Our main result is a theoretical justification for this finding: we show that, for a natural class of rational functions, Lanczos-FA matches the error of the best possible Krylov subspace method up to a multiplicative approximation factor. The approximation factor depends on the degree of $f(x)$'s denominator and the condition number of $\vec{A}$, but not on the number of iterations $k$. Our result provides a strong justification for the excellent performance of Lanczos-FA, especially on functions that are well approximated by rationals, such as the matrix square root.