Goto

Collaborating Authors

 proposition 3


A Organization of the Appendix 482 The appendix includes the missing proofs, detailed discussions of some argument in the main body

Neural Information Processing Systems

The proof of infeasibility condition (Theorem 3.2) is provided in Section B. Explanations on conditions derived in Theorem 3.2 are included in Section C. The proof of properties of the proposed model (r)LogSpecT (Proposition 3.4 The truncated Hausdorff distance based proof details of Theorem 4.1 and Corollary 4.4 are Details of L-ADMM and its convergence analysis are in Section F. Additional experiments and discussions on synthetic data are included in Section G. ( m 1) Again, from Farkas' lemma, this implies that the following linear system does not have a solution: Example 3.1 we know δ = 2|h Since the constraint set S is a cone, it follows that for all γ > 0, γ S = S . Opt(C, α) = α Opt(C, 1), which completes the proof. The proof will be conducted by constructing a feasible solution for rLogSpecT. Since the LogSpecT is a convex problem and Slater's condition holds, the KKT conditions We show that it is feasible for rLogSpecT. R, its epigraph is defined as epi f: = {( x, y) | y f ( x) }. Before presenting the proof, we first introduce the following lemma.


A Proofs

Neural Information Processing Systems

Further taking the usual assumption that X is compact. Let us start with Proposition 3, a central observation needed in Theorem 2. Put into words Now, we can proceed to prove the universality part of Theorem 2. Since the task admits a smooth separator, By Fubini's theorem and Proposition 3, we have F The reader can think of λ as a uniform distribution over G. (as in Theorem 2). The result follows directly from the combination of de Finetti's theorem [ Combining this with Kallenberg's noise transfer theorem we have that the weights and Assumption 1 or ii) is an inner-product decision graph problem as in Definition 3. Further, the task has infinitely (as in Theorem 2). Finally, we follow Proposition 2's proof by simply replacing de Finetti's with Aldous-Hoover's theorem. Define an RLC that samples the linear coefficients as follows.



Supplement to " Estimating Riemannian Metric with Noise-Contaminated Intrinsic Distance "

Neural Information Processing Systems

Unlike distance metric learning where the subsequent tasks utilizing the estimated distance metric is the usual focus, the proposal focuses on the estimated metric characterizing the geometry structure. Despite the illustrated taxi and MNIST examples, it is still open to finding more compelling applications that target the data space geometry. Interpreting mathematical concepts such as Riemannian metric and geodesic in the context of potential application (e.g., cognition and perception research where similarity measures are common) could be inspiring. Our proposal requires sufficiently dense data, which could be demanding, especially for high-dimensional data due to the curse of dimensionality. Dimensional reduction (e.g., manifold embedding as in the MNIST example) can substantially alleviate the curse of dimensionality, and the dense data requirement will more likely hold true.




Leveraging the two-timescale regime to demonstrate convergence of neural networks

Neural Information Processing Systems

Artificial neural networks are among the most successful modern machine learning methods, in particular because their non-linear parametrization provides a flexible way to implement feature learning (see, e.g., Goodfellow et al., 2016, chapter 15).