Goto

Collaborating Authors

 èxk


AFast Scale-Invariant Algorithm for Non-negative Least Squares with Non-negative Data

Neural Information Processing Systems

Nonnegative (linear) least square problems are a fundamental class of problems that is well-studied in statistical learning and for which solvers have been implemented in many of the standard programming languages used within the machine learning community. The existing off-the-shelf solvers view the non-negativity constraint in these problems as an obstacle and, compared to unconstrained least squares, perform additional effort to address it. However, in many of the typical applications, the data itself is nonnegative as well, and we show that the nonnegativity in this case makes the problem easier. In particular, while the worst-case dimension-independent oracle complexity for unconstrained least squares problems necessarily scales with one of the data matrix constants (typically the spectral norm) and these problems are solved to additive error, we show that nonnegative least squares problems with nonnegative data are solvable to multiplicative error and with complexity independent of any matrix constants. The algorithm we introduce is accelerated and based on a primal-dual perspective. We further show how to provably obtain linear convergence using adaptive restart coupled with our method and demonstrate its effectiveness on large-scale data via numerical experiments.



FinerMetagenomicReconstruction viaBiodiversityOptimization

Neural Information Processing Systems

In previous work [12, 13], a method was introduced that leverages compressive sensing techniques tofind thefewest taxa thatfitsthefrequencyofshort sequences ofnucleotides (i.e., k-mers) in a given sample. Consider, for instance, an environment/sample made of s bacterial species but where two of them are almost identical: one would wish to say that the concentration vector is almost(s 1)-sparse rather thans-sparse!


CAnIllustrativeExample WeprovideanillustrativecounterexampleforshowingthattheFS-WBPinEq.(10)isnotanMCF problemwhenm=3andn=3. ExampleC.1. Whenm=3andn=3,theconstraintmatrixis

Neural Information Processing Systems

When n = 2, the constraint matrixA has E = I2 1>2 and G = 1>2 I2. Now we simplify the matrixAby removing a specific set of redundantrows. Furthermore, the rows of A are categorized into a single set so that the criterion in Proposition 3.2 holds true (thedashed lineintheformulation of Aservesasapartition ofthissingle setintotwosets). We use the proof by contradiction. In particular, assume that problem(10) is a MCF problem whenm 3andn 3,Proposition 3.3 implies that the constraint matrixAisTU.


Appendices for " Pruning Randomly Initialized Neural Networks with Iterative Randomization " Contents

Neural Information Processing Systems

We consider a target neural networkf: Rd0 Rdl of depth l, which is described as follows. Similar to the previous works [6, 7], we assume that g(x) is twice as deep as the target network f(x). Thus, g(x) can be described as g(x)=G2lσ(G2l 1σ( G1(x))), (2) where Gj is a edj edj 1 matrix (edj N 1 for j = 1,,2l) with ed2i = di. Under this re-sampling assumption, we describe our main theorem as follows. 1 Theorem A.1 (Main Theorem) Fix,δ>0, and we assume thatkFikFrob 1. LetR Nand we assumethat each elementof Gi can be re-sampled with replacementfrom the uniformdistribution U[ 1,1] up to R 1 times. If n 2log(1δ) holds, then with probability at least 1 δ, we have |α Xi|, (5) for some i {1,,n}.