Goto

Collaborating Authors

 Statistical Learning








min

Neural Information Processing Systems

LetAbean nHermitian matrixandletBbea(n 1) (n 1)matrixwhich is constructed by deleting thei-th row andi-th column ofA. Denote thatฮฆ = [ฯ•(x1),...,ฯ•(xn)] Rn D, where D is the dimension of feature spaceH. Performing rank-n singular value decomposition (SVD) onฮฆ, we have ฮฆ = HฮฃV, where H Rn n, ฮฃ Rn n is a diagonal matrix whose diagonal elements are the singular values of ฮฆ,andV RD n. F(ฮฑ) in Eq.(21) is proven differentiable and thep-th component of the gradient is F(ฮฑ) ฮฑp = Then, a reduced gradient descent algorithm [26] is adopted to optimize Eq.(21). The three deep neural networks are pre-trained on the ImageNet[5].