Local stability and robustness of sparse dictionary learning in the presence of noise
Jenatton, Rodolphe, Gribonval, Rémi, Bach, Francis
Modelling signals as sparse linear combinations of atoms selected from a dictionary has become a popular paradigm in many fields, including signal processing, statistics, and machine learning. This line of research has witnessed the development of several well-founded theoretical frameworks (see, e.g., Wainwright [2009], Zhang [2009]) and efficient algorithmic tools (see, e.g., Bach et al. [2011] and references therein). However, the performance of such approaches hinges on the representation of the signals, which makes the question of designing "good" dictionaries prominent. A great deal of effort has been dedicated to come up with efficient predefined dictionaries, e.g., the various types of wavelets [Mallat, 2008]. These representations have notably contributed to many successful image processing applications such as compression, denoising and deblurring. More recently, the idea of simultaneously learning the dictionary and the sparse decompositions of the signals--also known as sparse dictionary learning, or simply, sparse coding--has emerged as a powerful framework, with state-of-the-art performance in many tasks, including inpainting and image classification (see, e.g., Mairal et al. [2010] and references therein). Although sparse dictionary learning can sometimes be formulated as convex[Bach et al., 2008, Bradley and Bagnell, 2009], nonparametric Bayesian [Zhou et al., 2009] and submodular [Krause and Cevher, 2010] problems, the most popular and widely used definition of sparse coding brings into play a non-convex optimization problem. Despite its empirical and practical success, there has only been little theoretical analysis of the properties of sparse dictionary learning.
Oct-2-2012