Model Complexity, Goodness of Fit and Diminishing Returns

Cadez, Igor V., Smyth, Padhraic

Neural Information Processing Systems 

Igor V. Cadez Information and Computer Science University of California Irvine, CA 92697-3425, U.S.A. PadhraicSmyth Information and Computer Science University of California Irvine, CA 92697-3425, U.S.A. Abstract We investigate a general characteristic of the tradeoff in learning problems between goodness-of-fit and model complexity. Specifically wecharacterize a general class of learning problems where the goodness-of-fit function can be shown to be convex within firstorder asa function of model complexity. This general property of "diminishing returns" is illustrated on a number of real data sets and learning problems, including finite mixture modeling and multivariate linear regression. 1 Introduction, Motivation, and Related Work Assume we have a data set D Such learning tasks can typically be characterized by the existence of a model and a loss function. A fitted model of complexity k is a function of the data points D and depends on a specific set of fitted parameters B. The loss function (goodnessof-fit) isa functional of the model and maps each specific model to a scalar used to evaluate the model, e.g., likelihood for density estimation or sum-of-squares for regression. Figure 1 illustrates a typical empirical curve for loss function versus complexity, for mixtures of Markov models fitted to a large data set of 900,000 sequences.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found