Model Complexity, Goodness of Fit and Diminishing Returns

Feb-16-2024, 20:52:33 GMT–Neural Information Processing Systems

We consider modeling the data set D using models indexed by a complexity index k, 1:::; k:::; kmax • For example, the models could be finite mixture probability density functions (PDFs) for vector Xi'S where model complexity is indexed by the number of components k in the mixture. Alternatively, the modeling task could be to fit a conditional regression model y g(Zk) e, where now y is one of the variables in the vector X and Z is some subset of size k of the remaining components in the X vector. Such learning tasks can typically be characterized by the existence of a model and a loss function. A fitted model of complexity k is a function of the data points D and depends on a specific set of fitted parameters B. The loss function (goodness(cid:173) of-fit) is a functional of the model and maps each specific model to a scalar used to evaluate the model, e.g., likelihood for density estimation or sum-of-squares for regression. Figure 1 illustrates a typical empirical curve for loss function versus complexity, for mixtures of Markov models fitted to a large data set of 900,000 sequences. The complexity k is the number of Markov models being used in the mixture (see Cadez et al. (2000) for further details on the model and the data set). The empirical curve has a distinctly concave appearance, with large relative gains in fit for low complexity models and much more modest relative gains for high complexity models.

complexity, loss function, model complexity, (14 more...)

Neural Information Processing Systems

Feb-16-2024, 20:52:33 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.38)