Understanding the Pseudo-Truth as an Optimal Approximation
One of the things that set statistics apart from the rest of applied mathematics is an interest in the problems introduced by sampling: how can we learn about a model if we're given only a finite and potentially noisy sample of data? Although frequently important, the issues introduced by sampling can be a distraction when the core difficulties you face would persist even with access to an infinite supply of noiseless data. For example, if you're fitting a misspecified model \(m_1\) to data generated by a model \(m_2\), this misspecification will persist even as the supply of data becomes infinite. In this setting, the issues introduced by sampling can be irrelevant: it's often more important to know whether or not the misspecified model, \(m_1\), could ever act as an acceptable approximation to the true model, \(m_2\). Until recently, I knew very little about these sorts of issues.
Mar-25-2016, 22:06:12 GMT
- Technology: