Probabilistic learning of boolean functions applied to the binary classification problem with categorical covariates

Hubert, Paulo

arXiv.org Machine Learning 

Consider a sample y {0, 1} n generated by two different Bernoulli distributions with parameters π 0 and π 1, and consider the set S {1,..., n} as the set of all indices i such that P (y i) π 1 . Assuming that the components of the vector y i are conditionally independent given θ (S, π 0, π 1), the likelihood function is the product of two Binomial distribution functions, and will attain a global maximum at the set S L(y) {i: 1 i n y i 1} (let's call this set the onset of the vector y), with maximum likelihood estimators given by ˆπ 0 0 and ˆπ 1 1. Now consider a design matrix X R n p and a function f: R p {0, 1} such that ψ(X i) 1 i S, where X i is the i-th row of X. Again, if the function f is not constrained in any way, the problem is the same and the same trivial solution applies, with function f defined only in the set of rows of X. In this extreme case, the solution will usually not generalize well, and also will not provide any interesting interpretation (since f is just an enumeration based on the onset of y). Standard methods for the binary classification problem are concerned with the task of estimating f constraining it in different ways such that this trivial solution (associated with the problem of overfitting) is avoided.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found