Probabilistic learning of boolean functions applied to the binary classification problem with categorical covariates

Mar-20-2020–arXiv.org Machine Learning

Consider a sample y {0, 1} n generated by two different Bernoulli distributions with parameters π 0 and π 1, and consider the set S {1,..., n} as the set of all indices i such that P (y i) π 1 . Assuming that the components of the vector y i are conditionally independent given θ (S, π 0, π 1), the likelihood function is the product of two Binomial distribution functions, and will attain a global maximum at the set S L(y) {i: 1 i n y i 1} (let's call this set the onset of the vector y), with maximum likelihood estimators given by ˆπ 0 0 and ˆπ 1 1. Now consider a design matrix X R n p and a function f: R p {0, 1} such that ψ(X i) 1 i S, where X i is the i-th row of X. Again, if the function f is not constrained in any way, the problem is the same and the same trivial solution applies, with function f defined only in the set of rows of X. In this extreme case, the solution will usually not generalize well, and also will not provide any interesting interpretation (since f is just an enumeration based on the onset of y). Standard methods for the binary classification problem are concerned with the task of estimating f constraining it in different ways such that this trivial solution (associated with the problem of overfitting) is avoided.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

Mar-20-2020

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre:
- Research Report (0.64)

Industry:
- Health & Medicine > Therapeutic Area (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.66)
  - Machine Learning
    - Statistical Learning (1.00)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found