Connections: Log Likelihood, Cross Entropy, KL Divergence, Logistic Regression, and Neural Networks

Dec-9-2019, 04:19:04 GMT–#artificialintelligence

Maximizing the (log) likelihood is equivalent to minimizing the binary cross entropy. There is literally no difference between the two objective functions, so there can be no difference between the resulting model or its characteristics. This of course, can be extended quite simply to the multiclass case using softmax cross-entropy and the so-called multinoulli likelihood, so there is no difference when doing this for multiclass cases as is typical in, say, neural networks. The difference between MLE and cross-entropy is that MLE represents a structured and principled approach to modeling and training, and binary/softmax cross-entropy simply represent special cases of that applied to problems that people typically care about. After that aside on maximum likelihood estimation, let's delve more into the relationship between negative log likelihood and cross entropy.

cross entropy, likelihood, log likelihood, (11 more...)

#artificialintelligence

Dec-9-2019, 04:19:04 GMT

News Web Page

Add feedback

Genre:
- Research Report
  - New Finding (0.42)
  - Experimental Study (0.42)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.73)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks (1.00)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.59)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found