The Naive Bayes Classifier explained
With the Naive Bayes model, we do not take only a small set of positive and negative words into account, but all words the NB Classifier was trained with, i.e. all words presents in the training set. If a word has not appeared in the training set, we have no data available and apply Laplacian smoothing (use 1 instead of the conditional probability of the word). The probability a document belongs to a class C is given by the class probability P(C) multiplied by the products of the conditional probabilities of each word for that class. Here count(d_i, C) is the number of occurences of word d_i in class C, V_C is the total number of words in class C and n is the number of words in the document we are currently classifying. In theory we want a training set as large as possible, since that will increase the accuracy.
Nov-14-2016, 04:33:40 GMT
- Technology: