Automatic Webpage Classification • /r/MachineLearning
I'm trying to create a document classifier but I'm not able to think of features to use. Anybody has experience with this? I used beautiful soup to remove the tags. I know tf-idf can be used, but not exactly sure how. Suggestions on how to'clean' the data better (eg removing stop words, stemming, etc) are also welcome.
Apr-21-2016, 23:48:10 GMT
- Technology: