State-of-the-Art Machine Learning Automation with HDT
The number of "feature values" is the total number of key-value pairs found, including the small unstable ones, regardless as to whether they are classified as good or bad. Any article with a pv above the arbitrary value pv_threshold 7.1 (see source code) is considered as good. This corresponds to articles having about 1.3 times more traffic than average, since we use a log scale and the average pv is 6.81. The traffic for articles classified as good by the algorithm (pv 8.23) is about 4.2 times above the traffic that an average article receives. Also note that we correctly identify the vast majority of good articles, but this is because we work with small nodes. Finally an article is marked as good if it triggers at least one node marked as good (that is, satisfying the criterion defined in the next sub-section.) Besides pv_threshold, the algorithm uses 12 parameters to identify a usable, stable node classified as good.
Feb-14-2017, 03:50:04 GMT
- Country:
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Technology: