10 Modern Statistical Concepts Discovered by Data Scientists

Dec-25-2016, 00:45:04 GMT–@machinelearnbot

Clustering using tagging or indexation methods (see section 3 after clicking on the link), allowing you to cluster text (articles, websites) much faster than any traditional statistical technique, with a scalable algorithm very easy to implement Bucketization - the science and art of identifying the right homogeneous data buckets (millions of buckets among billions of observations), to provide highly localized (or segment-targeted) predictions, or to smooth regression parameters across similar buckets, with strong statistical significance. It is equivalent to joint (not sequential) binning in multiple dimensions, which is a combinatorial optimization problem. While decision trees also produce some bucketization, the data science approach is more robust, simple, scalable and model-free. It does not directly produce decision trees, and lead to easy interpretation (each data bucket corresponding to a specific type of fraud, in a fraud detection problem). A related problem is bucket clustering, via standard hierarchical clustering techniques.

artificial intelligence, decision tree, machine learning, (14 more...)

@machinelearnbot

Dec-25-2016, 00:45:04 GMT

News Web Page

Add feedback

Genre:
- Research Report > Experimental Study (0.55)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Statistical Learning
    - Clustering (0.59)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found