Collaborating Authors

SAS Visual Data Mining and Machine Learning propels powerful self-learning analytics to produce insight that matters


The relentless increase in computing power and the accumulation of big data over the years has sparked intense interest in machine learning and its associated techniques. The new SAS Visual Data Mining and Machine Learning software will feed this need for smarter analytics. Advanced analytics offer insight to businesses, but machine learning and deep learning algorithms take it deeper, revealing insights that were previously out of reach. For example, machine learning use can include facial recognition in security systems, speech recognition in customer service applications, accurate product recommendations in e-commerce, self-driving cars and medical diagnostics. "SAS Visual Data Mining and Machine Learning shatters barriers related to data volume and variety, limited analytical depth and computational bottlenecks.

Data Mining and Machine Learning: Fundamental Concepts and Algorithms: The Free eBook - KDnuggets


We are pleased to announce the second edition of our book Data Mining and Machine Learning: Fundamental Concepts and Algorithms, Second Edition, by Mohammed J. Zaki and Wagner Meira, Jr., published by Cambridge University Press, 2020. The entire book is available to read online for free and the site includes video lectures and other resources. New to this edition is an entire part devoted to regression and deep learning. The fundamental algorithms in data mining and machine learning form the basis of data science, utilizing automated methods to analyze patterns and models for all kinds of data in applications ranging from scientific discovery to business analytics. This textbook for senior undergraduate and graduate courses provides a comprehensive, in-depth overview of data mining, machine learning and statistics, offering solid guidance for students, researchers, and practitioners.

60 Free Books on Big Data, Data Science, Data Mining, Machine Learning, Python, R, and more


Think Python: How to Think Like a Computer Scientist Allen Downey, 2012 Automate the Boring Stuff with Python: Practical Programming for Total Beginners [Buy on Amazon] Al Sweigart, 2015 Learn Python the Hard Way [Buy on Amazon] Zed A. Shaw, 2013

Your Ultimate Data Mining & Machine Learning Cheat Sheet


Dimensionality reduction is the process of expressing high-dimensional data in a reduced number of dimensions such that each one contains the most amount of information. Dimensionality reduction may be used for visualization of high-dimensional data or to speed up machine learning models by removing low-information or correlated features. Principal Component Analysis, or PCA, is a popular method of reducing the dimensionality of data by drawing several orthogonal (perpendicular) vectors in the feature space to represent the reduced number of dimensions. The variable number represents the number of dimensions the reduced data will have. In the case of visualization, for example, it would be two dimensions.

Network Based Pricing for 3D Printing Services in Two-Sided Manufacturing-as-a-Service Marketplace Machine Learning

This paper presents approaches to determine a network based pricing for 3D printing services in the context of a two-sided manufacturing-as-a-service marketplace. The intent is to provide cost analytics to enable service bureaus to better compete in the market by moving away from setting ad-hoc and subjective prices. A data mining approach with machine learning methods is used to estimate a price range based on the profile characteristics of 3D printing service suppliers. The model considers factors such as supplier experience, supplier capabilities, customer reviews and ratings from past orders, and scale of operations among others to estimate a price range for suppliers' services. Data was gathered from existing marketplace websites, which was then used to train and test the model. The model demonstrates an accuracy of 65% for US based suppliers and 59% for Europe based suppliers to classify a supplier's 3D Printer listing in one of the seven price categories. The improvement over baseline accuracy of 25% demonstrates that machine learning based methods are promising for network based pricing in manufacturing marketplaces. Conventional methodologies for pricing services through activity based costing are inefficient in strategically pricing 3D printing service offering in a connected marketplace. As opposed to arbitrarily determining prices, this work proposes an approach to determine prices through data mining methods to estimate competitive prices. Such tools can be built into online marketplaces to help independent service bureaus to determine service price rates.