Collaborating Authors



I would add HDT, Jackknife regression, density estimation, attribution modeling (to optimize marketing mix), linkage (in fraud detection), indexation (to create taxonomies or for clustering large data sets consisting of text), bucketisation, and time series algorithms.

6. Machine Learning Algorithms -- Python 3: from None to Machine Learning


Algorithms are often grouped by similarity in terms of their function (how they work). For example, tree-based methods, and neural network inspired methods. I think this is the most useful way to group algorithms and it is the approach we will use here. This is a useful grouping method, but it is not perfect. There are still algorithms that could just as easily fit into multiple categories like Learning Vector Quantization that is both a neural network inspired method and an instance-based method.

10 Best Books to Learn Data Structure and Algorithms in Java, Python, C, and C


The current edition of this books is the 3rd Edition and I strongly suggest that every programmer should have this in their bookshelf, but only for short reading and references. It's not possible to finish this book in one sitting and some of you may find it difficult to read as well, but don't worry, you can combine your learning with an online course like Data Structures and Algorithms: Deep Dive Using Java along with this book. This is like the best of both world, you learn basic Algrotihsm quickly in an online course and then you further cement that knowledge by going through the book, which would make more sense to you now that you have gone through a course already.

Step-By-Step Framework for Imbalanced Classification Projects


Classification predictive modeling problems involve predicting a class label for a given set of inputs. It is a challenging problem in general, especially if little is known about the dataset, as there are tens, if not hundreds, of machine learning algorithms to choose from. The problem is made significantly more difficult if the distribution of examples across the classes is imbalanced. This requires the use of specialized methods to either change the dataset or change the learning algorithm to handle the skewed class distribution. A common way to deal with the overwhelm on a new classification project is to use a favorite machine learning algorithm like Random Forest or SMOTE. Another common approach is to scour the research literature for descriptions of vaguely similar problems and attempt to re-implement the algorithms and configurations that are described. These approaches can be effective, although they are hit-or-miss and time-consuming respectively.

The ethics of algorithms: Mapping the debate


In information societies, operations, decisions and choices previously left to humans are increasingly delegated to algorithms, which may advise, if not decide, about how data should be interpreted and what actions should be taken as a result.1 Examples abound. Profiling and classification algorithms determine how individuals and groups are shaped and managed (Floridi, 2012). Recommendation systems give users directions about when and how to exercise, what to buy, which route to take, and who to contact (Vries, 2010: 81). Data mining algorithms are said to show promise in helping make sense of emerging streams of behavioural data generated by the'Internet of Things' (Portmess and Tower, 2014: 1). Online service providers continue to mediate how information is accessed with personalisation and filtering algorithms (Newell and Marabelli, 2015; Taddeo and Floridi, 2015). Machine learning algorithms automatically identify misleading, biased or inaccurate knowledge at the point of creation (e.g.