It is, however, a thoughtful introduction to and overview of machine-learning methods, appropriately remembering about the context and life-cycle of an ML project, and keeping things hands-on with small Python examples, but managing not to fall into the catalogue mode. I have seen other books try this before. "Doing Data Science" by O'Neill and Schutt comes to mind first, long on enthusiasm but a little short on quality. Then there is Manning's own "Practical Data Science with R" by Zumel and Mount. Among the three, RWML looks like a clear winner. If I had to pick on something, I would register disappointment with the book's one extended exercise, based on the NYC taxi dataset.
Python and R are among the most frequently mentioned skills in job postings for data science positions. But reports on which programming language is actually used most often on the job for these professionals are conflicting, according to a Thursday report from Cloud Academy. The TIOBE Programming Community Index shows R as being on a downward trend this year in terms of search engine requests. However, a Kaggle survey of 16,000 data professionals found that while Python was the most popular programming language overall, statisticians and data scientists were more likely to report using R at work than other roles. Among data scientists, 87% reported using Python and 71% reported using R at work, that report found.
Another great resource for learning R. While it is frustrating that all these books cover the same basic information they all cover it slightly differently. This book coming from the RStudio's chief trainer is a well designed book which covers many aspects not covered as well as other books. R as a programing language has also evolved so much over the past 5 years that I find that the newer books are a better start for beginners, not that the classics should be skipped. This book has a cleaner narrower focus and is a great fit for someone new to R. It uses less libraries and the libraries it uses are clean and make working with R easier. Also I couldn't imagine working with R without using RStudio and this book also shows short cuts on the language's best IDE that is free for personal use.
Despite the success of neural networks (NNs), there is still a concern among many over their "black box" nature. Why do they work? Here we present a simple analytic argument that NNs are in fact essentially polynomial regression models. This view will have various implications for NNs, e.g. providing an explanation for why convergence problems arise in NNs, and it gives rough guidance on avoiding overfitting. In addition, we use this phenomenon to predict and confirm a multicollinearity property of NNs not previously reported in the literature. Most importantly, given this loose correspondence, one may choose to routinely use polynomial models instead of NNs, thus avoiding some major problems of the latter, such as having to set many tuning parameters and dealing with convergence issues. We present a number of empirical results; in each case, the accuracy of the polynomial approach matches or exceeds that of NN approaches. A many-featured, open-source software package, polyreg, is available.