Collaborating Authors

Data Science

Resilience and Vibrancy: The 2020 Data & AI Landscape


In a year like no other in recent memory, the data ecosystem is showing not just remarkable resilience but exciting vibrancy. When COVID hit the world a few months ago, an extended period of gloom seemed all but inevitable. Cloud and data technologies (data infrastructure, machine learning / artificial intelligence, data driven applications) are at the heart of digital transformation. As a result, many companies in the data ecosystem have not just survived, but in fact thrived, in an otherwise overall challenging political and economic context. Perhaps most emblematic of this is the blockbuster IPO of Snowflake, a data warehouse provider, which took place a couple of weeks ago and catapulted Snowflake to a $69B market cap company, at the time of writing – the biggest software IPO ever (see our S-1 teardown).

KDD 2020 Recognizes Winning Teams of 24th Annual KDD Cup


KDD Cup Track 3: Automated Machine Learning Competition – AutoML for Graph Representation Learning; KDD Cup Track 4: Reinforcement Learning …

The Best Free Data Science eBooks: 2020 Update - KDnuggets


Description: This book provides essential language and tools for understanding statistics, randomness, and uncertainty. The book explores a wide variety of applications and examples, ranging from coincidences and paradoxes to Google PageRank and Markov chain Monte Carlo (MCMC). Additional application areas explored include genetics, medicine, computer science, and information theory. The authors present the material in an accessible style and motivate concepts using real-world examples. Be prepared, it is a big book!. Also, check out their great probability cheat sheet here.

Michael Cavaretta, Ph.D. posted on LinkedIn


How to understand the history of artificial intelligence in the popular press in five easy steps - 1. This technology is amazing! 2. We thought it was amazing, but it's actually terrible! We've moved on to something else. 5. Repeat. I've seen this for data mining, big data, machine learning and deep learning. What's the next AI technology that will be run through the cycle?

Harnessing big data and artificial intelligence to predict future pandemic spread


During COVID-19, artificial intelligence (AI) has been used to enhance diagnostic efforts, deliver medical supplies and even assess risk factors from blood tests. Now, artificial intelligence is being used to forecast future COVID-19 cases. Texas A&M University researchers, led by Dr. Ali Mostafavi, have developed a powerful deep-learning computational model that uses artificial intelligence and existing big data related to population activities and mobility to help predict the future spread of COVID-19 cases at a county level. The researchers published their results in IEEE Access. The spread of pandemics is influenced by complex relationships related to features including mobility, population activities and sociodemographic characteristics. However, typical mathematical epidemiological models only account for a small subset of relevant features.

14 open source tools to make the most of machine learning


Spam filtering, face recognition, recommendation engines -- when you have a large data set on which you'd like to perform predictive analysis or pattern recognition, machine learning is the way to go. The proliferation of free open source software has made machine learning easier to implement both on single machines and at scale, and in most popular programming languages. These open source tools include libraries for the likes of Python, R, C, Java, Scala, Clojure, JavaScript, and Go. Apache Mahout provides a way to build environments for hosting machine learning applications that can be scaled quickly and efficiently to meet demand. Mahout works mainly with another well-known Apache project, Spark, and was originally devised to work with Hadoop for the sake of running distributed applications, but has been extended to work with other distributed back ends like Flink and H2O. Mahout uses a domain specific language in Scala.

New Books and Resources for DSC Members


We are in the process of writing and adding new material (compact eBooks) exclusively available to our members, and written in simple English, by world leading experts in AI, data science, and machine learning. We invite you to sign up here to not miss these free books. This book is intended for busy professionals working with data of any kind: engineers, BI analysts, statisticians, operations research, AI and machine learning professionals, economists, data scientists, biologists, and quants, ranging from beginners to executives. In about 300 pages and 28 chapters it covers many new topics, offering a fresh perspective on the subject, including rules of thumb and recipes that are easy to automate or integrate in black-box systems, as well as new model-free, data-driven foundations to statistical science and predictive analytics. The approach focuses on robust techniques; it is bottom-up (from applications to theory), in contrast to the traditional top-down approach. The material is accessible to practitioners with a one-year college-level exposure to statistics and probability.

Data Science to Accelerate Drug Discovery with Artificial Intelligence and Machine Learning, Says Frost & Sullivan


For further information on this analysis, please visit: "Applying data science tools in healthcare, especially for drug discovery, has a huge potential to systematically change the entire existing practices and methods," said Aarthi Janakiraman, Technical Insights Research Manager at Frost & Sullivan. "Additionally, pharmaceutical companies and hospitals are adopting this system rapidly, and its application is going to be established in all branches of healthcare." Janakiraman added: "Integrating AI and ML methods into drug discovery pipelines would cut down cost and time, and increase the efficiency of the entire research and development (R&D) process. Going forward, big pharma and mid-sized biotech companies can benefit by partnering with core AI startups and reducing the costs involved in setting up their own capabilities."

Practical Artificial Intelligence (AI) with H2O in Python


Machine learning has finally come of age. With H2O software, you can perform machine learning and data analysis using a simple open source framework that's easy to use, has a wide range of OS and language support, and scales for big data. This hands-on guide teaches you how to use H20 with only minimal math and theory behind the learning algorithms. Hot & New What you'll learn This course covers the main aspects of the H2O package for data science in Python. If you take this course, you can do away with taking other courses or buying books on Python-based data science as you will have the keys to a very powerful Python supported data science framework.