Big Data: Overviews


Top September Stories: Essential Math for Data Science: Why and How; Machine Learning Cheat Sheets

#artificialintelligence

Here are the most popular posts in KDnuggets in September, based on the number of unique page views (UPV), and social share counts from Facebook, Twitter, and Addthis. Most Shareable (Viral) Blogs Among the top blogs, here are the 5 blogs with the highest ratio of shares/unique views, which suggests that people who read it really liked it. You Aren't So Smart: Cognitive Biases are Making Sure of It, by Matthew Mayo A Winning Game Plan For Building Your Data Science Team, by William Schmarzo What on earth is data science?, by Cassie Kozyrkov Everything You Need to Know About AutoML and Neural Architecture Search, by George Seif The Data Science of "Someone Like You" or Sentiment Analysis of Adele's Songs, by Preetish Panda How many data scientists are there and is there a shortage?, by Gregory Piatetsky Neural Networks and Deep Learning: A Textbook, by Charu Aggarwal 5 Resources to Inspire Your Next Data Science Project, by Conor Dewey Hadoop for Beginners, by Aafreen Dabhoiwala 6 Steps To Write Any Machine Learning Algorithm From Scratch: Perceptron Case Study, by John Sullivan Deep Learning for NLP: An Overview of Recent Trends, by Elvis Saravia (*) Ultimate Guide to Getting Started with TensorFlow, by Brian Zhang (*) How many data scientists are there and is there a shortage?, by Gregory Piatetsky Essential Math for Data Science: 'Why' and'How', by Tirthajyoti Sarkar Journey to Machine Learning - 100 Days of ML Code, by Avik Jain You Aren't So Smart: Cognitive Biases are Making Sure of It, by Matthew Mayo Neural Networks and Deep Learning: A Textbook, by Charu Aggarwal (*) You Aren't So Smart: Cognitive Biases are Making Sure of It, by Matthew Mayo How many data scientists are there and is there a shortage?, by Gregory Piatetsky You Aren't So Smart: Cognitive Biases are Making Sure of It, by Matthew Mayo A Winning Game Plan For Building Your Data Science Team, by William Schmarzo What on earth is data science?, by Cassie Kozyrkov Everything You Need to Know About AutoML and Neural Architecture Search, by George Seif The Data Science of "Someone Like You" or Sentiment Analysis of Adele's Songs, by Preetish Panda You Aren't So Smart: Cognitive Biases are Making Sure of It, by Matthew Mayo What on earth is data science?, by Cassie Kozyrkov


Machine learning and AI – ensuring fairness in smart cities

#artificialintelligence

Digital technologies and AI offer a new wave of opportunities to turn data into actionable insights – creating a balance between social, environmental, and economic opportunities. In 2018, it's safe to say that the Internet, the World Wide Web, and the myriad of technologies derived from their development are all here to stay. With the ceaseless amalgamation of these various innovations, engineers are creating a cyber-physical world where pervasively interconnected objects, things, and processes can potentially unlock a breadth of unprecedented opportunities. However, I should point out that encapsulating the entire medley of possibilities afforded by these technologies is a considerable endeavour requiring a far longer and more comprehensive overview – perhaps in the form of a book, or three – than this article can offer in isolation. More specifically, I'll be focusing on the potential for us to optimally – and transparently – manage and operate city-wide infrastructure.


Big data in GIS environment - Geospatial World

#artificialintelligence

GIS is virtual world, a world that is represented by points, polygon, line and graph. Processing of these datasets has always been a challenge since the day GIS got established as a field. Processing of huge data has always been a long standing problem not only in traditional Information and Technology(IT) sectors but also in the Geo-Spatial domain. However recent development in the both hardware and software infrastructure has enabled processing of huge data sets. This has given big push and new direction to those industries which were marred by slow data processing capabilities.


Artificial Intelligence, Machine Learning and Big Data - A Comprehensive Report

#artificialintelligence

Artificial Intelligence and Machine Learning are the hottest jobs in the industry right now. For instance, did you know that more than 50,000 positions related to Data and Analytics are currently vacant in India? We are excited to release a comprehensive report together with Great Learning on how AI, ML and Big Data are changing and evolving the world around us. Additionally, this report aims to provide an overview of the kind of career opportunities available in these fields right now, and the different roles we might see in the future. The aim behind creating this report is to provide our Data Science community with the context of changes happening at a macro level, and how they can best prepare for these upcoming changes.


A Primer on Causality in Data Science

arXiv.org Machine Learning

Many questions in Data Science are fundamentally causal in that our objective is to learn the effect of some exposure (randomized or not) on an outcome interest. Even studies that are seemingly non-causal (e.g. prediction or prevalence estimation) have causal elements, such as differential censoring or measurement. As a result, we, as Data Scientists, need to consider the underlying causal mechanisms that gave rise to the data, rather than simply the pattern or association observed in the data. In this work, we review the "Causal Roadmap", a formal framework to augment our traditional statistical analyses in an effort to answer the causal questions driving our research. Specific steps of the Roadmap include clearly stating the scientific question, defining of the causal model, translating the scientific question into a causal parameter, assessing the assumptions needed to translate the causal parameter into a statistical estimand, implementation of statistical estimators including parametric and semi-parametric methods, and interpretation of our findings. Throughout we focus on the effect of an exposure occurring at a single time point and provide extensions to more advanced settings.


AI, Machine Learning and Data Science Roundup: August 2018

#artificialintelligence

This is an eclectic collection of interesting blog posts, software announcements and data applications I've noted over the past month or so. ONNX Model Zoo is now available, providing a library of pre-trained state-of-the-art models in deep learning in the ONNX format. In the 2018 IEEE Spectrum Top Programming Language rankings, Python takes the top spot and R ranks #7. Julia 1.0 has been released, marking the stabilization of the scientific computing language and promising forwards compatibility. Google announces Cloud AutoML, a beta service to train vision, text categorization, or language translation models from provided data.


Blog

#artificialintelligence

The Dragonfly Machine Learning Engine (MLE) provides the machine learning and data science capabilities included within OPNids. Data science and machine learning promise to counteract the dynamic threat environment created by growing network traffic and increasing threat actor sophistication. This post will provide an overview of the MLE engine itself, reasoning for why data science and cybersecurity go together, and some insight into using the MLE as part of the OPNids system. The Dragonfly MLE is available as part of OPNids. The Dragonfly MLE provides a powerful framework for deploying anomaly detection algorithms, threat intelligence lookups, and machine learning predictions within a network security infrastructure.


Taking the pulse of machine learning adoption ZDNet

#artificialintelligence

A few months back, we gave our take on a survey from the O'Reilly folks regarding interest in deep learning. The survey reported that interest was more than latent, but there's little question that the bulk of the action today is in the (relatively) better understood confines of machine learning (ML). So on this go round, O'Reilly jumped into the shallower side of the pond to survey the people who subscribe to its publications and go to its big data-related Strata and AI conferences regarding ML. Before diving in, let's put some perspective on this cohort: it's likely a group that on average is ahead of the curve by virtue of its attendance at these big data events or consumption of O'Reilly learning services that are skewing increasingly toward the AI domain. Nonetheless, it provides a useful counterpoint to their earlier work exploring interest in deep learning.


Taking the pulse of machine learning adoption

ZDNet

A few months back, we gave our take on a survey from the O'Reilly folks regarding interest in deep learning. The survey reported that interest was more than latent, but there's little question that the bulk of the action today is in the (relatively) better understood confines of machine learning (ML). So on this go round, O'Reilly jumped into the shallower side of the pond to survey the people who subscribe to its publications and go to its big data-related Strata and AI conferences regarding ML. Before diving in, let's put some perspective on this cohort: it's likely a group that on average is ahead of the curve by virtue of its attendance at these big data events or consumption of O'Reilly learning services that are skewing increasingly toward the AI domain. Nonetheless, it provides a useful counterpoint to their earlier work exploring interest in deep learning.


Machine Learning and Data Science Redefining the African Continent

#artificialintelligence

With the continuous evolution of technology and new developments arising from the need to integrate technology to efficiently deliver an excellent digital consumer experience, more women are taking charge by being part of this evolution through their involvement in local communities tailored to effectively share resources and current trends in data science and Machine Learning. The WiMLDS community which comprises of data scientist and machine learners aims at increasing representation of women data scientist into the tech space, the luck of therefore presented an opportunity to build up this local community where the majority are self taught and hence are able to keep up with the ever highly advancing technology. "We are all largely self taught so we found each other while looking for data science and machine learning communities to aid our learning journeys. There was no such community in existence and the opportunity presented itself to start a local chapter of Women in Machine Learning and Data Science. So we jumped at it and now it has been almost 2 years," says Kathleen Siminyu Head of data science at Africa's Talking.