Collaborating Authors

Data Mining

KDD 2020 Recognizes Winning Teams of 24th Annual KDD Cup


KDD Cup Track 3: Automated Machine Learning Competition – AutoML for Graph Representation Learning; KDD Cup Track 4: Reinforcement Learning …

Michael Cavaretta, Ph.D. posted on LinkedIn


How to understand the history of artificial intelligence in the popular press in five easy steps - 1. This technology is amazing! 2. We thought it was amazing, but it's actually terrible! We've moved on to something else. 5. Repeat. I've seen this for data mining, big data, machine learning and deep learning. What's the next AI technology that will be run through the cycle?

Harnessing big data and artificial intelligence to predict future pandemic spread


During COVID-19, artificial intelligence (AI) has been used to enhance diagnostic efforts, deliver medical supplies and even assess risk factors from blood tests. Now, artificial intelligence is being used to forecast future COVID-19 cases. Texas A&M University researchers, led by Dr. Ali Mostafavi, have developed a powerful deep-learning computational model that uses artificial intelligence and existing big data related to population activities and mobility to help predict the future spread of COVID-19 cases at a county level. The researchers published their results in IEEE Access. The spread of pandemics is influenced by complex relationships related to features including mobility, population activities and sociodemographic characteristics. However, typical mathematical epidemiological models only account for a small subset of relevant features.

14 open source tools to make the most of machine learning


Spam filtering, face recognition, recommendation engines -- when you have a large data set on which you'd like to perform predictive analysis or pattern recognition, machine learning is the way to go. The proliferation of free open source software has made machine learning easier to implement both on single machines and at scale, and in most popular programming languages. These open source tools include libraries for the likes of Python, R, C, Java, Scala, Clojure, JavaScript, and Go. Apache Mahout provides a way to build environments for hosting machine learning applications that can be scaled quickly and efficiently to meet demand. Mahout works mainly with another well-known Apache project, Spark, and was originally devised to work with Hadoop for the sake of running distributed applications, but has been extended to work with other distributed back ends like Flink and H2O. Mahout uses a domain specific language in Scala.

New Books and Resources for DSC Members


We are in the process of writing and adding new material (compact eBooks) exclusively available to our members, and written in simple English, by world leading experts in AI, data science, and machine learning. We invite you to sign up here to not miss these free books. This book is intended for busy professionals working with data of any kind: engineers, BI analysts, statisticians, operations research, AI and machine learning professionals, economists, data scientists, biologists, and quants, ranging from beginners to executives. In about 300 pages and 28 chapters it covers many new topics, offering a fresh perspective on the subject, including rules of thumb and recipes that are easy to automate or integrate in black-box systems, as well as new model-free, data-driven foundations to statistical science and predictive analytics. The approach focuses on robust techniques; it is bottom-up (from applications to theory), in contrast to the traditional top-down approach. The material is accessible to practitioners with a one-year college-level exposure to statistics and probability.



Woodgrove Bank, who provides payment processing services for commerce, is looking to design and implement a proof-of-concept (PoC) of an innovative fraud detection solution. They want to provide new services to their merchant customers, helping them save costs by applying machine learning and advanced analytics to detect fraudulent transactions. Their customers are around the world, and the right solutions for them would minimize any latencies experienced using their service by distributing as much of the solution as possible, as closely as possible, to the regions in which their customers use the service. In this workshop, you will learn to design a data pipeline solution that leverages Cosmos DB for both the scalable ingest of streaming data, and the globally distributed serving of both pre-scored data and machine learning models. The solution leverages the Cosmos DB change data feed in concert with the Azure Databricks Delta to enable a modern data warehouse solution that can be used to create risk reduction solutions for scoring transactions for fraud in an offline, batch approach and in a near real-time, request/response approach.

Here are "Programing" Ivy League courses you can take online right now for free


We’re in the business of helping companies unlock competitive advantages through the use of Advanced Analytics tools, Data Science Solutions, Machine Learning and Artificial Intelligence.

Feature Extraction for Graphs


Heads up: I've structured the article similarly as in the Graph Representation Learning book by William L. Hamilton [1]. One of the simplest ways to capture information from graphs is to create individual features for each node. These features can capture information both from a close neighbourhood, and a more distant, K-hop neighbourhood using iterative methods. Node degree is a simple metric and can be defined as a number of edges incident to a node. This metric is often used as initialization of algorithms to generate more complex graph-level features such as Weisfeiler-Lehman Kernel.

Predictive engineering analytics, big data and the future of design


By combining physics-based simulations, data mining, statistical modelling and machine learning techniques, predictive engineering analytics can analyse patterns in the data to construct models of how the systems you gathered the data from work. IoT and sensors are already transforming products and mining the stream of information from products will be critical for maintaining products and designing their replacements. For many industries, the products they create are no longer purely mechanical; they're complex devices combining mechanical and electrical controls. That means engineering different systems, and the ways they interface with each other, and with the outside world. At one level you're coping with electromechanical controls, at another, you're creating a design that covers the cooling requirements for the electronics.