Collaborating Authors


Complete Machine Learning & Data Science Bootcamp 2022


This is a brand new Machine Learning and Data Science course just launched and updated this month with the latest trends and skills for 2021! Become a complete Data Scientist and Machine Learning engineer! Join a live online community of 400,000 engineers and a course taught by industry experts that have actually worked for large companies in places like Silicon Valley and Toronto. Graduates of Andrei's courses are now working at Google, Tesla, Amazon, Apple, IBM, JP Morgan, Facebook, other top tech companies. You will go from zero to mastery!

Papers to Read on using Artificial Inteligence with Rainfall


Abstract: We propose a didactic approach to use the Machine Learning protocol in order to perform weather forecast. This study is motivated by the possibility to apply this method to predict weather conditions in proximity of the Etna and Stromboli volcanic areas, located in Sicily (south Italy). Here the complex orography may significantly influence the weather conditions due to Stau and Foehn effects, with possible impact on the air traffic of the nearby Catania and Reggio Calabria airports. We first introduce a simple thermodynamic approach, suited to provide information on temperature and pressure when the Stau and Foehn effect takes place. In order to gain information to the rainfall accumulation, the Machine Learning approach is presented: according to this protocol, the model is able to learn'' from a set of input data which are the meteorological conditions (in our case dry, light rain, moderate rain and heavy rain) associated to the rainfall, measured in mm.

The Application of Machine Learning Techniques for Predicting Match Results in Team Sport: A Review

Journal of Artificial Intelligence Research

Predicting the results of matches in sport is a challenging and interesting task. In this paper, we review a selection of studies from 1996 to 2019 that used machine learning for predicting match results in team sport. Considering both invasion sports and striking/fielding sports, we discuss commonly applied machine learning algorithms, as well as common approaches related to data and evaluation. Our study considers accuracies that have been achieved across different sports, and explores whether evidence exists to support the notion that outcomes of some sports may be inherently more difficult to predict. We also uncover common themes of future research directions and propose recommendations for future researchers. Although there remains a lack of benchmark datasets (apart from in soccer), and the differences between sports, datasets and features makes between-study comparisons difficult, as we discuss, it is possible to evaluate accuracy performance in other ways. Artificial Neural Networks were commonly applied in early studies, however, our findings suggest that a range of models should instead be compared. Selecting and engineering an appropriate feature set appears to be more important than having a large number of instances. For feature selection, we see potential for greater inter-disciplinary collaboration between sport performance analysis, a sub-discipline of sport science, and machine learning.

Predicting Decisions in Language Based Persuasion Games

Journal of Artificial Intelligence Research

Sender-receiver interactions, and specifically persuasion games, are widely researched in economic modeling and artificial intelligence, and serve as a solid foundation for powerful applications. However, in the classic persuasion games setting, the messages sent from the expert to the decision-maker are abstract or well-structured application-specific signals rather than natural (human) language messages, although natural language is a very common communication signal in real-world persuasion setups. This paper addresses the use of natural language in persuasion games, exploring its impact on the decisions made by the players and aiming to construct effective models for the prediction of these decisions. For this purpose, we conduct an online repeated interaction experiment. At each trial of the interaction, an informed expert aims to sell an uninformed decision-maker a vacation in a hotel, by sending her a review that describes the hotel. While the expert is exposed to several scored reviews, the decision-maker observes only the single review sent by the expert, and her payoff in case she chooses to take the hotel is a random draw from the review score distribution available to the expert only. The expert’s payoff, in turn, depends on the number of times the decision-maker chooses the hotel. We also compare the behavioral patterns in this experiment to the equivalent patterns in similar experiments where the communication is based on the numerical values of the reviews rather than the reviews’ text, and observe substantial differences which can be explained through an equilibrium analysis of the game. We consider a number of modeling approaches for our verbal communication setup, differing from each other in the model type (deep neural network (DNN) vs. linear classifier), the type of features used by the model (textual, behavioral or both) and the source of the textual features (DNN-based vs. hand-crafted). Our results demonstrate that given a prefix of the interaction sequence, our models can predict the future decisions of the decision-maker, particularly when a sequential modeling approach and hand-crafted textual features are applied. Further analysis of the hand-crafted textual features allows us to make initial observations about the aspects of text that drive decision making in our setup.

Nothing deep about deep learning


At Chicago, I recall undergraduate students gawking about deep learning to Professor Lafferty after class. I recall professor Lafferty had hesitation in his voice at the time. It felt as though he was discussing a controversial, politically-sensitive issue. At that time, we knew only a fraction of what we know now and many of us were still wondering how deep learning could be anything more than non-linear regression. I had no motivation or curiosity to understand the subject and even the trio at Stanford--the ones who gave us the best-selling ML book of all time--only put a few paragraphs in the first edition of their textbook saying just that.

Visually Grounded Models of Spoken Language: A Survey of Datasets, Architectures and Evaluation Techniques

Journal of Artificial Intelligence Research

This survey provides an overview of the evolution of visually grounded models of spoken language over the last 20 years. Such models are inspired by the observation that when children pick up a language, they rely on a wide range of indirect and noisy clues, crucially including signals from the visual modality co-occurring with spoken utterances. Several fields have made important contributions to this approach to modeling or mimicking the process of learning language: Machine Learning, Natural Language and Speech Processing, Computer Vision and Cognitive Science. The current paper brings together these contributions in order to provide a useful introduction and overview for practitioners in all these areas. We discuss the central research questions addressed, the timeline of developments, and the datasets which enabled much of this work. We then summarize the main modeling architectures and offer an exhaustive overview of the evaluation metrics and analysis techniques.

How to learn the maths of Data Science using your high school maths knowledge -


This post is a part of my forthcoming book on Mathematical foundations of Data Science. In this post, we use the Perceptron algorithm to bridge the gap between high school maths and deep learning. As part of my role as course director of the Artificial Intelligence: Cloud and Edge Computing at the University…, I see more students who are familiar with programming than with mathematics. They have last learnt maths years ago at University. And then, suddenly they find that they encounter matrices, linear algebra etc when they start learning Data Science.

Geometric Graph Representation Learning via Maximizing Rate Reduction Artificial Intelligence

Learning discriminative node representations benefits various downstream tasks in graph analysis such as community detection and node classification. Existing graph representation learning methods (e.g., based on random walk and contrastive learning) are limited to maximizing the local similarity of connected nodes. Such pair-wise learning schemes could fail to capture the global distribution of representations, since it has no explicit constraints on the global geometric properties of representation space. To this end, we propose Geometric Graph Representation Learning (G2R) to learn node representations in an unsupervised manner via maximizing rate reduction. In this way, G2R maps nodes in distinct groups (implicitly stored in the adjacency matrix) into different subspaces, while each subspace is compact and different subspaces are dispersedly distributed. G2R adopts a graph neural network as the encoder and maximizes the rate reduction with the adjacency matrix. Furthermore, we theoretically and empirically demonstrate that rate reduction maximization is equivalent to maximizing the principal angles between different subspaces. Experiments on real-world datasets show that G2R outperforms various baselines on node classification and community detection tasks.

State-of-the-Art Review of Design of Experiments for Physics-Informed Deep Learning Machine Learning

This paper presents a comprehensive review of the design of experiments used in the surrogate models. In particular, this study demonstrates the necessity of the design of experiment schemes for the Physics-Informed Neural Network (PINN), which belongs to the supervised learning class. Many complex partial differential equations (PDEs) do not have any analytical solution; only numerical methods are used to solve the equations, which is computationally expensive. In recent decades, PINN has gained popularity as a replacement for numerical methods to reduce the computational budget. PINN uses physical information in the form of differential equations to enhance the performance of the neural networks. Though it works efficiently, the choice of the design of experiment scheme is important as the accuracy of the predicted responses using PINN depends on the training data. In this study, five different PDEs are used for numerical purposes, i.e., viscous Burger's equation, Shr\"{o}dinger equation, heat equation, Allen-Cahn equation, and Korteweg-de Vries equation. A comparative study is performed to establish the necessity of the selection of a DoE scheme. It is seen that the Hammersley sampling-based PINN performs better than other DoE sample strategies.

Private Adaptive Optimization with Side Information Machine Learning

Adaptive optimization methods have become the default solvers for many machine learning tasks. Unfortunately, the benefits of adaptivity may degrade when training with differential privacy, as the noise added to ensure privacy reduces the effectiveness of the adaptive preconditioner. To this end, we propose AdaDPS, a general framework that uses non-sensitive side information to precondition the gradients, allowing the effective use of adaptive methods in private settings. We formally show AdaDPS reduces the amount of noise needed to achieve similar privacy guarantees, thereby improving optimization performance. Empirically, we leverage simple and readily available side information to explore the performance of AdaDPS in practice, comparing to strong baselines in both centralized and federated settings. Our results show that AdaDPS improves accuracy by 7.7% (absolute) on average -- yielding state-of-the-art privacy-utility trade-offs on large-scale text and image benchmarks.