Collaborating Authors


Active Learning overview: Strategies and Uncertainty Measures.


The process of active learning is also referred to as optimal experimental design. It does this by prioritizing the labeling work for the experts. The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer training labels if it is allowed to choose the data from which it learns. Instead of collecting all the labels for all the data at once, Active Learning prioritizes which data the model is most confused about and requests labels for just those. The model then trains a little on that small amount of labeled data, and then again asks for some more labels for the most confusing data.

How I made my first SciPy contribution


Making contributions to famous, open source repositories popular with practitioners in your field is a great way to demonstrate expertise. Being able to say, for instance: "if you use this method in SciPy, you're using my code" is certain to get attention and make you stand out (we will be using SciPy as an example, but this applies to any popular library). In addition, you will find many top-notch researchers in fields ranging from AI to distributed computing mention such contributions on their profiles and resumes. But if doing this were easy, everyone would be doing it. For example, there are about a thousand total contributors to SciPy.

Linear Regression: Mathematical Intuition


Since the start of your data scientist journey, you have been commonly accustomed with this machine learning algorithm. Linear Regression as it is the basic and foremost machine learning algorithm we generally start with while analysing different regression problems. As the word linear says, the linear relationship between input variables(x) with the dependent output variable(y). Basically the linear regression analysis performs the task of predicticting the output variable by modelling or finding relationships between the independent variables(x). And the approach of finding the best ouput is by fitting the predicted line towards the best fit line.

What's the Difference Between a Metric and a Loss Function?


Have you been using your loss function for evaluating your machine learning system's performance? That's a mistake, but don't worry, you're not alone. It's a widespread misunderstanding that may have something to do with software defaults, college course format, and decision-maker absenteeism in AI. In this article, I'll explain why you need two separate model scoring functions for evaluation and optimization… and possibly a third one for statistical testing. Throughout data science, you'll see scoring functions (like the MSE, for example) being used for three main purposes: These three are subtly -- but importantly -- different from one another, so let's take a deeper look at what makes a function "good" for each purpose.

Inductive Logic Programming At 30: A New Introduction

Journal of Artificial Intelligence Research

Inductive logic programming (ILP) is a form of machine learning. The goal of ILP is to induce a hypothesis (a set of logical rules) that generalises training examples. As ILP turns 30, we provide a new introduction to the field. We introduce the necessary logical notation and the main learning settings; describe the building blocks of an ILP system; compare several systems on several dimensions; describe four systems (Aleph, TILDE, ASPAL, and Metagol); highlight key application areas; and, finally, summarise current limitations and directions for future research.

3 Ways Understanding Bayes Theorem Will Improve Your Data Science - KDnuggets


Bayes Theorem gives us a way of updating our beliefs in light of new evidence, taking into account the strength of our prior beliefs. Deploying Bayes Theorem, you seek to answer the question: what is the likelihood of my hypothesis in light of new evidence? In this article, we'll talk about three ways that the Bayes Theorem can improve your practice of Data Science: By the end, you'll possess a deep understanding of the foundational concept. Bayes Theorem provides a structure for testing a hypothesis, taking into account the strength of prior assumptions and the new evidence. This process is referred to as Bayesian Updating.

How ML with Titanic Dataset Could be Misleading? - Analytics Vidhya


This article was published as a part of the Data Science Blogathon. The Titanic ship disaster is one of the most infamous shipwrecks. The luxury cruiser, touted to be one of the safest when launched, sank thousands of passengers due to an accident with an iceberg. Out of 2224 passengers, 1502 passengers died due to the shipwreck. The accident had made some researchers wonder what could have led to the survival of some and the demise of others.

Understanding Linear Regression


Linear regression is a regression model which outputs a numeric value. It is used to predict an outcome based on a linear set of input. As you can guess this function represents a linear line in the coordinate system. The hypothesis function (h0) approximates the output given input. A linear regression model can either represent a univariate or a multivariate problem.

Talk like a President:


How do different presidents present themselves differently in different contexts? How do different ways of presenting oneself influence presidential popularity? This paper aims to answer these questions by implementing a multi-modal deep learning pipeline, extracting information from the text, audio, and image data from presidential speeches. This paper will first walk through the motivations and (brief) literature review. Then, we will introduce in order the seven models we ran for encoding the text (FastText and BERT), audio (CNN audio classifier, CNN emotion recognition), image models (EfficientNet, CNN emotion recognition), and multimodal prediction (self-defined RankNet).

Children who attend schools with more traffic noise have worse memory, study warns

Daily Mail - Science & tech

It's a widespread problem at schools in cities around the world, and now a new study has warned that noise pollution can affect children's memory. Researchers from the Barcelona Institute for Global Health studied children attending 38 schools in Barcelona. They found that children at schools with higher traffic noise had slower cognitive development. 'Our study supports the hypothesis that childhood is a vulnerable period during which external stimuli such as noise can affect the rapid process of cognitive development that takes place before adolescence,' said Jordi Sunyer, an author of the study. It's a widespread problem at schools in cities around the world, and now a new study has warned that noise pollution can affect children's memory (stock image) In the study, the researchers studied 2,680 children aged 7-10, who attended 38 schools across Barcelona.