Goto

Collaborating Authors

 South America


20 Questions to Ace Before Getting a Machine Learning Job

#artificialintelligence

Call it a lucky find on Twitter. Santiago tweeted 20 questions you need to ace before getting a machine learning job. I figured I'd use these questions to understand developers' work better and maybe get a glimpse into future applications. The first questions were about various basic concepts of machine learning. Let's imagine, for example, that we are given a puzzle as a gift.


Not the US or China, but Japan leads the world in AI

#artificialintelligence

Some of the largest digital consultancies across the globe have come together to assess the state of the global artificial intelligence (AI), revealing that Japanese businesses lead the way when it comes to AI adoption. The study was conducted by US-based research firm ESI ThoughtLab in collaboration with a consortium of digital services and consulting firms operating at the cutting edge of AI. Deloitte, Publicis Sapient, Cognizant, Appen, Dataiku and DataRobot were all involved in the study, which surveyed more than 1,000 companies across 15 countries. The goal was to understand the size and scale of AI initiatives across the world. According to the report, Japan emerges as a surprise leader in AI adoption.


This AI lyrics generator strings your random words into songs

#artificialintelligence

Songwriter's block can be a problem for even the world's most successful musicians. They can sometimes overcome it by taking breaks, seeking new forms of inspiration, or simply pushing through. And if none of that works, they could try out a new AI lyrics generator called keyword2lyrics. Sometimes I have a few ideas that I want to turn into a song, but I'm too lazy for that, so I thought it would be cool to make a program that generates lyrics from isolated keywords or phrases. Gatthi developed the tool by training OpenAI's GPT-2 language model on songs that Google lists when you search for "top artists 20th century" and "top artists 21st century," and extracted keywords from them using a tool called yake.


Estimating COVID-19 cases and outbreaks on-stream through phone-calls

arXiv.org Machine Learning

One of the main problems in controlling COVID-19 epidemic spread is the delay in confirming cases. Having information on changes in the epidemic evolution or outbreaks rise before lab-confirmation is crucial in decision making for Public Health policies. We present an algorithm to estimate on-stream the number of COVID-19 cases using the data from telephone calls to a COVID-line. By modeling the calls as background (proportional to population) plus signal (proportional to infected), we fit the calls in Province of Buenos Aires (Argentina) with coefficient of determination $R^2 > 0.85$. This result allows us to estimate the number of cases given the number of calls from a specific district, days before the lab results are available. We validate the algorithm with real data. We show how to use the algorithm to track on-stream the epidemic, and present the Early Outbreak Alarm to detect outbreaks in advance to lab results. One key point in the developed algorithm is a detailed track of the uncertainties in the estimations, since the alarm uses the significance of the observables as a main indicator to detect an anomaly. We present the details of the explicit example in Villa Azul (Quilmes) where this tool resulted crucial to control an outbreak on time. The presented tools have been designed in urgency with the available data at the time of the development, and therefore have their limitations which we describe and discuss. We consider possible improvements on the tools, many of which are currently under development.


Anomaly Detection based on Zero-Shot Outlier Synthesis and Hierarchical Feature Distillation

arXiv.org Machine Learning

Anomaly detection suffers from unbalanced data since anomalies are quite rare. Synthetically generated anomalies are a solution to such ill or not fully defined data. However, synthesis requires an expressive representation to guarantee the quality of the generated data. In this paper, we propose a two-level hierarchical latent space representation that distills inliers' feature-descriptors (through autoencoders) into more robust representations based on a variational family of distributions (through a variational autoencoder) for zero-shot anomaly generation. From the learned latent distributions, we select those that lie on the outskirts of the training data as synthetic-outlier generators. And, we synthesize from them, i.e., generate negative samples without seen them before, to train binary classifiers. We found that the use of the proposed hierarchical structure for feature distillation and fusion creates robust and general representations that allow us to synthesize pseudo outlier samples. And in turn, train robust binary classifiers for true outlier detection (without the need for actual outliers during training). We demonstrate the performance of our proposal on several benchmarks for anomaly detection.


Helpfulness as a Key Metric of Human-Robot Collaboration

arXiv.org Artificial Intelligence

As robotic teammates become more common in society, people will assess the robots' roles in their interactions along many dimensions. One such dimension is effectiveness: people will ask whether their robotic partners are trustworthy and effective collaborators. This begs a crucial question: how can we quantitatively measure the helpfulness of a robotic partner for a given task at hand? This paper seeks to answer this question with regards to the interactive robot's decision making. We describe a clear, concise, and task-oriented metric applicable to many different planning and execution paradigms. The proposed helpfulness metric is fundamental to assessing the benefit that a partner has on a team for a given task. In this paper, we define helpfulness, illustrate it on concrete examples from a variety of domains, discuss its properties and ramifications for planning interactions with humans, and present preliminary results.


Live facial recognition is tracking kids suspected of being criminals

MIT Technology Review

Now a new investigation from Human Rights Watch has found that not only are children regularly added to CONARC, but the database also powers a live facial recognition system in Buenos Aires deployed by the city government. This makes the system likely the first known instance of its kind being used to hunt down kids suspected of criminal activity. "It's completely outrageous," says Hye Jung Han, a children's rights advocate at Human Rights Watch, who led the research. Buenos Aires first began trialing live facial recognition on April 24, 2019. Implemented without any public consultation, the system sparked immediate resistance.


The Future of Fake News

#artificialintelligence

Is Bitcoin the revolution against unequal economic systems, or a scam and money laundry mechanism? Will artificial intelligence (AI) improve and boost humankind, or terminate our species? These questions present incompatible scenarios, but you will find supporters for all of them. They cannot be all right, so who's wrong then? Ideas spread because they are attractive, whether they are good or bad, right or wrong.


The Impact of Artificial Intelligence on Surgery

#artificialintelligence

"We've witnessed ten years of change in a month" is a typical description of how the pandemic is accelerating the use of telemedicine. Before the virus, video appointments made up only 1% of the 350m consultations which Britain's National Health Service handles each year. Companies like Docly, eConsult and AccuRx are changing that. The latter claims that 90% of primary care clinics in England are now using its video-calling system. The most dramatic form of telemedicine is remote surgery.


Rare-Event Simulation for Neural Network and Random Forest Predictors

arXiv.org Machine Learning

We study rare-event simulation for a class of problems where the target hitting sets of interest are defined via modern machine learning tools such as neural networks and random forests. This problem is motivated from fast emerging studies on the safety evaluation of intelligent systems, robustness quantification of learning models, and other potential applications to large-scale simulation in which machine learning tools can be used to approximate complex rare-event set boundaries. We investigate an importance sampling scheme that integrates the dominating point machinery in large deviations and sequential mixed integer programming to locate the underlying dominating points. Our approach works for a range of neural network architectures including fully connected layers, rectified linear units, normalization, pooling and convolutional layers, and random forests built from standard decision trees. We provide efficiency guarantees and numerical demonstration of our approach using a classification model in the UCI Machine Learning Repository.