South America
On dynamic ensemble selection and data preprocessing for multi-class imbalance learning
Cruz, Rafael M. O., Sabourin, Robert, Cavalcanti, George D. C.
Class-imbalance refers to classification problems in which many more instances are available for certain classes than for others. Such imbalanced datasets require special attention because traditional classifiers generally favor the majority class which has a large number of instances. Ensemble of classifiers have been reported to yield promising results. However, the majority of ensemble methods applied too imbalanced learning are static ones. Moreover, they only deal with binary imbalanced problems. Hence, this paper presents an empirical analysis of dynamic selection techniques and data preprocessing methods for dealing with multi-class imbalanced problems. We considered five variations of preprocessing methods and four dynamic selection methods. Our experiments conducted on 26 multi-class imbalanced problems show that the dynamic ensemble improves the F-measure and the G-mean as compared to the static ensemble. Moreover, data preprocessing plays an important role in such cases.
Your Dog Knows How You Feel--Here's How
Does it ever seem like your dog is in tune with your emotions? You may be on to something. In new experiments, dogs showed signs of understanding whether a human or a dog was happy or mad based on facial expressions and vocalizations. The research, published in the journal Biology Letters, set out to explore the emotional connection between man and his best friend. "We still didn't know if the dogs ... can somehow understand that, say, a happy facial expression is positive and a negative facial expression is negative," says study leader Natalia de Souza Albuquerque, a Ph.D. student in experimental psychology at the University of São Paulo in Brazil.
City-wide Analysis of Electronic Health Records Reveals Gender and Age Biases in the Administration of Known Drug-Drug Interactions
Correia, Rion Brattig, de Araújo, Luciana P., Mattos, Mauro M., Wild, David, Rocha, Luis M.
From a public-health perspective, the occurrence of drug-drug-interactions (DDI) from multiple drug prescriptions is a serious problem, especially in the elderly population. This is true both for individuals and the system itself since patients with complications due to DDI will likely re-enter the system at a costlier level. We conducted an 18-month study of DDI occurrence in Blumenau (Brazil; pop. 340,000) using city-wide drug dispensing data from both primary and secondary-care level. Our goal is also to identify possible risk factors in a large population, ultimately characterizing the burden of DDI for patients, doctors and the public system itself. We found 181 distinct DDI being prescribed concomitantly to almost 5% of the city population. We also discovered that women are at a 60% risk increase of DDI when compared to men, while only having a 6% co-administration risk increase. Analysis of the DDI co-occurrence network reveals which DDI pairs are most associated with the observed greater DDI risk for females, demonstrating that contraception and hormone therapy are not the main culprits of the gender disparity, which is maximized after the reproductive years. Furthermore, DDI risk increases dramatically with age, with patients age 70-79 having a 50-fold risk increase in comparison to patients aged 0-19. Interestingly, several null models demonstrate that this risk increase is not due to increased polypharmacy with age. Finally, we demonstrate that while the number of drugs and co-administrations help predict a patient's number of DDI ($R^2=.413$), they are not sufficient to flag these patients accurately, which we achieve by training classifiers with additional data (MCC=.83,F1=.72). These results demonstrate that accurate warning systems for known DDI can be devised for public and private systems alike, resulting in substantial prevention of DDI-related ADR and savings.
Scoring Formulation for Multi-Condition Joint PLDA
The joint PLDA model, is a generalization of PLDA where the nuisance variable is no longer considered independent across samples, but potentially shared (tied) across samples that correspond to the same nuisance condition. The original work considered a single nuisance condition, deriving the EM and scoring formulas for this scenario. In this document, we show how to obtain likelihood ratios for scoring when multiple nuisance conditions are allowed in the model.
How can machine learning boost 5G networks? Submit your papers!
Smart 5G systems will enable a range of emerging technologies that have the potential to improve lives at a pace and scale not seen before. And machine learning holds great promise to optimize 5G and future networks. This will affect ITU's standardization work in fields such as coding algorithms; data collection, storage and management; and network management and orchestration – raising a host of important questions such as: These questions will be central to ITU's 10th annual Kaleidoscope academic conference from 26-28 November in Sante Fe, Argentina. "Kaleidoscope 2018: Machine learning for a 5G future" is the tenth in a series of peer-reviewed academic conferences organized by ITU to bring together a wide range of views from universities, industry and research institutions. The aim of the Kaleidoscope conferences is to identify emerging developments in information and communication technologies (ICTs) and, in particular, areas in need of international standards to aid the healthy development of the Information Society.
Geographic Information Systems (GIS) Field Upended by Neural Networks
On today's episode of "The Interview" with The Next Platform, we focus on how geographic information systems (GIS) is, as a field, being revolutionized by deep learning. This stands to reason given the large volumes of satellite image data and robust deep learning frameworks that excel at image classification and analysis–a volume issue that has been compounded by more satellites with ever-higher resolution output. Unlike other areas of large-scale scientific data analysis that have traditionally relied on massive supercomputers, our audio interview (player below) reveals that a great deal of GIS analysis can be done on smaller systems. However, with the addition of deep learning, the field could be investing in more GPU systems for training and still others for inference at scale. Using lower end TitanX GPUs from Nvidia, the team, which includes Sudeep Sarkar and Mauricio Pamplona Segunda that created a CNN approach to GIS land classification described here, it was shown that deep learning can be a successful tool in the box of GIS analysts.
AI Has a Dirty Little Secret: It's Powered by People
There's a dirty little secret about artificial intelligence: It's powered by hundreds of thousands of real people. From makeup artists in Venezuela to women in conservative parts of India, people around the world are doing the digital equivalent of needlework --drawing boxes around cars in street photos, tagging images, and transcribing snatches of speech that computers can't quite make out. Such data feeds directly into "machine learning" algorithms that help self-driving cars wind through traffic and let Alexa figure out that you want the lights on. These repetitive tasks pay pennies apiece. But in bulk, this work can offer a decent wage in many parts of the world -- even in the U.S.
Roborace is building a 300kph AI supercar – no driver required
The Argentinian summer Sun beat down on the Buenos Aires city circuit as the cars approached the penultimate turn. It was February 18, 2017, the Saturday of Formula E's South American weekend, and two cars jostled for first place. The second car, though, was being too aggressive. Nearing the corner's apex, the vehicle misjudged its position and speed. The vehicle slammed into the blue safety walls surrounding the track. As the wreckage crumpled to a stop, a detached wheel rolled freely across the hot asphalt. The scene was eerie: though the marshals were alerted to the smash, the usual scramble to rush paramedics to the scene didn't happen.
Detecting Fake News, Fake Reviews, Fake Accounts, Fake Pictures
A while back, I was reading an article posted on Facebook, about Clovis people found alive and well living in Florida, with a picture featuring tribesmen (see below.) The quality of the picture was poor, and the URL was very suspicious: baynews9.com.ddwg.clonezone.link, as to make it appear that it was from Baynews9.com. It turned out that the picture (and thus the whole story) was fake: these people are real people living in Peru, see here for a Youtube video about them. My question is how to detect that a story is fake? The picture might have metadata embedded in it, allowing the data scientist to find the real source, unless it is a screenshot.
Consequentialist conditional cooperation in social dilemmas with imperfect information
Peysakhovich, Alexander, Lerer, Adam
Social dilemmas, where mutual cooperation can lead to high payoffs but participants face incentives to cheat, are ubiquitous in multi-agent interaction. We wish to construct agents that cooperate with pure cooperators, avoid exploitation by pure defectors, and incentivize cooperation from the rest. However, often the actions taken by a partner are (partially) unobserved or the consequences of individual actions are hard to predict. We show that in a large class of games good strategies can be constructed by conditioning one's behavior solely on outcomes (ie. one's past rewards). We call this consequentialist conditional cooperation. We show how to construct such strategies using deep reinforcement learning techniques and demonstrate, both analytically and experimentally, that they are effective in social dilemmas beyond simple matrix games. We also show the limitations of relying purely on consequences and discuss the need for understanding both the consequences of and the intentions behind an action.