Goto

Collaborating Authors

 South America


Context-Driven Data Mining through Bias Removal and Data Incompleteness Mitigation

arXiv.org Machine Learning

The results of data mining endeavors are majorly driven by data quality. Throughout these deployments, serious show-stopper problems are still unresolved, such as: data collection ambiguities, data imbalance, hidden biases in data, the lack of domain information, and data incompleteness. This paper is based on the premise that context can aid in mitigating these issues. In a traditional data science lifecycle, context is not considered. Context-driven Data Science Lifecycle (C-DSL); the main contribution of this paper, is developed to address these challenges. Two case studies (using data-sets from sports events) are developed to test C-DSL. Results from both case studies are evaluated using common data mining metrics such as: coefficient of determination (R2 value) and confusion matrices. The work presented in this paper aims to re-define the lifecycle and introduce tangible improvements to its outcomes.


Machine Learning Systems for Highly-Distributed and Rapidly-Growing Data

arXiv.org Machine Learning

The usability and practicality of any machine learning (ML) applications are largely influenced by two critical but hard-to-attain factors: low latency and low cost. Unfortunately, achieving low latency and low cost is very challenging when ML depends on real-world data that are highly distributed and rapidly growing (e.g., data collected by mobile phones and video cameras all over the world). Such real-world data pose many challenges in communication and computation. For example, when training data are distributed across data centers that span multiple continents, communication among data centers can easily overwhelm the limited wide-area network bandwidth, leading to prohibitively high latency and high cost. In this dissertation, we demonstrate that the latency and cost of ML on highly-distributed and rapidly-growing data can be improved by one to two orders of magnitude by designing ML systems that exploit the characteristics of ML algorithms, ML model structures, and ML training/serving data. We support this thesis statement with three contributions. First, we design a system that provides both low-latency and low-cost ML serving (inferencing) over large-scale and continuously-growing datasets, such as videos. Second, we build a system that makes ML training over geo-distributed datasets as fast as training within a single data center. Third, we present a first detailed study and a system-level solution on a fundamental and largely overlooked problem: ML training over non-IID (i.e., not independent and identically distributed) data partitions (e.g., facial images collected by cameras varies according to the demographics of each camera's location).


On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation

arXiv.org Machine Learning

Reinforcement learning, mathematically described by Markov Decision Problems, may be approached either through dynamic programming or policy search. Actor-critic algorithms combine the merits of both approaches by alternating between steps to estimate the value function and policy gradient updates. Due to the fact that the updates exhibit correlated noise and biased gradient updates, only the asymptotic behavior of actor-critic is known by connecting its behavior to dynamical systems. This work puts forth a new variant of actor-critic that employs Monte Carlo rollouts during the policy search updates, which results in controllable bias that depends on the number of critic evaluations. As a result, we are able to provide for the first time the convergence rate of actor-critic algorithms when the policy search step employs policy gradient, agnostic to the choice of policy evaluation technique. In particular, we establish conditions under which the sample complexity is comparable to stochastic gradient method for non-convex problems or slower as a result of the critic estimation error, which is the main complexity bottleneck. These results hold for in continuous state and action spaces with linear function approximation for the value function. We then specialize these conceptual results to the case where the critic is estimated by Temporal Difference, Gradient Temporal Difference, and Accelerated Gradient Temporal Difference. These learning rates are then corroborated on a navigation problem involving an obstacle, which suggests that learning more slowly may lead to improved limit points, providing insight into the interplay between optimization and generalization in reinforcement learning.


Fully Parallel Hyperparameter Search: Reshaped Space-Filling

arXiv.org Machine Learning

Space-filling designs such as scrambled-Hammersley, Latin Hypercube Sampling and Jittered Sampling have been proposed for fully parallel hyperparameter search, and were shown to be more effective than random or grid search. In this paper, we show that these designs only improve over random search by a constant factor. In contrast, we introduce a new approach based on reshaping the search distribution, which leads to substantial gains over random search, both theoretically and empirically. We propose two flavors of reshaping. First, when the distribution of the optimum is some known $P_0$, we propose Recentering, which uses as search distribution a modified version of $P_0$ tightened closer to the center of the domain, in a dimension-dependent and budget-dependent manner. Second, we show that in a wide range of experiments with $P_0$ unknown, using a proposed Cauchy transformation, which simultaneously has a heavier tail (for unbounded hyperparameters) and is closer to the boundaries (for bounded hyperparameters), leads to improved performances. Besides artificial experiments and simple real world tests on clustering or Salmon mappings, we check our proposed methods on expensive artificial intelligence tasks such as attend/infer/repeat, video next frame segmentation forecasting and progressive generative adversarial networks.


Employees trust in workplace AI growing HRExecutive.com

#artificialintelligence

There used to be a time in the not-too-distant past when we feared the oncoming hordes of robots in the workplace. That time is no longer. People now have more trust in robots than their managers, according to the second annual AI at Work study conducted by Oracle and research firm Future Workplace. The study of 8,370 employees, managers and HR leaders across 10 countries, found that AI has changed the relationship between people and technology at work and is reshaping the role HR teams and managers need to play in attracting, retaining and developing talent. The latest advancements in machine learning and artificial intelligence are rapidly reaching mainstream, resulting in a massive shift in the way people across the world interact with technology and their teams, says Emily He, senior vice president, human capital management for Oracle's cloud business group.


Artificial intelligence and farmer knowledge boost smallholder maize yields

#artificialintelligence

The situation called for a new approach. They needed information services that would help them decide what varieties to plant, when they should sow and how they should manage their crops. A consortium formed with the government, Colombia's National Cereals and Legumes Federation (FENALCE), and big-data scientists at the International Center for Tropical Agriculture (CIAT). The researchers used big-data tools, based on the data farmers helped collect, and yields increased substantially. The study, published in September in Global Food Security, shows how machine learning of data from multiple sources can help make farming more efficient and productive even as the climate changes. "Today we can collect massive amounts of data, but you can't just bulk it, process it in a machine and make a decision," said Daniel Jimenez, a data scientist at CIAT and the study's lead author.


futureofwork _2019-10-16_18-33-37.xlsx

#artificialintelligence

The graph represents a network of 3,752 Twitter users whose tweets in the requested range contained "futureofwork ", or who were replied to or mentioned in those tweets. The network was obtained from the NodeXL Graph Server on Thursday, 17 October 2019 at 01:35 UTC. The requested start date was Monday, 14 October 2019 at 00:01 UTC and the maximum number of days (going backward) was 14. The maximum number of tweets collected was 5,000. The tweets in the network were tweeted over the 2-day, 16-hour, 29-minute period from Friday, 11 October 2019 at 07:31 UTC to Monday, 14 October 2019 at 00:00 UTC.


A Deep Learning-based Framework for the Detection of Schools of Herring in Echograms

arXiv.org Machine Learning

Tracking the abundance of underwater species is crucial for understanding the effects of climate change on marine ecosystems. Biologists typically monitor underwater sites with echosounders and visualize data as 2D images (echograms); they interpret these data manually or semi-automatically, which is time-consuming and prone to inconsistencies. This paper proposes a deep learning framework for the automatic detection of schools of herring from echograms. Experiments demonstrated that our approach outperforms a traditional machine learning algorithm using hand-crafted features. Our framework could easily be expanded to detect more species of interest to sustainable fisheries.


KDE sampling for imbalanced class distribution

arXiv.org Machine Learning

Imbalanced response variable distribution is not an uncommon occurrence in data science. One common way to combat class imbalance is through resampling the minority class to achieve a more balanced distribution. In this paper, we investigate the performance of the sampling method based on kernel density estimate (KDE). We illustrate how KDE is less prone to overfitting than other standard sampling methods. Numerical experiments show that KDE can outperform other sampling techniques on a range of classifiers and real life datasets.


Single Episode Policy Transfer in Reinforcement Learning

arXiv.org Machine Learning

Transfer and adaptation to new unknown environmental dynamics is a key challenge for reinforcement learning (RL). An even greater challenge is performing near-optimally in a single attempt at test time, possibly without access to dense rewards, which is not addressed by current methods that require multiple experience rollouts for adaptation. To achieve single episode transfer in a family of environments with related dynamics, we propose a general algorithm that optimizes a probe and an inference model to rapidly estimate underlying latent variables of test dynamics, which are then immediately used as input to a universal control policy. This modular approach enables integration of state-of-the-art algorithms for variational inference or RL. Moreover, our approach does not require access to rewards at test time, allowing it to perform in settings where existing adaptive approaches cannot. In diverse experimental domains with a single episode test constraint, our method significantly outperforms existing adaptive approaches and shows favorable performance against baselines for robust transfer.