Goto

Collaborating Authors

 Evolutionary Systems



Recycling Privileged Learning and Distribution Matching for Fairness

Neural Information Processing Systems

Equipping machine learning models with ethical and legal constraints is a serious issue; without this, the future of machine learning is at risk. This paper takes a step forward in this direction and focuses on ensuring machine learning models deliver fair decisions. In legal scholarships, the notion of fairness itself is evolving and multi-faceted. We set an overarching goal to develop a unified machine learning framework that is able to handle any definitions of fairness, their combinations, and also new definitions that might be stipulated in the future. To achieve our goal, we recycle two well-established machine learning techniques, privileged learning and distribution matching, and harmonize them for satisfying multi-faceted fairness definitions. We consider protected characteristics such as race and gender as privileged information that is available at training but not at test time; this accelerates model training and delivers fairness through unawareness. Further, we cast demographic parity, equalized odds, and equality of opportunity as a classical two-sample problem of conditional distributions, which can be solved in a general form by using distance measures in Hilbert Space. We show several existing models are special cases of ours. Finally, we advocate returning the Pareto frontier of multi-objective minimization of error and unfairness in predictions. This will facilitate decision makers to select an operating point and to be accountable for it.


Generalizing GANs: A Turing Perspective

Neural Information Processing Systems

Recently, a new class of machine learning algorithms has emerged, where models and discriminators are generated in a competitive setting. The most prominent example is Generative Adversarial Networks (GANs). In this paper we examine how these algorithms relate to the Turing test, and derive what - from a Turing perspective - can be considered their defining features. Based on these features, we outline directions for generalizing GANs - resulting in the family of algorithms referred to as Turing Learning. One such direction is to allow the discriminators to interact with the processes from which the data samples are obtained, making them "interrogators", as in the Turing test. We validate this idea using two case studies. In the first case study, a computer infers the behavior of an agent while controlling its environment. In the second case study, a robot infers its own sensor configuration while controlling its movements. The results confirm that by allowing discriminators to interrogate, the accuracy of models is improved.


Scalable Prototype Selection by Genetic Algorithms and Hashing

arXiv.org Machine Learning

Classification in the dissimilarity space has become a very active research area since it provides a possibility to learn from data given in the form of pairwise non-metric dissimilarities, which otherwise would be difficult to cope with. The selection of prototypes is a key step for the further creation of the space. However, despite previous efforts to find good prototypes, how to select the best representation set remains an open issue. In this paper we proposed scalable methods to select the set of prototypes out of very large datasets. The methods are based on genetic algorithms, dissimilarity-based hashing, and two different unsupervised and supervised scalable criteria. The unsupervised criterion is based on the Minimum Spanning Tree of the graph created by the prototypes as nodes and the dissimilarities as edges. The supervised criterion is based on counting matching labels of objects and their closest prototypes. The suitability of these type of algorithms is analyzed for the specific case of dissimilarity representations. The experimental results showed that the methods select good prototypes taking advantage of the large datasets, and they do so at low runtimes. Preprint submitted to Elsevier December 27, 2017 1. Introduction The vector space representation is a common option to represent the data for learning tasks since many statistical techniques are applicable for this kind of representation. However, there is an increasing number of real-world problems which are not vectorial. Instead, the data are given in terms of pairwise dissimilarities which may be non-Euclidean and even non-metric.


Profit Driven Decision Trees for Churn Prediction

arXiv.org Machine Learning

Customer retention campaigns increasingly rely on predictive models to detect potential churners in a vast customer base. From the perspective of machine learning, the task of predicting customer churn can be presented as a binary classification problem. Using data on historic behavior, classification algorithms are built with the purpose of accurately predicting the probability of a customer defecting. The predictive churn models are then commonly selected based on accuracy related performance measures such as the area under the ROC curve (AUC). However, these models are often not well aligned with the core business requirement of profit maximization, in the sense that, the models fail to take into account not only misclassification costs, but also the benefits originating from a correct classification. Therefore, the aim is to construct churn prediction models that are profitable and preferably interpretable too. The recently developed expected maximum profit measure for customer churn (EMPC) has been proposed in order to select the most profitable churn model. We present a new classifier that integrates the EMPC metric directly into the model construction. Our technique, called ProfTree, uses an evolutionary algorithm for learning profit driven decision trees. In a benchmark study with real-life data sets from various telecommunication service providers, we show that ProfTree achieves significant profit improvements compared to classic accuracy driven tree-based methods.


Natural selection is still at work in humans, study finds

Daily Mail - Science & tech

Humans aren't quite done evolving, a dramatic new study has found. Researchers analyzing genetic and health data on hundreds of thousands of people, uncovered evidence to suggest natural selection has an ongoing, albeit small, effect on modern humans. The new study appears to be favour larger, 'hunkier' men with a greater body mass index, and younger mothers. A new study found that natural selection appears to favour women who get a young start on having a family. Researchers examined data from the UK Biobank, looking at genetic variants and their correlation to the number of children people had.


overnewser, The best real-time news sites information.

#artificialintelligence

In this contributed article, Sharmistha Sarkar of India based Progressive Markets, highlights a handful of compelling technology advancements that are helping to drive the evolution of artificial intelligence. Industry is expected to grow at a CAGR of 46.5% from 2017 to 2025. The market is growing fast due to improved productivity through AI, its diversified application areas, and big data integration drive....


Machine learning is not just for the buy side - Risk.net

#artificialintelligence

The most common application being researched for machine learning is optimal execution. When large trades are executed in the market, it could potentially push prices in an unfavourable direction, so it makes sense that traders are keen on optimising this cost. So far, most of the interest in applying machine learning technology to reduce trading costs has been from the buy side. However, recent research by quants from Standard Chartered shows this may be about to change. In this month's first technical, Evolutionary algos for optimising MVA, Alexei Kondratyev, a managing director at Standard Chartered in London, and George Giorgidze a senior quantitative developer in the strats team within the same bank, propose machine learning techniques to optimise initial margin costs through trade selection.


Highly Efficient Human Action Recognition with Quantum Genetic Algorithm Optimized Support Vector Machine

arXiv.org Machine Learning

In this paper we propose the use of quantum genetic algorithm to optimize the support vector machine (SVM) for human action recognition. The Microsoft Kinect sensor can be used for skeleton tracking, which provides the joints' position data. However, how to extract the motion features for representing the dynamics of a human skeleton is still a challenge due to the complexity of human motion. We present a highly efficient features extraction method for action classification, that is, using the joint angles to represent a human skeleton and calculating the variance of each angle during an action time window. Using the proposed representation, we compared the human action classification accuracy of two approaches, including the optimized SVM based on quantum genetic algorithm and the conventional SVM with grid search. Experimental results on the MSR-12 dataset show that the conventional SVM achieved an accuracy of $ 93.85\% $. The proposed approach outperforms the conventional method with an accuracy of $ 96.15\% $.


Evolving Spatially Aggregated Features from Satellite Imagery for Regional Modeling

arXiv.org Machine Learning

Satellite imagery and remote sensing provide explanatory variables at relatively high resolutions for modeling geospatial phenomena, yet regional summaries are often desirable for analysis and actionable insight. In this paper, we propose a novel method of inducing spatial aggregations as a component of the machine learning process, yielding regional model features whose construction is driven by model prediction performance rather than prior assumptions. Our results demonstrate that Genetic Programming is particularly well suited to this type of feature construction because it can automatically synthesize appropriate aggregations, as well as better incorporate them into predictive models compared to other regression methods we tested. In our experiments we consider a specific problem instance and real-world dataset relevant to predicting snow properties in high-mountain Asia.