Goto

Collaborating Authors

 Country


Google researchers release audit framework to close AI accountability gap

#artificialintelligence

Researchers associated with Google and the Partnership on AI have created a framework to help companies and their engineering teams audit AI systems before deploying them. The framework, intended to add a layer of quality assurance to businesses launching AI, translates into practice values often espoused in AI ethics principles and tackles an accountability gap authors say exists in AI today. The work, titled "Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing" is one of a handful of outstanding AI ethics research papers accepted for publication as part of the Fairness, Accountability, and Transparency (FAT) conference, which takes place this week in Barcelona, Spain. "The proposed auditing framework is intended to contribute to closing the development and deployment accountability gap of large-scale artificial intelligence systems by embedding a robust process to ensure audit integrity," the paper reads. "At a minimum, the internal audit process should enable critical reflections on the potential impact of a system, serving as internal education and training on ethical awareness in addition to leaving what we refer to as a'transparency trail' of documentation at each step of the development cycle." The framework is also intended to identify risks and reduce them to the lowest degree possible, as well as to map out how things that can be done differently in the future or how to respond to a failure after launch.


Error-feedback Stochastic Configuration Strategy on Convolutional Neural Networks for Time Series Forecasting

arXiv.org Machine Learning

-- Despite the superiority of convolutional neural networks demonstrated in time series modeling and forecasting, it has not been fully explored on the design of the neural network architecture as well as the tuning of the hyper-parameters. Inspired by the iterative construction strategy for building a random multilayer perceptron, we propose a novel Error-feedback Stochastic Configuration (ESC) strategy to construct a random Convolutional Neural Network (ESC-CNN) for time series forecasting task, which builds the network architecture adaptively. The ESC strategy suggests that random filters and neurons of the error-feedback fully connected layer are incre-mentally added in a manner that they can steadily compensate the prediction error during the construction process, and a filter selection strategy is introduced to secure that ESC-CNN holds the universal approximation property, providing helpful information at each iterative process for the prediction. The performance of ESC-CNN is justified on its prediction accuracy for one-step- ahead and multi-step-ahead forecasting tasks. Comprehensive experiments on a synthetic dataset and two real-world datasets show that the proposed ESC-CNN not only outperforms the state-of-art random neural networks, but also exhibits strong predictive power in comparison to trained Convolution Neural Networks and Long Short-T erm Memory models, demonstrating the effectiveness of ESC-CNN in time series forecasting. Time series forecasting, especially computational intelligence enabled time series forecasting, is of great importance for a learning system in dynamic environments, and plays a vital role in applications such as in finance [1]-[3], energy [4]- [6], traffic [7]-[9], and electric load [10]-[12], etc. Recently, convolutional neural networks (CNNs) have been successfully implemented for time series forecasting tasks, benefiting from its strength in extracting local features via multiple convolu-tional filters and learning representation by fully connected layers [13]-[16].


A Deep Learning Approach for the Computation of Curvature in the Level-Set Method

arXiv.org Machine Learning

We propose a deep learning strategy to compute the mean curvature of an implicit level-set representation of an interface. Our approach is based on fitting neural networks to synthetic datasets of pairs of nodal $\phi$ values and curvatures obtained from circular interfaces immersed in different uniform resolutions. These neural networks are multilayer perceptrons that ingest sample level-set values of grid points along a free boundary and output the dimensionless curvature at the center vertices of each sampled neighborhood. Evaluations with irregular (smooth and sharp) interfaces, in both uniform and adaptive meshes, show that our deep learning approach is systematically superior to conventional numerical approximation in the $L^2$ and $L^\infty$ norms. Our methodology is also less sensitive to steep curvatures and approximates them well with samples collected with fewer iterations of the reinitialization equation, often needed to regularize the underlying implicit function. Additionally, we show that an application-dependent map of local resolutions to neural networks can be constructed and employed to estimate interface curvatures more efficiently than using typically expensive numerical schemes while still attaining comparable or higher precision.


Neuro-evolutionary Frameworks for Generalized Learning Agents

arXiv.org Artificial Intelligence

The ultimate aim of artificial intelligence research is to develop agents with truly intelligent behaviors, akin to those found in humans and animals. To this end, a number of tools and techniques have been developed. In recent years, two approaches in particular - deep learning (DL) and reinforcement learning (RL), seem to have made considerable progress towards this goal. Both these fields have been widely studied, with numerous successful examples [22, 29, 42, 25, 40] reported, particularly in recent years. However, even with the unprecedented success of recent approaches such as deep RL [28, 27, 36], poor sample efficiency and limited generalization remain major concerns to be addressed, keeping in view the ultimate goal of developing general purpose agents. The poor generalization capability of DL is exposed by its liability to deception when presented with adversarial examples [30, 39]. Recent work [38], showed that it was possible to hurt the performance of DLbased image recognition systems by carefully altering just a single pixel.


Improving Efficiency in Large-Scale Decentralized Distributed Training

arXiv.org Machine Learning

Decentralized Parallel SGD (D-PSGD) and its asynchronous variant Asynchronous Parallel SGD (AD-PSGD) is a family of distributed learning algorithms that have been demonstrated to perform well for large-scale deep learning tasks. One drawback of (A)D-PSGD is that the spectral gap of the mixing matrix decreases when the number of learners in the system increases, which hampers convergence. In this paper, we investigate techniques to accelerate (A)D-PSGD based training by improving the spectral gap while minimizing the communication cost. We demonstrate the effectiveness of our proposed techniques by running experiments on the 2000-hour Switchboard speech recognition task and the ImageNet computer vision task. On an IBM P9 supercomputer, our system is able to train an LSTM acoustic model in 2.28 hours with 7.5% WER on the Hub5-2000 Switchboard (SWB) test set and 13.3% WER on the CallHome (CH) test set using 64 V100 GPUs and in 1.98 hours with 7.7% WER on SWB and 13.3% WER on CH using 128 V100 GPUs, the fastest training time reported to date. Index T erms -- distributed training, decentralized SGD, parallel computing, automatic speech recognition, image recognition.


End-to-End Models for the Analysis of System 1 and System 2 Interactions based on Eye-Tracking Data

arXiv.org Machine Learning

While theories postulating a dual cognitive system take hold, quantitative confirmations are still needed to understand and identify interactions between the two systems or conflict events. Eye movements are among the most direct markers of the individual attentive load and may serve as an important proxy of information. In this work we propose a computational method, within a modified visual version of the well-known Stroop test, for the identification of different tasks and potential conflicts events between the two systems through the collection and processing of data related to eye movements. A statistical analysis shows that the selected variables can characterize the variation of attentive load within different scenarios. Moreover, we show that Machine Learning techniques allow to distinguish between different tasks with a good classification accuracy and to investigate more in depth the gaze dynamics.


Adversarial Machine Learning -- Industry Perspectives

arXiv.org Machine Learning

Based on interviews with 28 organizations, we found that industry practitioners are not equipped with tactical and strategic tools to protect, detect and respond to attacks on their Machine Learning (ML) systems. We leverage the insights from the interviews and we enumerate the gaps in perspective in securing machine learning systems when viewed in the context of traditional software security development. We write this paper from the perspective of two personas: developers/ML engineers and security incident responders who are tasked with securing ML systems as they are designed, developed and deployed ML systems. The goal of this paper is to engage researchers to revise and amend the Security Development Lifecycle for industrial-grade software in the adversarial ML era.


Machine Learning Based Channel Modeling for Vehicular Visible Light Communication

arXiv.org Machine Learning

Optical Wireless Communication (OWC) propagation channel characterization plays a key role on the design and performance analysis of Vehicular Visible Light Communication (VVLC) systems. Current OWC channel models based on deterministic and stochastic methods, fail to address mobility induced ambient light, optical turbulence and road reflection effects on channel characterization. Therefore, alternative machine learning (ML) based schemes, considering ambient light, optical turbulence, road reflection effects in addition to intervehicular distance and geometry, are proposed to obtain accurate VVLC channel loss and channel frequency response (CFR). This work demonstrates synthesis of ML based VVLC channel model frameworks through multi layer perceptron feed-forward neural network (MLP), radial basis function neural network (RBF-NN) and Random Forest ensemble learning algorithms. Predictor and response variables, collected through practical road measurements, are employed to train and validate proposed models for various conditions. Additionally, the importance of different predictor variables on channel loss and CFR is assessed, normalized importance of features for measured VVLC channel is introduced. We show that RBF-NN, Random Forest and MLP based models yield more accurate channel loss estimations with 3.53 dB, 3.81 dB, 3.95 dB root mean square error (RMSE), respectively, when compared to fitting curve based VVLC channel model with 7 dB RMSE. Moreover, RBF-NN and MLP models are demonstrated to predict VVLC CFR with respect to distance, ambient light and receiver inclination angle predictor variables with 3.78 dB and 3.60 dB RMSE respectively.


Generating Digital Twins with Multiple Sclerosis Using Probabilistic Neural Networks

arXiv.org Machine Learning

Multiple Sclerosis (MS) is a neurodegenerative disorder characterized by a complex set of clinical assessments. We use an unsupervised machine learning model called a Conditional Restricted Boltzmann Machine (CRBM) to learn the relationships between covariates commonly used to characterize subjects and their disease progression in MS clinical trials. A CRBM is capable of generating digital twins, which are simulated subjects having the same baseline data as actual subjects. Digital twins allow for subject-level statistical analyses of disease progression. The CRBM is trained using data from 2395 subjects enrolled in the placebo arms of clinical trials across the three primary subtypes of MS. We discuss how CRBMs are trained and show that digital twins generated by the model are statistically indistinguishable from their actual subject counterparts along a number of measures.


Understanding the dynamics of message passing algorithms: a free probability heuristics

arXiv.org Machine Learning

A major task is to compute statistics of unobserved random variables using distributions of these variables conditioned on observed data. An exact computation of the corresponding expectations in the multivariate case is usually not possible except for simple cases. Hence, one has to resort to methods which approximate the necessary high-dimensional sums or integrals and which are often based on ideas of statistical physics [1]. A class of such approximation algorithms is often termed message passing. Prominent examples are belief propagation [2] which was developed for inference in probabilistic Bayesian networks with sparse couplings and expectation propagation (EP) which is also applicable for networks with dense coupling matrices [3]. Both types of algorithms make assumptions on weak dependencies between random variables which motivate the approximation of certain expectations by Gaussian random variables invoking central limit theorem arguments [4]. Using ideas of the statistical physics of disordered systems, such arguments can be justified for the fixed points of such algorithms for large network models where couplings are drawn from random, rotation invariant matrix distributions. This extra assumption of randomness allows for further simplifications of message passing approaches [5, 6], leading e.g. to the approximate message passing AMP or VAMP algorithms, see [7, 8, 9].