AITopics

Error-feedback Stochastic Configuration Strategy on Convolutional Neural Networks for Time Series Forecasting

Zhang, Xinze, He, Kun, Bao, Yukun

-- Despite the superiority of convolutional neural networks demonstrated in time series modeling and forecasting, it has not been fully explored on the design of the neural network architecture as well as the tuning of the hyper-parameters. Inspired by the iterative construction strategy for building a random multilayer perceptron, we propose a novel Error-feedback Stochastic Configuration (ESC) strategy to construct a random Convolutional Neural Network (ESC-CNN) for time series forecasting task, which builds the network architecture adaptively. The ESC strategy suggests that random filters and neurons of the error-feedback fully connected layer are incre-mentally added in a manner that they can steadily compensate the prediction error during the construction process, and a filter selection strategy is introduced to secure that ESC-CNN holds the universal approximation property, providing helpful information at each iterative process for the prediction. The performance of ESC-CNN is justified on its prediction accuracy for one-step- ahead and multi-step-ahead forecasting tasks. Comprehensive experiments on a synthetic dataset and two real-world datasets show that the proposed ESC-CNN not only outperforms the state-of-art random neural networks, but also exhibits strong predictive power in comparison to trained Convolution Neural Networks and Long Short-T erm Memory models, demonstrating the effectiveness of ESC-CNN in time series forecasting. Time series forecasting, especially computational intelligence enabled time series forecasting, is of great importance for a learning system in dynamic environments, and plays a vital role in applications such as in finance [1]-[3], energy [4]- [6], traffic [7]-[9], and electric load [10]-[12], etc. Recently, convolutional neural networks (CNNs) have been successfully implemented for time series forecasting tasks, benefiting from its strength in extracting local features via multiple convolu-tional filters and learning representation by fully connected layers [13]-[16].

deep learning, neural network, time series forecasting, (14 more...)

2002.00717

Country: Asia > China (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Cárdenas, Luis Ángel Larios, Gibou, Frederic

A Deep Learning Approach for the Computation of Curvature in the Level-Set Method

We propose a deep learning strategy to compute the mean curvature of an implicit level-set representation of an interface. Our approach is based on fitting neural networks to synthetic datasets of pairs of nodal $\phi$ values and curvatures obtained from circular interfaces immersed in different uniform resolutions. These neural networks are multilayer perceptrons that ingest sample level-set values of grid points along a free boundary and output the dimensionless curvature at the center vertices of each sampled neighborhood. Evaluations with irregular (smooth and sharp) interfaces, in both uniform and adaptive meshes, show that our deep learning approach is systematically superior to conventional numerical approximation in the $L^2$ and $L^\infty$ norms. Our methodology is also less sensitive to steep curvatures and approximates them well with samples collected with fewer iterations of the reinitialization equation, often needed to regularize the underlying implicit function. Additionally, we show that an application-dependent map of local resolutions to neural networks can be constructed and employed to estimate interface curvatures more efficiently than using typically expensive numerical schemes while still attaining comparable or higher precision.

deep learning, neural network, upstream oil & gas, (19 more...)

2002.02804

Country: North America > United States > California > Santa Barbara County > Santa Barbara (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas > Upstream (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Karimpanal, Thommen George

Neuro-evolutionary Frameworks for Generalized Learning Agents

arXiv.org Artificial IntelligenceFeb-3-2020

The ultimate aim of artificial intelligence research is to develop agents with truly intelligent behaviors, akin to those found in humans and animals. To this end, a number of tools and techniques have been developed. In recent years, two approaches in particular - deep learning (DL) and reinforcement learning (RL), seem to have made considerable progress towards this goal. Both these fields have been widely studied, with numerous successful examples [22, 29, 42, 25, 40] reported, particularly in recent years. However, even with the unprecedented success of recent approaches such as deep RL [28, 27, 36], poor sample efficiency and limited generalization remain major concerns to be addressed, keeping in view the ultimate goal of developing general purpose agents. The poor generalization capability of DL is exposed by its liability to deception when presented with adversarial examples [30, 39]. Recent work [38], showed that it was possible to hurt the performance of DLbased image recognition systems by carefully altering just a single pixel.

evolutionary algorithm, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2002.01088

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Improving Efficiency in Large-Scale Decentralized Distributed Training

Zhang, Wei, Cui, Xiaodong, Kayi, Abdullah, Liu, Mingrui, Finkler, Ulrich, Kingsbury, Brian, Saon, George, Mroueh, Youssef, Buyuktosunoglu, Alper, Das, Payel, Kung, David, Picheny, Michael

Decentralized Parallel SGD (D-PSGD) and its asynchronous variant Asynchronous Parallel SGD (AD-PSGD) is a family of distributed learning algorithms that have been demonstrated to perform well for large-scale deep learning tasks. One drawback of (A)D-PSGD is that the spectral gap of the mixing matrix decreases when the number of learners in the system increases, which hampers convergence. In this paper, we investigate techniques to accelerate (A)D-PSGD based training by improving the spectral gap while minimizing the communication cost. We demonstrate the effectiveness of our proposed techniques by running experiments on the 2000-hour Switchboard speech recognition task and the ImageNet computer vision task. On an IBM P9 supercomputer, our system is able to train an LSTM acoustic model in 2.28 hours with 7.5% WER on the Hub5-2000 Switchboard (SWB) test set and 13.3% WER on the CallHome (CH) test set using 64 V100 GPUs and in 1.98 hours with 7.7% WER on SWB and 13.3% WER on CH using 128 V100 GPUs, the fastest training time reported to date. Index T erms -- distributed training, decentralized SGD, parallel computing, automatic speech recognition, image recognition.

artificial intelligence, learner, machine learning, (19 more...)

2002.01119

Country: North America > United States > Iowa (0.04)

Genre: Research Report (0.40)

Industry: Information Technology (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

End-to-End Models for the Analysis of System 1 and System 2 Interactions based on Eye-Tracking Data

Rossi, Alessandro, Ermini, Sara, Bernabini, Dario, Zanca, Dario, Todisco, Marino, Genovese, Alessandro, Rizzo, Antonio

While theories postulating a dual cognitive system take hold, quantitative confirmations are still needed to understand and identify interactions between the two systems or conflict events. Eye movements are among the most direct markers of the individual attentive load and may serve as an important proxy of information. In this work we propose a computational method, within a modified visual version of the well-known Stroop test, for the identification of different tasks and potential conflicts events between the two systems through the collection and processing of data related to eye movements. A statistical analysis shows that the selected variables can characterize the variation of attentive load within different scenarios. Moreover, we show that Machine Learning techniques allow to distinguish between different tasks with a good classification accuracy and to investigate more in depth the gaze dynamics.

eye movement, fixation, interference, (15 more...)

2002.11192

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy (0.04)

Genre: Research Report > Experimental Study (0.47)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Adversarial Machine Learning -- Industry Perspectives

Kumar, Ram Shankar Siva, Nyström, Magnus, Lambert, John, Marshall, Andrew, Goertzel, Mario, Comissoneru, Andi, Swann, Matt, Xia, Sharon

Based on interviews with 28 organizations, we found that industry practitioners are not equipped with tactical and strategic tools to protect, detect and respond to attacks on their Machine Learning (ML) systems. We leverage the insights from the interviews and we enumerate the gaps in perspective in securing machine learning systems when viewed in the context of traditional software security development. We write this paper from the perspective of two personas: developers/ML engineers and security incident responders who are tasked with securing ML systems as they are designed, developed and deployed ML systems. The goal of this paper is to engage researchers to revise and amend the Security Development Lifecycle for industrial-grade software in the adversarial ML era.

arxiv preprint arxiv, ml system, vulnerability, (9 more...)

2002.05646

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe (0.04)

Genre:

Questionnaire & Opinion Survey (0.48)
Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Turan, Bugra, Coleri, Sinem

Machine Learning Based Channel Modeling for Vehicular Visible Light Communication

Optical Wireless Communication (OWC) propagation channel characterization plays a key role on the design and performance analysis of Vehicular Visible Light Communication (VVLC) systems. Current OWC channel models based on deterministic and stochastic methods, fail to address mobility induced ambient light, optical turbulence and road reflection effects on channel characterization. Therefore, alternative machine learning (ML) based schemes, considering ambient light, optical turbulence, road reflection effects in addition to intervehicular distance and geometry, are proposed to obtain accurate VVLC channel loss and channel frequency response (CFR). This work demonstrates synthesis of ML based VVLC channel model frameworks through multi layer perceptron feed-forward neural network (MLP), radial basis function neural network (RBF-NN) and Random Forest ensemble learning algorithms. Predictor and response variables, collected through practical road measurements, are employed to train and validate proposed models for various conditions. Additionally, the importance of different predictor variables on channel loss and CFR is assessed, normalized importance of features for measured VVLC channel is introduced. We show that RBF-NN, Random Forest and MLP based models yield more accurate channel loss estimations with 3.53 dB, 3.81 dB, 3.95 dB root mean square error (RMSE), respectively, when compared to fitting curve based VVLC channel model with 7 dB RMSE. Moreover, RBF-NN and MLP models are demonstrated to predict VVLC CFR with respect to distance, ambient light and receiver inclination angle predictor variables with 3.78 dB and 3.60 dB RMSE respectively.

ambient light, path loss, vvlc channel, (13 more...)

2002.03774

Country:

Asia > China (0.04)
North America > United States > New York (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)

Genre: Research Report (0.64)

Industry:

Energy (0.46)
Materials (0.46)
Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.86)

Walsh, Jonathan R., Smith, Aaron M., Pouliot, Yannick, Li-Bland, David, Loukianov, Anton, Fisher, Charles K.

Generating Digital Twins with Multiple Sclerosis Using Probabilistic Neural Networks

Multiple Sclerosis (MS) is a neurodegenerative disorder characterized by a complex set of clinical assessments. We use an unsupervised machine learning model called a Conditional Restricted Boltzmann Machine (CRBM) to learn the relationships between covariates commonly used to characterize subjects and their disease progression in MS clinical trials. A CRBM is capable of generating digital twins, which are simulated subjects having the same baseline data as actual subjects. Digital twins allow for subject-level statistical analyses of disease progression. The CRBM is trained using data from 2395 subjects enrolled in the placebo arms of clinical trials across the three primary subtypes of MS. We discuss how CRBMs are trained and show that digital twins generated by the model are statistically indistinguishable from their actual subject counterparts along a number of measures.

covariate, crbm, digital twin, (15 more...)

2002.02779

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England (0.04)
Europe > Poland > Lublin Province > Lublin (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Multiple Sclerosis (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Opper, Manfred, Çakmak, Burak

Understanding the dynamics of message passing algorithms: a free probability heuristics

A major task is to compute statistics of unobserved random variables using distributions of these variables conditioned on observed data. An exact computation of the corresponding expectations in the multivariate case is usually not possible except for simple cases. Hence, one has to resort to methods which approximate the necessary high-dimensional sums or integrals and which are often based on ideas of statistical physics [1]. A class of such approximation algorithms is often termed message passing. Prominent examples are belief propagation [2] which was developed for inference in probabilistic Bayesian networks with sparse couplings and expectation propagation (EP) which is also applicable for networks with dense coupling matrices [3]. Both types of algorithms make assumptions on weak dependencies between random variables which motivate the approximation of certain expectations by Gaussian random variables invoking central limit theorem arguments [4]. Using ideas of the statistical physics of disordered systems, such arguments can be justified for the fixed points of such algorithms for large network models where couplings are drawn from random, rotation invariant matrix distributions. This extra assumption of randomness allows for further simplifications of message passing approaches [5, 6], leading e.g. to the approximate message passing AMP or VAMP algorithms, see [7, 8, 9].

algorithm, matrix, random matrix, (15 more...)

2002.02533

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Architecture > Distributed Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)