Country
Continuous-time Discounted Mirror-Descent Dynamics in Monotone Concave Games
In this paper, we consider concave continuous-kernel games characterized by monotonicity properties and propose discounted mirror descent-type dynamics. We introduce two classes of dynamics whereby the associated mirror map is constructed based on a strongly convex or a Legendre regularizer. Depending on the properties of the regularizer we show that these new dynamics can converge asymptotically in concave games with monotone (negative) pseudo-gradient. Furthermore, we show that when the regularizer enjoys strong convexity, the resulting dynamics can converge even in games with hypo-monotone (negative) pseudo-gradient, which corresponds to a shortage of monotonicity.
A novel guided deep learning algorithm to design low-cost SPP films
The design of surface plasmon polaritons (SPP) films is an ill-posed inverse problem. There are many-to-one correspondence between the structures and user needs. We present a novel guided deep learning algorithm to find optimal solutions (with both high accuracy and low cost). To achieve this goal, we use low cost sample replacement algorithm in training process. The deep CNN would gradually learn better model from samples with lower cost. We have successfully applied this algorithm to the design of low-cost SPP films. Our model learned to replace precious metals with ordinary metals to reduce cost. So the the cost of predicted structure is much lower than standard deep CNN. And the average relative error of spectrum is less than 5%. The source codes are available at https://github.com/closest-git/MetaLab.
No-Regret Exploration in Goal-Oriented Reinforcement Learning
Tarbouriech, Jean, Garcelon, Evrard, Valko, Michal, Pirotta, Matteo, Lazaric, Alessandro
Many popular reinforcement learning problems (e.g., navigation in a maze, some Atari games, mountain car) are instances of the so-called episodic setting or stochastic shortest path (SSP) problem, where an agent has to achieve a predefined goal state (e.g., the top of the hill) while maximizing the cumulative reward or minimizing the cumulative cost. Despite its popularity, most of the literature studying the exploration-exploitation dilemma either focused on different problems (i.e., fixed-horizon and infinite-horizon) or made the restrictive loop-free assumption (which implies that no same state can be visited twice during any episode). In this paper, we study the general SSP setting and introduce the algorithm UC-SSP whose regret scales as $\displaystyle \widetilde{O}(c_{\max}^{3/2} c_{\min}^{-1/2} D S \sqrt{ A D K})$ after $K$ episodes for any unknown SSP with $S$ non-terminal states, $A$ actions, an SSP-diameter of $D$ and positive costs in $[c_{\min}, c_{\max}]$. UC-SSP is thus the first learning algorithm with vanishing regret in the theoretically challenging setting of episodic RL.
Detecting Cyberattacks in Industrial Control Systems Using Online Learning Algorithms
Lia, Guangxia, Shena, Yulong, Zhaob, Peilin, Lu, Xiao, Liu, Jia, Liu, Yangyang, Hoi, Steven C. H.
Industrial control systems are critical to the operation of industrial facilities, especially for critical infrastructures, such as refineries, power gri ds, and transportation systems. Similar to other information systems, a significant threat to indust rial control systems is the attack from cyberspace--the offensive maneuvers launched by "anon ymous" in the digital world that target computer-based assets with the goal of compromising a system's functions or probing for information. Owing to the importance of industrial control systems, and the possibly devastating consequences of being attacked, significant endeavors have been attempted to secure industrial control systems from cyberattacks. Among them are intrusio n detection systems that serve as the first line of defense by monitoring and reporting potenti ally malicious activities. Classical machine-learning-based intrusion detection methods usua lly generate prediction models by learning modest-sized training samples all at once. Such approac h is not always applicable to industrial control systems, as industrial control systems must proces s continuous control commands with limited computational resources in a nonstop way. To satisf y such requirements, we propose using online learning to learn prediction models from the control ling data stream. W e introduce several state-of-the-art online learning algorithms categorical ly, and illustrate their efficacies on two typically used testbeds--power system and gas pipeline. Fur ther, we explore a new cost-sensitive online learning algorithm to solve the class-imbalance pro blem that is pervasive in industrial intrusion detection systems. Our experimental results ind icate that the proposed algorithm can achieve an overall improvement in the detection rate of cybe rattacks in industrial control systems. Modern industrial control systems are microprocessor-equ ipped devices and associated communication networks used to monitor and operate physica l equipment in the industrial environment.
Heuristic Approach for Jointly Optimizing FeICIC and UAV Locations in Multi-Tier LTE-Advanced Public Safety HetNet
Kumbhar, Abhaykumar, Binol, Hamidullah, Singh, Simran, Guvenc, Ismail, Akkaya, Kemal
UAV enabled communications and networking can enhance wireless connectivity and support emerging services. However, this would require system-level understanding to modify and extend the existing terrestrial network infrastructure. In this paper, we integrate UAVs both as user equipment and base stations into existing LTE-Advanced heterogeneous network (HetNet) and provide system-level insights of this three-tier LTE-Advanced air-ground HetNet (AG-HetNet). This AG-HetNet leverages cell range expansion (CRE), ICIC, 3D beamforming, and enhanced support for UAVs. Using system-level understanding and through brute-force technique and heuristics algorithms, we evaluate the performance of AG-HetNet in terms of fifth percentile spectral efficiency (5pSE) and coverage probability. We compare 5pSE and coverage probability, when aerial base-stations (UABS) are deployed on a fixed hexagonal grid and when their locations are optimized using genetic algorithm (GA) and elitist harmony search algorithm based on genetic algorithm (eHSGA). Our simulation results show the heuristic algorithms outperform the brute-force technique and achieve better peak values of coverage probability and 5pSE. Simulation results also show that trade-off exists between peak values and computation time when using heuristic algorithms. Furthermore, the three-tier hierarchical structuring of FeICIC provides considerably better 5pSE and coverage probability than eICIC.
Individual predictions matter: Assessing the effect of data ordering in training fine-tuned CNNs for medical imaging
Zech, John R., Forde, Jessica Zosa, Littman, Michael L.
We reproduced the results of CheXNet with fixed hyperparameters and 50 different random seeds to identify 14 finding in chest radiographs (x-rays). Because CheXNet fine-tunes a pre-trained DenseNet, the random seed affects the ordering of the batches of training data but not the initialized model weights. We found substantial variability in predictions for the same radiograph across model runs (mean ln[(maximum probability)/(minimum probability)] 2.45, coefficient of variation 0.543). This individual radiograph-level variability was not fully reflected in the variability of AUC on a large test set. Averaging predictions from 10 models reduced variability by nearly 70% (mean coefficient of variation from 0.543 to 0.169, t-test 15.96, p-value < 0.0001). We encourage researchers to be aware of the potential variability of CNNs and ensemble predictions from multiple models to minimize the effect this variability may have on the care of individual patients when these models are deployed clinically.
Exploring the Back Alleys: Analysing The Robustness of Alternative Neural Network Architectures against Adversarial Attacks
Tan, Yi Xiang Marcus, Elovici, Yuval, Binder, Alexander
Recent discoveries in the field of adversarial machine learning have shown that Artificial Neural Networks (ANNs) are susceptible to adversarial attacks. These attacks cause misclassification of specially crafted adversarial samples. In light of this phenomenon, it is worth investigating whether other types of neural networks are less susceptible to adversarial attacks. In this work, we applied standard attack methods originally aimed at conventional ANNs, towards stochastic ANNs and also towards Spiking Neural Networks (SNNs), across three different datasets namely MNIST, CIFAR-10 and Patch Camelyon. We analysed their adversarial robustness against attacks performed in the raw image space of the different model variants. We employ a variety of attacks namely Basic Iterative Method (BIM), Carlini & Wagner L2 attack (CWL2) and Boundary attack. Our results suggests that SNNs and stochastic ANNs exhibit some degree of adversarial robustness as compared to their ANN counterparts under certain attack methods. Namely, we found that the Boundary and the state-of-the-art CWL2 attacks are largely ineffective against stochastic ANNs. Following this observation, we proposed a modified version of the CWL2 attack and analysed the impact of this attack on the models' adversarial robustness. Our results suggest that with this modified CWL2 attack, many models are more easily fooled as compared to the vanilla CWL2 attack, albeit observing an increase in L2 norms of adversarial perturbations. Lastly, we also investigate the resilience of alternative neural networks against adversarial samples transferred from ResNet18. We show that the modified CWL2 attack provides an improved cross-architecture transferability compared to other attacks.
PIDForest: Anomaly Detection via Partial Identification
Gopalan, Parikshit, Sharan, Vatsal, Wieder, Udi
We consider the problem of detecting anomalies in a large dataset. We propose a framework called Partial Identification which captures the intuition that anomalies are easy to distinguish from the overwhelming majority of points by relatively few attribute values. Formalizing this intuition, we propose a geometric anomaly measure for a point that we call PIDScore, which measures the minimum density of data points over all subcubes containing the point. We present PIDForest: a random forest based algorithm that finds anomalies based on this definition. We show that it performs favorably in comparison to several popular anomaly detection methods, across a broad range of benchmarks. PIDForest also provides a succinct explanation for why a point is labelled anomalous, by providing a set of features and ranges for them which are relatively uncommon in the dataset.
Neural Networks with Cheap Differential Operators
Chen, Ricky T. Q., Duvenaud, David
Gradients of neural networks can be computed efficiently for any architecture, but some applications require differential operators with higher time complexity. We describe a family of restricted neural network architectures that allow efficient computation of a family of differential operators involving dimension-wise derivatives, used in cases such as computing the divergence. Our proposed architecture has a Jacobian matrix composed of diagonal and hollow (non-diagonal) components. We can then modify the backward computation graph to extract dimension-wise derivatives efficiently with automatic differentiation. We demonstrate these cheap differential operators for solving root-finding subproblems in implicit ODE solvers, exact density evaluation for continuous normalizing flows, and evaluating the Fokker--Planck equation for training stochastic differential equation models.
Deep Variable-Block Chain with Adaptive Variable Selection
Zhang, Lixiang, Lin, Lin, Li, Jia
The architectures of deep neural networks (DNN) rely heavily on the underlying grid structure of variables, for instance, the lattice of pixels in an image. For general high dimensional data with variables not associated with a grid, the multi-layer perceptron and deep brief network are often used. However, it is frequently observed that those networks do not perform competitively and they are not helpful for identifying important variables. In this paper, we propose a framework that imposes on blocks of variables a chain structure obtained by step-wise greedy search so that the DNN architecture can leverage the constructed grid. We call this new neural network Deep Variable-Block Chain (DVC). Because the variable blocks are used for classification in a sequential manner, we further develop the capacity of selecting variables adaptively according to a number of regions trained by a decision tree. Our experiments show that DVC outperforms other generic DNNs and other strong classifiers. Moreover, DVC can achieve high accuracy at much reduced dimensionality and sometimes reveals drastically different sets of relevant variables for different regions.