Goto

Collaborating Authors

 Country


Progressive Graph Convolutional Networks for Semi-Supervised Node Classification

arXiv.org Machine Learning

Graph convolutional networks have been successful in addressing graph-based tasks such as semi-supervised node classification. Existing methods use a network structure defined by the user based on experimentation with fixed number of layers and employ a layer-wise propagation rule to obtain the node embeddings. Designing an automatic process to define a problem-dependant architecture for graph convolutional networks can greatly help to reduce the computational complexity of the training process. In this paper, we propose a method to automatically build compact and task-specific graph convolutional networks. Experimental results on widely used publicly available datasets indicate that the proposed method outperforms the related graph-based learning algorithms in terms of classification performance and network compactness.


A light neural network for modulation detection under impairments

arXiv.org Machine Learning

We present a neural network architecture able to efficiently detect modulation techniques in a portion of I/Q signals. This network is lighter by up to two orders of magnitude than other architectures working on the same or similar tasks. Moreover, the number of parameters does not depend on the signal duration, which allows processing stream of data, and results in a signal-length invariant network. In addition, we develop a custom simulator able to model the different impairments the propagation channel and the demodulator can bring to the recorded I/Q signal: random phase shifts, delays, roll-off, sampling rates, and frequency offsets. We benefit from this data set to train our neural network to be invariant to impairments and quantify its accuracy at disentangling between modulations under realistic real-life conditions.


MiLeNAS: Efficient Neural Architecture Search via Mixed-Level Reformulation

arXiv.org Machine Learning

Many recently proposed methods for Neural Architecture Search (NAS) can be formulated as bilevel optimization. For efficient implementation, its solution requires approximations of second-order methods. In this paper, we demonstrate that gradient errors caused by such approximations lead to suboptimality, in the sense that the optimization procedure fails to converge to a (locally) optimal solution. To remedy this, this paper proposes \mldas, a mixed-level reformulation for NAS that can be optimized efficiently and reliably. It is shown that even when using a simple first-order method on the mixed-level formulation, \mldas\ can achieve a lower validation error for NAS problems. Consequently, architectures obtained by our method achieve consistently higher accuracies than those obtained from bilevel optimization. Moreover, \mldas\ proposes a framework beyond DARTS. It is upgraded via model size-based search and early stopping strategies to complete the search process in around 5 hours. Extensive experiments within the convolutional architecture search space validate the effectiveness of our approach.


Piecewise linear activations substantially shape the loss surfaces of neural networks

arXiv.org Machine Learning

Understanding the loss surface of a neural network is fundamentally important to the understanding of deep learning. This paper presents how piecewise linear activation functions substantially shape the loss surfaces of neural networks. We first prove that {\it the loss surfaces of many neural networks have infinite spurious local minima} which are defined as the local minima with higher empirical risks than the global minima. Our result demonstrates that the networks with piecewise linear activations possess substantial differences to the well-studied linear neural networks. This result holds for any neural network with arbitrary depth and arbitrary piecewise linear activation functions (excluding linear functions) under most loss functions in practice. Essentially, the underlying assumptions are consistent with most practical circumstances where the output layer is narrower than any hidden layer. In addition, the loss surface of a neural network with piecewise linear activations is partitioned into multiple smooth and multilinear cells by nondifferentiable boundaries. The constructed spurious local minima are concentrated in one cell as a valley: they are connected with each other by a continuous path, on which empirical risk is invariant. Further for one-hidden-layer networks, we prove that all local minima in a cell constitute an equivalence class; they are concentrated in a valley; and they are all global minima in the cell.


Quantum Semantic Learning by Reverse Annealing an Adiabatic Quantum Computer

arXiv.org Machine Learning

Boltzmann Machines constitute a class of neural networks with applications to image reconstruction, pattern classification and unsupervised learning in general. Their most common variants, called Restricted Boltzmann Machines (RBMs) exhibit a good trade-off between computability on existing silicon-based hardware and generality of possible applications. Still, the diffusion of RBMs is quite limited, since their training process proves to be hard. The advent of commercial Adiabatic Quantum Computers (AQCs) raised the expectation that the implementations of RBMs on such quantum devices could increase the training speed with respect to conventional hardware. To date, however, the implementation of RBM networks on AQCs has been limited by the low qubit connectivity when each qubit acts as a node of the neural network. Here we demonstrate the feasibility of a complete RBM on AQCs, thanks to an embedding that associates its nodes to virtual qubits, thus outperforming previous implementations based on incomplete graphs. Moreover, to accelerate the learning, we implement a semantic quantum search which, contrary to previous proposals, takes the input data as initial boundary conditions to start each learning step of the RBM, thanks to a reverse annealing schedule. Such an approach, unlike the more conventional forward annealing schedule, allows sampling configurations in a meaningful neighborhood of the training data, mimicking the behavior of the classical Gibbs sampling algorithm. We show that the learning based on reverse annealing quickly raises the sampling probability of a meaningful subset of the set of the configurations. Even without a proper optimization of the annealing schedule, the RBM semantically trained by reverse annealing achieves better scores on reconstruction tasks.


Adaptive Reward-Poisoning Attacks against Reinforcement Learning

arXiv.org Artificial Intelligence

In reward-poisoning attacks against reinforcement learning (RL), an attacker can perturb the environment reward $r_t$ into $r_t+\delta_t$ at each step, with the goal of forcing the RL agent to learn a nefarious policy. We categorize such attacks by the infinity-norm constraint on $\delta_t$: We provide a lower threshold below which reward-poisoning attack is infeasible and RL is certified to be safe; we provide a corresponding upper threshold above which the attack is feasible. Feasible attacks can be further categorized as non-adaptive where $\delta_t$ depends only on $(s_t,a_t, s_{t+1})$, or adaptive where $\delta_t$ depends further on the RL agent's learning process at time $t$. Non-adaptive attacks have been the focus of prior works. However, we show that under mild conditions, adaptive attacks can achieve the nefarious policy in steps polynomial in state-space size $|S|$, whereas non-adaptive attacks require exponential steps. We provide a constructive proof that a Fast Adaptive Attack strategy achieves the polynomial rate. Finally, we show that empirically an attacker can find effective reward-poisoning attacks using state-of-the-art deep RL techniques.


Identification of Choquet capacity in multicriteria sorting problems through stochastic inverse analysis

arXiv.org Artificial Intelligence

In multicriteria decision aiding (MCDA), the Choquet integral has been used as an aggregation operator to deal with the case of interacting decision criteria. While the application of the Choquet integral for ranking problems have been receiving most of the attention, this paper rather focuses on multicriteria sorting problems (MCSP). In the Choquet integral context, a practical problem that arises is related to the elicitation of parameters known as the Choquet capacities. We address the problem of Choquet capacity identification for MCSP by applying the Stochastic Acceptability Multicriteri Analysis (SMAA), proposing the SMAA-S-Choquet method. The proposed method is also able to model uncertain data that may be present in both decision matrix and limiting profiles, the latter a parameter associated with the sorting problematic. We also introduce two new descriptive measures in order to conduct reverse analysis regarding the capacities: the Scenario Acceptability Index and the Scenario Central Capacity vector.


Generation of Consistent Sets of Multi-Label Classification Rules with a Multi-Objective Evolutionary Algorithm

arXiv.org Artificial Intelligence

Multi-label classification consists in classifying an instance into two or more classes simultaneously. It is a very challenging task present in many real-world applications, such as classification of biology, image, video, audio, and text. Recently, the interest in interpretable classification models has grown, partially as a consequence of regulations such as the General Data Protection Regulation. In this context, we propose a multi-objective evolutionary algorithm that generates multiple rule-based multi-label classification models, allowing users to choose among models that offer different compromises between predictive power and interpretability. An important contribution of this work is that different from most algorithms, which usually generate models based on lists (ordered collections) of rules, our algorithm generates models based on sets (unordered collections) of rules, increasing interpretability. Also, by employing a conflict avoidance algorithm during the rule-creation, every rule within a given model is guaranteed to be consistent with every other rule in the same model. Thus, no conflict resolution strategy is required, evolving simpler models. We conducted experiments on synthetic and real-world datasets and compared our results with state-of-the-art algorithms in terms of predictive performance (F-Score) and interpretability (model size), and demonstrate that our best models had comparable F-Score and smaller model sizes.


Bayesian Hierarchical Multi-Objective Optimization for Vehicle Parking Route Discovery

arXiv.org Artificial Intelligence

Discovering an optimal route to the most feasible parking lot has been a matter of concern for any driver which aggravates further during peak hours of the day and at congested places leading to considerable wastage of time and fuel. This paper proposes a Bayesian hierarchical technique for obtaining the most optimal route to a parking lot. The route selection is based on conflicting objectives and hence the problem belongs to the domain of multi-objective optimization. A probabilistic data driven method has been used to overcome the inherent problem of weight selection in the popular weighted sum technique. The weights of these conflicting objectives have been refined using a Bayesian hierarchical model based on Multinomial and Dirichlet prior. Genetic algorithm has been used to obtain optimal solutions. Simulated data has been used to obtain routes which are in close agreement with real life situations.


Rolling Horizon Evolutionary Algorithms for General Video Game Playing

arXiv.org Artificial Intelligence

Game-playing Evolutionary Algorithms, specifically Rolling Horizon Evolutionary Algorithms, have recently managed to beat the state of the art in performance across many games. However, the best results per game are highly dependent on the specific configuration of modifications and hybrids introduced over several works, each described as parameters in the algorithm. However, the search for the best parameters has been reduced to several human-picked combinations, as the possibility space has grown beyond exhaustive search. This paper presents the state of the art in Rolling Horizon Evolutionary algorithms, combining all modifications described in literature and some additional ones for a large resultant hybrid. It then uses a parameter optimiser, the N-Tuple Bandit Evolutionary Algorithm, to find the best combination of parameters in 20 games with various properties from the General Video Game AI Framework. We highlight the noisy optimisation problem resultant, as both the games and the algorithm being optimised are stochastic. We then analyse the algorithm's parameters and interesting combinations revealed through the parameter optimisation process. Lastly, we show that it is possible to automatically explore a large parameter space and find configurations which outperform the state of the art on several games.