Country
Model-Free Learning of Optimal Ergodic Policies in Wireless Systems
Kalogerias, Dionysios S., Eisen, Mark, Pappas, George J., Ribeiro, Alejandro
Learning optimal resource allocation policies in wireless systems can be effectively achieved by formulating finite dimensional constrained programs which depend on system configuration, as well as the adopted learning parameterization. The interest here is in cases where system models are unavailable, prompting methods that probe the wireless system with candidate policies, and then use observed performance to determine better policies. This generic procedure is difficult because of the need to cull accurate gradient estimates out of these limited system queries. This paper constructs and exploits smoothed surrogates of constrained ergodic resource allocation problems, the gradients of the former being representable exactly as averages of finite differences that can be obtained through limited system probing. Leveraging this unique property, we develop a new model-free primal-dual algorithm for learning optimal ergodic resource allocations, while we rigorously analyze the relationships between original policy search problems and their surrogates, in both primal and dual domains. First, we show that both primal and dual domain surrogates are uniformly consistent approximations of their corresponding original finite dimensional counterparts. Upon further assuming the use of near-universal policy parameterizations, we also develop explicit bounds on the gap between optimal values of initial, infinite dimensional resource allocation problems, and dual values of their parameterized smoothed surrogates. In fact, we show that this duality gap decreases at a linear rate relative to smoothing and universality parameters. Thus, it can be made arbitrarily small at will, also justifying our proposed primal-dual algorithmic recipe. Numerical simulations confirm the effectiveness of our approach.
Preventing Posterior Collapse in Sequence VAEs with Pooling
Long, Teng, Cao, Yanshuai, Cheung, Jackie Chi Kit
Variational Autoencoders (VAEs) hold great potential for modelling text, as they could in theory separate high-level semantic and syntactic properties from local regularities of natural language. Practically, however, VAEs with autoregressive decoders often suffer from posterior collapse, a phenomenon where the model learns to ignore the latent variables, causing the sequence VAE to degenerate into a language model. Previous works attempt to solve this problem with complex architectural changes or costly optimization schemes. In this paper, we argue that posterior collapse is caused in part by the encoder network failing to capture the input variabilities. We verify this hypothesis empirically and propose a straightforward fix using pooling. This simple technique effectively prevents posterior collapse, allowing the model to achieve significantly better data log-likelihood than standard sequence VAEs. Compared to the previous SOTA on preventing posterior collapse, we are able to achieve comparable performances while being significantly faster.
IrisNet: Deep Learning for Automatic and Real-time Tongue Contour Tracking in Ultrasound Video Data using Peripheral Vision
Mozaffari, M. Hamed, Ratul, Md. Aminur Rab, Lee, Won-Sook
The progress of deep convolutional neural networks has been successfully exploited in various real-time computer vision tasks such as image classification and segmentation. Owing to the development of computational units, availability of digital datasets, and improved performance of deep learning models, fully automatic and accurate tracking of tongue contours in real-time ultrasound data became practical only in recent years. Recent studies have shown that the performance of deep learning techniques is significant in the tracking of ultrasound tongue contours in real-time applications such as pronunciation training using multimodal ultrasound-enhanced approaches. Due to the high correlation between ultrasound tongue datasets, it is feasible to have a general model that accomplishes automatic tongue tracking for almost all datasets. In this paper, we proposed a deep learning model comprises of a convolutional module mimicking the peripheral vision ability of the human eye to handle real-time, accurate, and fully automatic tongue contour tracking tasks, applicable for almost all primary ultrasound tongue datasets. Qualitative and quantitative assessment of IrisNet on different ultrasound tongue datasets and PASCAL VOC2012 revealed its outstanding generalization achievement in compare with similar techniques.
TSK-Streams: Learning TSK Fuzzy Systems on Data Streams
Shaker, Ammar, Hรผllermeier, Eyke
In many practical applications of machine learning and pred ictive modeling, data is produced incrementally in the course of time and observed in the form of a continuous, potentially unbounded stream of observations. Correspond ingly, the problem of learning from data streams has recently received increasing attenti on (Gama, 2012). Algorithms for learning on streams must be able to process the data in a si ngle pass, which implies an incremental mode of learning, and to adapt to changes of the u nderlying data-generating process (Domingos and Hulten, 2003). A popular approach for learning on data streams, both for cla ssification and regression, is rule induction, in the fuzzy logic and computational inte lligence community also known as "evolving fuzzy systems" (Lughofer, 2011). Shaker et al. (2017) proposed a method for regression that builds on a very efficient and effective techniq ue for rule induction, which 1 is inspired by the state-of-the-art machine learning algor ithm AMRules, and combines it with the strengths of fuzzy modeling. Thus, the method induc es a set of fuzzy rules, which, compared to conventional rules with Boolean antecedents, h as the advantage of producing smooth regression functions. The method presented in this p aper, called TSK-Streams, is a revised and improved variant. The main modifications and novel contributions are as follows.
Interpretable Multiple-Kernel Prototype Learning for Discriminative Representation and Feature Selection
Hosseini, Babak, Hammer, Barbara
Prototype-based methods are of the particular interest for domain specialists and practitioners as they summarize a dataset by a small set of representatives. Therefore, in a classification setting, interpretability of the prototypes is as significant as the prediction accuracy of the algorithm. Nevertheless, the state-of-the-art methods make inefficient trade-offs between these concerns by sacrificing one in favor of the other, especially if the given data has a kernel-based representation. In this paper, we propose a novel interpretable multiple-kernel prototype learning (IMKPL) to construct highly interpretable prototypes in the feature space, which are also efficient for the discriminative representation of the data. Our method focuses on the local discrimination of the classes in the feature space and shaping the prototypes based on condensed class-homogeneous neighborhoods of data. Besides, IMKPL learns a combined embedding in the feature space in which the above objectives are better fulfilled. When the base kernels coincide with the data dimensions, this embedding results in a discriminative features selection. We evaluate IMKPL on several benchmarks from different domains which demonstrate its superiority to the related state-of-the-art methods regarding both interpretability and discriminative representation.
Symmetrical Gaussian Error Linear Units (SGELUs)
In this paper, a novel neural network activation function, called Symmetrical Gaussian Error Linear Unit (SGELU), is proposed to obtain high performance. It is achieved by effectively integrating the property of the stochastic regularizer in the Gaussian Error Linear Unit (GELU) with the symmetrical characteristics. Combining with these two merits, the proposed unit introduces the capability of the bidirection convergence to successfully optimize the network without the gradient diminishing problem. The evaluations of SGELU against GELU and Linearly Scaled Hyperbolic Tangent (LiSHT) have been carried out on MNIST classification and MNIST auto-encoder, which provide great validations in terms of the performance, the convergence rate among these applications.
Improving Node Classification by Co-training Node Pair Classification: A Novel Training Framework for General Graph Neural Networks
Chen, Deli, Liu, Xiaoqian, Lin, Yankai, Li, Peng, Zhou, Jie, Su, Qi, Sun, Xu
Semi-supervised learning is a widely used training framework for graph node classification. However, there are two problems existing in this learning method: (1) the original graph topology may not be perfectly aligned with the node classification task; (2) the supervision information in the training set has not been fully used. To tackle these two problems, we design a new task: node pair classification, to assist in training GNN models for the target node classification task. We further propose a novel training framework named Adaptive Co-training, which jointly trains the node classification and the node pair classification after the optimization of graph topology. Extensive experimental results on four representative GNN models have demonstrated that our proposed training framework significantly outperforms baseline methods across three benchmark graph datasets.
Deep Reinforcement Learning Based Dynamic Trajectory Control for UAV-assisted Mobile Edge Computing
Wang, Liang, Wang, Kezhi, Pan, Cunhua, Xu, Wei, Aslam, Nauman, Nallanathan, Arumugam
In this paper, we consider a platform of flying mobile edge computing (F-MEC), where unmanned aerial vehicles (UA Vs) serve as equipment providing computation resource, and they enable task offload-ing from user equipment (UE). We aim to minimize energy consumption of all the UEs via optimizing the user association, resource allocation and the trajectory of UA Vs. To this end, we first propose a Convex optimizAtion based Trajectory control algorithm (CA T), which solves the problem in an iterative way by using block coordinate descent (BCD) method. Then, to make the real-time decision while taking into account the dynamics of the environment (i.e., UA V may take off from different locations), we propose a deep Reinforcement leArning based Trajectory control algorithm (RA T). In RA T, we apply the Prioritized Experience Replay (PER) to improve the convergence of the training procedure. Different from the convex optimization based algorithm which may be susceptible to the initial points and requires iterations, RA T can be adapted to any taking off points of the UA Vs and can obtain the solution more rapidly than CA T once training process has been completed. Simulation results show that the proposed CA T and RA T achieve the similar performance and both outperform traditional algorithms. Liang, Kezhi and Nauman are with the Department of Computer and Informantion Science, Northumbria University, Newcastle upon Tyne, UK, NE1 8ST. Cunhua and Arumugam are with School of Electronic Engineering and Computer Science, Queen Mary University of London, E1 4NS, U.K. Wei is with National Mobile Communications Research Lab, Southeast University, China. I NTRODUCTION With the popularity of computationally-intensive tasks, e.g., smart navigation and augmented reality, people are expecting to enjoy more convenient life than ever before. However, current smart devices and user equipments (UEs), due to small size and limited resource, e.g., computation and battery, may not be able to provide satisfactory Quality of Service (QoS) and Quality of Experience (QoE) in executing those highly demanding tasks. Mobile edge computing (MEC) has been proposed by moving the computation resource to the network edge and it has been proved to greatly enhance UE's ability in executing computation-hungry tasks [1].
Pre-train and Plug-in: Flexible Conditional Text Generation with Variational Auto-Encoders
Duan, Yu, Pei, Jiaxin, Xu, Canwen, Li, Chenliang
Current neural Natural Language Generation (NLG) models cannot handle emerging conditions due to their joint end-to-end learning fashion. When the need for generating text under a new condition emerges, these techniques require not only sufficiently supplementary labeled data but also a full re-training of the existing model. In this paper, we present a new framework named Hierarchical Neural Auto-Encoder (HAE) toward flexible conditional text generation. HAE decouples the text generation module from the condition representation module to allow "one-to-many" conditional generation. When a fresh condition emerges, only a lightweight network needs to be trained and works as a plug-in for HAE, which is efficient and desirable for real-world applications. Extensive experiments demonstrate the superiority of HAE against the existing alternatives with much less training time and fewer model parameters.
Location Attention for Extrapolation to Longer Sequences
Dubois, Yann, Dagan, Gautier, Hupkes, Dieuwke, Bruni, Elia
Neural networks are surprisingly good at interpolating and perform remarkably well when the training set examples resemble those in the test set. However, they are often unable to extrapolate patterns beyond the seen data, even when the abstractions required for such patterns are simple. In this paper, we first review the notion of extrapolation, why it is important and how one could hope to tackle it. We then focus on a specific type of extrapolation which is especially useful for natural language processing: generalization to sequences that are longer than the training ones. We hypothesize that models with a separate content- and location-based attention are more likely to extrapolate than those with common attention mechanisms. We empirically support our claim for recurrent seq2seq models with our proposed attention on variants of the Lookup Table task. This sheds light on some striking failures of neural models for sequences and on possible methods to approaching such issues.