Calgary
Language Both Enraptures and Deceives Us - Issue 76: Language
The purpose of language is to reveal the contents of our minds, says Julie Sedivy. We are social animals and language is what springs us from our isolated selves and connects us with others. Sedivy has taught linguistics and psychology at Brown University and the University of Calgary. She specializes in psycholinguistics, the psychology of language, notably the psychological pressures that give birth to language and comprehension.
Teaching technological stewardship makes future engineers more agile and responsible
The scale, pace and breadth of technological development in artificial intelligence, robotics, computing, biotechnology, materials science and beyond have ushered in the Fourth Industrial Revolution. In an interview with journalist Thomas Friedman, Google executive Eric Teller argues that humanity's 21st century challenge is to become as good at shaping the positive impacts of technologies as we are at inventing the technologies in the first place. Teller says the problem is that the political, economic, legal, organizational and educational systems in which we operate are not agile enough to respond to the scale and pace of technological change. My professional life is focused on how to educate aspiring engineers to be agile. I teach ethics, professionalism and communication in the Faculty of Engineering and Applied Science at Memorial University.
Universal mCloud Strengthens AssetCare Business for Oil and Gas with Key Appointment and New Hire
Universal mCloud is creating a more efficient future with the use of AI and analytics, curbing energy waste, maximizing energy production, and getting the most out of critical energy infrastructure. Through mCloud's AI-powered AssetCare platform, mCloud offers complete asset management solutions to three distinct segments: smart facilities, power generation, and process industries including oil and gas. IoT sensors bring data from connected assets into the cloud, where AI and analytics are applied to maximize their performance. Headquartered in Vancouver, Canada with offices in locations worldwide including Calgary, San Francisco, and Beijing, the mCloud family includes an ecosystem of operating subsidiaries that deliver high-performance IoT, AI, 3D, and mobile capabilities to customers, all integrated into AssetCare. With over 100 blue-chip customers and more than 35,000 assets connected in thousands of locations worldwide, mCloud is changing the way energy assets are managed.
Standalone and RTK GNSS on 30,000 km of North American Highways
Reid, Tyler G. R., Pervez, Nahid, Ibrahim, Umair, Houts, Sarah E., Pandey, Gaurav, Alla, Naveen K. R., Hsia, Andy
There is a growing need for vehicle positioning information to support Advanced Driver Assistance Systems (ADAS), Connectivity (V2X), and Automated Driving (AD) features. These range from a need for road determination (<5 meters), lane determination (<1.5 meters), and determining where the vehicle is within the lane (<0.3 meters). This work examines the performance of Global Navigation Satellite Systems (GNSS) on 30,000 km of North American highways to better understand the automotive positioning needs it meets today and what might be possible in the near future with wide area GNSS correction services and multi-frequency receivers. This includes data from a representative automotive production GNSS used primarily for turn-by-turn navigation as well as an Inertial Navigation System which couples two survey grade GNSS receivers with a tactical grade Inertial Measurement Unit (IMU) to act as ground truth. The latter utilized networked Real-Time Kinematic (RTK) GNSS corrections delivered over a cellular modem in real-time. We assess on-road GNSS accuracy, availability, and continuity. Availability and continuity are broken down in terms of satellite visibility, satellite geometry, position type (RTK fixed, RTK float, or standard positioning), and RTK correction latency over the network. Results show that current automotive solutions are best suited to meet road determination requirements at 98% availability but are less suitable for lane determination at 57%. Multi-frequency receivers with RTK corrections were found more capable with road determination at 99.5%, lane determination at 98%, and highway-level lane departure protection at 91%.
On the Veracity of Cyber Intrusion Alerts Synthesized by Generative Adversarial Networks
Sweet, Christopher, Moskal, Stephen, Yang, Shanchieh Jay
--Recreating cyber-attack alert data with a high level of fidelity is challenging due to the intricate interaction between features, non-homogeneity of alerts, and potential for rare yet critical samples. Generative Adversarial Networks (GANs) have been shown to effectively learn complex data distributions with the intent of creating increasingly realistic data. This paper presents the application of GANs to cyber-attack alert data and shows that GANs not only successfully learn to generate realistic alerts, but also reveal feature dependencies within alerts. This is accomplished by reviewing the intersection of histograms for varying alert-feature combinations between the ground truth and generated datsets. Traditional statistical metrics, such as conditional and joint entropy, are also employed to verify the accuracy of these dependencies. Finally, it is shown that a Mutual Information constraint on the network can be used to increase the generation of low probability, critical, alert values. By mapping alerts to a set of attack stages it is shown that the output of these low probability alerts has a direct contextual meaning for Cyber Security analysts. Overall, this work provides the basis for generating new cyber intrusion alerts and provides evidence that synthesized alerts emulate critical dependencies from the source dataset. I NTRODUCTION Classifying, predicting, and generating cyber-attack alert data provides a unique set of challenges due to imbalance and a lack of homogeneity in alert datasets. Furthering these challenges critical exploits in a network are often rare and difficult to identify. Despite this is has been shown that alert data can be used to identify anomalous traffic [1] [2] [3], network vulnerabilities [4], and bad actor behavior profiling [5]. However, to fully realize the potential of cyber-attack alert data, a means to acquire more data and analyze critical dependencies within alerts is needed. This work seeks to provide solutions to these challenges by showing that deep learning models are able to recreate cyber-attack alert data when given representative real world data. This includes a means for driving better coverage of the feature domain in model outputs, allowing more rare but critical events to be synthesized.
DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis
Saito, Yuki, Takamichi, Shinnosuke, Saruwatari, Hiroshi
This paper proposes novel algorithms for speaker embedding using subjective inter-speaker similarity based on deep neural networks (DNNs). Although conventional DNN-based speaker embedding such as a $d$-vector can be applied to multi-speaker modeling in speech synthesis, it does not correlate with the subjective inter-speaker similarity and is not necessarily appropriate speaker representation for open speakers whose speech utterances are not included in the training data. We propose two training algorithms for DNN-based speaker embedding model using an inter-speaker similarity matrix obtained by large-scale subjective scoring. One is based on similarity vector embedding and trains the model to predict a vector of the similarity matrix as speaker representation. The other is based on similarity matrix embedding and trains the model to minimize the squared Frobenius norm between the similarity matrix and the Gram matrix of $d$-vectors, i.e., the inter-speaker similarity derived from the $d$-vectors. We crowdsourced the inter-speaker similarity scores of 153 Japanese female speakers, and the experimental results demonstrate that our algorithms learn speaker embedding that is highly correlated with the subjective similarity. We also apply the proposed speaker embedding to multi-speaker modeling in DNN-based speech synthesis and reveal that the proposed similarity vector embedding improves synthetic speech quality for open speakers whose speech utterances are unseen during the training.
The Roadmap to 6G -- AI Empowered Wireless Networks
Letaief, Khaled B., Chen, Wei, Shi, Yuanming, Zhang, Jun, Zhang, Ying-Jun Angela
The recent upsurge of diversified mobile applications, especially those supported by Artificial Intelligence (AI), is spurring heated discussions on the future evolution of wireless communications. While 5G is being deployed around the world, efforts from industry and academia have started to look beyond 5G and conceptualize 6G. We envision 6G to undergo an unprecedented transformation that will make it substantially different from the previous generations of wireless cellular systems. In particular, 6G will go beyond mobile Internet and will be required to support ubiquitous AI services from the core to the end devices of the network. Meanwhile, AI will play a critical role in designing and optimizing 6G architectures, protocols, and operations. In this article, we discuss potential technologies for 6G to enable mobile AI applications, as well as AI-enabled methodologies for 6G network design and optimization. Key trends in the evolution to 6G will also be discussed.
Perceptual Generative Autoencoders
Zhang, Zijun, Zhang, Ruixiang, Li, Zongpeng, Bengio, Yoshua, Paull, Liam
Modern generative models are usually designed to match target distributions directly in the data space, where the intrinsic dimensionality of data can be much lower than the ambient dimensionality. We argue that this discrepancy may contribute to the difficulties in training generative models. We therefore propose to map both the generated and target distributions to the latent space using the encoder of a standard autoencoder, and train the generator (or decoder) to match the target distribution in the latent space. The resulting method, perceptual generative autoencoder (PGA), is then incorporated with a maximum likelihood or variational autoencoder (VAE) objective to train the generative model. With maximum likelihood, PGAs generalize the idea of reversible generative models to unrestricted neural network architectures and arbitrary latent dimensionalities. When combined with VAEs, PGAs can generate sharper samples than vanilla VAEs. Compared to other autoencoder-based generative models using simple priors, PGAs achieve state-of-the-art FID scores on CIFAR-10 and CelebA.
Cumulative Adaptation for BLSTM Acoustic Models
Kitza, Markus, Golik, Pavel, Schlüter, Ralf, Ney, Hermann
This paper addresses the robust speech recognition problem as an adaptation task. Specifically, we investigate the cumulative application of adaptation methods. A bidirectional Long Short-Term Memory (BLSTM) based neural network, capable of learning temporal relationships and translation invariant representations, is used for robust acoustic modelling. Further, i-vectors were used as an input to the neural network to perform instantaneous speaker and environment adaptation, providing 8\% relative improvement in word error rate on the NIST Hub5 2000 evaluation test set. By enhancing the first-pass i-vector based adaptation with a second-pass adaptation using speaker and environment dependent transformations within the network, a further relative improvement of 5\% in word error rate was achieved. We have reevaluated the features used to estimate i-vectors and their normalization to achieve the best performance in a modern large scale automatic speech recognition system.
One-Way Prototypical Networks
Few-shot models have become a popular topic of research in the past years. They offer the possibility to determine class belongings for unseen examples using just a handful of examples for each class. Such models are trained on a wide range of classes and their respective examples, learning a decision metric in the process. Types of few-shot models include matching networks and prototypical networks. We show a new way of training prototypical few-shot models for just a single class. These models have the ability to predict the likelihood of an unseen query belonging to a group of examples without any given counterexamples. The difficulty here lies in the fact that no relative distance to other classes can be calculated via softmax. We solve this problem by introducing a "null class" centered around zero, and enforcing centering with batch normalization. Trained on the commonly used Omniglot data set, we obtain a classification accuracy of .98 on the matched test set, and of .8 on unmatched MNIST data. On the more complex MiniImageNet data set, test accuracy is .8. In addition, we propose a novel Gaussian layer for distance calculation in a prototypical network, which takes the support examples' distribution rather than just their centroid into account. This extension shows promising results when a higher number of support examples is available.