Overview
geomstats: a Python Package for Riemannian Geometry in Machine Learning
Miolane, Nina, Mathe, Johan, Donnat, Claire, Jorda, Mikael, Pennec, Xavier
We introduce geomstats, a python package that performs computations on manifolds such as hyperspheres, hyperbolic spaces, spaces of symmetric positive definite matrices and Lie groups of transformations. We provide efficient and extensively unit-tested implementations of these manifolds, together with useful Riemannian metrics and associated Exponential and Logarithm maps. The corresponding geodesic distances provide a range of intuitive choices of Machine Learning's loss functions. We also give the corresponding Riemannian gradients. The operations implemented in geomstats are available with different computing backends such as numpy, tensorflow and keras. We have enabled GPU implementation and integrated geomstats' manifold computations into keras' deep learning framework. This paper also presents a review of manifolds in machine learning and an overview of the geomstats package with examples demonstrating its use for efficient and user-friendly Riemannian geometry.
ReLeQ: A Reinforcement Learning Approach for Deep Quantization of Neural Networks
Elthakeb, Ahmed T., Pilligundla, Prannoy, Yazdanbakhsh, Amir, Kinzer, Sean, Esmaeilzadeh, Hadi
Despite numerous state-of-the-art applications of Deep Neural Networks (DNNs) in a wide range of real-world tasks, two major challenges hinder further advances in DNNs: hyperparameter optimization and lack of computing power. DNNs become increasingly difficult to train and deploy as they grow in size due to both computational intensity and the large resulting memory footprint. Recent efforts show that quantizing the weights and activations of DNN layers to lower bitwidths takes a significant step toward reducing memory bandwidth and power consumption by using limited computing resources. This paper builds upon the algorithmic insight that the bitwidth of operations in DNNs can be reduced without compromising their classification accuracy. While the use of eight-bit weights and activations during inference maintains the accuracy in most cases, lower bitwidths can achieve the same accuracy while utilizing less power. However, deep quantization (quantizing bitwidths below eight) while maintaining accuracy requires a great deal of trial-and-error, fine-tuning as well as retraining. By formulating quantization bitwidth as a hyperparameter in the optimization problem of selecting the bitwidth, we tackle this issue by leveraging a state-of-the-art policy gradient based Reinforcement Learning (RL) algorithm called Proximal Policy Optimization [10] (PPO), to efficiently explore a large design space of DNN quantization. The proposed technique also opens up the possibility of performing heterogeneous quantization of the network (e.g., quantizing each layer to different bitwidth) as the RL agent learns the sensitivity of each layer with respect to accuracy in order to perform quantization of the entire network. We evaluated our method on several neural networks including MNIST, CIFAR10, SVHN and the RL agent quantizes these networks to average bitwidths of 2.25, 5 and 4 respectively with less than 0.3% accuracy loss in all cases.
IIT Madras Hosts Conclave To Boost AI And ML Ecosystem In Chennai
Indian Institute of Technology Madras undertook a major effort to give a boost to the Artificial Intelligence (AI) and Machine Learning (ML) sectors in Chennai. The Robert Bosch Center for Data Science and Artificial Intelligence, IIT Madras, organized the'Artificial Intelligence and Machine Learning Conclave' focused on understand cutting-edge technology and innovation in the field with participation from top technology firms and think-tanks including Google, Amazon, Foxconn and TVS group among others. Prof Bhaskar Ramamurthi, Director, IIT Madras, inaugurated the Conclave, which was held on 23rd October 2018. The conclave aimed at generating a greater realization of the AI/ML ecosystem in and around Chennai and facilitated the stakeholders to have a brainstorming session about the needs for this ecosystem to thrive and grow further. This event for the first time brought together a significant number of AI/ML deep technology start-ups in Chennai in a single platform.
Security for Machine Learning-based Systems: Attacks and Challenges during Training and Inference
Khalid, Faiq, Hanif, Muhammad Abdullah, Rehman, Semeen, Shafique, Muhammad
The exponential increase in dependencies between the cyber and physical world leads to an enormous amount of data which must be efficiently processed and stored. Therefore, computing paradigms are evolving towards machine learning (ML)-based systems because of their ability to efficiently and accurately process the enormous amount of data. Although ML-based solutions address the efficient computing requirements of big data, they introduce (new) security vulnerabilities into the systems, which cannot be addressed by traditional monitoring-based security measures. Therefore, this paper first presents a brief overview of various security threats in machine learning, their respective threat models and associated research challenges to develop robust security measures. To illustrate the security vulnerabilities of ML during training, inferencing and hardware implementation, we demonstrate some key security threats on ML using LeNet and VGGNet for MNIST and German Traffic Sign Recognition Benchmarks (GTSRB), respectively. Moreover, based on the security analysis of ML-training, we also propose an attack that has a very less impact on the inference accuracy. Towards the end, we highlight the associated research challenges in developing security measures and provide a brief overview of the techniques used to mitigate such security threats.
Variational Bayes Inference in Digital Receivers
The digital telecommunications receiver is an important context for inference methodology, the key objective being to minimize the expected loss function in recovering the transmitted information. For that criterion, the optimal decision is the Bayesian minimum-risk estimator. However, the computational load of the Bayesian estimator is often prohibitive and, hence, efficient computational schemes are required. The design of novel schemes, striking new balances between accuracy and computational load, is the primary concern of this thesis. Two popular techniques, one exact and one approximate, will be studied. The exact scheme is a recursive one, namely the generalized distributive law (GDL), whose purpose is to distribute all operators across the conditionally independent (CI) factors of the joint model, so as to reduce the total number of operators required. In a novel theorem derived in this thesis, GDL, if applicable, will be shown to guarantee such a reduction in all cases. An associated lemma also quantifies this reduction. For practical use, two novel algorithms, namely the no-longer-needed (NLN) algorithm and the generalized form of the Markovian Forward-Backward (FB) algorithm, recursively factorizes and computes the CI factors of an arbitrary model, respectively. The approximate scheme is an iterative one, namely the Variational Bayes (VB) approximation, whose purpose is to find the independent (i.e. zero-order Markov) model closest to the true joint model in the minimum Kullback-Leibler divergence (KLD) sense. Despite being computationally efficient, this naive mean field approximation confers only modest performance for highly correlated models. A novel approximation, namely Transformed Variational Bayes (TVB), will be designed in the thesis in order to relax the zero-order constraint in the VB approximation, further reducing the KLD of the optimal approximation.
The 6 trends that will define intelligent manufacturing in 2019 - Microsoft Industry Blogs
Since the start of the First Industrial Revolution, manufacturing has been the force pushing industrial and societal transformation forward. Today, we're in the midst of the Fourth Industrial Revolution, as a new generation of sophisticated technologies is transforming manufacturing into a highly connected, intelligent, and ultimately, more productive industry. The manpowered shop floor of the past is being replaced by smart manufacturing facilities where tech-savvy workers, aided by intelligent robots, are creating the products and services of the future. As we approach 2019, we're looking ahead to the trends that will define intelligent manufacturing, as well as help empower clients to better evaluate and manage operations, build innovative products and services, and grow their manufacturing businesses. These trends are detailed in our new 2019 Manufacturing Trends Report.
Machine learning architectures to predict motion sickness using a Virtual Reality rollercoaster simulation tool
Hell, Stefan, Argyriou, Vasileios
Virtual Reality (VR) can cause an unprecedented immersion and feeling of presence yet a lot of users experience motion sickness when moving through a virtual environment. Rollercoaster rides are popular in Virtual Reality but have to be well designed to limit the amount of nausea the user may feel. This paper describes a novel framework to get automated ratings on motion sickness using Neural Networks. An application that lets users create rollercoasters directly in VR, share them with other users and ride and rate them is used to gather real-time data related to the in-game behaviour of the player, the track itself and users' ratings based on a Simulator Sickness Questionnaire (SSQ) integrated into the application. Machine learning architectures based on deep neural networks are trained using this data aiming to predict motion sickness levels. While this paper focuses on rollercoasters this framework could help to rate any VR application on motion sickness and intensity that involves camera movement. A new well defined dataset is provided in this paper and the performance of the proposed architectures are evaluated in a comparative study.
Unsupervised Learning of Interpretable Dialog Models
Madan, Dhiraj, Raghu, Dinesh, Pandey, Gaurav, Joshi, Sachindra
Recently several deep learning based models have been proposed for end-to-end learning of dialogs. While these models can be trained from data without the need for any additional annotations, it is hard to interpret them. On the other hand, there exist traditional state based dialog systems, where the states of the dialog are discrete and hence easy to interpret. However these states need to be handcrafted and annotated in the data. To achieve the best of both worlds, we propose Latent State Tracking Network (LSTN) using which we learn an interpretable model in unsupervised manner. The model defines a discrete latent variable at each turn of the conversation which can take a finite set of values. Since these discrete variables are not present in the training data, we use EM algorithm to train our model in unsupervised manner. In the experiments, we show that LSTN can help achieve interpretability in dialog models without much decrease in performance compared to end-to-end approaches.
A Survey on Natural Language Processing for Fake News Detection
Oshikawa, Ray, Qian, Jing, Wang, William Yang
Fake news detection is a critical yet challenging problem in Natural Language Processing (NLP). The rapid rise of social networking platforms has not only yielded a vast increase in information accessibility but has also accelerated the spread of fake news. Given the massive amount of Web content, automatic fake news detection is a practical NLP problem required by all online content providers. This paper presents a survey on fake news detection. Our survey introduces the challenges of automatic fake news detection. We systematically review the datasets and NLP solutions that have been developed for this task. We also discuss the limits of these datasets and problem formulations, our insights, and recommended solutions.
AI for the Common Good?! Pitfalls, challenges, and Ethics Pen-Testing
Recently, many AI researchers and practitioners have embarked on research visions that involve doing AI for "Good". This is part of a general drive towards infusing AI research and practice with ethical thinking. One frequent theme in current ethical guidelines is the requirement that AI be good for all, or: contribute to the Common Good. But what is the Common Good, and is it enough to want to be good? Via four lead questions, I will illustrate challenges and pitfalls when determining, from an AI point of view, what the Common Good is and how it can be enhanced by AI. The questions are: What is the problem / What is a problem?, Who defines the problem?, What is the role of knowledge?, and What are important side effects and dynamics? The illustration will use an example from the domain of "AI for Social Good", more specifically "Data Science for Social Good". Even if the importance of these questions may be known at an abstract level, they do not get asked sufficiently in practice, as shown by an exploratory study of 99 contributions to recent conferences in the field. Turning these challenges and pitfalls into a positive recommendation, as a conclusion I will draw on another characteristic of computer-science thinking and practice to make these impediments visible and attenuate them: "attacks" as a method for improving design. This results in the proposal of ethics pen-testing as a method for helping AI designs to better contribute to the Common Good.