Goto

Collaborating Authors

 Calgary


Harvesting Ambient RF for Presence Detection Through Deep Learning

arXiv.org Machine Learning

This paper explores the use of ambient radio frequency (RF) signals for human presence detection through deep learning. Using WiFi signal as an example, we demonstrate that the channel state information (CSI) obtained at the receiver contains rich information about the propagation environment. Through judicious pre-processing of the estimated CSI followed by deep learning, reliable presence detection can be achieved. Several challenges in passive RF sensing are addressed. With presence detection, how to collect training data with human presence can have a significant impact on the performance. This is in contrast to activity detection when a specific motion pattern is of interest. A second challenge is that RF signals are complex-valued. Handling complex-valued input in deep learning requires careful data representation and network architecture design. Finally, human presence affects CSI variation along multiple dimensions; such variation, however, is often masked by system impediments such as timing or frequency offset. Addressing these challenges, the proposed learning system uses pre-processing to preserve human motion induced channel variation while insulating against other impairments. A convolutional neural network (CNN) properly trained with both magnitude and phase information is then designed to achieve reliable presence detection. Extensive experiments are conducted. Using off-the-shelf WiFi devices, the proposed deep learning based RF sensing achieves near perfect presence detection during multiple extended periods of test and exhibits superior performance compared with leading edge passive infrared sensors. The learning based passive RF sensing thus provides a viable and promising alternative for presence or occupancy detection.



Active Learning for Sound Event Detection

arXiv.org Machine Learning

This paper proposes an active learning system for sound event detection (SED). It aims at maximizing the accuracy of a learned SED model with limited annotation effort. The proposed system analyzes an initially unlabeled audio dataset, from which it selects sound segments for manual annotation. The candidate segments are generated based on a proposed change point detection approach, and the selection is based on the principle of mismatch-first farthest-traversal. During the training of SED models, recordings are used as training inputs, preserving the long-term context for annotated segments. The proposed system clearly outperforms reference methods in the two datasets used for evaluation (TUT Rare Sound 2017 and TAU Spatial Sound 2019). Training with recordings as context outperforms training with only annotated segments. Mismatch-first farthest-traversal outperforms reference sample selection methods based on random sampling and uncertainty sampling. Remarkably, the required annotation effort can be greatly reduced on the dataset where target sound events are rare: by annotating only 2% of the training data, the achieved SED performance is similar to annotating all the training data.


A Speaker Verification Backend for Improved Calibration Performance across Varying Conditions

arXiv.org Machine Learning

In a recent work, we presented a discriminative backend for speaker verification that achieved good out-of-the-box calibration performance on most tested conditions containing varying levels of mismatch to the training conditions. This backend mimics the standard PLDA-based backend process used in most current speaker verification systems, including the calibration stage. All parameters of the backend are jointly trained to optimize the binary cross-entropy for the speaker verification task. Calibration robustness is achieved by making the parameters of the calibration stage a function of vectors representing the conditions of the signal, which are extracted using a model trained to predict condition labels. In this work, we propose a simplified version of this backend where the vectors used to compute the calibration parameters are estimated within the backend, without the need for a condition prediction model. We show that this simplified method provides similar performance to the previously proposed method while being simpler to implement, and having less requirements on the training data. Further, we provide an analysis of different aspects of the method including the effect of initialization, the nature of the vectors used to compute the calibration parameters, and the effect that the random seed and the number of training epochs has on performance. We also compare the proposed method with the trial-based calibration (TBC) method that, to our knowledge, was the state-of-the-art for achieving good calibration across varying conditions. We show that the proposed method outperforms TBC while also being several orders of magnitude faster to run, comparable to the standard PLDA baseline.


Stochastic L-system Inference from Multiple String Sequence Inputs

arXiv.org Artificial Intelligence

Lindenmayer systems (L-systems) are a grammar system that consist of string rewriting rules. The rules replace every symbol in a string in parallel with a successor to produce the next string, and this procedure iterates. In a stochastic context-free L-system (S0L-system), every symbol may have one or more rewriting rule, each with an associated probability of selection. Properly constructed rewriting rules have been found to be useful for modeling and simulating some natural and human engineered processes where each derived string describes a step in the simulation. Typically, processes are modeled by experts who meticulously construct the rules based on measurements or domain knowledge of the process. This paper presents an automated approach to finding stochastic L-systems, given a set of string sequences as input. The implemented tool is called the Plant Model Inference Tool for S0L-systems (PMIT-S0L). PMIT-S0L is evaluated using 960 procedurally generated S0L-systems in a test suite, which are each used to generate input strings, and PMIT-S0L is then used to infer the system from only the sequences. The evaluation shows that PMIT-S0L infers S0L-systems with up to 9 rewriting rules each in under 12 hours. Additionally, it is found that 3 sequences of strings is sufficient to find the correct original rewriting rules in 100% of the cases in the test suite, and 6 sequences of strings reduces the difference in the associated probabilities to approximately 1% or less.


Multi-task Learning for Speaker Verification and Voice Trigger Detection

arXiv.org Machine Learning

Automatic speech transcription and speaker recognition are usually treated as separate tasks even though they are interdependent. In this study, we investigate training a single network to perform both tasks jointly. We train the network in a supervised multi-task learning setup, where the speech transcription branch of the network is trained to minimise a phonetic connectionist temporal classification (CTC) loss while the speaker recognition branch of the network is trained to label the input sequence with the correct label for the speaker. We present a large-scale empirical study where the model is trained using several thousand hours of labelled training data for each task. We evaluate the speech transcription branch of the network on a voice trigger detection task while the speaker recognition branch is evaluated on a speaker verification task. Results demonstrate that the network is able to encode both phonetic \emph{and} speaker information in its learnt representations while yielding accuracies at least as good as the baseline models for each task, with the same number of parameters as the independent models.


Ensemble Noise Simulation to Handle Uncertainty about Gradient-based Adversarial Attacks

arXiv.org Machine Learning

DVERSARIAL attacks on neural networks pose a serious threat to safety-critical systems that rely on the high accuracies of these neural networks. The imperceptibility of additive evasion attacks makes it difficult to even detect their existence. Recent work has attempted to tackle this issue by designing defenses against such attacks, mostly focusing on a scenario where the assumption is that the attacker has significant knowledge of the victim classifier, and hence will design an attack to optimally destroy the accuracy of that particular classifier. However, there is no guarantee that the attacker will choose to do so. Furthermore, adversarial examples transfer across classifiers, and an adversary could take advantage of this property by crafting an attack based on a different classifier. The attacker would do this when having only partial knowledge about the victim classifier, or when attempting to confuse the defender on purpose. Alternatively, another scenario is that the attacker is limited in computational resources, and may be trying to attack multiple classifiers at once. This is why they would tailor the attack to only one classifier, and use that to attack all classifiers. R. Mahfuz, R. Sahay, and A. El Gamal are with the Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA.


The productive software engineer with Dr. Tom Zimmermann Learn More

#artificialintelligence

If you're in software development, Dr. Tom Zimmermann, a senior researcher at Microsoft Research in Redmond, wants you to be more productive, and he's here to help. Well, while productivity can be hard to measure, his research in the Empirical Software Engineering group is attempting to do just that by using insights from actual data, rather than just gut feelings, to improve the software development process. On today's podcast, Dr. Zimmermann talks about why we need to rethink productivity in software engineering, explains why work environments matter, tells us how AI and machine learning are impacting traditional software workflows, and reveals the difference between a typical day and a good day in the life of a software developer, and what it would take to make a good day typical! Tom Zimmermann: If you think of a typical software engineer at Microsoft, they spend about half of a day on development related activities, and the other half of the day is spent on other activities like coordinating with other people in meetings, sending emails… So, there's actually not that much time that they can spend on writing code, and the time they spend writing code, on a good day, it's actually only 96 minutes, and on a bad day it's, on average, 66 minutes. And half an hour writing code actually can make the difference between a bad and a good workday. Host: You're listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. Host: If you're in software development, Dr. Tom Zimmermann, a senior researcher at Microsoft Research in Redmond, wants you to be more productive, and he's here to help. Well, while productivity can be hard to measure, his research in the Empirical Software Engineering group is attempting to do just that by using insights from actual data, rather than just gut feelings, to improve the software development process. On today's podcast, Dr. Zimmermann talks about why we need to rethink productivity in software engineering, explains why work environments matter, tells us how AI and machine learning are impacting traditional software workflows, and reveals the difference between a typical day and a good day in the life of a software developer, and what it would take to make a good day typical! Host: You have a cool nickname. Why do people call you that? Tom Zimmermann: So, it goes back to when I started at Microsoft.


The productive software engineer with Dr. Tom Zimmermann Learn More

#artificialintelligence

If you're in software development, Dr. Tom Zimmermann, a senior researcher at Microsoft Research in Redmond, wants you to be more productive, and he's here to help. Well, while productivity can be hard to measure, his research in the Empirical Software Engineering group is attempting to do just that by using insights from actual data, rather than just gut feelings, to improve the software development process. On today's podcast, Dr. Zimmermann talks about why we need to rethink productivity in software engineering, explains why work environments matter, tells us how AI and machine learning are impacting traditional software workflows, and reveals the difference between a typical day and a good day in the life of a software developer, and what it would take to make a good day typical! Tom Zimmermann: If you think of a typical software engineer at Microsoft, they spend about half of a day on development related activities, and the other half of the day is spent on other activities like coordinating with other people in meetings, sending emails… So, there's actually not that much time that they can spend on writing code, and the time they spend writing code, on a good day, it's actually only 96 minutes, and on a bad day it's, on average, 66 minutes. And half an hour writing code actually can make the difference between a bad and a good workday. Host: You're listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. Host: If you're in software development, Dr. Tom Zimmermann, a senior researcher at Microsoft Research in Redmond, wants you to be more productive, and he's here to help. Well, while productivity can be hard to measure, his research in the Empirical Software Engineering group is attempting to do just that by using insights from actual data, rather than just gut feelings, to improve the software development process. On today's podcast, Dr. Zimmermann talks about why we need to rethink productivity in software engineering, explains why work environments matter, tells us how AI and machine learning are impacting traditional software workflows, and reveals the difference between a typical day and a good day in the life of a software developer, and what it would take to make a good day typical! Host: You have a cool nickname. Why do people call you that? Tom Zimmermann: So, it goes back to when I started at Microsoft.


What Artificial Intelligence Says About the Perfect Running Stride

#artificialintelligence

The physiologist and coach Jack Daniels once filmed a bunch of runners in stride, then showed the footage to coaches and biomechanists to see if they could eyeball who was the most efficient. "They couldn't tell," Daniels later recalled. "No way at all." Famously awkward-looking runners like Paula Radcliffe and Alberto Salazar sometimes turn out to be extraordinarily efficient. Smooth-striding beauties sometimes finish at the back of the pack. The act of running, it turns out, is surprisingly complicated.