Calgary
HapticLever: Kinematic Force Feedback using a 3D Pantograph
Friedel, Marcus, Sharlin, Ehud, Suzuki, Ryo
HapticLever is a new kinematic approach for VR haptics which uses a 3D pantograph to stiffly render large-scale surfaces using small-scale proxies. The HapticLever approach does not consume power to render forces, but rather puts a mechanical constraint on the end effector using a small-scale proxy surface. The HapticLever approach provides stiff force feedback when the user interacts with a static virtual surface, but allows the user to move their arm freely when moving through free virtual space. We present the problem space, the related work, and the HapticLever design approach.
Next Generation Monitoring and Detection - Smart Cities Tech
Fotech has launched two next-generation Helios DAS systems at the International Pipeline Expo in Calgary, Canada, between 27 and 29 September 2022. The new Helios DAS TL4 (single-channel) and the Helios DAS TX4 (dual-channel) interrogators deliver lower false alarm rates and enhanced monitoring and incident detection. They incorporate new machine learning capabilities, which allows a faster, cost effective and more systematic deployment of solutions in long linear assets such, as pipelines and perimeters. Pedro Barbosa, Senior Product Manager at Fotech, says, "The new Helios DAS TL4 and Helios DAS TX4 interrogators take monitoring of pipelines, critical infrastructure and perimeters to the next level. The machine learning that is built into them means they deliver exceptional accuracy with a much-reduced false alarm rate. As a result, users have extremely high confidence in alarms, and don't waste precious time or resource investigating false alarms."
UltraBots: Large-Area Mid-Air Haptics for VR with Robotically Actuated Ultrasound Transducers
Faridan, Mehrad, Friedel, Marcus, Suzuki, Ryo
We introduce UltraBots, a system that combines ultrasound haptic feedback and robotic actuation for large-area mid-air haptics for VR. Ultrasound haptics can provide precise mid-air haptic feedback and versatile shape rendering, but the interaction area is often limited by the small size of the ultrasound devices, restricting the possible interactions for VR. To address this problem, this paper introduces a novel approach that combines robotic actuation with ultrasound haptics. More specifically, we will attach ultrasound transducer arrays to tabletop mobile robots or robotic arms for scalable, extendable, and translatable interaction areas. We plan to use Sony Toio robots for 2D translation and/or commercially available robotic arms for 3D translation. Using robotic actuation and hand tracking measured by a VR HMD (e.g., Oculus Quest), our system can keep the ultrasound transducers underneath the user's hands to provide on-demand haptics. We demonstrate applications with workspace environments, medical training, education and entertainment.
Unsupervised Speech Enhancement using Dynamical Variational Auto-Encoders
Bie, Xiaoyu, Leglaive, Simon, Alameda-Pineda, Xavier, Girin, Laurent
Dynamical variational autoencoders (DVAEs) are a class of deep generative models with latent variables, dedicated to model time series of high-dimensional data. DVAEs can be considered as extensions of the variational autoencoder (VAE) that include temporal dependencies between successive observed and/or latent vectors. Previous work has shown the interest of using DVAEs over the VAE for speech spectrograms modeling. Independently, the VAE has been successfully applied to speech enhancement in noise, in an unsupervised noise-agnostic set-up that requires neither noise samples nor noisy speech samples at training time, but only requires clean speech signals. In this paper, we extend these works to DVAE-based single-channel unsupervised speech enhancement, hence exploiting both speech signals unsupervised representation learning and dynamics modeling. We propose an unsupervised speech enhancement algorithm that combines a DVAE speech prior pre-trained on clean speech signals with a noise model based on nonnegative matrix factorization, and we derive a variational expectation-maximization (VEM) algorithm to perform speech enhancement. The algorithm is presented with the most general DVAE formulation and is then applied with three specific DVAE models to illustrate the versatility of the framework. Experimental results show that the proposed DVAE-based approach outperforms its VAE-based counterpart, as well as several supervised and unsupervised noise-dependent baselines, especially when the noise type is unseen during training.
Meta-Learning a Cross-lingual Manifold for Semantic Parsing
Sherborne, Tom, Lapata, Mirella
Localizing a semantic parser to support new languages requires effective cross-lingual generalization. Recent work has found success with machine-translation or zero-shot methods although these approaches can struggle to model how native speakers ask questions. We consider how to effectively leverage minimal annotated examples in new languages for few-shot cross-lingual semantic parsing. We introduce a first-order meta-learning algorithm to train a semantic parser with maximal sample efficiency during cross-lingual transfer. Our algorithm uses high-resource languages to train the parser and simultaneously optimizes for cross-lingual generalization for lower-resource languages. Results across six languages on ATIS demonstrate that our combination of generalization steps yields accurate semantic parsers sampling $\le$10% of source training data in each new language. Our approach also trains a competitive model on Spider using English with generalization to Chinese similarly sampling $\le$10% of training data.
Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural Text-to-Speech
Nakai, Yusuke, Saito, Yuki, Udagawa, Kenta, Saruwatari, Hiroshi
We propose a novel training algorithm for a multi-speaker neural text-to-speech (TTS) model based on multi-task adversarial training. A conventional generative adversarial network (GAN)-based training algorithm significantly improves the quality of synthetic speech by reducing the statistical difference between natural and synthetic speech. However, the algorithm does not guarantee the generalization performance of the trained TTS model in synthesizing voices of unseen speakers who are not included in the training data. Our algorithm alternatively trains two deep neural networks: multi-task discriminator and multi-speaker neural TTS model (i.e., generator of GANs). The discriminator is trained not only to distinguish between natural and synthetic speech but also to verify the speaker of input speech is existent or non-existent (i.e., newly generated by interpolating seen speakers' embedding vectors). Meanwhile, the generator is trained to minimize the weighted sum of the speech reconstruction loss and adversarial loss for fooling the discriminator, which achieves high-quality multi-speaker TTS even if the target speaker is unseen. Experimental evaluation shows that our algorithm improves the quality of synthetic speech better than a conventional GANSpeech algorithm.
AARGH! End-to-end Retrieval-Generation for Task-Oriented Dialog
Nekvinda, Tomáš, Dušek, Ondřej
We introduce AARGH, an end-to-end task-oriented dialog system combining retrieval and generative approaches in a single model, aiming at improving dialog management and lexical diversity of outputs. The model features a new response selection method based on an action-aware training objective and a simplified single-encoder retrieval architecture which allow us to build an end-to-end retrieval-enhanced generation model where retrieval and generation share most of the parameters. On the MultiWOZ dataset, we show that our approach produces more diverse outputs while maintaining or improving state tracking and context-to-response generation performance, compared to state-of-the-art baselines.
Valuation of Public Bus Electrification with Open Data
Vijay, Upadhi, Woo, Soomin, Moura, Scott J., Jain, Akshat, Rodriguez, David, Gambacorta, Sergio, Ferrara, Giuseppe, Lanuzza, Luigi, Zulberti, Christian, Mellekas, Erika, Papa, Carlo
This research provides a novel framework to estimate the economic, environmental, and social values of electrifying public transit buses, for cities across the world, based on open-source data. Electric buses are a compelling candidate to replace diesel buses for the environmental and social benefits. However, the state-of-art models to evaluate the value of bus electrification are limited in applicability because they require granular and bespoke data on bus operation that can be difficult to procure. Our valuation tool uses General Transit Feed Specification, a standard data format used by transit agencies worldwide, to provide high-level guidance on developing a prioritization strategy for electrifying a bus fleet. We develop physics-informed machine learning models to evaluate the energy consumption, the carbon emissions, the health impacts, and the total cost of ownership for each transit route. We demonstrate the scalability of our tool with a case study of the bus lines in the Greater Boston and Milan metropolitan areas. Detailed Affiliation: U.Vijay, S.Woo, and S.J.Moura are at Department of Civil and Environmental Engineering, University of California-Berkeley, Davis Hall, Berkeley, California, 94720, USA. A.Jain is at Department of Electrical Engineering and Computer Sciences, University of California-Berkeley, Soda Hall, Berkeley, California, 94720, USA. D.Rodriguez and E.Mellekas are at Enel X, North America, Inc., One Marina Park Drive, Boston, 02210, MA, USA. S. Gambacorta is at Enel X, Innovation and Sustainability Global, Smart City, Viale Tor di Quinto, Rome, 00191, Italy. G.Ferrara is at Enel X, Innovation and Sustainability Global, Smart City, Passo Martino, Catania, 95121, Italy. L.Lanuzza is at Enel X, Innovation and Sustainability B2C & B2B Innovation Factory, Viale Tor di Quinto, Rome, 00191, Italy. C.Zulberti and C.Papa are at Enel Foundation, Via Bellini, Rome, 00198, Italy. Vehicle electrification is crucial for reducing the climate impact of the transportation sector, which currently accounts for 16.2% of the global greenhouse gas emissions [22]. Zero-emission electric vehicles can significantly improve the air quality, health, and environmental equity [23], [24].
Read, Revise, Repeat: A System Demonstration for Human-in-the-loop Iterative Text Revision
Du, Wanyu, Kim, Zae Myung, Raheja, Vipul, Kumar, Dhruv, Kang, Dongyeop
Revision is an essential part of the human writing process. It tends to be strategic, adaptive, and, more importantly, iterative in nature. Despite the success of large language models on text revision tasks, they are limited to non-iterative, one-shot revisions. Examining and evaluating the capability of large language models for making continuous revisions and collaborating with human writers is a critical step towards building effective writing assistants. In this work, we present a human-in-the-loop iterative text revision system, Read, Revise, Repeat (R3), which aims at achieving high quality text revisions with minimal human efforts by reading model-generated revisions and user feedbacks, revising documents, and repeating human-machine interactions. In R3, a text revision model provides text editing suggestions for human writers, who can accept or reject the suggested edits. The accepted edits are then incorporated into the model for the next iteration of document revision. Writers can therefore revise documents iteratively by interacting with the system and simply accepting/rejecting its suggested edits until the text revision model stops making further revisions or reaches a predefined maximum number of revisions. Empirical experiments show that R3 can generate revisions with comparable acceptance rate to human writers at early revision depths, and the human-machine interaction can get higher quality revisions with fewer iterations and edits. The collected human-model interaction dataset and system code are available at \url{https://github.com/vipulraheja/IteraTeR}. Our system demonstration is available at \url{https://youtu.be/lK08tIpEoaE}.
Leveraging Joint-Diagonalization in Transform-Learning NMF
Zhang, Sixin, Soubies, Emmanuel, Févotte, Cédric
Non-negative matrix factorization with transform learning (TL-NMF) is a recent idea that aims at learning data representations suited to NMF. In this work, we relate TL-NMF to the classical matrix joint-diagonalization (JD) problem. We show that, when the number of data realizations is sufficiently large, TL-NMF can be replaced by a two-step approach -- termed as JD+NMF -- that estimates the transform through JD, prior to NMF computation. In contrast, we found that when the number of data realizations is limited, not only is JD+NMF no longer equivalent to TL-NMF, but the inherent low-rank constraint of TL-NMF turns out to be an essential ingredient to learn meaningful transforms for NMF.