Overview
From Continual Learning to Causal Discovery in Robotics
Castri, Luca, Mghames, Sariah, Bellotto, Nicola
Reconstructing accurate causal models of dynamic systems from time-series of sensor data is a key problem in many real-world scenarios. In this paper, we present an overview based on our experience about practical challenges that the causal analysis encounters when applied to autonomous robots and how Continual Learning~(CL) could help to overcome them. We propose a possible way to leverage the CL paradigm to make causal discovery feasible for robotics applications where the computational resources are limited, while at the same time exploiting the robot as an active agent that helps to increase the quality of the reconstructed causal models.
Intelligence at the Extreme Edge: A Survey on Reformable TinyML
Rajapakse, Visal, Karunanayake, Ishan, Ahmed, Nadeem
Tiny Machine Learning (TinyML) is an upsurging research field that proposes to democratize the use of Machine Learning and Deep Learning on highly energy-efficient frugal Microcontroller Units. Considering the general assumption that TinyML can only run inference, growing interest in the domain has led to work that makes them reformable, i.e., solutions that permit models to improve once deployed. This work presents a survey on reformable TinyML solutions with the proposal of a novel taxonomy. Here, the suitability of each hierarchical layer for reformability is discussed. Furthermore, we explore the workflow of TinyML and analyze the identified deployment schemes, available tools and the scarcely available benchmarking tools. Finally, we discuss how reformable TinyML can impact a few selected industrial areas and discuss the challenges and future directions.
On adversarial robustness and the use of Wasserstein ascent-descent dynamics to enforce it
Trillos, Camilo Garcia, Trillos, Nicolas Garcia
We propose iterative algorithms to solve adversarial problems in a variety of supervised learning settings of interest. Our algorithms, which can be interpreted as suitable ascent-descent dynamics in Wasserstein spaces, take the form of a system of interacting particles. These interacting particle dynamics are shown to converge toward appropriate mean-field limit equations in certain large number of particles regimes. In turn, we prove that, under certain regularity assumptions, these mean-field equations converge, in the large time limit, toward approximate Nash equilibria of the original adversarial learning problems. We present results for nonconvex-nonconcave settings, as well as for nonconvex-concave ones. Numerical experiments illustrate our results.
Enabling AI-Generated Content (AIGC) Services in Wireless Edge Networks
Du, Hongyang, Li, Zonghang, Niyato, Dusit, Kang, Jiawen, Xiong, Zehui, Xuemin, null, Shen, null, Kim, Dong In
Artificial Intelligence-Generated Content (AIGC) refers to the use of AI to automate the information creation process while fulfilling the personalized requirements of users. However, due to the instability of AIGC models, e.g., the stochastic nature of diffusion models, the quality and accuracy of the generated content can vary significantly. In wireless edge networks, the transmission of incorrectly generated content may unnecessarily consume network resources. Thus, a dynamic AIGC service provider (ASP) selection scheme is required to enable users to connect to the most suited ASP, improving the users' satisfaction and quality of generated content. In this article, we first review the AIGC techniques and their applications in wireless networks. We then present the AIGC-as-a-service (AaaS) concept and discuss the challenges in deploying AaaS at the edge networks. Yet, it is essential to have performance metrics to evaluate the accuracy of AIGC services. Thus, we introduce several image-based perceived quality evaluation metrics. Then, we propose a general and effective model to illustrate the relationship between computational resources and user-perceived quality evaluation metrics. To achieve efficient AaaS and maximize the quality of generated content in wireless edge networks, we propose a deep reinforcement learning-enabled algorithm for optimal ASP selection. Simulation results show that the proposed algorithm can provide a higher quality of generated content to users and achieve fewer crashed tasks by comparing with four benchmarks, i.e., overloading-avoidance, random, round-robin policies, and the upper-bound schemes.
Mining Healthcare Procurement Data Using Text Mining and Natural Language Processing -- Reflection From An Industrial Project
Zhang, Ziqi, Jasaitis, Tomas, Freeman, Richard, Alfrjani, Rowida, Funk, Adam
While text mining and NLP research has been established for decades, there remain gaps in the literature that reports the use of these techniques in building real-world applications. For example, they typically look at single and sometimes simplified tasks, and do not discuss in-depth data heterogeneity and inconsistency that is common in real-world problems or their implication on the development of their methods. Also, few prior work has focused on the healthcare domain. In this work, we describe an industry project that developed text mining and NLP solutions to mine millions of heterogeneous, multilingual procurement documents in the healthcare sector. We extract structured procurement contract data that is used to power a platform for dynamically assessing supplier risks. Our work makes unique contributions in a number of ways. First, we deal with highly heterogeneous, multilingual data and we document our approach to tackle these challenges. This is mainly based on a method that effectively uses domain knowledge and generalises to multiple text mining and NLP tasks and languages. Second, applying this method to mine millions of procurement documents, we develop the first structured procurement contract database that will help facilitate the tendering process. Second, Finally, we discuss lessons learned for practical text mining/NLP development, and make recommendations for future research and practice.
Efficient Attack Detection in IoT Devices using Feature Engineering-Less Machine Learning
Through the generalization of deep learning, the research community has addressed critical challenges in the network security domain, like malware identification and anomaly detection. However, they have yet to discuss deploying them on Internet of Things (IoT) devices for day-to-day operations. IoT devices are often limited in memory and processing power, rendering the compute-intensive deep learning environment unusable. This research proposes a way to overcome this barrier by bypassing feature engineering in the deep learning pipeline and using raw packet data as input. We introduce a feature engineering-less machine learning (ML) process to perform malware detection on IoT devices. Our proposed model," Feature engineering-less ML (FEL-ML)," is a lighter-weight detection algorithm that expends no extra computations on "engineered" features. It effectively accelerates the low-powered IoT edge. It is trained on unprocessed byte-streams of packets. Aside from providing better results, it is quicker than traditional feature-based methods. FEL-ML facilitates resource-sensitive network traffic security with the added benefit of eliminating the significant investment by subject matter experts in feature engineering. NTRODUCTION Cyber Security experts have found pivotal features in network traffic, including packet captures (pcap). Data scientists have used them to fashion impressive models capable of differentiating malicious traffic from benign [1]. However, most network traffic is emitted over encrypted channels in the current scheme. This security measure has limited experts' ability to contrive meaningful features for machine learning (ML), which can soon become obsolete. This challenge has given birth to analyzing raw bytes to detect malicious behavior in internet flows. In the Internet of Things (IoT) domain, devices are sensors that interact with the environment. They conditionally react to changes in the environment and exchange information over the internet about these changes.
A Survey of Zero-shot Generalisation in Deep Reinforcement Learning
Kirk, Robert (a:1:{s:5:"en_US";s:25:"University College London";}) | Zhang, Amy | Grefenstette, Edward | Rocktäschel, Tim
The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to produce RL algorithms whose policies generalise well to novel unseen situations at deployment time, avoiding overfitting to their training environments. Tackling this is vital if we are to deploy reinforcement learning algorithms in real world scenarios, where the environment will be diverse, dynamic and unpredictable. This survey is an overview of this nascent field. We rely on a unifying formalism and terminology for discussing different ZSG problems, building upon previous works. We go on to categorise existing benchmarks for ZSG, as well as current methods for tackling these problems. Finally, we provide a critical discussion of the current state of the field, including recommendations for future work. Among other conclusions, we argue that taking a purely procedural content generation approach to benchmark design is not conducive to progress in ZSG, we suggest fast online adaptation and tackling RL-specific problems as some areas for future work on methods for ZSG, and we recommend building benchmarks in underexplored problem settings such as offline RL ZSG and reward-function variation.
5 Strange new inventions arriving in 2023
CyberGuy lists some wireless earbuds to help you choose the best one for you. This year's Consumer Electronics Show debuted tons of state-of-the-art technology, and people are already going nuts over it. CLICK TO GET KURT'S CYBERGUY NEWSLETTER WITH QUICK TIPS, TECH REVIEWS, SECURITY ALERTS AND EASY HOW-TO'S TO MAKE YOU SMARTER There's a lot to be excited about, and a bit weirded out about - too, from bird feeders with cameras to pillows that breathe and even a self-driving stroller. Not sure that is mom approved. The AI-powered hummingbird feeder comes with a camera that can capture photos and videos of over 350 different hummingbird species. This just might be the coolest bird feeder around.
Equivariant and Steerable Neural Networks: A review with special emphasis on the symmetric group
Krüger, Patrick, Gottschalk, Hanno
Convolutional neural networks revolutionized computer vision and natrual language processing. Their efficiency, as compared to fully connected neural networks, has its origin in the architecture, where convolutions reflect the translation invariance in space and time in pattern or speech recognition tasks. Recently, Cohen and Welling have put this in the broader perspective of invariance under symmetry groups, which leads to the concept of group equivaiant neural networks and more generally steerable neural networks. In this article, we review the architecture of such networks including equivariant layers and filter banks, activation with capsules and group pooling. We apply this formalism to the symmetric group, for which we work out a number of details on representations and capsules that are not found in the literature.
Foldsformer: Learning Sequential Multi-Step Cloth Manipulation With Space-Time Attention
Mo, Kai, Xia, Chongkun, Wang, Xueqian, Deng, Yuhong, Gao, Xuehai, Liang, Bin
Sequential multi-step cloth manipulation is a challenging problem in robotic manipulation, requiring a robot to perceive the cloth state and plan a sequence of chained actions leading to the desired state. Most previous works address this problem in a goal-conditioned way, and goal observation must be given for each specific task and cloth configuration, which is not practical and efficient. Thus, we present a novel multi-step cloth manipulation planning framework named Foldformer. Foldformer can complete similar tasks with only a general demonstration and utilize a space-time attention mechanism to capture the instruction information behind this demonstration. We experimentally evaluate Foldsformer on four representative sequential multi-step manipulation tasks and show that Foldsformer significantly outperforms state-of-the-art approaches in simulation. Foldformer can complete multi-step cloth manipulation tasks even when configurations of the cloth (e.g., size and pose) vary from configurations in the general demonstrations. Furthermore, our approach can be transferred from simulation to the real world without additional training or domain randomization. Despite training on rectangular clothes, we also show that our approach can generalize to unseen cloth shapes (T-shirts and shorts). Videos and source code are available at: https://sites.google.com/view/foldsformer.