Overview
Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects
Zhang, Kexin, Wen, Qingsong, Zhang, Chaoli, Cai, Rongyao, Jin, Ming, Liu, Yong, Zhang, James, Liang, Yuxuan, Pang, Guansong, Song, Dongjin, Pan, Shirui
Self-supervised learning (SSL) has recently achieved impressive performance on various time series tasks. The most prominent advantage of SSL is that it reduces the dependence on labeled data. Based on the pre-training and fine-tuning strategy, even a small amount of labeled data can achieve high performance. Compared with many published self-supervised surveys on computer vision and natural language processing, a comprehensive survey for time series SSL is still missing. To fill this gap, we review current state-of-the-art SSL methods for time series data in this article. To this end, we first comprehensively review existing surveys related to SSL and time series, and then provide a new taxonomy of existing time series SSL methods by summarizing them from three perspectives: generative-based, contrastive-based, and adversarial-based. These methods are further divided into ten subcategories with detailed reviews and discussions about their key intuitions, main frameworks, advantages and disadvantages. To facilitate the experiments and validation of time series SSL methods, we also summarize datasets commonly used in time series forecasting, classification, anomaly detection, and clustering tasks. Finally, we present the future directions of SSL for time series analysis.
A Survey of Knowledge Graph Reasoning on Graph Types: Static, Dynamic, and Multimodal
Liang, Ke, Meng, Lingyuan, Liu, Meng, Liu, Yue, Tu, Wenxuan, Wang, Siwei, Zhou, Sihang, Liu, Xinwang, Sun, Fuchun
Knowledge graph reasoning (KGR), aiming to deduce new facts from existing facts based on mined logic rules underlying knowledge graphs (KGs), has become a fast-growing research direction. It has been proven to significantly benefit the usage of KGs in many AI applications, such as question answering, recommendation systems, and etc. According to the graph types, existing KGR models can be roughly divided into three categories, i.e., static models, temporal models, and multi-modal models. Early works in this domain mainly focus on static KGR, and recent works try to leverage the temporal and multi-modal information, which are more practical and closer to real-world. However, no survey papers and open-source repositories comprehensively summarize and discuss models in this important direction. To fill the gap, we conduct a first survey for knowledge graph reasoning tracing from static to temporal and then to multi-modal KGs. Concretely, the models are reviewed based on bi-level taxonomy, i.e., top-level (graph types) and base-level (techniques and scenarios). Besides, the performances, as well as datasets, are summarized and presented. Moreover, we point out the challenges and potential opportunities to enlighten the readers. The corresponding open-source repository is shared on GitHub https://github.com/LIANGKE23/Awesome-Knowledge-Graph-Reasoning.
Imitation Learning with Human Eye Gaze via Multi-Objective Prediction
Thakur, Ravi Kumar, Sunbeam, MD-Nazmus Samin, Goecks, Vinicius G., Novoseller, Ellen, Bera, Ritwik, Lawhern, Vernon J., Gremillion, Gregory M., Valasek, John, Waytowich, Nicholas R.
Approaches for teaching learning agents via human demonstrations have been widely studied and successfully applied to multiple domains. However, the majority of imitation learning work utilizes only behavioral information from the demonstrator, i.e. which actions were taken, and ignores other useful information. In particular, eye gaze information can give valuable insight towards where the demonstrator is allocating visual attention, and holds the potential to improve agent performance and generalization. In this work, we propose Gaze Regularized Imitation Learning (GRIL), a novel context-aware, imitation learning architecture that learns concurrently from both human demonstrations and eye gaze to solve tasks where visual attention provides important context. We apply GRIL to a visual navigation task, in which an unmanned quadrotor is trained to search for and navigate to a target vehicle in a photorealistic simulated environment. We show that GRIL outperforms several state-of-the-art gaze-based imitation learning algorithms, simultaneously learns to predict human visual attention, and generalizes to scenarios not present in the training data. Supplemental videos and code can be found at https://sites.google.com/view/gaze-regularized-il/.
Deep Injective Prior for Inverse Scattering
Khorashadizadeh, AmirEhsan, Khorashadizadeh, Vahid, Eskandari, Sepehr, Vandenbosch, Guy A. E., Dokmaniฤ, Ivan
In electromagnetic inverse scattering, the goal is to reconstruct object permittivity using scattered waves. While deep learning has shown promise as an alternative to iterative solvers, it is primarily used in supervised frameworks which are sensitive to distribution drift of the scattered fields, common in practice. Moreover, these methods typically provide a single estimate of the permittivity pattern, which may be inadequate or misleading due to noise and the ill-posedness of the problem. In this paper, we propose a data-driven framework for inverse scattering based on deep generative models. Our approach learns a low-dimensional manifold as a regularizer for recovering target permittivities. Unlike supervised methods that necessitate both scattered fields and target permittivities, our method only requires the target permittivities for training; it can then be used with any experimental setup. We also introduce a Bayesian framework for approximating the posterior distribution of the target permittivity, enabling multiple estimates and uncertainty quantification. Extensive experiments with synthetic and experimental data demonstrate that our framework outperforms traditional iterative solvers, particularly for strong scatterers, while achieving comparable reconstruction quality to state-of-the-art supervised learning methods like the U-Net.
Modeling Events and Interactions through Temporal Processes -- A Survey
Liguori, Angelica, Caroprese, Luciano, Minici, Marco, Veloso, Bruno, Spinnato, Francesco, Nanni, Mirco, Manco, Giuseppe, Gama, Joao
This problem is of scientific and practical relevance since event data is common in many real-world scenarios and sparks interest in many fields including medicine, epidemiology, engineering, earth science, economics, finance, and social science. In medicine, events can represent various situations, such as incidents, test results, diagnoses and symptoms, and medications. The advent of wearable devices and apps also allows tracking human activities, such as eating, working, sleeping, traveling, etc. Events also characterize movement patterns such as trajectories or taxi/car/public transportation adoptions. In engineering, events can represent phenomena occurring in complex environments, such as failures occurring in industrial processes. In earth science, monitoring and modeling phenomena such as volcano eruptions, seismic events, or floods are of crucial importance.
The Two Faces of AI in Green Mobile Computing: A Literature Review
Siemers, Wander, Sallou, June, Cruz, Luรญs
Artificial intelligence is bringing ever new functionalities to the realm of mobile devices that are now considered essential (e.g., camera and voice assistants, recommender systems). Yet, operating artificial intelligence takes up a substantial amount of energy. However, artificial intelligence is also being used to enable more energy-efficient solutions for mobile systems. Hence, artificial intelligence has two faces in that regard, it is both a key enabler of desired (efficient) mobile functionalities and a major power draw on these devices, playing a part in both the solution and the problem. In this paper, we present a review of the literature of the past decade on the usage of artificial intelligence within the realm of green mobile computing. From the analysis of 34 papers, we highlight the emerging patterns and map the field into 13 main topics that are summarized in details. Our results showcase that the field is slowly increasing in the past years, more specifically, since 2019. Regarding the double impact AI has on the mobile energy consumption, the energy consumption of AI-based mobile systems is under-studied in comparison to the usage of AI for energy-efficient mobile computing, and we argue for more exploratory studies in that direction. We observe that although most studies are framed as solution papers (94%), the large majority do not make those solutions publicly available to the community. Moreover, we also show that most contributions are purely academic (28 out of 34 papers) and that we need to promote the involvement of the mobile software industry in this field.
P\=uioio: On-device Real-Time Smartphone-Based Automated Exercise Repetition Counting System
Sinclair, Adam, Kautai, Kayla, Shahamiri, Seyed Reza
Automated exercise repetition counting has applications across the physical fitness realm, from personal health to rehabilitation. Motivated by the ubiquity of mobile phones and the benefits of tracking physical activity, this study explored the feasibility of counting exercise repetitions in real-time, using only on-device inference, on smartphones. In this work, after providing an extensive overview of the state-of-the-art automatic exercise repetition counting methods, we introduce a deep learning based exercise repetition counting system for smartphones consisting of five components: (1) Pose estimation, (2) Thresholding, (3) Optical flow, (4) State machine, and (5) Counter. The system is then implemented via a cross-platform mobile application named P\=uioio that uses only the smartphone camera to track repetitions in real time for three standard exercises: Squats, Push-ups, and Pull-ups. The proposed system was evaluated via a dataset of pre-recorded videos of individuals exercising as well as testing by subjects exercising in real time. Evaluation results indicated the system was 98.89% accurate in real-world tests and up to 98.85% when evaluated via the pre-recorded dataset. This makes it an effective, low-cost, and convenient alternative to existing solutions since the proposed system has minimal hardware requirements without requiring any wearable or specific sensors or network connectivity.
Model Compression Methods for YOLOv5: A Review
Jani, Mohammad, Fayyad, Jamil, Al-Younes, Younes, Najjaran, Homayoun
Over the past few years, extensive research has been devoted to enhancing YOLO object detectors. Since its introduction, eight major versions of YOLO have been introduced with the purpose of improving its accuracy and efficiency. While the evident merits of YOLO have yielded to its extensive use in many areas, deploying it on resource-limited devices poses challenges. To address this issue, various neural network compression methods have been developed, which fall under three main categories, namely network pruning, quantization, and knowledge distillation. The fruitful outcomes of utilizing model compression methods, such as lowering memory usage and inference time, make them favorable, if not necessary, for deploying large neural networks on hardware-constrained edge devices. In this review paper, our focus is on pruning and quantization due to their comparative modularity. We categorize them and analyze the practical results of applying those methods to YOLOv5. By doing so, we identify gaps in adapting pruning and quantization for compressing YOLOv5, and provide future directions in this area for further exploration. Among several versions of YOLO, we specifically choose YOLOv5 for its excellent trade-off between recency and popularity in literature. This is the first specific review paper that surveys pruning and quantization methods from an implementation point of view on YOLOv5. Our study is also extendable to newer versions of YOLO as implementing them on resource-limited devices poses the same challenges that persist even today. This paper targets those interested in the practical deployment of model compression methods on YOLOv5, and in exploring different compression techniques that can be used for subsequent versions of YOLO.
Project Florida: Federated Learning Made Easy
Diaz, Daniel Madrigal, Manoel, Andre, Chen, Jialei, Singal, Nalin, Sim, Robert
We present Project Florida, a system architecture and software development kit (SDK) enabling deployment of large-scale Federated Learning (FL) solutions across a heterogeneous device ecosystem. Federated learning is an approach to machine learning based on a strong data sovereignty principle, i.e., that privacy and security of data is best enabled by storing it at its origin, whether on end-user devices or in segregated cloud storage silos. Federated learning enables model training across devices and silos while the training data remains within its security boundary, by distributing a model snapshot to a client running inside the boundary, running client code to update the model, and then aggregating updated snapshots across many clients in a central orchestrator. Deploying a FL solution requires implementation of complex privacy and security mechanisms as well as scalable orchestration infrastructure. Scale and performance is a paramount concern, as the model training process benefits from full participation of many client devices, which may have a wide variety of performance characteristics. Project Florida aims to simplify the task of deploying cross-device FL solutions by providing cloud-hosted infrastructure and accompanying task management interfaces, as well as a multi-platform SDK supporting most major programming languages including C++, Java, and Python, enabling FL training across a wide range of operating system (OS) and hardware specifications. The architecture decouples service management from the FL workflow, enabling a cloud service provider to deliver FL-as-a-service (FLaaS) to ML engineers and application developers. We present an overview of Florida, including a description of the architecture, sample code, and illustrative experiments demonstrating system capabilities.
Statement-based Memory for Neural Source Code Summarization
Bansal, Aakash, Jiang, Siyuan, Haque, Sakib, McMillan, Collin
Source code summarization is the task of writing natural language descriptions of source code behavior. Code summarization underpins software documentation for programmers. Short descriptions of code help programmers understand the program quickly without having to read the code itself. Lately, neural source code summarization has emerged as the frontier of research into automated code summarization techniques. By far the most popular targets for summarization are program subroutines. The idea, in a nutshell, is to train an encoder-decoder neural architecture using large sets of examples of subroutines extracted from code repositories. The encoder represents the code and the decoder represents the summary. However, most current approaches attempt to treat the subroutine as a single unit. For example, by taking the entire subroutine as input to a Transformer or RNN-based encoder. But code behavior tends to depend on the flow from statement to statement. Normally dynamic analysis may shed light on this flow, but dynamic analysis on hundreds of thousands of examples in large datasets is not practical. In this paper, we present a statement-based memory encoder that learns the important elements of flow during training, leading to a statement-based subroutine representation without the need for dynamic analysis. We implement our encoder for code summarization and demonstrate a significant improvement over the state-of-the-art.