Overview
The Vulnerability-Adaptive Protection Paradigm
We present a comprehensive review of the design landscape for resilient autonomous machines. We show that existing techniques are of a "one-size-fits-all" nature, where the same protection scheme is applied to the entire software stack, leading to either high overhead or low protection strength. We provide a thorough characterization of the inherent resilience of different tasks in widely used, open source software stacks for autonomous vehicles (AutoWare) and drones (MAVBench). We show that different tasks vary significantly in their resilience under hardware faults. In particular, front-end machine vision tasks that operate on massive visual data are much more resilient to faults than back-end tasks, such as planning and control, which operate on smaller data but are more sensitive to faults. We propose VAP for resilient autonomous machines. In VAP, we spend less protection efforts on front-end machine-vision tasks and more budget on back-end planning and control tasks. Experimentally, we show that the VAP mechanism provides high protection coverage while maintaining low protection overhead on both autonomous vehicle and drone systems.
Using a CNN Model to Assess Visual Artwork's Creativity
Zhang, Zhehan, Qian, Meihua, Luo, Li, Saha, Ripon, Gao, Qianyi, Song, Xinxin
Assessing artistic creativity has long challenged researchers, with traditional methods proving time-consuming. Recent studies have applied machine learning to evaluate creativity in drawings, but not paintings. Our research addresses this gap by developing a CNN model to automatically assess the creativity of human paintings. Using a dataset of six hundred paintings by professionals and children, our model achieved 90% accuracy and faster evaluation times than human raters. This approach demonstrates the potential of machine learning in advancing artistic creativity assessment, offering a more efficient alternative to traditional methods.
ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks at Scale
Wang, Xin, Delgado, Hector, Tak, Hemlata, Jung, Jee-weon, Shim, Hye-jin, Todisco, Massimiliano, Kukanov, Ivan, Liu, Xuechen, Sahidullah, Md, Kinnunen, Tomi, Evans, Nicholas, Lee, Kong Aik, Yamagishi, Junichi
ASVspoof 5 is the fifth edition in a series of challenges that promote the study of speech spoofing and deepfake attacks, and the design of detection solutions. Compared to previous challenges, the ASVspoof 5 database is built from crowdsourced data collected from a vastly greater number of speakers in diverse acoustic conditions. Attacks, also crowdsourced, are generated and tested using surrogate detection models, while adversarial attacks are incorporated for the first time. New metrics support the evaluation of spoofing-robust automatic speaker verification (SASV) as well as stand-alone detection solutions, i.e., countermeasures without ASV. We describe the two challenge tracks, the new database, the evaluation metrics, baselines, and the evaluation platform, and present a summary of the results. Attacks significantly compromise the baseline systems, while submissions bring substantial improvements.
Overview of the BioLaySumm 2024 Shared Task on the Lay Summarization of Biomedical Research Articles
Goldsack, Tomas, Scarton, Carolina, Shardlow, Matthew, Lin, Chenghua
This paper presents the setup and results of the second edition of the BioLaySumm shared task on the Lay Summarisation of Biomedical Research Articles, hosted at the BioNLP Workshop at ACL 2024. In this task edition, we aim to build on the first edition's success by further increasing research interest in this important task and encouraging participants to explore novel approaches that will help advance the state-of-the-art. Encouragingly, we found research interest in the task to be high, with this edition of the task attracting a total of 53 participating teams, a significant increase in engagement from the previous edition. Overall, our results show that a broad range of innovative approaches were adopted by task participants, with a predictable shift towards the use of Large Language Models (LLMs).
System Identification For Constrained Robots
Zhang, Bohao, Haugk, Daniel, Vasudevan, Ram
Identifying the parameters of robotic systems, such as motor inertia or joint friction, is critical to satisfactory controller synthesis, model analysis, and observer design. Conventional identification techniques are designed primarily for unconstrained systems, such as robotic manipulators. In contrast, the growing importance of legged robots that feature closed kinematic chains or other constraints, poses challenges to these traditional methods. This paper introduces a system identification approach for constrained systems that relies on iterative least squares to identify motor inertia and joint friction parameters from data. The proposed approach is validated in simulation and in the real-world on Digit, which is a 20 degree-of-freedom humanoid robot built by Agility Robotics. In these experiments, the parameters identified by the proposed method enable a model-based controller to achieve better tracking performance than when it uses the default parameters provided by the manufacturer. The implementation of the approach is available at https://github.com/roahmlab/ConstrainedSysID.
Diffusion Model for Planning: A Systematic Literature Review
Ubukata, Toshihide, Li, Jialong, Tei, Kenji
Diffusion models, which leverage stochastic processes to capture complex data distributions effectively, have shown their performance as generative models, achieving notable success in image-related tasks through iterative denoising processes. Recently, diffusion models have been further applied and show their strong abilities in planning tasks, leading to a significant growth in related publications since 2023. To help researchers better understand the field and promote the development of the field, we conduct a systematic literature review of recent advancements in the application of diffusion models for planning. Specifically, this paper categorizes and discusses the current literature from the following perspectives: (i) relevant datasets and benchmarks used for evaluating diffusion modelbased planning; (ii) fundamental studies that address aspects such as sampling efficiency; (iii) skill-centric and condition-guided planning for enhancing adaptability; (iv) safety and uncertainty managing mechanism for enhancing safety and robustness; and (v) domain-specific application such as autonomous driving. Finally, given the above literature review, we further discuss the challenges and future directions in this field.
A survey on secure decentralized optimization and learning
Liu, Changxin, Bastianello, Nicola, Huo, Wei, Shi, Yang, Johansson, Karl H.
Decentralized optimization has become a standard paradigm for solving large-scale decision-making problems and training large machine learning models without centralizing data. However, this paradigm introduces new privacy and security risks, with malicious agents potentially able to infer private data or impair the model accuracy. Over the past decade, significant advancements have been made in developing secure decentralized optimization and learning frameworks and algorithms. This survey provides a comprehensive tutorial on these advancements. We begin with the fundamentals of decentralized optimization and learning, highlighting centralized aggregation and distributed consensus as key modules exposed to security risks in federated and distributed optimization, respectively. Next, we focus on privacy-preserving algorithms, detailing three cryptographic tools and their integration into decentralized optimization and learning systems. Additionally, we examine resilient algorithms, exploring the design and analysis of resilient aggregation and consensus protocols that support these systems. We conclude the survey by discussing current trends and potential future directions.
A Multivocal Literature Review on Privacy and Fairness in Federated Learning
Balbierer, Beatrice, Heinlein, Lukas, Zipperling, Domenique, Kühl, Niklas
Federated Learning presents a way to revolutionize AI applications by eliminating the necessity for data sharing. Yet, research has shown that information can still be extracted during training, making additional privacy-preserving measures such as differential privacy imperative. To implement real-world federated learning applications, fairness, ranging from a fair distribution of performance to non-discriminative behaviour, must be considered. Particularly in high-risk applications (e.g. healthcare), avoiding the repetition of past discriminatory errors is paramount. As recent research has demonstrated an inherent tension between privacy and fairness, we conduct a multivocal literature review to examine the current methods to integrate privacy and fairness in federated learning. Our analyses illustrate that the relationship between privacy and fairness has been neglected, posing a critical risk for real-world applications. We highlight the need to explore the relationship between privacy, fairness, and performance, advocating for the creation of integrated federated learning frameworks.
An Empirical Examination of Balancing Strategy for Counterfactual Estimation on Time Series
Huang, Qiang, Meng, Chuizheng, Cao, Defu, Huang, Biwei, Chang, Yi, Liu, Yan
Counterfactual estimation from observations represents a critical endeavor in numerous application fields, such as healthcare and finance, with the primary challenge being the mitigation of treatment bias. The balancing strategy aimed at reducing covariate disparities between different treatment groups serves as a universal solution. However, when it comes to the time series data, the effectiveness of balancing strategies remains an open question, with a thorough analysis of the robustness and applicability of balancing strategies still lacking. This paper revisits counterfactual estimation in the temporal setting and provides a brief overview of recent advancements in balancing strategies. More importantly, we conduct a critical empirical examination for the effectiveness of the balancing strategies within the realm of temporal counterfactual estimation in various settings on multiple datasets. Our findings could be of significant interest to researchers and practitioners and call for a reexamination of the balancing strategy in time series settings.
Evolving Text Data Stream Mining
A text stream is an ordered sequence of text documents generated over time. A massive amount of such text data is generated by online social platforms every day. Designing an algorithm for such text streams to extract useful information is a challenging task due to unique properties of the stream such as infinite length, data sparsity, and evolution. Thereby, learning useful information from such streaming data under the constraint of limited time and memory has gained increasing attention. During the past decade, although many text stream mining algorithms have proposed, there still exists some potential issues. First, high-dimensional text data heavily degrades the learning performance until the model either works on subspace or reduces the global feature space. The second issue is to extract semantic text representation of documents and capture evolving topics over time. Moreover, the problem of label scarcity exists, whereas existing approaches work on the full availability of labeled data. To deal with these issues, in this thesis, new learning models are proposed for clustering and multi-label learning on text streams.