Overview
Quantum Circuit Synthesis and Compilation Optimization: Overview and Prospects
Ge, Yan, Wenjie, Wu, Yuheng, Chen, Kaisen, Pan, Xudong, Lu, Zixiang, Zhou, Yuhan, Wang, Ruocheng, Wang, Junchi, Yan
Quantum computing is regarded as a promising paradigm that may overcome the current computational power bottlenecks in the post-Moore era. The increasing maturity of quantum processors, especially superconducting ones, provides more possibilities for the development and implementation of quantum algorithms. As the crucial stages for quantum algorithm implementation, the logic circuit design and quantum compiling have also received significant attention, which covers key technologies such as quantum logic circuit synthesis (also widely known as quantum architecture search) and optimization, as well as qubit mapping and routing. Recent studies suggest that the scale and precision of related algorithms are steadily increasing, especially with the integration of artificial intelligence methods. In this survey, we systematically review and summarize a vast body of literature, exploring the feasibility of an integrated design and optimization scheme that spans from the algorithmic level to quantum hardware, combining the steps of logic circuit design and compilation optimization. Leveraging the exceptional cognitive and learning capabilities of AI algorithms, one can reduce manual design costs, enhance the precision and efficiency of execution, and facilitate the implementation and validation of the superiority of quantum algorithms on hardware.
A Survey on Deep Clustering: From the Prior Perspective
Lu, Yiding, Li, Haobin, Li, Yunfan, Lin, Yijie, Peng, Xi
Facilitated by the powerful feature extraction ability of neural networks, deep clustering has achieved great success in analyzing high-dimensional and complex real-world data. The performance of deep clustering methods is affected by various factors such as network structures and learning objectives. However, as pointed out in this survey, the essence of deep clustering lies in the incorporation and utilization of prior knowledge, which is largely ignored by existing works. From pioneering deep clustering methods based on data structure assumptions to recent contrastive clustering methods based on data augmentation invariances, the development of deep clustering intrinsically corresponds to the evolution of prior knowledge. In this survey, we provide a comprehensive review of deep clustering methods by categorizing them into six types of prior knowledge. We find that in general the prior innovation follows two trends, namely, i) from mining to constructing, and ii) from internal to external. Besides, we provide a benchmark on five widely-used datasets and analyze the performance of methods with diverse priors. By providing a novel prior knowledge perspective, we hope this survey could provide some novel insights and inspire future research in the deep clustering community.
A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches
Zhao, Zhigen, Cheng, Shuo, Ding, Yan, Zhou, Ziyi, Zhang, Shiqi, Xu, Danfei, Zhao, Ye
Task and Motion Planning (TAMP) integrates high-level task planning and low-level motion planning to equip robots with the autonomy to effectively reason over long-horizon, dynamic tasks. Optimization-based TAMP focuses on hybrid optimization approaches that define goal conditions via objective functions and are capable of handling open-ended goals, robotic dynamics, and physical interaction between the robot and the environment. Therefore, optimization-based TAMP is particularly suited to solve highly complex, contact-rich locomotion and manipulation problems. This survey provides a comprehensive review on optimization-based TAMP, covering (i) planning domain representations, including action description languages and temporal logic, (ii) individual solution strategies for components of TAMP, including AI planning and trajectory optimization (TO), and (iii) the dynamic interplay between logic-based task planning and model-based TO. A particular focus of this survey is to highlight the algorithm structures to efficiently solve TAMP, especially hierarchical and distributed approaches. Additionally, the survey emphasizes the synergy between the classical methods and contemporary learning-based innovations such as large language models. Furthermore, the future research directions for TAMP is discussed in this survey, highlighting both algorithmic and application-specific challenges.
Meta-Learning Loss Functions for Deep Neural Networks
Humans can often quickly and efficiently solve complex new learning tasks given only a small set of examples. In contrast, modern artificially intelligent systems often require thousands or millions of observations in order to solve even the most basic tasks. Meta-learning aims to resolve this issue by leveraging past experiences from similar learning tasks to embed the appropriate inductive biases into the learning system. Historically methods for meta-learning components such as optimizers, parameter initializations, and more have led to significant performance increases. This thesis aims to explore the concept of meta-learning to improve performance, through the often-overlooked component of the loss function. The loss function is a vital component of a learning system, as it represents the primary learning objective, where success is determined and quantified by the system's ability to optimize for that objective successfully.
Characterizing Continual Learning Scenarios and Strategies for Audio Analysis
Bhatt, Ruchi, Kumari, Pratibha, Mahapatra, Dwarikanath, Saddik, Abdulmotaleb El, Saini, Mukesh
Audio analysis is useful in many application scenarios. The state-of-the-art audio analysis approaches assume that the data distribution at training and deployment time will be the same. However, due to various real-life environmental factors, the data may encounter drift in its distribution or can encounter new classes in the late future. Thus, a one-time trained model might not perform adequately. In this paper, we characterize continual learning (CL) approaches in audio analysis. In this paper, we characterize continual learning (CL) approaches, intended to tackle catastrophic forgetting arising due to drifts. As there is no CL dataset for audio analysis, we use DCASE 2020 to 2023 datasets to create various CL scenarios for audio-based monitoring tasks. We have investigated the following CL and non-CL approaches: EWC, LwF, SI, GEM, A-GEM, GDumb, Replay, Naive, cumulative, and joint training. The study is very beneficial for researchers and practitioners working in the area of audio analysis for developing adaptive models. We observed that Replay achieved better results than other methods in the DCASE challenge data. It achieved an accuracy of 70.12% for the domain incremental scenario and an accuracy of 96.98% for the class incremental scenario.
Quantum Algorithms for Weighted Constrained Sampling and Weighted Model Counting
Given a Boolean formula and a functions assigning weights to assignments of values to the Boolean variable, we consider the problems of Weighted Constrained Sampling (WCS) and Weighted Model Counting (WMC). The first, also called distributionaware sampling (Chakraborty et al, 2014), involves sampling assignments to the Boolean variables with a probability proportional to their weight given that the formula is satisfied. The latter (Sang et al, 2005) consists in computing the sum of the weights of the models of the formula, i.e. the weighted model count. WCS has important applications in a variety of domanis, including statistical physics (Jerrum and Sinclair, 1996), statistics (Madras and Piccioni, 1999), hardware verification (Naveh et al, 2006), and probabilistic reasoning, where it can be used to solve the problem of Most Probable Explanation (MPE) and Maximum A Posteriori (MAP). MPE (Sang et al, 2007) involves finding an assignment to all variables that satisfies a Boolean formula and has the maximum weight. The related MAP problem means finding an assignment of a subset of the variables such that the sum of the weights of the models of the formula that agree on the assignment is maximum. WMC was successfully applied, among others, to the problem of performing inference in graphical models (Chavira and Darwiche, 2008; Sang et al, 2005).
Leveraging Ontologies to Document Bias in Data
Russo, Mayra, Vidal, Maria-Esther
The breakthroughs and benefits attributed to big data and, consequently, to machine learning (ML) - or AIsystems [1, 2], have also resulted in making prevalent how these systems are capable of producing unexpected, biased, and in some cases, undesirable output [3, 4, 5]. Seminal work on bias (i.e., prejudice for, or against one person, or group, especially in a way considered to be unfair) in the context of ML systems demonstrates how facial recognition tools and popular search engines can exacerbate demographic disparities, worsening the marginalization of minorities at the individual and group level [6, 7]. Further, biases in news recommenders and social media feeds actively play a role in conditioning and manipulating people's behavior and amplifying individual and public opinion polarization [8, 9]. In this context, the last few years have seen the consolidation of the Trustworthy AI framework, led in large part by regulatory bodies [10], with the objective of guiding commercial AI development to proactively account for ethical, legal, and technical dimensions [11]. Furthermore, this framework is also accompanied by the call to establish standards across the field in order to ensure AI systems are safe, secure and fair upon deployment [11]. In terms of AI bias, many efforts have been concentrated in devising methods that can improve its identification, understanding, measurement, and mitigation [12]. For example, the special publication prepared by the National Institute of Standards and Technology (NIST) proposes a thorough, however not exhaustive, categorization of different types of bias in AI beyond common computational definitions (see Figure 1 for core hierarchy) [13]. In this same direction, some scholars advocate for practices that account for the characteristics of ML pipelines (i.e., datasets, ML algorithms, and user interaction loop) [14] to enable actors concerned with its research, development, regulation, and use, to inspect all the actions performed across the engineering process, with the objective to increase trust placed not only on the development processes, but on the systems themselves [15, 16, 17, 18].
BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical Science
Lin, Xinna, Ma, Siqi, Shan, Junjie, Zhang, Xiaojing, Hu, Shell Xu, Guo, Tiannan, Li, Stan Z., Yu, Kaicheng
Pursuing artificial intelligence for biomedical science, a.k.a. AI Scientist, draws increasing attention, where one common approach is to build a copilot agent driven by Large Language Models (LLMs). However, to evaluate such systems, people either rely on direct Question-Answering (QA) to the LLM itself, or in a biomedical experimental manner. How to precisely benchmark biomedical agents from an AI Scientist perspective remains largely unexplored. To this end, we draw inspiration from one most important abilities of scientists, understanding the literature, and introduce BioKGBench. In contrast to traditional evaluation benchmark that only focuses on factual QA, where the LLMs are known to have hallucination issues, we first disentangle "Understanding Literature" into two atomic abilities, i) "Understanding" the unstructured text from research papers by performing scientific claim verification, and ii) Ability to interact with structured Knowledge-Graph Question-Answering (KGQA) as a form of "Literature" grounding. We then formulate a novel agent task, dubbed KGCheck, using KGQA and domain-based Retrieval-Augmented Generation (RAG) to identify the factual errors of existing large-scale knowledge graph databases. We collect over two thousand data for two atomic tasks and 225 high-quality annotated data for the agent task. Surprisingly, we discover that state-of-the-art agents, both daily scenarios and biomedical ones, have either failed or inferior performance on our benchmark. We then introduce a simple yet effective baseline, dubbed BKGAgent. On the widely used popular knowledge graph, we discover over 90 factual errors which provide scenarios for agents to make discoveries and demonstrate the effectiveness of our approach. The code and data are available at https://github.com/westlake-autolab/BioKGBench.
Continual Learning of Large Language Models: A Comprehensive Survey
Shi, Haizhou, Xu, Zihao, Wang, Hengyi, Qin, Weiyi, Wang, Wenyuan, Wang, Yibin, Wang, Zifeng, Ebrahimi, Sayna, Wang, Hao
The recent success of large language models (LLMs) trained on static, pre-collected, general datasets has sparked numerous research directions and applications. One such direction addresses the non-trivial challenge of integrating pre-trained LLMs into dynamic data distributions, task structures, and user preferences. Pre-trained LLMs, when tailored for specific needs, often experience significant performance degradation in previous knowledge domains -- a phenomenon known as "catastrophic forgetting". While extensively studied in the continual learning (CL) community, it presents new manifestations in the realm of LLMs. In this survey, we provide a comprehensive overview of the current research progress on LLMs within the context of CL. This survey is structured into four main sections: we first describe an overview of continually learning LLMs, consisting of two directions of continuity: vertical continuity (or vertical continual learning), i.e., continual adaptation from general to specific capabilities, and horizontal continuity (or horizontal continual learning), i.e., continual adaptation across time and domains (Section 3). We then summarize three stages of learning LLMs in the context of modern CL: Continual Pre-Training (CPT), Domain-Adaptive Pre-training (DAP), and Continual Fine-Tuning (CFT) (Section 4). Then we provide an overview of evaluation protocols for continual learning with LLMs, along with the current available data sources (Section 5). Finally, we discuss intriguing questions pertaining to continual learning for LLMs (Section 6). The full list of papers examined in this survey is available at https://github.com/Wang-ML-Lab/llm-continual-learning-survey.
IoT-Based Preventive Mental Health Using Knowledge Graphs and Standards for Better Well-Being
Gyrard, Amelie, Mohammadi, Seyedali, Gaur, Manas, Kung, Antonio
Sustainable Development Goals (SDGs) give the UN a road map for development with Agenda 2030 as a target. SDG3 "Good Health and Well-Being" ensures healthy lives and promotes well-being for all ages. Digital technologies can support SDG3. Burnout and even depression could be reduced by encouraging better preventive health. Due to the lack of patient knowledge and focus to take care of their health, it is necessary to help patients before it is too late. New trends such as positive psychology and mindfulness are highly encouraged in the USA. Digital Twin (DT) can help with the continuous monitoring of emotion using physiological signals (e.g., collected via wearables). Digital twins facilitate monitoring and provide constant health insight to improve quality of life and well-being with better personalization. Healthcare DT challenges are standardizing data formats, communication protocols, and data exchange mechanisms. To achieve those data integration and knowledge challenges, we designed the Mental Health Knowledge Graph (ontology and dataset) to boost mental health. The Knowledge Graph (KG) acquires knowledge from ontology-based mental health projects classified within the LOV4IoT ontology catalog (Emotion, Depression, and Mental Health). Furthermore, the KG is mapped to standards (e.g., ontologies) when possible. Standards from ETSI SmartM2M, ITU/WHO, ISO, W3C, NIST, and IEEE are relevant to mental health.