Goto

Collaborating Authors

 Zhao, Shengjie


H2-MARL: Multi-Agent Reinforcement Learning for Pareto Optimality in Hospital Capacity Strain and Human Mobility during Epidemic

arXiv.org Artificial Intelligence

The necessity of achieving an effective balance between minimizing the losses associated with restricting human mobility and ensuring hospital capacity has gained significant attention in the aftermath of COVID-19. Reinforcement learning (RL)-based strategies for human mobility management have recently advanced in addressing the dynamic evolution of cities and epidemics; however, they still face challenges in achieving coordinated control at the township level and adapting to cities of varying scales. To address the above issues, we propose a multi-agent RL approach that achieves Pareto optimality in managing hospital capacity and human mobility (H2-MARL), applicable across cities of different scales. We first develop a township-level infection model with online-updatable parameters to simulate disease transmission and construct a city-wide dynamic spatiotemporal epidemic simulator. On this basis, H2-MARL is designed to treat each division as an agent, with a trade-off dual-objective reward function formulated and an experience replay buffer enriched with expert knowledge built. To evaluate the effectiveness of the model, we construct a township-level human mobility dataset containing over one billion records from four representative cities of varying scales. Extensive experiments demonstrate that H2-MARL has the optimal dual-objective trade-off capability, which can minimize hospital capacity strain while minimizing human mobility restriction loss. Meanwhile, the applicability of the proposed model to epidemic control in cities of varying scales is verified, which showcases its feasibility and versatility in practical applications.


PVBF: A Framework for Mitigating Parameter Variation Imbalance in Online Continual Learning

arXiv.org Artificial Intelligence

Online continual learning (OCL), which enables AI systems to adaptively learn from non-stationary data streams, is commonly achieved using experience replay (ER)-based methods that retain knowledge by replaying stored past during training. However, these methods face challenges of prediction bias, stemming from deviations in parameter update directions during task transitions. This paper identifies parameter variation imbalance as a critical factor contributing to prediction bias in ER-based OCL. Specifically, using the proposed parameter variation evaluation method, we highlight two types of imbalance: correlation-induced imbalance, where certain parameters are disproportionately updated across tasks, and layer-wise imbalance, where output layer parameters update faster than those in preceding layers. To mitigate the above imbalances, we propose the Parameter Variation Balancing Framework (PVBF), which incorporates: 1) a novel method to compute parameter correlations with previous tasks based on parameter variations, 2) an encourage-and-consolidate (E&C) method utilizing parameter correlations to perform gradient adjustments across all parameters during training, 3) a dual-layer copy weights with reinit (D-CWR) strategy to slowly update output layer parameters for frequently occuring sample categories. Experiments on short and long task sequences demonstrate that PVBF significantly reduces prediction bias and improves OCL performance, achieving up to 47\% higher accuracy compared to existing ER-based methods.


Urban Emergency Rescue Based on Multi-Agent Collaborative Learning: Coordination Between Fire Engines and Traffic Lights

arXiv.org Artificial Intelligence

Nowadays, traffic management in urban areas is one of the major economic problems. In particular, when faced with emergency situations like firefighting, timely and efficient traffic dispatching is crucial. Intelligent coordination between multiple departments is essential to realize efficient emergency rescue. In this demo, we present a framework that integrates techniques for collaborative learning methods into the well-known Unity Engine simulator, and thus these techniques can be evaluated in realistic settings. In particular, the framework allows flexible settings such as the number and type of collaborative agents, learning strategies, reward functions, and constraint conditions in practice. The framework is evaluated for an emergency rescue scenario, which could be used as a simulation tool for urban emergency departments.


From Imitation to Introspection: Probing Self-Consciousness in Language Models

arXiv.org Artificial Intelligence

Self-consciousness, the introspection of one's existence and thoughts, represents a high-level cognitive process. As language models advance at an unprecedented pace, a critical question arises: Are these models becoming self-conscious? Drawing upon insights from psychological and neural science, this work presents a practical definition of self-consciousness for language models and refines ten core concepts. Our work pioneers an investigation into self-consciousness in language models by, for the first time, leveraging causal structural games to establish the functional definitions of the ten core concepts. Based on our definitions, we conduct a comprehensive four-stage experiment: quantification (evaluation of ten leading models), representation (visualization of self-consciousness within the models), manipulation (modification of the models' representation), and acquisition (fine-tuning the models on core concepts). Our findings indicate that although models are in the early stages of developing self-consciousness, there is a discernible representation of certain concepts within their internal mechanisms. However, these representations of self-consciousness are hard to manipulate positively at the current stage, yet they can be acquired through targeted fine-tuning. Our datasets and code are at https://github.com/OpenCausaLab/SelfConsciousness.


Graph Anomaly Detection with Noisy Labels by Reinforcement Learning

arXiv.org Artificial Intelligence

Graph anomaly detection (GAD) has been widely applied in many areas, e.g., fraud detection in finance and robot accounts in social networks. Existing methods are dedicated to identifying the outlier nodes that deviate from normal ones. While they heavily rely on high-quality annotation, which is hard to obtain in real-world scenarios, this could lead to severely degraded performance based on noisy labels. Thus, we are motivated to cut the edges of suspicious nodes to alleviate the impact of noise. However, it remains difficult to precisely identify the nodes with noisy labels. Moreover, it is hard to quantitatively evaluate the regret of cutting the edges, which may have either positive or negative influences. To this end, we propose a novel framework REGAD, i.e., REinforced Graph Anomaly Detector. Specifically, we aim to maximize the performance improvement (AUC) of a base detector by cutting noisy edges approximated through the nodes with high-confidence labels. (i) We design a tailored action and search space to train a policy network to carefully prune edges step by step, where only a few suspicious edges are prioritized in each step. (ii) We design a policy-in-the-loop mechanism to iteratively optimize the policy based on the feedback from base detector. The overall performance is evaluated by the cumulative rewards. Extensive experiments are conducted on three datasets under different anomaly ratios. The results indicate the superior performance of our proposed REGAD.


CLEAR: Can Language Models Really Understand Causal Graphs?

arXiv.org Artificial Intelligence

Causal reasoning is a cornerstone of how humans interpret the world. To model and reason about causality, causal graphs offer a concise yet effective solution. Given the impressive advancements in language models, a crucial question arises: can they really understand causal graphs? To this end, we pioneer an investigation into language models' understanding of causal graphs. Specifically, we develop a framework to define causal graph understanding, by assessing language models' behaviors through four practical criteria derived from diverse disciplines (e.g., philosophy and psychology). We then develop CLEAR, a novel benchmark that defines three complexity levels and encompasses 20 causal graph-based tasks across these levels. Finally, based on our framework and benchmark, we conduct extensive experiments on six leading language models and summarize five empirical findings. Our results indicate that while language models demonstrate a preliminary understanding of causal graphs, significant potential for improvement remains. Our project website is at https://github.com/OpenCausaLab/CLEAR.


Causal Evaluation of Language Models

arXiv.org Artificial Intelligence

Causal reasoning is viewed as crucial for achieving human-level machine intelligence. Recent advances in language models have expanded the horizons of artificial intelligence across various domains, sparking inquiries into their potential for causal reasoning. In this work, we introduce Causal evaluation of Language Models (CaLM), which, to the best of our knowledge, is the first comprehensive benchmark for evaluating the causal reasoning capabilities of language models. First, we propose the CaLM framework, which establishes a foundational taxonomy consisting of four modules: causal target (i.e., what to evaluate), adaptation (i.e., how to obtain the results), metric (i.e., how to measure the results), and error (i.e., how to analyze the bad results). This taxonomy defines a broad evaluation design space while systematically selecting criteria and priorities. Second, we compose the CaLM dataset, comprising 126,334 data samples, to provide curated sets of causal targets, adaptations, metrics, and errors, offering extensive coverage for diverse research pursuits. Third, we conduct an extensive evaluation of 28 leading language models on a core set of 92 causal targets, 9 adaptations, 7 metrics, and 12 error types. Fourth, we perform detailed analyses of the evaluation results across various dimensions (e.g., adaptation, scale). Fifth, we present 50 high-level empirical findings across 9 dimensions (e.g., model), providing valuable guidance for future language model development. Finally, we develop a multifaceted platform, including a website, leaderboards, datasets, and toolkits, to support scalable and adaptable assessments. We envision CaLM as an ever-evolving benchmark for the community, systematically updated with new causal targets, adaptations, models, metrics, and error types to reflect ongoing research advancements. Project website is at https://opencausalab.github.io/CaLM.