Goto

Collaborating Authors

 Wei, Song


Assessing Electricity Service Unfairness with Transfer Counterfactual Learning

arXiv.org Artificial Intelligence

Energy justice is a growing area of interest in interdisciplinary energy research. However, identifying systematic biases in the energy sector remains challenging due to confounding variables, intricate heterogeneity in counterfactual effects, and limited data availability. First, this paper demonstrates how one can evaluate counterfactual unfairness in a power system by analyzing the average causal effect of a specific protected attribute. Subsequently, we use subgroup analysis to handle model heterogeneity and introduce a novel method for estimating counterfactual unfairness based on transfer learning, which helps to alleviate the data scarcity in each subgroup. In our numerical analysis, we apply our method to a unique large-scale customer-level power outage data set and investigate the counterfactual effect of demographic factors, such as income and age of the population, on power outage durations. Our results indicate that low-income and elderly-populated areas consistently experience longer power outages under both daily and post-disaster operations, and such discrimination is exacerbated under severe conditions. These findings suggest a widespread, systematic issue of injustice in the power service systems and emphasize the necessity for focused interventions in disadvantaged communities.


INTAGS: Interactive Agent-Guided Simulation

arXiv.org Artificial Intelligence

In many applications involving multi-agent system (MAS), it is imperative to test an experimental (Exp) autonomous agent in a high-fidelity simulator prior to its deployment to production, to avoid unexpected losses in the real-world. Such a simulator acts as the environmental background (BG) agent(s), called agent-based simulator (ABS), aiming to replicate the complex real MAS. However, developing realistic ABS remains challenging, mainly due to the sequential and dynamic nature of such systems. To fill this gap, we propose a metric to distinguish between real and synthetic multi-agent systems, which is evaluated through the live interaction between the Exp and BG agents to explicitly account for the systems' sequential nature. Specifically, we characterize the system/environment by studying the effect of a sequence of BG agents' responses to the environment state evolution and take such effects' differences as MAS distance metric; The effect estimation is cast as a causal inference problem since the environment evolution is confounded with the previous environment state. Importantly, we propose the Interactive Agent-Guided Simulation (INTAGS) framework to build a realistic ABS by optimizing over this novel metric. To adapt to any environment with interactive sequential decision making agents, INTAGS formulates the simulator as a stochastic policy in reinforcement learning. Moreover, INTAGS utilizes the policy gradient update to bypass differentiating the proposed metric such that it can support non-differentiable operations of multi-agent environments. Through extensive experiments, we demonstrate the effectiveness of INTAGS on an equity stock market simulation example. We show that using INTAGS to calibrate the simulator can generate more realistic market data compared to the state-of-the-art conditional Wasserstein Generative Adversarial Network approach.


Online Kernel CUSUM for Change-Point Detection

arXiv.org Machine Learning

We present a computationally efficient online kernel Cumulative Sum (CUSUM) method for change-point detection that utilizes the maximum over a set of kernel statistics to account for the unknown change-point location. Our approach exhibits increased sensitivity to small changes compared to existing kernel-based change-point detection methods, including Scan-B statistic, corresponding to a non-parametric Shewhart chart-type procedure. We provide accurate analytic approximations for two key performance metrics: the Average Run Length (ARL) and Expected Detection Delay (EDD), which enable us to establish an optimal window length to be on the order of the logarithm of ARL to ensure minimal power loss relative to an oracle procedure with infinite memory. Moreover, we introduce a recursive calculation procedure for detection statistics to ensure constant computational and memory complexity, which is essential for online implementation. Through extensive experiments on both simulated and real data, we demonstrate the competitive performance of our method and validate our theoretical results.


Scene-Driven Multimodal Knowledge Graph Construction for Embodied AI

arXiv.org Artificial Intelligence

Embodied AI is one of the most popular studies in artificial intelligence and robotics, which can effectively improve the intelligence of real-world agents (i.e. robots) serving human beings. Scene knowledge is important for an agent to understand the surroundings and make correct decisions in the varied open world. Currently, knowledge base for embodied tasks is missing and most existing work use general knowledge base or pre-trained models to enhance the intelligence of an agent. For conventional knowledge base, it is sparse, insufficient in capacity and cost in data collection. For pre-trained models, they face the uncertainty of knowledge and hard maintenance. To overcome the challenges of scene knowledge, we propose a scene-driven multimodal knowledge graph (Scene-MMKG) construction method combining conventional knowledge engineering and large language models. A unified scene knowledge injection framework is introduced for knowledge representation. To evaluate the advantages of our proposed method, we instantiate Scene-MMKG considering typical indoor robotic functionalities (Manipulation and Mobility), named ManipMob-MMKG. Comparisons in characteristics indicate our instantiated ManipMob-MMKG has broad superiority in data-collection efficiency and knowledge quality. Experimental results on typical embodied tasks show that knowledge-enhanced methods using our instantiated ManipMob-MMKG can improve the performance obviously without re-designing model structures complexly. Our project can be found at https://sites.google.com/view/manipmob-mmkg


Causal Graph Discovery from Self and Mutually Exciting Time Series

arXiv.org Artificial Intelligence

We present a generalized linear structural causal model, coupled with a novel data-adaptive linear regularization, to recover causal directed acyclic graphs (DAGs) from time series. By leveraging a recently developed stochastic monotone Variational Inequality (VI) formulation, we cast the causal discovery problem as a general convex optimization. Furthermore, we develop a non-asymptotic recovery guarantee and quantifiable uncertainty by solving a linear program to establish confidence intervals for a wide range of non-linear monotone link functions. We validate our theoretical results and show the competitive performance of our method via extensive numerical experiments. Most importantly, we demonstrate the effectiveness of our approach in recovering highly interpretable causal DAGs over Sepsis Associated Derangements (SADs) while achieving comparable prediction performance to powerful ``black-box'' models such as XGBoost. Thus, the future adoption of our proposed method to conduct continuous surveillance of high-risk patients by clinicians is much more likely.


Granger Causal Chain Discovery for Sepsis-Associated Derangements via Continuous-Time Hawkes Processes

arXiv.org Machine Learning

Modern health care systems are conducting continuous, automated surveillance of the electronic medical record (EMR) to identify adverse events with increasing frequency; however, many events such as sepsis do not have elucidated prodromes (i.e., event chains) that can be used to identify and intercept the adverse event early in its course. Clinically relevant and interpretable results require a framework that can (i) infer temporal interactions across multiple patient features found in EMR data (e.g., Labs, vital signs, etc.) and (ii) identify patterns that precede and are specific to an impending adverse event (e.g., sepsis). In this work, we propose a linear multivariate Hawkes process model, coupled with ReLU link function, to recover a Granger Causal (GC) graph with both exciting and inhibiting effects. We develop a scalable two-phase gradient-based method to obtain a maximum surrogate-likelihood estimator, which is shown to be effective via extensive numerical simulation. Our method is subsequently extended to a data set of patients admitted to Grady hospital system in Atlanta, GA, USA, where the estimated GC graph identifies several highly interpretable GC chains that precede sepsis. The code is available at \url{https://github.com/SongWei-GT/two-phase-MHP}.


Causal Structural Learning from Time Series: A Convex Optimization Approach

arXiv.org Artificial Intelligence

Structural learning, which aims to learn directed acyclic graphs (DAGs) from observational data, is foundational to causal reasoning and scientific discovery. Recent advancements formulate structural learning into a continuous optimization problem; however, DAG learning remains a highly non-convex problem, and there has not been much work on leveraging well-developed convex optimization techniques for causal structural learning. We fill this gap by proposing a data-adaptive linear approach for causal structural learning from time series data, which can be conveniently cast into a convex optimization problem using a recently developed monotone operator variational inequality (VI) formulation. Furthermore, we establish non-asymptotic recovery guarantee of the VI-based approach and show the superior performance of our proposed method on structure recovery over existing methods via extensive numerical experiments.


Causal Graph Discovery from Self and Mutually Exciting Time Series

arXiv.org Artificial Intelligence

We present a generalized linear structural causal model, coupled with a novel data-adaptive linear regularization, to recover causal directed acyclic graphs (DAGs) from time series. By leveraging a recently developed stochastic monotone Variational Inequality (VI) formulation, we cast the causal discovery problem as a general convex optimization. Furthermore, we develop a non-asymptotic recovery guarantee and quantifiable uncertainty by solving a linear program to establish confidence intervals for a wide range of non-linear monotone link functions. We validate our theoretical results and show the competitive performance of our method via extensive numerical experiments. Most importantly, we demonstrate the effectiveness of our approach in recovering highly interpretable causal DAGs over Sepsis Associated Derangements (SADs) while achieving comparable prediction performance to powerful ``black-box'' models such as XGBoost. Thus, the future adoption of our proposed method to conduct continuous surveillance of high-risk patients by clinicians is much more likely.


Optimal Sub-sampling to Boost Power of Kernel Sequential Change-point Detection

arXiv.org Machine Learning

We present a novel scheme to boost detection power for kernel maximum mean discrepancy based sequential change-point detection procedures. Our proposed scheme features an optimal sub-sampling of the history data before the detection procedure, in order to tackle the power loss incurred by the random sub-sample from the enormous history data. We apply our proposed scheme to both Scan $B$ and Kernel Cumulative Sum (CUSUM) procedures, and improved performance is observed from extensive numerical experiments.


Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization

arXiv.org Machine Learning

Nonconvex optimization has been widely adopted in various domains, including image recognition (Hinton et al., 2012; Krizhevsky et al., 2012), Bayesian graphical models (Jordan et al., 2004; Attias, 2000), recommendation systems (Salakhutdinov et al., 2007), etc. Despite the fact that solving a nonconvex problem is generally difficult, empirical evidences have shown that simple first order algorithms such as stochastic gradient descent (SGD), are able to solve a majority of the aforementioned nonconvex problems efficiently. The theory behind these empirical observations, however, is still largely unexplored. In classical optimization literature, there have been fruitful results on characterizing the convergence of SGD to first-order stationary points for nonconvex problems. However, these types of results fall short of explaining the empirical evidences that SGD often converges to global minima for a wide class of nonconvex problems used in practice. More recently, understanding the role of noise in the algorithmic behavior of SGD has received significant attention. For instance, Jin et al. (2017) show that a perturbed form of gradient descent is able to escape from strict saddle points and converge to second-order stationary points (i.e., local minima). Zhou et al. (2019) further show that noise in the update can help SGD to escape from spurious local minima and converge to the global minima.