Goto

Collaborating Authors

 canonical model


Turbo Autoencoder: Deep learning based channel codes for point-to-point communication channels

Neural Information Processing Systems

Designing codes that combat the noise in a communication medium has remained a significant area of research in information theory as well as wireless communications. Asymptotically optimal channel codes have been developed by mathematicians for communicating under canonical models after over 60 years of research. On the other hand, in many non-canonical channel settings, optimal codes do not exist and the codes designed for canonical models are adapted via heuristics to these channels and are thus not guaranteed to be optimal. In this work, we make significant progress on this problem by designing a fully end-to-end jointly trained neural encoder and decoder, namely, Turbo Autoencoder (TurboAE), with the following contributions: (a) under moderate block lengths, TurboAE approaches state-of-the-art performance under canonical channels; (b) moreover, TurboAE outperforms the state-of-the-art codes under non-canonical settings in terms of reliability. TurboAE shows that the development of channel coding design can be automated via deep learning, with near-optimal performance.


Causal Discovery for Linear DAGs with Dependent Latent Variables via Higher-order Cumulants

arXiv.org Machine Learning

This paper addresses the problem of estimating causal directed acyclic graphs in linear non-Gaussian acyclic models with latent confounders (LvLiNGAM). Existing methods assume mutually independent latent confounders or cannot properly handle models with causal relationships among observed variables. We propose a novel algorithm that identifies causal DAGs in LvLiNGAM, allowing causal structures among latent variables, among observed variables, and between the two. The proposed method leverages higher-order cumu-lants of observed data to identify the causal structure. Extensive simulations and experiments with real-world data demonstrate the validity and practical utility of the proposed algorithm. Introduction Estimating causal directed acyclic graphs (DAGs) in the presence of latent confounders has been a major challenge in causal analysis. Conventional causal discovery methods, such as the Peter-Clark (PC) algorithm [1], Greedy Equivalence Search (GES) [2], and the Linear Non-Gaussian Acyclic Model (LiNGAM) [3, 4], focus solely on the causal model without latent confounders. Fast Causal Inference (FCI) [1] extends the PC algorithm to handle latent variables, recovering a partial ancestral graph (PAG) under the faithfulness assumption. Greedy Fast Causal Inference (GFCI) [6] hybridizes GES and FCI but inherits the limitation of FCI. The assumption of linearity and non-Gaussian disturbances in the causal model enables the identification of causal structures beyond the PAG. The linear non-Gaussian acyclic model with latent confounders (LvLiNGAM) is an extension of LiNGAM that incorporates latent confounders. Hoyer et al. [7] demonstrated that LvLiNGAM can be transformed into a canonical model in which all latent variables are mutually independent and causally precede the observed variables.


Turbo Autoencoder: Deep learning based channel codes for point-to-point communication channels

Neural Information Processing Systems

Designing codes that combat the noise in a communication medium has remained a significant area of research in information theory as well as wireless communications. Asymptotically optimal channel codes have been developed by mathematicians for communicating under canonical models after over 60 years of research. On the other hand, in many non-canonical channel settings, optimal codes do not exist and the codes designed for canonical models are adapted via heuristics to these channels and are thus not guaranteed to be optimal. In this work, we make significant progress on this problem by designing a fully end-to-end jointly trained neural encoder and decoder, namely, Turbo Autoencoder (TurboAE), with the following contributions: (a) under moderate block lengths, TurboAE approaches state-of-the-art performance under canonical channels; (b) moreover, TurboAE outperforms the state-of-the-art codes under non-canonical settings in terms of reliability. TurboAE shows that the development of channel coding design can be automated via deep learning, with near-optimal performance.


Data-Driven Modeling and Verification of Perception-Based Autonomous Systems

arXiv.org Artificial Intelligence

This paper addresses the problem of data-driven modeling and verification of perception-based autonomous systems. We assume the perception model can be decomposed into a canonical model (obtained from first principles or a simulator) and a noise model that contains the measurement noise introduced by the real environment. We focus on two types of noise, benign and adversarial noise, and develop a data-driven model for each type using generative models and classifiers, respectively. We show that the trained models perform well according to a variety of evaluation metrics based on downstream tasks such as state estimation and control. Finally, we verify the safety of two systems with high-dimensional data-driven models, namely an image-based version of mountain car (a reinforcement learning benchmark) as well as the F1/10 car, which uses LiDAR measurements to navigate a racing track.


PPFL: A Personalized Federated Learning Framework for Heterogeneous Population

arXiv.org Artificial Intelligence

Personalization aims to characterize individual preferences and is widely applied across many fields. However, conventional personalized methods operate in a centralized manner and potentially expose the raw data when pooling individual information. In this paper, with privacy considerations, we develop a flexible and interpretable personalized framework within the paradigm of Federated Learning, called PPFL (Population Personalized Federated Learning). By leveraging canonical models to capture fundamental characteristics among the heterogeneous population and employing membership vectors to reveal clients' preferences, it models the heterogeneity as clients' varying preferences for these characteristics and provides substantial insights into client characteristics, which is lacking in existing Personalized Federated Learning (PFL) methods. Furthermore, we explore the relationship between our method and three main branches of PFL methods: multi-task PFL, clustered FL, and decoupling PFL, and demonstrate the advantages of PPFL. To solve PPFL (a non-convex constrained optimization problem), we propose a novel random block coordinate descent algorithm and present the convergence property. We conduct experiments on both pathological and practical datasets, and the results validate the effectiveness of PPFL.


Online Modeling and Monitoring of Dependent Processes under Resource Constraints

arXiv.org Artificial Intelligence

Adaptive monitoring of a large population of dynamic processes is critical for the timely detection of abnormal events under limited resources in many healthcare and engineering systems. Examples include the risk-based disease screening and condition-based process monitoring. However, existing adaptive monitoring models either ignore the dependency among processes or overlook the uncertainty in process modeling. To design an optimal monitoring strategy that accurately monitors the processes with poor health conditions and actively collects information for uncertainty reduction, a novel online collaborative learning method is proposed in this study. The proposed method designs a collaborative learning-based upper confidence bound (CL-UCB) algorithm to optimally balance the exploitation and exploration of dependent processes under limited resources. Efficiency of the proposed method is demonstrated through theoretical analysis, simulation studies and an empirical study of adaptive cognitive monitoring in Alzheimer's disease.


Logic of Awareness in Agent's Reasoning

arXiv.org Artificial Intelligence

The aim of this study is to formally express awareness for modeling practical agent communication. The notion of awareness has been proposed as a set of propositions for each agent, to which he/she pays attention, and has contributed to avoiding \textit{logical omniscience}. However, when an agent guesses another agent's knowledge states, what matters are not propositions but are accessible possible worlds. Therefore, we introduce a partition of possible worlds connected to awareness, that is an equivalence relation, to denote \textit{indistinguishable} worlds. Our logic is called Awareness Logic with Partition ($\mathcal{ALP}$). In this paper, we first show a running example to illustrate a practical social game. Thereafter, we introduce syntax and Kripke semantics of the logic and prove its completeness. Finally, we outline an idea to incorporate some epistemic actions with dynamic operators that change the state of awareness.


Epistemic Syllogistic: First Steps

arXiv.org Artificial Intelligence

Although modal logic is regarded as a relatively young field, its origins can be traced back to Aristotle, who explored syllogistic reasoning patterns that incorporated modalities. However, in contrast to his utterly successful assertoric syllogistic, Aristotle's examination of modal syllogisms is often viewed as error-prone and controversial, thus receiving less attention from logicians. In the literature, a large body of research on Aristotle's modal syllogistic primarily centers on the possibility of a coherent interpretation of his proposed modal systems grounded by his philosophy on necessity and contingency (see, e.g., [11, 5, 12]). We adopt a more liberal view on Aristotle's modal syllogistic, considering it as a source of inspiration for formalizing natural reasoning patterns involving modalities, rather than scrutinizing the coherence of the original systems. Our approach is encouraged by the fruitful research program of natural logic, which explores "light" logic systems that admit intuitive reasoning patterns in natural languages while balancing expressivity and computational complexity [1, 8]. In particular, various extensions of the assertoric syllogistic have been proposed and studied [8]. In this paper, we propose a systematic study on epistemic syllogistic to initiate our technical investigations of (extensions of) modal syllogistic. The choice for the epistemic modality is intentional for its ubiquitous use in natural languages. Consider the following syllogism: All C are B Some C is known to be A Some B is known to be A Taking the intuitive de re reading, the second premise and the conclusion above can be formalized as x(Cx KAx) and x(Bx KAx) respectively in first-order modal logic (FOML).


Reverse Engineering of Temporal Queries Mediated by LTL Ontologies

arXiv.org Artificial Intelligence

In reverse engineering of database queries, we aim to construct a query from a given set of answers and non-answers; it can then be used to explore the data further or as an explanation of the answers and non-answers. We investigate this query-by-example problem for queries formulated in positive fragments of linear temporal logic LTL over timestamped data, focusing on the design of suitable query languages and the combined and data complexity of deciding whether there exists a query in the given language that separates the given answers from non-answers. We consider both plain LTL queries and those mediated by LTL-ontologies.


Decision trees compensate for model misspecification

arXiv.org Artificial Intelligence

Boost (Chen and Guestrin, 2016) with default tree-depth 3 and default tree number 100, could be depicted in full: The best-performing models in ML are not interpretable. If we can explain why they outperform, we may be able to replicate these mechanisms and obtain both interpretability and performance. One example are decision trees and their descendent gradient boosting machines (GBMs). These perform well in the presence of complex interactions, with tree depth governing the order of interactions. However, interactions cannot fully account for the depth of trees found in practice. We confirm 5 alternative hypotheses about the role of tree depth in performance in the absence of true interactions, and present results from experiments on a battery of datasets. Part of the success of tree models is due to their robustness to various forms of mis-specification.