Goto

Collaborating Authors

 civ


Miniature Testbed for Validating Multi-Agent Cooperative Autonomous Driving

Bae, Hyunchul, Lee, Eunjae, Han, Jehyeop, Kang, Minhee, Kim, Jaehyeon, Seo, Junggeun, Noh, Minkyun, Ahn, Heejin

arXiv.org Artificial Intelligence

Cooperative autonomous driving, which extends vehicle autonomy by enabling real-time collaboration between vehicles and smart roadside infrastructure, remains a challenging yet essential problem. However, none of the existing testbeds employ smart infrastructure equipped with sensing, edge computing, and communication capabilities. To address this gap, we design and implement a 1:15-scale miniature testbed, CIVAT, for validating cooperative autonomous driving, consisting of a scaled urban map, autonomous vehicles with onboard sensors, and smart infrastructure. The proposed testbed integrates V2V and V2I communication with the publish-subscribe pattern through a shared Wi-Fi and ROS2 framework, enabling information exchange between vehicles and infrastructure to realize cooperative driving functionality. As a case study, we validate the system through infrastructure-based perception and intersection management experiments.


Can AI Keep a Secret? Contextual Integrity Verification: A Provable Security Architecture for LLMs

Gupta, Aayush

arXiv.org Artificial Intelligence

Large language models (LLMs) remain acutely vulnerable to prompt injection and related jailbreak attacks; heuristic guardrails (rules, filters, LLM judges) are routinely bypassed. We present Contextual Integrity Verification (CIV), an inference-time security architecture that attaches cryptographically signed provenance labels to every token and enforces a source-trust lattice inside the transformer via a pre-softmax hard attention mask (with optional FFN/residual gating). CIV provides deterministic, per-token non-interference guarantees on frozen models: lower-trust tokens cannot influence higher-trust representations. On benchmarks derived from recent taxonomies of prompt-injection vectors (Elite-Attack + SoK-246), CIV attains 0% attack success rate under the stated threat model while preserving 93.1% token-level similarity and showing no degradation in model perplexity on benign tasks; we note a latency overhead attributable to a non-optimized data path. Because CIV is a lightweight patch -- no fine-tuning required -- we demonstrate drop-in protection for Llama-3-8B and Mistral-7B. We release a reference implementation, an automated certification harness, and the Elite-Attack corpus to support reproducible research.


Using Large Language Models to Assess Teachers' Pedagogical Content Knowledge

Yang, Yaxuan, Wang, Shiyu, Zhai, Xiaoming

arXiv.org Artificial Intelligence

Assessing teachers' pedagogical content knowledge (PCK) through performance-based tasks is both time and effort-consuming. While large language models (LLMs) offer new opportunities for efficient automatic scoring, little is known about whether LLMs introduce construct-irrelevant variance (CIV) in ways similar to or different from traditional machine learning (ML) and human raters. This study examines three sources of CIV -- scenario variability, rater severity, and rater sensitivity to scenario -- in the context of video-based constructed-response tasks targeting two PCK sub-constructs: analyzing student thinking and evaluating teacher responsiveness. Using generalized linear mixed models (GLMMs), we compared variance components and rater-level scoring patterns across three scoring sources: human raters, supervised ML, and LLM. Results indicate that scenario-level variance was minimal across tasks, while rater-related factors contributed substantially to CIV, especially in the more interpretive Task II. The ML model was the most severe and least sensitive rater, whereas the LLM was the most lenient. These findings suggest that the LLM contributes to scoring efficiency while also introducing CIV as human raters do, yet with varying levels of contribution compared to supervised ML. Implications for rater training, automated scoring design, and future research on model interpretability are discussed.


Vision language models have difficulty recognizing virtual objects

Tran, Tyler, Khemlani, Sangeet, Trafton, J. G.

arXiv.org Artificial Intelligence

Vision language models (VLMs) are AI systems paired with both language and vision encoders to process multimodal input. They are capable of performing complex semantic tasks such as automatic captioning, but it remains an open question about how well they comprehend the visuospatial properties of scenes depicted in the images they process. We argue that descriptions of virtual objects -- objects that are not visually represented in an image -- can help test scene comprehension in these AI systems. For example, an image that depicts a person standing under a tree can be paired with the following prompt: imagine that a kite is stuck in the tree. VLMs that comprehend the scene should update their representations and reason sensibly about the spatial relations between all three objects. We describe systematic evaluations of state-of-the-art VLMs and show that their ability to process virtual objects is inadequate.


Identifying Elasticities in Autocorrelated Time Series Using Causal Graphs

Tiedemann, Silvana, Canales, Jorge Sanchez, Schur, Felix, Sgarlato, Raffaele, Hirth, Lion, Ruhnau, Oliver, Peters, Jonas

arXiv.org Machine Learning

The price elasticity of demand can be estimated from observational data using instrumental variables (IV). However, naive IV estimators may be inconsistent in settings with autocorrelated time series. We argue that causal time graphs can simplify IV identification and help select consistent estimators. To do so, we propose to first model the equilibrium condition by an unobserved confounder, deriving a directed acyclic graph (DAG) while maintaining the assumption of a simultaneous determination of prices and quantities. We then exploit recent advances in graphical inference to derive valid IV estimators, including estimators that achieve consistency by simultaneously estimating nuisance effects. We further argue that observing significant differences between the estimates of presumably valid estimators can help to reject false model assumptions, thereby improving our understanding of underlying economic dynamics. We apply this approach to the German electricity market, estimating the price elasticity of demand on simulated and real-world data. The findings underscore the importance of accounting for structural autocorrelation in IV-based analysis.


Conditional Instrumental Variable Regression with Representation Learning for Causal Inference

Cheng, Debo, Xu, Ziqi, Li, Jiuyong, Liu, Lin, Liu, Jixue, Le, Thuc Duy

arXiv.org Artificial Intelligence

This paper studies the challenging problem of estimating causal effects from observational data, in the presence of unobserved confounders. The two-stage least square (TSLS) method and its variants with a standard instrumental variable (IV) are commonly used to eliminate confounding bias, including the bias caused by unobserved confounders, but they rely on the linearity assumption. Besides, the strict condition of unconfounded instruments posed on a standard IV is too strong to be practical. To address these challenging and practical problems of the standard IV method (linearity assumption and the strict condition), in this paper, we use a conditional IV (CIV) to relax the unconfounded instrument condition of standard IV and propose a non-linear CIV regression with Confounding Balancing Representation Learning, CBRL.CIV, for jointly eliminating the confounding bias from unobserved confounders and balancing the observed confounders, without the linearity assumption. We theoretically demonstrate the soundness of CBRL.CIV. Extensive experiments on synthetic and two real-world datasets show the competitive performance of CBRL.CIV against state-of-the-art IV-based estimators and superiority in dealing with the non-linear situation.


Learning Conditional Instrumental Variable Representation for Causal Effect Estimation

Cheng, Debo, Xu, Ziqi, Li, Jiuyong, Liu, Lin, Le, Thuc Duy, Liu, Jixue

arXiv.org Artificial Intelligence

One of the fundamental challenges in causal inference is to estimate the causal effect of a treatment on its outcome of interest from observational data. However, causal effect estimation often suffers from the impacts of confounding bias caused by unmeasured confounders that affect both the treatment and the outcome. The instrumental variable (IV) approach is a powerful way to eliminate the confounding bias from latent confounders. However, the existing IV-based estimators require a nominated IV, and for a conditional IV (CIV) the corresponding conditioning set too, for causal effect estimation. This limits the application of IV-based estimators. In this paper, by leveraging the advantage of disentangled representation learning, we propose a novel method, named DVAE.CIV, for learning and disentangling the representations of CIV and the representations of its conditioning set for causal effect estimations from data with latent confounders. Extensive experimental results on both synthetic and real-world datasets demonstrate the superiority of the proposed DVAE.CIV method against the existing causal effect estimators.


Causal Inference with Conditional Instruments using Deep Generative Models

Cheng, Debo, Xu, Ziqi, Li, Jiuyong, Liu, Lin, Liu, Jixue, Le, Thuc Duy

arXiv.org Artificial Intelligence

The instrumental variable (IV) approach is a widely used way to estimate the causal effects of a treatment on an outcome of interest from observational data with latent confounders. A standard IV is expected to be related to the treatment variable and independent of all other variables in the system. However, it is challenging to search for a standard IV from data directly due to the strict conditions. The conditional IV (CIV) method has been proposed to allow a variable to be an instrument conditioning on a set of variables, allowing a wider choice of possible IVs and enabling broader practical applications of the IV approach. Nevertheless, there is not a data-driven method to discover a CIV and its conditioning set directly from data. To fill this gap, in this paper, we propose to learn the representations of the information of a CIV and its conditioning set from data with latent confounders for average causal effect estimation. By taking advantage of deep generative models, we develop a novel data-driven approach for simultaneously learning the representation of a CIV from measured variables and generating the representation of its conditioning set given measured variables. Extensive experiments on synthetic and real-world datasets show that our method outperforms the existing IV methods.


Turing Completeness and Sid Meier's Civilization

de Wynter, Adrian

arXiv.org Artificial Intelligence

We prove that three strategy video games from the Sid Meier's Civilization series: Sid Meier's Civilization: Beyond Earth, Sid Meier's Civilization V, and Sid Meier's Civilization VI, are Turing complete. We achieve this by building three universal Turing machines-one for each game-using only the elements present in the games, and using their internal rules and mechanics as the transition function. The existence of such machines imply that under the assumptions made, the games are undecidable. We show constructions of these machines within a running game session, and we provide a sample execution of an algorithm-the three-state Busy Beaver-with one of our machines.