Partial Structure Discovery is Sufficient for No-regret Learning in Causal Bandits

Jun-1-2025, 14:42:21 GMT–Neural Information Processing Systems

Causal knowledge about the relationships among decision variables and a reward variable in a bandit setting can accelerate the learning of an optimal decision. Current works often assume the causal graph is known, which may not always be available a priori. Motivated by this challenge, we focus on the causal bandit problem in scenarios where the underlying causal graph is unknown and may include latent confounders. While intervention on the parents of the reward node is optimal in the absence of latent confounders, this is not necessarily the case in general. Instead, one must consider a set of possibly optimal arms/interventions, each being a special subset of the ancestors of the reward node, making causal discovery beyond the parents of the reward node essential.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Jun-1-2025, 14:42:21 GMT

Conferences PDF

Add feedback

Genre:
- Research Report > Experimental Study (0.92)

Industry:
- Health & Medicine (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language (0.67)
  - Representation & Reasoning > Uncertainty (0.67)