copp
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Asia > Singapore (0.04)
A Proofs
In this proof, we use the notion of weighted exchangeability as defined in Section 3.2 of [27]. A.2 Proof of Proposition 4.2 The following proof is an adaptation of [14, Proposition 1] to our setting. To get from (32) to (33), we use Assumption 2 and Markov's inequality. B.1 Further comments on the differences between [14] and COPP In this subsection, we elaborate on the differences between our work and [14]. As mentioned in in the main text, given that we are integrating out the action in Eq. 7, we are essentially able to use the full dataset when constructing the CP intervals.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Asia > Singapore (0.04)
Conformal Off-policy Prediction
Zhang, Yingying, Shi, Chengchun, Luo, Shikai
Off-policy evaluation is critical in a number of applications where new policies need to be evaluated offline before online deployment. Most existing methods focus on the expected return, define the target parameter through averaging and provide a point estimator only. In this paper, we develop a novel procedure to produce reliable interval estimators for a target policy's return starting from any initial state. Our proposal accounts for the variability of the return around its expectation, focuses on the individual effect and offers valid uncertainty quantification. Our main idea lies in designing a pseudo policy that generates subsamples as if they were sampled from the target policy so that existing conformal prediction algorithms are applicable to prediction interval construction. Our methods are justified by theories, synthetic data and real data from short-video platforms.
Conformal Off-Policy Prediction in Contextual Bandits
Taufiq, Muhammad Faaiz, Ton, Jean-Francois, Cornish, Rob, Teh, Yee Whye, Doucet, Arnaud
Most off-policy evaluation methods for contextual bandits have focused on the expected outcome of a policy, which is estimated via methods that at best provide only asymptotic guarantees. However, in many applications, the expectation may not be the best measure of performance as it does not capture the variability of the outcome. In addition, particularly in safety-critical settings, stronger guarantees than asymptotic correctness may be required. To address these limitations, we consider a novel application of conformal prediction to contextual bandits. Given data collected under a behavioral policy, we propose \emph{conformal off-policy prediction} (COPP), which can output reliable predictive intervals for the outcome under a new target policy. We provide theoretical finite-sample guarantees without making any additional assumptions beyond the standard contextual bandit setup, and empirically demonstrate the utility of COPP compared with existing methods on synthetic and real-world data.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Asia > Singapore (0.04)
Planning for Proactive Assistance in Environments with Partial Observability
Kulkarni, Anagha, Srivastava, Siddharth, Kambhampati, Subbarao
AI agent and the human coexist, and have partial observability of each other's activities. There are several real-world This paper addresses the problem of synthesizing workspaces like factory floors, warehouses, restaurants, nursing the behavior of an AI agent that provides proactive homes for elderly, disaster response areas, etc., where this task assistance to a human in settings like factory problem of providing proactive task assistance to the involved floors where they may coexist in a common humans is important. Our formulation considers a scenario environment. Unlike in the case of requested assistance, where the AI agent is aware of the tasks being allocated to the human may not be expecting proactive the human by the ecosystem and may also know the rules and assistance and hence it is crucial for the agent to protocols of the ecosystem. We assume that the agent has ensure that the human is aware of how the assistance access to an input that captures the human's planning process affects her task. This becomes harder when for her goals. For instance, prior works that study the there is a possibility that the human may neither problem of action model acquisition [Zhuo and Yang, 2014; have full knowledge of the AI agent's capabilities Zhuo and Kambhampati, 2013] can be used to derive the human's nor have full observability of its activities.
How AI Can Solve Your Worst Corporate Nightmare - Texas CEO Magazine
Volkswagen made international headlines this year when the company had to shell out 15 billion after a high-profile emissions scandal rocked the automaker and left its future uncertain. It's a prime example of the type of situation David Copps is trying to prevent with his Dallas-based AI business, Brainspace. "If they had known who said what and when early on in terms of all the emissions problems that they're having, they potentially could've saved 12 billion," Copps says. That's where he sets his sights with Brainspace. The company uses AI and machine learning to enable organizations to garner insights from the data they accumulate each business day, faster than ever before.
Augmenting Human Intelligence
As what was once mere data evolves into actionable intelligence, the context that binds that data becomes ever more essential. With no context around those four letters, you might not understand the reference or make any sort of connection. But if you add just one word to "java," such as "development," "island," or "coffee," the reference changes completely--and that's with just a single word of context. This is the type of active context and connection that the Brainspace engine provides. "Context is a very important part of what we do. When we analyze documents, we take the context into consideration," says Ravi Sathyanna, vice president of technology and product management at Brainspace.
- North America > United States (0.05)
- Europe (0.05)
- Law (0.50)
- Automobiles & Trucks > Manufacturer (0.48)