hca
Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis
To make reinforcement learning more sample efficient, we need better credit assignment methods that measure an action's influence on future rewards. Building upon Hindsight Credit Assignment (HCA), we introduce Counterfactual Contribution Analysis (COCOA), a new family of model-based credit assignment algorithms. Our algorithms achieve precise credit assignment by measuring the contribution of actions upon obtaining subsequent rewards, by quantifying a counterfactual query: 'Would the agent still have reached this reward if it had taken another action?'. We show that measuring contributions w.r.t.
Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck
Tan, Yuwen, Qing, Yuan, Gong, Boqing
This paper reveals that many state-of-the-art large language models (LLMs) lack hierarchical knowledge about our visual world, unaware of even well-established biology taxonomies. This shortcoming makes LLMs a bottleneck for vision LLMs' hierarchical visual understanding (e.g., recognizing Anemone Fish but not Vertebrate). We arrive at these findings using about one million four-choice visual question answering (VQA) tasks constructed from six taxonomies and four image datasets. Interestingly, finetuning a vision LLM using our VQA tasks reaffirms LLMs' bottleneck effect to some extent because the VQA tasks improve the LLM's hierarchical consistency more than the vision LLM's. We conjecture that one cannot make vision LLMs understand visual concepts fully hierarchical until LLMs possess corresponding taxonomy knowledge.
Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis
To make reinforcement learning more sample efficient, we need better credit assignment methods that measure an action's influence on future rewards. Building upon Hindsight Credit Assignment (HCA), we introduce Counterfactual Contribution Analysis (COCOA), a new family of model-based credit assignment algorithms. Our algorithms achieve precise credit assignment by measuring the contribution of actions upon obtaining subsequent rewards, by quantifying a counterfactual query: 'Would the agent still have reached this reward if it had taken another action?'. We show that measuring contributions w.r.t. We run experiments on a suite of problems specifically designed to evaluate long-term credit assignment capabilities.
Human-Centered Automation
The rapid advancement of Generative Artificial Intelligence (AI), such as Large Language Models (LLMs) and Multimodal Large Language Models (MLLM), has the potential to revolutionize the way we work and interact with digital systems across various industries. However, the current state of software automation, such as Robotic Process Automation (RPA) frameworks, often requires domain expertise and lacks visibility and intuitive interfaces, making it challenging for users to fully leverage these technologies. This position paper argues for the emerging area of Human-Centered Automation (HCA), which prioritizes user needs and preferences in the design and development of automation systems. Drawing on empirical evidence from human-computer interaction research and case studies, we highlight the importance of considering user perspectives in automation and propose a framework for designing human-centric automation solutions. The paper discusses the limitations of existing automation approaches, the challenges in integrating AI and RPA, and the benefits of human-centered automation for productivity, innovation, and democratizing access to these technologies. We emphasize the importance of open-source solutions and provide examples of how HCA can empower individuals and organizations in the era of rapidly progressing AI, helping them remain competitive. The paper also explores pathways to achieve more advanced and context-aware automation solutions. We conclude with a call to action for researchers and practitioners to focus on developing automation technologies that adapt to user needs, provide intuitive interfaces, and leverage the capabilities of high-end AI to create a more accessible and user-friendly future of automation.
HiQA: A Hierarchical Contextual Augmentation RAG for Massive Documents QA
Chen, Xinyue, Gao, Pengyu, Song, Jiangjiang, Tan, Xiaoyang
As language model agents leveraging external tools rapidly evolve, significant progress has been made in question-answering(QA) methodologies utilizing supplementary documents and the Retrieval-Augmented Generation (RAG) approach. This advancement has improved the response quality of language models and alleviates the appearance of hallucination. However, these methods exhibit limited retrieval accuracy when faced with massive indistinguishable documents, presenting notable challenges in their practical application. In response to these emerging challenges, we present HiQA, an advanced framework for multi-document question-answering (MDQA) that integrates cascading metadata into content as well as a multi-route retrieval mechanism. We also release a benchmark called MasQA to evaluate and research in MDQA. Finally, HiQA demonstrates the state-of-the-art performance in multi-document environments.
Structural Causal Models Reveal Confounder Bias in Linear Program Modelling
Zeฤeviฤ, Matej, Dhami, Devendra Singh, Kersting, Kristian
The recent years have been marked by extended research on adversarial attacks, especially on deep neural networks. With this work we intend on posing and investigating the question of whether the phenomenon might be more general in nature, that is, adversarial-style attacks outside classical classification tasks. Specifically, we investigate optimization problems as they constitute a fundamental part of modern AI research. To this end, we consider the base class of optimizers namely Linear Programs (LPs). On our initial attempt of a na\"ive mapping between the formalism of adversarial examples and LPs, we quickly identify the key ingredients missing for making sense of a reasonable notion of adversarial examples for LPs. Intriguingly, the formalism of Pearl's notion to causality allows for the right description of adversarial like examples for LPs. Characteristically, we show the direct influence of the Structural Causal Model (SCM) onto the subsequent LP optimization, which ultimately exposes a notion of confounding in LPs (inherited by said SCM) that allows for adversarial-style attacks. We provide both the general proof formally alongside existential proofs of such intriguing LP-parameterizations based on SCM for three combinatorial problems, namely Linear Assignment, Shortest Path and a real world problem of energy systems.
An Active Learning-based Approach for Hosting Capacity Analysis in Distribution Systems
Lee, Kiyeob, Zhao, Peng, Bhattacharya, Anirban, Mallick, Bani K., Xie, Le
With the increasing amount of distributed energy resources (DERs) integration, there is a significant need to model and analyze hosting capacity (HC) for future electric distribution grids. Hosting capacity analysis (HCA) examines the amount of DERs that can be safely integrated into the grid and is a challenging task in full generality because there are many possible integration of DERs in foresight. That is, there are numerous extreme points between feasible and infeasible sets. Moreover, HC depends on multiple factors such as (a) adoption patterns of DERs that depend on socio-economic behaviors and (b) how DERs are controlled and managed. These two factors are intrinsic to the problem space because not all integration of DERs may be centrally planned, and could largely change our understanding about HC. This paper addresses the research gap by capturing the two factors (a) and (b) in HCA and by identifying a few most insightful HC scenarios at the cost of domain knowledge. We propose a data-driven HCA framework and introduce active learning in HCA to effectively explore scenarios. Active learning in HCA and characteristics of HC with respect to the two factors (a) and (b) are illustrated in a 3-bus example. Next, detailed large-scale studies are proposed to understand the significance of (a) and (b). Our findings suggest that HC and its interpretations significantly change subject to the two factors (a) and (b).
Towards Practical Credit Assignment for Deep Reinforcement Learning
Alipov, Vyacheslav, Simmons-Edler, Riley, Putintsev, Nikita, Kalinin, Pavel, Vetrov, Dmitry
Credit assignment is a fundamental problem in reinforcement learning, the problem of measuring an action's influence on future rewards. Improvements in credit assignment methods have the potential to boost the performance of RL algorithms on many tasks, but thus far have not seen widespread adoption. Recently, a family of methods called Hindsight Credit Assignment (HCA) was proposed, which explicitly assign credit to actions in hindsight based on the probability of the action having led to an observed outcome. This approach is appealing as a means to more efficient data usage, but remains a largely theoretical idea applicable to a limited set of tabular RL tasks, and it is unclear how to extend HCA to Deep RL environments. In this work, we explore the use of HCA-style credit in a deep RL context. We first describe the limitations of existing HCA algorithms in deep RL, then propose several theoretically-justified modifications to overcome them. Based on this exploration, we present a new algorithm, Credit-Constrained Advantage Actor-Critic (C2A2C), which ignores policy updates for actions which don't affect future outcomes based on credit in hindsight, while updating the policy as normal for those that do. We find that C2A2C outperforms Advantage Actor-Critic (A2C) on the Arcade Learning Environment (ALE) benchmark, showing broad improvements over A2C and motivating further work on credit-constrained update rules for deep RL methods.