Entropy Regularization for Population Estimation

Chugg, Ben, Henderson, Peter, Goldin, Jacob, Ho, Daniel E.

Aug-24-2022–arXiv.org Artificial Intelligence

While most frameworks for online sequential decision-making focus on the objective of maximizing reward, in practice this is rarely the sole objective. Other considerations may involve budget constraints, ensuring fair treatment, or estimating various population characteristics. There has been growing recognition that these other objectives must be formally integrated into sequential decision-making frameworks, especially if such algorithms are to be used in sensitive application areas [21]. In this work, we focus on the problem of maximizing reward while simultaneously estimating the population total (equivalently, mean) in a structured bandit setting. The most natural approach to this problem from a machine learning perspective is to use a model to predict the mean. However, this method is subject to the problem that adaptively collected data are subject to bias, which in turn biases the model estimates [29].

estimator, inclusion probability, probability, (16 more...)

arXiv.org Artificial Intelligence

Aug-24-2022

arXiv.org PDF

Add feedback

Country:
- North America > United States > Illinois > Cook County > Chicago (0.04)

Genre:
- Research Report (1.00)

Industry:
- Government (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (0.68)
  - Machine Learning
    - Statistical Learning (1.00)
    - Reinforcement Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found