Learning from Logged Implicit Exploration Data

Strehl, Alex, Langford, John, Li, Lihong, Kakade, Sham M.

Dec-31-2010–Neural Information Processing Systems

We provide a sound and consistent foundation for the use of \emph{nonrandom} exploration data in ``contextual bandit'' or ``partially labeled'' settings where only the value of a chosen action is learned. The primary challenge in a variety of settings is that the exploration policy, in which ``offline'' data is logged, is not explicitly known. Prior solutions here require either control of the actions during the learning process, recorded random exploration, or actions chosen obliviously in a repeated manner. The techniques reported here lift these restrictions, allowing the learning of a policy for choosing actions given features from historical data where no randomization occurred or was logged. We empirically verify our solution on two reasonably sized sets of real-world data obtained from an Internet %online advertising company.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Dec-31-2010

Conferences PDF

Add feedback

Country:
- North America > United States
  - Pennsylvania (0.28)
  - California (0.28)

Industry:
- Marketing (0.46)
- Information Technology (0.46)

Technology:
- Information Technology
  - Data Science > Data Mining (1.00)
  - Artificial Intelligence
    - Representation & Reasoning (0.93)
    - Machine Learning > Statistical Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found