Reviews: Stochastic Structured Prediction under Bandit Feedback

Jan-20-2025, 13:27:09 GMT–Neural Information Processing Systems

Summary: This paper proposes a stochastic online learning method for the task of structured prediction. In this setting, the learner doest not get the correct structured output during training. Instead, it only gets bandit feedback from the labeler. The paper first proposes an online learning algorithm that learns model parameters via stochastic gradient descent; generalizes the learning method to pair-wise comparison of structured outputs; provides an optimization approach with Cross-Entropy Minimization; and theoretically analyzes the convergence property of the optimization approach. Pros: The paper proposes an online stochastic learning algorithm for minimizing the expected loss of structured predictions; gives a method of learning from pair-wise comparisons; and theoretical analyze the convergence rate.

model parameter, relation, stochastic structured prediction, (7 more...)

Neural Information Processing Systems

Jan-20-2025, 13:27:09 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (0.60)
  - Machine Learning
    - Supervised Learning (0.84)
    - Inductive Learning (0.84)
    - Statistical Learning > Gradient Descent (0.58)