Export Reviews, Discussions, Author Feedback and Meta-Reviews
–Neural Information Processing Systems
This paper presents a hybrid approach for using both crowdsourced labels and an incrementally (online) trained model to address prediction problems; the core idea is to lean heavily on the crowd as the system is ramping up, learn from the labels thus acquired, and then use the crowd less and less often as the model becomes more confident. This is done via a sophisticated framing of the problem as a stochastic game based on a CRF prediction model in which the system and the crowd are both players. The system can issue one or more queries q for tokens x (with true label y) which elicit responses r, where there is a utility U(q,r) for each outcome; the system thus attempts to pick the actions that will maximize the expected utility. Furthermore, the queries are not issued all at once, but at times s (with response times t); utility is maximized with respect to a t_deadline by which an answer needs to be computed (this thus determines how many queries are sent out, at what rate, etc.) Computing this expected utility requires using the simulation dynamics model P(y,r,t x,q,s) in order to compute the utilities as in (4). Given the utility values, the optimal action could be chosen; however, the introduction of continuous time makes this intractable to optimize and as such an approximation is used based on Monte Carlo Tree Search and TD learning (Algorithm 1).
Neural Information Processing Systems
Feb-6-2025, 18:17:32 GMT
- Technology: