ylx
Efficient Resources Allocation for Markov Decision Processes
Assume that we model a complex decision-making problem under uncertainty by a finite MDP. Because of the limited resources used, the parameters of the MDP (transition probabilities and rewards) are uncertain: we assume that we only know a belief state over their possible values. IT we select the most probable values of the parameters, we can build a MDP and solve it to deduce the corresponding optimal policy. However, because of the uncertainty over the true parameters, this policy may not be the one that maximizes the expected cumulative rewards of the true (but partially unknown) decision-making problem. We can nevertheless use sampling techniques to estimate the expected loss of using this policy.
Efficient Resources Allocation for Markov Decision Processes
Assume that we model a complex decision-making problem under uncertainty by a finite MDP. Because of the limited resources used, the parameters of the MDP (transition probabilities and rewards) are uncertain: we assume that we only know a belief state over their possible values. IT we select the most probable values of the parameters, we can build a MDP and solve it to deduce the corresponding optimal policy. However, because of the uncertainty over the true parameters, this policy may not be the one that maximizes the expected cumulative rewards of the true (but partially unknown) decision-making problem. We can nevertheless use sampling techniques to estimate the expected loss of using this policy.
Efficient Resources Allocation for Markov Decision Processes
Assume that we model a complex decision-making problem under uncertainty by a finite MDP. Because of the limited resources used, the parameters of the MDP (transition probabilities and rewards) are uncertain: we assume that we only know a belief state over their possible values. IT we select the most probable values of the parameters, we can build a MDP and solve it to deduce the corresponding optimal policy. However, because of the uncertainty over the true parameters, this policy may not be the one that maximizes the expected cumulative rewards of the true (but partially unknown) decision-making problem. We can nevertheless use sampling techniques to estimate the expected loss of using this policy.
A Gradient-Based Boosting Algorithm for Regression Problems
Zemel, Richard S., Pitassi, Toniann
Adaptive boosting methods are simple modular algorithms that operate as follows. Let 9: X -t Y be the function to be learned, where the label set Y is finite, typically binary-valued.The algorithm uses a learning procedure, which has access to n training examples, {(Xl, Y1), ..., (xn, Yn)}, drawn randomly from X x Yaccording todistribution D; it outputs a hypothesis I:
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Middle East > Jordan (0.04)
A Gradient-Based Boosting Algorithm for Regression Problems
Zemel, Richard S., Pitassi, Toniann
Adaptive boosting methods are simple modular algorithms that operate as follows. Let 9: X -t Y be the function to be learned, where the label set Y is finite, typically binary-valued. The algorithm uses a learning procedure, which has access to n training examples, {(Xl, Y1),..., (xn, Yn)}, drawn randomly from X x Yaccording to distribution D; it outputs a hypothesis I:
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Middle East > Jordan (0.04)
A Gradient-Based Boosting Algorithm for Regression Problems
Zemel, Richard S., Pitassi, Toniann
Adaptive boosting methods are simple modular algorithms that operate as follows. Let 9: X -t Y be the function to be learned, where the label set Y is finite, typically binary-valued. The algorithm uses a learning procedure, which has access to n training examples, {(Xl, Y1),..., (xn, Yn)}, drawn randomly from X x Yaccording to distribution D; it outputs a hypothesis I:
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Middle East > Jordan (0.04)
Maximum Conditional Likelihood via Bound Maximization and the CEM Algorithm
Advantages in feature selection, robustness andlimited resource allocation have been studied. Ultimately, tasks such as regression and classification reduce to the evaluation of a conditional density. However, popularity of maximumjoint likelihood and EM techniques remains strong in part due to their elegance and convergence properties. Thus, many conditional problems are solved by first estimating joint models then conditioning them.
- Asia > Middle East > Jordan (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Convergence of the Wake-Sleep Algorithm
Ikeda, Shiro, Amari, Shun-ichi, Nakahara, Hiroyuki
The WS (Wake-Sleep) algorithm is a simple learning rule for the models with hidden variables. It is shown that this algorithm can be applied to a factor analysis model which is a linear version of the Helmholtz machine. Buteven for a factor analysis model, the general convergence is not proved theoretically. In this article, we describe the geometrical understanding ofthe WS algorithm in contrast with the EM (Expectation Maximization) algorithm and the em algorithm. As the result, we prove the convergence of the WS algorithm for the factor analysis model. We also show the condition for the convergence in general models.
Maximum Conditional Likelihood via Bound Maximization and the CEM Algorithm
We present the CEM (Conditional Expectation Maximi::ation) algorithm as an extension of the EM (Expectation M aximi::ation) algorithm to conditional density estimation under missing data. A bounding and maximization process is given to specifically optimize conditional likelihood instead of the usual joint likelihood. We apply the method to conditioned mixture models and use bounding techniques to derive the model's update rules. Monotonic convergence, computational efficiency and regression results superior to EM are demonstrated.
- Asia > Middle East > Jordan (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Convergence of the Wake-Sleep Algorithm
Ikeda, Shiro, Amari, Shun-ichi, Nakahara, Hiroyuki
The W-S (Wake-Sleep) algorithm is a simple learning rule for the models with hidden variables. It is shown that this algorithm can be applied to a factor analysis model which is a linear version of the Helmholtz machine. But even for a factor analysis model, the general convergence is not proved theoretically. In this article, we describe the geometrical understanding of the W-S algorithm in contrast with the EM (Expectation Maximization) algorithm and the em algorithm. As the result, we prove the convergence of the W-S algorithm for the factor analysis model. We also show the condition for the convergence in general models.