Goto

Collaborating Authors

 Fuzzy Logic


A Note on the Representational Incompatibility of Function Approximation and Factored Dynamics

Neural Information Processing Systems

We establish a new hardness result that shows that the difficulty of planning infactored Markov decision processes is representational rather than just computational. More precisely, we give a fixed family of factored MDPswith linear rewards whose optimal policies and value functions simply cannot be represented succinctly in any standard parametric form. Previous hardness results indicated that computing good policies from the MDP parameters was difficult, but left open the possibility of succinct function approximation for any fixed factored MDP. Our result applies even to policies which yield a polynomially poor approximation to the optimal value, and highlights interesting connections with the complexity classof Arthur-Merlin games.


A fuzzy set AHP-based DFM tool for rotational parts

#artificialintelligence

Design for manufacturability (DFM) requires product designers to simultaneously consider the manufacturing issues of a product along with the geometrical and design aspects. This paper reports a computer-aided DFM tool for product designers to evaluate the manufacturability of their designs. A fuzzy set-based manufacturability evaluation algorithm is formulated to generate relative manufacturability indices (MIs) to provide product designers with a better understanding of the relative ease or difficulty of machining the features in their designs. This computer-aided DFM system is developed for rotational parts. The MI of machining a part is decomposed into three components, namely, the support index, the clamping index, and the feature index.


Stabilizing Value Function Approximation with the BFBP Algorithm

Neural Information Processing Systems

However, online RL algorithms such as SARSA(A) have been shown experimentally to have difficulty converging when applied with function approximators. Theoretical analysis has not been able to prove convergence, even in the case-of linear function approximators.



Batch Value Function Approximation via Support Vectors

Neural Information Processing Systems

Virtually all existing work on value function approximation and policy-gradient methods starts with a parameterized formula for the value function or policy and thenseeks to find the best policythat canbe representedinthat parameterizedform. This can give rise to very difficult search problems for which the Bellman equation is of little or no use. In this paper, we take a different approach: rather than fixing the form of the function approximator and searching for a representable policy, we instead identify a good policy and then search for a function approximator that can represent it. Our approach exploits the ability of mathematical programming to represent a variety of constraints including those that derive from supervised learning, from advantage learning (Baird, 1993), and from the Bellman equation. By combining the kernel trick with mathematical programming, we obtain a function approximator that seeks to find the smallest number of support vectors sufficient to represent the desired policy.


Linking Motor Learning to Function Approximation: Learning in an Unlearnable Force Field

Neural Information Processing Systems

Reaching movements require the brain to generate motor commands that rely on an internal model of the task's dynamics. Here we consider the errors that subjects make early in their reaching trajectories to various targets as they learn an internal model. Using a framework from function approximation, we argue that the sequence of errors should reflect the process of gradient descent. If so, then the sequence of errors should obey hidden state transitions of a simple dynamical system. Fitting the system to human data, we find a surprisingly good fit accounting for 98% of the variance. This allows us to draw tentative conclusions about the basis elements used by the brain in transforming sensory space to motor commands. To test the robustness of the results, we estimate the shape of the basis elements under two conditions: in a traditional learning paradigm with a consistent force field, and in a random sequence of force fields where learning is not possible. Remarkably, we find that the basis remains invariant.


Stabilizing Value Function Approximation with the BFBP Algorithm

Neural Information Processing Systems

However, online RL algorithms such as SARSA(A) have been shown experimentally to have difficulty converging when applied with function approximators. Theoretical analysis has not been able to prove convergence, even in the case-of linear function approximators.



Batch Value Function Approximation via Support Vectors

Neural Information Processing Systems

Virtually all existing work on value function approximation and policy-gradient methods starts with a parameterized formula for the value function or policy and thenseeks to find the best policythat canbe representedinthat parameterizedform. This can give rise to very difficult search problems for which the Bellman equation is of little or no use. In this paper, we take a different approach: rather than fixing the form of the function approximator and searching for a representable policy, we instead identify a good policy and then search for a function approximator that can represent it. Our approach exploits the ability of mathematical programming to represent a variety of constraints including those that derive from supervised learning, from advantage learning (Baird, 1993), and from the Bellman equation. By combining the kernel trick with mathematical programming, we obtain a function approximator that seeks to find the smallest number of support vectors sufficient to represent the desired policy.


Linking Motor Learning to Function Approximation: Learning in an Unlearnable Force Field

Neural Information Processing Systems

Reaching movements require the brain to generate motor commands that rely on an internal model of the task's dynamics. Here we consider the errors that subjects make early in their reaching trajectories to various targets as they learn an internal model. Using a framework from function approximation, we argue that the sequence of errors should reflect the process of gradient descent. If so, then the sequence of errors should obey hidden state transitions of a simple dynamical system. Fitting the system to human data, we find a surprisingly good fit accounting for 98% of the variance. This allows us to draw tentative conclusions about the basis elements used by the brain in transforming sensory space to motor commands. To test the robustness of the results, we estimate the shape of the basis elements under two conditions: in a traditional learning paradigm with a consistent force field, and in a random sequence of force fields where learning is not possible. Remarkably, we find that the basis remains invariant.