AITopics | pvar

Percentile Criterion Optimization in Offline Reinforcement Learning

Neural Information Processing SystemsApr-25-2026, 15:45:43 GMT

In reinforcement learning, robust policies for high-stakes decision-making problems with limited data are usually computed by optimizing the percentile criterion. The percentile criterion is approximately solved by constructing an ambiguity set that contains the true model with high probability and optimizing the policy for the worst model in the set. Since the percentile criterion is non-convex, constructing ambiguity sets is often challenging. Existing work uses Bayesian credible regions as ambiguity sets, but they are often unnecessarily large and result in learning overly conservative policies. To overcome these shortcomings, we propose a novel Valueat-Risk based dynamic programming algorithm to optimize the percentile criterion without explicitly constructing any ambiguity sets. Our theoretical and empirical results show that our algorithm implicitly constructs much smaller ambiguity sets and learns less conservative robust policies.

ambiguity, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Predicted Variables in Programming

Carbune, Victor, Coppey, Thierry, Daryin, Alexander, Deselaers, Thomas, Sarda, Nikhil, Yagnik, Jay

arXiv.org Machine LearningOct-1-2018

We present Predicted Variables (PVars), an approach to making machine learning (ML) a first class citizen in programming languages. There is a growing divide in approaches to building systems: using human experts (e.g. programming) on the one hand, and using behavior learned from data (e.g. ML) on the other hand. PVars aim to make ML in programming as easy as `if' statements and with that hybridize ML with programming. We leverage the existing concept of variables and create a new type, a predicted variable. PVars are akin to native variables with one important distinction: PVars determine their value using ML when evaluated. We describe PVars and their interface, how they can be used in programming, and demonstrate the feasibility of our approach on three algorithmic problems: binary search, Quicksort, and caches. We show experimentally that PVars are able to improve over the commonly used heuristics and lead to a better performance than the original algorithms. As opposed to previous work applying ML to algorithmic problems, PVars have the advantage that they can be used within the existing frameworks and do not require the existing domain knowledge to be replaced. PVars allow for a seamless integration of ML into existing systems and algorithms. Our PVars implementation currently relies on standard Reinforcement Learning (RL) methods. To learn faster, PVars use the heuristic function, which they are replacing, as an initial function. We show that PVars quickly pick up the behavior of the initial function and then improve performance beyond that without ever performing substantially worse -- allowing for a safe deployment in critical applications.

artificial intelligence, machine learning, programming language, (18 more...)

arXiv.org Machine Learning

1810.00619

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Neural Network Implementation Approaches for the Connection Machine

Jr., Nathan H. Brown

Neural Information Processing SystemsDec-31-1988

Two approaches are described which allow parallel computation of a model's nonlinear functions, parallel modification of a model's weights, and parallel propagation of a model's activation and error. Each approach also allows a model's interconnect structure to be physically dynamic. A Hopfield model is implemented with each approach at six sizes over the same number of CM processors to provide a performance comparison. INTRODUCflON Simulations of neural network models on digital computers perform various computations by applying linear or nonlinear functions, defined in a program, to weighted sums of integer or real numbers retrieved and stored by array reference. The numerical values are model dependent parameters like time averaged spiking frequency (activation), synaptic efficacy (weight), the error in error back propagation models, and computational temperature in thermodynamic models. The interconnect structure of a particular model is implied by indexing relationships between arrays defined in a program. On the Connection Machine (CM), these relationships are expressed in hardware processors interconnected by a 16-dimensional hypercube communication network. Mappings are constructed to defme higher dimensional interconnectivity between processors on top of the fundamental geometry of the communication network.

activation, processor, pvar, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia > Fairfax County > Oakton (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Neural Network Implementation Approaches for the Connection Machine

Jr., Nathan H. Brown

Neural Information Processing SystemsDec-31-1988

Two approaches are described which allow parallel computation of a model's nonlinear functions, parallel modification of a model's weights, and parallel propagation of a model's activation and error. Each approach also allows a model's interconnect structure to be physically dynamic. A Hopfield model is implemented with each approach at six sizes over the same number of CM processors to provide a performance comparison. INTRODUCflON Simulations of neural network models on digital computers perform various computations by applying linear or nonlinear functions, defined in a program, to weighted sums of integer or real numbers retrieved and stored by array reference. The numerical values are model dependent parameters like time averaged spiking frequency (activation), synaptic efficacy (weight), the error in error back propagation models, and computational temperature in thermodynamic models. The interconnect structure of a particular model is implied by indexing relationships between arrays defined in a program. On the Connection Machine (CM), these relationships are expressed in hardware processors interconnected by a 16-dimensional hypercube communication network. Mappings are constructed to defme higher dimensional interconnectivity between processors on top of the fundamental geometry of the communication network.

activation, processor, pvar, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia > Fairfax County > Oakton (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Neural Network Implementation Approaches for the Connection Machine

Jr., Nathan H. Brown

Neural Information Processing SystemsDec-31-1988

Two approaches are described which allow parallel computation of a model's nonlinear functions, parallel modification of a model's weights, and parallel propagation of a model's activation and error. Each approach also allows a model's interconnect structure to be physically dynamic. A Hopfield model is implemented with each approach at six sizes over the same number of CM processors to provide a performance comparison. INTRODUCflON Simulations of neural network models on digital computers perform various computations by applying linear or nonlinear functions, defined in a program, to weighted sums of integer or real numbers retrieved and stored by array reference. The numerical values are model dependent parameters like time averaged spiking frequency (activation), synaptic efficacy (weight), the error in error back propagation models, and computational temperature in thermodynamic models. The interconnect structure of a particular model is implied by indexing relationships between arrays defined in a program. On the Connection Machine (CM), these relationships are expressed in hardware processors interconnected by a 16-dimensional hypercube communication network. Mappings are constructed to defme higher dimensional interconnectivity between processors on top of the fundamental geometry of the communication network.

artificial intelligence, machine learning, processor, (18 more...)

Neural Information Processing Systems

Technology: