Goto

Collaborating Authors

 sape



SAPE: Spatially-AdaptiveProgressiveEncoding forNeuralOptimization

Neural Information Processing Systems

MLPs with"noencoding" struggle tofit high frequencysegments (see appendix for train details). Our workenables MLP networks toadaptivelyfitavarying spectrum offine details that previous methods struggle to capture in a single shot, without involved tuning of parameters or domain specific preprocessing.


SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization

Neural Information Processing Systems

Multilayer-perceptrons (MLP) are known to struggle learning functions of high-frequencies, and in particular, instances of wide frequency bands.We present a progressive mapping scheme for input signals of MLP networks, enabling them to better fit a wide range of frequencies without sacrificing training stability or requiring any domain specific preprocessing. We introduce Spatially Adaptive Progressive Encoding (SAPE) layers, which gradually unmask signal components with increasing frequencies as a function of time and space. The progressive exposure of frequencies is monitored by a feedback loop throughout the neural optimization process, allowing changes to propagate at different rates among local spatial portions of the signal space. We demonstrate the advantage of our method on variety of domains and applications: regression of low dimensional signals and images, representation learning of occupancy networks, and a geometric task of mesh transfer between 3D shapes.



SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization

Neural Information Processing Systems

Implementing implicit neural representations with common neural structures, e.g., multilayer perecep-trons with ReLU activations (ReLU MLPs), proves to be challenging in the presence of signals with


SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization

Neural Information Processing Systems

Multilayer-perceptrons (MLP) are known to struggle learning functions of high-frequencies, and in particular, instances of wide frequency bands.We present a progressive mapping scheme for input signals of MLP networks, enabling them to better fit a wide range of frequencies without sacrificing training stability or requiring any domain specific preprocessing. We introduce Spatially Adaptive Progressive Encoding (SAPE) layers, which gradually unmask signal components with increasing frequencies as a function of time and space. The progressive exposure of frequencies is monitored by a feedback loop throughout the neural optimization process, allowing changes to propagate at different rates among local spatial portions of the signal space. We demonstrate the advantage of our method on variety of domains and applications: regression of low dimensional signals and images, representation learning of occupancy networks, and a geometric task of mesh transfer between 3D shapes.


Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders

Bennett, Andrew, Kallus, Nathan, Li, Lihong, Mousavi, Ali

arXiv.org Artificial Intelligence

A fundamental question in offline reinforcement learning (RL) is how to estimate the value of some target evaluation policy, defined as the long-run average reward obtained by following the policy, using data logged by running a different behavior policy. This question, known as off-policy evaluation (OPE), often arises in applications such as healthcare, education, or robotics, where experimenting with running the target policy can be expensive or even impossible, but we have data logged following business as usual or current standards of care. A central concern using such passively observed data is that observed actions, rewards, and transitions may be confounded by unobserved variables, which can bias standard OPE methods that assume no unobserved confounders, or equivalently that a standard Markov decision process (MDP) model holds with fully observed state. Consider for example evaluating a new smart-phone app to help people living with type-1 diabetes time their insulin injections by monitoring their blood glucose level using some wearable device. Rather than risking giving bad advice that may harm individuals, we may consider first evaluating our injection-timing policy using existing longitudinal observations of individuals' blood glucose levels over time and the timing of insulin injections.


Balanced Policy Evaluation and Learning

Kallus, Nathan

Neural Information Processing Systems

We present a new approach to the problems of evaluating and learning personalized decision policies from observational data of past contexts, decisions, and outcomes. Only the outcome of the enacted decision is available and the historical policy is unknown. These problems arise in personalized medicine using electronic health records and in internet advertising. Existing approaches use inverse propensity weighting (or, doubly robust versions) to make historical outcome (or, residual) data look like it were generated by a new policy being evaluated or learned. But this relies on a plug-in approach that rejects data points with a decision that disagrees with the new policy, leading to high variance estimates and ineffective learning. We propose a new, balance-based approach that too makes the data look like the new policy but does so directly by finding weights that optimize for balance between the weighted data and the target policy in the given, finite sample, which is equivalent to minimizing worst-case or posterior conditional mean square error. Our policy learner proceeds as a two-level optimization problem over policies and weights. We demonstrate that this approach markedly outperforms existing ones both in evaluation and learning, which is unsurprising given the wider support of balance-based weights. We establish extensive theoretical consistency guarantees and regret bounds that support this empirical success.


Balanced Policy Evaluation and Learning

Kallus, Nathan

Neural Information Processing Systems

We present a new approach to the problems of evaluating and learning personalized decision policies from observational data of past contexts, decisions, and outcomes. Only the outcome of the enacted decision is available and the historical policy is unknown. These problems arise in personalized medicine using electronic health records and in internet advertising. Existing approaches use inverse propensity weighting (or, doubly robust versions) to make historical outcome (or, residual) data look like it were generated by a new policy being evaluated or learned. But this relies on a plug-in approach that rejects data points with a decision that disagrees with the new policy, leading to high variance estimates and ineffective learning. We propose a new, balance-based approach that too makes the data look like the new policy but does so directly by finding weights that optimize for balance between the weighted data and the target policy in the given, finite sample, which is equivalent to minimizing worst-case or posterior conditional mean square error. Our policy learner proceeds as a two-level optimization problem over policies and weights. We demonstrate that this approach markedly outperforms existing ones both in evaluation and learning, which is unsurprising given the wider support of balance-based weights. We establish extensive theoretical consistency guarantees and regret bounds that support this empirical success.


SAPE: A System for Situation-Aware Public Security Evaluation

Wu, Shu (Institute of Automation, Chinese Academy of Sciences) | Liu, Qiang (Institute of Automation, Chinese Academy of Sciences) | Bai, Ping (Institute of Automation, Chinese Academy of Sciences) | Wang, Liang (Institute of Automation, Chinese Academy of Sciences) | Tan, Tieniu (Institute of Automation, Chinese Academy of Sciences)

AAAI Conferences

Public security events are occurring all over the world, bringing threat to personal and property safety, and homeland security. It is vital to construct an effective model to evaluate and predict the public security. In this work, we establish a Situation-Aware Public Security Evaluation (SAPE) platform. Based on conventional Recurrent Neural Networks (RNN), we develop a new variant of RNN to handle temporal contexts in public security event datasets. The proposed model can achieve better performance than the compared state-of-the-art methods. On SAPE, There are two parts of demonstrations, i.e., global public security evaluation and China public security evaluation. In the global part, based on Global Terrorism Database from UMD, for each country, SAPE can predict risk level and top-n potential terrorist organizations which might attack the country. The users can also view the actual attacking organizations and predicted results. For each province in China, SAPE can predict the risk level and the probability scores of different types of events in the next month. The users can also view the actual numbers of events and predicted risk levels of the past one year.