Neural Information Processing Systems
ffeed84c7cb1ae7bf4ec4bd78275bb98-Reviews.html
This paper considers the problem of option pricing in finance. Using a minimax approach, the authors construct a game between Nature and an Investor, prove that the game value of this game converges to the classic Black-Scholes option price, and give an explicit hedging strategy that achieves this value. Clarity: This is a very math-heavy paper. Unfortunately, I am not very knowledgeable in the area of stochastic calculus, so I am unable to verify the correctness of the proofs in the paper and in the 14-page supplementary material. The authors do provide a reference to a standard book in stochastic calculus, but unfortunately I do not have the time to familiarize myself with the material.
fec8d47d412bcbeece3d9128ae855a7a-Reviews.html
The paper aims to predict fixations and saccadic motions of the gaze for human subjects observing still images. A new dataset of eye movements is collected for VOC 2012 Actions images. Two separate sets of eye movements are collected from two groups of subjects given two tasks: (a) recognizing the action and (b) recognizing objects in the background. The consistency of subjects in fixations and eye movements is analyzed, by modeling the gaze as transitions between Gaussian states (Areas of Interest). The prediction of gaze locations is done by training an SVM on HOG features sampled at Areas of Interest for positives and at other random locations for negatives.
fe709c654eac84d5239d1a12a4f71877-Reviews.html
The main idea is to sample several determinations of the system in the form of roll-out trees where each state/action pair has only one sampled successor. A combination of breadth-first and best-first search is used to explore the deterministic trees, and then they are recombined to create a stochastic model from which a policy can be calculated. The algorithm is proven to be consistent (as the number of trees and number of nodes in each tree both approach infinity, the value at the root can be arbitrarily approximated with high probability). The algorithm is empirically compared to an planning algorithm that requires a full transition model and performs well in comparison.
fd5c905bcd8c3348ad1b35d7231ee2b1-Reviews.html
This paper is published in the context of making Learning from Demonstration more robust when a limited number of demonstrations are available. Many of the low level trajectory learning LfD approaches suffer from fragile policies. This paper proposes to use Reinforcement learning to overcome this limitation. This paper falls squarely in the LfD field and does not tackle Inverse reinforcement learning, i.e. the reward function is assumed to be known to the agent rather than inferred by demonstration. One work with a very similar flavor is that of Smart, W. and Kaelbling, L.P. "Effective Reinfrocement Learning for Mobile Robots" ICRA 2002.
fc8001f834f6a5f0561080d134d53d29-Reviews.html
Summary: The paper presents a method that learns a pruning algorithm for a VP-tree, in non-metric spaces. The idea is to estimate the decision function of the approximate nearest neighbor search in the VP-tree by sampling, and approximating it with a piecewise linear function. The learning to prune method is validated for the search efficiency against relevant baselines for prunning, and outperforms them substantially when the intrinsic dimensionality of the data is small. Clarity: The paper is mostly clearly written but sometimes does not really go into explaining the implementation details and the choice of some parameters (for example, why choose K 100, m 7, rho 8 and the bucket size 10 5? Line 185,227,315) Originality: Learning to approximate the approximate nearest neighbor classification on a VP-tree, to the extent of my knowledge, is the first work that'learns to prune' Significance: Nearest neighbor method is a very fundamental topic in search or classification; thus this learning-to-prune method which approximates the nearest neighbor search with a non-linear function would be of some interest to a wide audience. However, the datasets chosen for validation for the experiments seem rather simple and have low-dimensionality, which are far from realistic.
fb89705ae6d743bf1e848c206e16a1d7-Reviews.html
Overview: The authors propose the Gibbs error criterion for active learning; seeking the samples that maximize the expected Gibbs error under the current posterior. They propose a greedy algorithm that maximises this criterion (Max-GEC). The objective reduces to maximising a specific instance of the Tsallis entropy of the predictive distribution which is very similar to Maximum Entropy Sampling (MES) which uses the Shannon entropy of the predictive distribution. They consider the non-adaptive, adaptive and batch settings separately, and in each setting they prove using submodularity results that the greedy approach achieves near-maximal performance compared to optimal policy. They show how to implement their fully adaptive policy (approximately) in CRFs with application to named entity recognition, and implement the batch algorithm with a Naive Bayes classifier, with application to a text classification task.
f9b902fc3289af4dd08de5d1de54f68f-Reviews.html
This paper proposes a discriminative clustering algorithm inspired by mean shift and the idea of finding local maxima of the density ratio (ratio of the densities of positive and negative points). The work is motivated by recent approaches of [4,8,16] aimed at discovering distinctive mid-level parts or patches for various recognition tasks. In the authors' own words from the Intro, "The idea is to search for clusters of image patches that are both 1) representative... and 2) visually discriminative. Unfortunately, finding patches that fit these criteria remain rather ad-hoc and poorly understood. While most current algorithms use a discriminative clustering-like procedure, they generally don't optimize elements for these criteria, or do so in an indirect, procedural way that is difficult to analyze. Hence, our goal in this work is to quantify the terms'representative' and'discriminative', and show that that a generalization of the well-known, well-understood Mean-Shift algorithm can produce visual elements that are more representative and discriminative than those of previous approaches."
f7e6c85504ce6e82442c770f7c8606f0-Reviews.html
The title of this paper is much like the paper itself: to-the-point, descriptive, and readable. "A simple example of Dirichlet process mixture inconsistency for the number of components" delivers on its promise by providing two easy-to-understand demonstrations of the severity of the problem of using Dirichlet process mixtures to estimate the number of components in a mixture model. The authors start by demonstrating that making such a component-cardinality estimate is widespread in the literature (and therefore a problem deserving of interest), briefly describe the Dirichlet process mixture (DPM) model (with particular emphasis on the popular normal likelihood case), and then demonstrate with a simple single-component mixture example how poorly estimation of component cardinality can go (their convincing answer: very poorly). Not only was the paper enjoyable to read but, refreshingly, didn't try to fit 20 pages of material into an 8 page limit. One potential criticism of this paper is that this result should be well-known in some sense in the community.