Goto

Collaborating Authors

 Research Report


A New Study Details How Cats Almost Always Land on Their Feet

WIRED

The secret to this acrobatic skill lies in an extremely flexible part of the spine that allows cats to twist in the air and land safely. It's well established that when cats fall, they're able to land perfectly most of the time, nimbly maneuvering to right themselves before they hit the ground. Now, researchers at Japan's Yamaguchi University have advanced our understanding of this extraordinary ability, focusing on the mechanical properties of feline spines. What they found, as detailed in a recent study in the journal The Anatomical Record, is that those sure-footed landings are due in part to the fact that a cat's thoracic region is much more flexible than its lumbar region. While a cat's ability to rotate in the air without something to push again seems to defy the laws of physics, it's instead a complex righting maneuver.


On Learning Intrinsic Rewards for Policy Gradient Methods

Neural Information Processing Systems

In many sequential decision making tasks, it is challenging to design reward functions that help an RL agent efficiently learn behavior that is considered good by the agent designer. A number of different formulations of the reward-design problem, or close variants thereof, have been proposed in the literature. In this paper we build on the Optimal Rewards Framework of Singh et al. that defines the optimal intrinsic reward function as one that when used by an RL agent achieves behavior that optimizes the task-specifying or extrinsic reward function. Previous work in this framework has shown how good intrinsic reward functions can be learned for lookahead search based planning agents. Whether it is possible to learn intrinsic reward functions for learning agents remains an open problem. In this paper we derive a novel algorithm for learning intrinsic rewards for policy-gradient based learning agents. We compare the performance of an augmented agent that uses our algorithm to provide additive intrinsic rewards to an A2C-based policy learner (for Atari games) and a PPO-based policy learner (for Mujoco domains) with a baseline agent that uses the same policy learners but with only extrinsic rewards. Our results show improved performance on most but not all of the domains.


The world's oldest wild bird has a new grandchick

Popular Science

Environment Animals Wildlife Birds The world's oldest wild bird has a new grandchick Biologists have been tracking Wisdom, the roughly 75-year-old Laysan albatross, since the 1950s. Albatross chicks are getting stronger. Breakthroughs, discoveries, and DIY tips sent six days a week. The U.S. Fish and Wildlife Service is shining a light on a new member of a famous feathered family--that of the world's oldest known breeding bird, a Laysan albatross called Wisdom. The agency posted a video on social media featuring a scruffy looking hatchling seemingly yawning as it hangs out in the sand in close contact with a giant bird --presumably one of its parents.



Conditional Generative Moment-Matching Networks

Neural Information Processing Systems

Maximum mean discrepancy (MMD) has been successfully applied to learn deep generative models for characterizing a joint distribution of variables via kernel mean embedding. In this paper, we present conditional generative moment-matching networks (CGMMN), which learn a conditional distribution given some input variables based on a conditional maximum mean discrepancy (CMMD) criterion. The learning is performed by stochastic gradient descent with the gradient calculated by back-propagation. We evaluate CGMMN on a wide range of tasks, including predictive modeling, contextual generation, and Bayesian dark knowledge, which distills knowledge from a Bayesian model by learning a relatively small CGMMN student network. Our results demonstrate competitive performance in all the tasks.


Forget Viagra! 'Arousal training' app can help men last TWICE as long in bed, scientists say

Daily Mail - Science & tech

Ground stop issued for all three Washington DC-area airports after'strong chemical smell' detected Trump hails dramatic bombing raid on'Iran's crown jewel'... but says one area deliberately SPARED: Live updates Uncomfortable truth about what happened to Rob Reiner's forgotten daughter Tracy: As she breaks cover for first time since murders... new details of secret New Mexico life Kylie Jenner's total humiliation in Hollywood: Derogatory rumor leaves her boyfriend's peers'laughing at her' behind her back Dak Prescott's crippling secret fear: Quarterback'preparing for the worst' after fiancée split... as career-ending gossip now seems inevitable Queen Camilla told her friend that Meghan Markle'brainwashed' Prince Harry, new book claims Downfall of Trump VP hopeful exiled to construction job: Filthy messages, Oval Office humiliations and the Ice Maiden who'f***ing hates his guts' What convinced Timothy Busfield's wife Melissa Gilbert that he didn't grope children: 'She would dump his a**' Mysterious'Trump' airships appearing in 100-year-old sketchbooks sparks'time traveler' theories Yellowstone fans go wild as Cole Hauser unveils spinoff series Dutton Ranch: 'Here we go!' Men admit their wildest kinks to JANA HOCKING: Some are smelly, some are truly shocking... but these are the ones women actually secretly adore Inside the sex guide electrifying conservative women: Good Christian wives purring over'explicit illustrations' that teach them the ultimate taboos Liberal MS NOW star makes prediction about Gavin Newsom's 2028 chances that will ENRAGE California governor Dolly Parton, 80, makes first public appearance in MONTHS as she admits to getting'worn out' amid health struggles Forget Viagra! 'Arousal training' app can help men last TWICE as long in bed, scientists say Forget Viagra! 'Arousal training' app can help men last TWICE as long in bed, scientists say An'arousal training' app could help men last twice as long in bed, a study has found. The Melonga App guides users through a number of therapeutic techniques, tips and exercises designed by urologists and psychologists. It is designed to help men manage arousal better and includes elements of cognitive behavioural therapy and physical exercises to improve ejaculation control without taking medicine. The at-home self-help tool could benefit men who are hesitant to seek help because they are ashamed, researchers said. And it could help the 20 to 30 per cent of men in the UK who are estimated to suffer from the issue, which is defined by ejaculating sooner than wanted during sex.


Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

Neural Information Processing Systems

Asynchronous momentum stochastic gradient descent algorithms (Async-MSGD) have been widely used in distributed machine learning, e.g., training large collaborative filtering systems and deep neural networks. Due to current technical limit, however, establishing convergence properties of Async-MSGD for these highly complicated nonoconvex problems is generally infeasible. Therefore, we propose to analyze the algorithm through a simpler but nontrivial nonconvex problems --- streaming PCA. This allows us to make progress toward understanding Aync-MSGD and gaining new insights for more general problems. Specifically, by exploiting the diffusion approximation of stochastic optimization, we establish the asymptotic rate of convergence of Async-MSGD for streaming PCA. Our results indicate a fundamental tradeoff between asynchrony and momentum: To ensure convergence and acceleration through asynchrony, we have to reduce the momentum (compared with Sync-MSGD). To the best of our knowledge, this is the first theoretical attempt on understanding Async-MSGD for distributed nonconvex stochastic optimization. Numerical experiments on both streaming PCA and training deep neural networks are provided to support our findings for Async-MSGD.


Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion

Jacob Buckman, Danijar Hafner, George Tucker, Eugene Brevdo, Honglak Lee

Neural Information Processing Systems

We propose stochastic ensemble value expansion (STEVE), a novel model-based technique that addresses this issue. By dynamically interpolating between model rollouts of various horizon lengths for each individual example, STEVE ensures that the model is only utilized when doing so does not introduce significant errors.


Tight Bounds for Collaborative PAC Learning via Multiplicative Weights

Neural Information Processing Systems

We study the collaborative PAC learning problem recently proposed in Blum et al.~\cite{BHPQ17}, in which we have $k$ players and they want to learn a target function collaboratively, such that the learned function approximates the target function well on all players' distributions simultaneously. The quality of the collaborative learning algorithm is measured by the ratio between the sample complexity of the algorithm and that of the learning algorithm for a single distribution (called the overhead). We obtain a collaborative learning algorithm with overhead $O(\ln k)$, improving the one with overhead $O(\ln^2 k)$ in \cite{BHPQ17}. We also show that an $\Omega(\ln k)$ overhead is inevitable when $k$ is polynomial bounded by the VC dimension of the hypothesis class. Finally, our experimental study has demonstrated the superiority of our algorithm compared with the one in Blum et al.~\cite{BHPQ17} on real-world datasets.


Robust Detection of Adversarial Attacks by Modeling the Intrinsic Properties of Deep Neural Networks

Neural Information Processing Systems

It has been shown that deep neural network (DNN) based classifiers are vulnerable to human-imperceptive adversarial perturbations which can cause DNN classifiers to output wrong predictions with high confidence. We propose an unsupervised learning approach to detect adversarial inputs without any knowledge of attackers. Our approach tries to capture the intrinsic properties of a DNN classifier and uses them to detect adversarial inputs. The intrinsic properties used in this study are the output distributions of the hidden neurons in a DNN classifier presented with natural images. Our approach can be easily applied to any DNN classifiers or combined with other defense strategy to improve robustness. Experimental results show that our approach demonstrates state-of-the-art robustness in defending black-box and gray-box attacks.