Forward Dynamics Modeling of Speech Motor Control Using Physiological Data

Neural Information Processing Systems

We propose a paradigm for modeling speech production based on neural networks. We focus on characteristics of the musculoskeletal system. Using real physiological data - articulator movements and EMG from muscle activitya neuralnetwork learns the forward dynamics relating motor commands to muscles and the ensuing articulator behavior. After learning, simulated perturbations, were used to asses properties of the acquired model, such as natural frequency, damping, and interarticulator couplings. Finally, a cascade neural network is used to generate continuous motor commands from a sequence of discrete articulatory targets.


Inverse Dynamics of Speech Motor Control

Neural Information Processing Systems

This inverse dynamics model allows the use of a faster speech mot.or control scheme, which can be applied to phoneme-tospeech synthesisvia musclo-skeletal system dynamics, or to future use in speech recognition. The forward acoustic model, which is the mapping from articulator trajectories t.o the acoustic parameters, was improved by adding velocity and voicing information inputs to distinguish acollst.ic


Learning from Reinforcement and Advice Using Composite Reward Functions

AAAI Conferences

Reinforcement learning has become a widely used methodology for creating intelligent agents in a wide range of applications. However, its performance deteriorates in tasks with sparse feedback or lengthy inter-reinforcement times. This paper presents an extension that makes use of an advisory entity to provide additional feedback to the agent. The agent incorporates both the rewards provided by the environment and the advice to attain faster learning speed, and policies that are tuned towards the preferences of the advisor while still achieving the underlying task objective. The advice is converted to "tuning" or user rewards that, together with the task rewards, define a composite reward function that more accurately defines the advisor's perception of the task. At the same time, the formation of erroneous loops due to incorrect user rewards is avoided using formal bounds on the user reward component. This approach is illustrated using a robot navigation task.


Variable Impedance Control in End-Effector Space: An Action Space for Reinforcement Learning in Contact-Rich Tasks

arXiv.org Artificial Intelligence

Reinforcement Learning (RL) of contact-rich manipulation tasks has yielded impressive results in recent years. While many studies in RL focus on varying the observation space or reward model, few efforts focused on the choice of action space (e.g. joint or end-effector space, position, velocity, etc.). However, studies in robot motion control indicate that choosing an action space that conforms to the characteristics of the task can simplify exploration and improve robustness to disturbances. This paper studies the effect of different action spaces in deep RL and advocates for Variable Impedance Control in End-effector Space (VICES) as an advantageous action space for constrained and contact-rich tasks. We evaluate multiple action spaces on three prototypical manipulation tasks: Path Following (task with no contact), Door Opening (task with kinematic constraints), and Surface Wiping (task with continuous contact). We show that VICES improves sample efficiency, maintains low energy consumption, and ensures safety across all three experimental setups. Further, RL policies learned with VICES can transfer across different robot models in simulation, and from simulation to real for the same robot. Further information is available at https://stanfordvl.github.io/vices.


Probabilistic Movement Primitives

Neural Information Processing Systems

Movement Primitives (MP) are a well-established approach for representing modular and re-usable robot movement generators. Many state-of-the-art robot learning successes are based MPs, due to their compact representation of the inherently continuous and high dimensional robot movements. A major goal in robot learning is to combine multiple MPs as building blocks in a modular control architecture to solve complex tasks. To this effect, a MP representation has to allow for blending between motions, adapting to altered task variables, and co-activating multiple MPs in parallel. We present a probabilistic formulation of the MP concept that maintains a distribution over trajectories. Our probabilistic approach allows for the derivation of new operations which are essential for implementing all aforementioned properties in one framework. In order to use such a trajectory distribution for robot movement control, we analytically derive a stochastic feedback controller which reproduces the given trajectory distribution. We evaluate and compare our approach to existing methods on several simulated as well as real robot scenarios.