Goto

Collaborating Authors

 movellan


A Robot With a Delicate Touch

AITopics Original Links

Here in a brick factory that was once one of the first electrified manufacturing sites in New England, Rodney A. Brooks, the legendary roboticist who is Rethink's founder, proves its safety by placing his head in the path of Baxter's arm while it moves objects on an assembly line. The arm senses his head and abruptly stops moving with a soft clunk. Dr. Brooks, unfazed, points out that the arm is what roboticists call "compliant": intended to sense unexpected obstacles and adjust itself accordingly. The $22,000 robot that Rethink will begin selling in October is the clearest evidence yet that robotics is more than a laboratory curiosity or a tool only for large companies with vast amounts of capital. The company is betting it can broaden the market for robots by selling an inexpensive machine that can collaborate with human workers, the way the computer industry took off in the 1980s when the prices of PCs fell sharply and people without programming experience could start using them right out of the box. "It feels like a true Macintosh moment for the robot world," said Tony Fadell, the former Apple executive who oversaw the development of the iPod and the iPhone.


UCSD's robot baby Diego-san appears on video for the first time

AITopics Original Links

A new android infant has been born thanks to the University of California San Diego's Machine Perception Lab. The lab received funding from the National Science Foundation to contract Kokoro Co. Ltd. and Hanson Robotics, two companies that specialize in building lifelike animatronics and androids, to build a replicant based on a one year old baby. The resulting robot, which has been a couple of years in development, has finally been completed – and you can watch it smile and make cute faces after the break. Diego-san, as it's called, is actually much larger than a standard one year old – mainly because miniaturizing the parts would have been too costly. It stands about 4 feet 3 inches (130 cm) tall and weighs 66 pounds (30 kg), and its body has a total of 44 pneumatic joints.


Apple's New AI will decode the 43 muscles in your face and help Siri2 understand you better.

#artificialintelligence

Computers Don't Know When You Are Happy--Apple Is Adding a Previously Unseen Dimension To Your Device From the moment you are born, assuming normal eyesight, we open our eyes and fixate on the 43 muscles that control 1000s of nuances of facial expressions and emotion intent in the face of our parents. They inform a reaction to how to interpret the world, an extended sensor to help learn the basic emotions and reactions to the world around us. "Emotient is the leading authority on facial expression recognition and analysis technologies that are enabling a future of emotion aware computing." In the Spring of 2013 a team of scientists and researchers at the Machine Perception Lab at University of California, San Diego, was forming the technology and the basic elements of what was to become Emotient. The founding team were widely regarded as spearheading the use of machine learning for facial expression analysis with over 20 years of experience pioneering machine learning and computer vision technology for facial behavior analysis. The team has published hundreds of peer reviewed scientific publications, starting in 1995, which have been cited by thousands of other researchers in the field. Building around the work of Paul Ekman, Ph.D.[1] a pioneer in the study of emotions and facial expressions, and a professor emeritus of psychology in the Department of Psychiatry at the University of California Medical School (UCSF) where he has been active for 32 years, Emotient used AI to machine learn his ground breaking research in micro-emotions.


Apple's New AI will decode the 43 muscles in your face and help Siri2 understand you better.

#artificialintelligence

Computers Don't Know When You Are Happy--Apple Is Adding a Previously Unseen Dimension To Your Device From the moment you are born, assuming normal eyesight, we open our eyes and fixate on the 43 muscles that control 1000s of nuances of facial expressions and emotion intent in the face of our parents. They inform a reaction to how to interpret the world, an extended sensor to help learn the basic emotions and reactions to the world around us. "Emotient is the leading authority on facial expression recognition and analysis technologies that are enabling a future of emotion aware computing." In the Spring of 2013 a team of scientists and researchers at the Machine Perception Lab at University of California, San Diego, was forming the techennlogy and the basic elements of what was to become Emotient. The founding team were widely regarded as spearheading the use of machine learning for facial expression analysis with over 20 years of experience pioneering machine learning and computer vision technology for facial behavior analysis. The team has published hundreds of peer reviewed scientific publications, starting in 1995, which have been cited by thousands of other researchers in the field. Building around the work of Paul Ekman, Ph.D.[1] a pioneer in the study of emotions and facial expressions, and a professor emeritus of psychology in the Department of Psychiatry at the University of California Medical School (UCSF) where he has been active for 32 years, Emotient used AI to machine learn his ground breaking research in micro-emotions.


Minimum Probability Flow Learning

arXiv.org Machine Learning

Fitting probabilistic models to data is often difficult, due to the general intractability of the partition function and its derivatives. Here we propose a new parameter estimation technique that does not require computing an intractable normalization factor or sampling from the equilibrium distribution of the model. This is achieved by establishing dynamics that would transform the observed data distribution into the model distribution, and then setting as the objective the minimization of the KL divergence between the data distribution and the distribution produced by running the dynamics for an infinitesimal time. Score matching, minimum velocity learning, and certain forms of contrastive divergence are shown to be special cases of this learning technique. We demonstrate parameter estimation in Ising models, deep belief networks and an independent component analysis model of natural scenes. In the Ising model case, current state of the art techniques are outperformed by at least an order of magnitude in learning time, with lower error in recovered coupling parameters.


An Alternative to Low-level-Sychrony-Based Methods for Speech Detection

Neural Information Processing Systems

Determining whether someone is talking has applications in many areas such as speech recognition, speaker diarization, social robotics, facial expression recognition, and human computer interaction. One popular approach to this problem is audio-visual synchrony detection. A candidate speaker is deemed to be talking if the visual signal around that speaker correlates with the auditory signal. Here we show that with the proper visual features (in this case movements of various facial muscle groups), a very accurate detector of speech can be created that does not use the audio signal at all. Further we show that this person independent visual-only detector can be used to train very accurate audio-based person dependent voice models. The voice model has the advantage of being able to identify when a particular person is speaking even when they are not visible to the camera (e.g. in the case of a mobile robot). Moreover, we show that a simple sensory fusion scheme between the auditory and visual models improves performance on the task of talking detection. The work here provides dramatic evidence about the efficacy of two very different approaches to multimodal speech detection on a challenging database.


Bayesian Robustification for Audio Visual Fusion

Neural Information Processing Systems

Department of Cognitive Science Department of Cognitive Science University of California, San Diego University of California, San Diego La Jolla, CA 92092-0515 La Jolla, CA 92092-0515 Abstract We discuss the problem of catastrophic fusion in multimodal recognition systems. This problem arises in systems that need to fuse different channels in non-stationary environments. Practice shows that when recognition modules within each modality are tested in contexts inconsistent with their assumptions, their influence on the fused product tends to increase, with catastrophic results. We explore a principled solution to this problem based upon Bayesian ideas of competitive models and inference robustification: each sensory channel is provided with simple white-noise context models, and the perceptual hypothesis and context are jointly estimated. Consequently, context deviations are interpreted as changes in white noise contamination strength, automatically adjusting the influence of the module.


Bayesian Robustification for Audio Visual Fusion

Neural Information Processing Systems

Department of Cognitive Science Department of Cognitive Science University of California, San Diego University of California, San Diego La Jolla, CA 92092-0515 La Jolla, CA 92092-0515 Abstract We discuss the problem of catastrophic fusion in multimodal recognition systems. This problem arises in systems that need to fuse different channels in non-stationary environments. Practice shows that when recognition modules within each modality are tested in contexts inconsistent with their assumptions, their influence on the fused product tends to increase, with catastrophic results. We explore a principled solution to this problem based upon Bayesian ideas of competitive models and inference robustification: each sensory channel is provided with simple white-noise context models, and the perceptual hypothesis and context are jointly estimated. Consequently, context deviations are interpreted as changes in white noise contamination strength, automatically adjusting the influence of the module.


Bayesian Robustification for Audio Visual Fusion

Neural Information Processing Systems

Department of Cognitive Science University of California, San Diego La Jolla, CA 92092-0515 Abstract We discuss the problem of catastrophic fusion in multimodal recognition systems.This problem arises in systems that need to fuse different channels in non-stationary environments. Practice shows that when recognition modules within each modality are tested in contexts inconsistent with their assumptions, their influence on the fused product tends to increase, with catastrophic results. We explore aprincipled solution to this problem based upon Bayesian ideas of competitive models and inference robustification: each sensory channel is provided with simple white-noise context models, andthe perceptual hypothesis and context are jointly estimated. Consequently,context deviations are interpreted as changes in white noise contamination strength, automatically adjusting the influence of the module. The approach is tested on a fixed lexicon automatic audiovisual speech recognition problem with very good results. 1 Introduction In this paper we address the problem of catastrophic fusion in automatic multimodal recognition systems.