Collaborating Authors


Scientists have developed a new program that can identify people based on how they dance

Daily Mail - Science & tech

A team of researchers from the University of Jyväskylä in Finland have developed a new computer system that can identify individuals not through their faces or finger prints, but simply by watching them dance. For the experiment the team analyzed the movements of 73 participants as they danced to music in eight different genres, including blues, country, metal, reggae, rap, and more. They developed a machine learning program that would analyze 21 different points of articulation on each dancer's body through a motion capture camera, and combined that data with some general information about each participant. The team found that the machine learning program was able to accurately identify which of the 73 participants was dancing just by capturing their movements 94 percent of the time. 'It seems as though a person's dance movements are a kind of fingerprint,' researcher Dr. Pasi Saari told Eurekalert.

Zach Pardos is Using Machine Learning to Broaden Pathways from Community College


UC Berkeley Assistant Professor Zachary Pardos and his team have developed a machine learning approach that promises to help more community college students position themselves to transfer and succeed at four-year colleges and universities. Along the way, they've discovered that considering course enrollment patterns -- or the classes that students take before, along with, and after a particular course -- can help provide a more complete picture of what courses should "count" when students transfer. Roughly 80% of community college students aim to continue their education at four-year institutions, but the vast majority never make the transfer. Contributing to the problem are the complexities of "articulation," or determining which course at one institution will count for credit at another. This entails assessing the similarity of thousands, or potentially even millions, of pairs of courses, an endeavor that's impossible to comprehensively achieve and keep current across all institutions manually.

Introduction About A Formal Machine Learning Model


In day to day life, individuals are effectively confronting a few choices to make. For a machine to settle on these sorts of decisions, the automatic route is to show the issues faced in a numerical articulation. The mathematical articulation could legitimately be structured from the issue foundation. Machine learning is a formal learning model. For example, a candy machine could utilize the gauges and security enrichment of cash to distinguish false payment.

Deep Learning based human pose estimation with OpenCV


In today's post, we would only run the single person pose estimation using OpenCV. We would just be showing the confidence maps now to show the keypoints. In order to keep this post simple, we shall be showing how to connect multiple person keypoints using Pose affinity maps in a separate post next week. We would be using the pretrained model trained by the OpenPose team using Caffe on MPI dataset. This dataset has 15 keypoints to identify various points in human body. We also define the pose pairs which define the limbs. This is used to create the limbs which connect the keypoints and Pose affinity maps are used to predict the limbs.

Towards Explainable Music Emotion Recognition: The Route via Mid-level Features Machine Learning

Emotional aspects play an important part in our interaction with music. However, modelling these aspects in MIR systems have been notoriously challenging since emotion is an inherently abstract and subjective experience, thus making it difficult to quantify or predict in the first place, and to make sense of the predictions in the next. In an attempt to create a model that can give a musically meaningful and intuitive explanation for its predictions, we propose a VGG-style deep neural network that learns to predict emotional characteristics of a musical piece together with (and based on) human-interpretable, mid-level perceptual features. We compare this to predicting emotion directly with an identical network that does not take into account the mid-level features and observe that the loss in predictive performance of going through the mid-level features is surprisingly low, on average. The design of our network allows us to visualize the effects of perceptual features on individual emotion predictions, and we argue that the small loss in performance in going through the mid-level features is justified by the gain in explainability of the predictions.

Attention model for articulatory features detection Machine Learning

Articulatory distinctive features, as well as phonetic transcription, play important role in speech-related tasks: computer-assisted pronunciation training, text-to-speech conversion (TTS), studying speech production mechanisms, speech recognition for low-resourced languages. End-to-end approaches to speech-related tasks got a lot of traction in recent years. We apply Listen, Attend and Spell~(LAS)~\cite{Chan-LAS2016} architecture to phones recognition on a small small training set, like TIMIT~\cite{TIMIT-1992}. Also, we introduce a novel decoding technique that allows to train manners and places of articulation detectors end-to-end using attention models. We also explore joint phones recognition and articulatory features detection in multitask learning setting.

Neuroscientists Transform Brain Activity to Speech with AI


Artificial intelligence is enabling many scientific breakthroughs, especially in fields of study that generate high volumes of complex data such as neuroscience. As impossible as it may seem, neuroscientists are making strides in decoding neural activity into speech using artificial neural networks. Yesterday, the neuroscience team of Gopala K. Anumanchipalli, Josh Chartier, and Edward F. Chang of University of California San Francisco (UCSF) published in Nature their study using artificial intelligence and a state-of-the-art brain-machine interface to produce synthetic speech from brain recordings. The concept is relatively straightforward--record the brain activity and audio of participants while they are reading aloud in order to create a system that decodes brain signals for vocal tract movements, then synthesize speech from the decoded movements. The execution of the concept required sophisticated finessing of cutting-edge AI techniques and tools.

Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition Machine Learning

Abstract--The recognition of sign language is a challenging task with an important role in society to facilitate the communication ofdeaf persons. We propose a new approach of Spatial-Temporal Graph Convolutional Network to sign language recognition based on the human skeletal movements. The method uses graphs to capture the signs dynamics in two dimensions, spatial and temporal, considering the complex aspects of the language. Additionally, we present a new dataset of human skeletons for sign language based on ASLLVD to contribute to future related studies. I. INTRODUCTION Sign language is a visual communication skill that enables individuals with different types of hearing impairment to communicate in society. It is the language used by most deaf people in their daily lives and, moreover, it is the symbol of identification between the members of that community and the main force that unites them. The sign language has a very close relationship with the culture of the country or even regions, and for this reason, each nation has its language [1]. According to the World Health Organization, the number of deaf people is about 466 million, and the organization estimates that by 2050 this number exceeds 900 million, which is equivalent to a forecast of 1 in 10 individuals around the world [2].

Learning to Fingerprint the Latent Structure in Question Articulation Artificial Intelligence

Abstract Machine understanding of questions is tightly related to recognition of articulation in the context of the computational capabilities of an underlying processing algorithm. In this paper a mathematical model to capture and distinguish the latent structure in the articulation of questions is presented. We propose an objective-driven approach to represent this latent structure and show that such an approach is beneficial when examples of complementary objectives are not available. We show that the latent structure can be represented as a system that maximizes a cost function related to the underlying objective. Further, we show that the optimization formulation can be approximated to building a memory of patterns represented as a trained neural auto-encoder. Experimental evaluation using many clusters of questions, each related to an objective, shows 80% recognition accuracy and negligible false positive across these clusters of questions. We then extend the same memory to a related task where the goal is to iteratively refine a dataset of questions based on the latent articulation. We also demonstrate a refinement scheme called K-fingerprints, that achieves nearly 100% recognition with negligible false positive across the different clusters of questions.

Learning About Learning & How AI Informs Enterprise Analytics


This little vignette from my childhood highlights one of the first ways that machines "learn" – through training. Give a machine algorithm authoritative sources, dictionaries or large collections of words in common usage for example, and then ask the machine to tell you if something "looks" right. It will be highly accurate, according to the creators of such methods. If you ask how they measure accuracy, they will tell you that it is a comparison of the results of the algorithm as compared to the training sets. If the training is "right," then accuracy can be measured by comparison. Of course, if usage changes, in this simple example, imagine that we also wanted to include alternative spellings (British vs. American spelling of "color" for example), we need to include training examples of the alternate spellings.