Goto

Collaborating Authors

 Media


Share Your Science: Microsoft Developing Applications for the Visually Impaired

#artificialintelligence

Ken Tran, Senior Research Engineer at Microsoft Research shares how they are using deep learning to create applications for people who are blind or visually impaired. Using Tesla M40 and TITAN X GPUs with the cuDNN-accelerated Caffe deep learning framework, they have trained their language model to describe images or scenes in natural language. The work is being used for Microsoft's research project called Seeing AI that uses computer vision and natural language processing to describe a person's surroundings, read text, answer questions and even identify emotions on people's faces. Watch more scientists and researchers share how accelerated computing is benefiting their work at http://nvda.ly/X7WpH


Using the Cloud Natural Language API to analyze Harry Potter and The New York Times Google Cloud Big Data and Machine Learning Blog

#artificialintelligence

Ever wanted a way to easily extract and analyze meaningful data from text? The new Cloud Natural Language API has a feature that lets you extract entities from text -- like people, places and events -- with a single API call. Let's take the following sentence from a recent news article: LONDON -- J. K. Rowling always said that the seventh Harry Potter book, "Harry Potter and the Deathly Hallows," would be the last in the series, and so far she has kept to her word. I could write my own algorithm to find the people and locations mentioned in this sentence, but that would be difficult. And it would be even more difficult if I wanted to gather more data on each of the mentioned entities, or analyze entities across thousands of sentences, while accounting for mentions of the same entity that are phrased differently (e.g., "Rowling" and "J.K. Rowling"). With Cloud Natural Language API, I can analyze the above sentence with the API's analyzeEntities method.


Inside story of how Ghostbot handles texting creepers

#artificialintelligence

Last month, in collaboration with the folks behind the fantastic Burner app, my company Voxable launched our latest chatbot project, Ghostbot for Burner. Ghostbot runs on top of Burner's "open" phone numbers as an agent inside your text message inbox, allowing you to "ghost" away from fleeting relationships by responding to unwanted or creepy texters on your behalf. Since the launch, Ghostbot has been covered in over one hundred media outlets. Chris Messina was also kind enough to post it to Product Hunt, calling it "the first bot-as-personal-firewall" he had seen. The opportunity to break new ground in the coming era of bot-mediated communication was what initially both excited and terrified us about the project.


Linear Algebraic Structure of Word Senses, with Applications to Polysemy

arXiv.org Machine Learning

Word embeddings are ubiquitous in NLP and information retrieval, but it's unclear what they represent when the word is polysemous, i.e., has multiple senses. Here it is shown that multiple word senses reside in linear superposition within the word embedding and can be recovered by simple sparse coding. The success of the method ---which applies to several embedding methods including word2vec--- is mathematically explained using the random walk on discourses model (Arora et al., 2016). A novel aspect of our technique is that each word sense is also accompanied by one of about 2000 discourse atoms that give a succinct description of which other words co-occur with that word sense. Discourse atoms seem of independent interest, and make the method potentially more useful than the traditional clustering-based approaches to polysemy.


Left/Right Hand Segmentation in Egocentric Videos

arXiv.org Artificial Intelligence

Wearable cameras allow people to record their daily activities from a user-centered (First Person Vision) perspective. Due to their favorable location, wearable cameras frequently capture the hands of the user, and may thus represent a promising usermachine interaction tool for different applications. Existent First Person Vision methods handle hand segmentation as a backgroundforeground problem, ignoring two important facts: i) hands are not a single "skin-like" moving element, but a pair of interacting cooperative entities, ii) close hand interactions may lead to hand-to-hand occlusions and, as a consequence, create a single hand-like segment. These facts complicate a proper understanding of hand movements and interactions. Our approach extends traditional background-foreground strategies, by including a hand-identification step (left-right) based on a Maxwell distribution of angle and position. Hand-to-hand occlusions are addressed by exploiting temporal superpixels. The experimental results show that, in addition to a reliable left/right hand-segmentation, our approach considerably improves the traditional background-foreground hand-segmentation. Keywords: Hand-Segmentation, Hand-identification, Egocentric Vision, First Person Vision 1. Introduction The recent widespread availability of wearable devices has quickly attracted the interest of researchers, computer scientists and high-tech companies [1]. The 90's idea of a body-worn device that is always ready to be used is nowadays possible, and its potential applicability to real problems is evident. In general, the wearable sensor that most attracted researchers' attention is the video camera: while enjoying a unique position to record what the user is seeing, it suffers from important issues and technical challenges [2]. Images and videos recorded from this perspective are commonly referred to as First-Person Vision (FPV) or Egocentric videos [2].


Baidu AI Composer creates music inspired by art

#artificialintelligence

Baidu, the Chinese internet giant, has created a new AI program to explore the connection between art and music. The Baidu AI Composer creates original music inspired by different pieces of art, evoking the mood of each picture in a musical representation. According to the promotional video released by Baidu, the Baidu AI Composer uses image recognition, connected to the'world's largest neural network', to identify the subject, mood, and even cultural signifiers of a piece of art. These are filtered through a matrix of hundreds of billions of samples and AI training features, using trillions of parameters, to create a complete and original piece of music inspired by the specific piece of art observed. The AI program first identifies elements of the picture – are people represented, or is the focus on nature, or objects, or is it an abstract piece? The AI is trained to extract attributes from labeled image data, assigning a mood to elements of the picture, for example, deciding if the overall tone is warm, upbeat, or melancholy.


Watch Room - An Artificial Intelligence Thriller

#artificialintelligence

With Watch Room, our goal is to contribute to the budding conversation around the promise and perils of Artificial Intelligence research, in a way that respects the complexities involved. As such, we've done our best to create a story that touches on everything from simulation theory, to brain emulation, to Roko's Basilisk... to that most hallowed of science fiction questions: "What makes us human?" Another goal of ours is to illustrate the possibilities within the realm of virtual reality. Of course, Watch Room's scientific roots drink deeply from rich dramatic soil. On one level, we're just plain old excited to make a film that's a joy to watch: smart and twisting in a way that respects the audience and keeps you guessing right up to the end.


IBM is making a music app that can create entirely new songs just for you

#artificialintelligence

IBM Watson wants to take your music to another level. Most people know Watson for its legendary performance on "Jeopardy!" But IBM's supercomputer has a host of other skills since its 2011 trivia debut. Watson's artificial intelligence can help doctors diagnose cancer, help teach a graduate level class, and even analyze characters in Harry Potter. And soon, IBM's Watson will be able to create entirely new music on a convenient app.


Is this the smartest drone yet? UAV that travels at more than 70mph has a supercomputer that lets it fly unaided

Daily Mail - Science & tech

From striking aerial photography to monitoring hard to reach areas as part of search and rescue missions, the list of tasks that drones can do is ever growing. While most drones require a driver to control their movements from the ground, one new model can fly itself unaided. The Teal quad drone contains a supercomputer which allows it to fly autonomously as well as recognise images. The Teal quad is the brainchild of founder George Matus, who at only 18 years old, wanted to create a device that was'fast, versatile, smart and break the limits of what drones could do.' Mr Matus told MailOnline: 'In regards to flight speed, it depends on the environment, altitude, wind, and battery life, so we say 70 mph to cover all of that.' While this high speed could be seen as dangerous, on its website Teal states: 'Teal is built on both the hardware side and software side to make it as safe and easy as possible to fly, while still allowing mind-blowing manoeuvrability and speed.'


Meet Jeff Walker, the man who brought Hollywood to Comic-Con

Los Angeles Times

In a galaxy known as New York, in a drab age before Trekkies and light sabers, there lived a curious boy who liked comic books, time travel and Elvis. He read Isaac Asimov and tuned in to John Zacherle, this Phantom of the Opera-type guy with scraggly hair and a creepy laugh who introduced horror movies on Channel 9, which, if you were a kid at the time, was something close to splendid. This boy, let's call him Jeffrey Walker, who incidentally would later watch "2001: A Space Odyssey" 29 times, tended toward the imaginative. His mother was a beauty queen, his father a clothier. Walker grew up to be many things, including an actor who played poker with Jesus Christ, shot a nude scene with Al Pacino and was hugged by Natalie Wood.