Deep Learning
Artificial intelligence machine out-plays gamers in video game - The Tartan
When it comes to gaming, Carnegie Mellon students are usually the ones beating the computer. But this time, the computer beat the students. Carnegie Mellon University computer science students, Devendra Chaplot and Guillaume Lample, recently made an artificial intelligence (AI) agent in the video game Doom that outplays computer-generated agents and human gamers. They accomplished this by applying deep-learning techniques that taught their agent, Arnold, to manipulate the game's 3-D design. "The work is purely a result of our passion for artificial intelligence and video games," Chaplot said.
Guest Opinion: Time to go Deep on Deep Learning
A core difference between Machine Learning and Deep Learning is in the feature selection process, which is the function by which data is chosen in creating a predictive model. In Machine Learning, domain expertise is required to code the inputs used to build a model. For example, let's say you are building a model for facial recognition. You might start by determining where the eyes, nose and mouth are located. Doing this is something we humans can do very easily; however, for a machine, it's not so simple.
[Discussion] Advice on a Deep Learning Mobile Workstation/Laptop • /r/MachineLearning
Hi Reddit Machine Learning Community, I dont know if this question suits this subreddit so apologies for my ignorance. I am planning to get my feet wet in the deep learning and eventually using amazon/azure for data processing purposes. But till then I want to experiment with initial Kaggle data-sets like Titanic, MNIST etc. and I am planning to buy a gaming laptop for the same. I am not in a position to buy a desktop as I dont have the space for it ( currently sharing room with roommates) and I want my workstation to be mobile. Can anyone provide me any suggestions opinions?
Machine Learning Top 10 in September
In this observation, we ranked nearly 1,900 articles posted in September 2016 about machine learning, deep learning and AI. Many people thesedays claim they use machine learning, especially companies that are trying to sell the vision of AI (including Mybridge). Mybridge AI evaluates the quality of content and ranks the best articles for professionals. This list is competitive and carefully includes quality content for you to read. You may find this condensed list useful in learning and working more productively in the field of machine learning.
Unsupervised Learning for Physical Interaction through Video Prediction
Finn, Chelsea, Goodfellow, Ian, Levine, Sergey
A core challenge for an agent learning to interact with the world is to predict how its actions affect objects in its environment. Many existing methods for learning the dynamics of physical interactions require labeled object information. However, to scale real-world interaction learning to a variety of scenes and objects, acquiring labeled data becomes increasingly impractical. To learn about physical object motion without labels, we develop an action-conditioned video prediction model that explicitly models pixel motion, by predicting a distribution over pixel motion from previous frames. Because our model explicitly predicts motion, it is partially invariant to object appearance, enabling it to generalize to previously unseen objects. To explore video prediction for real-world interactive agents, we also introduce a dataset of 59,000 robot interactions involving pushing motions, including a test set with novel objects. In this dataset, accurate prediction of videos conditioned on the robot's future actions amounts to learning a "visual imagination" of different futures based on different courses of action. Our experiments show that our proposed method produces more accurate video predictions both quantitatively and qualitatively, when compared to prior methods.
SR-Clustering: Semantic Regularized Clustering for Egocentric Photo Streams Segmentation
Dimiccoli, Mariella, Bolaños, Marc, Talavera, Estefania, Aghaei, Maedeh, Nikolov, Stavri G., Radeva, Petia
While wearable cameras are becoming increasingly popular, locating relevant information in large unstructured collections of egocentric images is still a tedious and time consuming process. This paper addresses the problem of organizing egocentric photo streams acquired by a wearable camera into semantically meaningful segments, hence making an important step towards the goal of automatically annotating these photos for browsing and retrieval. In the proposed method, first, contextual and semantic information is extracted for each image by employing a Convolutional Neural Networks approach. Later, a vocabulary of concepts is defined in a semantic space by relying on linguistic information. Finally, by exploiting the temporal coherence of concepts in photo streams, images which share contextual and semantic attributes are grouped together. The resulting temporal segmentation is particularly suited for further analysis, ranging from event recognition to semantic indexing and summarization. Experimental results over egocentric set of nearly 31,000 images, show the prominence of the proposed approach over state-of-the-art methods. Keywords: temporal segmentation, egocentric vision, photo streams clustering 1. Introduction Among the advances in wearable technology during the last few years, wearable cameras specifically have gained more popularity [5].
X-CNN: Cross-modal Convolutional Neural Networks for Sparse Datasets
Veličković, Petar, Wang, Duo, Lane, Nicholas D., Liò, Pietro
In this paper we propose cross-modal convolutional neural networks (X-CNNs), a novel biologically inspired type of CNN architectures, treating gradient descent-specialised CNNs as individual units of processing in a larger-scale network topology, while allowing for unconstrained information flow and/or weight sharing between analogous hidden layers of the network---thus generalising the already well-established concept of neural network ensembles (where information typically may flow only between the output layers of the individual networks). The constituent networks are individually designed to learn the output function on their own subset of the input data, after which cross-connections between them are introduced after each pooling operation to periodically allow for information exchange between them. This injection of knowledge into a model (by prior partition of the input data through domain knowledge or unsupervised methods) is expected to yield greatest returns in sparse data environments, which are typically less suitable for training CNNs. For evaluation purposes, we have compared a standard four-layer CNN as well as a sophisticated FitNet4 architecture against their cross-modal variants on the CIFAR-10 and CIFAR-100 datasets with differing percentages of the training data being removed, and find that at lower levels of data availability, the X-CNNs significantly outperform their baselines (typically providing a 2--6% benefit, depending on the dataset size and whether data augmentation is used), while still maintaining an edge on all of the full dataset tests.
Honeypot Turing Test
Honeypot design and deployment is a tradeoff between realism and simplicity; this tradeoff can be characterized as the difference between high and low interaction honeypots. A realistic design could use an actual operating system instrumented to detect and capture intruders (known as a high interaction honeypot). However, the detection would be greatly complicated, because it is difficult to distinguish between normal traffic on the system and the attacker's. It is a low signal to noise detection problem due to the complexity of modern operating systems running hundreds of threads generating large volumes of traffic with complex signatures. A honeypot that is designed only to superficially mimic an OS (low interaction honeypot) can easily detect the attacker's actions, since there is no background noise.
Short Term Memory Boosts Google Learning AI
Google has tweaked its "deep learning" AI to use an external memory bank. It's an attempt to replicate the way human brains use short term memory to simplify reasoning. The company demonstrated the approach by having the system teach itself the London Underground (subway) map and figure out the quickest route between stops. It's a simple task to humans, but the process – which involves comparing multiple branching options with 270 stops over 11 lines – is exactly the type of problem that poses a challenge to artificial intelligence. Because the system was allowed temporary access to stored memory, it was able to more effectively process and categorize the possible routes without having to start from scratch each time. That's similar to how a human brain could use short term memory to filter down all the possible routes by ruling out every one that involves travelling in a particular direction from a specified stop, repeating the process until only the optimum answer remained.
[discussion] Auto Grouping of Faces using deep learning • /r/MachineLearning
I am a 3rd year CS student and I am trying to create an application which will auto-group photos of the same person like google photos does. I understand that I can use methods such as k-means, but that would be too slow. Is there a deep learning alternative I can use?. I have alright knowledge of basic neural networks and cnns having done the cs231n course. But all of the methods they discuss are supervised ones...can someone point me in the right direction?