Classifying and visualizing with fastText and tSNE


Previously I wrote a three-part series on classifying text, in which I walked through the creation of a text classifier from the bottom up. It was interesting but it was purely an academic exercise. Here I'm going to use methods suitable for scaling up to large datasets, preferring tools written by others to those written by myself. The end goal is the same: classifying and visualizing relationships between blocks of text. I'm thinking of the classifier as a different representation of the block of text, so (1) and (2) are similar.

TensorFlow for Short-Term Stocks Prediction


In machine learning, a convolutional neural network (CNN, or ConvNet) is a class of neural networks that has successfully been applied to image recognition and analysis. In this project I've approached this class of models trying to apply it to stock market prediction, combining stock prices with sentiment analysis. The implementation of the network has been made using TensorFlow, starting from the online tutorial. In this article, I will describe the following steps: dataset creation, CNN training and evaluation of the model. In this section, it's briefly described the procedure used to build the dataset, the data sources and the sentiment analysis performed.

Modeling ensembles: making predictive analytics more accurate


Selecting the appropriate target audience effectively is a key to fulfilling those goals. With the wider availability of data sources such as social media and web analytics, marketers are moving more toward using predictive analytics to help separate likely respondents from non-respondents. The adoption of machine learning can be demonstrated in many ways, but one of the most dramatic statistics comes from Google search query volumes. Interest in "machine learning" has doubled in just the past four years, leaping from an index of 45 to 100 as of April 2015 (source: Google Trends). Similarly, it is estimated that marketing departments will outspend traditional IT departments in total budget expenditures for the technology to handle these demands.

Modern copyright law can't keep pace with thinking machines


This past April, engineer Alex Reben developed and posted to YouTube, "Deeply Artificial Trees", an art piece powered by machine learning, that leveraged old Joy of Painting videos. It generate gibberish audio in the speaking style and tone of Bob Ross, the show's host. Bob Ross' estate was not amused, subsequently issuing a DMCA takedown request and having the video knocked offline until very recently.

Is the Stanford Rare Word Similarity dataset a reliable evaluation benchmark?


Rare word representation is one of the active areas in lexical semantics which deals with inducing embeddings for rare and unseen words (for which no or very few occurrences have been observed in the training corpus). Since its creation, the Stanford Rare Word (RW) Similarity dataset has been regarded as a standard evaluation benchmark for rare word representation techniques. The dataset has 2034 word pairs which are selected in a way to reflect words with low occurrence frequency in Wikipedia, rated with a similarity scale [0,10]. Created by Minh-Thang Luong, Richard Socher, and Christopher D. Manning (2013), the RW dataset is one of the many recent word similarity datasets which acquire their similarity judgements from crowdsourcing. In this case, (Amazon Mechanical) Turkers have provided up to ten scores for each word pair.



Though nearly every industry is finding applications for machine learning--the artificial intelligence technology that feeds on data to automatically discover patterns and anomalies and make predictions--most companies are not yet taking advantage. However, five vectors of progress are making it easier, faster, and cheaper to deploy machine learning and could eventually help to bring the technology into the mainstream. With barriers to use beginning to fall, every enterprise can begin exploring applications of this transformative technology. Machine learning is one of the most powerful and versatile information technologies available today.6 But most companies have not begun to put it to use.

A Wearable Chip to Predict Seizures


One of the toughest aspects of having epilepsy is not knowing when the next seizure will strike. A wearable warning system that detects pre-seizure brain activity and alerts people of its onset could alleviate some of that stress and make the disorder more manageable. To that end, IBM researchers say they have developed a portable chip that can do the job; they described their invention today in the Lancet's open access journal eBioMedicine. The scientists built the system on a mountain of brainwave data collected from epilepsy patients. The dataset, reported by a separate group in 2013, included over 16 years of continuous electroencephalography (EEG) recordings of brain activity, and thousands of seizures, from patients who had had electrodes surgically implanted in their brains.



Artificial intelligence will be one of the key drivers of the economic growth in the next few years. But what will drive the AI industry itself? Some consider AI technologies a secret weapon of a few high-paid engineers. In fact, the success of an AI solution is mainly defined by the low-paid workers in developing countries. By 2025, AI technologies and AI-driven services will become a nearly $60 billion market -- $59.75 billion in Tractica's view, an increase from less than $1.38 billion in 2016.

Image classification with Keras and deep learning - PyImageSearch


The Christmas season holds a special place in my heart. Not because I'm particularly religious or spiritual. Not because I enjoy cold weather. And certainly not because I relish the taste of eggnog (the consistency alone makes my stomach turn). Instead, Christmas means a lot to me because of my dad.

New AI That Makes Fake Videos May Be the End of Reality as We Know It


A new artificial intelligence (AI) algorithm is capable of manufacturing simulated video imagery that is indiscernible from reality, say researchers at Nvidia, a California-based tech company. AI developers at the company have released details of a new project that allows its AI to generate fake videos using only minimal raw input data. The technology can render a flawlessly realistic sequence showing what a sunny street looks like when it's raining, for example, as well as what a cat or dog looks like as a different breed or even a person's face with a different facial expression. And this is video -- not photo. For their work, researchers tweaked a familiar algorithm, known as a generative adversarial network (GAN), to allow their AI to create fresh visual data.