Results


New Optimizations Improve Deep Learning Frameworks For CPUs

#artificialintelligence

Since most of us need more than a "machine learning only" server, I'll focus on the reality of how Intel Xeon SP Platinum processors remain the best choice for servers, including servers needing to do machine learning as part of their workload. Here is a partial run down of key software for accelerating deep learning on Intel Xeon Platinum processor versions enough that the best performance advantage of GPUs is closer to 2X than to 100X. There is also a good article in Parallel Universe Magazine, Issue 28, starting on page 26, titled Solving Real-World Machine Learning Problems with Intel Data Analytics Acceleration Library. High-core count CPUs (the Intel Xeon Phi processors – in particular the upcoming "Knights Mill" version), and FPGAs (Intel Xeon processors coupled with Intel/Altera FPGAs), offer highly flexible options excellent price/performance and power efficiencies.


New Optimizations Improve Deep Learning Frameworks For CPUs

#artificialintelligence

Since most of us need more than a "machine learning only" server, I'll focus on the reality of how Intel Xeon SP Platinum processors remain the best choice for servers, including servers needing to do machine learning as part of their workload. Here is a partial run down of key software for accelerating deep learning on Intel Xeon Platinum processor versions enough that the best performance advantage of GPUs is closer to 2X than to 100X. There is also a good article in Parallel Universe Magazine, Issue 28, starting on page 26, titled Solving Real-World Machine Learning Problems with Intel Data Analytics Acceleration Library. High-core count CPUs (the Intel Xeon Phi processors – in particular the upcoming "Knights Mill" version), and FPGAs (Intel Xeon processors coupled with Intel/Altera FPGAs), offer highly flexible options excellent price/performance and power efficiencies.


Search for the fastest Deep Learning Framework supported by Keras

@machinelearnbot

Currently the official Keras release already supports Google's TensorFlow and Microsoft's CNTK deep learning libraries besides supporting other popular libraries like Theano. Keras also enables developers to quickly test relative performance across multiple supported deep learning frameworks. This is because MXNet doesn't yet support newer Keras functions and scripts would have needed significant changes before running on MXNet. In a standard Deep neural network test using MNIST dataset, CNTK, TensorFlow and Theano achieve similar scores (2.5 – 2.7 s/epoch) but MXNet blows it out of the water with 1.4s/epoch timing.


PyTorch or TensorFlow?

@machinelearnbot

PyTorch is essentially a GPU enabled drop-in replacement for NumPy equipped with higher-level functionality for building and training deep neural networks. In PyTorch the graph construction is dynamic, meaning the graph is built at run-time. TensorFlow does have thedynamic_rnn for the more common constructs but creating custom dynamic computations is more difficult. I haven't found the tools for data loading in TensorFlow (readers, queues, queue runners, etc.)


Machine Learning using Spark and R - Dataconomy

#artificialintelligence

In Spark 1.x there was no support for accessing the Spark ML (machine learning) libraries from R. The performance of R code on Spark was also considerably worse than could be achieved using, say, Scala. In addition, with Spark 2.1, we now have access to much of Spark's machine learning algorithms from SparkR. We're going to look at using machine learning to predict wine quality based on various characteristics of the wine. However, we'll just convert our small wine data frame to a distributed data frame.


Which GPU(s) to Get for Deep Learning

@machinelearnbot

With a good, solid GPU, one can quickly iterate over deep learning networks, and run experiments in days instead of months, hours instead of days, minutes instead of hours. Later I ventured further down the road and I developed a new 8-bit compression technique which enables you to parallelize dense or fully connected layers much more efficiently with model parallelism compared to 32-bit methods. For example if you have differently sized fully connected layers, or dropout layers the Xeon Phi is slower than the CPU. GPUs excel at problems that involve large amounts of memory due to their memory bandwidth.


PyTorch or TensorFlow?

@machinelearnbot

PyTorch is essentially a GPU enabled drop-in replacement for NumPy equipped with higher-level functionality for building and training deep neural networks. In PyTorch the graph construction is dynamic, meaning the graph is built at run-time. TensorFlow does have the dynamic_rnn for the more common constructs but creating custom dynamic computations is more difficult. I haven't found the tools for data loading in TensorFlow (readers, queues, queue runners, etc.)


OpenText brings AI-powered search to eDOCS - OpenText Blogs

#artificialintelligence

That's why we're announcing that we are bringing OpenText Decisiv Search to OpenText eDOCS as its default search engine. When a Decisiv user types "asia," the system instantly and automatically retrieves documents that (1) contain the word Asia and/or (2) are conceptually related to Asia. Machine learning makes such conceptual analysis automatic and highly scalable, and humans get the benefit. With this announcement, and the bundling of Decisiv into eDOCS DM as the index and search engine in 2018, companies will be able to quickly extract results from within the firm's eDOCS library or libraries.


IBM's Breakthrough Distributed Computation for Deep Learning Workloads

@machinelearnbot

"Deep learning is considered to be a subset, or a particular method, within this bigger term, which is machine learning," Sumit Gupta, IBM Cognitive Systems Vice-President of High Performance Computing and Data Analytics, told eWEEK. Gupta said IBM Research posted close to ideal scaling with its new distributed deep learning software that achieved record low communication overhead and 95 percent scaling efficiency on the open source Caffe deep learning framework over 256 GPUs in 64 IBM Power systems. Using this software, IBM Research achieved a new image recognition accuracy of 33.8 percent for a neural network trained on a very large data set (7.5 million images). But progress in accuracy and the practicality of deploying deep learning at scale is gated by technical challenges running massive deep learning based AI models, with training times measured in days and weeks, Gupta said.


Do-it-yourself NLP for bot developers – Rasa Blog – Medium

#artificialintelligence

In a previous post I mentioned that tools like wit and LUIS make intent classification and entity extraction so simple that you can build a bot like this during a hackathon. Amazingly, this is enough to correctly generalise, and to pick out "indian" as a cuisine type, based on its similarity to the reference words. Hence why I say that once you have a good change of variables, problems become easy. There are many ways we could combine word vectors to represent a sentence, but again we're going to do the simplest thing possible: add them up.