Results


Natural Language Processing Library for Apache Spark – free to use

@machinelearnbot

Apache Spark is a general-purpose cluster computing framework, with native support for distributed SQL, streaming, graph processing, and machine learning. Now, the Spark ecosystem also has an Spark Natural Language Processing library. Get it on GitHub or begin with the quickstart tutorial. The John Snow Labs NLP Library is under the Apache 2.0 license, written in Scala with no dependencies on other NLP or ML libraries. It natively extends the Spark ML Pipeline API.


Top 10 deep learning Framesworks everyone should know

#artificialintelligence

This is the age of artificial intelligence. Machine Learning and predictive analytics are now established and integral to just about every modern businesses, but artificial intelligence expands the scale of what's possible within those fields. It's what makes deep learning possible. Systems with greater ostensible autonomy and complexity can solve similarly complex problems. If Deep Learning is able to solve more complex problems and perform tasks of greater sophistication, building them is naturally a bigger challenge for data scientists and engineers.


Optimizing Machine Learning with TensorFlow

#artificialintelligence

In our webinar "Optimizing Machine Learning with TensorFlow" we gave an overview of some of the impressive optimizations Intel has made to TensorFlow when using their hardware. You can find a link to the archived video here. During the webinar, Mohammad Ashraf Bhuiyan, Senior Software Engineer in Intel's Artificial Intelligence Group, and myself spoke about some of the common use cases that require optimization as well as benchmarks demonstrating order-of-magnitude speed improvements when running on Intel hardware. TensorFlow, Google's library for machine learning (ML), has become the most popular machine learning library in a fast-growing ecosystem. This library has over 77k stars on GitHub and is widely used in a growing number of business critical applications.


Building a natural language processing library for Apache Spark

@machinelearnbot

Check out David Talby's tutorial "Natural language understanding at scale with spaCy and Spark NLP" at the Strata Data Conference in San Jose, March 5-8, 2018. Registration is now open--save 20% with the code BIGDATA20. Subscribe to the O'Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. When I first discovered and started using Apache Spark, a majority of the use cases I used it for involved unstructured text.


building-a-natural-language-processing-library-for-apache-spark

@machinelearnbot

Check out David Talby's tutorial "Natural language understanding at scale with spaCy and Spark NLP" at the Strata Data Conference in San Jose, March 5-8, 2018. Registration is now open--save 20% with the code BIGDATA20. Subscribe to the O'Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. When I first discovered and started using Apache Spark, a majority of the use cases I used it for involved unstructured text.


spotify/annoy

@machinelearnbot

Annoy (Approximate Nearest Neighbors Oh Yeah) is a C library with Python bindings to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data. To install, simply do sudo pip install annoy to pull down the latest version from PyPI. There are some other libraries to do nearest neighbor search. Annoy is almost as fast as the fastest libraries, (see below), but there is actually another feature that really sets Annoy apart: it has the ability to use static files as indexes.


Phones don't need a NPU to benefit from machine learning

#artificialintelligence

Neural Networks and Machine Learning are some of this year's biggest buzzwords in the world of smartphone processors. Huawei's HiSilicon Kirin 970, Apple's A11 Bionic, and the image processing unit (IPU) inside the Google Pixel 2 all boast dedicated hardware support for this emerging technology. The trend so far has suggested that machine learning requires a dedicated piece of hardware, like a Neural Processing Unit (NPU), IPU, or "Neural Engine", as Apple would call it. However, the reality is these are all just fancy words for custom digital signal processors (DSP) -- that is, hardware specialized in performing complex mathematical functions quickly. Today's latest custom silicon has been specifically optimized around machine learning and neural network operations, the most common of which include dot product math and matrix multiply.


Intel's New Processors: A Machine-learning Perspective - insideBIGDATA

@machinelearnbot

Machine learning and its younger sibling deep learning are continuing their acceleration in terms of increasing the value of enterprise data assets across a variety of problem domains. A recent talk by Dr. Amitai Armon, Chief Data-Scientist of Intel's Advanced Analytics department, at the O'reilly Artificial Intelligence conference, New-York, September 27 2016, focused on the usage of Intel's new server processors for various machine learning tasks as well as considerations in choosing and matching processors for specific machine learning tasks. Intel formed a machine learning task force with a mission to determine how the company can advance the machine learning domain. The vast majority of machine learning code today runs on Intel servers but the company wanted to do even better for the present and the future use cases. We need to understand the needs for these domains and prepare processors for those needs," said Dr. Amitai Armon.


TensorFlow (GPU) Setup for Developers – Michael Ramos – Medium

@machinelearnbot

This probably isn't for the professional data scientists or anyone creating actual models -- I imagine their setups are a bit more verbose. This blog post will cover my manual implementation of setting up TensorFlow with GPU support. I've spent hours reading posts and going through walkthroughs… and learned a ton from them… so I pieced together this installation guide to which I've been routinely using since (should have a CloudFormation script soon). This installation guide is for simple/default configurations and settings. They were made specifically for what we want to do, which is to run intense computations on the GPU.


Introducing the Natural Language Processing Library for Apache Spark - The Databricks Blog

@machinelearnbot

This is a community blog and effort from the engineering team at John Snow Labs, explaining their contribution to an open-source Apache Spark Natural Language Processing (NLP) library. Apache Spark is a general-purpose cluster computing framework, with native support for distributed SQL, streaming, graph processing, and machine learning. Now, the Spark ecosystem also has an Spark Natural Language Processing library. Get it on GitHub or begin with the quickstart tutorial. The John Snow Labs NLP Library is under the Apache 2.0 license, written in Scala with no dependencies on other NLP or ML libraries.