700 SQL Queries per Second in Apache Spark with FiloDB

#artificialintelligence 

Apache Spark is increasingly thought of as the new jack-of-all-trades distributed platform for big data crunching – what with everything from traditional MapReduce-like workloads, streaming, graph computation, statistics, and machine learning all in one package. Except for Spark Streaming, with its micro-batches, Spark is focused for the most part on higher-latency, rich/complex analytics workloads. What about using Spark as an embedded, web-speed / low-latency query engine? This post will dive into using Apache Spark for low-latency, higher concurrency reporting / dashboard / SQL-like applications - up to hundreds of queries a second! Launching Spark applications on a cluster, or even on localhost, has a pretty high overhead.