Spark Platform Engineer - San Francisco San Francisco Posted Jan 20, 2017 - Requisition No. 56588 Apply Now The Spark Platform team is building low-latency, distributed analytics infrastructure for Bloomberg based on Apache Spark. Instead of building dozens of isolated Spark applications for individual problem domains, we are building a platform that makes it easy for teams to plug in their business logic without duplicating common functionality. This includes connecting to various datastores or real-time streams, figuring out a way to serverize Spark transforms or having to re-implement transforms such as currency conversion that are very common in financial analytics. But we can't just use Apache Spark as is. We need to enhance open source Spark to fit our low-latency, high throughput and high availability contexts.
With Early Release ebooks, you get books in their earliest form--the author's raw and unedited content as he or she writes--so you can take advantage of these technologies long before the official release of these titles. You'll also receive updates when significant changes are made, new chapters are available, and the final ebook bundle is released. If you've successfully used Apache Spark to solve medium sized-problems, but still struggle to realize the "Spark promise" of unparalleled performance on big data, this book is for you. High Performance Spark shows you how take advantage of Spark at scale, so you can grow beyond the novice-level.
One of the biggest players in the eCommerce world are going through a massive data & analytics transformation, and introducing some world-class analytics techniques using cutting edge real-time/streaming tools (e.g. This is an awesome opportunity to join an exciting data innovations team - your platform will help to generate advanced commercial insight through machine learning and recommendation systems. The Data Science Engineer will likely be a Big Data Engineer with stats/analytics experience, or a Data Scientist with a Computer Science/programming background. Please register your interest by sending your CV via the Apply link on this page. If you decide this role isn't the one for you, please contact Jethro Willett at Harnham - the Big Data market has never been busier and we're working on 20 live roles
Join us at Spark Summit to hear more about new functionalities of Apache Spark. Use the code Databricks20 to receive a 20% discount! As many data scientists and engineers can attest, the majority of the time is spent not on the models themselves but on the supporting infrastructure. Key issues include on the ability to easily visualize, share, deploy, and schedule jobs. More disconcerting is the need for data engineers to re-implement the models developed by data scientists for production.