Rise of the Robots: From big data to artificial intelligence


Worldwide spending on artificial intelligence and big data will reach the tens of billions by 2025. Michael Finnigan finds out what family-run operations need to know about the rise of the robots. The county of Wiltshire in the United Kingdom might seem like an unlikely setting for one of the world's most advanced artificial intelligence (AI) laboratories. It is best-known for its Neolithic monuments and iconic stone circles, most famously Stonehenge, but beneath its prehistoric landscape the future is unfolding. At family-run technology design firm Dyson, a team of engineers are using artificial intelligence to get a leg up on the competition.

Pre-Spark Summit Meetup in Dublin, Ireland


Since the creation of Apache Spark, I/O throughput has increased at a faster pace than processing speed. In a lot of big data applications, the bottleneck is increasingly the CPU. With the release of Apache Spark 2.0 and Project Tungsten, Spark runs a number of control operations close to the metal. At the same time, there has been a surge of interest in using GPUs (the Graphics Processing Units of video cards) for general purpose applications, and a number of frameworks have been proposed to do numerical computations on GPUs. In this talk, we will discuss how to combine Apache Spark with TensorFlow, a new framework from Google that provides building blocks for Machine Learning computations on GPUs.

Optimization of a SSP's Header Bidding Strategy using Thompson Sampling

arXiv.org Machine Learning

Over the last decade, digital media (web or app publishers) generalized the use of real time ad auctions to sell their ad spaces. Multiple auction platforms, also called Supply-Side Platforms (SSP), were created. Because of this multiplicity, publishers started to create competition between SSPs. In this setting, there are two successive auctions: a second price auction in each SSP and a secondary, first price auction, called header bidding auction, between SSPs.In this paper, we consider an SSP competing with other SSPs for ad spaces. The SSP acts as an intermediary between an advertiser wanting to buy ad spaces and a web publisher wanting to sell its ad spaces, and needs to define a bidding strategy to be able to deliver to the advertisers as many ads as possible while spending as little as possible. The revenue optimization of this SSP can be written as a contextual bandit problem, where the context consists of the information available about the ad opportunity, such as properties of the internet user or of the ad placement.Using classical multi-armed bandit strategies (such as the original versions of UCB and EXP3) is inefficient in this setting and yields a low convergence speed, as the arms are very correlated. In this paper we design and experiment a version of the Thompson Sampling algorithm that easily takes this correlation into account. We combine this bayesian algorithm with a particle filter, which permits to handle non-stationarity by sequentially estimating the distribution of the highest bid to beat in order to win an auction. We apply this methodology on two real auction datasets, and show that it significantly outperforms more classical approaches.The strategy defined in this paper is being developed to be deployed on thousands of publishers worldwide.

Upcoming Practical Data Science courses in London, Chicago, Zurich, Oslo and Stockholm


If you'd like to learn how to run R within Azure Machine Learning and SQL Server, you may be interested in these upcoming 4-day Practical Data Science courses, presented by Rafal Lukawiecki from Project Botticelli. In this classroom-based course, you will learn machine learning, data mining, some statistics, data preparation, and how to interpret the results. You will also learn how to formulate business questions in terms of data science hypotheses and experiments, and how to prepare inputs to answer those questions. Rafal will share his decade of hands-on experience while teaching you about Azure Machine Learning (Azure ML) which is the foundation of Cortana Analytics Suite, and its highly-visual, on-premise companion, the SQL Server Analysis Services Data Mining engine, supplemented with the free Microsoft R Open and Microsoft R Server software. By the end of this course you will be able to plan and run data science projects.