Data Science

BHP lifts lid on major data science project


BHP is applying data science to understand how it services machines located across its mines, in the hope of saving $79 million this financial year alone. The miner revealed plans late last year to set up a maintenance centre of excellence (MCoE) based out of Brisbane. The MCoE will standardise maintenance systems and processes for BHP's worldwide operations, replacing the previous model of having 40 different maintenance organisations globally, each with its own way of working. One of the keys to the MCoE model is its reliance on data science techniques, such as machine learning, to understand how maintenance is performed at each site and where improvements can be made. Like other projects since BHP relaunched its technology function at the start of this year, the idea with the MCoE is to create repeatable processes for its business operations across the world.

AVORA secures €1.7 million to bring machine-learning powered data intelligence to businesses


AVORA, a London-based company that delivers next-generation Business Intelligence (BI) and machine learning as a service, announces that it has raised €1.7 million in funding from institutional and angel investors. New investor Crane Venture Partners joins angel investors Peter Simon, founder of retailer Monsoon, and Steve Garnett, former chairman of Salesforce EMEA. According to analyst firm Gartner, through 2017, a full 60% of big data projects will fail to go beyond piloting and experimentation, and will be abandoned. AVORA was founded by serial entrepreneur Ricky Thomas, who previously established and sold two online companies – DatingUK and PetMeds. After experiencing the data challenges when running businesses firsthand, Thomas developed AVORA, offering a Software as a Service solution that redefines how companies get value from their data.

Call centers leveraging artificial intelligence


There are various different forms of artificial intelligence. The aspects of greatest relevance to the call center manager are Natural Language Processing and Speech Recognition; these provide the basis for platforms that allow for business-to-customer or business-to-business interactions through the call center model. According to Erni Medeovic, who is a Technical Architect for the Patent Transformation Project, the past five years have seen a steady rise in the use of artificial intelligence for the call center model. This includes analyzing big data sets and making use of predictive analytics, so that automated and personalized customer services can be improved. Interpreting big data With these key metrics, artificial intelligence technology can be used to interpret big data to identify customer browsing patterns, purchase history, recent access to customer devices, and most visited webpages.

AI and HPC: Inferencing, Platforms & Infrastructure


This feature continues our series of articles that survey the landscape of HPC and AI. This post focuses on inferencing, platforms, and infrastructure at the convergence of HPC and AI. Inferencing is the operation that makes data derived models valuable because they can predict the future and perform recognition tasks better than humans. Inferencing works because once the model is trained (meaning the bumpy surface has been fitted) the ANN can interpolate between known points on the surface to correctly make predictions for data points it has never seen before--meaning they were not in the original training data. Without getting too technical, during inferencing, ANNs perform this interpolation on a nonlinear (bumpy) surface, which means that ANNs can perform better than a straight line interpolation like a conventional linear method.

With IBM POWER9, we're all riding the AI wave - IBM Systems Blog: In the Making


There's a big connection between my love for water sports and hardware design -- both involve observing waves and planning several moves ahead. Four years ago, when we started sketching the POWER9 chip from scratch, we saw an upsurge of modern workloads driven by artificial intelligence and massive data sets. We are now ready to ride this new tide of computing with POWER9. It is a transformational architecture and an evolutionary shift from the archaic ways of computing promoted by x86. POWER9 is loaded with industry-leading new technologies designed for AI to thrive.

ROC curves and Area Under the Curve explained (video)


While competing in a Kaggle competition this summer, I came across a simple visualization (created by a fellow competitor) that helped me to gain a better intuitive understanding of ROC curves and Area Under the Curve (AUC). I created a video explaining this visualization to serve as a learning aid for my Data Science students, and decided to share it publicly to help others understand this complex topic. An ROC curve is the most commonly used way to visualize the performance of a binary classifier, and AUC is (arguably) the best way to summarize its performance in a single number. As such, gaining a deep understanding of ROC curves and AUC is beneficial for data scientists, machine learning practitioners, and medical researchers (among others). The 14-minute video is embedded below, followed by the complete transcript (including graphics).

Context Levels in Data Science Solutioning in real-world


A Data science-based solution needs to address problems at multiple levels. While it addresses a business problem, computationally it is comprised of a pipeline of algorithm which, in turn, operates on relevant data presented in proper format. Contrary to the popular belief, almost all non-trivial data science solutions are needed to be built ground up with minute and interrelated attention to the details of the problem at all three levels. In the following we shall try to understand that with the help of a running example of aspects of a churn analysis solution. It is vital to understand that in most real-world cases we are re-purposing the data for building the solution.

GE Healthcare turns to Nvidia for AI boost in medical imaging


GE Healthcare is set to speed up the time taken to process medical images, thanks to a pair of partnerships announced on Sunday. The global giant will team up with Nvidia to update its 500,000 medical imaging devices worldwide with Revolution Frontier CT, which is claimed to be two times faster than the previous generation image processor. GE said the speedier Revolution Frontier would be better at liver lesion detection and kidney lesion characterisation, and has the potential to reduce the number of follow-up appointments and the number of non-interpretable scans. GE Healthcare is also making use of Nvidia in its new analytics platform, with sections of it to be placed in the Nvidia GPU Cloud. An average hospital generates 50 petabytes of data annually, GE said, but only 3 percent of that data is analysed, tagged, or made actionable.

An Overview of ResNet and its Variants – Towards Data Science


After the celebrated victory of AlexNet [1] at the LSVRC2012 classification contest, deep Residual Network [2] was arguably the most groundbreaking work in the computer vision/deep learning community in the last few years. ResNet makes it possible to train up to hundreds or even thousands of layers and still achieves compelling performance. Taking advantage of its powerful representational ability, the performance of many computer vision applications other than image classification have been boosted, such as object detection and face recognition. Since ResNet blew people's mind in 2015, many in the research community have dived into the secrets of its success, many refinements have been made in the architecture. This article is divided into two parts, in the first part I am going to give a little bit of background knowledge for those who are unfamiliar with ResNet, in the second I will review some of the papers I read recently regarding different variants and interpretations of the ResNet architecture.

HPE pushes toward autonomous data center with InfoSight AI recommendation engine


HPE is adding an AI-based recommendation engine to the InfoSight predictive analytics platform for flash storage, taking another step toward what it calls the autonomous data centre, where systems modify themselves to run more efficiently. The ultimate goal is to simplify and automate infrastructure management in order to cut operation expenses. HPE acquired InfoSight as part of its $1 billion deal earlier this year for Nimble Software, a maker of all-flash and hybrid flash storage products. Along with the announcement of the new recommendation engine, HPE Tuesday also said it is extending InfoSight to work with 3Par high-end storage technology it acquired in 2010. HPE says that is only the beginning of what it is doing to develop InfoSight's ability to monitor infrastructure, predict possible problems and recommend ways to enhance performance.