Collaborating Authors


Debugging Incidents in Google's Distributed Systems

Communications of the ACM

Google has published two books about Site Reliability Engineering (SRE) principles, best practices, and practical applications.1,2 In the heat of the moment when handling a production incident, however, a team's actual response and debugging approaches often differ from ideal best practices. This article covers the outcomes of research performed in 2019 on how engineers at Google debug production issues, including the types of tools, high-level strategies, and low-level tasks that engineers use in varying combinations to debug effectively. It examines the research approach used to capture data, summarizing the common engineering journeys for production investigations and sharing examples of how experts debug complex distributed systems. Finally, the article extends the Google specifics of this research to provide some practical strategies that you can apply in your organization. As this study began, its focus was on developing an empirical understanding of the debugging process, with the overarching goal of creating optimal product solutions that met the needs of Google engineers. We wanted to capture the data that engineers need when debugging, when they need it, the communication process among the teams involved, and the types of mitigations that are successful.

Amazing footage shows giant 60-foot robot 'taking the knee' during a test in Yokohama

Daily Mail - Science & tech

A giant 60ft tall Transformer-like robot based on the Japanese anime series Mobile Suit Gundam was filmed pointing, walking and'taking the knee' during a recent test. The humanoid robot was built as part of a new attraction at Yamashita Pier by engineers at Gundam Factory in the Port of Yokohama. It was due to become the centrepiece of the factory on October 1, but the coronavirus pandemic delayed the opening until later this year, operators confirmed. The Gundam anime series has sparked a multi-billion dollar industry with movies, manga, plastic models and video games. Video footage shows the massive 25 tonne robot moving its right arm and fingers, lifting its legs and kneeling while workers watch from a nearby observation deck.

Artificial Intelligence Advances Food Safety


Machine vision has long found a place in food safety, working 24/7 without fatigue. But as data access increases and processing power improves, machine vision is finding even more opportunities through the added capabilities of artificial intelligence (AI). To take one example, traditional machine vision tends to struggle to inspect for contamination in sun-dried tomatoes. But it's an application that's well suited to AI. "Similar to a human, AI is very good at dealing with a lot of variations in whatever's being looked at," says Quinn Killough, senior business development manager for Landing AI, a company that provides end-to-end AI platforms for manufacturing. "That type of application, because there's so much variability in what a tomato could look like or what kind of contamination could be on it, it was a pretty tough machine vision problem in general. A human can do it easily. And it turns out AI can do it fairly easily as well. Being able to deal with all that variation in what you're looking at, it makes it very well suited for AI."

10 Azure ML Code Examples Every Cloud AI Developer Should Know


TLDR; The Azure ML Python SDK enables Data scientists, AI engineers,and MLOps developers to be productive in the cloud. This post highlights 10 examples every cloud AI developer should know, to be successful with Azure ML. If you are new to Azure you can get a free subscription using the link below. The scripts in this example are used to classify iris flower images to build a machine learning model based on scikit-learn's iris dataset the code can easily be adapted to any scikit-learn estimator. This example shows you how to run your TensorFlow training scripts at scale using Azure Machine Learning's TensorFlow estimator class.

The Musk Method: Learn from partners then go it alone

The Japan Times

Elon Musk is hailed as an innovator and disrupter who went from knowing next to nothing about building cars to running the world's most valuable automaker in the space of 16 years. But his record shows he is more of a fast learner who forged alliances with firms that had technology Tesla lacked, hired some of their most talented people, and then powered through the boundaries that limited more risk-averse partners. Now, Musk and his team are preparing to outline new steps in Tesla's drive to become a more self-sufficient company less reliant on suppliers at its "Battery Day" event on Tuesday. Musk has been dropping hints for months that significant advances in technology will be announced as Tesla strives to produce the low-cost, long-lasting batteries that could put its electric cars on a more equal footing with cheaper gasoline vehicles. New battery cell designs, chemistries and manufacturing processes are just some of the developments that would allow Tesla to reduce its reliance on its long-time battery partner, Japan's Panasonic, people familiar with the situation said.

Qualcomm Institute: Machine Learning Workshop – 3 Day Online Workshop


The demand for Data Scientists, Machine Learning Engineers, and Data Engineers is unprecedented, as demonstrated by the ever-growing number of …

Maximizing the Impact of ML in Production - insideBIGDATA


In this special guest feature, Emily Kruger, Vice President of Product at Kaskada, discusses the topic that is on the minds of many data scientists and data engineers these days, maximizing the impact of machine learning in production environments. Kaskada is a machine learning company that enables collaboration among data scientists and data engineers. Kaskada develops a machine learning studio for feature engineering using event-based data. Kaskada's platform allows data scientists to unify the feature engineering process across their organizations with a single platform for feature creation and feature serving. Machine learning is changing the way the world does business.

This Raspberry Pi-powered AI helps robots to sort through your recycling


Raspberry Pi fans have never been short of ideas to put the tool to good use, for applications as wacky as they are useful. Now researchers are fitting the low-cost computer with artificial intelligence, high-resolution cameras and robots, to sort through rubbish and reduce the amount of waste going to landfills. Engineers from Liverpool Hope University played around with a Raspberry Pi 3 model, combining the device with optical sensors and computer vision algorithms, to create a tool that can distinguish between paper, glass, plastic, metal and cardboard. Set up in a material recovery facility (MRF), where household rubbish is usually sent to be sorted, the technology could spot different materials on the conveyor belt where waste is dumped, and accordingly instruct robots to recycle specific objects as they come towards them. Karl Myers, from Liverpool Hope University's department of mathematics and computer science, told ZDNet: "It is designed to be integrated with any of the robotic systems that are on the market at the moment. The Raspberry Pi sends a signal via serial communication to the robotic arm about the position of the recyclables, and the robot just grabs the object."

AutoML: Bridging the skills gap with machine learning


Is there anything that can stop AI? As the novel Covid-19 pandemic forces the world to put on its brakes, AI technologies like machine learning – AutoML in particular – have been continuing to develop at break-neck speeds at the beginning of the new decade. Following a recent breakthrough by Google scientists at the start of a period of enforced lockdown, AutoML is seeing a wave of new progress in correlation with the explosion of big data, advanced analytics and predictive models. The increasing amount of viable data has meant that AI, machine learning (ML) and data science is undergoing reams of data and training that has served to boost the technology exponentially. AutoML in 2020, can perform data pre-processing, as well as Extraction, Transformation and Loading tasks (ETL).

Testing and Monitoring Machine Learning Model Deployments


Learn how to test & monitor production machine learning models. You've taken your model from a Jupyter notebook and rewritten it in your production system. Are you sure there weren't any mistakes when you moved from the research environment to the production system? How can you control the risk before your deployment? ML-specific unit, integration and differential tests can help you to minimize the risk.