Research Team Wins Award for Machine Learning Diagnostic

@machinelearnbot 

A team of scientists hailing from the Sandia National Laboratories and Boston University developed an experimental algorithm that could automatically diagnose problems in supercomputers. There is an array of internal and external issues that could arise with these powerful machines. For instance, factors like physical parts breaking can occur or previous programs performing "zombie processes" that prevent the computer from functioning properly. Furthermore, the repair process for these devices can take an extended period of time, which raises another issue since these computers perform critical tasks like forecasting the weather and ensuring the U.S. nuclear arsenal is safe and reliable without needing to do underground testing. To develop the algorithm, the team took a multi-step approach. First, the engineers created a suite of issues they became familiar with over the time they spent working on various supercomputers, which was then followed by them writing specific codes to re-create these anomalies.