linear algebra operation
Technical Perspective: Supporting Linear Algebra Operations in SQL
Linear algebra operations are at the core of machine learning. Multiple specialized systems have emerged for the scalable, distributed execution of matrix and vector operations. The relationship of such computations to data management and databases however brings frictions. It is well known that a great deal of human time and machine time is being spent nowadays on fetching data out of the database and performing a computation on a specialized system. One answer to the issue is that we truly need a new kind of non-SQL database that is tuned to these computations.
Scheduling optimization of parallel linear algebra algorithms using Supervised Learning
Laberge, G., Shirzad, S., Diehl, P., Kaiser, H., Prudhomme, S., Lemoine, A.
Linear algebra algorithms are used widely in a variety of domains, e.g machine learning, numerical physics and video games graphics. For all these applications, loop-level parallelism is required to achieve high performance. However, finding the optimal way to schedule the workload between threads is a non-trivial problem because it depends on the structure of the algorithm being parallelized and the hardware the executable is run on. In the realm of Asynchronous Many Task runtime systems, a key aspect of the scheduling problem is predicting the proper chunk-size, where the chunk-size is defined as the number of iterations of a for-loop assigned to a thread as one task. In this paper, we study the applications of supervised learning models to predict the chunk-size which yields maximum performance on multiple parallel linear algebra operations using the HPX backend of Blaze's linear algebra library. More precisely, we generate our training and tests sets by measuring performance of the application with different chunk-sizes for multiple linear algebra operations; vector-addition, matrix-vector-multiplication, matrix-matrix addition and matrix-matrix-multiplication. We compare the use of logistic regression, neural networks and decision trees with a newly developed decision tree based model in order to predict the optimal value for chunk-size. Our results show that classical decision trees and our custom decision tree model are able to forecast a chunk-size which results in good performance for the linear algebra operations.
Technical Perspective: Compressing Matrices for Large-Scale Machine Learning
Demand for more powerful big data analytics solutions has spurred the development of novel programming models, abstractions, and platforms for next-generation systems. For these problems, a complete solution would address data wrangling and processing, and it would support analytics over data of any modality or scale. It would support a wide array of machine learning algorithms, but also provide primitives for building new ones. It would be customizable, scale to vast volumes of data, and map to modern multicore, GPU, coprocessor, and compute cluster hardware. In pursuit of these goals, novel techniques and solutions are being developed by machine learning researchers,4,6,7 in the database and distributed systems research communities,2,5,8 and by major players in industry.1,3
An Engineering View on Real-Time Machine Learning – MemSQL Blog
About Thorn Thorn partners across the tech industry, government and NGOs, leveraging technology to combat predatory behavior, rescue victims, and protect vulnerable children. About Eric Boutin Eric leads an engineering team for MemSQL in our Seattle office. This is background information from Eric on our work with Thorn. I was introduced to Federico Gomez Suarez, a volunteer working with Thorn, by a common friend. I was impressed by the work Thorn was doing, and excited about the opportunity to help them.