Three years ago, we had maybe six or less AI accelerators, today there's over two dozen, and more are coming. One of the first commercially available AI training accelerators was the GPU, and the undisputed leader of that segment was Nvidia. Nvidia was already preeminent in machine learning (ML) and deep-learning (DL) applications and adding neural net acceleration was a logical and rather straight-forward step for the company. Nvidia also brought a treasure-trove of applications with their GPUs based on the company's proprietary development language CUDA. The company developed CUDA in 2006 and empowered hundreds of Universities to give courses on it.
This week is the eighth annual International Workshop on OpenCL, SYCL, Vulkan, and SPIR-V, and the event is available online for the very first time in its history thanks to the coronavirus pandemic. One of the event organizers, and the conference chair, is Simon McIntosh-Smith, who is a professor of high performance computing at Bristol University in Great Britain and also the head of its Microelectronics Group. Among other things, McIntosh-Smith was a microprocessor architect at STMicroeletronics, where he designed SIMD units for the dual-core, superscalar Chameleon and SH5 set-top box ASICs back in the late 1990s. McIntosh-Smith moved to Pixelfusion in 1999, which created the first general purpose GPU – arguably eight or nine years before Nvidia did it with its Tesla GPUs, where he was an architect on the 1,536-core chip and software manager for two years. In 2002, McIntosh-Smith was one of the co-founders of ClearSpeed, which created floating point math accelerators used in HPC systems before GPU accelerators came along, and was first director of architecture and applications and then vice president of applications.
Announced at its Software Technology Day in London, Intel new Data Parallel C aims to provide a unified, cross-industry, single-source language to program heterogeneous architectures. Data Parallel C will be based on C, hence its name, with the goal of making it possible to portably write programs that can exploit the inherent parallelism of certain algorithms. This is still a hard problem due to the heterogeneity of existing architectures exploiting hardware parallelism. It is not clear at the moment how much DPC will differ from standard C, but Intel confirmed it will incorporate SYCL, Khronos Group's high-level programming model for single-source programs running on heterogeneous platform based on OpenCL. SYCL shares some of DPC goals, namely the attempt to enable writing C single-source programs that can be run on multiple, heterogeneous architectures, which includes CPUs, GPUs, DSPs, FPGAs, and other kinds of processing units used as hardware accelerators.
In the context of machine learning, tensor refers to the multidimensional array used in the mathematical models that describe neural networks. In other words, a tensor is usually a higher-dimension generalization of a matrix or a vector. Through a simple notation that uses a rank to show the number of dimensions, tensors allow the representation of complex n-dimensional vectors and hyper-shapes as n-dimensional arrays. Tensors have two properties: a datatype and a shape. TensorFlow is an open source deep learning framework that was released in late 2015 under the Apache 2.0 license.
AMD's $5.4 billion purchase of ATI Technologies in 2006 seemed like an odd match. Not only were the companies in separate markets, but they were on separate coasts, with ATI in the Toronto, Canada, region and AMD in Sunnyvale, California. They made it work, and arguably it saved AMD from extinction because it was the graphics business that kept the company afloat while the Athlon/Opteron business was going nowhere. There were many quarters where graphics brought in more revenue than CPUs and likely saved the company from bankruptcy. But those days are over, and AMD is once again a highly competitive CPU company, and quarterly sales are getting very close to the $2 billion mark.