Collaborating Authors


All-flash Vast Data storage trains AI data at LightSpeed


Vast Data Universal Storage serves all data from flash. Its "shared everything" architecture uses lower cost QLC SSDs for bulk storage with more expensive Intel Optane storage class memory acceleration, connected via NVMe over Fabrics in the box. File and object protocols are presented as container-based microservices. Vast claims its method ensures data remains highly available on the fastest storage media. The Vast Data storage targets applications that hinge on fast random access, said Eric Burgener, a research vice president of storage at IT analyst firm IDC.

Domain-Specific Hardware Accelerators

Communications of the ACM

From the simple embedded processor in your washing machine to powerful processors in data center servers, most computing today takes place on general-purpose programmable processors or CPUs. CPUs are attractive because they are easy to program and because large code bases exist for them. The programmability of CPUs stems from their execution of sequences of simple instructions, such as ADD or BRANCH; however, the energy required to fetch and interpret an instruction is 10x to 4000x more than that required to perform a simple operation such as ADD. This high overhead was acceptable when processor performance and efficiency were scaling according to Moore's Law.32 One could simply wait and an existing application would run faster and more efficiently. Our economy has become dependent on these increases in computing performance and efficiency to enable new features and new applications. Today, Moore's Law has largely ended,12 and we must look to alternative architectures with lower overhead, such as domain-specific accelerators, to continue scaling of performance and efficiency. There are several ways to realize domain-specific accelerators as discussed in the sidebar on accelerator options. A domain-specific accelerator is a hardware computing engine that is specialized for a particular domain of applications. Accelerators have been designed for graphics,26 deep learning,16 simulation,2 bioinformatics,49 image processing,38 and many other tasks. Accelerators can offer orders of magnitude improvements in performance/cost and performance/W compared to general-purpose computers. For example, our bioinformatics accelerator, Darwin,49 is up to 15,000x faster than a CPU at reference-based, long-read assembly. The performance and efficiency of accelerators is due to a combination of specialized operations, parallelism, efficient memory systems, and reduction of overhead. Domain-specific accelerators7 are becoming more pervasive and more visible, because they are one of the few remaining ways to continue to improve performance and efficiency now that Moore's Law has ended.22 Most applications require modifications to achieve high speed up on domain-specific accelerators. These applications are highly tuned to balance the performance of conventional processors with their memory systems.

A Survey on Edge Intelligence Artificial Intelligence

Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis in locations close to where data is captured based on artificial intelligence. The aim of edge intelligence is to enhance the quality and speed of data processing and protect the privacy and security of the data. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this paper, we present a thorough and comprehensive survey on the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, namely edge caching, edge training, edge inference, and edge offloading, based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare and analyse the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, etc. This survey article provides a comprehensive introduction to edge intelligence and its application areas. In addition, we summarise the development of the emerging research field and the current state-of-the-art and discuss the important open issues and possible theoretical and technical solutions.

Hitachi Vantara CTO on quantum, data ethics, and public trust


When quantum computing moves from the theoretical world into the applied space it threatens to break apart the accepted modus operandi of much of the technology industry, something Hubert Yoshida, the CTO of Hitachi Vantara is keenly aware of. Search giant Google made a surprise announcement that it had reached quantum supremacy last month, raising serious questions about how organisations can manage and secure data in the future. Nowehere is this more important than in the domain of cryptography. Where once it could take hundreds of years to crack encryption methods with traditional computing, quantum computing techniques could lower that to just seconds. "We have to keep one step ahead and find different ways of doing encryption in the face of new technologies," Yoshida, told Computerworld, speaking during the Hitachi Next conference at the MGM Grand, Las Vegas, last week.

Can AMD convert its growing GPU presence into a data center play?


AMD's $5.4 billion purchase of ATI Technologies in 2006 seemed like an odd match. Not only were the companies in separate markets, but they were on separate coasts, with ATI in the Toronto, Canada, region and AMD in Sunnyvale, California. They made it work, and arguably it saved AMD from extinction because it was the graphics business that kept the company afloat while the Athlon/Opteron business was going nowhere. There were many quarters where graphics brought in more revenue than CPUs and likely saved the company from bankruptcy. But those days are over, and AMD is once again a highly competitive CPU company, and quarterly sales are getting very close to the $2 billion mark.

A New Golden Age for Computer Architecture

Communications of the ACM

We began our Turing Lecture June 4, 201811 with a review of computer architecture since the 1960s. In addition to that review, here, we highlight current challenges and identify future opportunities, projecting another golden age for the field of computer architecture in the next decade, much like the 1980s when we did the research that led to our award, delivering gains in cost, energy, and security, as well as performance. "Those who cannot remember the past are condemned to repeat it."--George Software talks to hardware through a vocabulary called an instruction set architecture (ISA). By the early 1960s, IBM had four incompatible lines of computers, each with its own ISA, software stack, I/O system, and market niche--targeting small business, large business, scientific, and real time, respectively. IBM engineers, including ACM A.M. Turing Award laureate Fred Brooks, Jr., thought they could create a single ISA that would efficiently unify all four of these ISA bases. They needed a technical solution for how computers as inexpensive as those with 8-bit data paths and as fast as those with 64-bit data paths could share a single ISA. The data paths are the "brawn" of the processor in that they perform the arithmetic but are relatively easy to "widen" or "narrow." The greatest challenge for computer designers then and now is the "brains" of the processor--the control hardware. Inspired by software programming, computing pioneer and Turing laureate Maurice Wilkes proposed how to simplify control. Control was specified as a two-dimensional array he called a "control store." Each column of the array corresponded to one control line, each row was a microinstruction, and writing microinstructions was called microprogramming.39 A control store contains an ISA interpreter written using microinstructions, so execution of a conventional instruction takes several microinstructions. The control store was implemented through memory, which was much less costly than logic gates. The table here lists four models of the new System/360 ISA IBM announced April 7, 1964. The data paths vary by a factor of 8, memory capacity by a factor of 16, clock rate by nearly 4, performance by 50, and cost by nearly 6.

I played Shadow of the Tomb Raider over 5G, and it didn't suck


Anyone who's experimented with a cloud gaming service knows that wired ethernet is almost required. At AT&T's Spark conference in San Francisco on Monday, I had a chance to try out Nvidia's GeForce Now service for PCs running over AT&T's 5G service, playing the newly-released Shadow of the Tomb Raider game on a generic Lenovo ThinkPad. The traditional way to run a PC game is locally, running the game off a hard drive or SSD on your PC, using the CPU and GPU to render the game as fast as it can. The downside, of course, is that you have to buy all of that hardware yourself. The trade-off is that the 3D rendering takes place on a remote server--a cheaper solution than buying a high-end graphics card, at least in the short term.

Google Strategy Teardown: Google Is Turning Itself Into An AI Company As It Seeks To Win New Markets Like Cloud And Transportation


Alphabet is broken out into its core Google business and a number of other subsidiaries, which it deems "Other Bets." The majority of Google's business comes from advertising revenues, which the company generates through its search engine as well as a number of other Google-affiliated and partnership websites. Outside of search and advertising, Google generates revenue from products including cloud and enterprise, consumer hardware, mapping, and YouTube. In addition to Google, Alphabet encompasses a host of other subsidiaries called "Other Bets." These companies are more experimental in nature, and as a result are not material to Alphabet's bottom line.

Intel continues to optimize its products around AI


Normally, this is the time of year when Intel would hold its Intel Developer Forum conference, which would be replete with new product announcements. But with the demise of the show last year, the company instead held an all-day event that it live-streamed over the web. The company's Data Centric Innovation Summit was the backdrop for a series of processor and memory announcements aimed at the data center and artificial intelligence, in particular. Even though Intel is without a leader, it still has considerable momentum. Navin Shenoy, executive vice president and general manager of the Data Center Group, did the heavy lifting.

Why Micron Is So Excited About Artificial Intelligence


Memory specialist Micron (NASDAQ:MU) sells both DRAM, a type of computer memory that's used in virtually every kind of computing device, and NAND flash, which is rapidly gaining traction for high-performance data storage applications as it's quicker and more efficient than hard disk drive-based storage. Micron's business has continued to benefit from what seems like an insatiable amount of demand for both DRAM and NAND in applications such as mobile phones and data center servers. One of the sub-segments within data center servers is the market for servers that handle machine learning, commonly referred to as artificial intelligence, processing tasks. That sub-segment is small today, with data center chip giant Intel estimating the market at around 7% of total data center server shipments in 2016, but it's also, according to Intel, the fastest growing. The companies that make the processors that perform these machine learning computations are clearly very excited about the artificial intelligence opportunity as it means they'll get to sell a lot more computing power over the years.