MOUNTAIN VIEW, Calif., April 13, 2020 -- Synopsys, Inc. announced that Groq has adopted the Synopsys ZeBu Server 4 emulation solution for its Tensor Streaming Processor (TSP) architecture development. ZeBu Server 4 performance and capacity enabled first silicon success of Groq's TSP architecture for artificial intelligence (AI) and machine learning platforms. ZeBu also enabled optimization and validation of Groq's TSP architecture prior to silicon, resulting in unmatched performance for throughput and latency. "As we redefine compute technology with our unique single-core architecture, we are enabling the development of artificial intelligence and machine learning platforms that offer twice the inference performance while drastically reducing infrastructure costs," said Adrian Mendes, chief operating officer at Groq. "Synopsys ZeBu Server 4 Cloud solution delivered the performance and capacity required to efficiently analyze performance of our Tensor Streaming Processor, enabling us to focus on silicon innovation." ZeBu Server 4 is the industry's fastest emulation system offering 2X higher performance over competitive solutions.
Integrating a deep neural network accelerator, vector digital signal processor (DSP) and vector floating point unit (FPU), Synopsys explains that the DesignWare EV7x Vision Processors' heterogeneous architecture delivers 35 Tera operations per second (TOPS) for artificial intelligence system on chips (AI SoCs). The DesignWare ARC EV7x Embedded Vision processors, with deep neural network (DNN) accelerator provide sufficient performance for AI-intensive edge applications. The ARC EV7x Vision Processors integrate up to four enhanced vector processing units (VPUs) and a DNN accelerator with up to 14,080 MACs to deliver up to 35 TOPS performance in 16-nm FinFET process technologies under typical conditions, which is four times the performance of the ARC EV6x processors, reports Synopsys. Each EV7x VPU includes a 32-bit scalar unit and a 512-bit-wide vector DSP and can be configured for 8-, 16-, or 32-bit operations to perform simultaneous multiply-accumulates on different streams of data. The optional DNN accelerator scales from 880 to 14,080 MACs and employs a specialized architecture for faster memory access, higher performance, and better power efficiency than alternative neural network IP.
In my previous post on the recent Linley Processor Conference, I wrote about the ways that semiconductor companies are developing heterogeneous systems to reach higher levels of performance and efficiency than with traditional hardware. One of the areas where this is most urgently needed is vision processing, a challenge that got a lot of attention at this year's conference.
Cadence's Christine Young presents two views on the challenges of teaching physical design and some creative approaches to get students involved in solving complex problems. In his latest video, Mentor's Colin Walls ponders the mysteries of the increment operator in C/C and how to use it most efficiently. Synopsys' Anand Shirahatti, Mohd Adil Khan, and Jamshed Alum look at two key features in 16 GT/s PCIe Gen 4 that are gaining traction in the quest for full bandwidth utilization. A previously unknown aspect of batteries has been detected which could result in a big leap forward, in this week's top five tech picks selected by Ansys' Bill Vandermark. When will AI be considered successful?
The company's shares rose nearly 5 percent to $99.40 in after-market trading, on track to open at a record high on Thursday. Synopsys, whose clients include Intel Corp and IBM Corp, receives more than half of its revenue from supplying electronic design automation (EDA) software to chipmakers, which they use to design and test chips. The company is also set to benefit from emerging technology areas such as artificial intelligence, autonomous driving and the Internet of Things (IoT), analysts have said.