AITopics | grayskull

Collaborating Authors

grayskull

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Assessing Tenstorrent's RISC-V MatMul Acceleration Capabilities

Cavagna, Hiari Pizzini, Cesarini, Daniele, Bartolini, Andrea

arXiv.org Artificial IntelligenceJun-23-2025

The increasing demand for generative AI as Large Language Models (LLMs) services has driven the need for specialized hardware architectures that optimize computational efficiency and energy consumption. This paper evaluates the performance of the Tenstorrent Grayskull e75 RISC-V accelerator for basic linear algebra kernels at reduced numerical precision, a fundamental operation in LLM computations. We present a detailed characterization of Grayskull's execution model, grid size, matrix dimensions, data formats, and numerical precision impact on computational efficiency. Furthermore, we compare Grayskull's performance against state-of-the-art architectures with tensor acceleration, including Intel Sapphire Rapids processors and two NVIDIA GPUs (V100 and A100). Whilst NVIDIA GPUs dominate raw performance, Grayskull demonstrates a competitive trade-off between power consumption and computational throughput, reaching a peak of 1.55 TFLOPs/Watt with BF16.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.06085

Country: Europe > Italy (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Hardware (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Attention in SRAM on Tenstorrent Grayskull

Thüning, Moritz

arXiv.org Artificial IntelligenceJul-18-2024

When implementations of the Transformer's self-attention layer utilize SRAM instead of DRAM, they can achieve significant speedups. The Tenstorrent Grayskull architecture provides a large SRAM, distributed across a grid of cores. This work presents a fused kernel for Grayskull, that exclusively utilizes its large SRAM by combining matrix multiplication, attention score scaling and Softmax operations. Additionally, a dedicated Softmax kernel utilizing the SRAM and a CPU implementation serving as a baseline are presented. The Softmax operation consumes most of the runtime in the computation of attention weights from queries and keys on Grayskull. The speedup of the dedicated Softmax kernel compared to the CPU implementation is up to $10 \times$, and the Softmax implementation inside the fused kernel is approximately $1.8 \times$ faster than the dedicated Softmax kernel. The time and memory complexity of all implementations is quadratic in sequence length. Currently, the Grayskull e150 is approximately $30 \times$ cheaper for the general public than an Nvidia H100 PCIe (a state-of-the-art GPU) and offers approximately $1.5 \times$ more SRAM.

implementation, opération, tensix core, (15 more...)

arXiv.org Artificial Intelligence

2407.13885

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > San Diego County > Carlsbad (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.40)

Industry: Information Technology (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Another deep learning processor appears in the ring: Grayskull from Tenstorrent

#artificialintelligenceNov-5-2020, 05:45:19 GMT

It describes the technology behind the processor as: "The first conditional execution architecture for artificial intelligence facilitating scalable deep learning. Tenstorrent has taken an approach that dynamically eliminates unnecessary computation, thus breaking the direct link between model size growth and compute/memory bandwidth requirements." "Conditional computation enables adaptation to both inference and training of a model to the exact input that was presented, like adjusting NLP model computations to the exact length of the text presented, and dynamically pruning portions of the model based on input characteristics," is how the company describes it. It has eight channels of LPDDR4 for supporting up to 16Gbyte of external DRAM and 16 lanes of PCI-E Gen 4. The Tensix cores have a packet processor, a programmable SIMD and maths computation block, five single-issue RISC cores and 1Mbyte of ram. "The array of Tensix cores is stitched together with a double 2D torus network-on-chip, which facilitates multi-cast flexibility, along with minimal software burden for scheduling coarse-grain data transfers," according to the company.

deep learning processor, grayskull, tenstorrent, (4 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)

Add feedback