Goto

Collaborating Authors

 risc-v


Binary-30K: A Heterogeneous Dataset for Deep Learning in Binary Analysis and Malware Detection

Bommarito, Michael J. II

arXiv.org Artificial Intelligence

Deep learning research for binary analysis faces a critical infrastructure gap. Today, existing datasets target single platforms, require specialized tooling, or provide only hand-engineered features incompatible with modern neural architectures; no single dataset supports accessible research and pedagogy on realistic use cases. To solve this, we introduce Binary-30K, the first heterogeneous binary dataset designed for sequence-based models like transformers. Critically, Binary-30K covers Windows, Linux, macOS, and Android across 15+ CPU architectures. With 29,793 binaries and approximately 26.93% malware representation, Binary-30K enables research on platform-invariant detection, cross-target transfer learning, and long-context binary understanding. The dataset provides pre-computed byte-level BPE tokenization alongside comprehensive structural metadata, supporting both sequence modeling and structure-aware approaches. Platform-first stratified sampling ensures representative coverage across operating systems and architectures, while distribution via Hugging Face with official train/validation/test splits enables reproducible benchmarking. The dataset is publicly available at https://huggingface.co/datasets/mjbommar/binary-30k, providing an accessible resource for researchers, practitioners, and students alike.


Decentor-V: Lightweight ML Training on Low-Power RISC-V Edge Devices

Ribeiro, Marcelo, Costa, Diogo, Moreira, Gonçalo, Pinto, Sandro, Gomes, Tiago

arXiv.org Artificial Intelligence

Modern IoT devices increasingly rely on machine learning solutions to process data locally. However, the lack of graphics processing units (GPUs) or dedicated accelerators on most platforms makes on-device training largely infeasible, often requiring cloud-based services to perform this task. This procedure often raises privacy-related concerns, and creates dependency on reliable and always-on connectivity. Federated Learning (FL) is a new trend that addresses these issues by enabling decentralized and collaborative training directly on devices, but it requires highly efficient optimization algorithms. L-SGD, a lightweight variant of stochastic gradient descent, has enabled neural network training on Arm Cortex-M Microcontroller Units (MCUs). This work extends L-SGD to RISC-V-based MCUs, an open and emerging architecture that still lacks robust support for on-device training. L-SGD was evaluated on both Arm and RISC-V platforms using 32-bit floating-point arithmetic, highlighting the performance impact of the absence of Floating-Point Units (FPUs) in RISC-V MCUs. To mitigate these limitations, we introduce an 8-bit quantized version of L-SGD for RISC-V, which achieves nearly 4x reduction in memory usage and a 2.2x speedup in training time, with negligible accuracy degradation.


Design and Implementation of a RISC-V SoC with Custom DSP Accelerators for Edge Computing

Yadav, Priyanshu

arXiv.org Artificial Intelligence

RISC-V [1] is rapidly gaining traction as an open, modular, and royalty-free Instruction Set Architecture (ISA). Unlike proprietary ISAs, RISC-V's openness allows researchers and designers to customize the core to application-specific requirements, enabling novel architectural extensions and accelerators. In domains such as wireless communications and edge Machine Learning, one-dimensional (1D) convolutions (and related dot products) are ubiquitous: they underlie Finite Impulse Response (FIR) filters, matched filtering, correlation and synchronization in wireless systems, and convolutional layers in neural networks for time-series data (e.g., audio processing, sensor data analysis). Despite RISC-V's flexibility, a scalar, in-order implementation of the RV32I base ISA (32-bit integer) lacks specialized instructions for the numerous multiply-accumulate (MAC) operations required by convolution. Software implementations on such a core execute a sequence of load, multiply, add, and store instructions for each convolution tap, resulting in high cycle counts and energy consumption-especially problematic in real-time, battery-powered edge deployments.


China bets on open-source chips as U.S. export controls mount

The Japan Times

When a Beijing-based military institute in September published a patent for a new high-performance chip, it offered a glimpse of China's bid to remake the half-trillion dollar global chip market and withstand U.S. sanctions. The People's Liberation Army's (PLA) Academy of Military Sciences had used an open-source standard known as RISC-V to reduce malfunctions in chips for cloud computing and smart cars, the patent filing shows. RISC-V is an instruction set architecture, a computer language used to design anything from smartphone chips to advanced processors for artificial intelligence.


Risc-V in orbit // eeNews Europe Newsletter 220819 Yeah, finally - Wisse Hettinga on LinkedIn

#artificialintelligence

Digital Twins are Virtually a Reality // eeNews Europe Newsletter 220729 Digital twins are becoming more and more reality. The ultimate digital twin is of course a virtual replica of the human being - the robot. That robot is'under development', but look at where we are! We see stumbling, crawling, dancing, fighting, cooking, steel muscle constructions that require massif battery packs and a lot of remote control - not exactly a'twin' that can do your work or can do your cleaning or shopping. Digital twins in manufacturing are doing much better.


GPU From Imagination Works With RISC-V - AI Summary

#artificialintelligence

The activity around creating a legit graphics processor for RISC-V chip designs, an emerging competitor to x86 and ARM, is gaining steam. Special interest groups at RISC-V next year will expand the focus on extensions for shaders and advanced matrix operations, which is important for artificial intelligence and machine learning, Mark Himelstein, chief technology officer at RISC-V, told The Register. "There is no reason why you could not integrate C-series -- which is the part that has ray tracing -- with RISC-V," David Harold, chief marketing officer at Imagination, told The Register. Andes Technology, which creates RISC-V chip designs, has verified that Imagination's GPUs work with RISC-V, and so has RIOS Lab, which has David Patterson, vice chair of the Board at RISC-V Foundation, on staff. The need for a GPU on RISC-V could be fundamental as the chip architecture gains importance, Shreyas Derashri, vice president of compute at Imagination, told The Register.


GPU from Imagination works with RISC-V

#artificialintelligence

The activity around creating a legit graphics processor for RISC-V chip designs, an emerging competitor to x86 and ARM, is gaining steam. Special interest groups at RISC-V next year will expand the focus on extensions for shaders and advanced matrix operations, which is important for artificial intelligence and machine learning, Mark Himelstein, chief technology officer at RISC-V, told The Register. RISC-V International, which developed the instruction set architecture, has interest groups develop extensions that users can add to their chip designs. In 2021, 16 RISC-V extensions were ratified, Himelstein said, and that number will grow next year. Many new extensions were part of mainstream computing chips announced this year at the RISC-V Summit.


RISC-V International Ratifies 15 New Specifications, Opening Up New Possibilities for RISC-V Designs - RISC-V International

#artificialintelligence

ZURICH – Dec. 2, 2021 – RISC-V International, a global open hardware standards organization, today announced that RISC-V members have ratified 15 new specifications – representing more than 40 extensions – for the free and open RISC-V instruction set architecture (ISA). Most notably, RISC-V members ratified the Vector, Scalar Cryptography, and Hypervisor specifications which will help unlock new opportunities for developers creating RISC-V applications for artificial intelligence (AI) and machine learning (ML), the Internet of Things (IoT), connected and autonomous cars, data centers, and beyond. "In 2021, RISC-V International made huge leaps in our technical progress as we ratified 15 specifications that are critical for the future of computing," said Krste Asanović, Chair of the RISC-V International Board of Directors. "The development of these specifications really showcased the incredible benefits of open collaboration across companies and geographies as members worked together to develop novel approaches for the latest computing requirements." The RISC-V Vector specification will help accelerate the computation of data intensive operations like ML inference for audio, vision, and voice processing.


China's Chip-Independence Goals Helped by U.S.-Developed Tech

WSJ.com: WSJD - Technology

A U.S.-born approach to defining how computer processors work presents a potential steppingstone to chip independence for Chinese tech companies that face growing limits from Washington on buying American semiconductors. The so-called RISC-V technology offers an openly accessible approach to running the brains that power personal computers, smartphones and servers. It is an emerging rival to two, long-dominant proprietary models from Intel Corp. and Arm Holdings Ltd., a British company that U.S. graphics-chip maker Nvidia Inc. agreed in September to acquire for $40 billion. The standard is winning global interest, and early users include Chinese online retail and tech giant Alibaba Group Holding Ltd., which developed what some industry insiders consider the highest-performance RISC-V chip in production. Alibaba has said it is using that chip in its data centers to perform artificial intelligence calculations, and is selling versions of it.


Esperanto Unveils ML Chip with Nearly 1,100 RISC-V Cores

#artificialintelligence

At the RISC-V Summit today, Art Swift, CEO of Esperanto Technologies, announced a new, RISC-V based chip aimed at machine learning and containing nearly 1,100 low-power cores based on the open-source RISC-V architecture. Esperanto Technologies, headquartered in Mountain View, Calif., with other sites across the U.S. and Europe, was created in 2014 "with the goal of making RISC-V the architecture of choice for compute-intensive applications such as AI and machine learning." Swift traced the history of the new chip back to 2017, when Dave Ditzel – the founder and chairman of Esperanto – laid out the vision for Esperanto at the seventh RISC-V workshop. At that workshop, Ditzel set a goal of "laying down 4,000 or more cores on a single device." Ditzel called for both a simple instruction set through RISC-V and innovation in the realms of custom microarchitectures and proprietary low-power design techniques.