AITopics | Feng, Chenghao

Collaborating Authors

Feng, Chenghao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Hardware-Efficient Photonic Tensor Core: Accelerating Deep Neural Networks with Structured Compression

Ning, Shupeng, Zhu, Hanqing, Feng, Chenghao, Gu, Jiaqi, Pan, David Z., Chen, Ray T.

arXiv.org Artificial IntelligenceFeb-1-2025

Recent advancements in artificial intelligence (AI) and deep neural networks (DNNs) have revolutionized numerous fields, enabling complex tasks by extracting intricate features from large datasets. However, the exponential growth in computational demands has outstripped the capabilities of traditional electrical hardware accelerators. Optical computing offers a promising alternative due to its inherent advantages of parallelism, high computational speed, and low power consumption. Yet, current photonic integrated circuits (PICs) designed for general matrix multiplication (GEMM) are constrained by large footprints, high costs of electro-optical (E-O) interfaces, and high control complexity, limiting their scalability. To overcome these challenges, we introduce a block-circulant photonic tensor core (CirPTC) for a structure-compressed optical neural network (StrC-ONN) architecture. By applying a structured compression strategy to weight matrices, StrC-ONN significantly reduces model parameters and hardware requirements while preserving the universal representability of networks and maintaining comparable expressivity. Additionally, we propose a hardware-aware training framework to compensate for on-chip nonidealities to improve model robustness and accuracy. We experimentally demonstrate image processing and classification tasks, achieving up to a 74.91% reduction in trainable parameters while maintaining competitive accuracies. Performance analysis expects a computational density of 5.84 tera operations per second (TOPS) per mm^2 and a power efficiency of 47.94 TOPS/W, marking a 6.87-times improvement achieved through the hardware-software co-design approach. By reducing both hardware requirements and control complexity across multiple dimensions, this work explores a new pathway to push the limits of optical computing in the pursuit of high efficiency and scalability.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2502.0167

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Semiconductors & Electronics (0.88)
Information Technology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

M3ICRO: Machine Learning-Enabled Compact Photonic Tensor Core based on PRogrammable Multi-Operand Multimode Interference

Gu, Jiaqi, Zhu, Hanqing, Feng, Chenghao, Jiang, Zixuan, Chen, Ray T., Pan, David Z.

arXiv.org Artificial IntelligenceDec-28-2023

Photonic computing shows promise for transformative advancements in machine learning (ML) acceleration, offering ultra-fast speed, massive parallelism, and high energy efficiency. However, current photonic tensor core (PTC) designs based on standard optical components hinder scalability and compute density due to their large spatial footprint. To address this, we propose an ultra-compact PTC using customized programmable multi-operand multimode interference (MOMMI) devices, named M3ICRO. The programmable MOMMI leverages the intrinsic light propagation principle, providing a single-device programmable matrix unit beyond the conventional computing paradigm of one multiply-accumulate (MAC) operation per device. To overcome the optimization difficulty of customized devices that often requires time-consuming simulation, we apply ML for optics to predict the device behavior and enable a differentiable optimization flow. We thoroughly investigate the reconfigurability and matrix expressivity of our customized PTC, and introduce a novel block unfolding method to fully exploit the computing capabilities of a complex-valued PTC for near-universal real-valued linear transformations. Extensive evaluations demonstrate that M3ICRO achieves a 3.4-9.6x smaller footprint, 1.6-4.4x higher speed, 10.6-42x higher compute density, 3.7-12x higher system throughput, and superior noise robustness compared to state-of-the-art coherent PTC designs, while maintaining close-to-digital task accuracy across various ML benchmarks. Our code is open-sourced at https://github.com/JeremieMelo/M3ICRO-MOMMI.

artificial intelligence, icro, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2305.19505

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Integrated multi-operand optical neurons for scalable and hardware-efficient deep learning

Feng, Chenghao, Gu, Jiaqi, Zhu, Hanqing, Tang, Rongxing, Ning, Shupeng, Hlaing, May, Midkiff, Jason, Jain, Sourabh, Pan, David Z., Chen, Ray T.

arXiv.org Artificial IntelligenceMay-31-2023

The optical neural network (ONN) is a promising hardware platform for next-generation neuromorphic computing due to its high parallelism, low latency, and low energy consumption. However, previous integrated photonic tensor cores (PTCs) consume numerous single-operand optical modulators for signal and weight encoding, leading to large area costs and high propagation loss to implement large tensor operations. This work proposes a scalable and efficient optical dot-product engine based on customized multi-operand photonic devices, namely multi-operand optical neurons (MOON). We experimentally demonstrate the utility of a MOON using a multi-operand-Mach-Zehnder-interferometer (MOMZI) in image recognition tasks. Specifically, our MOMZI-based ONN achieves a measured accuracy of 85.89% in the street view house number (SVHN) recognition dataset with 4-bit voltage control precision. Furthermore, our performance analysis reveals that a 128x128 MOMZI-based PTCs outperform their counterparts based on single-operand MZIs by one to two order-of-magnitudes in propagation loss, optical delay, and total device footprint, with comparable matrix expressivity.

artificial intelligence, machine learning, modulator, (17 more...)

arXiv.org Artificial Intelligence

2305.19592

Country: North America > United States > Texas > Travis County > Austin (0.28)

Genre: Research Report (1.00)

Industry: Energy (0.35)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback