phase shifter
Gradients of unitary optical neural networks using parameter-shift rule
Jiang, Jinzhe, Zhao, Yaqian, Zhang, Xin, Li, Chen, Yu, Yunlong, Liu, Hailing
This paper explores the application of the parameter-shift rule (PSR) for computing gradients in unitary optical neural networks (UONNs). While backpropagation has been fundamental to training conventional neural networks, its implementation in optical neural networks faces significant challenges due to the physical constraints of optical systems. We demonstrate how PSR, which calculates gradients by evaluating functions at shifted parameter values, can be effectively adapted for training UONNs constructed from Mach-Zehnder interferometer meshes. The method leverages the inherent Fourier series nature of optical interference in these systems to compute exact analytical gradients directly from hardware measurements. This approach offers a promising alternative to traditional in silico training methods and circumvents the limitations of both finite difference approximations and all-optical backpropagation implementations. We present the theoretical framework and practical methodology for applying PSR to optimize phase parameters in optical neural networks, potentially advancing the development of efficient hardware-based training strategies for optical computing systems.
Leveraging machine learning features for linear optical interferometer control
Kuzmin, Sergei S., Dyakonov, Ivan V., Straupe, Stanislav S.
We have developed an algorithm that constructs a model of a reconfigurable optical interferometer, independent of specific architectural constraints. The programming of unitary transformations on the interferometer's optical modes relies on either an analytical method for deriving the unitary matrix from a set of phase shifts or an optimization routine when such decomposition is not available. Our algorithm employs a supervised learning approach, aligning the interferometer model with a training set derived from the device being studied. A straightforward optimization procedure leverages this trained model to determine the phase shifts of the interferometer with a specific architecture, obtaining the required unitary transformation. This approach enables the effective tuning of interferometers without requiring a precise analytical solution, paving the way for the exploration of new interferometric circuit architectures.
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)
- Asia > Russia (0.05)
Beyond Terabit/s Integrated Neuromorphic Photonic Processor for DSP-Free Optical Interconnects
Wang, Benshan, Xiao, Qiarong, Xu, Tengji, Fan, Li, Liu, Shaojie, Dong, Jianji, Zhang, Junwen, Huang, Chaoran
The rapid expansion of generative AI is driving unprecedented demands for high-performance computing. Training large-scale AI models now requires vast interconnected GPU clusters across multiple data centers. Multi-scale AI training and inference demand uniform, ultra-low latency, and energy-efficient links to enable massive GPUs to function as a single cohesive unit. However, traditional electrical and optical interconnects, which rely on conventional digital signal processors (DSPs) for signal distortion compensation, are increasingly unable to meet these stringent requirements. To overcome these limitations, we present an integrated neuromorphic optical signal processor (OSP) that leverages deep reservoir computing and achieves DSP-free, all-optical, real-time processing. Experimentally, our OSP achieves a 100 Gbaud PAM4 per lane, 1.6 Tbit/s data center interconnect over a 5 km optical fiber in the C-band (equivalent to over 80 km optical fiber in the O-band), far exceeding the reach of state-of-the-art DSP solutions, which are fundamentally constrained by chromatic dispersion 1 arXiv:2504.15044v1 Simultaneously, it delivers a four-orders-of-magnitude reduction in processing latency and a three-orders-of-magnitude reduction in energy consumption. Unlike DSPs, which introduce increased latency at high data rates, our OSP maintains consistent, ultra-low latency regardless of data rate scaling, making it an ideal solution for future optical interconnects. Moreover, the OSP retains full optical field information for better impairment compensation and adapts to various modulation formats, data rates, and wavelengths. Fabricated using a mature silicon photonic process, the OSP can be monolithically integrated with silicon photonic transceivers, enhancing the compactness and reliability of all-optical interconnects. This research provides a highly scalable, energy-efficient, and high-speed solution, paving the way for next-generation AI infrastructure. Keywords: Photonic neural network, optical interconnect, AI infrastructure, data center 1 Introduction The surging demand for artificial intelligence and machine learning (AI/ML), especially in generative AI, has driven unprecedented requirements for high-performance computing infrastructure.
- Asia > China > Hubei Province > Wuhan (0.04)
- Asia > China > Hong Kong (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Information Technology > Services (0.75)
- Energy (0.68)
Rapid and Power-Aware Learned Optimization for Modular Receive Beamforming
Multiple-input multiple-output (MIMO) systems play a key role in wireless communication technologies. A widely considered approach to realize scalable MIMO systems involves architectures comprised of multiple separate modules, each with its own beamforming capability. Such models accommodate cell-free massive MIMO and partially connected hybrid MIMO architectures. A core issue with the implementation of modular MIMO arises from the need to rapidly set the beampatterns of the modules, while maintaining their power efficiency. This leads to challenging constrained optimization that should be repeatedly solved on each coherence duration. In this work, we propose a power-oriented optimization algorithm for beamforming in uplink modular hybrid MIMO systems, which learns from data to operate rapidly. We derive our learned optimizer by tackling the rate maximization objective using projected gradient ascent steps with momentum. We then leverage data to tune the hyperparameters of the optimizer, allowing it to operate reliably in a fixed and small number of iterations while completely preserving its interpretable operation. We show how power efficient beamforming can be encouraged by the learned optimizer, via boosting architectures with low-resolution phase shifts and with deactivated analog components. Numerical results show that our learn-to-optimize method notably reduces the number of iterations and computation latency required to reliably tune modular MIMO receivers, and that it allows obtaining desirable balances between power efficient designs and throughput.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Israel (0.04)
SCATTER: Algorithm-Circuit Co-Sparse Photonic Accelerator with Thermal-Tolerant, Power-Efficient In-situ Light Redistribution
Yin, Ziang, Gangi, Nicholas, Zhang, Meng, Zhang, Jeff, Huang, Rena, Gu, Jiaqi
Photonic computing has emerged as a promising solution for accelerating computation-intensive artificial intelligence (AI) workloads. However, limited reconfigurability, high electrical-optical conversion cost, and thermal sensitivity limit the deployment of current optical analog computing engines to support power-restricted, performance-sensitive AI workloads at scale. Sparsity provides a great opportunity for hardware-efficient AI accelerators. However, current dense photonic accelerators fail to fully exploit the power-saving potential of algorithmic sparsity. It requires sparsity-aware hardware specialization with a fundamental re-design of photonic tensor core topology and cross-layer device-circuit-architecture-algorithm co-optimization aware of hardware non-ideality and power bottleneck. To trim down the redundant power consumption while maximizing robustness to thermal variations, we propose SCATTER, a novel algorithm-circuit co-sparse photonic accelerator featuring dynamically reconfigurable signal path via thermal-tolerant, power-efficient in-situ light redistribution and power gating. A power-optimized, crosstalk-aware dynamic sparse training framework is introduced to explore row-column structured sparsity and ensure marginal accuracy loss and maximum power efficiency. The extensive evaluation shows that our cross-stacked optimized accelerator SCATTER achieves a 511X area reduction and 12.4X power saving with superior crosstalk tolerance that enables unprecedented circuit layout compactness and on-chip power efficiency.
The Impact of Feature Representation on the Accuracy of Photonic Neural Networks
de Queiroz, Mauricio Gomes, Jimenez, Paul, Cardoso, Raphael, Costa, Mateus Vidaletti, Abdalla, Mohab, O'Connor, Ian, Bosio, Alberto, Pavanello, Fabio
Photonic Neural Networks (PNNs) are gaining significant interest in the research community due to their potential for high parallelization, low latency, and energy efficiency. PNNs compute using light, which leads to several differences in implementation when compared to electronics, such as the need to represent input features in the photonic domain before feeding them into the network. In this encoding process, it is common to combine multiple features into a single input to reduce the number of inputs and associated devices, leading to smaller and more energy-efficient PNNs. Although this alters the network's handling of input data, its impact on PNNs remains understudied. This paper addresses this open question, investigating the effect of commonly used encoding strategies that combine features on the performance and learning capabilities of PNNs. Here, using the concept of feature importance, we develop a mathematical methodology for analyzing feature combination. Through this methodology, we demonstrate that encoding multiple features together in a single input determines their relative importance, thus limiting the network's ability to learn from the data. Given some prior knowledge of the data, however, this can also be leveraged for higher accuracy. By selecting an optimal encoding method, we achieve up to a 12.3% improvement in accuracy of PNNs trained on the Iris dataset compared to other encoding techniques, surpassing the performance of networks where features are not combined. These findings highlight the importance of carefully choosing the encoding to the accuracy and decision-making strategies of PNNs, particularly in size or power constrained applications.
TeMPO: Efficient Time-Multiplexed Dynamic Photonic Tensor Core for Edge AI with Compact Slow-Light Electro-Optic Modulator
Zhang, Meng, Yin, Dennis, Gangi, Nicholas, Begović, Amir, Chen, Alexander, Huang, Zhaoran Rena, Gu, Jiaqi
Electronic-photonic computing systems offer immense potential in energy-efficient artificial intelligence (AI) acceleration tasks due to the superior computing speed and efficiency of optics, especially for real-time, low-energy deep neural network (DNN) inference tasks on resource-restricted edge platforms. However, current optical neural accelerators based on foundry-available devices and conventional system architecture still encounter a performance gap compared to highly customized electronic counterparts. To bridge the performance gap due to lack of domain specialization, we present a time-multiplexed dynamic photonic tensor accelerator, dubbed TeMPO, with cross-layer device/circuit/architecture customization. At the device level, we present foundry-compatible, customized photonic devices, including a slow-light electro-optic modulator with experimental demonstration, optical splitters, and phase shifters that significantly reduce the footprint and power in input encoding and dot-product calculation. At the circuit level, partial products are hierarchically accumulated via parallel photocurrent aggregation, lightweight capacitive temporal integration, and sequential digital summation, considerably relieving the analog-to-digital conversion bottleneck. We also employ a multi-tile, multi-core architecture to maximize hardware sharing for higher efficiency. Across diverse edge AI workloads, TeMPO delivers digital-comparable task accuracy with superior quantization/noise tolerance. We achieve a 368.6 TOPS peak performance, 22.3 TOPS/W energy efficiency, and 1.2 TOPS/mm$^2$ compute density, pushing the Pareto frontier in edge AI hardware. This work signifies the power of cross-layer co-design and domain-specific customization, paving the way for future electronic-photonic accelerators with even greater performance and efficiency.
- Europe (0.04)
- South America > Chile (0.04)
- North America > United States > New York > Rensselaer County > Troy (0.04)
- (2 more...)
Accelerating DNN Training With Photonics: A Residue Number System-Based Design
Demirkiran, Cansu, Yang, Guowei, Bunandar, Darius, Joshi, Ajay
Photonic computing is a compelling avenue for performing highly efficient matrix multiplication, a crucial operation in Deep Neural Networks (DNNs). While this method has shown great success in DNN inference, meeting the high precision demands of DNN training proves challenging due to the precision limitations imposed by costly data converters and the analog noise inherent in photonic hardware. This paper proposes Mirage, a photonic DNN training accelerator that overcomes the precision challenges in photonic hardware using the Residue Number System (RNS). RNS is a numeral system based on modular arithmetic$\unicode{x2014}$allowing us to perform high-precision operations via multiple low-precision modular operations. In this work, we present a novel micro-architecture and dataflow for an RNS-based photonic tensor core performing modular arithmetic in the analog domain. By combining RNS and photonics, Mirage provides high energy efficiency without compromising precision and can successfully train state-of-the-art DNNs achieving accuracy comparable to FP32 training. Our study shows that on average across several DNNs when compared to systolic arrays, Mirage achieves more than $23.8\times$ faster training and $32.1\times$ lower EDP in an iso-energy scenario and consumes $42.8\times$ lower power with comparable or better EDP in an iso-area scenario.
Towards interpretable quantum machine learning via single-photon quantum walks
Flamini, Fulvio, Krumm, Marius, Fiderer, Lukas J., Müller, Thomas, Briegel, Hans J.
Variational quantum algorithms represent a promising approach to quantum machine learning where classical neural networks are replaced by parametrized quantum circuits. However, both approaches suffer from a clear limitation, that is a lack of interpretability. Here, we present a variational method to quantize projective simulation (PS), a reinforcement learning model aimed at interpretable artificial intelligence. Decision making in PS is modeled as a random walk on a graph describing the agent's memory. To implement the quantized model, we consider quantum walks of single photons in a lattice of tunable Mach-Zehnder interferometers trained via variational algorithms. Using an example from transfer learning, we show that the quantized PS model can exploit quantum interference to acquire capabilities beyond those of its classical counterpart. Finally, we discuss the role of quantum interference for training and tracing the decision making process, paving the way for realizations of interpretable quantum learning agents.
- Europe > Austria > Tyrol > Innsbruck (0.04)
- North America > United States (0.04)
- Europe > Slovenia (0.04)
- (2 more...)
- Government (0.46)
- Education (0.46)