AITopics | Yan, Siqi

Collaborating Authors

Yan, Siqi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Addressing Data Scarcity in Optical Matrix Multiplier Modeling Using Transfer Learning

Cem, Ali, Jovanovic, Ognjen, Yan, Siqi, Ding, Yunhong, Zibar, Darko, Da Ros, Francesco

arXiv.org Artificial IntelligenceNov-13-2023

We present and experimentally evaluate using transfer learning to address experimental data scarcity when training neural network (NN) models for Mach-Zehnder interferometer mesh-based optical matrix multipliers. Our approach involves pre-training the model using synthetic data generated from a less accurate analytical model and fine-tuning with experimental data. Our investigation demonstrates that this method yields significant reductions in modeling errors compared to using an analytical model, or a standalone NN model when training data is limited. Utilizing regularization techniques and ensemble averaging, we achieve < 1 dB root-mean-square error on the matrix weights implemented by a 3x3 photonic chip while using only 25% of the available data.

artificial intelligence, machine learning, tl-nn, (15 more...)

arXiv.org Artificial Intelligence

2308.1163

Country:

Europe > Denmark > Capital Region > Kongens Lyngby (0.14)
Asia > China > Hubei Province (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Data-driven Modeling of Mach-Zehnder Interferometer-based Optical Matrix Multipliers

Cem, Ali, Yan, Siqi, Ding, Yunhong, Zibar, Darko, Da Ros, Francesco

arXiv.org Artificial IntelligenceMar-6-2023

Photonic integrated circuits are facilitating the development of optical neural networks, which have the potential to be both faster and more energy efficient than their electronic counterparts since optical signals are especially well-suited for implementing matrix multiplications. However, accurate programming of photonic chips for optical matrix multiplication remains a difficult challenge. Here, we describe both simple analytical models and data-driven models for offline training of optical matrix multipliers. We train and evaluate the models using experimental data obtained from a fabricated chip featuring a Mach-Zehnder interferometer mesh implementing 3-by-3 matrix multiplication. The neural network-based models outperform the simple physics-based models in terms of prediction error. Furthermore, the neural network models are also able to predict the spectral variations in the matrix weights for up to 100 frequency channels covering the C-band. The use of neural network models for programming the chip for optical matrix multiplication yields increased performance on multiple machine learning tasks.

artificial intelligence, machine learning, voltage, (18 more...)

arXiv.org Artificial Intelligence

2210.09171

Country:

Europe (0.28)
Asia > China > Hubei Province (0.14)

Genre: Research Report (0.64)

Industry: Semiconductors & Electronics (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Data-efficient Modeling of Optical Matrix Multipliers Using Transfer Learning

Cem, Ali, Jovanovic, Ognjen, Yan, Siqi, Ding, Yunhong, Zibar, Darko, Da Ros, Francesco

arXiv.org Artificial IntelligenceNov-29-2022

Various photonic integrated circuit (PIC) architectures have been proposed for implementing linear layers through optical matrix multiplication (OMM). Specifically, performing unitary transformations using Mach-Zehnder interferometer (MZI) meshes has drawn large attention in recent years [1, 2]. For OMM, MZI meshes are programmed by tuning the phase shifters associated with the individual MZIs such that a desired weight matrix is realized. A common choice is to use thermo-optic phase shifters, for which analytical models relating the heater voltages to the phase shifts exist [3]. Programming a chip accurately using such models may be challenging due to fabrication errors and thermal crosstalk, which has led to the emergence of a variety of offline and online calibration techniques [4, 5].

artificial intelligence, machine learning, nn model, (12 more...)

arXiv.org Artificial Intelligence

2211.16038

Country:

Europe > Denmark > Capital Region > Kongens Lyngby (0.15)
Asia > China > Hubei Province (0.15)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Captum: A unified and generic model interpretability library for PyTorch

Kokhlikyan, Narine, Miglani, Vivek, Martin, Miguel, Wang, Edward, Alsallakh, Bilal, Reynolds, Jonathan, Melnikov, Alexander, Kliushkina, Natalia, Araya, Carlos, Yan, Siqi, Reblitz-Richardson, Orion

arXiv.org Artificial IntelligenceSep-16-2020

In this paper we introduce a novel, unified, open-source model interpretability library for PyTorch [12]. The library contains generic implementations of a number of gradient and perturbation-based attribution algorithms, also known as feature, neuron and layer importance algorithms, as well as a set of evaluation metrics for these algorithms. It can be used for both classification and non-classification models including graph-structured models built on Neural Networks (NN). In this paper we give a high-level overview of supported attribution algorithms and show how to perform memory-efficient and scalable computations. We emphasize that the three main characteristics of the library are multimodality, extensibility and ease of use. Multimodality supports different modality of inputs such as image, text, audio or video. Extensibility allows adding new algorithms and features. The library is also designed for easy understanding and use. Besides, we also introduce an interactive visualization tool called Captum Insights that is built on top of Captum library and allows sample-based model debugging and visualization using feature importance metrics.

algorithm, deep learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2009.07896

Country: North America > United States > Oregon (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback