AITopics | Pei, Yan Ru

Collaborating Authors

Pei, Yan Ru

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Let SSMs be ConvNets: State-space Modeling with Optimal Tensor Contractions

Pei, Yan Ru

arXiv.org Artificial IntelligenceJan-22-2025

We introduce Centaurus, a class of networks composed of generalized state-space model (SSM) blocks, where the SSM operations can be treated as tensor contractions during training. The optimal order of tensor contractions can then be systematically determined for every SSM block to maximize training efficiency. This allows more flexibility in designing SSM blocks beyond the depthwise-separable configuration commonly implemented. The new design choices will take inspiration from classical convolutional blocks including group convolutions, full convolutions, and bottleneck blocks. We architect the Centaurus network with a mixture of these blocks, to balance between network size and performance, as well as memory and computational efficiency during both training and inference. We show that this heterogeneous network design outperforms its homogeneous counterparts in raw audio processing tasks including keyword spotting, speech denoising, and automatic speech recognition (ASR). For ASR, Centaurus is the first network with competitive performance that can be made fully state-space based, without using any nonlinear recurrence (LSTMs), explicit convolutions (CNNs), or (surrogate) attention mechanism. Sequence or temporal modeling encompasses a wide range of tasks from audio processing to language modeling. Traditionally, there have been many (related) statistical methods employed (Box et al., 2015). In the age of deep learning, neural networks have been predominantly used (LeCun et al., 2015), including recurrent neural networks (RNNs), convolutional neural networks (CNNs), transformers (Vaswani, 2017), and neural ODEs (Chen et al., 2018). In many cases, the model will inevitably suffer from one of two drawbacks: 1) cannot be efficiently trained (or fitted) in parallel due to the sequential nature of the model, 2) cannot be efficiently configured for online inference due to its large memory and computational requirement. To address this, deep state-space models (SSMs) were adapted for sequence modeling, and have shown incredible potential across a wide range of tasks (Gu et al., 2021; Goel et al., 2022; Gu & Dao, 2023). Due to the linearity of the SSM layers, they can not only be configured for efficient online inference with small memory and computational resources, but also configured for efficient training using parallel hardware with unrolling strategies (Gu et al., 2022; Smith et al., 2022; Dao & Gu, 2024; Heinsen, 2023). Currently, most deep SSM networks (along with most neural networks in general) follow the architectural recipe of transformers, where they are composed of uniform "SSM blocks" throughout the network, containing little to no variations in the shapes of the intermediate features or weights. This simplifies the designs of deep SSM networks, but may sacrifice performance and efficiency in practice.

artificial intelligence, machine learning, opération, (17 more...)

arXiv.org Artificial Intelligence

2501.1323

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

TENNs-PLEIADES: Building Temporal Kernels with Orthogonal Polynomials

Pei, Yan Ru, Coenen, Olivier

arXiv.org Artificial IntelligenceMay-31-2024

Abstract-- We introduce a neural network named PLEIADES (PoLynomial Expansion In Adaptive Distributed Event-based Systems), belonging to the TENNs (Temporal Neural Networks) architecture. We focus on interfacing these networks with event-based data to perform online spatiotemporal classification and detection with low latency. By virtue of using structured temporal kernels and event-based data, we have the freedom to vary the sample rate of the data along with the discretization step-size of the network without additional finetuning. We experimented with three event-based benchmarks and obtained state-of-the-art results on all three by large margins with significantly smaller memory and compute costs. We achieved: 1) 99.59% accuracy with 192K parameters on the DVS128 hand gesture recognition dataset and 100% with a small additional output filter; 2) 99.58% test accuracy with 277K parameters on the AIS 2024 eye tracking challenge; and 3) 0.556 mAP with 576k parameters on the PROPHESEE 1 Megapixel Automotive Detection Dataset.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2405.12179

Country: Europe > Netherlands (0.14)

Genre: Research Report (0.40)

Industry:

Information Technology (0.68)
Education > Educational Setting > Online (0.54)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Event-Based Eye Tracking. AIS 2024 Challenge Survey

Wang, Zuowen, Gao, Chang, Wu, Zongwei, Conde, Marcos V., Timofte, Radu, Liu, Shih-Chii, Chen, Qinyu, Zha, Zheng-jun, Zhai, Wei, Han, Han, Liao, Bohao, Wu, Yuliang, Wan, Zengyu, Wang, Zhong, Cao, Yang, Tan, Ganchao, Chen, Jinze, Pei, Yan Ru, Brüers, Sasskia, Crouzet, Sébastien, McLelland, Douglas, Coenen, Oliver, Zhang, Baoheng, Gao, Yizhao, Li, Jingyuan, So, Hayden Kwok-Hay, Bich, Philippe, Boretti, Chiara, Prono, Luciano, Lică, Mircea, Dinucu-Jianu, David, Grîu, Cătălin, Lin, Xiaopeng, Ren, Hongwei, Cheng, Bojun, Zhang, Xinan, Vial, Valentin, Yezzi, Anthony, Tsai, James

arXiv.org Artificial IntelligenceApr-17-2024

This survey reviews the AIS 2024 Event-Based Eye Tracking (EET) Challenge. The task of the challenge focuses on processing eye movement recorded with event cameras and predicting the pupil center of the eye. The challenge emphasizes efficient eye tracking with event cameras to achieve good task accuracy and efficiency trade-off. During the challenge period, 38 participants registered for the Kaggle competition, and 8 teams submitted a challenge factsheet. The novel and diverse methods from the submitted factsheets are reviewed and analyzed in this survey to advance future event-based eye tracking research.

artificial intelligence, machine learning, representation, (20 more...)

arXiv.org Artificial Intelligence

2404.1177

Country:

Asia > China (0.28)
Europe > Netherlands > South Holland (0.14)

Genre: Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Information Technology (0.68)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

A Lightweight Spatiotemporal Network for Online Eye Tracking with Event Camera

Pei, Yan Ru, Brüers, Sasskia, Crouzet, Sébastien, McLelland, Douglas, Coenen, Olivier

arXiv.org Artificial IntelligenceApr-12-2024

Event-based data are commonly encountered in edge computing environments where efficiency and low latency are critical. To interface with such data and leverage their rich temporal features, we propose a causal spatiotemporal convolutional network. This solution targets efficient implementation on edge-appropriate hardware with limited resources in three ways: 1) deliberately targets a simple architecture and set of operations (convolutions, ReLU activations) 2) can be configured to perform online inference efficiently via buffering of layer outputs 3) can achieve more than 90% activation sparsity through regularization during training, enabling very significant efficiency gains on event-based processors. In addition, we propose a general affine augmentation strategy acting directly on the events, which alleviates the problem of dataset scarcity for event-based systems. We apply our model on the AIS 2024 event-based eye tracking challenge, reaching a score of 0.9916 p10 accuracy on the Kaggle private testset.

artificial intelligence, convolution, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2404.08858

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Mode-Assisted Unsupervised Learning of Restricted Boltzmann Machines

Manukian, Haik, Pei, Yan Ru, Bearden, Sean R. B., Di Ventra, Massimiliano

arXiv.org Machine LearningJan-19-2020

Restricted Boltzmann machines (RBMs) are a powerful class of generative models, but their training requires computing a gradient that, unlike supervised backpropagation on typical loss functions, is notoriously difficult even to approximate. Here, we show that properly combining standard gradient updates with an off-gradient direction, constructed from samples of the RBM ground state (mode), improves their training dramatically over traditional gradient methods. This approach, which we call mode training, promotes faster training and stability, in addition to lower converged relative entropy (KL divergence). Along with the proofs of stability and convergence of this method, we also demonstrate its efficacy on synthetic datasets where we can compute KL divergences exactly, as well as on a larger machine learning standard, MNIST. The mode training we suggest is quite versatile, as it can be applied in conjunction with any given gradient method, and is easily extended to more general energy-based neural network structures such as deep, convolutional and unrestricted Boltzmann machines.

deep learning, ground state, neural network, (21 more...)

arXiv.org Machine Learning

2001.05559

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)

Add feedback

Generating Weighted MAX-2-SAT Instances of Tunable Difficulty with Frustrated Loops

Pei, Yan Ru, Manukian, Haik, Di Ventra, Massimiliano

arXiv.org Machine LearningMay-13-2019

Many optimization problems can be cast into the maximum satisfiability (MAX-SAT) form, and many solvers have been developed for tackling such problems. To evaluate the performance of a MAX-SAT solver, it is convenient to generate difficult MAX-SAT instances with solutions known in advance. Here, we propose a method of generating weighted MAX-2-SAT instances inspired by the frustrated-loop algorithm used by the quantum annealing community to generate Ising spin-glass instances with nearest-neighbor coupling. Our algorithm is extended to instances whose underlying coupling graph is general, though we focus here on the case of bipartite coupling, with the associated energy being the restricted Boltzmann machine (RBM) energy. It is shown that any MAX-2-SAT problem can be reduced to the problem of minimizing an RBM energy over the nodal values. The algorithm is designed such that the difficulty of the generated instances can be tuned through a central parameter known as the frustration index. Two versions of the algorithm are presented: the random- and structured-loop algorithms. For the random-loop algorithm, we provide a thorough theoretical and empirical analysis on its mathematical properties from the perspective of frustration, and observe empirically, using simulated annealing, a double phase transition behavior in the difficulty scaling behavior driven by the frustration index. For the structured-loop algorithm, we show that it offers an improvement in difficulty of the generated instances over the random-loop algorithm, with the improvement factor scaling super-exponentially with respect to the frustration index for instances at high loop density. At the end of the paper, we provide a brief discussion of the relevance of this work to the pre-training of RBMs.

algorithm, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1905.05334

Country: North America > United States > California > San Diego County (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback