AITopics | Wang, Shida

Collaborating Authors

Wang, Shida

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Relational Tabular Data without Shared Features

Wu, Zhaomin, Wang, Shida, Wang, Ziyang, He, Bingsheng

arXiv.org Artificial IntelligenceFeb-14-2025

Learning relational tabular data has gained significant attention recently, but most studies focus on single tables, overlooking the potential of cross-table learning. Cross-table learning, especially in scenarios where tables lack shared features and pre-aligned data, offers vast opportunities but also introduces substantial challenges. The alignment space is immense, and determining accurate alignments between tables is highly complex. We propose Latent Entity Alignment Learning (Leal), a novel framework enabling effective cross-table training without requiring shared features or pre-aligned data. Leal operates on the principle that properly aligned data yield lower loss than misaligned data, a concept embodied in its soft alignment mechanism. This mechanism is coupled with a differentiable cluster sampler module, ensuring efficient scaling to large relational tables. Furthermore, we provide a theoretical proof of the cluster sampler's approximation capacity. Extensive experiments on five real-world and five synthetic datasets show that Leal achieves up to a 26.8% improvement in predictive performance compared to state-of-the-art methods, demonstrating its effectiveness and scalability.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.10125

Country: Asia > Singapore (0.15)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

LongSSM: On the Length Extension of State-space Models in Language Modelling

Wang, Shida

arXiv.org Artificial IntelligenceJun-4-2024

In this paper, we investigate the length-extension of state-space models (SSMs) in language modeling. Length extension involves training models on short sequences and testing them on longer ones. We show that state-space models trained with zero hidden states initialization have difficulty doing length extension. We explain this difficulty by pointing out the length extension is equivalent to polynomial extrapolation. Based on the theory, we propose a simple yet effective method - changing the hidden states initialization scheme - to improve the length extension. Moreover, our method shows that using long training sequence length is beneficial but not necessary to length extension. Changing the hidden state initialization enables the efficient training of long-memory model with a smaller training context length.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2406.0208

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Integrating Deep Learning and Synthetic Biology: A Co-Design Approach for Enhancing Gene Expression via N-terminal Coding Sequences

Yan, Zhanglu, Chu, Weiran, Sheng, Yuhua, Tang, Kaiwen, Wang, Shida, Liu, Yanfeng, Wong, Weng-Fai

arXiv.org Artificial IntelligenceFeb-20-2024

N-terminal coding sequence (NCS) influences gene expression by impacting the translation initiation rate. The NCS optimization problem is to find an NCS that maximizes gene expression. The problem is important in genetic engineering. However, current methods for NCS optimization such as rational design and statistics-guided approaches are labor-intensive yield only relatively small improvements. This paper introduces a deep learning/synthetic biology co-designed few-shot training workflow for NCS optimization. Our method utilizes k-nearest encoding followed by word2vec to encode the NCS, then performs feature extraction using attention mechanisms, before constructing a time-series network for predicting gene expression intensity, and finally a direct search algorithm identifies the optimal NCS with limited training data. We took green fluorescent protein (GFP) expressed by Bacillus subtilis as a reporting protein of NCSs, and employed the fluorescence enhancement factor as the metric of NCS optimization. Within just six iterative experiments, our model generated an NCS (MLD62) that increased average GFP expression by 5.41-fold, outperforming the state-of-the-art NCS designs. Extending our findings beyond GFP, we showed that our engineered NCS (MLD62) can effectively boost the production of N-acetylneuraminic acid by enhancing the expression of the crucial rate-limiting GNA1 gene, demonstrating its practical utility. We have open-sourced our NCS expression database and experimental procedures for public use.

artificial intelligence, expression, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2402.13297

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Inverse Approximation Theory for Nonlinear Recurrent Neural Networks

Wang, Shida, Li, Zhong, Li, Qianxiao

arXiv.org Artificial IntelligenceFeb-6-2024

We prove an inverse approximation theorem for the approximation of nonlinear sequence-to-sequence relationships using recurrent neural networks (RNNs). This is a so-called Bernstein-type result in approximation theory, which deduces properties of a target function under the assumption that it can be effectively approximated by a hypothesis space. In particular, we show that nonlinear sequence relationships that can be stably approximated by nonlinear RNNs must have an exponential decaying memory structure - a notion that can be made precise. This extends the previously identified curse of memory in linear RNNs into the general nonlinear setting, and quantifies the essential limitations of the RNN architecture for learning sequential relationships with long-term memory. Based on the analysis, we propose a principled reparameterization method to overcome the limitations. Our theoretical results are confirmed by numerical experiments. The code has been released in https://github.com/radarFudan/Curse-of-memory

approximation, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.1919

Country:

Asia (0.14)
Europe > Portugal (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization

Wang, Shida, Li, Qianxiao

arXiv.org Artificial IntelligenceNov-24-2023

In this paper, we investigate the long-term memory learning capabilities of state-space models (SSMs) from the perspective of parameterization. We prove that state-space models without any reparameterization exhibit a memory limitation similar to that of traditional RNNs: the target relationships that can be stably approximated by state-space models must have an exponential decaying memory. Our analysis identifies this "curse of memory" as a result of the recurrent weights converging to a stability boundary, suggesting that a reparameterization technique can be effective. To this end, we introduce a class of reparameterization techniques for SSMs that effectively lift its memory limitations. Besides improving approximation capabilities, we further illustrate that a principled choice of reparameterization scheme can also enhance optimization stability. We validate our findings using synthetic datasets and language models.

artificial intelligence, machine learning, reparameterization, (16 more...)

arXiv.org Artificial Intelligence

2311.14495

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

State-space Models with Layer-wise Nonlinearity are Universal Approximators with Exponential Decaying Memory

Wang, Shida, Xue, Beichen

arXiv.org Artificial IntelligenceNov-1-2023

State-space models have gained popularity in sequence modelling due to their simple and efficient network structures. However, the absence of nonlinear activation along the temporal direction limits the model's capacity. In this paper, we prove that stacking state-space models with layer-wise nonlinear activation is sufficient to approximate any continuous sequence-to-sequence relationship. Our findings demonstrate that the addition of layer-wise nonlinear activation enhances the model's capacity to learn complex sequence patterns. Meanwhile, it can be seen both theoretically and empirically that the state-space models do not fundamentally resolve the issue of exponential decaying memory. Theoretical results are justified by numerical verifications.

exponential decaying memory, layer-wise nonlinearity, universal approximator, (1 more...)

arXiv.org Artificial Intelligence

2309.13414

Genre: Research Report > New Finding (0.53)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Efficient Hyperdimensional Computing

Yan, Zhanglu, Wang, Shida, Tang, Kaiwen, Wong, Weng-Fai

arXiv.org Artificial IntelligenceOct-12-2023

Hyperdimensional computing (HDC) is a method to perform classification that uses binary vectors with high dimensions and the majority rule. This approach has the potential to be energy-efficient and hence deemed suitable for resource-limited platforms due to its simplicity and massive parallelism. However, in order to achieve high accuracy, HDC sometimes uses hypervectors with tens of thousands of dimensions. This potentially negates its efficiency advantage. In this paper, we examine the necessity of such high dimensions and conduct a detailed theoretical analysis of the relationship between hypervector dimensions and accuracy. Our results demonstrate that as the dimension of the hypervectors increases, the worst-case/average-case HDC prediction accuracy with the majority rule decreases. Building on this insight, we develop HDC models that use binary hypervectors with dimensions orders of magnitude lower than those of state-of-the-art HDC models while maintaining equivalent or even improved accuracy and efficiency. For instance, on the MNIST dataset, we achieve 91.12% HDC accuracy in image classification with a dimension of only 64. Our methods perform operations that are only 0.35% of other HDC models with dimensions of 10,000. Furthermore, we evaluate our methods on ISOLET, UCI-HAR, and Fashion-MNIST datasets and investigate the limits of HDC computing.

artificial intelligence, dimension, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-43415-0_9

2301.10902

Country: Asia > Singapore (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

HyperSNN: A new efficient and robust deep learning model for resource constrained control applications

Yan, Zhanglu, Wang, Shida, Tang, Kaiwen, Wong, Weng-Fai

arXiv.org Artificial IntelligenceAug-17-2023

In light of the increasing adoption of edge computing in areas such as intelligent furniture, robotics, and smart homes, this paper introduces HyperSNN, an innovative method for control tasks that uses spiking neural networks (SNNs) in combination with hyperdimensional computing. HyperSNN substitutes expensive 32-bit floating point multiplications with 8-bit integer additions, resulting in reduced energy consumption while enhancing robustness and potentially improving accuracy. Our model was tested on AI Gym benchmarks, including Cartpole, Acrobot, MountainCar, and Lunar Lander. HyperSNN achieves control accuracies that are on par with conventional machine learning methods but with only 1.36% to 9.96% of the energy expenditure. Furthermore, our experiments showed increased robustness when using HyperSNN. We believe that HyperSNN is especially suitable for interactive, mobile, and wearable devices, promoting energy-efficient and robust system design. Furthermore, it paves the way for the practical implementation of complex algorithms like model predictive control (MPC) in real-world industrial scenarios.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2308.08222

Country:

Asia > Singapore (0.15)
North America > United States (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Information Technology (1.00)
Health & Medicine (1.00)
Energy > Oil & Gas > Upstream (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improve Long-term Memory Learning Through Rescaling the Error Temporally

Wang, Shida, Yan, Zhanglu

arXiv.org Artificial IntelligenceJul-21-2023

This paper studies the error metric selection for long-term memory learning in sequence modelling. We examine the bias towards short-term memory in commonly used errors, including mean absolute/squared error. Our findings show that all temporally positive-weighted errors are biased towards short-term memory in learning linear functionals. To reduce this bias and improve long-term memory learning, we propose the use of a temporally rescaled error. In addition to reducing the bias towards short-term memory, this approach can also alleviate the vanishing gradient issue. We conduct numerical experiments on different long-memory tasks and sequence models to validate our claims. Numerical results confirm the importance of appropriate temporally rescaled error for effective long-term memory learning. To the best of our knowledge, this is the first work that quantitatively analyzes different errors' memory bias towards short-term memory in sequence modelling.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2307.11462

Country:

Asia > China (0.14)
Europe > Portugal (0.14)
Europe > Germany (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

A Brief Survey on the Approximation Theory for Sequence Modelling

Jiang, Haotian, Li, Qianxiao, Li, Zhong, Wang, Shida

arXiv.org Artificial IntelligenceFeb-27-2023

The modelling of relationships between sequences is an important task that enables a wide array of applications, including classical time-series prediction problems in finance [1], and modern machine learning problems in natural language processing [2]. Another class of engineering applications involving sequential relationships are control systems, which study the dependence of a dynamical trajectory on an input control sequence [3]. In general, sequence-to-sequence relationships can be very complex. For example, when the index set for the sequences is infinite, one can understand these relationships as mappings between infinite-dimensional spaces. Thus, traditional modelling techniques are limited in their efficacy, especially when there is little prior knowledge on the system of interest.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2302.13752

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Overview (1.00)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback