AITopics | Tang, Kaiwen

Collaborating Authors

Tang, Kaiwen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sorbet: A Neuromorphic Hardware-Compatible Transformer-Based Spiking Language Model

Tang, Kaiwen, Yan, Zhanglu, Wong, Weng-Fai

arXiv.org Artificial IntelligenceSep-4-2024

For reasons such as privacy, there are use cases for language models at the edge. This has given rise to small language models (SLMs) targeted for deployment in resource-constrained devices where energy efficiency is a significant concern. Spiking neural networks (SNNs) offer a promising solution due to their energy efficiency, and there are already works on realizing transformer-based models on SNNs. However, key operations like softmax and layer normalization (LN) are difficult to implement on neuromorphic hardware, and many of these early works sidestepped them. To address these challenges, we introduce Sorbet, a transformer-based spiking language model that is more neuromorphic hardware-compatible. Sorbet incorporates a novel shifting-based softmax called PTsoftmax and a power normalization method using bit-shifting (BSPN), both designed to replace the respective energy-intensive operations. By leveraging knowledge distillation and model quantization, Sorbet achieved a highly compressed binary weight model that maintains competitive performance while significantly reducing energy consumption. We validate Sorbet's effectiveness through extensive testing on the GLUE benchmark and a series of ablation studies, demonstrating its potential as an energy-efficient solution for language model inference.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2409.15298

Country: Asia > Singapore (0.14)

Genre: Research Report > Promising Solution (0.66)

Industry:

Energy (0.69)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Integrating Deep Learning and Synthetic Biology: A Co-Design Approach for Enhancing Gene Expression via N-terminal Coding Sequences

Yan, Zhanglu, Chu, Weiran, Sheng, Yuhua, Tang, Kaiwen, Wang, Shida, Liu, Yanfeng, Wong, Weng-Fai

arXiv.org Artificial IntelligenceFeb-20-2024

N-terminal coding sequence (NCS) influences gene expression by impacting the translation initiation rate. The NCS optimization problem is to find an NCS that maximizes gene expression. The problem is important in genetic engineering. However, current methods for NCS optimization such as rational design and statistics-guided approaches are labor-intensive yield only relatively small improvements. This paper introduces a deep learning/synthetic biology co-designed few-shot training workflow for NCS optimization. Our method utilizes k-nearest encoding followed by word2vec to encode the NCS, then performs feature extraction using attention mechanisms, before constructing a time-series network for predicting gene expression intensity, and finally a direct search algorithm identifies the optimal NCS with limited training data. We took green fluorescent protein (GFP) expressed by Bacillus subtilis as a reporting protein of NCSs, and employed the fluorescence enhancement factor as the metric of NCS optimization. Within just six iterative experiments, our model generated an NCS (MLD62) that increased average GFP expression by 5.41-fold, outperforming the state-of-the-art NCS designs. Extending our findings beyond GFP, we showed that our engineered NCS (MLD62) can effectively boost the production of N-acetylneuraminic acid by enhancing the expression of the crucial rate-limiting GNA1 gene, demonstrating its practical utility. We have open-sourced our NCS expression database and experimental procedures for public use.

artificial intelligence, expression, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2402.13297

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Hyperdimensional Computing

Yan, Zhanglu, Wang, Shida, Tang, Kaiwen, Wong, Weng-Fai

arXiv.org Artificial IntelligenceOct-12-2023

Hyperdimensional computing (HDC) is a method to perform classification that uses binary vectors with high dimensions and the majority rule. This approach has the potential to be energy-efficient and hence deemed suitable for resource-limited platforms due to its simplicity and massive parallelism. However, in order to achieve high accuracy, HDC sometimes uses hypervectors with tens of thousands of dimensions. This potentially negates its efficiency advantage. In this paper, we examine the necessity of such high dimensions and conduct a detailed theoretical analysis of the relationship between hypervector dimensions and accuracy. Our results demonstrate that as the dimension of the hypervectors increases, the worst-case/average-case HDC prediction accuracy with the majority rule decreases. Building on this insight, we develop HDC models that use binary hypervectors with dimensions orders of magnitude lower than those of state-of-the-art HDC models while maintaining equivalent or even improved accuracy and efficiency. For instance, on the MNIST dataset, we achieve 91.12% HDC accuracy in image classification with a dimension of only 64. Our methods perform operations that are only 0.35% of other HDC models with dimensions of 10,000. Furthermore, we evaluate our methods on ISOLET, UCI-HAR, and Fashion-MNIST datasets and investigate the limits of HDC computing.

artificial intelligence, dimension, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-43415-0_9

2301.10902

Country: Asia > Singapore (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

HyperSNN: A new efficient and robust deep learning model for resource constrained control applications

Yan, Zhanglu, Wang, Shida, Tang, Kaiwen, Wong, Weng-Fai

arXiv.org Artificial IntelligenceAug-17-2023

In light of the increasing adoption of edge computing in areas such as intelligent furniture, robotics, and smart homes, this paper introduces HyperSNN, an innovative method for control tasks that uses spiking neural networks (SNNs) in combination with hyperdimensional computing. HyperSNN substitutes expensive 32-bit floating point multiplications with 8-bit integer additions, resulting in reduced energy consumption while enhancing robustness and potentially improving accuracy. Our model was tested on AI Gym benchmarks, including Cartpole, Acrobot, MountainCar, and Lunar Lander. HyperSNN achieves control accuracies that are on par with conventional machine learning methods but with only 1.36% to 9.96% of the energy expenditure. Furthermore, our experiments showed increased robustness when using HyperSNN. We believe that HyperSNN is especially suitable for interactive, mobile, and wearable devices, promoting energy-efficient and robust system design. Furthermore, it paves the way for the practical implementation of complex algorithms like model predictive control (MPC) in real-world industrial scenarios.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2308.08222

Country:

Asia > Singapore (0.15)
North America > United States (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Information Technology (1.00)
Health & Medicine (1.00)
Energy > Oil & Gas > Upstream (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models

Hao, Shibo, Tan, Bowen, Tang, Kaiwen, Ni, Bin, Shao, Xiyan, Zhang, Hengzhe, Xing, Eric P., Hu, Zhiting

arXiv.org Artificial IntelligenceJun-2-2023

It is crucial to automatically construct knowledge graphs (KGs) of diverse new relations to support knowledge discovery and broad applications. Previous KG construction methods, based on either crowdsourcing or text mining, are often limited to a small predefined set of relations due to manual cost or restrictions in text corpus. Recent research proposed to use pretrained language models (LMs) as implicit knowledge bases that accept knowledge queries with prompts. Yet, the implicit knowledge lacks many desirable properties of a full-scale symbolic KG, such as easy access, navigation, editing, and quality assurance. In this paper, we propose a new approach of harvesting massive KGs of arbitrary relations from pretrained LMs. With minimal input of a relation definition (a prompt and a few shot of example entity pairs), the approach efficiently searches in the vast entity pair space to extract diverse accurate knowledge of the desired relation. We develop an effective search-and-rescore mechanism for improved efficiency and accuracy. We deploy the approach to harvest KGs of over 400 new relations from different LMs. Extensive human and automatic evaluations show our approach manages to extract diverse accurate knowledge, including tuples of complex relations (e.g., "A is capable of but not good at B"). The resulting KGs as a symbolic interpretation of the source LMs also reveal new insights into the LMs' knowledge capacities.

data mining, machine learning, relation, (24 more...)

arXiv.org Artificial Intelligence

2206.14268

Country: North America > United States (0.93)

Genre: Research Report (0.82)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.68)

Add feedback