AITopics | Niu, Xin

Collaborating Authors

Niu, Xin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks

Zheng, Haoqi, Zhong, Qihuang, Ding, Liang, Tian, Zhiliang, Niu, Xin, Li, Dongsheng, Tao, Dacheng

arXiv.org Artificial IntelligenceNov-27-2023

Text classification tasks often encounter few shot scenarios with limited labeled data, and addressing data scarcity is crucial. Data augmentation with mixup has shown to be effective on various text classification tasks. However, most of the mixup methods do not consider the varying degree of learning difficulty in different stages of training and generate new samples with one hot labels, resulting in the model over confidence. In this paper, we propose a self evolution learning (SE) based mixup approach for data augmentation in text classification, which can generate more adaptive and model friendly pesudo samples for the model training. SE focuses on the variation of the model's learning ability. To alleviate the model confidence, we introduce a novel instance specific label smoothing approach, which linearly interpolates the model's output and one hot labels of the original samples to generate new soft for label mixing up. Through experimental analysis, in addition to improving classification accuracy, we demonstrate that SE also enhances the model's generalize ability.

computational linguistic, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2305.13547

Country:

Europe (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization

Dong, Peijie, Li, Lujun, Wei, Zimian, Niu, Xin, Tian, Zhiliang, Pan, Hengyue

arXiv.org Artificial IntelligenceJul-19-2023

Mixed-Precision Quantization~(MQ) can achieve a competitive accuracy-complexity trade-off for models. Conventional training-based search methods require time-consuming candidate training to search optimized per-layer bit-width configurations in MQ. Recently, some training-free approaches have presented various MQ proxies and significantly improve search efficiency. However, the correlation between these proxies and quantization accuracy is poorly understood. To address the gap, we first build the MQ-Bench-101, which involves different bit configurations and quantization results. Then, we observe that the existing training-free proxies perform weak correlations on the MQ-Bench-101. To efficiently seek superior proxies, we develop an automatic search of proxies framework for MQ via evolving algorithms. In particular, we devise an elaborate search space involving the existing proxies and perform an evolution search to discover the best correlated MQ proxy. We proposed a diversity-prompting selection strategy and compatibility screening protocol to avoid premature convergence and improve search efficiency. In this way, our Evolving proxies for Mixed-precision Quantization~(EMQ) framework allows the auto-generation of proxies without heavy tuning and expert knowledge. Extensive experiments on ImageNet with various ResNet and MobileNet families demonstrate that our EMQ obtains superior performance than state-of-the-art mixed-precision methods at a significantly reduced cost. The code will be released.

artificial intelligence, machine learning, proxy, (19 more...)

arXiv.org Artificial Intelligence

2307.10554

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Multi-trial Neural Architecture Search with Lottery Tickets

Wei, Zimian, Pan, Hengyue, Li, Lujun, Lu, Menglong, Niu, Xin, Dong, Peijie, Li, Dongsheng

arXiv.org Artificial IntelligenceDec-2-2022

Neural architecture search (NAS) has brought significant progress in recent image recognition tasks. Most existing NAS methods apply restricted search spaces, which limits the upper-bound performance of searched models. To address this issue, we propose a new search space named MobileNet3-MT. By reducing human-prior knowledge in omni dimensions of networks, MobileNet3-MT accommodates more potential candidates. For searching in this challenging search space, we present an efficient Multi-trial Evolution-based NAS method termed MENAS. Specifically, we accelerate the evolutionary search process by gradually pruning models in the population. Each model is trained with an early stop and replaced by its Lottery Tickets (the explored optimal pruned network).In this way, the full training pipeline of cumbersome networks is prevented and more efficient networks are automatically generated. Extensive experimental results on ImageNet-1K, CIFAR-10, and CIFAR-100 demonstrate that MENAS achieves state-of-the-art performance.

artificial intelligence, lottery ticket, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2203.043

Genre: Contests & Prizes (1.00)

Industry: Leisure & Entertainment > Gambling (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

DMFormer: Closing the Gap Between CNN and Vision Transformers

Wei, Zimian, Pan, Hengyue, Li, Lujun, Lu, Menglong, Niu, Xin, Dong, Peijie, Li, Dongsheng

arXiv.org Artificial IntelligenceNov-28-2022

Vision transformers have shown excellent performance in computer vision tasks. As the computation cost of their self-attention mechanism is expensive, recent works tried to replace the self-attention mechanism in vision transformers with convolutional operations, which is more efficient with built-in inductive bias. However, these efforts either ignore multi-level features or lack dynamic prosperity, leading to sub-optimal performance. In this paper, we propose a Dynamic Multi-level Attention mechanism (DMA), which captures different patterns of input images by multiple kernel sizes and enables input-adaptive weights with a gating mechanism. Based on DMA, we present an efficient backbone network named DMFormer. DMFormer adopts the overall architecture of vision transformers, while replacing the self-attention mechanism with our proposed DMA. Extensive experimental results on ImageNet-1K and ADE20K datasets demonstrated that DMFormer achieves state-of-the-art performance, which outperforms similar-sized vision transformers(ViTs) and convolutional neural networks (CNNs).

artificial intelligence, dmformer, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2209.07738

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

STMARL: A Spatio-Temporal Multi-Agent Reinforcement Learning Approach for Traffic Light Control

Wang, Yanan, Xu, Tong, Niu, Xin, Tan, Chang, Chen, Enhong, Xiong, Hui

arXiv.org Artificial IntelligenceAug-28-2019

The development of intelligent traffic light control systems is essential for smart transportation management. While some efforts have been made to optimize the use of individual traffic lights in an isolated way, related studies have largely ignored the fact that the use of multi-intersection traffic lights is spatially influenced and there is a temporal dependency of historical traffic status for current traffic light control. To that end, in this paper, we propose a novel SpatioTemporal Multi-Agent Reinforcement Learning (STMARL) framework for effectively capturing the spatio-temporal dependency of multiple related traffic lights and control these traffic lights in a coordinating way. Specifically, we first construct the traffic light adjacency graph based on the spatial structure among traffic lights. Then, historical traffic records will be integrated with current traffic status via Recurrent Neural Network structure. Moreover, based on the temporally-dependent traffic information, we design a Graph Neural Network based model to represent relationships among multiple traffic lights, and the decision for each traffic light will be made in a distributed way by the deep Q-learning method. Finally, the experimental results on both synthetic and real-world data have demonstrated the effectiveness of our STMARL framework, which also provides an insightful understanding of the influence mechanism among multi-intersection traffic lights.

deep learning, ground transportation, neural network, (21 more...)

arXiv.org Artificial Intelligence

1908.10577

Country: North America > United States (0.30)

Genre: Research Report (0.82)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

DropFilter: A Novel Regularization Method for Learning Convolutional Neural Networks

Pan, Hengyue, Jiang, Hui, Niu, Xin, Dou, Yong

arXiv.org Machine LearningNov-18-2018

The past few years have witnessed the fast development of different regularization methods for deep learning models such as fully-connected deep neural networks (DNNs) and Convolutional Neural Networks (CNNs). Most of previous methods mainly consider to drop features from input data and hidden layers, such as Dropout, Cutout and DropBlocks. DropConnect select to drop connections between fully-connected layers. By randomly discard some features or connections, the above mentioned methods control the overfitting problem and improve the performance of neural networks. In this paper, we proposed two novel regularization methods, namely DropFilter and DropFilter-PLUS, for the learning of CNNs. Different from the previous methods, DropFilter and DropFilter-PLUS selects to modify the convolution filters. For DropFilter-PLUS, we find a suitable way to accelerate the learning process based on theoretical analysis. Experimental results on MNIST show that using DropFilter and DropFilter-PLUS may improve performance on image classification tasks.

deep learning, neural network, regularization method, (20 more...)

arXiv.org Machine Learning

1811.06783

Country:

Asia > China (0.15)
North America > Canada (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback