AITopics | Wang, Yaonan

Collaborating Authors

Wang, Yaonan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation

Liu, Ruiping, Yang, Kailun, Roitberg, Alina, Zhang, Jiaming, Peng, Kunyu, Liu, Huayao, Wang, Yaonan, Stiefelhagen, Rainer

arXiv.org Artificial IntelligenceDec-24-2023

Semantic segmentation benchmarks in the realm of autonomous driving are dominated by large pre-trained transformers, yet their widespread adoption is impeded by substantial computational costs and prolonged training durations. To lift this constraint, we look at efficient semantic segmentation from a perspective of comprehensive knowledge distillation and consider to bridge the gap between multi-source knowledge extractions and transformer-specific patch embeddings. We put forward the Transformer-based Knowledge Distillation (TransKD) framework which learns compact student transformers by distilling both feature maps and patch embeddings of large teacher transformers, bypassing the long pre-training process and reducing the FLOPs by >85.0%. Specifically, we propose two fundamental and two optimization modules: (1) Cross Selective Fusion (CSF) enables knowledge transfer between cross-stage features via channel attention and feature map distillation within hierarchical transformers; (2) Patch Embedding Alignment (PEA) performs dimensional transformation within the patchifying process to facilitate the patch embedding distillation; (3) Global-Local Context Mixer (GL-Mixer) extracts both global and local information of a representative embedding; (4) Embedding Assistant (EA) acts as an embedding method to seamlessly bridge teacher and student models with the teacher's number of channels. Experiments on Cityscapes, ACDC, NYUv2, and Pascal VOC2012 datasets show that TransKD outperforms state-of-the-art distillation frameworks and rivals the time-consuming pre-training method. The source code is publicly available at https://github.com/RuipingL/TransKD.

distillation, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2202.13393

Country:

Asia (0.28)
Europe > Germany > Baden-Württemberg (0.14)

Genre: Research Report (1.00)

Industry:

Education (1.00)
Transportation > Ground > Road (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

OAFuser: Towards Omni-Aperture Fusion for Light Field Semantic Segmentation

Teng, Fei, Zhang, Jiaming, Peng, Kunyu, Wang, Yaonan, Stiefelhagen, Rainer, Yang, Kailun

arXiv.org Artificial IntelligenceDec-21-2023

Light field cameras, by harnessing the power of micro-lens array, are capable of capturing intricate angular and spatial details. This allows for acquiring complex light patterns and details from multiple angles, significantly enhancing the precision of image semantic segmentation, a critical aspect of scene interpretation in vision intelligence. However, the extensive angular information of light field cameras contains a large amount of redundant data, which is overwhelming for the limited hardware resources of intelligent vehicles. Besides, inappropriate compression leads to information corruption and data loss. To excavate representative information, we propose a new paradigm, Omni-Aperture Fusion model (OAFuser), which leverages dense context from the central view and discovers the angular information from sub-aperture images to generate a semantically consistent result. To avoid feature loss during network propagation and simultaneously streamline the redundant information from the light field camera, we present a simple yet very effective Sub-Aperture Fusion Module (SAFM) to embed sub-aperture images into angular features without any additional memory cost. Furthermore, to address the mismatched spatial information across viewpoints, we present a Center Angular Rectification Module (CARM) to realize feature resorting and prevent feature occlusion caused by asymmetric information. Our proposed OAFuser achieves state-of-the-art performance on the UrbanLF-Real and -Syn datasets and sets a new record of 84.93% in mIoU on the UrbanLF-Real Extended dataset, with a gain of +4.53%. The source code of OAFuser will be available at https://github.com/FeiBryantkit/OAFuser.

artificial intelligence, machine learning, segmentation, (14 more...)

arXiv.org Artificial Intelligence

2307.15588

Country: North America > United States > Oklahoma > Beaver County (1.00)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(2 more...)

Add feedback

Towards Anytime Optical Flow Estimation with Event Cameras

Ye, Yaozu, Shi, Hao, Yang, Kailun, Wang, Ze, Yin, Xiaoting, Lin, Yining, Liu, Mao, Wang, Yaonan, Wang, Kaiwei

arXiv.org Artificial IntelligenceOct-19-2023

Optical flow estimation is a fundamental task in the field of autonomous driving. Event cameras are capable of responding to log-brightness changes in microseconds. Its characteristic of producing responses only to the changing region is particularly suitable for optical flow estimation. In contrast to the super low-latency response speed of event cameras, existing datasets collected via event cameras, however, only provide limited frame rate optical flow ground truth, (e.g., at 10Hz), greatly restricting the potential of event-driven optical flow. To address this challenge, we put forward a high-frame-rate, low-latency event representation Unified Voxel Grid, sequentially fed into the network bin by bin. We then propose EVA-Flow, an EVent-based Anytime Flow estimation network to produce high-frame-rate event optical flow with only low-frame-rate optical flow ground truth for supervision. The key component of our EVA-Flow is the stacked Spatiotemporal Motion Refinement (SMR) module, which predicts temporally dense optical flow and enhances the accuracy via spatial-temporal motion refinement. The time-dense feature warping utilized in the SMR module provides implicit supervision for the intermediate optical flow. Additionally, we introduce the Rectified Flow Warp Loss (RFWL) for the unsupervised evaluation of intermediate optical flow in the absence of ground truth. This is, to the best of our knowledge, the first work focusing on anytime optical flow estimation via event cameras. A comprehensive variety of experiments on MVSEC, DESC, and our EVA-FlowSet demonstrates that EVA-Flow achieves competitive performance, super-low-latency (5ms), fastest inference (9.2ms), time-dense motion estimation (200Hz), and strong generalization. Our code will be available at https://github.com/Yaozhuwa/EVA-Flow.

anytime optical flow estimation, artificial intelligence, event camera

arXiv.org Artificial Intelligence

2307.05033

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

A Graph Reconstruction by Dynamic Signal Coefficient for Fault Classification

He, Wenbin, Mao, Jianxu, Wang, Yaonan, Li, Zhe, Fang, Qiu, Wu, Haotian

arXiv.org Artificial IntelligenceSep-29-2023

To improve the performance in identifying the faults under strong noise for rotating machinery, this paper presents a dynamic feature reconstruction signal graph method, which plays the key role of the proposed end-to-end fault diagnosis model. Specifically, the original mechanical signal is first decomposed by wavelet packet decomposition (WPD) to obtain multiple subbands including coefficient matrix. Then, with originally defined two feature extraction factors MDD and DDD, a dynamic feature selection method based on L2 energy norm (DFSL) is proposed, which can dynamically select the feature coefficient matrix of WPD based on the difference in the distribution of norm energy, enabling each sub-signal to take adaptive signal reconstruction. Next the coefficient matrices of the optimal feature sub-bands are reconstructed and reorganized to obtain the feature signal graphs. Finally, deep features are extracted from the feature signal graphs by 2D-Convolutional neural network (2D-CNN). Experimental results on a public data platform of a bearing and our laboratory platform of robot grinding show that this method is better than the existing methods under different noise intensities.

artificial intelligence, data mining, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2306.05281

Genre: Research Report (0.66)

Technology:

Information Technology > Data Science > Data Mining (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Add feedback

Towards Source-free Domain Adaptive Semantic Segmentation via Importance-aware and Prototype-contrast Learning

Cao, Yihong, Zhang, Hui, Lu, Xiao, Xiao, Zheng, Yang, Kailun, Wang, Yaonan

arXiv.org Artificial IntelligenceJul-3-2023

Domain adaptive semantic segmentation enables robust pixel-wise understanding in real-world driving scenes. Source-free domain adaptation, as a more practical technique, addresses the concerns of data privacy and storage limitations in typical unsupervised domain adaptation methods. It utilizes a well-trained source model and unlabeled target data to achieve adaptation in the target domain. However, in the absence of source data and target labels, current solutions cannot sufficiently reduce the impact of domain shift and fully leverage the information from the target data. In this paper, we propose an end-to-end source-free domain adaptation semantic segmentation method via Importance-Aware and Prototype-Contrast (IAPC) learning. The proposed IAPC framework effectively extracts domain-invariant knowledge from the well-trained source model and learns domain-specific knowledge from the unlabeled target domain. Specifically, considering the problem of domain shift in the prediction of the target domain by the source model, we put forward an importance-aware mechanism for the biased target prediction probability distribution to extract domain-invariant knowledge from the source model. We further introduce a prototype-contrast strategy, which includes a prototype-symmetric cross-entropy loss and a prototype-enhanced cross-entropy loss, to learn target intra-domain knowledge without relying on labels. A comprehensive variety of experiments on two domain adaptive semantic segmentation benchmarks demonstrates that the proposed end-to-end IAPC solution outperforms existing state-of-the-art methods. Code will be made publicly available at https://github.com/yihong-97/Source-free_IAPC.

artificial intelligence, machine learning, target domain, (17 more...)

arXiv.org Artificial Intelligence

2306.01598

Genre: Research Report (0.84)

Industry: Information Technology > Security & Privacy (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Vision (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

A Signed Subgraph Encoding Approach via Linear Optimization for Link Sign Prediction

Fang, Zhihong, Tan, Shaolin, Wang, Yaonan

arXiv.org Artificial IntelligenceMay-16-2023

In this paper, we consider the problem of inferring the sign of a link based on limited sign data in signed networks. Regarding this link sign prediction problem, SDGNN (Signed Directed Graph Neural Networks) provides the best prediction performance currently to the best of our knowledge. In this paper, we propose a different link sign prediction architecture call SELO (Subgraph Encoding via Linear Optimization), which obtains overall leading prediction performances compared the state-of-the-art algorithm SDGNN. The proposed model utilizes a subgraph encoding approach to learn edge embeddings for signed directed networks. In particular, a signed subgraph encoding approach is introduced to embed each subgraph into a likelihood matrix instead of the adjacency matrix through a linear optimization method. Comprehensive experiments are conducted on six real-world signed networks with AUC, F1, micro-F1, and Macro-F1 as the evaluation metrics. The experiment results show that the proposed SELO model outperforms existing baseline feature-based methods and embedding-based methods on all the six real-world networks and in all the four evaluation metrics.

artificial intelligence, machine learning, subgraph, (18 more...)

arXiv.org Artificial Intelligence

2305.09869

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Add feedback

Artificial intelligence empowered multi-AGVs in manufacturing systems

Li, Dong, Ouyang, Bo, Wu, Duanpo, Wang, Yaonan

arXiv.org Artificial IntelligenceSep-7-2019

AGVs are driverless robotic vehicles that picks up and delivers materials. How to improve the efficiency while preventing deadlocks is the core issue in designing AGV systems. In this paper, we propose an approach to tackle this problem.The proposed approach includes a traditional AGV scheduling algorithm, which aims at solving deadlock problems, and an artificial neural network based component, which predict future tasks of the AGV system, and make decisions on whether to send an AGV to the predicted starting location of the upcoming task,so as to save the time of waiting for an AGV to go to there first when the upcoming task is created. Simulation results show that the proposed method significantly improves the efficiency as against traditional method, up to 20% to 30%.

deep learning, efficiency, neural network, (19 more...)

arXiv.org Artificial Intelligence

1909.03373

Country: Asia > China (0.30)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback