AITopics | Pan, Xin

Collaborating Authors

Pan, Xin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving the Language Understanding Capabilities of Large Language Models Using Reinforcement Learning

Hu, Bokai, Somayajula, Sai Ashish, Pan, Xin, Huang, Zihan, Xie, Pengtao

arXiv.org Artificial IntelligenceOct-21-2024

Large language models (LLMs), built on decoder-only transformers, excel in natural language generation and adapt to diverse tasks using zero-shot and few-shot prompting. However, these prompting methods often struggle on natural language understanding (NLU) tasks, where encoder-only models like BERT-base outperform LLMs on benchmarks like GLUE and SuperGLUE. This paper explores two approaches-supervised fine-tuning (SFT) and proximal policy optimization (PPO)-to enhance LLMs' NLU abilities. To reduce the cost of full-model fine-tuning, we integrate low-rank adaptation (LoRA) layers, limiting updates to these layers during both SFT and PPO. In SFT, task-specific prompts are concatenated with input queries and ground-truth labels, optimizing with next-token prediction. Despite this, LLMs still underperform compared to models like BERT-base on several NLU tasks. To close this gap, we apply PPO, a reinforcement learning technique that treats each token generation as an action and uses a reward function based on alignment with ground-truth answers. PPO then updates the model to maximize these rewards, aligning outputs with correct labels. Our experiments with LLAMA2-7B show that PPO improves performance, with a 6.3-point gain over SFT on GLUE. PPO exceeds zero-shot by 38.7 points and few-shot by 26.1 points on GLUE, while surpassing these by 28.8 and 28.5 points on SuperGLUE. Additionally, PPO outperforms BERT-large by 2.7 points on GLUE and 9.3 points on SuperGLUE. The improvements are consistent across models like Qwen2.5-7B and MPT-7B, highlighting PPO's robustness in enhancing LLMs' NLU capabilities.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.1102

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Development of a Simple and Novel Digital Twin Framework for Industrial Robots in Intelligent robotics manufacturing

Xiang, Tianyi, Li, Borui, Pan, Xin, Zhang, Quan

arXiv.org Artificial IntelligenceOct-18-2024

This paper has proposed an easily replicable and novel approach for developing a Digital Twin (DT) system for industrial robots in intelligent manufacturing applications. Our framework enables effective communication via Robot Web Service (RWS), while a real-time simulation is implemented in Unity 3D and Web-based Platform without any other 3rd party tools. The framework can do real-time visualization and control of the entire work process, as well as implement real-time path planning based on algorithms executed in MATLAB. Results verify the high communication efficiency with a refresh rate of only $17 ms$. Furthermore, our developed web-based platform and Graphical User Interface (GUI) enable easy accessibility and user-friendliness in real-time control.

artificial intelligence, communication, robot, (16 more...)

arXiv.org Artificial Intelligence

2410.14934

Genre: Research Report (1.00)

Industry: Information Technology > Robotics & Automation (0.40)

Technology: Information Technology > Artificial Intelligence > Robots > Robots in the Workplace (0.69)

Add feedback

Mammo-Clustering:A Weakly Supervised Multi-view Global-Local Context Clustering Network for Detection and Classification in Mammography

Yang, Shilong, Zhang, Chulong, Zang, Qi, Yu, Juan, Zeng, Liang, Luo, Xiao, Xing, Yexuan, Pan, Xin, Li, Qi, Liang, Xiaokun, Xie, Yaoqin

arXiv.org Artificial IntelligenceSep-23-2024

Breast cancer has long posed a significant threat to women's health, making early screening crucial for mitigating its impact. However, mammography, the preferred method for early screening, faces limitations such as the burden of double reading by radiologists, challenges in widespread adoption in remote and underdeveloped areas, and obstacles in intelligent early screening development due to data constraints. To address these challenges, we propose a weakly supervised multi-view mammography early screening model for breast cancer based on context clustering. Context clustering, a feature extraction structure that is neither CNN nor transformer, combined with multi-view learning for information complementation, presents a promising approach. The weak supervision design specifically addresses data limitations. Our model achieves state-of-the-art performance with fewer parameters on two public datasets, with an AUC of 0.828 on the Vindr-Mammo dataset and 0.805 on the CBIS-DDSM dataset. Our model shows potential in reducing the burden on doctors and increasing the feasibility of breast cancer screening for women in underdeveloped regions.

information, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2409.14876

Country: Asia > China (0.29)

Genre: Research Report > Promising Solution (0.48)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

SparseByteNN: A Novel Mobile Inference Acceleration Framework Based on Fine-Grained Group Sparsity

Xu, Haitao, Liu, Songwei, Xu, Yuyang, Wang, Shuai, Li, Jiashi, Yan, Chenqian, Li, Liangqiang, Fu, Lean, Pan, Xin, Chen, Fangmin

arXiv.org Artificial IntelligenceOct-30-2023

To address the challenge of increasing network size, researchers have developed sparse models through network pruning. However, maintaining model accuracy while achieving significant speedups on general computing devices remains an open problem. In this paper, we present a novel mobile inference acceleration framework SparseByteNN, which leverages fine-grained kernel sparsity to achieve real-time execution as well as high accuracy. Our framework consists of two parts: (a) A fine-grained kernel sparsity schema with a sparsity granularity between structured pruning and unstructured pruning. It designs multiple sparse patterns for different operators. Combined with our proposed whole network rearrangement strategy, the schema achieves a high compression rate and high precision at the same time. (b) Inference engine co-optimized with the sparse pattern. The conventional wisdom is that this reduction in theoretical FLOPs does not translate into real-world efficiency gains. We aim to correct this misconception by introducing a family of efficient sparse kernels for ARM and WebAssembly. Equipped with our efficient implementation of sparse primitives, we show that sparse versions of MobileNet-v1 outperform strong dense baselines on the efficiency-accuracy curve. Experimental results on Qualcomm 855 show that for 30% sparse MobileNet-v1, SparseByteNN achieves 1.27x speedup over the dense version and 1.29x speedup over the state-of-the-art sparse inference engine MNN with a slight accuracy drop of 0.224%. The source code of SparseByteNN will be available at https://github.com/lswzjuer/SparseByteNN

artificial intelligence, machine learning, pruning, (19 more...)

arXiv.org Artificial Intelligence

2310.19509

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.86)
Information Technology > Artificial Intelligence > Vision (0.69)

Add feedback

AlignDet: Aligning Pre-training and Fine-tuning in Object Detection

Li, Ming, Wu, Jie, Wang, Xionghui, Chen, Chen, Qin, Jie, Xiao, Xuefeng, Wang, Rui, Zheng, Min, Pan, Xin

arXiv.org Artificial IntelligenceAug-13-2023

The paradigm of large-scale pre-training followed by downstream fine-tuning has been widely employed in various object detection algorithms. In this paper, we reveal discrepancies in data, model, and task between the pre-training and fine-tuning procedure in existing practices, which implicitly limit the detector's performance, generalization ability, and convergence speed. To this end, we propose AlignDet, a unified pre-training framework that can be adapted to various existing detectors to alleviate the discrepancies. AlignDet decouples the pre-training process into two stages, i.e., image-domain and box-domain pre-training. The image-domain pre-training optimizes the detection backbone to capture holistic visual abstraction, and box-domain pre-training learns instance-level semantics and task-aware concepts to initialize the parts out of the backbone. By incorporating the self-supervised pre-trained backbones, we can pre-train all modules for various detectors in an unsupervised paradigm. As depicted in Figure 1, extensive experiments demonstrate that AlignDet can achieve significant improvements across diverse protocols, such as detection algorithm, model backbone, data setting, and training schedule. For example, AlignDet improves FCOS by 5.3 mAP, RetinaNet by 2.1 mAP, Faster R-CNN by 3.3 mAP, and DETR by 2.3 mAP under fewer epochs.

artificial intelligence, backbone, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2307.11077

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback