AITopics | Chen, Jinyu

Collaborating Authors

Chen, Jinyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deploying Foundation Model Powered Agent Services: A Survey

Xu, Wenchao, Chen, Jinyu, Zheng, Peirong, Yi, Xiaoquan, Tian, Tianyi, Zhu, Wenhui, Wan, Quan, Wang, Haozhao, Fan, Yunfeng, Su, Qinliang, Shen, Xuemin

arXiv.org Artificial IntelligenceDec-17-2024

Foundation model (FM) powered agent services are regarded as a promising solution to develop intelligent and personalized applications for advancing toward Artificial General Intelligence (AGI). To achieve high reliability and scalability in deploying these agent services, it is essential to collaboratively optimize computational and communication resources, thereby ensuring effective resource allocation and seamless service delivery. In pursuit of this vision, this paper proposes a unified framework aimed at providing a comprehensive survey on deploying FM-based agent services across heterogeneous devices, with the emphasis on the integration of model and resource optimization to establish a robust infrastructure for these services. Particularly, this paper begins with exploring various low-level optimization strategies during inference and studies approaches that enhance system scalability, such as parallelism techniques and resource scaling methods. The paper then discusses several prominent FMs and investigates research efforts focused on inference acceleration, including techniques such as model compression and token reduction. Moreover, the paper also investigates critical components for constructing agent services and highlights notable intelligent applications. Finally, the paper presents potential research directions for developing real-time agent services with high Quality of Service (QoS).

large language model, machine learning, real time system, (24 more...)

arXiv.org Artificial Intelligence

2412.13437

Country:

Asia > China (0.67)
North America (0.45)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Health & Medicine (1.00)
Energy (1.00)
Education (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
(9 more...)

Add feedback

Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology

Wang, Xiangyu, Yang, Donglin, Wang, Ziqin, Kwan, Hohin, Chen, Jinyu, Wu, Wenjun, Li, Hongsheng, Liao, Yue, Liu, Si

arXiv.org Artificial IntelligenceOct-10-2024

Developing agents capable of navigating to a target location based on language instructions and visual information, known as vision-language navigation (VLN), has attracted widespread interest. Most research has focused on ground-based agents, while UAV-based VLN remains relatively underexplored. Recent efforts in UAV vision-language navigation predominantly adopt ground-based VLN settings, relying on predefined discrete action spaces and neglecting the inherent disparities in agent movement dynamics and the complexity of navigation tasks between ground and aerial environments. To address these disparities and challenges, we propose solutions from three perspectives: platform, benchmark, and methodology. To enable realistic UAV trajectory simulation in VLN tasks, we propose the OpenUAV platform, which features diverse environments, realistic flight control, and extensive algorithmic support. We further construct a target-oriented VLN dataset consisting of approximately 12k trajectories on this platform, serving as the first dataset specifically designed for realistic UAV VLN tasks. To tackle the challenges posed by complex aerial environments, we propose an assistant-guided UAV object search benchmark called UAV-Need-Help, which provides varying levels of guidance information to help UAVs better accomplish realistic VLN tasks. We also propose a UAV navigation LLM that, given multi-view images, task descriptions, and assistant instructions, leverages the multimodal understanding capabilities of the MLLM to jointly process visual and textual information, and performs hierarchical trajectory generation. The evaluation results of our method significantly outperform the baseline models, while there remains a considerable gap between our results and those achieved by human operators, underscoring the challenge presented by the UAV-Need-Help task. Constructing embodied agents capable of understanding human commands remains a long-term objective in the field of artificial intelligence. Among these (Qi et al., 2020; Ku et al., 2020; Shridhar et al., 2020; Shen et al., 2021), visual-language navigation (VLN)--navigating to a target location based on language instructions and visual information--has garnered significant research interest. Current research in VLN focuses primarily on ground-based agents (Krantz et al., 2020; Blukis et al., 2018), while UAV-based VLN has received comparatively less attention.

artificial intelligence, natural language, trajectory, (18 more...)

arXiv.org Artificial Intelligence

2410.07087

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Robotics & Automation (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Controllable Navigation Instruction Generation with Chain of Thought Prompting

Kong, Xianghao, Chen, Jinyu, Wang, Wenguan, Su, Hang, Hu, Xiaolin, Yang, Yi, Liu, Si

arXiv.org Artificial IntelligenceJul-16-2024

Instruction generation is a vital and multidisciplinary research area with broad applications. Existing instruction generation models are limited to generating instructions in a single style from a particular dataset, and the style and content of generated instructions cannot be controlled. Moreover, most existing instruction generation methods also disregard the spatial modeling of the navigation environment. Leveraging the capabilities of Large Language Models (LLMs), we propose C-Instructor, which utilizes the chain-of-thought-style prompt for style-controllable and content-controllable instruction generation. Firstly, we propose a Chain of Thought with Landmarks (CoTL) mechanism, which guides the LLM to identify key landmarks and then generate complete instructions. CoTL renders generated instructions more accessible to follow and offers greater controllability over the manipulation of landmark objects. Furthermore, we present a Spatial Topology Modeling Task to facilitate the understanding of the spatial structure of the environment. Finally, we introduce a Style-Mixed Training policy, harnessing the prior knowledge of LLMs to enable style control for instruction generation based on different prompts within a single model instance. Extensive experiments demonstrate that instructions generated by C-Instructor outperform those generated by previous methods in text metrics, navigation guidance evaluation, and user studies.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.07433

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)

Add feedback

FedDWA: Personalized Federated Learning with Dynamic Weight Adjustment

Liu, Jiahao, Wu, Jiang, Chen, Jinyu, Hu, Miao, Zhou, Yipeng, Wu, Di

arXiv.org Artificial IntelligenceJul-16-2023

Different from conventional federated learning, personalized federated learning (PFL) is able to train a customized model for each individual client according to its unique requirement. The mainstream approach is to adopt a kind of weighted aggregation method to generate personalized models, in which weights are determined by the loss value or model parameters among different clients. However, such kinds of methods require clients to download others' models. It not only sheer increases communication traffic but also potentially infringes data privacy. In this paper, we propose a new PFL algorithm called \emph{FedDWA (Federated Learning with Dynamic Weight Adjustment)} to address the above problem, which leverages the parameter server (PS) to compute personalized aggregation weights based on collected models from clients. In this way, FedDWA can capture similarities between clients with much less communication overhead. More specifically, we formulate the PFL problem as an optimization problem by minimizing the distance between personalized models and guidance models, so as to customize aggregation weights for each client. Guidance models are obtained by the local one-step ahead adaptation on individual clients. Finally, we conduct extensive experiments using five real datasets and the results demonstrate that FedDWA can significantly reduce the communication traffic and achieve much higher model accuracy than the state-of-the-art approaches.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Artificial Intelligence

2305.06124

Country:

Asia > China (0.14)
Oceania > Australia (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

FedTune: A Deep Dive into Efficient Federated Fine-Tuning with Pre-trained Transformers

Chen, Jinyu, Xu, Wenchao, Guo, Song, Wang, Junxiao, Zhang, Jie, Wang, Haozhao

arXiv.org Artificial IntelligenceNov-15-2022

Federated Learning (FL) is an emerging paradigm that enables distributed users to collaboratively and iteratively train machine learning models without sharing their private data. Motivated by the effectiveness and robustness of self-attention-based architectures, researchers are turning to using pre-trained Transformers (i.e., foundation models) instead of traditional convolutional neural networks in FL to leverage their excellent transfer learning capabilities. Despite recent progress, how pre-trained Transformer models play a role in FL remains obscure, that is, how to efficiently fine-tune these pre-trained models in FL and how FL users could benefit from this new paradigm. In this paper, we explore this issue and demonstrate that the fine-tuned Transformers achieve extraordinary performance on FL, and that the lightweight fine-tuning method facilitates a fast convergence rate and low communication costs. Concretely, we conduct a rigorous empirical study of three tuning methods (i.e., modifying the input, adding extra modules, and adjusting the backbone) using two types of pre-trained models (i.e., vision-language models and vision models) for FL. Our experiments show that 1) Fine-tuning the bias term of the backbone performs best when relying on a strong pre-trained model; 2) The vision-language model (e.g., CLIP) outperforms the pure vision model (e.g., ViT) and is more robust to the few-shot settings; 3) Compared to pure local training, FL with pre-trained models has a higher accuracy because it alleviates the problem of over-fitting. We will release our code and encourage further exploration of pre-trained Transformers and FL.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2211.08025

Genre: Research Report (0.64)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback