AITopics | Wang, Zehao

Plotting

Wang, Zehao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multi-Span Optical Power Spectrum Evolution Modeling using ML-based Multi-Decoder Attention Framework

Raj, Agastya, Wang, Zehao, Slyne, Frank, Chen, Tingjun, Kilper, Dan, Ruffini, Marco

arXiv.org Artificial IntelligenceMar-21-2025

We implement a ML-based attention framework with component-specific decoders, improving optical power spectrum prediction in multi-span networks. By reducing the need for in-depth training on each component, the framework can be scaled to multi-span topologies with minimal data collection, making it suitable for brown-field scenarios.

artificial intelligence, configuration, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2503.17072

Country: Europe > Ireland > Leinster > County Dublin > Dublin (0.86)

Genre: Research Report (0.83)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

A Comprehensive Framework for Evaluating API-oriented Code Generation in Large Language Models

Wu, Yixi, He, Pengfei, Wang, Zehao, Wang, Shaowei, Tian, Yuan, Chen, Tse-Hsun

arXiv.org Artificial IntelligenceSep-26-2024

Large language models (LLMs) like GitHub Copilot and ChatGPT have emerged as powerful tools for code generation, significantly enhancing productivity and accelerating software development. However, existing benchmarks primarily focus on general code generation without considering API-oriented code generation, i.e., generating code that invokes APIs from specific libraries. Given the growing demand for API-oriented code generation, there is a pressing need for a systematic and automated approach to evaluate LLM on API-oriented code generation. To address this gap, we propose AutoAPIEval, a lightweight and automated framework designed to evaluate the capabilities of LLMs in API-oriented code generation. Our framework works with any library that provides API documentation and focuses on two unit tasks: API recommendation and code example generation, along with four metrics to evaluate the generated APIs and code examples, such as the proportion of incorrect API recommendations for Task 1, and the proportion of code examples where no specific API is invoked and uncompilable/unexecutable code examples for Task 2. In addition, we conducted a case study on three LLMs (ChatGPT, MagiCoder, and DeepSeek Coder) and Java Runtime Environment 8 to demonstrate the framework's effectiveness. Our findings reveal substantial variability in LLM performance across tasks, with ChatGPT adhering better to instructions, while sharing similar effectiveness in code example generation with its counterparts (i.e., MagiCoder and DeekSeek Coder). We also identify key factors associated with code quality, such as API popularity and model confidence, and build classifiers that achieve high accuracy in detecting incorrect API recommendations and erroneous code examples. Retrieval-augmented generation enhances the quality of code generated by LLMs, though its effectiveness varies across different LLMs.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2409.15228

Country: North America > Canada > Manitoba (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Navigating the Nuances: A Fine-grained Evaluation of Vision-Language Navigation

Wang, Zehao, Wu, Minye, Cao, Yixin, Ma, Yubo, Chen, Meiqi, Tuytelaars, Tinne

arXiv.org Artificial IntelligenceSep-25-2024

This study presents a novel evaluation framework for the Vision-Language Navigation (VLN) task. It aims to diagnose current models for various instruction categories at a finer-grained level. The framework is structured around the context-free grammar (CFG) of the task. The CFG serves as the basis for the problem decomposition and the core premise of the instruction categories design. We propose a semi-automatic method for CFG construction with the help of Large-Language Models (LLMs). Then, we induct and generate data spanning five principal instruction categories (i.e. direction change, landmark recognition, region recognition, vertical movement, and numerical comprehension). Our analysis of different models reveals notable performance discrepancies and recurrent issues. The stagnation of numerical comprehension, heavy selective biases over directional concepts, and other interesting findings contribute to the development of future language-guided navigation systems.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2409.17313

Country: Europe > Belgium > Flanders (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs

Zeng, Shulin, Liu, Jun, Dai, Guohao, Yang, Xinhao, Fu, Tianyu, Wang, Hongyi, Ma, Wenheng, Sun, Hanbo, Li, Shiyao, Huang, Zixiao, Dai, Yadong, Li, Jintao, Wang, Zehao, Zhang, Ruoyu, Wen, Kairui, Ning, Xuefei, Wang, Yu

arXiv.org Artificial IntelligenceJan-9-2024

Transformer-based Large Language Models (LLMs) have made a significant impact on various domains. However, LLMs' efficiency suffers from both heavy computation and memory overheads. Compression techniques like sparsification and quantization are commonly used to mitigate the gap between LLM's computation/memory overheads and hardware capacity. However, existing GPU and transformer-based accelerators cannot efficiently process compressed LLMs, due to the following unresolved challenges: low computational efficiency, underutilized memory bandwidth, and large compilation overheads. This paper proposes FlightLLM, enabling efficient LLMs inference with a complete mapping flow on FPGAs. In FlightLLM, we highlight an innovative solution that the computation and memory overhead of LLMs can be solved by utilizing FPGA-specific resources (e.g., DSP48 and heterogeneous memory hierarchy). We propose a configurable sparse DSP chain to support different sparsity patterns with high computation efficiency. Second, we propose an always-on-chip decode scheme to boost memory bandwidth with mixed-precision support. Finally, to make FlightLLM available for real-world LLMs, we propose a length adaptive compilation method to reduce the compilation overhead. Implemented on the Xilinx Alveo U280 FPGA, FlightLLM achieves 6.0$\times$ higher energy efficiency and 1.8$\times$ better cost efficiency against commercial GPUs (e.g., NVIDIA V100S) on modern LLMs (e.g., LLaMA2-7B) using vLLM and SmoothQuant under the batch size of one. FlightLLM beats NVIDIA A100 GPU with 1.2$\times$ higher throughput using the latest Versal VHK158 FPGA.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2401.03868

Country: North America > United States (0.48)

Genre: Research Report > Promising Solution (0.48)

Industry:

Information Technology (0.55)
Semiconductors & Electronics (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding

Li, Yong-Lu, Wu, Xiaoqian, Liu, Xinpeng, Wang, Zehao, Dou, Yiming, Ji, Yikun, Zhang, Junyi, Li, Yixing, Tan, Jingru, Lu, Xudong, Lu, Cewu

arXiv.org Artificial IntelligenceNov-27-2023

As a vital step toward the intelligent agent, Action understanding matters for intelligent agents and has attracted long-term attention. It can be formed as the mapping from the action physical space to the semantic space. Typically, researchers built action datasets according to idiosyncratic choices to define classes and push the envelope of benchmarks respectively. Thus, datasets are incompatible with each other like "Isolated Islands" due to semantic gaps and various class granularities, e.g., do housework in dataset A and wash plate in dataset B. We argue that a more principled semantic space is an urgent need to concentrate the community efforts and enable us to use all datasets together to pursue generalizable action learning. To this end, we design a structured action semantic space in view of verb taxonomy hierarchy and covering massive actions. By aligning the classes of previous datasets to our semantic space, we gather (image/video/skeleton/MoCap) datasets into a unified database in a unified label system, i.e., bridging ``isolated islands'' into a "Pangea". Accordingly, we propose a novel model mapping from the physical space to semantic space to fully use Pangea. In extensive experiments, our new system shows significant superiority, especially in transfer learning. Code and data will be made publicly available.

large language model, machine learning, node, (21 more...)

arXiv.org Artificial Intelligence

2304.00553

Country:

Asia (0.45)
North America > United States > Pennsylvania (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Self-Normalizing Neural Network, Enabling One Shot Transfer Learning for Modeling EDFA Wavelength Dependent Gain

Raj, Agastya, Wang, Zehao, Slyne, Frank, Chen, Tingjun, Kilper, Dan, Ruffini, Marco

arXiv.org Artificial IntelligenceOct-21-2023

We present a novel ML framework for modeling the wavelength-dependent gain of multiple EDFAs, based on semi-supervised, self-normalizing neural networks, enabling one-shot transfer learning. Our experiments on 22 EDFAs in Open Ireland and COSMOS testbeds show high-accuracy transfer-learning even when operated across different amplifier types.

artificial intelligence, edfa, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2308.02233

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.15)
North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Few-shot Event Detection: An Empirical Study and a Unified View

Ma, Yubo, Wang, Zehao, Cao, Yixin, Sun, Aixin

arXiv.org Artificial IntelligenceMay-25-2023

Few-shot event detection (ED) has been widely studied, while this brings noticeable discrepancies, e.g., various motivations, tasks, and experimental settings, that hinder the understanding of models for future progress.This paper presents a thorough empirical study, a unified view of ED models, and a better unified baseline. For fair evaluation, we compare 12 representative methods on three datasets, which are roughly grouped into prompt-based and prototype-based models for detailed analysis. Experiments consistently demonstrate that prompt-based methods, including ChatGPT, still significantly trail prototype-based methods in terms of overall performance. To investigate their superior performance, we break down their design elements along several dimensions and build a unified framework on prototype-based methods. Under such unified view, each prototype-method can be viewed a combination of different modules from these design elements. We further combine all advantageous modules and propose a simple yet effective baseline, which outperforms existing methods by a large margin (e.g., 2.7% F1 gains under low-resource setting).

computational linguistic, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.01901

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

AeCoM: An Aerial Continuum Manipulator with Precise Kinematic Modeling for Variable Loading and Tendon-slacking Prevention

Peng, Rui, Wang, Zehao, Lu, Peng

arXiv.org Artificial IntelligenceJan-26-2023

Aerial robotic systems have raised emerging interests in recent years. In this article, we propose a novel aerial manipulator system that is significantly different from conventional aerial discrete manipulators: An Aerial Continuum Manipulator (AeCoM). The AeCoM compactly integrates a quadrotor with a tendon-driven continuum robotic manipulator. Due to the compact design and the payload bearing ability of tendon-driven continuum robotic arms, the proposed system solved the conflict between payload capacity and dexterity lying in conventional aerial manipulators. Two contributions are made in this paper: 1) a sensor-based kinematic model is developed for precise modeling in the presence of variable loading; and 2) a tendon slacking prevention system is developed in the presence of aggressive motions. The detailed design of the system is presented and extensive experimental validations have been performed to validate the system self-initialization, payload capacity, precise kinematic modeling with variable end-effector (EE) loadings during aerial grasping and tendon-slacking prevention. The experimental results demonstrate that the proposed novel aerial continuum manipulator system solves the constraints in conventional aerial manipulators and has more potential applications in clustered environments.

artificial intelligence, continuum manipulator, manipulator, (17 more...)

arXiv.org Artificial Intelligence

2110.1418

Country: Asia > China (0.14)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Layout-aware Dreamer for Embodied Referring Expression Grounding

Li, Mingxiao, Wang, Zehao, Tuytelaars, Tinne, Moens, Marie-Francine

arXiv.org Artificial IntelligenceDec-2-2022

In this work, we study the problem of Embodied Referring Expression Grounding, where an agent needs to navigate in a previously unseen environment and localize a remote object described by a concise high-level natural language instruction. When facing such a situation, a human tends to imagine what the destination may look like and to explore the environment based on prior knowledge of the environmental layout, such as the fact that a bathroom is more likely to be found near a bedroom than a kitchen. We have designed an autonomous agent called Layout-aware Dreamer (LAD), including two novel modules, that is, the Layout Learner and the Goal Dreamer to mimic this cognitive decision process. The Layout Learner learns to infer the room category distribution of neighboring unexplored areas along the path for coarse layout estimation, which effectively introduces layout common sense of room-to-room transitions to our agent. To learn an effective exploration of the environment, the Goal Dreamer imagines the destination beforehand. Our agent achieves new state-of-the-art performance on the public leaderboard of the REVERIE dataset in challenging unseen test environments with improvement in navigation success (SR) by 4.02% and remote grounding success (RGS) by 3.43% compared to the previous state-of-the-art. The code is released at https://github.com/zehao-wang/LAD

machine learning, natural language, node, (18 more...)

arXiv.org Artificial Intelligence

2212.00171

Country: Europe > Belgium > Flanders (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.34)

Add feedback