AITopics | Chen, Xinyi

Collaborating Authors

Chen, Xinyi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Leveraging LLM Agents for Translating Network Configurations

Wei, Yunze, Xie, Xiaohui, Zuo, Yiwei, Hu, Tianshuo, Chen, Xinyi, Chi, Kaiwen, Cui, Yong

arXiv.org Artificial IntelligenceJan-15-2025

Configuration translation is a critical and frequent task in network operations. When a network device is damaged or outdated, administrators need to replace it to maintain service continuity. The replacement devices may originate from different vendors, necessitating configuration translation to ensure seamless network operation. However, translating configurations manually is a labor-intensive and error-prone process. In this paper, we propose an intent-based framework for translating network configuration with Large Language Model (LLM) Agents. The core of our approach is an Intent-based Retrieval Augmented Generation (IRAG) module that systematically splits a configuration file into fragments, extracts intents, and generates accurate translations. We also design a two-stage verification method to validate the syntax and semantics correctness of the translated configurations. We implement and evaluate the proposed method on real-world network configurations. Experimental results show that our method achieves 97.74% syntax correctness, outperforming state-of-the-art methods in translation accuracy.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2501.0876

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry:

Telecommunications > Networks (0.82)
Information Technology > Networks (0.82)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Neural Reflectance Fields for Radio-Frequency Ray Tracing

Jia, Haifeng, Chen, Xinyi, Wei, Yichen, Sun, Yifei, Pi, Yibo

arXiv.org Artificial IntelligenceJan-5-2025

Ray tracing is widely employed to model the propagation of radio-frequency (RF) signal in complex environment. The modelling performance greatly depends on how accurately the target scene can be depicted, including the scene geometry and surface material properties. The advances in computer vision and LiDAR make scene geometry estimation increasingly accurate, but there still lacks scalable and efficient approaches to estimate the material reflectivity in real-world environment. In this work, we tackle this problem by learning the material reflectivity efficiently from the path loss of the RF signal from the transmitters to receivers. Specifically, we want the learned material reflection coefficients to minimize the gap between the predicted and measured powers of the receivers. We achieve this by translating the neural reflectance field from optics to RF domain by modelling both the amplitude and phase of RF signals to account for the multipath effects. We further propose a differentiable RF ray tracing framework that optimizes the neural reflectance field to match the signal strength measurements. We simulate a complex real-world environment for experiments and our simulation results show that the neural reflectance field can successfully learn the reflection coefficients for all incident angles. As a result, our approach achieves better accuracy in predicting the powers of receivers with significantly less training data compared to existing approaches.

artificial intelligence, machine learning, neural reflectance field, (15 more...)

arXiv.org Artificial Intelligence

2501.02458

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.34)

Industry:

Energy > Oil & Gas > Upstream (0.86)
Media > Radio (0.61)
Leisure & Entertainment (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

S3TU-Net: Structured Convolution and Superpixel Transformer for Lung Nodule Segmentation

Wu, Yuke, Liu, Xiang, Shi, Yunyu, Chen, Xinyi, Wang, Zhenglei, Xu, YuQing, Wang, Shuo Hong

arXiv.org Artificial IntelligenceNov-19-2024

The irregular and challenging characteristics of lung adenocarcinoma nodules in computed tomography (CT) images complicate staging diagnosis, making accurate segmentation critical for clinicians to extract detailed lesion information. In this study, we propose a segmentation model, S3TU-Net, which integrates multi-dimensional spatial connectors and a superpixel-based visual transformer. S3TU-Net is built on a multi-view CNN-Transformer hybrid architecture, incorporating superpixel algorithms, structured weighting, and spatial shifting techniques to achieve superior segmentation performance. The model leverages structured convolution blocks (DWF-Conv/D2BR-Conv) to extract multi-scale local features while mitigating overfitting. To enhance multi-scale feature fusion, we introduce the S2-MLP Link, integrating spatial shifting and attention mechanisms at the skip connections. Additionally, the residual-based superpixel visual transformer (RM-SViT) effectively merges global and local features by employing sparse correlation learning and multi-branch attention to capture long-range dependencies, with residual connections enhancing stability and computational efficiency. Experimental results on the LIDC-IDRI dataset demonstrate that S3TU-Net achieves a DSC, precision, and IoU of 89.04%, 90.73%, and 90.70%, respectively. Compared to recent methods, S3TU-Net improves DSC by 4.52% and sensitivity by 3.16%, with other metrics showing an approximate 2% increase. In addition to comparison and ablation studies, we validated the generalization ability of our model on the EPDB private dataset, achieving a DSC of 86.40%.

artificial intelligence, machine learning, segmentation, (18 more...)

arXiv.org Artificial Intelligence

2411.12547

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.95)
Health & Medicine > Therapeutic Area > Oncology (0.90)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Provable Length Generalization in Sequence Prediction via Spectral Filtering

Marsden, Annie, Dogariu, Evan, Agarwal, Naman, Chen, Xinyi, Suo, Daniel, Hazan, Elad

arXiv.org Artificial IntelligenceNov-1-2024

Sequence prediction is a fundamental problem in machine learning with widespread applications in natural language processing, time-series forecasting, and control systems. In this setting, a learner observes a sequence of tokens and iteratively predicts the next token, suffering a loss that measures the discrepancy between the predicted and the true token. Predicting future elements of a sequence based on historical data is crucial for tasks ranging from language modeling to autonomous control. A key challenge in sequence prediction is understanding the role of context length--the number of previous tokens used to make the upcoming prediction--and designing predictors that perform well with limited context due to computational and memory constraints. These resource constraints become particularly significant during the training phase of a predictor, where the computational cost of using long sequences can be prohibitive. Consequently, it is beneficial to design predictors that can learn from a smaller context length while still generalizing well to longer sequences. This leads us to the central question of our investigation: Can we develop algorithms that learn effectively using short contexts but perform comparably to models that use longer contexts?

length generalization, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2411.01035

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Toward Understanding In-context vs. In-weight Learning

Chan, Bryan, Chen, Xinyi, György, András, Schuurmans, Dale

arXiv.org Artificial IntelligenceOct-30-2024

It has recently been demonstrated empirically that in-context learning emerges in transformers when certain distributional properties are present in the training data, but this ability can also diminish upon further training. We provide a new theoretical understanding of these phenomena by identifying simplified distributional properties that give rise to the emergence and eventual disappearance of in-context learning. We do so by first analyzing a simplified model that uses a gating mechanism to choose between an in-weight and an in-context predictor. Through a combination of a generalization error and regret analysis we identify conditions where in-context and in-weight learning emerge. These theoretical findings are then corroborated experimentally by comparing the behaviour of a full transformer on the simplified distributions to that of the stylized model, demonstrating aligned results. We then extend the study to a full large language model, showing how fine-tuning on various collections of natural language prompts can elicit similar in-context and in-weight learning behaviour.

large language model, machine learning, predictor, (20 more...)

arXiv.org Artificial Intelligence

2410.23042

Country:

Europe > Hungary (0.28)
North America > United States > New York (0.14)
Europe > United Kingdom > England (0.14)
North America > Canada > Alberta (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)

Add feedback

FutureFill: Fast Generation from Convolutional Sequence Models

Agarwal, Naman, Chen, Xinyi, Dogariu, Evan, Feinberg, Vlad, Suo, Daniel, Bartlett, Peter, Hazan, Elad

arXiv.org Artificial IntelligenceOct-25-2024

We address the challenge of efficient auto-regressive generation in sequence prediction models by introducing FutureFill - a method for fast generation that applies to any sequence prediction algorithm based on convolutional operators. Our approach reduces the generation time requirement from quadratic to quasilinear relative to the context length. Additionally, FutureFill requires a prefill cache sized only by the number of tokens generated, which is smaller than the cache requirements for standard convolutional and attention-based models. We validate our theoretical findings with experimental evidence demonstrating correctness and efficiency gains in a synthetic generation task.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.03766

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

LLM$\times$MapReduce: Simplified Long-Sequence Processing using Large Language Models

Zhou, Zihan, Li, Chong, Chen, Xinyi, Wang, Shuo, Chao, Yu, Li, Zhili, Wang, Haoyu, An, Rongqiao, Shi, Qi, Tan, Zhixing, Han, Xu, Shi, Xiaodong, Liu, Zhiyuan, Sun, Maosong

arXiv.org Artificial IntelligenceOct-11-2024

Enlarging the context window of large language models (LLMs) has become a crucial research area, particularly for applications involving extremely long texts. In this work, we propose a novel training-free framework for processing long texts, utilizing a divide-and-conquer strategy to achieve comprehensive document understanding. The proposed LLM$\times$MapReduce framework splits the entire document into several chunks for LLMs to read and then aggregates the intermediate answers to produce the final output. The main challenge for divide-and-conquer long text processing frameworks lies in the risk of losing essential long-range information when splitting the document, which can lead the model to produce incomplete or incorrect answers based on the segmented texts. Disrupted long-range information can be classified into two categories: inter-chunk dependency and inter-chunk conflict. We design a structured information protocol to better cope with inter-chunk dependency and an in-context confidence calibration mechanism to resolve inter-chunk conflicts. Experimental results demonstrate that LLM$\times$MapReduce can outperform representative open-source and commercial long-context LLMs, and is applicable to several different models.

information, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.09342

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

UniQuad: A Unified and Versatile Quadrotor Platform Series for UAV Research and Application

Zhang, Yichen, Chen, Xinyi, Liu, Peize, Wang, Junzhe, Zou, Hetai, Pan, Neng, Gao, Fei, Shen, Shaojie

arXiv.org Artificial IntelligenceJul-4-2024

As quadrotors take on an increasingly diverse range of roles, researchers often need to develop new hardware platforms tailored for specific tasks, introducing significant engineering overhead. In this article, we introduce the UniQuad series, a unified and versatile quadrotor platform series that offers high flexibility to adapt to a wide range of common tasks, excellent customizability for advanced demands, and easy maintenance in case of crashes. This project is fully open-source at https://hkust-aerial-robotics.github.io/UniQuad.

artificial intelligence, uniquad series, versatile quadrotor platform series, (11 more...)

arXiv.org Artificial Intelligence

2407.00578

Country: Asia > China (0.31)

Genre: Research Report (0.50)

Industry: Transportation > Air (0.71)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models

Chen, Xinyi, Liao, Baohao, Qi, Jirui, Eustratiadis, Panagiotis, Monz, Christof, Bisazza, Arianna, de Rijke, Maarten

arXiv.org Artificial IntelligenceJun-28-2024

Following multiple instructions is a crucial ability for large language models (LLMs). Evaluating this ability comes with significant challenges: (i) limited coherence between multiple instructions, (ii) positional bias where the order of instructions affects model performance, and (iii) a lack of objectively verifiable tasks. To address these issues, we introduce a benchmark designed to evaluate models' abilities to follow multiple instructions through sequential instruction following (SIFo) tasks. In SIFo, the successful completion of multiple instructions is verifiable by examining only the final instruction. Our benchmark evaluates instruction following using four tasks (text modification, question answering, mathematics, and security rule following), each assessing different aspects of sequential instruction following. Our evaluation of popular LLMs, both closed-source and open-source, shows that more recent and larger models significantly outperform their older and smaller counterparts on the SIFo tasks, validating the benchmark's effectiveness. All models struggle with following sequences of instructions, hinting at an important lack of robustness of today's language models.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2406.19999

Country:

Europe (0.46)
Asia (0.28)
Oceania > Marshall Islands (0.14)
(2 more...)

Genre:

Workflow (0.68)
Research Report (0.64)

Industry:

Education (1.00)
Transportation > Infrastructure & Services > Airport (0.46)
Transportation > Air (0.46)
Media > Film (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Making Recommender Systems More Knowledgeable: A Framework to Incorporate Side Information

Jiang, Yukun, Guo, Leo, Chen, Xinyi, Liu, Jing Xi

arXiv.org Artificial IntelligenceJun-2-2024

Session-based recommender systems typically focus on using only the triplet (user_id, timestamp, item_id) to make predictions of users' next actions. In this paper, we aim to utilize side information to help recommender systems catch patterns and signals otherwise undetectable. Specifically, we propose a general framework for incorporating item-specific side information into the recommender system to enhance its performance without much modification on the original model architecture. Experimental results on several models and datasets prove that with side information, our recommender system outperforms state-of-the-art models by a considerable margin and converges much faster. Additionally, we propose a new type of loss to regularize the attention mechanism used by recommender systems and evaluate its influence on model performance. Furthermore, through analysis, we put forward a few insights on potential further improvements.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2406.00615

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.15)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback