AITopics | Xiao, Xiao

Collaborating Authors

Xiao, Xiao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation

Lin, Fan, Xie, Shuyi, Dai, Yong, Yao, Wenlin, Lang, Tianjiao, Xu, Zishan, Hu, Zhichao, Xiao, Xiao, Liu, Yuhong, Zhang, Yu

arXiv.org Artificial IntelligenceOct-5-2024

As Large Language Models (LLMs) grow increasingly adept at managing complex tasks, the evaluation set must keep pace with these advancements to ensure it remains sufficiently discriminative. Item Discrimination (ID) theory, which is widely used in educational assessment, measures the ability of individual test items to differentiate between high and low performers. Inspired by this theory, we propose an ID-induced prompt synthesis framework for evaluating LLMs to ensure the evaluation set can continually update and refine according to model abilities. Our data synthesis framework prioritizes both breadth and specificity. It can generate prompts that comprehensively evaluate the capabilities of LLMs while revealing meaningful performance differences between models, allowing for effective discrimination of their relative strengths and weaknesses across various tasks and domains. To produce high-quality data, we incorporate a self-correct mechanism into our generalization framework, and develop two models to predict prompt discrimination and difficulty score to facilitate our data synthesis framework, contributing valuable tools to evaluation data synthesis research. We apply our generated data to evaluate five SOTA models. Our data achieves an average score of 51.92, accompanied by a variance of 10.06. By contrast, previous works (i.e., SELF-INSTRUCT and WizardLM) obtain an average score exceeding 67, with a variance below 3.2. The results demonstrate that the data generated by our framework is more challenging and discriminative compared to previous works. We will release a dataset of over 3,000 carefully crafted prompts to facilitate evaluation research of LLMs.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2409.18892

Country: Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

ITERTL: An Iterative Framework for Fine-tuning LLMs for RTL Code Generation

Wu, Peiyang, Guo, Nan, Xiao, Xiao, Li, Wenming, Ye, Xiaochun, Fan, Dongrui

arXiv.org Artificial IntelligenceJun-27-2024

Recently, large language models (LLMs) have demonstrated excellent performance in understanding human instructions and generating code, which has inspired researchers to explore the feasibility of generating RTL code with LLMs. However, the existing approaches to fine-tune LLMs on RTL codes typically are conducted on fixed datasets, which do not fully stimulate the capability of LLMs and require large amounts of reference data. To mitigate these issues , we introduce a simple yet effective iterative training paradigm named ITERTL. During each iteration, samples are drawn from the model trained in the previous cycle. Then these new samples are employed for training in this loop. Through this iterative approach, the distribution mismatch between the model and the training samples is reduced. Additionally, the model is thus enabled to explore a broader generative space and receive more comprehensive feedback. Theoretical analyses are conducted to investigate the mechanism of the effectiveness. Experimental results show the model trained through our proposed approach can compete with and even outperform the state-of-the-art (SOTA) open-source model with nearly 37\% reference samples, achieving remarkable 42.9\% and 62.2\% pass@1 rate on two VerilogEval evaluation datasets respectively. While using the same amount of reference samples, our method can achieved a relative improvement of 16.9\% and 12.5\% in pass@1 compared to the non-iterative method. This study facilitates the application of LLMs for generating RTL code in practical scenarios with limited data.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2407.12022

Country:

Asia > China (0.16)
North America > United States (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Augmenting Biomedical Named Entity Recognition with General-domain Resources

Yin, Yu, Kim, Hyunjae, Xiao, Xiao, Wei, Chih Hsuan, Kang, Jaewoo, Lu, Zhiyong, Xu, Hua, Fang, Meng, Chen, Qingyu

arXiv.org Artificial IntelligenceJun-18-2024

Training a neural network-based biomedical named entity recognition (BioNER) model usually requires extensive and costly human annotations. While several studies have employed multi-task learning with multiple BioNER datasets to reduce human effort, this approach does not consistently yield performance improvements and may introduce label ambiguity in different biomedical corpora. We aim to tackle those challenges through transfer learning from easily accessible resources with fewer concept overlaps with biomedical datasets. In this paper, we proposed GERBERA, a simple-yet-effective method that utilized a general-domain NER dataset for training. Specifically, we performed multi-task learning to train a pre-trained biomedical language model with both the target BioNER dataset and the general-domain dataset. Subsequently, we fine-tuned the models specifically for the BioNER dataset. We systematically evaluated GERBERA on five datasets of eight entity types, collectively consisting of 81,410 instances. Despite using fewer biomedical resources, our models demonstrated superior performance compared to baseline models trained with multiple additional BioNER datasets. Specifically, our models consistently outperformed the baselines in six out of eight entity types, achieving an average improvement of 0.9% over the best baseline performance across eight biomedical entity types sourced from five different corpora. Our method was especially effective in amplifying performance on BioNER datasets characterized by limited data, with a 4.7% improvement in F1 scores on the JNLPBA-RNA dataset.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2406.10671

Country:

North America > United States > Maryland (0.14)
North America > United States > Connecticut (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning

Shao, Wei, Kang, Yufan, Peng, Ziyan, Xiao, Xiao, Wang, Lei, Yang, Yuhui, Salim, Flora D

arXiv.org Artificial IntelligenceJun-18-2024

Accuracy and timeliness are indeed often conflicting goals in prediction tasks. Premature predictions may yield a higher rate of false alarms, whereas delaying predictions to gather more information can render them too late to be useful. In applications such as wildfires, crimes, and traffic jams, timely forecasting are vital for safeguarding human life and property. Consequently, finding a balance between accuracy and timeliness is crucial. In this paper, we propose an early spatio-temporal forecasting model based on Multi-Objective reinforcement learning that can either implement an optimal policy given a preference or infer the preference based on a small number of samples. The model addresses two primary challenges: 1) enhancing the accuracy of early forecasting and 2) providing the optimal policy for determining the most suitable prediction time for each area. Our method demonstrates superior performance on three large-scale real-world datasets, surpassing existing methods in early spatio-temporal forecasting tasks.

artificial intelligence, machine learning, node, (17 more...)

arXiv.org Artificial Intelligence

2406.04035

Country:

North America > United States > New York (0.14)
Oceania > Australia > Victoria (0.14)
Oceania > Australia > New South Wales (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report (0.64)

Industry:

Law Enforcement & Public Safety (0.46)
Transportation (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Non-destructive Degradation Pattern Decoupling for Ultra-early Battery Prototype Verification Using Physics-informed Machine Learning

Tao, Shengyu, Zhang, Mengtian, Zhao, Zixi, Li, Haoyang, Ma, Ruifei, Che, Yunhong, Sun, Xin, Su, Lin, Chen, Xiangyu, Zhou, Zihao, Chang, Heng, Cao, Tingwei, Xiao, Xiao, Liu, Yaojun, Yu, Wenjun, Xu, Zhongling, Li, Yang, Hao, Han, Zhang, Xuan, Hu, Xiaosong, ZHou, Guangmin

arXiv.org Artificial IntelligenceMay-31-2024

Manufacturing complexities and uncertainties have impeded the transition from material prototypes to commercial batteries, making prototype verification critical to quality assessment. A fundamental challenge involves deciphering intertwined chemical processes to characterize degradation patterns and their quantitative relationship with battery performance. Here we show that a physics-informed machine learning approach can quantify and visualize temporally resolved losses concerning thermodynamics and kinetics only using electric signals. Our method enables non-destructive degradation pattern characterization, expediting temperature-adaptable predictions of entire lifetime trajectories, rather than end-of-life points. The verification speed is 25 times faster yet maintaining 95.1% accuracy across temperatures. Such advances facilitate more sustainable management of defective prototypes before massive production, establishing a 19.76 billion USD scrap material recycling market by 2060 in China. By incorporating stepwise charge acceptance as a measure of the initial manufacturing variability of normally identical batteries, we can immediately identify long-term degradation variations. We attribute the predictive power to interpreting machine learning insights using material-agnostic featurization taxonomy for degradation pattern decoupling. Our findings offer new possibilities for dynamic system analysis, such as battery prototype degradation, demonstrating that complex pattern evolutions can be accurately predicted in a non-destructive and data-driven fashion by integrating physics-informed machine learning.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2406.00276

Country:

Europe (0.67)
Asia > China (0.66)
North America > United States > California (0.14)

Genre:

Workflow (0.93)
Research Report > New Finding (0.48)

Industry:

Transportation > Ground > Road (1.00)
Energy > Oil & Gas > Upstream (1.00)
Energy > Energy Storage (1.00)
(6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Android in the Zoo: Chain-of-Action-Thought for GUI Agents

Zhang, Jiwen, Wu, Jihao, Teng, Yihua, Liao, Minghui, Xu, Nuo, Xiao, Xiao, Wei, Zhongyu, Tang, Duyu

arXiv.org Artificial IntelligenceMar-5-2024

Large language model (LLM) leads to a surge of autonomous GUI agents for smartphone, which completes a task triggered by natural language through predicting a sequence of actions of API. Even though the task highly relies on past actions and visual observations, existing studies typical consider little semantic information carried out by intermediate screenshots and screen operations. To address this, this work presents Chain-of-Action-Thought (dubbed CoAT), which takes the description of the previous actions, the current screen, and more importantly the action thinking of what actions should be performed and the outcomes led by the chosen action. We demonstrate that, in a zero-shot setting upon an off-the-shell LLM, CoAT significantly improves the goal progress compared to standard context modeling. To further facilitate the research in this line, we construct a benchmark Android-In-The-Zoo (AitZ), which contains 18,643 screen-action pairs together with chain-of-action-thought annotations. Experiments show that fine-tuning a 200M model on our AitZ dataset achieves on par performance with CogAgent-Chat-18B.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2403.02713

Country:

North America > United States (0.14)
North America > Mexico (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Mitigating Pooling Bias in E-commerce Search via False Negative Estimation

Wang, Xiaochen, Xiao, Xiao, Zhang, Ruhan, Zhang, Xuan, Na, Taesik, Tenneti, Tejaswi, Wang, Haixun, Ma, Fenglong

arXiv.org Artificial IntelligenceNov-18-2023

Efficient and accurate product relevance assessment is critical for user experiences and business success. Training a proficient relevance assessment model requires high-quality query-product pairs, often obtained through negative sampling strategies. Unfortunately, current methods introduce pooling bias by mistakenly sampling false negatives, diminishing performance and business impact. To address this, we present Bias-mitigating Hard Negative Sampling (BHNS), a novel negative sampling strategy tailored to identify and adjust for false negatives, building upon our original False Negative Estimation algorithm. Our experiments in the Instacart search setting confirm BHNS as effective for practical e-commerce use. Furthermore, comparative analyses on public dataset showcase its domain-agnostic potential for diverse applications.

artificial intelligence, machine learning, query, (13 more...)

arXiv.org Artificial Intelligence

2311.06444

Country:

North America > United States > Virginia (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Services > e-Commerce Services (0.73)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Disturbance Rejection Control for Autonomous Trolley Collection Robots with Prescribed Performance

Xi, Rui-Dong, Lu, Liang, Zhang, Xue, Xiao, Xiao, Xia, Bingyi, Wang, Jiankun, Meng, Max Q. -H.

arXiv.org Artificial IntelligenceSep-22-2023

Trajectory tracking control of autonomous trolley collection robots (ATCR) is an ambitious work due to the complex environment, serious noise and external disturbances. This work investigates a control scheme for ATCR subjecting to severe environmental interference. A kinematics model based adaptive sliding mode disturbance observer with fast convergence is first proposed to estimate the lumped disturbances. On this basis, a robust controller with prescribed performance is proposed using a backstepping technique, which improves the transient performance and guarantees fast convergence. Simulation outcomes have been provided to illustrate the effectiveness of the proposed control scheme.

artificial intelligence, controller, robot, (14 more...)

arXiv.org Artificial Intelligence

2309.1266

Country: Asia > China (0.15)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.47)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Dynamic Graph Neural Network with Adaptive Edge Attributes for Air Quality Predictions

Xu, Jing, Wang, Shuo, Ying, Na, Xiao, Xiao, Zhang, Jiang, Cheng, Yun, Jin, Zhiling, Zhang, Gangfeng

arXiv.org Artificial IntelligenceFeb-20-2023

Air quality prediction is a typical spatio-temporal modeling problem, which always uses different components to handle spatial and temporal dependencies in complex systems separately. Previous models based on time series analysis and Recurrent Neural Network (RNN) methods have only modeled time series while ignoring spatial information. Previous GCNs-based methods usually require providing spatial correlation graph structure of observation sites in advance. The correlations among these sites and their strengths are usually calculated using prior information. However, due to the limitations of human cognition, limited prior information cannot reflect the real station-related structure or bring more effective information for accurate prediction. To this end, we propose a novel Dynamic Graph Neural Network with Adaptive Edge Attributes (DGN-AEA) on the message passing network, which generates the adaptive bidirected dynamic graph by learning the edge attributes as model parameters. Unlike prior information to establish edges, our method can obtain adaptive edge information through end-to-end training without any prior information. Thus reduced the complexity of the problem. Besides, the hidden structural information between the stations can be obtained as model by-products, which can help make some subsequent decision-making analyses. Experimental results show that our model received state-of-the-art performance than other baselines.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2302.09977

Country: Asia > China (0.95)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback