AITopics | Wang, Li

Collaborating Authors

Wang, Li

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Honest AI: Fine-Tuning "Small" Language Models to Say "I Don't Know", and Reducing Hallucination in RAG

Chen, Xinxi, Wang, Li, Wu, Wei, Tang, Qi, Liu, Yiyao

arXiv.org Artificial IntelligenceOct-12-2024

Hallucination is a key roadblock for applications of Large Language Models (LLMs), particularly for enterprise applications that are sensitive to information accuracy. To address this issue, two general approaches have been explored: Retrieval-Augmented Generation (RAG) to supply LLMs with updated information as context, and fine-tuning the LLMs with new information and desired output styles. In this paper, we propose Honest AI: a novel strategy to fine-tune "small" language models to say "I don't know" to reduce hallucination, along with several alternative RAG approaches. The solution ranked 1st in Task 2 for the false premise question. The alternative approaches include using RAG with search engine and knowledge graph results, fine-tuning base LLMs with new information and combinations of both approaches. Although all approaches improve the performance of the LLMs, RAG alone does not significantly improve the performance and fine-tuning is needed for better results. Finally, the hybrid approach achieved the highest score in the CRAG benchmark. In addition, our approach emphasizes the use of relatively small models with fewer than 10 billion parameters, promoting resource efficiency.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.09699

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report (0.65)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Two-Stage Radio Map Construction with Real Environments and Sparse Measurements

Wang, Yifan, Sun, Shu, Liu, Na, Xu, Lianming, Wang, Li

arXiv.org Artificial IntelligenceOct-8-2024

Radio map construction based on extensive measurements is accurate but expensive and time-consuming, while environment-aware radio map estimation reduces the costs at the expense of low accuracy. Considering accuracy and costs, a first-predict-then-correct (FPTC) method is proposed by leveraging generative adversarial networks (GANs). A primary radio map is first predicted by a radio map prediction GAN (RMP-GAN) taking environmental information as input. Then, the prediction result is corrected by a radio map correction GAN (RMC-GAN) with sparse measurements as guidelines. Specifically, the self-attention mechanism and residual-connection blocks are introduced to RMP-GAN and RMC-GAN to improve the accuracy, respectively. Experimental results validate that the proposed FPTC-GANs method achieves the best radio map construction performance, compared with the state-of-the-art methods.

artificial intelligence, information, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2410.18092

Country: Asia > China (0.29)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Stochastic Inverse Problem: stability, regularization and Wasserstein gradient flow

Li, Qin, Oprea, Maria, Wang, Li, Yang, Yunan

arXiv.org Machine LearningSep-30-2024

Inverse problems in physical or biological sciences often involve recovering an unknown parameter that is random. The sought-after quantity is a probability distribution of the unknown parameter, that produces data that aligns with measurements. Consequently, these problems are naturally framed as stochastic inverse problems. In this paper, we explore three aspects of this problem: direct inversion, variational formulation with regularization, and optimization via gradient flows, drawing parallels with deterministic inverse problems. A key difference from the deterministic case is the space in which we operate. Here, we work within probability space rather than Euclidean or Sobolev spaces, making tools from measure transport theory necessary for the study. Our findings reveal that the choice of metric -- both in the design of the loss function and in the optimization process -- significantly impacts the stability and properties of the optimizer.

artificial intelligence, machine learning, review purpose only, (17 more...)

arXiv.org Machine Learning

2410.00229

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Generative AI for RF Sensing in IoT systems

Wang, Li, Zhang, Chao, Zhao, Qiyang, Zou, Hang, Lasaulce, Samson, Valenzise, Giuseppe, He, Zhuo, Debbah, Merouane

arXiv.org Artificial IntelligenceJul-10-2024

The development of wireless sensing technologies, using signals such as Wi-Fi, infrared, and RF to gather environmental data, has significantly advanced within Internet of Things (IoT) systems. Among these, Radio Frequency (RF) sensing stands out for its cost-effective and non-intrusive monitoring of human activities and environmental changes. However, traditional RF sensing methods face significant challenges, including noise, interference, incomplete data, and high deployment costs, which limit their effectiveness and scalability. This paper investigates the potential of Generative AI (GenAI) to overcome these limitations within the IoT ecosystem. We provide a comprehensive review of state-of-the-art GenAI techniques, focusing on their application to RF sensing problems. By generating high-quality synthetic data, enhancing signal quality, and integrating multi-modal data, GenAI offers robust solutions for RF environment reconstruction, localization, and imaging. Additionally, GenAI's ability to generalize enables IoT devices to adapt to new environments and unseen tasks, improving their efficiency and performance. The main contributions of this article include a detailed analysis of the challenges in RF sensing, the presentation of innovative GenAI-based solutions, and the proposal of a unified framework for diverse RF sensing tasks. Through case studies, we demonstrate the effectiveness of integrating GenAI models, leading to advanced, scalable, and intelligent IoT systems.

artificial intelligence, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2407.07506

Country:

Europe > France (0.14)
North America > United States (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology > Smart Houses & Appliances (1.00)

Technology:

Information Technology > Internet of Things (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.63)

Add feedback

Solving General Natural-Language-Description Optimization Problems with Large Language Models

Zhang, Jihai, Wang, Wei, Guo, Siyan, Wang, Li, Lin, Fangquan, Yang, Cheng, Yin, Wotao

arXiv.org Artificial IntelligenceJul-9-2024

Optimization problems seek to find the best solution to an objective under a set of constraints, and have been widely investigated in real-world applications. Modeling and solving optimization problems in a specific domain typically require a combination of domain knowledge, mathematical skills, and programming ability, making it difficult for general users and even domain professionals. In this paper, we propose a novel framework called OptLLM that augments LLMs with external solvers. Specifically, OptLLM accepts user queries in natural language, convert them into mathematical formulations and programming codes, and calls the solvers to calculate the results for decision-making. In addition, OptLLM supports multi-round dialogues to gradually refine the modeling and solving of optimization problems. To illustrate the effectiveness of OptLLM, we provide tutorials on three typical optimization applications and conduct experiments on both prompt-based GPT models and a fine-tuned Qwen model using a large-scale selfdeveloped optimization dataset. Experimental results show that OptLLM works with various LLMs, and the fine-tuned model achieves an accuracy boost compared to the promptbased models. Some features of OptLLM framework have been available for trial since June 2023 (https://opt.alibabacloud.com/chat or https://opt.aliyun.com/chat).

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.07924

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Evaluation of Bias Towards Medical Professionals in Large Language Models

Chen, Xi, Xu, Yang, You, MingKe, Wang, Li, Liu, WeiZhi, Li, Jian

arXiv.org Artificial IntelligenceJun-30-2024

This study evaluates whether large language models (LLMs) exhibit biases towards medical professionals. Fictitious candidate resumes were created to control for identity factors while maintaining consistent qualifications. Three LLMs (GPT-4, Claude-3-haiku, and Mistral-Large) were tested using a standardized prompt to evaluate resumes for specific residency programs. Explicit bias was tested by changing gender and race information, while implicit bias was tested by changing names while hiding race and gender. Physician data from the Association of American Medical Colleges was used to compare with real-world demographics. 900,000 resumes were evaluated. All LLMs exhibited significant gender and racial biases across medical specialties. Gender preferences varied, favoring male candidates in surgery and orthopedics, while preferring females in dermatology, family medicine, obstetrics and gynecology, pediatrics, and psychiatry. Claude-3 and Mistral-Large generally favored Asian candidates, while GPT-4 preferred Black and Hispanic candidates in several specialties. Tests revealed strong preferences towards Hispanic females and Asian males in various specialties. Compared to real-world data, LLMs consistently chose higher proportions of female and underrepresented racial candidates than their actual representation in the medical workforce. GPT-4, Claude-3, and Mistral-Large showed significant gender and racial biases when evaluating medical professionals for residency selection. These findings highlight the potential for LLMs to perpetuate biases and compromise healthcare workforce diversity if used without proper bias mitigation strategies.

large language model, machine learning, medicine, (20 more...)

arXiv.org Artificial Intelligence

2407.12031

Country: Asia > China > Sichuan Province (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Orthopedics/Orthopedic Surgery (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

Failure-Resilient Distributed Inference with Model Compression over Heterogeneous Edge Devices

Wang, Li, Li, Liang, Xu, Lianming, Peng, Xian, Fei, Aiguo

arXiv.org Artificial IntelligenceJun-20-2024

The distributed inference paradigm enables the computation workload to be distributed across multiple devices, facilitating the implementations of deep learning based intelligent services on extremely resource-constrained Internet of Things (IoT) scenarios. Yet it raises great challenges to perform complicated inference tasks relying on a cluster of IoT devices that are heterogeneous in their computing/communication capacity and prone to crash or timeout failures. In this paper, we present RoCoIn, a robust cooperative inference mechanism for locally distributed execution of deep neural network-based inference tasks over heterogeneous edge devices. It creates a set of independent and compact student models that are learned from a large model using knowledge distillation for distributed deployment. In particular, the devices are strategically grouped to redundantly deploy and execute the same student model such that the inference process is resilient to any local failures, while a joint knowledge partition and student model assignment scheme are designed to minimize the response latency of the distributed inference system in the presence of devices with diverse capacities. Extensive simulations are conducted to corroborate the superior performance of our RoCoIn for distributed inference compared to several baselines, and the results demonstrate its efficacy in timely inference and failure resiliency.

artificial intelligence, edge device, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2406.14185

Country:

Asia > China (0.14)
North America > Canada (0.14)

Genre: Research Report (0.70)

Industry:

Education (1.00)
Information Technology > Smart Houses & Appliances (0.34)

Technology:

Information Technology > Internet of Things (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences

Liu, Zicheng, Li, Siyuan, Wang, Li, Wang, Zedong, Liu, Yunfan, Li, Stan Z.

arXiv.org Artificial IntelligenceJun-13-2024

To mitigate the computational complexity in the self-attention mechanism on long sequences, linear attention utilizes computation tricks to achieve linear complexity, while state space models (SSMs) popularize a favorable practice of using non-data-dependent memory pattern, i.e., emphasize the near and neglect the distant, to processing sequences. Recent studies have shown the priorities by combining them as one. However, the efficiency of linear attention remains only at the theoretical level in a causal setting, and SSMs require various designed constraints to operate effectively on specific data. Therefore, in order to unveil the true power of the hybrid design, the following two issues need to be addressed: (1) hardware-efficient implementation for linear attention and (2) stabilization of SSMs. To achieve this, we leverage the thought of tiling and hierarchy to propose CHELA (short-long Convolutions with Hardware-Efficient Linear Attention), which replaces SSMs with short-long convolutions and implements linear attention in a divide-and-conquer manner. This approach enjoys global abstraction and data-dependent selection from stable SSM and linear attention while maintaining real linear complexity. Our comprehensive experiments on the Long Range Arena benchmark and language modeling tasks demonstrate the effectiveness of the proposed method.

convolution, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2406.08128

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

On-site scale factor linearity calibration of MEMS triaxial gyroscopes

Li, Yaqi, Wang, Li, Wang, Zhitao, Li, Xiangqing, Li, Jiaojiao, Su, Steven Weidong

arXiv.org Artificial IntelligenceJun-10-2024

The calibration of MEMS triaxial gyroscopes is crucial for achieving precise attitude estimation for various wearable health monitoring applications. However, gyroscope calibration poses greater challenges compared to accelerometers and magnetometers. This paper introduces an efficient method for calibrating MEMS triaxial gyroscopes via only a servo motor, making it well-suited for field environments. The core strategy of the method involves utilizing the fact that the dot product of the measured gravity and the rotational speed in a fixed frame remains constant. To eliminate the influence of rotating centrifugal force on the accelerometer, the accelerometer data is measured while stationary. The proposed calibration experiment scheme, which allows gyroscopic measurements when operating each axis at a specific rotation speed, making it easier to evaluate the linearity across a related speed range constituted by a series of rotation speeds. Moreover, solely the classical least squares algorithm proves adequate for estimating the scale factor, notably streamlining the analysis of the calibration process. Extensive numerical simulations were conducted to analyze the proposed method's performance in calibrating a triaxial gyroscope model. Experimental validation was also carried out using a commercially available MEMS inertial measurement unit (LSM9DS1 from Arduino nano 33 BLE SENSE) and a servo motor capable of controlling precise speed. The experimental results effectively demonstrate the efficacy of the proposed calibration approach.

artificial intelligence, gyroscope, modeling & simulation, (18 more...)

arXiv.org Artificial Intelligence

2405.03393

Country: Asia > China > Shandong Province (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Consumer Health (0.54)

Technology:

Information Technology > Modeling & Simulation (0.68)
Information Technology > Artificial Intelligence > Robots (0.46)

Add feedback

MoFormer: Multi-objective Antimicrobial Peptide Generation Based on Conditional Transformer Joint Multi-modal Fusion Descriptor

Wang, Li, Fu, Xiangzheng, Yang, Jiahao, Zhang, Xinyi, Ye, Xiucai, Liu, Yiping, Sakurai, Tetsuya, Zeng, Xiangxiang

arXiv.org Artificial IntelligenceJun-3-2024

Deep learning holds a big promise for optimizing existing peptides with more desirable properties, a critical step towards accelerating new drug discovery. Despite the recent emergence of several optimized Antimicrobial peptides(AMP) generation methods, multi-objective optimizations remain still quite challenging for the idealism-realism tradeoff. Here, we establish a multi-objective AMP synthesis pipeline (MoFormer) for the simultaneous optimization of multi-attributes of AMPs. MoFormer improves the desired attributes of AMP sequences in a highly structured latent space, guided by conditional constraints and fine-grained multi-descriptor.We show that MoFormer outperforms existing methods in the generation task of enhanced antimicrobial activity and minimal hemolysis. We also utilize a Pareto-based non-dominated sorting algorithm and proxies based on large model fine-tuning to hierarchically rank the candidates. We demonstrate substantial property improvement using MoFormer from two perspectives: (1) employing molecular simulations and scoring interactions among amino acids to decipher the structure and functionality of AMPs; (2) visualizing latent space to examine the qualities and distribution features, verifying an effective means to facilitate multi-objective optimization AMPs with design constraints.

evolutionary algorithm, machine learning, mofomer, (18 more...)

arXiv.org Artificial Intelligence

2406.0261

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback