AITopics | Xing, Lei

Collaborating Authors

Xing, Lei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep-and-Wide Learning: Enhancing Data-Driven Inference via Synergistic Learning of Inter- and Intra-Data Representations

Islam, Md Tauhidul, Xing, Lei

arXiv.org Artificial IntelligenceJan-28-2025

Advancements in deep learning are revolutionizing science and engineering. The immense success of deep learning is largely due to its ability to extract essential high-dimensional (HD) features from input data and make inference decisions based on this information. However, current deep neural network (DNN) models face several challenges, such as the requirements of extensive amounts of data and computational resources. Here, we introduce a new learning scheme, referred to as deep-and-wide learning (DWL), to systematically capture features not only within individual input data (intra-data features) but also across the data (inter-data features). Furthermore, we propose a dual-interactive-channel network (D-Net) to realize the DWL, which leverages our Bayesian formulation of low-dimensional (LD) inter-data feature extraction and its synergistic interaction with the conventional HD representation of the dataset, for substantially enhanced computational efficiency and inference. The proposed technique has been applied to data across various disciplines for both classification and regression tasks. Our results demonstrate that DWL surpasses state-of-the-art DNNs in accuracy by a substantial margin with limited training data and improves the computational efficiency by order(s) of magnitude. The proposed DWL strategy dramatically alters the data-driven learning techniques, including emerging large foundation models, and sheds significant insights into the evolving field of AI.

accuracy, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2501.17347

Country:

North America > United States > California (0.14)
North America > United States > New York (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.86)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Dermatology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.94)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MS-Glance: Bio-Insipred Non-semantic Context Vectors and their Applications in Supervising Image Reconstruction

Gao, Ziqi, Yang, Wendi, Li, Yujia, Xing, Lei, Zhou, S. Kevin

arXiv.org Artificial IntelligenceNov-23-2024

Non-semantic context information is crucial for visual recognition, as the human visual perception system first uses global statistics to process scenes rapidly before identifying specific objects. However, while semantic information is increasingly incorporated into computer vision tasks such as image reconstruction, non-semantic information, such as global spatial structures, is often overlooked. To bridge the gap, we propose a biologically informed non-semantic context descriptor, \textbf{MS-Glance}, along with the Glance Index Measure for comparing two images. A Global Glance vector is formulated by randomly retrieving pixels based on a perception-driven rule from an image to form a vector representing non-semantic global context, while a local Glance vector is a flattened local image window, mimicking a zoom-in observation. The Glance Index is defined as the inner product of two standardized sets of Glance vectors. We evaluate the effectiveness of incorporating Glance supervision in two reconstruction tasks: image fitting with implicit neural representation (INR) and undersampled MRI reconstruction. Extensive experimental results show that MS-Glance outperforms existing image restoration losses across both natural and medical images. The code is available at \url{https://github.com/Z7Gao/MSGlance}.

artificial intelligence, bio-insipred non-semantic context vector, supervising image reconstruction, (1 more...)

arXiv.org Artificial Intelligence

2410.23577

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

Generalized Trusted Multi-view Classification Framework with Hierarchical Opinion Aggregation

Shi, Long, Tang, Chuanqing, Deng, Huangyi, Xu, Cai, Xing, Lei, Chen, Badong

arXiv.org Artificial IntelligenceNov-6-2024

Recently, multi-view learning has witnessed a considerable interest on the research of trusted decision-making. Previous methods are mainly inspired from an important paper published by Han et al. in 2021, which formulates a Trusted Multi-view Classification (TMC) framework that aggregates evidence from different views based on Dempster's combination rule. All these methods only consider inter-view aggregation, yet lacking exploitation of intra-view information. In this paper, we propose a generalized trusted multi-view classification framework with hierarchical opinion aggregation. This hierarchical framework includes a two-phase aggregation process: the intra-view and inter-view aggregation hierarchies. In the intra aggregation, we assume that each view is comprised of common information shared with other views, as well as its specific information. We then aggregate both the common and specific information. This aggregation phase is useful to eliminate the feature noise inherent to view itself, thereby improving the view quality. In the inter-view aggregation, we design an attention mechanism at the evidence level to facilitate opinion aggregation from different views. To the best of our knowledge, this is one of the pioneering efforts to formulate a hierarchical aggregation framework in the trusted multi-view learning domain. Extensive experiments show that our model outperforms some state-of-art trust-related baselines.

aggregation, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.03713

Country: Asia > China > Sichuan Province (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

ST-NeRP: Spatial-Temporal Neural Representation Learning with Prior Embedding for Patient-specific Imaging Study

Qiu, Liang, Shen, Liyue, Liu, Lianli, Liu, Junyan, Chen, Yizheng, Xing, Lei

arXiv.org Artificial IntelligenceOct-24-2024

During and after a course of therapy, imaging is routinely used to monitor the disease progression and assess the treatment responses. Despite of its significance, reliably capturing and predicting the spatial-temporal anatomic changes from a sequence of patient-specific image series presents a considerable challenge. Thus, the development of a computational framework becomes highly desirable for a multitude of practical applications. In this context, we propose a strategy of Spatial-Temporal Neural Representation learning with Prior embedding (ST-NeRP) for patient-specific imaging study. Our strategy involves leveraging an Implicit Neural Representation (INR) network to encode the image at the reference time point into a prior embedding. Subsequently, a spatial-temporally continuous deformation function is learned through another INR network. This network is trained using the whole patient-specific image sequence, enabling the prediction of deformation fields at various target time points. The efficacy of the ST-NeRP model is demonstrated through its application to diverse sequential image series, including 4D CT and longitudinal CT datasets within thoracic and abdominal imaging. The proposed ST-NeRP model exhibits substantial potential in enabling the monitoring of anatomical changes within a patient throughout the therapeutic journey.

artificial intelligence, machine learning, registration, (21 more...)

arXiv.org Artificial Intelligence

2410.19283

Country: North America > United States > Michigan (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.92)

Add feedback

Reducing Hallucinations in Vision-Language Models via Latent Space Steering

Liu, Sheng, Ye, Haotian, Xing, Lei, Zou, James

arXiv.org Artificial IntelligenceOct-22-2024

Hallucination poses a challenge to the deployment of large vision-language models (LVLMs) in applications. Unlike in large language models (LLMs), hallucination in LVLMs often arises from misalignments between visual inputs and textual outputs. This paper investigates the underlying mechanisms of hallucination, focusing on the unique structure of LVLMs that distinguishes them from large language models (LLMs). We identify that hallucinations often arise from the sensitivity of text decoders to vision inputs, a natural phenomenon when image encoders and text decoders are pre-trained separately. Inspired by this, we introduce Visual and Textual Intervention (VTI), a novel technique designed to reduce hallucinations by steering latent space representations during inference to enhance the stability of vision features. As a task-agnostic test-time intervention, VTI can be easily applied to any problem without additional cost. Extensive experiments demonstrate that it can effectively reduce hallucinations and outperform baseline methods across multiple metrics, highlighting the critical role of vision feature stability in LVLMs.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2410.15778

Country: Europe > Switzerland (0.28)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Large Language Model-Augmented Auto-Delineation of Treatment Target Volume in Radiation Therapy

Rajendran, Praveenbalaji, Yang, Yong, Niedermayr, Thomas R., Gensheimer, Michael, Beadle, Beth, Le, Quynh-Thu, Xing, Lei, Dai, Xianjin

arXiv.org Artificial IntelligenceJul-9-2024

Radiation therapy (RT) is one of the most effective treatments for cancer, and its success relies on the accurate delineation of targets. However, target delineation is a comprehensive medical decision that currently relies purely on manual processes by human experts. Manual delineation is time-consuming, laborious, and subject to interobserver variations. Although the advancements in artificial intelligence (AI) techniques have significantly enhanced the auto-contouring of normal tissues, accurate delineation of RT target volumes remains a challenge. In this study, we propose a visual language model-based RT target volume auto-delineation network termed Radformer. The Radformer utilizes a hierarichal vision transformer as the backbone and incorporates large language models to extract text-rich features from clinical data. We introduce a visual language attention module (VLAM) for integrating visual and linguistic features for language-aware visual encoding (LAVE). The Radformer has been evaluated on a dataset comprising 2985 patients with head-and-neck cancer who underwent RT. Metrics, including the Dice similarity coefficient (DSC), intersection over union (IOU), and 95th percentile Hausdorff distance (HD95), were used to evaluate the performance of the model quantitatively. Our results demonstrate that the Radformer has superior segmentation performance compared to other state-of-the-art models, validating its potential for adoption in RT practice.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2407.07296

Country:

North America > United States > California > Santa Clara County (0.15)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology > Head & Neck Cancer (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Automated radiotherapy treatment planning guided by GPT-4Vision

Liu, Sheng, Pastor-Serrano, Oscar, Chen, Yizheng, Gopaulchan, Matthew, Liang, Weixing, Buyyounouski, Mark, Pollom, Erqi, Le, Quynh-Thu, Gensheimer, Michael, Dong, Peng, Yang, Yong, Zou, James, Xing, Lei

arXiv.org Artificial IntelligenceJul-1-2024

Radiotherapy treatment planning is a time-consuming and potentially subjective process that requires the iterative adjustment of model parameters to balance multiple conflicting objectives. Recent advancements in large foundation models offer promising avenues for addressing the challenges in planning and clinical decision-making. This study introduces GPT-RadPlan, a fully automated treatment planning framework that harnesses prior radiation oncology knowledge encoded in multi-modal large language models, such as GPT-4Vision (GPT-4V) from OpenAI. GPT-RadPlan is made aware of planning protocols as context and acts as an expert human planner, capable of guiding a treatment planning process. Via in-context learning, we incorporate clinical protocols for various disease sites as prompts to enable GPT-4V to acquire treatment planning domain knowledge. The resulting GPT-RadPlan agent is integrated into our in-house inverse treatment planning system through an API. The efficacy of the automated planning system is showcased using multiple prostate and head & neck cancer cases, where we compared GPT-RadPlan results to clinical plans. In all cases, GPT-RadPlan either outperformed or matched the clinical plans, demonstrating superior target coverage and organ-at-risk sparing. Consistently satisfying the dosimetric objectives in the clinical protocol, GPT-RadPlan represents the first multimodal large language model agent that mimics the behaviors of human planners in radiation oncology clinics, achieving remarkable results in automating the treatment planning process without the need for additional training.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.15609

Country: North America > United States > California > Santa Clara County > Palo Alto (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Nuclear Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering

Liu, Sheng, Xing, Lei, Zou, James

arXiv.org Artificial IntelligenceNov-16-2023

Large language models (LLMs) demonstrate emergent in-context learning capabilities, where they adapt to new tasks based on example demonstrations. However, in-context learning has seen limited effectiveness in many settings, is difficult to quantitatively control and takes up context window space. To overcome these limitations, we propose an alternative approach that recasts in-context learning as in-context vectors (ICV). Using ICV has two steps. We first use a forward pass on demonstration examples to create the in-context vector from the latent embedding of the LLM. This vector captures essential information about the intended task. On a new query, instead of adding demonstrations to the prompt, we shift the latent states of the LLM using the ICV. The ICV approach has several benefits: 1) it enables the LLM to more effectively follow the demonstration examples; 2) it's easy to control by adjusting the magnitude of the ICV; 3) it reduces the length of the prompt by removing the in-context demonstrations; 4) ICV is computationally much more efficient than fine-tuning. We demonstrate that ICV achieves better performance compared to standard in-context learning and fine-tuning on diverse tasks including safety, style transfer, role-playing and formatting. Moreover, we show that we can flexibly teach LLM to simultaneously follow different types of instructions by simple vector arithmetics on the corresponding ICVs.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2311.06668

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report (0.64)
Overview (0.46)

Industry:

Health & Medicine (0.93)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Development and external validation of a lung cancer risk estimation tool using gradient-boosting

Benveniste, Pierre-Louis, Alberge, Julie, Xing, Lei, Bibault, Jean-Emmanuel

arXiv.org Artificial IntelligenceAug-23-2023

Lung cancer is a significant cause of mortality worldwide, emphasizing the importance of early detection for improved survival rates. In this study, we propose a machine learning (ML) tool trained on data from the PLCO Cancer Screening Trial and validated on the NLST to estimate the likelihood of lung cancer occurrence within five years. The study utilized two datasets, the PLCO (n=55,161) and NLST (n=48,595), consisting of comprehensive information on risk factors, clinical measurements, and outcomes related to lung cancer. Data preprocessing involved removing patients who were not current or former smokers and those who had died of causes unrelated to lung cancer. Additionally, a focus was placed on mitigating bias caused by censored data. Feature selection, hyper-parameter optimization, and model calibration were performed using XGBoost, an ensemble learning algorithm that combines gradient boosting and decision trees. The ML model was trained on the pre-processed PLCO dataset and tested on the NLST dataset. The model incorporated features such as age, gender, smoking history, medical diagnoses, and family history of lung cancer. The model was well-calibrated (Brier score=0.044). ROC-AUC was 82% on the PLCO dataset and 70% on the NLST dataset. PR-AUC was 29% and 11% respectively. When compared to the USPSTF guidelines for lung cancer screening, our model provided the same recall with a precision of 13.1% vs. 9.3% on the PLCO dataset and 3.2% vs. 3.1% on the NLST dataset. The developed ML tool provides a freely available web application for estimating the likelihood of developing lung cancer within five years. By utilizing risk factors and clinical data, individuals can assess their risk and make informed decisions regarding lung cancer screening. This research contributes to the efforts in early detection and prevention strategies, aiming to reduce lung cancer-related mortality rates.

artificial intelligence, cancer, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2308.12188

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

Towards Medical Artificial General Intelligence via Knowledge-Enhanced Multimodal Pretraining

Lin, Bingqian, Chen, Zicong, Li, Mingjie, Lin, Haokun, Xu, Hang, Zhu, Yi, Liu, Jianzhuang, Cai, Wenjia, Yang, Lei, Zhao, Shen, Wu, Chenfei, Chen, Ling, Chang, Xiaojun, Yang, Yi, Xing, Lei, Liang, Xiaodan

arXiv.org Artificial IntelligenceApr-25-2023

Medical artificial general intelligence (MAGI) enables one foundation model to solve different medical tasks, which is very practical in the medical domain. It can significantly reduce the requirement of large amounts of task-specific data by sufficiently sharing medical knowledge among different tasks. However, due to the challenges of designing strongly generalizable models with limited and complex medical data, most existing approaches tend to develop task-specific models. To take a step towards MAGI, we propose a new paradigm called Medical-knOwledge-enhanced mulTimOdal pretRaining (MOTOR). In MOTOR, we combine two kinds of basic medical knowledge, i.e., general and specific knowledge, in a complementary manner to boost the general pretraining process. As a result, the foundation model with comprehensive basic knowledge can learn compact representations from pretraining radiographic data for better cross-modal alignment. MOTOR unifies the understanding and generation, which are two kinds of core intelligence of an AI system, into a single medical foundation model, to flexibly handle more diverse medical tasks. To enable a comprehensive evaluation and facilitate further research, we construct a medical multimodal benchmark including a wide range of downstream tasks, such as chest x-ray report generation and medical visual question answering. Extensive experiments on our benchmark show that MOTOR obtains promising results through simple task-oriented adaptation. The visualization shows that the injected knowledge successfully highlights key information in the medical data, demonstrating the excellent interpretability of MOTOR. Our MOTOR successfully mimics the human practice of fulfilling a "medical student" to accelerate the process of becoming a "specialist". We believe that our work makes a significant stride in realizing MAGI.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2304.14204

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback