AITopics | Jiang, Chen

Plotting

Jiang, Chen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Aneumo: A Large-Scale Comprehensive Synthetic Dataset of Aneurysm Hemodynamics

Li, Xigui, Zhou, Yuanye, Xiao, Feiyang, Guo, Xin, Zhang, Yichi, Jiang, Chen, Ge, Jianchao, Wang, Xiansheng, Wang, Qimeng, Zhang, Taiwei, Lin, Chensen, Cheng, Yuan, Qi, Yuan

arXiv.org Artificial IntelligenceJan-17-2025

Intracranial aneurysm (IA) is a common cerebrovascular disease that is usually asymptomatic but may cause severe subarachnoid hemorrhage (SAH) if ruptured. Although clinical practice is usually based on individual factors and morphological features of the aneurysm, its pathophysiology and hemodynamic mechanisms remain controversial. To address the limitations of current research, this study constructed a comprehensive hemodynamic dataset of intracranial aneurysms. The dataset is based on 466 real aneurysm models, and 10,000 synthetic models were generated by resection and deformation operations, including 466 aneurysm-free models and 9,534 deformed aneurysm models. The dataset also provides medical image-like segmentation mask files to support insightful analysis. In addition, the dataset contains hemodynamic data measured at eight steady-state flow rates (0.001 to 0.004 kg/s), including critical parameters such as flow velocity, pressure, and wall shear stress, providing a valuable resource for investigating aneurysm pathogenesis and clinical prediction. This dataset will help advance the understanding of the pathologic features and hemodynamic mechanisms of intracranial aneurysms and support in-depth research in related fields. Dataset hosted at https://github.com/Xigui-Li/Aneumo.

aneurysm, artificial intelligence, intracranial aneurysm, (15 more...)

arXiv.org Artificial Intelligence

2501.0998

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases > Cerebrovascular Disease (0.35)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.49)
Information Technology > Artificial Intelligence > Vision (0.46)

Add feedback

Personalize to generalize: Towards a universal medical multi-modality generalization through personalization

Tan, Zhaorui, Yang, Xi, Pan, Tan, Liu, Tianyi, Jiang, Chen, Guo, Xin, Wang, Qiufeng, Nguyen, Anh, Qi, Yuan, Huang, Kaizhu, Cheng, Yuan

arXiv.org Artificial IntelligenceNov-12-2024

The differences among medical imaging modalities, driven by distinct underlying principles, pose significant challenges for generalization in multi-modal medical tasks. Beyond modality gaps, individual variations, such as differences in organ size and metabolic rate, further impede a model's ability to generalize effectively across both modalities and diverse populations. Despite the importance of personalization, existing approaches to multi-modal generalization often neglect individual differences, focusing solely on common anatomical features. This limitation may result in weakened generalization in various medical tasks. In this paper, we unveil that personalization is critical for multi-modal generalization. Specifically, we propose an approach to achieve personalized generalization through approximating the underlying personalized invariant representation ${X}_h$ across various modalities by leveraging individual-level constraints and a learnable biological prior. We validate the feasibility and benefits of learning a personalized ${X}_h$, showing that this representation is highly generalizable and transferable across various multi-modal medical tasks. Extensive experimental results consistently show that the additionally incorporated personalization significantly improves performance and generalization across diverse scenarios, confirming its effectiveness.

artificial intelligence, generalization, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.06106

Country:

Europe > Greece (0.14)
Europe > France (0.14)
Asia > China (0.14)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

On the Sequence Evaluation based on Stochastic Processes

Zhang, Tianhao, Lin, Zhexiao, Sheng, Zhecheng, Jiang, Chen, Kang, Dongyeop

arXiv.org Artificial IntelligenceJun-15-2024

Modeling and analyzing long sequences of text is an essential task for Natural Language Processing. Success in capturing long text dynamics using neural language models will facilitate many downstream tasks such as coherence evaluation, text generation, machine translation and so on. This paper presents a novel approach to model sequences through a stochastic process. We introduce a likelihood-based training objective for the text encoder and design a more thorough measurement (score) for long text evaluation compared to the previous approach. The proposed training objective effectively preserves the sequence coherence, while the new score comprehensively captures both temporal and spatial dependencies. Theoretical properties of our new score show its advantages in sequence evaluation. Experimental results show superior performance in various sequence evaluation tasks, including global and local discrimination within and between documents of different lengths. We also demonstrate the encoder achieves competitive results on discriminating human and AI written text.

bbscore, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.17764

Country:

North America > United States > Michigan (0.14)
North America > United States > California (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning

Jiang, Chen, Liu, Hong, Yu, Xuzheng, Wang, Qing, Cheng, Yuan, Xu, Jia, Liu, Zhongyi, Guo, Qingpei, Chu, Wei, Yang, Ming, Qi, Yuan

arXiv.org Artificial IntelligenceJan-26-2024

In recent years, the explosion of web videos makes text-video retrieval increasingly essential and popular for video filtering, recommendation, and search. Text-video retrieval aims to rank relevant text/video higher than irrelevant ones. The core of this task is to precisely measure the cross-modal similarity between texts and videos. Recently, contrastive learning methods have shown promising results for text-video retrieval, most of which focus on the construction of positive and negative pairs to learn text and video representations. Nevertheless, they do not pay enough attention to hard negative pairs and lack the ability to model different levels of semantic similarity. To address these two issues, this paper improves contrastive learning using two novel techniques. First, to exploit hard examples for robust discriminative power, we propose a novel Dual-Modal Attention-Enhanced Module (DMAE) to mine hard negative pairs from textual and visual clues. By further introducing a Negative-aware InfoNCE (NegNCE) loss, we are able to adaptively identify all these hard negatives and explicitly highlight their impacts in the training loss. Second, our work argues that triplet samples can better model fine-grained semantic similarity compared to pairwise samples. We thereby present a new Triplet Partial Margin Contrastive Learning (TPM-CL) module to construct partial order triplet samples by automatically generating fine-grained hard negatives for matched text-video pairs. The proposed TPM-CL designs an adaptive token masking strategy with cross-modal interaction to model subtle semantic differences. Extensive experiments demonstrate that the proposed approach outperforms existing methods on four widely-used text-video retrieval datasets, including MSR-VTT, MSVD, DiDeMo and ActivityNet.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3581783.3612006

2309.11082

Country:

North America > Canada (0.16)
North America > United States (0.14)
Europe > Germany (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

BBScore: A Brownian Bridge Based Metric for Assessing Text Coherence

Sheng, Zhecheng, Zhang, Tianhao, Jiang, Chen, Kang, Dongyeop

arXiv.org Artificial IntelligenceDec-28-2023

Measuring the coherence of text is a vital aspect of evaluating the quality of written content. Recent advancements in neural coherence modeling have demonstrated their efficacy in capturing entity coreference and discourse relations, thereby enhancing coherence evaluation. However, many existing methods heavily depend on static embeddings or focus narrowly on nearby context, constraining their capacity to measure the overarching coherence of long texts. In this paper, we posit that coherent texts inherently manifest a sequential and cohesive interplay among sentences, effectively conveying the central theme, purpose, or standpoint. To explore this abstract relationship, we introduce the "BBScore," a novel reference-free metric grounded in Brownian bridge theory for assessing text coherence. Our findings showcase that when synergized with a simple additional classification component, this metric attains a performance level comparable to state-of-the-art techniques on standard artificial discrimination tasks. We also establish in downstream tasks that this metric effectively differentiates between human-written documents and text generated by large language models under a specific domain. Furthermore, we illustrate the efficacy of this approach in detecting written styles attributed to diverse large language models, underscoring its potential for generalizability. In summary, we present a novel Brownian bridge coherence metric capable of measuring both local and global text coherence, while circumventing the need for end-to-end model training. This flexibility allows for its application in various downstream tasks.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2312.16893

Country:

Europe (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Bridging Low-level Geometry to High-level Concepts in Visual Servoing of Robot Manipulation Task Using Event Knowledge Graphs and Vision-Language Models

Jiang, Chen, Jagersand, Martin

arXiv.org Artificial IntelligenceOct-5-2023

In this paper, we propose a framework of building knowledgeable robot control in the scope of smart human-robot interaction, by empowering a basic uncalibrated visual servoing controller with contextual knowledge through the joint usage of event knowledge graphs (EKGs) and large-scale pretrained vision-language models (VLMs). The framework is expanded in twofold: first, we interpret low-level image geometry as high-level concepts, allowing us to prompt VLMs and to select geometric features of points and lines for motor control skills; then, we create an event knowledge graph (EKG) to conceptualize a robot manipulation task of interest, where the main body of the EKG is characterized by an executable behavior tree, and the leaves by semantic concepts relevant to the manipulation context. We demonstrate, in an uncalibrated environment with real robot trials, that our method lowers the reliance of human annotation during task interfacing, allows the robot to perform activities of daily living more easily by treating low-level geometric-based motor control skills as high-level concepts, and is beneficial in building cognitive thinking for smart robot applications.

artificial intelligence, bridging low-level geometry, knowledge graph and vision-language model, (4 more...)

arXiv.org Artificial Intelligence

2310.03932

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.80)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.60)

Add feedback

CLIPUNetr: Assisting Human-robot Interface for Uncalibrated Visual Servoing Control with CLIP-driven Referring Expression Segmentation

Jiang, Chen, Yang, Yuchen, Jagersand, Martin

arXiv.org Artificial IntelligenceSep-17-2023

The classical human-robot interface in uncalibrated image-based visual servoing (UIBVS) relies on either human annotations or semantic segmentation with categorical labels. Both methods fail to match natural human communication and convey rich semantics in manipulation tasks as effectively as natural language expressions. In this paper, we tackle this problem by using referring expression segmentation, which is a prompt-based approach, to provide more in-depth information for robot perception. To generate high-quality segmentation predictions from referring expressions, we propose CLIPUNetr - a new CLIP-driven referring expression segmentation network. CLIPUNetr leverages CLIP's strong vision-language representations to segment regions from referring expressions, while utilizing its ``U-shaped'' encoder-decoder architecture to generate predictions with sharper boundaries and finer structures. Furthermore, we propose a new pipeline to integrate CLIPUNetr into UIBVS and apply it to control robots in real-world environments. In experiments, our method improves boundary and structure measurements by an average of 120% and can successfully assist real-world UIBVS control in an unstructured manipulation environment.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2309.09183

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Using Adamic-Adar Index Algorithm to Predict Volunteer Collaboration: Less is More

Wu, Chao, Chen, Peng, Yin, Baiqiao, Lin, Zijuan, Jiang, Chen, Yu, Di, Zou, Changhong, Lui, Chunwang

arXiv.org Artificial IntelligenceAug-25-2023

Social networks exhibit a complex graph-like structure due to the uncertainty surrounding potential collaborations among participants. Machine learning algorithms possess generic outstanding performance in multiple real-world prediction tasks. However, whether machine learning algorithms outperform specific algorithms designed for graph link prediction remains unknown to us. To address this issue, the Adamic-Adar Index (AAI), Jaccard Coefficient (JC) and common neighbour centrality (CNC) as representatives of graph-specific algorithms were applied to predict potential collaborations, utilizing data from volunteer activities during the Covid-19 pandemic in Shenzhen city, along with the classical machine learning algorithms such as random forest, support vector machine, and gradient boosting as single predictors and components of ensemble learning. This paper introduces that the AAI algorithm outperformed the traditional JC and CNC, and other machine learning algorithms in analyzing graph node attributes for this task.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2308.13176

Country: Asia > China > Guangdong Province > Shenzhen (0.25)

Genre: Research Report > New Finding (0.94)

Industry: Health & Medicine (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.56)

Add feedback

Single-photon Image Super-resolution via Self-supervised Learning

Chen, Yiwei, Jiang, Chen, Pan, Yu

arXiv.org Artificial IntelligenceMar-3-2023

Single-Photon Image Super-Resolution (SPISR) aims to recover a high-resolution volumetric photon counting cube from a noisy low-resolution one by computational imaging algorithms. In real-world scenarios, pairs of training samples are often expensive or impossible to obtain. By extending Equivariant Imaging (EI) to volumetric single-photon data, we propose a self-supervised learning framework for the SPISR task. Particularly, using the Poisson unbiased Kullback-Leibler risk estimator and equivariance, our method is able to learn from noisy measurements without ground truths. Comprehensive experiments on simulated and real-world dataset demonstrate that the proposed method achieves comparable performance with supervised learning and outperforms interpolation-based methods.

artificial intelligence, imaging, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2303.02033

Country: Asia > China (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.83)

Add feedback

InGVIO: A Consistent Invariant Filter for Fast and High-Accuracy GNSS-Visual-Inertial Odometry

Liu, Changwu, Jiang, Chen, Wang, Haowen

arXiv.org Artificial IntelligenceFeb-9-2023

Combining Global Navigation Satellite System (GNSS) with visual and inertial sensors can give smooth pose estimation without drifting. The fusion system gradually degrades to Visual-Inertial Odometry (VIO) with the number of satellites decreasing, which guarantees robust global navigation in GNSS unfriendly environments. In this letter, we propose an open-sourced invariant filter-based platform, InGVIO, to tightly fuse monocular/stereo visual-inertial measurements, along with raw data from GNSS. InGVIO gives highly competitive results in terms of computational load compared to current graph-based algorithms, meanwhile possessing the same or even better level of accuracy. Thanks to our proposed marginalization strategies, the baseline for triangulation is large although only a few cloned poses are kept. Moreover, we define the infinitesimal symmetries of the system and exploit the various structures of its symmetry group, being different from the total symmetries of the VIO case, which elegantly gives results for the pattern of degenerate motions and the structure of unobservable subspaces. We prove that the properly-chosen invariant error is still compatible with all possible symmetry group structures of InGVIO and has intrinsic consistency properties. Besides, InGVIO has strictly linear error propagation without linearization error. InGVIO is tested on both open datasets and our proposed fixed-wing datasets with variable levels of difficulty and various numbers of satellites. The latter datasets, to the best of our knowledge, are the first datasets open-sourced to the community on a fixed-wing aircraft with raw GNSS.

artificial intelligence, dataset, ingvio, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2023.3243520

2210.15145

Country:

Europe (0.46)
Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report (0.50)

Industry:

Aerospace & Defense > Aircraft (0.49)
Transportation > Air (0.46)

Technology: Information Technology > Artificial Intelligence > Robots (0.31)

Add feedback