AITopics | Cheng, Zehua

Collaborating Authors

Cheng, Zehua

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Affinity-Graph-Guided Contractive Learning for Pretext-Free Medical Image Segmentation with Minimal Annotation

Cheng, Zehua, Yuan, Di, Lukasiewicz, Thomas

arXiv.org Artificial IntelligenceOct-14-2024

The combination of semi-supervised learning (SemiSL) and contrastive learning (CL) has been successful in medical image segmentation with limited annotations. However, these works often rely on pretext tasks that lack the specificity required for pixel-level segmentation, and still face overfitting issues due to insufficient supervision signals resulting from too few annotations. Therefore, this paper proposes an affinity-graph-guided semi-supervised contrastive learning framework (Semi-AGCL) by establishing additional affinity-graph-based supervision signals between the student and teacher network, to achieve medical image segmentation with minimal annotations without pretext. The framework first designs an average-patch-entropy-driven inter-patch sampling method, which can provide a robust initial feature space without relying on pretext tasks. Furthermore, the framework designs an affinity-graph-guided loss function, which can improve the quality of the learned representation and the model generalization ability by exploiting the inherent structure of the data, thus mitigating overfitting. Our experiments indicate that with merely 10% of the complete annotation set, our model approaches the accuracy of the fully annotated baseline, manifesting a marginal deviation of only 2.52%. Under the stringent conditions where only 5% of the annotations are employed, our model exhibits a significant enhancement in performance surpassing the second best baseline by 23.09% on the dice metric and achieving an improvement of 26.57% on the notably arduous CRAG and ACDC datasets.

artificial intelligence, machine learning, segmentation, (18 more...)

arXiv.org Artificial Intelligence

2410.10366

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.35)

Add feedback

Does DetectGPT Fully Utilize Perturbation? Selective Perturbation on Model-Based Contrastive Learning Detector would be Better

Liu, Shengchao, Liu, Xiaoming, Wang, Yichen, Cheng, Zehua, Li, Chengzhengxu, Zhang, Zhaohan, Lan, Yu, Shen, Chao

arXiv.org Artificial IntelligenceFeb-4-2024

The burgeoning capabilities of large language models (LLMs) have raised growing concerns about abuse. DetectGPT, a zero-shot metric-based unsupervised machine-generated text detector, first introduces perturbation and shows great performance improvement. However, DetectGPT's random perturbation strategy might introduce noise, limiting the distinguishability and further performance improvements. Moreover, its logit regression module relies on setting the threshold, which harms the generalizability and applicability of individual or small-batch inputs. Hence, we propose a novel detector, Pecola, which uses selective strategy perturbation to relieve the information loss caused by random masking, and multi-pair contrastive learning to capture the implicit pattern information during perturbation, facilitating few-shot performance. The experiments show that Pecola outperforms the SOTA method by 1.20% in accuracy on average on four public datasets. We further analyze the effectiveness, robustness, and generalization of our perturbation method.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.00263

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Bandwidth Reduction using Importance Weighted Pruning on Ring AllReduce

Cheng, Zehua, Xu, Zhenghua

arXiv.org Machine LearningJan-6-2019

It is inevitable to train large deep learning models on a large-scale cluster equipped with accelerators system. Deep gradient compression would highly increase the bandwidth utilization and speed up the training process but hard to implement on ring structure. In this paper, we find that redundant gradient and gradient staleness has negative effect on training. We have observed that in different epoch and different steps, the neural networks focus on updating different layers and different parameters. In order to save more communication bandwidth and preserve the accuracy on ring structure, which break the restrict as the node increase, we propose a new algorithm to measure the importance of gradients on large-scale cluster implementing ring all-reduce based on the size of the ratio of parameter calculation gradient to parameter value. Our importance weighted pruning approach achieved 64X and 58.8X of gradient compression ratio on AlexNet and ResNet50 on ImageNet. Meanwhile, in order to maintain the sparseness of the gradient propagation, we randomly broadcast the index of important gradients on each node. While the remaining nodes are ready for the index gradient and perform all-reduce update. This would speed up the convergence of the model and preserve the training accuracy.

deep learning, gradient, neural network, (14 more...)

arXiv.org Machine Learning

1901.01544

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.58)

Add feedback