AITopics | Huang, Zheng

Collaborating Authors

Huang, Zheng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

"I know myself better, but not really greatly": Using LLMs to Detect and Explain LLM-Generated Texts

Ji, Jiazhou, Guo, Jie, Qiu, Weidong, Huang, Zheng, Xu, Yang, Lu, Xinru, Jiang, Xiaoyu, Li, Ruizhe, Li, Shujun

arXiv.org Artificial IntelligenceFeb-18-2025

Large language models (LLMs) have demonstrated impressive capabilities in generating human-like texts, but the potential misuse of such LLM-generated texts raises the need to distinguish between human-generated and LLM-generated content. This paper explores the detection and explanation capabilities of LLM-based detectors of LLM-generated texts, in the context of a binary classification task (human-generated texts vs LLM-generated texts) and a ternary classification task (human-generated texts, LLM-generated texts, and undecided). By evaluating on six close/open-source LLMs with different sizes, our findings reveal that while self-detection consistently outperforms cross-detection, i.e., LLMs can detect texts generated by themselves more accurately than those generated by other LLMs, the performance of self-detection is still far from ideal, indicating that further improvements are needed. We also show that extending the binary to the ternary classification task with a new class "Undecided" can enhance both detection accuracy and explanation quality, with improvements being statistically significant and consistent across all LLMs. We finally conducted comprehensive qualitative and quantitative analyses on the explanation errors, which are categorized into three types: reliance on inaccurate features (the most frequent error), hallucinations, and incorrect reasoning. These findings with our human-annotated dataset emphasize the need for further research into improving both self-detection and self-explanation, particularly to address overfitting issues that may hinder generalization.

explanation, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2502.12743

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.46)
Information Technology > Security & Privacy (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

MindLLM: A Subject-Agnostic and Versatile Model for fMRI-to-Text Decoding

Qiu, Weikang, Huang, Zheng, Hu, Haoyu, Feng, Aosong, Yan, Yujun, Ying, Rex

arXiv.org Artificial IntelligenceFeb-17-2025

Decoding functional magnetic resonance imaging (fMRI) signals into text has been a key challenge in the neuroscience community, with the potential to advance brain-computer interfaces and uncover deeper insights into brain mechanisms. However, existing approaches often struggle with suboptimal predictive performance, limited task variety, and poor generalization across subjects. In response to this, we propose MindLLM, a model designed for subject-agnostic and versatile fMRI-to-text decoding. MindLLM consists of an fMRI encoder and an off-the-shelf LLM. The fMRI encoder employs a neuroscience-informed attention mechanism, which is capable of accommodating subjects with varying input shapes and thus achieves high-performance subject-agnostic decoding. Moreover, we introduce Brain Instruction Tuning (BIT), a novel approach that enhances the model's ability to capture diverse semantic representations from fMRI signals, facilitating more versatile decoding. We evaluate MindLLM on comprehensive fMRI-to-text benchmarks. Results demonstrate that our model outperforms the baselines, improving downstream tasks by 12.0%, unseen subject generalization by 16.4%, and novel task adaptation by 25.0%. Furthermore, the attention patterns in MindLLM provide interpretable insights into its decision-making process.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.15786

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

Detecting Machine-Generated Texts: Not Just "AI vs Humans" and Explainability is Complicated

Ji, Jiazhou, Li, Ruizhe, Li, Shujun, Guo, Jie, Qiu, Weidong, Huang, Zheng, Chen, Chiyu, Jiang, Xiaoyu, Lu, Xinru

arXiv.org Artificial IntelligenceJun-26-2024

As LLMs rapidly advance, increasing concerns arise regarding risks about actual authorship of texts we see online and in real world. The task of distinguishing LLM-authored texts is complicated by the nuanced and overlapping behaviors of both machines and humans. In this paper, we challenge the current practice of considering LLM-generated text detection a binary classification task of differentiating human from AI. Instead, we introduce a novel ternary text classification scheme, adding an "undecided" category for texts that could be attributed to either source, and we show that this new category is crucial to understand how to make the detection result more explainable to lay users. This research shifts the paradigm from merely classifying to explaining machine-generated texts, emphasizing need for detectors to provide clear and understandable explanations to users. Our study involves creating four new datasets comprised of texts from various LLMs and human authors. Based on new datasets, we performed binary classification tests to ascertain the most effective SOTA detection methods and identified SOTA LLMs capable of producing harder-to-detect texts. We constructed a new dataset of texts generated by two top-performing LLMs and human authors, and asked three human annotators to produce ternary labels with explanation notes. This dataset was used to investigate how three top-performing SOTA detectors behave in new ternary classification context. Our results highlight why "undecided" category is much needed from the viewpoint of explainability. Additionally, we conducted an analysis of explainability of the three best-performing detectors and the explanation notes of the human annotators, revealing insights about the complexity of explainable detection of machine-generated texts. Finally, we propose guidelines for developing future detection systems with improved explanatory power.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2406.18259

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.87)

Industry:

Energy > Oil & Gas (1.00)
Education (0.93)
Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Enhancing Size Generalization in Graph Neural Networks through Disentangled Representation Learning

Huang, Zheng, Yang, Qihui, Zhou, Dawei, Yan, Yujun

arXiv.org Artificial IntelligenceJun-11-2024

Although most graph neural networks (GNNs) can operate on graphs of any size, their classification performance often declines on graphs larger than those encountered during training. Existing methods insufficiently address the removal of size information from graph representations, resulting in sub-optimal performance and reliance on backbone models. In response, we propose DISGEN, a novel and model-agnostic framework designed to disentangle size factors from graph representations. DISGEN employs size- and task-invariant augmentations and introduces a decoupling loss that minimizes shared information in hidden representations, with theoretical guarantees for its effectiveness. Our empirical results show that DISGEN outperforms the state-of-the-art models by up to 6% on real-world datasets, underscoring its effectiveness in enhancing the size generalizability of GNNs. Our codes are available at: https://github.com/GraphmindDartmouth/DISGEN.

artificial intelligence, graph, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2406.04601

Country:

North America > United States > Virginia (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.67)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)

Add feedback

Support or Refute: Analyzing the Stance of Evidence to Detect Out-of-Context Mis- and Disinformation

Yuan, Xin, Guo, Jie, Qiu, Weidong, Huang, Zheng, Li, Shujun

arXiv.org Artificial IntelligenceDec-9-2023

Mis- and disinformation online have become a major societal problem as major sources of online harms of different kinds. One common form of mis- and disinformation is out-of-context (OOC) information, where different pieces of information are falsely associated, e.g., a real image combined with a false textual caption or a misleading textual description. Although some past studies have attempted to defend against OOC mis- and disinformation through external evidence, they tend to disregard the role of different pieces of evidence with different stances. Motivated by the intuition that the stance of evidence represents a bias towards different detection results, we propose a stance extraction network (SEN) that can extract the stances of different pieces of multi-modal evidence in a unified framework. Moreover, we introduce a support-refutation score calculated based on the co-occurrence relations of named entities into the textual SEN. Extensive experiments on a public large-scale dataset demonstrated that our proposed method outperformed the state-of-the-art baselines, with the best model achieving a performance gain of 3.2% in accuracy.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2311.01766

Country:

Europe > United Kingdom > Scotland (0.14)
North America > United States > Alaska (0.14)

Genre: Research Report (0.82)

Industry:

Media > News (1.00)
Law (0.93)
Government > Regional Government > North America Government > United States Government (0.68)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Deep adaptive fuzzy clustering for evolutionary unsupervised representation learning

Tan, Dayu, Huang, Zheng, Peng, Xin, Zhong, Weimin, Mahalec, Vladimir

arXiv.org Artificial IntelligenceMar-31-2021

Cluster assignment of large and complex images is a crucial but challenging task in pattern recognition and computer vision. In this study, we explore the possibility of employing fuzzy clustering in a deep neural network framework. Thus, we present a novel evolutionary unsupervised learning representation model with iterative optimization. It implements the deep adaptive fuzzy clustering (DAFC) strategy that learns a convolutional neural network classifier from given only unlabeled data samples. DAFC consists of a deep feature quality-verifying model and a fuzzy clustering model, where deep feature representation learning loss function and embedded fuzzy clustering with the weighted adaptive entropy is implemented. We joint fuzzy clustering to the deep reconstruction model, in which fuzzy membership is utilized to represent a clear structure of deep cluster assignments and jointly optimize for the deep representation learning and clustering. Also, the joint model evaluates current clustering performance by inspecting whether the re-sampled data from estimated bottleneck space have consistent clustering properties to progressively improve the deep clustering model. Comprehensive experiments on a variety of datasets show that the proposed method obtains a substantially better performance for both reconstruction and clustering quality when compared to the other state-of-the-art deep clustering methods, as demonstrated with the in-depth analysis in the extensive experiments.

artificial intelligence, dataset, neural network, (18 more...)

arXiv.org Artificial Intelligence

2103.17086

Country: North America > Canada > Ontario > Hamilton (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction

Huang, Zheng, Chen, Kai, He, Jianhua, Bai, Xiang, Karatzas, Dimosthenis, Lu, Shjian, Jawahar, C. V.

arXiv.org Artificial IntelligenceMar-18-2021

Scanned receipts OCR and key information extraction (SROIE) represent the processeses of recognizing text from scanned receipts and extracting key texts from them and save the extracted tests to structured documents. SROIE plays critical roles for many document analysis applications and holds great commercial potentials, but very little research works and advances have been published in this area. In recognition of the technical challenges, importance and huge commercial potentials of SROIE, we organized the ICDAR 2019 competition on SROIE. In this competition, we set up three tasks, namely, Scanned Receipt Text Localisation (Task 1), Scanned Receipt OCR (Task 2) and Key Information Extraction from Scanned Receipts (Task 3). A new dataset with 1000 whole scanned receipt images and annotations is created for the competition. In this report we will presents the motivation, competition datasets, task definition, evaluation protocol, submission statistics, performance of submitted methods and results analysis.

artificial intelligence, competition, natural language, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICDAR.2019.00244

2103.10213

Country:

Asia > China (0.15)
Europe > Spain (0.14)
Asia > India (0.14)

Genre:

Contests & Prizes (0.61)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.96)
Information Technology > Data Science > Data Mining > Text Mining (0.86)

Add feedback

Double Supervised Network with Attention Mechanism for Scene Text Recognition

Gao, Yuting, Huang, Zheng, Dai, Yuchen

arXiv.org Artificial IntelligenceAug-2-2018

In this paper, we propose Double Supervised Network with Attention Mechanism (DSAN), a novel end-to-end trainable framework for scene text recognition. It incorporates one text attention module during feature extraction which enforces the model to focus on text regions and the whole framework is supervised by two branches. One supervision branch comes from context-level modelling and another comes from one extra supervision enhancement branch which aims at tackling inexplicit semantic information at character level. These two supervisions can benefit each other and yield better performance. The proposed approach can recognize text in arbitrary length and does not need any predefined lexicon. Our method outperforms the current state-of-the-art methods on three text recognition benchmarks: IIIT5K, ICDAR2013 and SVT reaching accuracy 88.6%, 92.3% and 84.1% respectively which suggests the effectiveness of the proposed method.

deep learning, neural network, text recognition, (16 more...)

arXiv.org Artificial Intelligence

1808.00677

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.87)

Add feedback