AITopics | Bao, Guangsheng

Collaborating Authors

Bao, Guangsheng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Decoupling Content and Expression: Two-Dimensional Detection of AI-Generated Text

Bao, Guangsheng, Rong, Lihua, Zhao, Yanbin, Zhou, Qiji, Zhang, Yue

arXiv.org Artificial IntelligenceFeb-28-2025

The wide usage of LLMs raises critical requirements on detecting AI participation in texts. Existing studies investigate these detections in scattered contexts, leaving a systematic and unified approach unexplored. In this paper, we present HART, a hierarchical framework of AI risk levels, each corresponding to a detection task. To address these tasks, we propose a novel 2D Detection Method, decoupling a text into content and language expression. Our findings show that content is resistant to surface-level changes, which can serve as a key feature for detection. Experiments demonstrate that 2D method significantly outperforms existing detectors, achieving an AUROC improvement from 0.705 to 0.849 for level-2 detection and from 0.807 to 0.886 for RAID. We release our data and code at https://github.com/baoguangsheng/truth-mirror.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.00258

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (0.93)
Media (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

Direct Value Optimization: Improving Chain-of-Thought Reasoning in LLMs with Refined Values

Zhang, Hongbo, Cui, Han, Bao, Guangsheng, Yang, Linyi, Wang, Jun, Zhang, Yue

arXiv.org Artificial IntelligenceFeb-19-2025

We introduce Direct Value Optimization (DVO), an innovative reinforcement learning framework for enhancing large language models in complex reasoning tasks. Unlike traditional methods relying on preference labels, DVO utilizes value signals at individual reasoning steps, optimizing models via a mean squared error loss. The key benefit of DVO lies in its fine-grained supervision, circumventing the need for labor-intensive human annotations. Target values within the DVO are estimated using either Monte Carlo Tree Search or an outcome value model. Our empirical analysis on both mathematical and commonsense reasoning tasks shows that DVO consistently outperforms existing offline preference optimization techniques, even with fewer training steps. These findings underscore the importance of value signals in advancing reasoning capabilities and highlight DVO as a superior methodology under scenarios lacking explicit human preference information.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.13723

Country:

North America > United States (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Glimpse: Enabling White-Box Methods to Use Proprietary Models for Zero-Shot LLM-Generated Text Detection

Bao, Guangsheng, Zhao, Yanbin, He, Juncai, Zhang, Yue

arXiv.org Artificial IntelligenceDec-16-2024

Advanced large language models (LLMs) can generate text almost indistinguishable from human-written text, highlighting the importance of LLM-generated text detection. However, current zero-shot techniques face challenges as white-box methods are restricted to use weaker open-source LLMs, and black-box methods are limited by partial observation from stronger proprietary LLMs. It seems impossible to enable white-box methods to use proprietary models because API-level access to the models neither provides full predictive distributions nor inner embeddings. To traverse the divide, we propose Glimpse, a probability distribution estimation approach, predicting the full distributions from partial observations. Despite the simplicity of Glimpse, we successfully extend white-box methods like Entropy, Rank, Log-Rank, and Fast-DetectGPT to latest proprietary models. Experiments show that Glimpse with Fast-DetectGPT and GPT-3.5 achieves an average AUROC of about 0.95 in five latest source models, improving the score by 51% relative to the remaining space of the open source baseline (Table 1). It demonstrates that the latest LLMs can effectively detect their own outputs, suggesting that advanced LLMs may be the best shield against themselves.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.11506

Genre: Research Report > New Finding (0.93)

Industry: Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

CycleResearcher: Improving Automated Research via Automated Review

Weng, Yixuan, Zhu, Minjun, Bao, Guangsheng, Zhang, Hongbo, Wang, Jindong, Zhang, Yue, Yang, Linyi

arXiv.org Artificial IntelligenceOct-28-2024

The automation of scientific discovery has been a long-standing goal within the research community, driven by the potential to accelerate knowledge creation. While significant progress has been made using commercial large language models (LLMs) as research assistants or idea generators, the possibility of automating the entire research process with open-source LLMs remains largely unexplored. This paper explores the feasibility of using open-source post-trained LLMs as autonomous agents capable of performing the full cycle of automated research and review, from literature review and manuscript preparation to peer review and paper revision. Our iterative preference training framework consists of CycleResearcher, which conducts research tasks, and CycleReviewer, which simulates the peer review process, providing iterative feedback via reinforcement learning. To train these models, we develop two new datasets, Review-5k and Research-14k, reflecting real-world machine learning research and peer review dynamics. Our results demonstrate that CycleReviewer achieves a 26.89\% improvement in mean absolute error (MAE) over individual human reviewers in predicting paper scores, indicating that LLMs can surpass expert-level performance in research evaluation. In research, the papers generated by the CycleResearcher model achieved a score of 5.36 in simulated peer reviews, surpassing the preprint level of 5.24 from human experts and approaching the accepted paper level of 5.69. This work represents a significant step toward fully automated scientific inquiry, providing ethical safeguards and advancing AI-driven research capabilities. The code, dataset and model weight are released at \url{http://github/minjun-zhu/Researcher}.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2411.00816

Country: Asia (0.45)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (0.92)
Leisure & Entertainment > Games (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens

Wang, Cunxiang, Ning, Ruoxi, Pan, Boqi, Wu, Tonghui, Guo, Qipeng, Deng, Cheng, Bao, Guangsheng, Hu, Xiangkun, Zhang, Zheng, Wang, Qian, Zhang, Yue

arXiv.org Artificial IntelligenceJun-17-2024

The rapid advancement of Large Language Models (LLMs) has introduced a new frontier in natural language processing, particularly in understanding and processing long-context information. However, the evaluation of these models' long-context abilities remains a challenge due to the limitations of current benchmarks. To address this gap, we introduce NovelQA, a benchmark specifically designed to test the capabilities of LLMs with extended texts. Constructed from English novels, NovelQA offers a unique blend of complexity, length, and narrative coherence, making it an ideal tool for assessing deep textual understanding in LLMs. This paper presents the design and construction of NovelQA, highlighting its manual annotation, and diverse question types. Our evaluation of Long-context LLMs on NovelQA reveals significant insights into the models' performance, particularly emphasizing the challenges they face with multi-hop reasoning, detail-oriented questions, and extremely long input with an average length more than 200,000 tokens. The results underscore the necessity for further advancements in LLMs to improve their long-context comprehension.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2403.12766

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Law (0.68)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

LLMs with Chain-of-Thought Are Non-Causal Reasoners

Bao, Guangsheng, Zhang, Hongbo, Yang, Linyi, Wang, Cunxiang, Zhang, Yue

arXiv.org Artificial IntelligenceFeb-25-2024

This paper explores the role of the Chain of Thought (CoT) in Large Language Models (LLMs) reasoning. Despite its potential to improve task performance, our analysis reveals a surprising frequency of correct answers following incorrect CoTs and vice versa. We employ causal analysis to assess the cause-effect relationship between CoTs/instructions and answers in LLMs, uncovering the Structural Causal Model (SCM) that LLMs approximate. By comparing the implied SCM with that of human reasoning, we highlight discrepancies between LLM and human reasoning processes. We further examine the factors influencing the causal structure of the implied SCM, revealing that in-context learning, supervised fine-tuning, and reinforcement learning on human feedback significantly impact the causal relations. We release the code and results at https://github.com/StevenZHB/CoT_Causal_Analysis.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.16048

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Supervised Knowledge Makes Large Language Models Better In-context Learners

Yang, Linyi, Zhang, Shuibai, Yu, Zhuohao, Bao, Guangsheng, Wang, Yidong, Wang, Jindong, Xu, Ruochen, Ye, Wei, Xie, Xing, Chen, Weizhu, Zhang, Yue

arXiv.org Artificial IntelligenceDec-26-2023

Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The recent progress in large-scale generative models has further expanded their use in real-world language applications. However, the critical challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. While previous in-context learning research has focused on enhancing models to adhere to users' specific instructions and quality expectations, and to avoid undesired outputs, little to no work has explored the use of task-Specific finetuned Language Models (SLMs) to improve LLMs' in-context learning during the inference stage. Our primary contribution is the establishment of a simple yet effective framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks. Using our proposed plug-in method, enhanced versions of Llama 2 and ChatGPT surpass their original versions regarding generalizability and factuality. We offer a comprehensive suite of resources, including 16 curated datasets, prompts, model checkpoints, and LLM outputs across 9 distinct tasks. Our empirical analysis sheds light on the advantages of incorporating discriminative models into LLMs and highlights the potential of our methodology in fostering more reliable LLMs. Trained on extensive volumes of data with numerous parameters, large language models (LLMs) have garnered significant performance across diverse tasks. Their in-context learning (ICL) ability positions them as foundational models to adeptly address various downstream tasks, ranging from natural language understanding (Chowdhery et al., 2022; OpenAI, 2023a;b) to reasoning (Wei et al., 2022; O'Brien & Lewis, 2023), and planning (Shen et al., 2023). Despite their robust performance, LLMs come with their own set of challenges; they demand substantial resources for training and deployment, demonstrate slow inference times, and are susceptible to hallucination (Li et al., 2023a).

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2312.15918

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Non-Autoregressive Document-Level Machine Translation

Bao, Guangsheng, Teng, Zhiyang, Zhou, Hao, Yan, Jianhao, Zhang, Yue

arXiv.org Artificial IntelligenceDec-9-2023

Non-autoregressive translation (NAT) models achieve comparable performance and superior speed compared to auto-regressive translation (AT) models in the context of sentence-level machine translation (MT). However, their abilities are unexplored in document-level MT, hindering their usage in real scenarios. In this paper, we conduct a comprehensive examination of typical NAT models in the context of document-level MT and further propose a simple but effective design of sentence alignment between source and target. Experiments show that NAT models achieve high acceleration on documents, and sentence alignment significantly enhances their performance. However, current NAT models still have a significant performance gap compared to their AT counterparts. Further investigation reveals that NAT models suffer more from the multi-modality and misalignment issues in the context of document-level MT, and current NAT models struggle with exploiting document context and handling discourse phenomena. We delve into these challenges and provide our code at \url{https://github.com/baoguangsheng/nat-on-doc}.

artificial intelligence, nat model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.12878

Genre: Research Report > Experimental Study (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

GEMINI: Controlling the Sentence-level Writing Style for Abstractive Text Summarization

Bao, Guangsheng, Ou, Zebin, Zhang, Yue

arXiv.org Artificial IntelligenceDec-9-2023

Human experts write summaries using different techniques, including extracting a sentence from the document and rewriting it, or fusing various information from the document to abstract it. These techniques are flexible and thus difficult to be imitated by any single method. To address this issue, we propose an adaptive model, GEMINI, that integrates a rewriter and a generator to mimic the sentence rewriting and abstracting techniques, respectively. GEMINI adaptively chooses to rewrite a specific document sentence or generate a summary sentence from scratch. Experiments demonstrate that our adaptive approach outperforms the pure abstractive and rewriting baselines on three benchmark datasets, achieving the best results on WikiHow. Interestingly, empirical results show that the human summary styles of summary sentences are consistently predictable given their context. We release our code and model at \url{https://github.com/baoguangsheng/gemini}.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2304.03548

Country:

Asia (0.68)
North America > United States > Massachusetts (0.14)
North America > United States > Louisiana (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Leisure & Entertainment > Sports (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature

Bao, Guangsheng, Zhao, Yanbin, Teng, Zhiyang, Yang, Linyi, Zhang, Yue

arXiv.org Artificial IntelligenceOct-8-2023

Table 4: Details of the source models that is used to produce machine-generated text. We assess the performance of our methodologies using text generations sourced from various models, as outlined in Table 4. These models are arranged in order of their parameter count, with those having fewer than 20 billion parameters being run locally on a Tesla A100 GPU (80G). For models with over 6 billion parameters, we employ half-precision (float16), otherwise, we use full-precision (float32). In the case of larger models like GPT-3, ChatGPT, and GPT-4, we utilize the OpenAI API for the evaluations. Additionally, we provide information about the training corpus associated with each model, which we believe is pertinent for understanding the detection accuracy of different sampling and scoring models when applied to text generations originating from diverse source models, domains, and languages.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2310.0513

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (0.46)
Energy > Oil & Gas > Upstream (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.42)

Add feedback