Applying Ensemble Methods to Model-Agnostic Machine-Generated Text Detection
–arXiv.org Artificial Intelligence
These can range from logistic regression models to convolutional In this paper, we study the problem of detecting neural networks (Weller and Woo, 2019) or LSTM models machine-generated text when the large language model (Kudugunta and Ferrara, 2018). These binary classifiers (LLM) it is possibly derived from is unknown. We do so by can also act as base learners in ensemble methods (Fayaz et apply ensembling methods to the outputs from DetectGPT al., 2020). These features can also be augmented with classifiers (Mitchell et al. 2023), a zero-shot model for additional information such as account data in the context machine-generated text detection which is highly accurate of social media bot detection. However, high classification when the generative (or base) language model is the same accuracy for these methods are reliant on sufficiently-long as the discriminative (or scoring) language model. We find text length and a sufficiently-diverse corpus of training that simple summary statistics of DetectGPT sub-model machine-generated samples in terms of stylometric and outputs yield an AUROC of 0.73 (relative to 0.61) while linguistic characteristics in order to prevent overfitting. As retaining its zero-shot nature, and that supervised learning such, these classifiers need to be continually trained and methods sharply boost the accuracy to an AUROC of 0.94 updated, limiting their usefulness (Pegoraro et al., 2023).
arXiv.org Artificial Intelligence
Jun-18-2024
- Country:
- North America > United States
- Washington > King County
- Seattle (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- California > Santa Clara County
- Palo Alto (0.04)
- Washington > King County
- Europe
- Germany > Berlin (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.35)
- Industry:
- Information Technology > Security & Privacy (0.48)
- Technology: