Goto

Collaborating Authors

 xlnet



dc6a7e655d7e5840e66733e9ee67cc69-AuthorFeedback.pdf

Neural Information Processing Systems

We thank all the reviewers for helpful suggestions. We will incorporate the following analysis into our revision. Firstly, we found 4 typical patterns shared by both, as shown in Figure 1. Attention patterns shared by XLNet and BERT . Rows and columns represent query and key respectively.



LanguageUnderstanding

Neural Information Processing Systems

However, XLNet does not leverage the full position information of a sentence and thus suffers from position discrepancy between pre-training and fine-tuning.


XLNet: Generalized Autoregressive Pretraining for Language Understanding

Neural Information Processing Systems

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation.


MPNet: Masked and Permuted Pre-training for Language Understanding

Neural Information Processing Systems

BERT adopts masked language modeling (MLM) for pre-training and is one of the most successful pre-training models. Since BERT neglects dependency among predicted tokens, XLNet introduces permuted language modeling (PLM) for pre-training to address this problem. However, XLNet does not leverage the full position information of a sentence and thus suffers from position discrepancy between pre-training and fine-tuning. In this paper, we propose MPNet, a novel pre-training method that inherits the advantages of BERT and XLNet and avoids their limitations. MPNet leverages the dependency among predicted tokens through permuted language modeling (vs. MLM in BERT), and takes auxiliary position information as input to make the model see a full sentence and thus reducing the position discrepancy (vs.


Health App Reviews for Privacy & Trust (HARPT): A Corpus for Analyzing Patient Privacy Concerns, Trust in Providers and Trust in Applications

Kelly, Timoteo, Korkmaz, Abdulkadir, Mallet, Samuel, Souders, Connor, Aliakbarpour, Sadra, Rao, Praveen

arXiv.org Artificial Intelligence

Background: User reviews of Telehealth and Patient Portal mobile applications (apps) hereon referred to as electronic health (eHealth) apps are a rich source of unsolicited patient feedback, revealing critical insights into patient perceptions. However, the lack of large-scale, annotated datasets specific to privacy and trust has limited the ability of researchers to systematically analyze these concerns using natural language processing (NLP) techniques. Objective: This study aims to develop and benchmark Health App Reviews for Privacy & Trust (HARPT), a large-scale annotated corpus of patient reviews from eHealth apps to advance research in patient privacy and trust. Methods: We employed a multistage data construction strategy. This integrated keyword-based filtering, iterative manual labeling with review, targeted data augmentation, and weak supervision using transformer-based classifiers. A curated subset of 7,000 reviews was manually annotated to support machine learning model development and evaluation. The resulting dataset was used to benchmark a broad range of models. Results: The HARPT corpus comprises 480,000 patient reviews annotated across seven categories capturing critical aspects of trust in the application (TA), trust in the provider (TP), and privacy concerns (PC). We provide comprehensive benchmark performance for a range of machine learning models on the manually annotated subset, establishing a baseline for future research. Conclusions: The HARPT corpus is a significant resource for advancing the study of privacy and trust in the eHealth domain. By providing a large-scale, annotated dataset and initial benchmarks, this work supports reproducible research in usable privacy and trust within health informatics. HARPT is released under an open resource license.


XLNet: Generalized Autoregressive Pretraining for Language Understanding

Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R. Salakhutdinov, Quoc V. Le

Neural Information Processing Systems

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy.