susceptibility score
Context versus Prior Knowledge in Language Models
Du, Kevin, Snæbjarnarson, Vésteinn, Stoehr, Niklas, White, Jennifer C., Schein, Aaron, Cotterell, Ryan
To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and contexts: models will rely more on prior knowledge for questions about entities (e.g., persons, places, etc.) that they are more familiar with due to higher exposure in the training corpus, and be more easily persuaded by some contexts than others. To formalize this problem, we propose two mutual information-based metrics to measure a model's dependency on a context and on its prior about an entity: first, the persuasion score of a given context represents how much a model depends on the context in its decision, and second, the susceptibility score of a given entity represents how much the model can be swayed away from its original answer distribution about an entity. We empirically test our metrics for their validity and reliability. Finally, we explore and find a relationship between the scores and the model's expected familiarity with an entity, and provide two use cases to illustrate their benefits.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
- North America > United States > New York (0.04)
- (12 more...)
- Research Report > New Finding (0.98)
- Research Report > Experimental Study (0.68)
- Media (0.93)
- Government > Regional Government > North America Government > United States Government (0.67)
- Leisure & Entertainment > Sports > Soccer (0.67)
From Scroll to Misbelief: Modeling the Unobservable Susceptibility to Misinformation on Social Media
Liu, Yanchen, Ma, Mingyu Derek, Qin, Wenna, Zhou, Azure, Chen, Jiaao, Shi, Weiyan, Wang, Wei, Yang, Diyi
Susceptibility to misinformation describes the extent to believe unverifiable claims, which is hidden in people's mental process and infeasible to observe. Existing susceptibility studies heavily rely on the self-reported beliefs, making any downstream applications on susceptability hard to scale. To address these limitations, in this work, we propose a computational model to infer users' susceptibility levels given their activities. Since user's susceptibility is a key indicator for their reposting behavior, we utilize the supervision from the observable sharing behavior to infer the underlying susceptibility tendency. The evaluation shows that our model yields estimations that are highly aligned with human judgment on users' susceptibility level comparisons. Building upon such large-scale susceptibility labeling, we further conduct a comprehensive analysis of how different social factors relate to susceptibility. We find that political leanings and psychological factors are associated with susceptibility in varying degrees.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Wyoming (0.04)
- North America > United States > Wisconsin (0.04)
- (45 more...)
- Media > News (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.71)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.34)
Why adversarial training can hurt robust accuracy
Clarysse, Jacob, Hörrmann, Julia, Yang, Fanny
Machine learning classifiers with high test accuracy often perform poorly under adversarial attacks. It is commonly believed that adversarial training alleviates this issue. In this paper, we demonstrate that, surprisingly, the opposite may be true -- Even though adversarial training helps when enough data is available, it may hurt robust generalization in the small sample size regime. We first prove this phenomenon for a high-dimensional linear classification setting with noiseless observations. Our proof provides explanatory insights that may also transfer to feature learning models. Further, we observe in experiments on standard image datasets that the same behavior occurs for perceptible attacks that effectively reduce class information such as mask attacks and object corruptions.
- North America > United States > California (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > Experimental Study (0.49)
- Research Report > New Finding (0.47)
- Information Technology > Security & Privacy (0.34)
- Government > Military (0.34)