Goto

Collaborating Authors

 Fabbri, Daniel


A Game-Theoretic Approach for Alert Prioritization

AAAI Conferences

The quantity of information that is collected and stored in computer systems continues to grow rapidly. At the same time, the sensitivity of such information (e.g., detailed medical records) often makes such information valuable to both external attackers, who may obtain information by compromising a system, and malicious insiders, who may misuse information by exercising their authorization. To mitigate compromises and deter misuse, the security administrators of these resources often deploy various types of intrusion and misuse detection systems, which provide alerts of suspicious events that are worthy of follow-up review. However, in practice, these systems may generate a large number of false alerts, wasting the time of investigators. Given that security administrators have limited budget for investigating alerts, they must prioritize certain types of alerts over others. An important challenge in alert prioritization is that adversaries may take advantage of such behavior to evade detection — specifically by mounting attacks that trigger alerts that are less likely to be investigated. In this paper, we model alert prioritization with adaptive adversaries using a Stackelberg game and introduce an approach to compute the optimal prioritization of alert types. We evaluate our approach using both synthetic data and a real-world dataset of alerts generated from the audit logs of an electronic medical record system in use at a large academic medical center.


#PrayForDad: Learning the Semantics Behind Why Social Media Users Disclose Health Information

AAAI Conferences

User-generated content in social media is increasingly acknowledged as a rich resource for research into health problems. One particular area of interest is in the semantics individuals’ evoke because they can influence when health-related information is disclosed. While there have been multiple investigations into why self-disclose occurs, much less is known about when individuals choose to disclose information about other people (e.g., a relative), which is a significant privacy concern. In this paper, we introduce a novel framework to investigate how semantics influence disclosure routines for 34 health issues. This framework begins with a supervised classification model to distinguish tweets that communicate personal health issues from confounding concepts (e.g., metaphorical statements that include a health-related keyword). Next, we annotate tweets for each health issue with linguistic and psychological categories (e.g. social processes, affective processes and personal concerns). Then, we apply a non-negative matrix factorization over a health issue-by-language category space. Finally, the factorized basis space is leveraged to group health issues into natural aggregations based around how they are discussed. We evaluate this framework with four months of tweets (over 200 million) and show that certain semantics correspond with whom a health mention pertains to. Our findings show that health issues related with family members, high medical cost and social support (e.g., Alzheimer's Disease, cancer, and Down syndrome) lead to tweets that are more likely to disclose another individual's health status, while tweets with more benign health issues (e.g., allergy, arthritis, and bronchitis) with biological processes (e.g., health and ingestion) and negative emotions are more likely to contain self-disclosures.