Montasser, Omar
VC Classes are Adversarially Robustly Learnable, but Only Improperly
Montasser, Omar, Hanneke, Steve, Srebro, Nathan
We study the question of learning an adversarially robust predictor. We show that any hypothesis class $\mathcal{H}$ with finite VC dimension is robustly PAC learnable with an improper learning rule. The requirement of being improper is necessary as we exhibit examples of hypothesis classes $\mathcal{H}$ with finite VC dimension that are not robustly PAC learnable with any proper learning rule.
Predicting Demographics of High-Resolution Geographies with Geotagged Tweets
Montasser, Omar (Pennsylvania State University) | Kifer, Daniel (Pennsylvania State University)
In this paper, we consider the problem of predicting demographics of geographic units given geotagged Tweets that are composed within these units. Traditional survey methods that offer demographics estimates are usually limited in terms of geographic resolution, geographic boundaries, and time intervals. Thus, it would be highly useful to develop computational methods that can complement traditional survey methods by offering demographics estimates at finer geographic resolutions, with flexible geographic boundaries (i.e. not confined to administrative boundaries), and at different time intervals. While prior work has focused on predicting demographics and health statistics at relatively coarse geographic resolutions such as the county-level or state-level, we introduce an approach to predict demographics at finer geographic resolutions such as the blockgroup-level. For the task of predicting gender and race/ethnicity counts at the blockgroup-level, an approach adapted from prior work to our problem achieves an average correlation of 0.389 (gender) and 0.569 (race) on a held-out test dataset. Our approach outperforms this prior approach with an average correlation of 0.671 (gender) and 0.692 (race).
Predicting Demographics of High-Resolution Geographies with Geotagged Tweets
Montasser, Omar, Kifer, Daniel
In this paper, we consider the problem of predicting demographics of geographic units given geotagged Tweets that are composed within these units. Traditional survey methods that offer demographics estimates are usually limited in terms of geographic resolution, geographic boundaries, and time intervals. Thus, it would be highly useful to develop computational methods that can complement traditional survey methods by offering demographics estimates at finer geographic resolutions, with flexible geographic boundaries (i.e. not confined to administrative boundaries), and at different time intervals. While prior work has focused on predicting demographics and health statistics at relatively coarse geographic resolutions such as the county-level or state-level, we introduce an approach to predict demographics at finer geographic resolutions such as the blockgroup-level. For the task of predicting gender and race/ethnicity counts at the blockgroup-level, an approach adapted from prior work to our problem achieves an average correlation of 0.389 (gender) and 0.569 (race) on a held-out test dataset. Our approach outperforms this prior approach with an average correlation of 0.671 (gender) and 0.692 (race).