Automatic Extraction of Opt-Out Choices from Privacy Policies
Sathyendra, Kanthashree Mysore (Carnegie Mellon University) | Schaub, Florian (University of Michigan) | Wilson, Shomir (University of Cincinnati) | Sadeh, Norman (Carnegie Mellon University)
Online “notice and choice” is an essential concept in the US FTC’s Fair Information Practice Principles. Privacy laws based on these principles include requirements for providing notice about data practices and allowing individuals to exercise control over those practices. Internet users need control over privacy, but their options are hidden in long privacy policies which are cumbersome to read and understand. In this paper, we describe several approaches to automatically extract choice instances from privacy policy documents using natural language processing and machine learning techniques. We define a choice instance as a statement in a privacy policy that indicates the user has discretion over the collection, use, sharing, or retention of their data. We describe supervised machine learning approaches for automatically extracting instances containing opt-out hyperlinks and evaluate the proposed methods using the OPP-115 Corpus, a dataset of annotated privacy policies. Extracting information about privacy choices and controls enables the development of concise and usable interfaces to help Internet users better understand the choices offered by online services. The focus of this paper, however, is to describe such methods to automatically extract useful opt-out hyperlinks from privacy policies.
Nov-19-2016
- Country:
- North America > United States (0.24)
- Industry:
- Information Technology > Security & Privacy (0.53)
- Technology: