We shift this perspective with the Privatech project to focus on corporations and law firms as agents of compliance. To comply with data protection laws, data processors must implement accountability measures to assess and document compliance in relation to both privacy documents and privacy practices. In this paper, we survey, on the one hand, current research on GDPR automation, and on the other hand, the operational challenges corporations face to comply with GDPR, and that may benefit from new forms of automation. We attempt to bridge the gap. We provide a roadmap for compliance assessment and generation by identifying compliance issues, breaking them down into tasks that can be addressed through machine learning and automation, and providing notes about related developments in the Privatech project.
Website and mobile application privacy policies are intended to describe the system’s data practices. However, they are often written in non-standard formats and contain ambiguities that make it difficult for users to read and comprehend these documents. We propose a crowdsourcing approach to extract data practices from privacy policies to provide more concise and useable privacy notices to users and support the analysis of stated data practices. To that end, we designed a hierarchical task workflow for crowdsourcing the extraction of data practices from privacy policies. We discuss our workflow design and report preliminary results.
Ramanath, Rohan (Carnegie Mellon University) | Schaub, Florian (Carnegie Mellon University) | Wilson, Shomir (Carnegie Mellon University) | Liu, Fei (Carnegie Mellon University) | Sadeh, Norman (Carnegie Mellon University) | Smith, Noah A (Carnegie Mellon University)
In today's age of big data, websites are collecting an increasingly wide variety of information about their users. The texts of websites' privacy policies, which serve as legal agreements between service providers and users, are often long and difficult to understand. Automated analysis of those texts has the potential to help users better understand the implications of agreeing to such policies. In this work, we present a technique that combines machine learning and crowdsourcing to semi-automatically extract key aspects of website privacy policies that is scalable, fast, and cost-effective.