There are now several large scale deployments of differential privacy used to collect statistical information about users. However, these deployments periodically recollect the data and recompute the statistics using algorithms designed for a single use. As a result, these systems do not provide meaningful privacy guarantees over long time scales. Moreover, existing techniques to mitigate this effect do not apply in the local model'' of differential privacy that these systems use. In this paper, we introduce a new technique for local differential privacy that makes it possible to maintain up-to-date statistics over time, with privacy guarantees that degrade only in the number of changes in the underlying distribution rather than the number of collection periods.
Privacy issues sit at the forefront of online activity, business actions, and government decisions. This is largely in response to the breaches, scandals, and personal data leaks that have eroded confidence in technology and information systems. The National Security Telecommunications Advisory Committee's (NSTAC) Report to the President on a Cybersecurity Moonshot says that privacy is a crucial component of cybersecurity and that we must flip the narrative to restore the trust Americans place in information systems. To achieve this, by 2028, Americans need to be "guaranteed" that technological advancements will no longer threaten privacy but will instead enhance privacy assurance through the safety and security of their personal data. One critical element in future technology advancements and online security is the increased development of artificial intelligence (AI).
Differential Privacy is a popular and well-studied notion of privacy. In the era ofbig data that we are in, privacy concerns are becoming ever more prevalent and thusdifferential privacy is being turned to as one such solution. A popular method forensuring differential privacy of a classifier is known as subsample-and-aggregate,in which the dataset is divided into distinct chunks and a model is learned on eachchunk, after which it is aggregated. This approach allows for easy analysis of themodel on the data and thus differential privacy can be easily applied. In this paper,we extend this approach by dividing the data several times (rather than just once)and learning models on each chunk within each division.
In light of the many news headlines and scandals over data privacy today, many companies are afraid to use customer data because of concerns over potential privacy violations. There is also a growing concern over being legally compliant but still making customers unhappy or uncomfortable, much like what happened with Target in 2012. Target legally used their customers' data to create targeted ads, but the personal nature of the ads still upset customers. Target wasn't doing anything wrong with data monetization, but it still negatively impacted customers. Many companies opt not to use data out of fear, but that comes at a huge loss of revenue.
In this paper we initiate the study of adaptive composition in differential privacy when the length of the composition, and the privacy parameters themselves can be chosen adaptively, as a function of the outcome of previously run analyses. This case is much more delicate than the setting covered by existing composition theorems, in which the algorithms themselves can be chosen adaptively, but the privacy parameters must be fixed up front. Indeed, it isn't even clear how to define differential privacy in the adaptive parameter setting. We proceed by defining two objects which cover the two main use cases of composition theorems. A privacy filter is a stopping time rule that allows an analyst to halt a computation before his pre-specified privacy budget is exceeded.