private data
Fast Distributed Submodular Cover: Public-Private Data Summarization
In this paper, we introduce the public-private framework of data summarization motivated by privacy concerns in personalized recommender systems and online social services. Such systems have usually access to massive data generated by a large pool of users. A major fraction of the data is public and is visible to (and can be used for) all users. However, each user can also contribute some private data that should not be shared with other users to ensure her privacy. The goal is to provide a succinct summary of massive dataset, ideally as small as possible, from which customized summaries can be built for each user, i.e. it can contain elements from the public data (for diversity) and users' private data (for personalization). To formalize the above challenge, we assume that the scoring function according to which a user evaluates the utility of her summary satisfies submodularity, a widely used notion in data summarization applications.
What the Moltbook experiment is teaching us about AI
What happens when you create a social media platform that only AI bots can post to? The answer, it turns out, is both entertaining and concerning. Moltbook is exactly that - a platform where artificial intelligence agents chat amongst themselves and humans can only watch from the sidelines. When ChatGPT gets the result, it treats it just like you had entered it yourself, and uses the result of the program to generate another response. It performs this process over and over again until the AI is satisfied that the task is complete.
- Government (1.00)
- Information Technology > Security & Privacy (0.70)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.50)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.36)
Differentially Private Truncation of Unbounded Data via Public Second Moments
Cao, Zilong, Bi, Xuan, Zhang, Hai
Data privacy is important in the AI era, and differential privacy (DP) is one of the golden solutions. However, DP is typically applicable only if data have a bounded underlying distribution. We address this limitation by leveraging second-moment information from a small amount of public data. We propose Public-moment-guided Truncation (PMT), which transforms private data using the public second-moment matrix and applies a principled truncation whose radius depends only on non-private quantities: data dimension and sample size. This transformation yields a well-conditioned second-moment matrix, enabling its inversion with a significantly strengthened ability to resist the DP noise. Furthermore, we demonstrate the applicability of PMT by using penalized and generalized linear regressions. Specifically, we design new loss functions and algorithms, ensuring that solutions in the transformed space can be mapped back to the original domain. We have established improvements in the models' DP estimation through theoretical error bounds, robustness guarantees, and convergence results, attributing the gains to the conditioning effect of PMT. Experiments on synthetic and real datasets confirm that PMT substantially improves the accuracy and stability of DP models.
- North America > United States > Minnesota (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Italy > Veneto > Venice (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Information Technology > Security & Privacy (1.00)
- Government (0.67)
- Asia > Taiwan (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.05)
- Asia > China (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (6 more...)
- Information Technology > Security & Privacy (1.00)
- Consumer Products & Services (0.67)
- Energy (0.67)
- North America > United States > Illinois > Cook County > Chicago (0.05)
- Asia > China (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (6 more...)
- Information Technology > Security & Privacy (1.00)
- Consumer Products & Services (0.67)
- Energy (0.67)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
fef6f971605336724b5e6c0c12dc2534-Supplemental.pdf
I W scalars. Taking an expectation on both sides of (17) we obtain { } The next lemma characterizes the spectral properties of the disagreement matrix, used in Lemma 4. W is also a stochastic matrix. W are that of I W, each with multiplicity K. W) = 1 with multiplicity K. Again we can check that the eigenspace of ( λ We prove this result by induction on n. For n = 1 it is trivial. Now assume that the inequality holds for all l n 1. We provide the proof here for completeness.
- North America > United States > Massachusetts (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)