Goto

Collaborating Authors

 personal data


No, the Freecash App Won't Pay You to Scroll TikTok

WIRED

Freecash will actually pay money out to users but not for watching videos. This misleading marketing coincides with the app's rising popularity. I first encountered the Freecash app after clicking on a sponsored TikTok video with dubious claims. The advertisement didn't promote this app by name, rather it showed a young woman expressing her excitement about seemingly getting hired by TikTok at $35 an hour to watch videos on her "For You" page. When I tapped the link to "order now," it sent me to a website with TikTok and Freecash logos, featuring a download link for the Freecash app.


On the Epistemic Limits of Personalized Prediction

Neural Information Processing Systems

Machine learning models are often personalized by using group attributes that encode personal characteristics (e.g., sex, age group, HIV status). In such settings, individuals expect to receive more accurate predictions in return for disclosing group attributes to the personalized model. We study when we can tell that a personalized model upholds this principle for every group who provides personal data. We introduce a metric called the benefit of personalization (BoP) to measure the smallest gain in accuracy that any group expects to receive from a personalized model. We describe how the BoP can be used to carry out basic routines to audit a personalized model, including: (i) hypothesis tests to check that a personalized model improves performance for every group; (ii) estimation procedures to bound the minimum gain in personalization. We characterize the reliability of these routines in a finite-sample regime and present minimax bounds on both the probability of error for BoP hypothesis tests and the mean-squared error of BoP estimates. Our results show that we can only claim that personalization improves performance for each group who provides data when we explicitly limit the number of group attributes used by a personalized model. In particular, we show that it is impossible to reliably verify that a personalized classifier with $k \geq 19$ binary group attributes will benefit every group who provides personal data using a dataset of $n = 8\times10^9$ samples -- one for each person in the world.


Parajudica: An RDF-Based Reasoner and Metamodel for Multi-Framework Context-Dependent Data Compliance Assessments

Moreau, Luc, Rossi, Alfred, Stalla-Bourdillon, Sophie

arXiv.org Artificial Intelligence

We demonstrate the utility of this resource and accompanying metamodel through application to existing legal frameworks and industry standards, offering insights for comparative framework analysis. Applications include compliance policy enforcement, compliance monitoring, data discovery, and risk assessment.


Holiday travel privacy risks and how to stay safe

FOX News

Holiday travelers face increased scammer attacks using leaked personal data from airlines and hotels to send fake flight cancellations and payment requests.


Your Data Might Determine How Much You Pay for Eggs

WIRED

A newly enacted New York law requires retailers to say whether your data influences the price of basic goods like a dozen eggs or toilet paper, but not how. If you're near Rochester, New York, the price for a carton of Target's Good & Gather eggs is listed as $1.99 on its website. It's unclear why the prices differ, but a new notice on Target's website offers a potential hint: "This price was set by an algorithm using your personal data." A recently enacted New York State law requires businesses that algorithmically set prices using customers' personal data to disclose that. According to the law, personal data includes any data that can be "linked or reasonably linked, directly or indirectly, with a specific consumer or device." The law doesn't require businesses to explicitly state what information about a person or device is being used or how each piece of information affects the final price a customer sees.


A Longitudinal Measurement of Privacy Policy Evolution for Large Language Models

Tao, Zhen, Pan, Shidong, Xing, Zhenchang, Black, Emily, Gillis, Talia, Chen, Chunyang

arXiv.org Artificial Intelligence

Large language model (LLM) services have been rapidly integrated into people's daily lives as chatbots and agentic systems. They are nourished by collecting rich streams of data, raising privacy concerns around excessive collection of sensitive personal information. Privacy policies are the fundamental mechanism for informing users about data practices in modern information privacy paradigm. Although traditional web and mobile policies are well studied, the privacy policies of LLM providers, their LLM-specific content, and their evolution over time remain largely underexplored. In this paper, we present the first longitudinal empirical study of privacy policies for mainstream LLM providers worldwide. We curate a chronological dataset of 74 historical privacy policies and 115 supplemental privacy documents from 11 LLM providers across 5 countries up to August 2025, and extract over 3,000 sentence-level edits between consecutive policy versions. We compare LLM privacy policies to those of other software formats, propose a taxonomy tailored to LLM privacy policies, annotate policy edits and align them with a timeline of key LLM ecosystem events. Results show they are substantially longer, demand college-level reading ability, and remain highly vague. Our taxonomy analysis reveals patterns in how providers disclose LLM-specific practices and highlights regional disparities in coverage. Policy edits are concentrated in first-party data collection and international/specific-audience sections, and that product releases and regulatory actions are the primary drivers, shedding light on the status quo and the evolution of LLM privacy policies.


Protect your data before holiday shopping scams strike

FOX News

Holiday shopping scams spike during Black Friday and Cyber Monday as data brokers sell personal information to cybercriminals who target shoppers with realistic fake emails and texts.


On the Epistemic Limits of Personalized Prediction

Neural Information Processing Systems

We characterize the reliability of these routines in a finite-sample regime and present minimax bounds on both the probability of error for BoP hypothesis tests and the mean-squared error of BoP estimates.


Policy-as-Prompt: Turning AI Governance Rules into Guardrails for AI Agents

Kholkar, Gauri, Ahuja, Ratinder

arXiv.org Artificial Intelligence

As autonomous AI agents are used in regulated and safety-critical settings, organizations need effective ways to turn policy into enforceable controls. We introduce a regulatory machine learning framework that converts unstructured design artifacts (like PRDs, TDDs, and code) into verifiable runtime guardrails. Our Policy as Prompt method reads these documents and risk controls to build a source-linked policy tree. This tree is then compiled into lightweight, prompt-based classifiers for real-time runtime monitoring. The system is built to enforce least privilege and data minimization. For conformity assessment, it provides complete provenance, traceability, and audit logging, all integrated with a human-in-the-loop review process. Evaluations show our system reduces prompt-injection risk, blocks out-of-scope requests, and limits toxic outputs. It also generates auditable rationales aligned with AI governance frameworks. By treating policies as executable prompts (a policy-as-code for agents), this approach enables secure-by-design deployment, continuous compliance, and scalable AI safety and AI security assurance for regulatable ML.


Stop foreign-owned apps from harvesting your personal data

FOX News

Foreign-owned apps secretly collect personal data from users and sell it to overseas data brokers, with retirees being particularly vulnerable to targeted scams.