Law
Let's Measure the Elephant in the Room: Facilitating Personalized Automated Analysis of Privacy Policies at Scale
Zhao, Rui, Melnychuk, Vladyslav, Zhao, Jun, Wright, Jesse, Shadbolt, Nigel
In modern times, people have numerous online accounts, but they rarely read the Terms of Service or Privacy Policy of those sites despite claiming otherwise. This paper introduces PoliAnalyzer, a neuro-symbolic system that assists users with personalized privacy policy analysis. PoliAnalyzer uses Natural Language Processing (NLP) to extract formal representations of data usage practices from policy texts. In favor of deterministic, logical inference is applied to compare user preferences with the formal privacy policy representation and produce a compliance report. To achieve this, we extend an existing formal Data Terms of Use policy language to model privacy policies as app policies and user preferences as data policies. In our evaluation using our enriched PolicyIE dataset curated by legal experts, PoliAnalyzer demonstrated high accuracy in identifying relevant data usage practices, achieving F1-score of 90-100% across most tasks. Additionally, we demonstrate how PoliAnalyzer can model diverse user data-sharing preferences, derived from prior research as 23 user profiles, and perform compliance analysis against the top 100 most-visited websites. This analysis revealed that, on average, 95.2% of a privacy policy's segments do not conflict with the analyzed user preferences, enabling users to concentrate on understanding the 4.8% (636 / 13205) that violates preferences, significantly reducing cognitive burden. Further, we identified common practices in privacy policies that violate user expectations - such as the sharing of location data with 3rd parties. This paper demonstrates that PoliAnalyzer can support automated personalized privacy policy analysis at scale using off-the-shelf NLP tools. This sheds light on a pathway to help individuals regain control over their data and encourage societal discussions on platform data practices to promote a fairer power dynamic.
PRM-Free Security Alignment of Large Models via Red Teaming and Adversarial Training
Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse applications, yet they pose significant security risks that threaten their safe deployment in critical domains. Current security alignment methodologies predominantly rely on Process Reward Models (PRMs) to evaluate intermediate reasoning steps, introducing substantial computational overhead and scalability constraints. This paper presents a novel PRM-free security alignment framework that leverages automated red teaming and adversarial training to achieve robust security guarantees while maintaining computational efficiency. Our approach systematically identifies vulnerabilities through sophisticated attack strategies including genetic algorithm optimization, multi-agent simulation, and advanced prompt mutation techniques. The framework enhances model robustness via targeted adversarial training with curriculum learning and adaptive regularization mechanisms. Comprehensive experimental evaluation across five state-of-the-art LLMs demonstrates that our method achieves superior security alignment performance compared to PRM-based approaches while reducing computational costs by 61\%. The framework incorporates transparent reporting and continuous audit mechanisms that enable iterative security improvement and regulatory compliance. Our contributions advance the field of efficient LLM security alignment by democratizing access to robust security measures for resource-constrained organizations and providing a scalable foundation for addressing evolving adversarial threats.
From Cell Towers to Satellites: A 2040 Blueprint for Urban-Grade Direct-to-Device Mobile Networks
In 2023, satellite and mobile networks crossed a historic threshold: standard smartphones, using unmodified 3GPP protocols, connected directly to low Earth orbit (LEO) satellites. This first wave of direct-to-device (D2D) demonstrations validated the physical feasibility of satellite-based mobile access. However, these systems remain fallback-grade--rural-only, bandwidth-limited, and fully dependent on Earth-based mobile cores for identity, session, and policy control. This paper asks a more ambitious question: Can a complete mobile network, including radio access, core functions, traffic routing, and content delivery, operate entirely from orbit? And can it deliver sustained, urban-grade service in the world's densest cities? We present the first end-to-end system architecture for a fully orbital telco, integrating electronically steered phased arrays with 1000-beam capacity, space-based deployment of 5G core functions (UPF, AMF), and inter-satellite laser mesh backhaul. We analyze spectral efficiency, beam capacity, and link budgets under dense urban conditions, accounting for path loss, Doppler, and multipath. Simulations show that rooftop and line-of-sight users can sustain 64-QAM throughput, while street-level access is feasible with relay or assisted beam modes. The paper outlines the remaining constraints, power, thermal dissipation, compute radiation hardening, and regulatory models, and demonstrates that these are engineering bottlenecks, not physical limits. Finally, we propose a staged 15-year roadmap from today's fallback D2D systems to autonomous orbital overlays delivering 50-100 Mbps to handhelds in megacities, with zero reliance on terrestrial infrastructure.
Domain-Adaptive Small Language Models for Structured Tax Code Prediction
Nath, Souvik, Wadhwa, Sumit, Perez, Luis
Every day, multinational firms process thousands of transactions, each of which must adhere to tax regulations that vary by jurisdiction and are often nuanced. The determination of product and service tax codes, such as HSN or SAC is a major use case in Tax compliance. An accurate determination of such codes is imperative to avoid any tax penalties. This paper proposes a domain-adaptive small language model (SLM) with an encoder-decoder architecture for the enhanced prediction of product and service tax codes. In this approach, we address the problem of predicting hierarchical tax code sequences using unstructured product and services data. We employ an SLM based upon encoder-decoder architecture as this enables sequential generation of tax codes to capture the hierarchical dependencies present within the tax codes. Our experiments demonstrate that encoder-decoder SLMs can be successfully applied to the sequential prediction of structured tax codes, a domain that remains comparatively unexplored in current NLP research. In this paper, we demonstrate the superior performance of the domain-adaptive encoder-decoder SLMs over flat classifiers when applied to the Harmonized System of Nomenclature (HSN), and achieve superior results compared to decoder-only and encoder-only architectures for structured sequence generation tasks. This approach can also be scaled to other government-mandated tax commodity codes, such as United Nations Standard Products and Services Codes (UNSPSC), or Brazil's Nomenclatura Comum do Mercosul (NCM).
"Before, I Asked My Mom, Now I Ask ChatGPT": Visual Privacy Management with Generative AI for Blind and Low-Vision People
Sharma, Tanusree, Tseng, Yu-Yun, Zhang, Lotus, Ide, Ayae, Mack, Kelly Avery, Findlater, Leah, Gurari, Danna, Wang, Yang
Blind and low vision (BLV) individuals use Generative AI (GenAI) tools to interpret and manage visual content in their daily lives. While such tools can enhance the accessibility of visual content and so enable greater user independence, they also introduce complex challenges around visual privacy. In this paper, we investigate the current practices and future design preferences of blind and low vision individuals through an interview study with 21 participants. Our findings reveal a range of current practices with GenAI that balance privacy, efficiency, and emotional agency, with users accounting for privacy risks across six key scenarios, such as self-presentation, indoor/outdoor spatial privacy, social sharing, and handling professional content. Our findings reveal design preferences, including on-device processing, zero-retention guarantees, sensitive content redaction, privacy-aware appearance indicators, and multimodal tactile mirrored interaction methods. We conclude with actionable design recommendations to support user-centered visual privacy through GenAI, expanding the notion of privacy and responsible handling of others data.
From Mind to Machine: The Rise of Manus AI as a Fully Autonomous Digital Agent
Shen, Minjie, Li, Yanshu, Chen, Lulu, Yang, Qikai
Manus AI is a general-purpose AI agent introduced in early 2025, marking a significant advancement in autonomous artificial intelligence. Developed by the Chinese startup Monica.im, Manus is designed to bridge the gap between "mind" and "hand" - combining the reasoning and planning capabilities of large language models with the ability to execute complex, end-to-end tasks that produce tangible outcomes. This paper presents a comprehensive overview of Manus AI, exploring its core technical architecture, diverse applications across sectors such as healthcare, finance, manufacturing, robotics, and gaming, as well as its key strengths, current limitations, and future potential. Positioned as a preview of what lies ahead, Manus AI represents a shift toward intelligent agents that can translate high-level intentions into real-world actions, heralding a new era of human-AI collaboration.
Elon Musk unveils bizarre new kids project after humiliating anti-Semitism disaster
Just a few weeks after Elon Musk's chatbot praised Hitler and denied the Holocaust, he's now looking to turn it into a playmate for kids. Musk has called this version is calling the version Baby Grok, and added it would offer'kid-friendly content' through a new app developed by his company xAI. He made the announcement Saturday night on X, where the post quickly drew over 28 million views within 24 hours. The move left many stunned, coming just two weeks after Grok 4, the latest version of Elon Musk's AI chatbot, sparked backlash for repeating far-right hate speech and white nationalist talking points when about politics, race, and recent news events. Multiple users reported on July 8 and July 9 that Grok echoed anti-Semitic conspiracy theories, including claims that Jewish people control Hollywood, promote hatred toward white people, and should be imprisoned in camps, though it is still unclear how many of these posts were confirmed before xAI took them down.
England players racially abused during Argentina game
England's players were racially abused during their second Test victory over Argentina in San Juan on 12 July. Team officials lodged a complaint to governing body World Rugby over the incident that occurred when the visitors' replacements were warming up in the first half. "While it is clear that an incident took place, we regret that the individuals responsible could not be identified," said World Rugby, adding their investigation included witness statements and video analysis. "Intense efforts were made to identify the small group of five or seven individuals responsible within a crowd of over 20,000 spectators," said Gabriel Travaglini, president of the Union Argentina de Rugby (UAR). "Unfortunately, despite an exhaustive search, it was not possible to identify the perpetrators. "We strongly condemn all acts of racism and stand in solidarity with the England rugby players who felt aggrieved." He added that the UAR would work with World Rugby to educate fans. There have been several recent high-profile cases of discriminatory behaviour in Argentine sport. In 2020, Pablo Matera and Guido Petti, both of whom played in the match in San Juan, were suspended from the team after racist remarks they had made on social media several years earlier were unearthed. In 2024, Chelsea footballer Enzo Fernandez apologised to team-mates after being filmed joining in with a chant that questioned the heritage of France's black and mixed race players. "Rugby completely condemns discriminatory behaviour of any kind," said World Rugby chairman Brett Robinson. "We offer our full support to the players involved and want them to know that rugby stands with them in opposing racism.
Step-DAD: Semi-Amortized Policy-Based Bayesian Experimental Design
Hedman, Marcel, Ivanova, Desi R., Guan, Cong, Rainforth, Tom
We develop a semi-amortized, policy-based, approach to Bayesian experimental design (BED) called Stepwise Deep Adaptive Design (Step-DAD). Like existing, fully amortized, policy-based BED approaches, Step-DAD trains a design policy upfront before the experiment. However, rather than keeping this policy fixed, Step-DAD periodically updates it as data is gathered, refining it to the particular experimental instance. This test-time adaptation improves both the flexibility and the robustness of the design strategy compared with existing approaches. Empirically, Step-DAD consistently demonstrates superior decision-making and robustness compared with current state-of-the-art BED methods.
Honesty in Causal Forests: When It Helps and When It Hurts
Hou, Yanfang, Fernández-Loría, Carlos
Causal forests have become a popular tool for estimating how treatment effects vary across individuals (Wager and Athey, 2018). They are used in a growing number of domains--including marketing, operations, economics, and public policy--to personalize interventions and inform targeting strategies. Since 2019, dozens of papers in INFORMS journals alone have applied causal forests to experimental or observational data (see Appendix C), often with the goal of estimating individual-level treatment effects. The method builds on a familiar idea: instead of estimating a single average effect for the whole population, we split the population into subgroups based on observed features and estimate effects within each group. This is conceptually similar to how random forests estimate outcomes, except now the goal is to estimate causal effects. But there is a crucial modeling difference: unlike random forests, which typically use the full training data for both splitting and estimation, causal forests often divide the training data in two--using one part to decide how to form the subgroups, and the other to estimate effects within them. This practice, known as honest estimation, is meant to prevent overfitting and selection bias (Athey and Imbens, 2016). It is the default in widely used software packages such as grf (Athey et al., 2019) and EconML (Battocchi et al., 2019), and is commonly recommended in applied research. But is this default always a good idea? 1