Goto

Collaborating Authors

 Law


Long document summarization using page specific target text alignment and distilling page importance

arXiv.org Artificial Intelligence

The rapid growth of textual data across news, legal, medical, and scientific domains is becoming a challenge for efficiently accessing and understanding large volumes of content. It is increasingly complex for users to consume and extract meaningful information efficiently. Thus, raising the need for summarization. Unlike short document summarization, long document abstractive summarization is resource-intensive, and very little literature is present in this direction. BART is a widely used efficient sequence-to-sequence (seq-to-seq) model. However, when it comes to summarizing long documents, the length of the context window limits its capabilities. We proposed a model called PTS (Page-specific Target-text alignment Summarization) that extends the seq-to-seq method for abstractive summarization by dividing the source document into several pages. PTS aligns each page with the relevant part of the target summary for better supervision. Partial summaries are generated for each page of the document. We proposed another model called PTSPI (Page-specific Target-text alignment Summarization with Page Importance), an extension to PTS where an additional layer is placed before merging the partial summaries into the final summary. This layer provides dynamic page weightage and explicit supervision to focus on the most informative pages. We performed experiments on the benchmark dataset and found that PTSPI outperformed the SOTA by 6.32\% in ROUGE-1 and 8.08\% in ROUGE-2 scores.


Google experiences deja vu as second monopoly trial begins in US

The Guardian

After deflecting the US Department of Justice's attack on its illegal monopoly in online search, Google is facing another attempt to dismantle its internet empire in a trial focused on abusive tactics in digital advertising. The trial that opened Monday in an Alexandria, Virginia, federal court revolves around the harmful conduct that resulted in US district Judge Leonie Brinkema declaring parts of Google's digital advertising technology to be an illegal monopoly in April. The judge found that Google has been engaging in behavior that stifles competition to the detriment of online publishers that depend on the system for revenue. Google and the justice department will spend the next two weeks in court presenting evidence in a "remedy" trial that will culminate in Brinkema issuing a ruling on how to restore fair market conditions. If the justice department gets its way, Brinkema will order Google to sell parts of its ad technology - a proposal that the company's lawyers warned would "invite disruption and damage" to consumers and the internet's ecosystem.


As Good as a Coin Toss: Human Detection of AI-Generated Content

Communications of the ACM

Membership in ACM includes a subscription to Communications of the ACM (CACM), the computing industry's most trusted source for staying connected to the world of advanced computing. With only a 50-50 chance of detecting synthetic media online, users are more vulnerable than ever to being duped. Advances in generative AI technology have made it easier than ever for anyone to manufacture increasingly realistic synthetic media (colloquially known as deepfakes) at faster speeds, larger scales, and with more customization than ever. This in turn has led to synthetic media increasingly being used for harmful purposes, including disinformation campaigns, nonconsensual pornography, financial fraud, child sexual abuse and exploitation, and espionage. As of today, the principal defense to combat deceptive synthetic media depends in large part on the human observer's perceptual detection capabilities--their ability to visually or auditorily identify AI-generated content when they encounter it. Yet the growing realism of synthetic media impedes this ability, heightening people's vulnerability to weaponized synthetic content. Moreover, people overestimate how capable they are at identifying synthetic media, further exacerbating the problem. As synthetic media continues to advance in sophistication, so too does the threat posed by its growing weaponization, from financial fraud to the production of nonconsensual intimate materials of adults and children.


Instagram tightens its teen policy: Meta-owned app begins using AI to find accounts belonging to under-18s - even if they list an adult birthday

Daily Mail - Science & tech

Gabrielle surging into major hurricane as forecasters warn of'life-threatening' impact to East Coast Fed governor installed by Trump outlines bold case to slash interest rates to 2.5% in months So is Meghan Markle's former best pal about to tell all? Jessica Mulroney has an axe to grind and'knows where the bodies are buried', friends warn amid claims she's penning memoir Incredible secret DNA weapon that nailed Bryan Kohberger... and how no criminal can hide again Why Jennifer Aniston is'being silenced' from speaking out on close friend Jimmy Kimmel's firing Six charities including Teenage Cancer Trust cut ties with Sarah Ferguson after leaked email showed her apologising to'supreme friend' Jeffrey Epstein Will Smith's'nepo baby' son Jaden sparks outrage after landing coveted job at designer fashion brand I've had crippling anxiety for years. Heather Locklear fans can't believe how amazing the Melrose Place vet looks at 63... 40 years after fame hit Whoopi Goldberg claims The View is too fearless not to discuss Kimmel canning... despite completely avoiding subject at crucial moment There's a new dating trend that's great news for guys who struggle to get laid. Even divorce lawyers say it's the secret to happiness. But ladies, I promise it'll backfire I'm a 49-year-old beauty editor and menopause gave me hair loss and short, brittle locks that wouldn't grow.


Predator drones shift from border patrol to protest surveillance

Los Angeles Times

Things to Do in L.A. Tap to enable a layout that focuses on the article. An unmanned Predator drone flies over Kandahar Air Field in southern Afghanistan in 2010. This is read by an automated voice. Please report any issues or inconsistencies here . MQ-9 Predator drones were deployed over Los Angeles to monitor anti-ICE protests in June.


Fairness-in-the-Workflow: How Machine Learning Practitioners at Big Tech Companies Approach Fairness in Recommender Systems

arXiv.org Artificial Intelligence

Recommender systems (RS), which are widely deployed across high-stakes domains, are susceptible to biases that can cause large-scale societal impacts. Researchers have proposed methods to measure and mitigate such biases -- but translating academic theory into practice is inherently challenging. RS practitioners must balance the competing interests of diverse stakeholders, including providers and users, and operate in dynamic environments. Through a semi-structured interview study (N=11), we map the RS practitioner workflow within large technology companies, focusing on how technical teams consider fairness internally and in collaboration with other (legal, data, and fairness) teams. We identify key challenges to incorporating fairness into existing RS workflows: defining fairness in RS contexts, particularly when navigating multi-stakeholder and dynamic fairness considerations. We also identify key organization-wide challenges: making time for fairness work and facilitating cross-team communication. Finally, we offer actionable recommendations for the RS community, including HCI researchers and practitioners.


Algorithmic Fairness: Not a Purely Technical but Socio-Technical Property

arXiv.org Artificial Intelligence

The rapid trend of deploying artificial intelligence (AI) and machine learning (ML) systems in socially consequential domains has raised growing concerns about their trustworthiness, including potential discriminatory behaviours. Research in algorithmic fairness has generated a proliferation of mathematical definitions and metrics, yet persistent misconceptions and limitations -- both within and beyond the fairness community -- limit their effectiveness, such as an unreached consensus on its understanding, prevailing measures primarily tailored to binary group settings, and superficial handling for intersectional contexts. Here we critically remark on these misconceptions and argue that fairness cannot be reduced to purely technical constraints on models; we also examine the limitations of existing fairness measures through conceptual analysis and empirical illustrations, showing their limited applicability in the face of complex real-world scenarios, challenging prevailing views on the incompatibility between accuracy and fairness as well as that among fairness measures themselves, and outlining three worth-considering principles in the design of fairness measures. We believe these findings will help bridge the gap between technical formalisation and social realities and meet the challenges of real-world AI/ML deployment.


CIDER: A Causal Cure for Brand-Obsessed Text-to-Image Models

arXiv.org Artificial Intelligence

Text-to-image (T2I) models exhibit a significant yet under-explored "brand bias", a tendency to generate contents featuring dominant commercial brands from generic prompts, posing ethical and legal risks. We propose CIDER, a novel, model-agnostic framework to mitigate bias at inference-time through prompt refinement to avoid costly retraining. CIDER uses a lightweight detector to identify branded content and a Vision-Language Model (VLM) to generate stylistically divergent alternatives. We introduce the Brand Neutrality Score (BNS) to quantify this issue and perform extensive experiments on leading T2I models. Results show CIDER significantly reduces both explicit and implicit biases while maintaining image quality and aesthetic appeal. Our work offers a practical solution for more original and equitable content, contributing to the development of trustworthy generative AI.


Stress Testing Deliberative Alignment for Anti-Scheming Training

arXiv.org Artificial Intelligence

Highly capable AI systems could secretly pursue misaligned goals -- what we call "scheming". Because a scheming AI would deliberately try to hide its misaligned goals and actions, measuring and mitigating scheming requires different strategies than are typically used in ML. We propose that assessing anti-scheming interventions requires at least (1) testing propensity to scheme on far out-of-distribution (OOD) tasks, (2) evaluating whether lack of scheming is driven by situational awareness, and (3) checking for robustness to pre-existing misaligned goals. We use a broad category of "covert actions" -- such as secretly breaking rules or intentionally underperforming in tests -- as a proxy for scheming, and design evaluations for covert actions. We then stress-test deliberative alignment as a case study for anti-scheming. Across 26 OOD evaluations (180+ environments), deliberative alignment reduces covert action rates (OpenAI o3: 13%->0.4%) but does not fully eliminate them. Our mitigation is also able to largely stop agents from pursuing a hidden goal previously trained into the model, but we still find misbehavior after additional red-teaming. We find that models' chain-of-thought (CoT) often demonstrates awareness of being evaluated for alignment, and show causal evidence that this awareness decreases covert behavior, while unawareness increases it. Therefore, we cannot exclude that the observed reductions in covert action rates are at least partially driven by situational awareness. While we rely on human-legible CoT for training, studying situational awareness, and demonstrating clear evidence of misalignment, our ability to rely on this degrades as models continue to depart from reasoning in standard English. We encourage research into alignment mitigations for scheming and their assessment, especially for the adversarial case of deceptive alignment, which this paper does not address.


Pre-Forgettable Models: Prompt Learning as a Native Mechanism for Unlearning

arXiv.org Artificial Intelligence

Foundation models have transformed multimedia analysis by enabling robust and transferable representations across diverse modalities and tasks. However, their static deployment conflicts with growing societal and regulatory demands -- particularly the need to unlearn specific data upon request, as mandated by privacy frameworks such as the GDPR. Traditional unlearning approaches, including retraining, activation editing, or distillation, are often computationally expensive, fragile, and ill-suited for real-time or continuously evolving systems. In this paper, we propose a paradigm shift: rethinking unlearning not as a retroactive intervention but as a built-in capability. We introduce a prompt-based learning framework that unifies knowledge acquisition and removal within a single training phase. Rather than encoding information in model weights, our approach binds class-level semantics to dedicated prompt tokens. This design enables instant unlearning simply by removing the corresponding prompt -- without retraining, model modification, or access to original data. Experiments demonstrate that our framework preserves predictive performance on retained classes while effectively erasing forgotten ones. Beyond utility, our method exhibits strong privacy and security guarantees: it is resistant to membership inference attacks, and prompt removal prevents any residual knowledge extraction, even under adversarial conditions. This ensures compliance with data protection principles and safeguards against unauthorized access to forgotten information, making the framework suitable for deployment in sensitive and regulated environments. Overall, by embedding removability into the architecture itself, this work establishes a new foundation for designing modular, scalable and ethically responsive AI models.