Law
Model Recycling Framework for Multi-Source Data-Free Supervised Transfer Learning
This situation can give rise to privacy concerns, as organizations may not want to share sensitive information; for instance, healthcare providers may be reluctant to share patient information and security system maintainers may not want to risk sharing facial recognition data for system performance updates. Additionally, there may be issues with obtaining the source data such as when it is hard to retrieve due to technical difficulties or intellectual property restrictions (Li et al., 2020b; Chen et al., 2021; Liang et al., 2020; Ahmed et al., 2021b). Recent advancements in source-free unsupervised domain adaptation (SFUDA) have presented solutions for a scenario where source data is not accessible (Fang et al., 2022). Purposely, SFUDA utilizes pre-trained source models to improve the generalization of a model on an unlabeled target dataset. Our work is similar to other approaches in the field of SFUDA (Li et al., 2020b; Chen et al., 2021; Liang et al., 2020; Ahmed et al., 2021b), in that it addresses the practical scenario where source data is not available during training. Importantly, a crucial aspect is often overlooked by the majority of SFUDA studies. When it is assumed that source data is not accessible, then it cannot be guaranteed that the available source models have been trained on domains related to the target task. And yet, most of the works only have experimented on classic domain adaptation benchmarks, which are somewhat related by design, e.g., Digits-Five (Peng et al., 2019), Office-31 (Saenko et al., 2010), and Office-Home (Venkateswara et al., 2017), i.e,, domains that share the same labels but are dissimilar in feature (and ambient) space. Our approach is unique in that we consider such a source-free supervised transfer learning (SFSTL) setting (Lee et al., 2019), where we do not assume source models are trained on tasks with similar feature spaces or 1
Actionable Counterfactual Explanations Using Bayesian Networks and Path Planning with Applications to Environmental Quality Improvement
Valero-Leal, Enrique, Larraรฑaga, Pedro, Bielza, Concha
Counterfactual explanations study what should have changed in order to get an alternative result, enabling end-users to understand machine learning mechanisms with counterexamples. Actionability is defined as the ability to transform the original case to be explained into a counterfactual one. We develop a method for actionable counterfactual explanations that, unlike predecessors, does not directly leverage training data. Rather, data is only used to learn a density estimator, creating a search landscape in which to apply path planning algorithms to solve the problem and masking the endogenous data, which can be sensitive or private. We put special focus on estimating the data density using Bayesian networks, demonstrating how their enhanced interpretability is useful in high-stakes scenarios in which fairness is raising concern. Using a synthetic benchmark comprised of 15 datasets, our proposal finds more actionable and simpler counterfactuals than the current state-of-the-art algorithms. We also test our algorithm with a real-world Environmental Protection Agency dataset, facilitating a more efficient and equitable study of policies to improve the quality of life in United States of America counties. Our proposal captures the interaction of variables, ensuring equity in decisions, as policies to improve certain domains of study (air, water quality, etc.) can be detrimental in others. In particular, the sociodemographic domain is often involved, where we find important variables related to the ongoing housing crisis that can potentially have a severe negative impact on communities.
Stakeholder Perspectives on Humanistic Implementation of Computer Perception in Healthcare: A Qualitative Study
Kostick-Quenet, Kristin M., Hurley, Meghan E., Ayaz, Syed, Herrington, John, Zampella, Casey, Parish-Morris, Julia, Tunรง, Birkan, Lรกzaro-Muรฑoz, Gabriel, Blumenthal-Barby, J. S., Storch, Eric A.
Computer perception (CP) technologies (digital phenotyping, affective computing and related passive sensing approaches) offer unprecedented opportunities to personalize healthcare, but provoke concerns about privacy, bias and the erosion of empathic, relationship-centered practice. A comprehensive understanding of perceived risks, benefits, and implementation challenges from those who design, deploy and experience these tools in real-world settings remains elusive. This study provides the first evidence-based account of key stakeholder perspectives on the relational, technical, and governance challenges raised by the integration of CP technologies into patient care. We conducted in-depth, semi-structured interviews with 102 stakeholders: adolescent patients and their caregivers, frontline clinicians, technology developers, and ethics, legal, policy or philosophy scholars. Transcripts underwent thematic analysis by a multidisciplinary team; reliability was enhanced through double coding and consensus adjudication. Stakeholders articulated seven interlocking concern domains: (1) trustworthiness and data integrity; (2) patient-specific relevance; (3) utility and workflow integration; (4) regulation and governance; (5) privacy and data protection; (6) direct and indirect patient harms; and (7) philosophical critiques of reductionism. To operationalize humanistic safeguards, we propose "personalized roadmaps": co-designed plans that predetermine which metrics will be monitored, how and when feedback is shared, thresholds for clinical action, and procedures for reconciling discrepancies between algorithmic inferences and lived experience. By translating these insights into personalized roadmaps, we offer a practical framework for developers, clinicians and policymakers seeking to harness continuous behavioral data while preserving the humanistic core of care.
Censored Sampling for Topology Design: Guiding Diffusion with Human Preferences
Kim, Euihyun, Park, Keun, Kim, Yeoneung
Recent advances in denoising diffusion models have enabled rapid generation of optimized structures for topology optimization. However, these models often rely on surrogate predictors to enforce physical constraints, which may fail to capture subtle yet critical design flaws such as floating components or boundary discontinuities that are obvious to human experts. In this work, we propose a novel human-in-the-loop diffusion framework that steers the generative process using a lightweight reward model trained on minimal human feedback. Inspired by preference alignment techniques in generative modeling, our method learns to suppress unrealistic outputs by modulating the reverse diffusion trajectory using gradients of human-aligned rewards. Specifically, we collect binary human evaluations of generated topologies and train classifiers to detect floating material and boundary violations. These reward models are then integrated into the sampling loop of a pre-trained diffusion generator, guiding it to produce designs that are not only structurally performant but also physically plausible and manufacturable. Our approach is modular and requires no retraining of the diffusion model. Preliminary results show substantial reductions in failure modes and improved design realism across diverse test conditions. This work bridges the gap between automated design generation and expert judgment, offering a scalable solution to trustworthy generative design.
LetheViT: Selective Machine Unlearning for Vision Transformers via Attention-Guided Contrastive Learning
Tong, Yujia, Zhang, Tian, Yuan, Jingling, Wang, Yuze, Hu, Chuang
Vision Transformers (ViTs) have revolutionized computer vision tasks with their exceptional performance. However, the introduction of privacy regulations such as GDPR and CCP A has brought new challenges to them. These laws grant users the right to withdraw their data, necessitating not only the deletion of data but also the complete removal of its influence from trained models. Machine unlearning emerges as a critical solution, with exact unlearning being computationally prohibitive and approximate methods offering a more practical approach. This work addresses the particularly challenging scenario of random data forgetting in ViTs, where the model must forget specific samples while retaining others, even within the same class. We first reveal the core characteristics of ViTs through selective masking experiments: when high-attention areas are masked, the model retains its recognition capability but significantly weakens its memorization ability. Based on the above insights, we propose LetheViT, a contrastive unlearning method tailored for ViTs. LetheViT uses masked image inputs to generate positive logits and original image inputs to generate negative logits, guiding the model to forget specific details while retaining the general cl category outlines. Experimental results demonstrate that LetheViT achieves state-of-the-art performance, effectively balancing privacy compliance with model efficacy.
Multispin Physics of AI Tipping Points and Hallucinations
Johnson, Neil F., Huo, Frank Yingjie
Output from generative AI such as ChatGPT, can be repetitive and biased. But more worrying is that this output can mysteriously tip mid-response from good (correct) to bad (misleading or wrong) without the user noticing. In 2024 alone, this reportedly caused $67 billion in losses and several deaths. Establishing a mathematical mapping to a multispin thermal system, we reveal a hidden tipping instability at the scale of the AI's 'atom' (basic Attention head). We derive a simple but essentially exact formula for this tipping point which shows directly the impact of a user's prompt choice and the AI's training bias. We then show how the output tipping can get amplified by the AI's multilayer architecture. As well as helping improve AI transparency, explainability and performance, our results open a path to quantifying users' AI risk and legal liabilities.
Academic Vibe Coding: Opportunities for Accelerating Research in an Era of Resource Constraint
Crowson, Matthew G, Celi, Leo Celi A.
Academic laboratories face mounting resource constraints: budgets are tightening, grant overheads are potentially being capped, and the market rate for data-science talent significantly outstrips university compensation. Vibe coding, which is structured, prompt-driven code generation with large language models (LLMs) embedded in reproducible workflows, offers one pragmatic response. It aims to compress the idea-to-analysis timeline, reduce staffing pressure on specialized data roles, and maintain rigorous, version-controlled outputs. This article defines the vibe coding concept, situates it against the current academic resourcing crisis, details a beginner-friendly toolchain for its implementation, and analyzes inherent limitations that necessitate governance and mindful application.
ff4ERA: A new Fuzzy Framework for Ethical Risk Assessment in AI
Dyoub, Abeer, Letteri, Ivan, Lisi, Francesca A.
The emergence of Symbiotic AI (SAI) introduces new challenges to ethical decision-making as it deepens human-AI collaboration. As symbiosis grows, AI systems pose greater ethical risks, including harm to human rights and trust. Ethical Risk Assessment (ERA) thus becomes crucial for guiding decisions that minimize such risks. However, ERA is hindered by uncertainty, vagueness, and incomplete information, and morality itself is context-dependent and imprecise. This motivates the need for a flexible, transparent, yet robust framework for ERA. Our work supports ethical decision-making by quantitatively assessing and prioritizing multiple ethical risks so that artificial agents can select actions aligned with human values and acceptable risk levels. We introduce ff4ERA, a fuzzy framework that integrates Fuzzy Logic, the Fuzzy Analytic Hierarchy Process (FAHP), and Certainty Factors (CF) to quantify ethical risks via an Ethical Risk Score (ERS) for each risk type. The final ERS combines the FAHP-derived weight, propagated CF, and risk level. The framework offers a robust mathematical approach for collaborative ERA modeling and systematic, step-by-step analysis. A case study confirms that ff4ERA yields context-sensitive, ethically meaningful risk scores reflecting both expert input and sensor-based evidence. Risk scores vary consistently with relevant factors while remaining robust to unrelated inputs. Local sensitivity analysis shows predictable, mostly monotonic behavior across perturbations, and global Sobol analysis highlights the dominant influence of expert-defined weights and certainty factors, validating the model design. Overall, the results demonstrate ff4ERA ability to produce interpretable, traceable, and risk-aware ethical assessments, enabling what-if analyses and guiding designers in calibrating membership functions and expert judgments for reliable ethical decision support.
HoneyImage: Verifiable, Harmless, and Stealthy Dataset Ownership Verification for Image Models
Zhu, Zhihao, Han, Jiale, Yang, Yi
Image-based AI models are increasingly deployed across a wide range of domains, including healthcare, security, and consumer applications. However, many image datasets carry sensitive or proprietary content, raising critical concerns about unauthorized data usage. Data owners therefore need reliable mechanisms to verify whether their proprietary data has been misused to train third-party models. Existing solutions, such as backdoor watermarking and membership inference, face inherent trade-offs between verification effectiveness and preservation of data integrity. In this work, we propose HoneyImage, a novel method for dataset ownership verification in image recognition models. HoneyImage selectively modifies a small number of hard samples to embed imperceptible yet verifiable traces, enabling reliable ownership verification while maintaining dataset integrity. Extensive experiments across four benchmark datasets and multiple model architectures show that HoneyImage consistently achieves strong verification accuracy with minimal impact on downstream performance while maintaining imperceptible. The proposed HoneyImage method could provide data owners with a practical mechanism to protect ownership over valuable image datasets, encouraging safe sharing and unlocking the full transformative potential of data-driven AI.
Patents as Knowledge Artifacts: An Information Science Perspective on Global Innovation
Rajeevan, M. S., Devi, B. Mini
In an age of fast-paced technological change, patents have evolved into not only legal mechanisms of intellectual property, but also structured storage containers of knowledge full of metadata, categories, and formal innovation. This chapter proposes to reframe patents in the context of information science, by focusing on patents as knowledge artifacts, and by seeing patents as fundamentally tied to the global movement of scientific and technological knowledge. With a focus on three areas, the inventions of AIs, biotech patents, and international competition with patents, this work considers how new technologies are challenging traditional notions of inventorship, access, and moral accountability.The chapter provides a critical analysis of AI's implications for patent authorship and prior art searches, ownership issues arising from proprietary claims in biotechnology to ethical dilemmas, and the problem of using patents for strategic advantage in a global context of innovation competition. In this analysis, the chapter identified the importance of organizing information, creating metadata standards about originality, implementing retrieval systems to access previous works, and ethical contemplation about patenting unseen relationships in innovation ecosystems. Ultimately, the chapter called for a collaborative, transparent, and ethically-based approach in managing knowledge in the patenting environment highlighting the role for information professionals and policy to contribute to access equity in innovation.