Goto

Collaborating Authors

 Generative AI


Concerns raised over AI trained on 57 million NHS medical records

New Scientist

An artificial intelligence model trained on the medical data of 57 million people who have used the National Health Service in England could one day assist doctors in predicting disease or forecast hospitalisation rates, its creators have claimed. However, other researchers say there are still significant privacy and data protection concerns around such large-scale use of health data, while even the AI's architects say they can't guarantee that it won't inadvertently reveal sensitive patient data. The model, called Foresight, was first developed in 2023. That initial version used OpenAI's GPT-3, the large language model (LLM) behind the first version of ChatGPT, and trained on 1.5 million real patient records from two London hospitals. Now, Chris Tomlinson at University College London and his colleagues have scaled up Foresight to create what they say is the world's first "national-scale generative AI model of health data" and the largest of its kind.


This patient's Neuralink brain implant gets a boost from generative AI

MIT Technology Review

Smith was about to get brain surgery, but Musk's virtual appearance foretold a greater transformation. Smith's brain was about to be inducted into a much larger technology and media ecosystem--one of whose goals, the billionaire has said, is to achieve a "symbiosis" of humans and AI. Consider what unfolded on April 27, the day Smith announced on X that he'd received the brain implant and wanted to take questions. One of the first came from "Adrian Dittmann," an account often suspected of being Musk's alter ego. Can you describe how it feels to type and interact with technology overall using the Neuralink?" It feels wild, like I'm a cyborg from a sci-fi movie, moving a cursor just by thinking about it. At first, it was a struggle--my cursor acted like a drunk mouse, barely hitting targets, but after weeks of training with imagined hand and jaw movements, it clicked, almost like riding a bike."


Snakemaker: Seamlessly transforming ad-hoc analyses into sustainable Snakemake workflows with generative AI

arXiv.org Artificial Intelligence

Reproducibility and sustainability present significant challenges in bioinformatics software development, where rapidly evolving tools and complex workflows often result in short-lived or difficult-to-adapt pipelines. This paper introduces Snakemaker, a tool that leverages generative AI to facilitate researchers build sustainable data analysis pipelines by converting unstructured code into well-defined Snakemake workflows. Snakemaker non-invasively tracks the work performed in the terminal by the researcher, analyzes execution patterns, and generates Snakemake workflows that can be integrated into existing pipelines. Snakemaker also supports the transformation of monolithic Ipython Notebooks into modular Snakemake pipelines, resolving the global state of the notebook into discrete, file-based interactions between rules. An integrated chat assistant provides users with fine-grained control through natural language instructions. Snakemaker generates high-quality Snakemake workflows by adhering to the best practices, including Conda environment tracking, generic rule generation and loop unrolling. By lowering the barrier between prototype and production-quality code, Snakemaker addresses a critical gap in computational reproducibility for bioinformatics research.


BCause: Human-AI collaboration to improve hybrid mapping and ideation in argumentation-grounded deliberation

arXiv.org Artificial Intelligence

Public deliberation, as in open discussion of issues of public concern, often suffers from scattered and shallow discourse, poor sensemaking, and a disconnect from actionable policy outcomes. This paper introduces BCause, a discussion system leveraging generative AI and human-machine collaboration to transform unstructured dialogue around public issues (such as urban living, policy changes, and current socio-economic transformations) into structured, actionable democratic processes. We present three innovations: (i) importing and transforming unstructured transcripts into argumentative discussions, (ii) geo-deliberated problem-sensing via a Telegram bot for local issue reporting, and (iii) smart reporting with customizable widgets (e.g., summaries, topic modelling, policy recommendations, clustered arguments). The system's human-AI partnership preserves critical human participation to ensure ethical oversight, contextual relevance, and creative synthesis.


Lesion-Aware Generative Artificial Intelligence for Virtual Contrast-Enhanced Mammography in Breast Cancer

arXiv.org Artificial Intelligence

Contrast-Enhanced Spectral Mammography (CESM) is a dual-energy mammographic technique that improves lesion visibility through the administration of an iodinated contrast agent. It acquires both a low-energy image, comparable to standard mammography, and a high-energy image, which are then combined to produce a dual-energy subtracted image highlighting lesion contrast enhancement. While CESM offers superior diagnostic accuracy compared to standard mammography, its use entails higher radiation exposure and potential side effects associated with the contrast medium. To address these limitations, we propose Seg-CycleGAN, a generative deep learning framework for Virtual Contrast Enhancement in CESM. The model synthesizes high-fidelity dual-energy subtracted images from low-energy images, leveraging lesion segmentation maps to guide the generative process and improve lesion reconstruction. Building upon the standard CycleGAN architecture, Seg-CycleGAN introduces localized loss terms focused on lesion areas, enhancing the synthesis of diagnostically relevant regions. Experiments on the CESM@UCBM dataset demonstrate that Seg-CycleGAN outperforms the baseline in terms of PSNR and SSIM, while maintaining competitive MSE and VIF. Qualitative evaluations further confirm improved lesion fidelity in the generated images. These results suggest that segmentation-aware generative models offer a viable pathway toward contrast-free CESM alternatives.


Ensuring Reproducibility in Generative AI Systems for General Use Cases: A Framework for Regression Testing and Open Datasets

arXiv.org Artificial Intelligence

Reproducibility and reliability remain pressing challenges for generative AI systems whose behavior can drift with each model update or prompt revision. We introduce GPR-bench, a lightweight, extensible benchmark that operationalizes regression testing for general purpose use cases. GPR-bench couples an open, bilingual (English and Japanese) dataset covering eight task categories (e.g., text generation, code generation, and information retrieval) and 10 scenarios in each task categories (80 total test cases for each language) with an automated evaluation pipeline that employs "LLM-as-a-Judge" scoring of correctness and conciseness. Experiments across three recent model versions - gpt-4o-mini, o3-mini, and o4-mini - and two prompt configurations (default versus concise-writing instruction) reveal heterogeneous quality. Our results show that newer models generally improve correctness, but the differences are modest and not statistically significant, suggesting that GPR-bench may not be sufficiently challenging to differentiate between recent model versions. In contrast, the concise-writing instruction significantly enhances conciseness (+12.37 pp, Mann-Whitney U test: p < 0.001, effect size r = 0.2995) with minimal degradations on accuracy (-1.7 pp), demonstrating the effectiveness of prompt engineering. Released under the MIT License, GPR- bench lowers the barrier to initiating reproducibility monitoring and provides a foundation for community-driven extensions, while also raising important considerations about benchmark design for rapidly evolving language models.


Real-World Gaps in AI Governance Research

arXiv.org Artificial Intelligence

Drawing on 1,178 safety and reliability papers from 9,439 generative AI papers (January 2020 - March 2025), we compare research outputs of leading AI companies (Anthropic, Google DeepMind, Meta, Microsoft, and OpenAI) and AI universities (CMU, MIT, NYU, Stanford, UC Berkeley, and University of Washington). We find that corporate AI research increasingly concentrates on pre-deployment areas -- model alignment and testing & evaluation -- while attention to deployment-stage issues such as model bias has waned. Significant research gaps exist in high-risk deployment domains, including healthcare, finance, misinformation, persuasive and addictive features, hallucinations, and copyright. Without improved observability into deployed AI, growing corporate concentration could deepen knowledge deficits. We recommend expanding external researcher access to deployment data and systematic observability of in-market AI behaviors.


OpenAI's new for-profit plan leaves many unanswered questions

Engadget

OpenAI has abandoned its controversial restructuring plan. In a dramatic reversal, the company said Monday it would no longer try to separate control of its for-profit arm from the non-profit board that currently oversees operations. "We made the decision for the nonprofit to retain control of OpenAI after hearing from civic leaders and engaging in constructive dialogue with the offices of the Attorney General of Delaware and the Attorney General of California," said Bret Taylor, the chairman of OpenAI. OpenAI had originally argued its existing structure would not allow its nonprofit to "easily do more than control the for-profit." It also said it needed more money, a mere two months after securing 6.6 billion in new investment.


OpenAI dials back conversion plan, with nonprofit to retain control

The Japan Times

OpenAI has dialed back a significant restructuring plan, with its nonprofit parent retaining control in a move that is likely to limit CEO Sam Altman's power over the pioneering maker of ChatGPT. The announcement follows a storm of criticism and legal challenges, including a high-profile lawsuit filed by rival and co-founder Elon Musk, who has accused OpenAI of straying from its founding mission to develop artificial intelligence for the benefit of humanity. "OpenAI was founded as a non-profit, is today a non-profit that oversees and controls the for-profit, and going forward will remain a non-profit that oversees and controls the for-profit. That will not change," Altman said in a blog post Monday.


Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks

arXiv.org Artificial Intelligence

Task-oriented semantic communication has emerged as a fundamental approach for enhancing performance in various communication scenarios. While recent advances in Generative Artificial Intelligence (GenAI), such as Large Language Models (LLMs), have been applied to semantic communication designs, the potential of Large Multimodal Models (LMMs) remains largely unexplored. In this paper, we investigate an LMM-based vehicle AI assistant using a Large Language and Vision Assistant (LLaVA) and propose a task-oriented semantic communication framework to facilitate efficient interaction between users and cloud servers. To reduce computational demands and shorten response time, we optimize LLaVA's image slicing to selectively focus on areas of utmost interest to users. Additionally, we assess the importance of image patches by combining objective and subjective user attention, adjusting energy usage for transmitting semantic information. This strategy optimizes resource utilization, ensuring precise transmission of critical information. We construct a Visual Question Answering (VQA) dataset for traffic scenarios to evaluate effectiveness. Experimental results show that our semantic communication framework significantly increases accuracy in answering questions under the same channel conditions, performing particularly well in environments with poor Signal-to-Noise Ratios (SNR). Accuracy can be improved by 13.4% at an SNR of 12dB and 33.1% at 10dB, respectively.