Goto

Collaborating Authors

 indication



Disney and OpenAI have made a surprise deal – what happens next?

New Scientist

Disney and OpenAI have made a surprise deal - what happens next? Disney's famous Mickey Mouse character will soon be available for use in AI-generated videos The world's best-known AI company and the world's best-known entertainment firm have come to a surprise agreement to allow AI versions of some of the most iconic characters in film, TV and cartoons to be used in generative AI videos and images. Social media is dead - here's what comes next The Walt Disney Company has signed a deal with OpenAI that will allow the AI firm's Sora video generation tool and ChatGPT image creator to use more than 200 of Disney's most iconic characters. Meanwhile, Disney remains in dispute with another AI firm, Midjourney, over alleged infringement of their intellectual property (IP), claiming Midjourney aims to "blatantly incorporate and copy Disney's and Universal's famous characters" into their image generating tool. The characters now deemed fair game for OpenAI users include the likes of Mickey and Minnie Mouse, Simba and Mufasa from and Moana, as well as Marvel and Lucasfilm characters, including some of's most well-known names.


Statistical NLP for Optimization of Clinical Trial Success Prediction in Pharmaceutical R&D

Doane, Michael R.

arXiv.org Artificial Intelligence

This work presents the development and evaluation of an NLP-enabled probabilistic classifier designed to estimate the probability of technical and regulatory success (pTRS) for clinical trials in the field of neuroscience. While pharmaceutical R&D is plagued by high attrition rates and enormous costs, particularly within neuroscience, where success rates are below 10%, timely identification of promising programs can streamline resource allocation and reduce financial risk. Leveraging data from the ClinicalTrials.gov database and success labels from the recently developed Clinical Trial Outcome dataset, the classifier extracts text-based clinical trial features using statistical NLP techniques. These features were integrated into several non-LLM frameworks (logistic regression, gradient boosting, and random forest) to generate calibrated probability scores. Model performance was assessed on a retrospective dataset of 101,145 completed clinical trials spanning 1976-2024, achieving an overall ROC-AUC of 0.64. An LLM-based predictive model was then built using BioBERT, a domain-specific language representation encoder. The BioBERT-based model achieved an overall ROC-AUC of 0.74 and a Brier Score of 0.185, indicating its predictions had, on average, 40% less squared error than would be observed using industry benchmarks. The BioBERT-based model also made trial outcome predictions that were superior to benchmark values 70% of the time overall. By integrating NLP-driven insights into drug development decision-making, this work aims to enhance strategic planning and optimize investment allocation in neuroscience programs.



Interpretable Data Mining of Follicular Thyroid Cancer Ultrasound Features Using Enhanced Association Rules

Zhou, Songlin, Zhou, Tao, Li, Xin, Yau, Stephen Shing-Toung

arXiv.org Artificial Intelligence

Purpose: Thyroid cancer has been a common cancer. Papillary thyroid cancer and follicular thyroid cancer are the two most common types of thyroid cancer. Follicular thyroid cancer lacks distinctive ultrasound signs and is more difficult to diagnose preoperatively than the more prevalent papillary thyroid cancer, and the clinical studies associated with it are less well established. We aimed to analyze the clinical data of follicular thyroid cancer based on a novel data mining tool to identify some clinical indications that may help in preoperative diagnosis. Methods: We performed a retrospective analysis based on case data collected by the Department of General Surgery of Peking University Third Hospital between 2010 and 2023. Unlike traditional statistical methods, we improved the association rule mining, a classical data mining method, and proposed new analytical metrics reflecting the malignant association between clinical indications and cancer with the help of the idea of SHAP method in interpretable machine learning. Results: The dataset was preprocessed to contain 1673 cases (in terms of nodes rather than patients), of which 1414 were benign and 259 were malignant nodes. Our analysis pointed out that in addition to some common indicators (e.g., irregular or lobulated nodal margins, uneven thickness halo, hypoechogenicity), there were also some indicators with strong malignant associations, such as nodule-in-nodule pattern, trabecular pattern, and low TSH scores. In addition, our results suggest that the combination of Hashimoto's thyroiditis may also have a strong malignant association. Conclusion: In the preoperative diagnosis of nodules suspected of follicular thyroid cancer, multiple clinical indications should be considered for a more accurate diagnosis. The diverse malignant associations identified in our study may serve as a reference for clinicians in related fields.


Predicting effect of novel treatments using molecular pathways and real-world data

Couetoux, Adrien, Devenyns, Thomas, Diagne, Lise, Champagne, David, Mousset, Pierre-Yves, Anagnostopoulos, Chris

arXiv.org Artificial Intelligence

In pharmaceutical R&D, predicting the efficacy of a pharmaceutical in treating a particular disease prior to clinical testing or any real-world use has been challenging. In this paper, we propose a flexible and modular machine learning-based approach for predicting the efficacy of an untested pharmaceutical for treating a disease. We train a machine learning model using sets of pharmaceutical-pathway weight impact scores and patient data, which can include patient characteristics and observed clinical outcomes. The resulting model then analyses weighted impact scores of an untested pharmaceutical across human biological molecule-protein pathways to generate a predicted efficacy value. We demonstrate how the method works on a real-world dataset with patient treatments and outcomes, with two different weight impact score algorithms We include methods for evaluating the generalisation performance on unseen treatments, and to characterise conditions under which the approach can be expected to be most predictive. We discuss specific ways in which our approach can be iterated on, making it an initial framework to support future work on predicting the effect of untested drugs, leveraging RWD clinical data and drug embeddings.


OwkinZero: Accelerating Biological Discovery with AI

Bigaud, Nathan, Cabeli, Vincent, Gürel, Meltem, Pignet, Arthur, Klein, John, Wainrib, Gilles, Durand, Eric

arXiv.org Artificial Intelligence

While large language models (LLMs) are rapidly advancing scientific research, they continue to struggle with core biological reasoning tasks essential for translational and biomedical discovery. To address this limitation, we created and curated eight comprehensive benchmark datasets comprising over 300,000 verifiable question-and-answer pairs, each targeting critical challenges in drug discovery including target druggability, modality suitability, and drug perturbation effects. Using this resource, we developed the OwkinZero models by post-training open-source LLMs through a Reinforcement Learning from Verifiable Rewards strategy. Our results demonstrate that specialized 8-32B OwkinZero models substantially outperform larger, state-of-the-art commercial LLMs on these biological benchmarks. Remarkably, we uncover evidence of a key aspect of generalization: specialist models trained on a single task consistently outperform their base models on previously unseen tasks. This generalization effect is further amplified in our comprehensive OwkinZero models, which were trained on a mixture of datasets and achieve even broader cross-task improvements. This study represents a significant step toward addressing the biological reasoning blind spot in current LLMs, demonstrating that targeted reinforcement learning on carefully curated data can unlock generalizable performance in specialized models, thereby accelerating AI-driven biological discovery.



Artificial intelligence for sustainable wine industry: AI-driven management in viticulture, wine production and enotourism

Sidorkiewicz, Marta, Królikowska, Karolina, Dyczek, Berenika, Pijet-Migon, Edyta, Dubel, Anna

arXiv.org Artificial Intelligence

ABSTRACT Purpose: This study examines the role of Artificial Intelligence (AI) in enhancing sustainability and efficiency w ithin the wine industry. It focuses on AI - driven intelligent management in viticulture, wine production, and enotourism. Need for the Study: As the wine industry faces environmental and economic challenges, AI offers innovative solutions to optimize resource use, reduce environmental impact, and improve customer engagement. Understanding AI's potential in sustainable winemaking is crucial for fostering responsible and efficient industry practices. Methodology: The research is based on a questionnaire survey conducted among Polish winemakers, combined with a comprehensive analysis of AI methods applicable to viticulture, production, and tourism. Key AI technologies, including predictive analytics, machine learning, and computer vision, are explored . Findings: AI enhances vineyard monitoring, optimizes irrigation, and streamlines production processes, contributing to sustainable resource manageme nt. In enotourism, AI - powered chatbots, recommendation systems, and virtual tastings personalize consumer experiences. The study underscores AI's impact on economic, environmental, and social sustainability, supporting local wine enterprises and cultural h eritage. Practical Implications: AI in winemaking and enotourism can lead to more efficient, sustainable operations that benefit producers and consumers. AI - driven solutions promote responsible tourism, enhance wine tourism experiences, and ensure the indu stry's long - term viability . Keywords: Artificial Intelligence, Sustainable Development, AI - Driven Management, Viticulture, Wine Production, Enotourism, Wine Enterprises, Local Communities JEL codes: A13, A14, C55, D81, L66, L83, M31, O33, Q01, Q13, Q16, Z32 1. INTRODUCTION Sustainability in the wine industry encompasses environmental stewardship, economic viability, and social responsibility. Sustainable viticulture aims to minimize environmental impacts while maintaining product quality.


medicX-KG: A Knowledge Graph for Pharmacists' Drug Information Needs

Farrugia, Lizzy, Azzopardi, Lilian M., Debattista, Jeremy, Abela, Charlie

arXiv.org Artificial Intelligence

The role of pharmacists is evolving from medicine dispensing to delivering comprehensive pharmaceutical services within multidisciplinary healthcare teams. Central to this shift is access to accurate, up-to-date medicinal product information supported by robust data integration. Leveraging artificial intelligence and semantic technologies, Knowledge Graphs (KGs) uncover hidden relationships and enable data-driven decision-making. This paper presents medicX-KG, a pharmacist-oriented knowledge graph supporting clinical and regulatory decisions. It forms the semantic layer of the broader medicX platform, powering predictive and explainable pharmacy services. medicX-KG integrates data from three sources, including, the British National Formulary (BNF), DrugBank, and the Malta Medicines Authority (MMA) that addresses Malta's regulatory landscape and combines European Medicines Agency alignment with partial UK supply dependence. The KG tackles the absence of a unified national drug repository, reducing pharmacists' reliance on fragmented sources. Its design was informed by interviews with practicing pharmacists to ensure real-world applicability. We detail the KG's construction, including data extraction, ontology design, and semantic mapping. Evaluation demonstrates that medicX-KG effectively supports queries about drug availability, interactions, adverse reactions, and therapeutic classes. Limitations, including missing detailed dosage encoding and real-time updates, are discussed alongside directions for future enhancements.