Goto

Collaborating Authors

 rash


Text Knows What, Tables Know When: Clinical Timeline Reconstruction via Retrieval-Augmented Multimodal Alignment

arXiv.org Machine Learning

Reconstructing precise clinical timelines is essential for modeling patient trajectories and forecasting risk in complex, heterogeneous conditions like sepsis. While unstructured clinical narratives offer semantically rich and contextually complete descriptions of a patient's course, they often lack temporal precision and contain ambiguous event timing. Conversely, structured electronic health record (EHR) data provides precise temporal anchors but misses a substantial portion of clinically meaningful events. We introduce a retrieval-augmented multimodal alignment framework that bridges this gap to improve the temporal precision of absolute clinical timelines extracted from text. Our approach formulates timeline reconstruction as a graph-based multistep process: it first extracts central anchor events from narratives to build an initial temporal scaffold, places non-central events relative to this backbone, and then calibrates the timeline using retrieved structured EHR rows as external temporal evidence. Evaluated using instruction-tuned large language models on the i2m4 benchmark spanning MIMIC-III and MIMIC-IV, our multimodal pipeline consistently improves absolute timestamp accuracy (AULTC) and improves temporal concordance across nearly all evaluated models over unimodal text-only reconstruction, without compromising event match rates. Furthermore, our empirical gap analysis reveals that 34.8% of text-derived events are entirely absent from tabular records, demonstrating that aligning these modalities can produce a more temporally faithful and clinically informative reconstruction of patient trajectories than either source alone.


How Pokรฉmon Go is giving delivery robots an inch-perfect view of the world

MIT Technology Review

Niantic's AI spinout is training a new world model using 30 billion images of urban landmarks crowdsourced from players. Pokรฉmon Go was the world's first augmented-reality megahit. Released in 2016 by the Google spinout Niantic, the AR twist on the juggernaut Pokรฉmon franchise fast became a global phenomenon. From Chicago to Oslo to Enoshima, players hit the streets in the urgent hope of catching a Jigglypuff or a Squirtle or (with a huge amount of luck) an ultra-rare Galarian Zapdos hovering just out of reach, superimposed on the everyday world. "Five hundred million people installed that app in 60 days," says Brian McClendon, CTO at Niantic Spatial, an AI company that Niantic spun out in May last year. According to the video-game firm Scopely, which bought Pokรฉmon Go from Niantic at the same time, the game still drew more than 100 million players in 2024, eight years after it launched.


Judging with Confidence: Calibrating Autoraters to Preference Distributions

arXiv.org Artificial Intelligence

The alignment of large language models (LLMs) with human values increasingly relies on using other LLMs as automated judges, or ``autoraters''. However, their reliability is limited by a foundational issue: they are trained on discrete preference labels, forcing a single ground truth onto tasks that are often subjective, ambiguous, or nuanced. We argue that a reliable autorater must learn to model the full distribution of preferences defined by a target population. In this paper, we propose a general framework for calibrating probabilistic autoraters to any given preference distribution. We formalize the problem and present two learning methods tailored to different data conditions: 1) a direct supervised fine-tuning for dense, probabilistic labels, and 2) a reinforcement learning approach for sparse, binary labels. Our empirical results show that finetuning autoraters with a distribution-matching objective leads to verbalized probability predictions that are better aligned with the target preference distribution, with improved calibration and significantly lower positional bias, all while preserving performance on objective tasks.


CDC warns of 'enhanced' virus risk for travelers amid outbreak spread by mosquitoes

FOX News

Fox News senior medical analyst Dr. Marc Siegel shares his perspective on whether the mosquito-borne virus in China will spread to the United States and how AI can be detrimental to children's and young adults' mental health on'Fox Report.' The U.S. Centers for Disease Control and Prevention (CDC) is warning that travelers to China face an "enhanced" risk of contracting a virus spread by mosquitoes. There has been an outbreak of chikungunya in Guangdong Province, which can cause fever, joint pain, headache, muscle pain, joint swelling, and rash. Recently, the CDC raised the warning related to chikungunya in China from Level 1: "Practice Usual Precautions" to Level 2: "Practice Enhanced Precautions." The CDC says there are no medicines to treat chikungunya, and recommends preventing it by wearing insect repellent, wearing long sleeves and pants, or staying in places that have air conditioning or screens on the windows and doors.


Enhancing Homophily-Heterophily Separation: Relation-Aware Learning in Heterogeneous Graphs

arXiv.org Artificial Intelligence

Real-world networks usually have a property of node heterophily, that is, the connected nodes usually have different features or different labels. This heterophily issue has been extensively studied in homogeneous graphs but remains under-explored in heterogeneous graphs, where there are multiple types of nodes and edges. Capturing node heterophily in heterogeneous graphs is very challenging since both node/edge heterogeneity and node heterophily should be carefully taken into consideration. Existing methods typically convert heterogeneous graphs into homogeneous ones to learn node heterophily, which will inevitably lose the potential heterophily conveyed by heterogeneous relations. To bridge this gap, we propose Relation-Aware Separation of Homophily and Heterophily (RASH), a novel contrastive learning framework that explicitly models high-order semantics of heterogeneous interactions and adaptively separates homophilic and heterophilic patterns. Particularly, RASH introduces dual heterogeneous hypergraphs to encode multi-relational bipartite subgraphs and dynamically constructs homophilic graphs and heterophilic graphs based on relation importance. A multi-relation contrastive loss is designed to align heterogeneous and homophilic/heterophilic views by maximizing mutual information. In this way, RASH simultaneously resolves the challenges of heterogeneity and heterophily in heterogeneous graphs. Extensive experiments on benchmark datasets demonstrate the effectiveness of RASH across various downstream tasks. The code is available at: https://github.com/zhengziyu77/RASH.


Enhancing Adverse Drug Event Detection with Multimodal Dataset: Corpus Creation and Model Development

arXiv.org Artificial Intelligence

The mining of adverse drug events (ADEs) is pivotal in pharmacovigilance, enhancing patient safety by identifying potential risks associated with medications, facilitating early detection of adverse events, and guiding regulatory decision-making. Traditional ADE detection methods are reliable but slow, not easily adaptable to large-scale operations, and offer limited information. With the exponential increase in data sources like social media content, biomedical literature, and Electronic Medical Records (EMR), extracting relevant ADE-related information from these unstructured texts is imperative. Previous ADE mining studies have focused on text-based methodologies, overlooking visual cues, limiting contextual comprehension, and hindering accurate interpretation. To address this gap, we present a MultiModal Adverse Drug Event (MMADE) detection dataset, merging ADE-related textual information with visual aids. Additionally, we introduce a framework that leverages the capabilities of LLMs and VLMs for ADE detection by generating detailed descriptions of medical images depicting ADEs, aiding healthcare professionals in visually identifying adverse events. Using our MMADE dataset, we showcase the significance of integrating visual cues from images to enhance overall performance. This approach holds promise for patient safety, ADE awareness, and healthcare accessibility, paving the way for further exploration in personalized healthcare.


MedSumm: A Multimodal Approach to Summarizing Code-Mixed Hindi-English Clinical Queries

arXiv.org Artificial Intelligence

In the healthcare domain, summarizing medical questions posed by patients is critical for improving doctor-patient interactions and medical decision-making. Although medical data has grown in complexity and quantity, the current body of research in this domain has primarily concentrated on text-based methods, overlooking the integration of visual cues. Also prior works in the area of medical question summarisation have been limited to the English language. This work introduces the task of multimodal medical question summarization for codemixed input in a low-resource setting. To address this gap, we introduce the Multimodal Medical Codemixed Question Summarization MMCQS dataset, which combines Hindi-English codemixed medical queries with visual aids. This integration enriches the representation of a patient's medical condition, providing a more comprehensive perspective. We also propose a framework named MedSumm that leverages the power of LLMs and VLMs for this task. By utilizing our MMCQS dataset, we demonstrate the value of integrating visual information from images to improve the creation of medically detailed summaries. This multimodal strategy not only improves healthcare decision-making but also promotes a deeper comprehension of patient queries, paving the way for future exploration in personalized and responsive medical care. Our dataset, code, and pre-trained models will be made publicly available.


Startup lets doctors classify skin conditions with the snap of a picture

#artificialintelligence

At the age of 22, when Susan Conover wanted to get a strange-looking mole checked out, she was told it would take three months to see a dermatologist. When the mole was finally removed and biopsied, doctors determined it was cancerous. At the time, no one could be sure the cancer hadn't spread to other parts of her body -- the difference between stage 2 and stage 3 or 4 melanoma. Thankfully, the mole ended up being confined to one spot. But the experience launched Conover into the world of skin diseases and dermatology.


AI can analyze smartphone 'rash selfies' to diagnose Lyme disease

Daily Mail - Science & tech

Artificial intelligence can be used to evaluate smartphone photos of suspicious rashes and detect Lyme disease earlier, according to a new study. Lyme disease affects roughly 300,000 people in the US every year and is transmitted through the bite of an infected deer tick. A painless rash, called Erythema migrans (EM), usually appears a week or so later, followed by more serious symptoms including fever, headache, chills, joint pain and swollen lymph glands. Lyme disease is most effectively treated if caught early. Untreated, it can cause cognitive impairment, chronic fatigue, heart palpitations and painful swelling that can last from months to years.


Measles Rash Identification Using Residual Deep Convolutional Neural Network

arXiv.org Artificial Intelligence

Measles is extremely contagious and is one of the leading causes of vaccine-preventable illness and death in developing countries, claiming more than 100,000 lives each year. Measles was declared eliminated in the US in 2000 due to decades of successful vaccination for the measles. As a result, an increasing number of US healthcare professionals and the public have never seen the disease. Unfortunately, the Measles resurged in the US in 2019 with 1,282 confirmed cases. To assist in diagnosing measles, we collected more than 1300 images of a variety of skin conditions, with which we employed residual deep convolutional neural network to distinguish measles rash from other skin conditions, in an aim to create a phone application in the future. On our image dataset, our model reaches a classification accuracy of 95.2%, sensitivity of 81.7%, and specificity of 97.1%, indicating the model is effective in facilitating an accurate detection of measles to help contain measles outbreaks.