Goto

Collaborating Authors

 eligibility


Toward an AI Reasoning-Enabled System for Patient-Clinical Trial Matching

Leach, Caroline N., Klusty, Mitchell A., Armstrong, Samuel E., Pickarski, Justine C., Hankins, Kristen L., Collier, Emily B., Shah, Maya, Mullen, Aaron D., Bumgardner, V. K. Cody

arXiv.org Artificial Intelligence

Screening patients for clinical trial eligibility remains a manual, time - consuming, and resource-intensive process. W e present a secure, scalable proof-of - concept system for Artificial Intelligence ( AI)- augmented patient - trial matching that addresses key implementation challenges: integrating heterogeneous electronic health record (EHR) data, facilitating expert review, and maintaining rigorous security standards. Leveraging open-source, reasoning-enabled large language models (LLMs), the system moves beyond binary classification to generate structured eligibility assessments with interpretable reasoning chains that support human-in - the - loop review. This decision support tool represents eligibility as a dynamic state rather than a fixed determination, identifying matches whe n available and offering actionable recommendations that could render a patient eligible in the future . The system aims to reduce coordinator burden, intelligently broaden the set of trials considered for each patient and guarantee comprehensive auditability of all AI - generated outputs. Introduction Applications of artificial intelligence (AI) in healthcare are increasingly focused on improving administrative efficiency and optimizing clinical workflows . Identifying relevant trials and screening them for a particular patient is traditionally manual, time - consuming, and heavily reliant on clinical expertise.


Hybrid LLM and Higher-Order Quantum Approximate Optimization for CSA Collateral Management

Jin, Tao, Florescu, Stuart, Heyu, null, Jin, null

arXiv.org Artificial Intelligence

We address finance-native collateral optimization under ISDA Credit Support Annexes (CSAs), where integer lots, Schedule A haircuts, RA/MTA gating, and issuer/currency/class caps create rugged, legally bounded search spaces. We introduce a certifiable hybrid pipeline purpose-built for this domain: (i) an evidence-gated LLM that extracts CSA terms to a normalized JSON (abstain-by-default, span-cited); (ii) a quantum-inspired explorer that interleaves simulated annealing with micro higher order QAOA (HO-QAOA) on binding sub-QUBOs (subset size n <= 16, order k <= 4) to coordinate multi-asset moves across caps and RA-induced discreteness; (iii) a weighted risk-aware objective (Movement, CVaR, funding-priced overshoot) with an explicit coverage window U <= Reff+B; and (iv) CP-SAT as single arbiter to certify feasibility and gaps, including a U-cap pre-check that reports the minimal feasible buffer B*. Encoding caps/rounding as higher-order terms lets HO-QAOA target the domain couplings that defeat local swaps. On government bond datasets and multi-CSA inputs, the hybrid improves a strong classical baseline (BL-3) by 9.1%, 9.6%, and 10.7% across representative harnesses, delivering better cost-movement-tail frontiers under governance settings. We release governance grade artifacts-span citations, valuation matrix audit, weight provenance, QUBO manifests, and CP-SAT traces-to make results auditable and reproducible.


Recommending Clinical Trials for Online Patient Cases using Artificial Intelligence

Chan, Joey, Jin, Qiao, Wan, Nicholas, Floudas, Charalampos S., Xue, Elisabetta, Lu, Zhiyong

arXiv.org Artificial Intelligence

Clinical trials are crucial for assessing new treatments; however, recruitment challenges - such as limited awareness, complex eligibility criteria, and referral barriers - hinder their success. With the growth of online platforms, patients increasingly turn to social media and health communities for support, research, and advocacy, expanding recruitment pools and established enrollment pathways. Recognizing this potential, we utilized TrialGPT, a framework that leverages a large language model (LLM) as its backbone, to match 50 online patient cases (collected from published case reports and a social media website) to clinical trials and evaluate performance against traditional keyword-based searches. Our results show that TrialGPT outperforms traditional methods by 46% in identifying eligible trials, with each patient, on average, being eligible for around 7 trials. Additionally, our outreach efforts to case authors and trial organizers regarding these patient-trial matches yielded highly positive feedback, which we present from both perspectives.


Program Synthesis Dialog Agents for Interactive Decision-Making

Toles, Matthew, Balwani, Nikhil, Singh, Rattandeep, Rodriguez, Valentina Giulia Sartori, Yu, Zhou

arXiv.org Artificial Intelligence

Many real-world eligibility problems, ranging from medical diagnosis to tax planning, can be mapped to decision problems expressed in natural language, wherein a model must make a binary choice based on user features. Large-scale domains such as legal codes or frequently updated funding opportunities render human annotation (e.g., web forms or decision trees) impractical, highlighting the need for agents that can automatically assist in decision-making. Since relevant information is often only known to the user, it is crucial that these agents ask the right questions. As agents determine when to terminate a conversation, they face a trade-off between accuracy and the number of questions asked, a key metric for both user experience and cost. To evaluate this task, we propose BeNYfits, a new benchmark for determining user eligibility for multiple overlapping social benefits opportunities through interactive decision-making. Our experiments show that current language models struggle with frequent hallucinations, with GPT-4o scoring only 35.7 F1 using a ReAct-style chain-of-thought. To address this, we introduce ProADA, a novel approach that leverages program synthesis to assist in decision-making by mapping dialog planning to a code generation problem and using gaps in structured data to determine the best next action. Our agent, ProADA, improves the F1 score to 55.6 while maintaining nearly the same number of dialog turns.


Getting in the Door: Streamlining Intake in Civil Legal Services with Large Language Models

Steenhuis, Quinten, Westermann, Hannes

arXiv.org Artificial Intelligence

Legal intake, the process of finding out if an applicant is eligible for help from a free legal aid program, takes significant time and resources. In part this is because eligibility criteria are nuanced, open-textured, and require frequent revision as grants start and end. In this paper, we investigate the use of large language models (LLMs) to reduce this burden. We describe a digital intake platform that combines logical rules with LLMs to offer eligibility recommendations, and we evaluate the ability of 8 different LLMs to perform this task. We find promising results for this approach to help close the access to justice gap, with the best model reaching an F1 score of .82, while minimizing false negatives.


Controlled LLM-based Reasoning for Clinical Trial Retrieval

Jullien, Mael, Bogatu, Alex, Unsworth, Harriet, Freitas, Andre

arXiv.org Artificial Intelligence

Matching patients to clinical trials demands a systematic and reasoned interpretation of documents which require significant expert-level background knowledge, over a complex set of well-defined eligibility criteria. Moreover, this interpretation process needs to operate at scale, over vast knowledge bases of trials. In this paper, we propose a scalable method that extends the capabilities of LLMs in the direction of systematizing the reasoning over sets of medical eligibility criteria, evaluating it in the context of real-world cases. The proposed method overlays a Set-guided reasoning method for LLMs. The proposed framework is evaluated on TREC 2022 Clinical Trials, achieving results superior to the state-of-the-art: NDCG@10 of 0.693 and Precision@10 of 0.73.


DABL: Detecting Semantic Anomalies in Business Processes Using Large Language Models

Guan, Wei, Cao, Jian, Gao, Jianqi, Zhao, Haiyan, Qian, Shiyou

arXiv.org Artificial Intelligence

Detecting anomalies in business processes is crucial for ensuring operational success. While many existing methods rely on statistical frequency to detect anomalies, it's important to note that infrequent behavior doesn't necessarily imply undesirability. To address this challenge, detecting anomalies from a semantic viewpoint proves to be a more effective approach. However, current semantic anomaly detection methods treat a trace (i.e., process instance) as multiple event pairs, disrupting long-distance dependencies. In this paper, we introduce DABL, a novel approach for detecting semantic anomalies in business processes using large language models (LLMs). We collect 143,137 real-world process models from various domains. By generating normal traces through the playout of these process models and simulating both ordering and exclusion anomalies, we fine-tune Llama 2 using the resulting log. Through extensive experiments, we demonstrate that DABL surpasses existing state-of-the-art semantic anomaly detection methods in terms of both generalization ability and learning of given processes. Users can directly apply DABL to detect semantic anomalies in their own datasets without the need for additional training. Furthermore, DABL offers the capability to interpret the causes of anomalies in natural language, providing valuable insights into the detected anomalies.


In India an algorithm declares them dead; they have to prove they're alive

Al Jazeera

This story was produced with support from the Pulitzer Center's AI Accountability Network. Rohtak and New Delhi, India: Dhuli Chand was 102 years old on September 8, 2022, when he led a wedding procession in Rohtak, a district town in the north Indian state of Haryana. As is customary in north Indian weddings, he sat on a chariot in his wedding finery, wearing garlands of Indian rupee notes, while a band played celebratory music and family members and villagers accompanied him. But instead of a bride, Chand was on his way to meet government officials. Chand resorted to the antic to prove to officials that he was not only alive but also lively.

  Country:

How an algorithm denied food to thousands of poor in India's Telangana

Al Jazeera

This story was produced with support from the Pulitzer Center's AI Accountability Network. Hyderabad and New Delhi, India – Bismillah Bee can't conceive of owning a car. The 67-year-old widow and 12 members of her family live in a cramped three-room house in an urban slum in Hyderabad, the capital of the Indian state of Telangana. Since her rickshaw puller husband's death two years ago of mouth cancer, Bee makes a living by peeling garlic for a local business. But an algorithmic system, which the Telangana government deploys to digitally profile its more than 30 million residents, tagged Bee's husband as a car owner in 2021, when he was still alive.


Distilling Large Language Models for Matching Patients to Clinical Trials

Nievas, Mauro, Basu, Aditya, Wang, Yanshan, Singh, Hrituraj

arXiv.org Artificial Intelligence

The recent success of large language models (LLMs) has paved the way for their adoption in the high-stakes domain of healthcare. Specifically, the application of LLMs in patient-trial matching, which involves assessing patient eligibility against clinical trial's nuanced inclusion and exclusion criteria, has shown promise. Recent research has shown that GPT-3.5, a widely recognized LLM developed by OpenAI, can outperform existing methods with minimal 'variable engineering' by simply comparing clinical trial information against patient summaries. However, there are significant challenges associated with using closed-source proprietary LLMs like GPT-3.5 in practical healthcare applications, such as cost, privacy and reproducibility concerns. To address these issues, this study presents the first systematic examination of the efficacy of both proprietary (GPT-3.5, and GPT-4) and open-source LLMs (LLAMA 7B,13B, and 70B) for the task of patient-trial matching. Employing a multifaceted evaluation framework, we conducted extensive automated and human-centric assessments coupled with a detailed error analysis for each model. To enhance the adaptability of open-source LLMs, we have created a specialized synthetic dataset utilizing GPT-4, enabling effective fine-tuning under constrained data conditions. Our findings reveal that open-source LLMs, when fine-tuned on this limited and synthetic dataset, demonstrate performance parity with their proprietary counterparts. This presents a massive opportunity for their deployment in real-world healthcare applications. To foster further research and applications in this field, we release both the annotated evaluation dataset along with the fine-tuned LLM -- Trial-LLAMA -- for public use.