Goto

Collaborating Authors

 american journal


Machine learning for fraud detection in digital banking: a systematic literature review REVIEW

arXiv.org Artificial Intelligence

This systematic literature review examines the role of machine learning in fraud detection within digital banking, synthesizing evidence from 118 peer-reviewed studies and institutional reports. Following the PRISMA guidelines, the review applied a structured identification, screening, eligibility, and inclusion process to ensure methodological rigor and transparency. The findings reveal that supervised learning methods, such as decision trees, logistic regression, and support vector machines, remain the dominant paradigm due to their interpretability and established performance, while unsupervised anomaly detection approaches are increasingly adopted to address novel fraud patterns in highly imbalanced datasets. Deep learning architectures, particularly recurrent and convolutional neural networks, have emerged as transformative tools capable of modeling sequential transaction data and detecting complex fraud typologies, though challenges of interpretability and real-time deployment persist. Hybrid models that combine supervised, unsupervised, and deep learning strategies demonstrate superior adaptability and detection accuracy, highlighting their potential as convergent solutions.


A Weighted U Statistic for Genetic Association Analyses of Sequencing Data

arXiv.org Artificial Intelligence

With advancements in next generation sequencing technology, a massive amount of sequencing data are generated, offering a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, this poses a great challenge for the statistical analysis of high-dimensional sequencing data. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a weighted U statistic, referred to as WU-seq, for the high-dimensional association analysis of sequencing data. Based on a non-parametric U statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we showed that WU-SEQ outperformed a commonly used SKAT method when the underlying assumptions were violated (e.g., the phenotype followed a heavy-tailed distribution). Even when the assumptions were satisfied, WU-SEQ still attained comparable performance to SKAT. Finally, we applied WU-seq to sequencing data from the Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very low density lipoprotein cholesterol.


Development of Application-Specific Large Language Models to Facilitate Research Ethics Review

arXiv.org Artificial Intelligence

Institutional review boards (IRBs) play a crucial role in ensuring the ethical conduct of human subjects research, but face challenges including inconsistency, delays, and inefficiencies. We propose the development and implementation of application-specific large language models (LLMs) to facilitate IRB review processes. These IRB-specific LLMs would be fine-tuned on IRB-specific literature and institutional datasets, and equipped with retrieval capabilities to access up-to-date, context-relevant information. We outline potential applications, including pre-review screening, preliminary analysis, consistency checking, and decision support. While addressing concerns about accuracy, context sensitivity, and human oversight, we acknowledge remaining challenges such as over-reliance on AI and the need for transparency. By enhancing the efficiency and quality of ethical review while maintaining human judgment in critical decisions, IRB-specific LLMs offer a promising tool to improve research oversight. We call for pilot studies to evaluate the feasibility and impact of this approach.


Toward Non-Invasive Diagnosis of Bankart Lesions with Deep Learning

arXiv.org Artificial Intelligence

Bankart lesions, or anterior-inferior glenoid labral tears, are diagnostically challenging on standard MRIs due to their subtle imaging features-often necessitating invasive MRI arthrograms (MRAs). This study develops deep learning (DL) models to detect Bankart lesions on both standard MRIs and MRAs, aiming to improve diagnostic accuracy and reduce reliance on MRAs. We curated a dataset of 586 shoulder MRIs (335 standard, 251 MRAs) from 558 patients who underwent arthroscopy. Ground truth labels were derived from intraoperative findings, the gold standard for Bankart lesion diagnosis. Separate DL models for MRAs and standard MRIs were trained using the Swin Transformer architecture, pre-trained on a public knee MRI dataset. Predictions from sagittal, axial, and coronal views were ensembled to optimize performance. The models were evaluated on a 20% hold-out test set (117 MRIs: 46 MRAs, 71 standard MRIs). Bankart lesions were identified in 31.9% of MRAs and 8.6% of standard MRIs. The models achieved AUCs of 0.87 (86% accuracy, 83% sensitivity, 86% specificity) and 0.90 (85% accuracy, 82% sensitivity, 86% specificity) on standard MRIs and MRAs, respectively. These results match or surpass radiologist performance on our dataset and reported literature metrics. Notably, our model's performance on non-invasive standard MRIs matched or surpassed the radiologists interpreting MRAs. This study demonstrates the feasibility of using DL to address the diagnostic challenges posed by subtle pathologies like Bankart lesions. Our models demonstrate potential to improve diagnostic confidence, reduce reliance on invasive imaging, and enhance accessibility to care.


A Data Envelopment Analysis Approach for Assessing Fairness in Resource Allocation: Application to Kidney Exchange Programs

arXiv.org Artificial Intelligence

Kidney exchange programs have significantly increased transplantation rates but raise pressing questions about fairness in organ allocation. We present a novel framework leveraging Data Envelopment Analysis (DEA) to evaluate multiple fairness criteria--Priority, Access, and Outcome--within a single model, capturing complexities that may be overlooked in single-metric analyses. Using data from the United Network for Organ Sharing, we analyze these criteria individually, measuring Priority fairness through waitlist durations, Access fairness through Kidney Donor Profile Index scores, and Outcome fairness through graft lifespan. We then apply our DEA model to demonstrate significant disparities in kidney allocation efficiency across ethnic groups. To quantify uncertainty, we employ conformal prediction within the DEA framework, yielding group-conditional prediction intervals with finite sample coverage guarantees. Our findings show notable differences in efficiency distributions between ethnic groups. Our study provides a rigorous framework for evaluating fairness in complex resource allocation systems, where resource scarcity and mutual compatibility constraints exist. All code for using the proposed method and reproducing results is available on GitHub.


Semantic Scaling: Bayesian Ideal Point Estimates with Large Language Models

arXiv.org Artificial Intelligence

This paper introduces "Semantic Scaling," a novel method for ideal point estimation from text. I leverage large language models to classify documents based on their expressed stances and extract survey-like data. I then use item response theory to scale subjects from these data. Semantic Scaling significantly improves on existing text-based scaling methods, and allows researchers to explicitly define the ideological dimensions they measure. This represents the first scaling approach that allows such flexibility outside of survey instruments and opens new avenues of inquiry for populations difficult to survey. Additionally, it works with documents of varying length, and produces valid estimates of both mass and elite ideology. I demonstrate that the method can differentiate between policy preferences and in-group/out-group affect. Among the public, Semantic Scaling out-preforms Tweetscores according to human judgement; in Congress, it recaptures the first dimension DW-NOMINATE while allowing for greater flexibility in resolving construct validity challenges.


A robophysical model of spacetime dynamics

arXiv.org Artificial Intelligence

Systems consisting of spheres rolling on elastic membranes have been used to introduce a core conceptual idea of General Relativity (GR): how curvature guides the movement of matter. However, such schemes cannot accurately represent relativistic dynamics in the laboratory because of the dominance of dissipation and external gravitational fields. Here we demonstrate that an ``active" object (a wheeled robot), which moves in a straight line on level ground and can alter its speed depending on the curvature of the deformable terrain it moves on, can exactly capture dynamics in curved relativistic spacetimes. Via the systematic study of the robot's dynamics in the radial and orbital directions, we develop a mapping of the emergent trajectories of a wheeled vehicle on a spandex membrane to the motion in a curved spacetime. Our mapping demonstrates how the driven robot's dynamics mix space and time in a metric, and shows how active particles do not necessarily follow geodesics in the real space but instead follow geodesics in a fiducial spacetime. The mapping further reveals how parameters such as the membrane elasticity and instantaneous speed allow the programming of a desired spacetime, such as the Schwarzschild metric near a non-rotating blackhole. Our mapping and framework facilitate creation of a robophysical analog to a general relativistic system in the laboratory at low cost that can provide insights into active matter in deformable environments and robot exploration in complex landscapes.


Natural Language Processing for Policymaking

arXiv.org Artificial Intelligence

Language is an important form of data in politics. Constituents express their stances and needs in text such as social media and survey responses. Politicians conduct campaigns through debates, statements of policy positions, and social media. Government staff needs to compile information from various documents to assist in decision-making. Textual data is also prevalent through the documents and debates in the legislation process, negotiations and treaties to resolve international conflicts, and media such as news reports, social media, party platforms, and manifestos. Natural language processing (NLP) is the study of computational methods to automatically analyze text and extract meaningful information for subsequent analysis. The importance of NLP for policymaking has been highlighted since the last century (Gigley, 1993).


Machine learning models predict hepatocellular carcinoma treatment response

#artificialintelligence

Leesburg, VA, August 17, 2022--According to ARRS' American Journal of Roentgenology (AJR), machine learning models applied to presently underutilized imaging features could help construct more reliable criteria for organ allocation and liver transplant eligibility. "The findings suggest that machine learning-based models can predict recurrence before therapy allocation in patients with early-stage hepatocellular carcinoma (HCC) initially eligible for liver transplant," wrote corresponding author Julius Chapiro from the department of radiology and biomedical imaging at Yale University School of Medicine in New Haven, CT. Chapiro and colleagues' proof-of-concept study included 120 patients (88 men, 32 women; median age, 60 years) diagnosed with early-stage HCC between June 2005 and March 2018, who were initially eligible for liver transplant and underwent treatment by transplant, resection, or thermal ablation. Patients underwent pretreatment MRI and posttreatment imaging surveillance, and imaging features were extracted from postcontrast phases of pretreatment MRI examinations using a pretrained convolutional neural network (VGG-16). Pretreatment clinical characteristics (including laboratory data) and extracted imaging features were integrated to develop three ML models--clinical, imaging, combined--for recurrence prediction within 1โ€“6 years posttreatment. Ultimately, all three models predicted posttreatment recurrence for early-stage HCC from pretreatment clinical (AUC 0.60โ€“0.78,


Deep learning, subtraction technique optimal for coronary stent evaluation by CTA

#artificialintelligence

Both readers provided a diagnosis of in-stent restenosis only for subtraction HIR and subtraction DLR. Diagnostic confidence score for the four methods for reader 1 was 2, 2, 3, and 4, respectively, and for reader 2 was 1, 2, 3, and 3, respectively. Patient subsequently underwent invasive catheter angiography. Fluoroscopic imagines obtained (E) before and (F) after contrast media injection demonstrate in-stent restenosis of proximal aspect of stent (arrow). Leesburg, VA, August 10, 2022--According to ARRS' American Journal of Roentgenology (AJR), the combination of deep-learning reconstruction (DLR) and a subtraction technique yielded optimal diagnostic performance for the detection of in-stent restenosis by coronary CTA.