Goto

Collaborating Authors

 Support Vector Machines


Fault Diagnosis of 3D-Printed Scaled Wind Turbine Blades

arXiv.org Artificial Intelligence

This study presents an integrated methodology for fault detection in wind turbine blades using 3D-printed scaled models, finite element simulations, experimental modal analysis, and machine learning techniques. A scaled model of the NREL 5MW blade was fabricated using 3D printing, and crack-type damages were introduced at critical locations. Finite Element Analysis was employed to predict the impact of these damages on the natural frequencies, with the results validated through controlled hammer impact tests. Vibration data was processed to extract both time-domain and frequency-domain features, and key discriminative variables were identified using statistical analyses (ANOVA). Machine learning classifiers, including Support Vector Machine and K-Nearest Neighbors, achieved classification accuracies exceeding 94%. The results revealed that vibration modes 3, 4, and 6 are particularly sensitive to structural anomalies for this blade. This integrated approach confirms the feasibility of combining numerical simulations with experimental validations and paves the way for structural health monitoring systems in wind energy applications.


Exploration of COVID-19 Discourse on Twitter: American Politician Edition

arXiv.org Artificial Intelligence

The advent of the COVID-19 pandemic has undoubtedly affected the political scene worldwide and the introduction of new terminology and public opinions regarding the virus has further polarized partisan stances. Using a collection of tweets gathered from leading American political figures online (Republican and Democratic), we explored the partisan differences in approach, response, and attitude towards handling the international crisis. Implementation of the bag-of-words, bigram, and TF-IDF models was used to identify and analyze keywords, topics, and overall sentiments from each party. Results suggest that Democrats are more concerned with the casualties of the pandemic, and give more medical precautions and recommendations to the public whereas Republicans are more invested in political responsibilities such as keeping the public updated through media and carefully watching the progress of the virus. We propose a systematic approach to predict and distinguish a tweet's political stance (left or right leaning) based on its COVID-19 related terms using different classification algorithms on different language models.


Risk Analysis and Design Against Adversarial Actions

arXiv.org Machine Learning

In particular, Theorem 5 applies when null A ฮด = { ฮด }, i.e., when ฮธ null A is just a standard, non-robust, solution. This is different from [56], whose main result is only applicable to solutions satisfying the infinitely many constraints f (ฮธ, ฮด) 0, ฮด A ฮด i, i = 1,...,N, where A ฮด i is tuned to the Wasserstein bound. As previously noted, R plays the role of a tunable parameter, and the result in Theorem 5 holds for any choice of the value ofR . As a consequence, the user can play with R to optimize the bound on Risk ( ฮธ null A) given in Theorem 5. As R increases, s A, null A (and, thereby, ฮต (s A, null A)) tends to increase while ยต/R diminishes. While the best compromise is difficult to foresee, one can experimentally try various choices R 1 < R 2 < < R i < R h and select the one giving the best result. The corresponding confidence level can be bounded as follows: P Nnull D: Risk (ฮธ null A) > ฮต (s A, null A,i) + ยต R i for at least one i { 1,...h } null h null i =1P Nnull D: Risk (ฮธ null A) > ฮต (s A, null A,i) + ยต R i null h null i =1ฮฒ = hฮฒ, 29 from which P Nnull D: Risk ( ฮธ null A) ฮต ( s A, null A,i) + ยต R i for all i = 1,...h null 1 hฮฒ.


Multivariate Conformal Selection

arXiv.org Machine Learning

Selecting high-quality candidates from large datasets is critical in applications such as drug discovery, precision medicine, and alignment of large language models (LLMs). While Conformal Selection (CS) provides rigorous uncertainty quantification, it is limited to univariate responses and scalar criteria. To address this issue, we propose Multivariate Conformal Selection (mCS), a generalization of CS designed for multivariate response settings. Our method introduces regional monotonicity and employs multivariate nonconformity scores to construct conformal p-values, enabling finite-sample False Discovery Rate (FDR) control. We present two variants: mCS-dist, using distance-based scores, and mCS-learn, which learns optimal scores via differentiable optimization. Experiments on simulated and real-world datasets demonstrate that mCS significantly improves selection power while maintaining FDR control, establishing it as a robust framework for multivariate selection tasks.


Can a Quantum Support Vector Machine algorithm be utilized to identify Key Biomarkers from Multi-Omics data of COVID19 patients?

arXiv.org Artificial Intelligence

The unprecedented global COVID - 19 pandemic has prompted researchers to investigate both the biochemical changes associated with acute infection and the long - term effects of COVID - 19, with the goal of elucidating underlying mechanisms [ 1 4 ]. Among the diverse biochemical alterations observed in COVID - 19, change s in metabolomic and proteomic profiles have drawn particular attention due to their roles in fundamental biological processes, including protein expression and metabolic pathways [5, 6]. Early in the pandemic, several studies highlighted the significance of certain biomarkers for diagnosing COVID - 19 and assessing disease severity [7, 8]. These initial finding s reveal ed that specific biomarkers are involved in COVID - 19 pathogenesis and correlate with disease severity. S ubsequent research into post - acute sequelae of COVID - 19 (PASC, or long COVID) has further shown that variations in these biomarkers are associated with neurological and respiratory complications [9, 10]. Collectively, these studie s highlight the importance of identifying key biomarkers to support both acute COVID - 19 detection and the understanding of long COVID.


Real-Time Sleepiness Detection for Driver State Monitoring System

arXiv.org Artificial Intelligence

Driver face monitoring system can detect driver fatigue, which is an important factor in a large number of accidents, using computer vision techniques. In this paper we present a real-time technique for driver eye state detection. At first face is detected and the eyes are searched inside face region for tracking. A normalized cross correlation based online dynamic template matching technique with combination of Kalman filter tracking is proposed to track the detected eye positions in the subsequent image frames. Support vector machine with histogram of orientation gradient features is used for classification of state of the eyes as open or closed. If the eye(s) state is detected as closed for a specified amount of time the driver is considered to be sleeping and an alarm will be generated.


A computational framework for longitudinal medication adherence prediction in breast cancer survivors: A social cognitive theory based approach

arXiv.org Artificial Intelligence

Non-adherence to medications is a critical concern since nearly half of patients with chronic illnesses do not follow their prescribed medication regimens, leading to increased mortality, costs, and preventable human distress. Amongst stage 0-3 breast cancer survivors, adherence to long-term adjuvant endocrine therapy (i.e., Tamoxifen and aromatase inhibitors) is associated with a significant increase in recurrence-free survival. This work aims to develop multi-scale models of medication adherence to understand the significance of different factors influencing adherence across varying time frames. We introduce a computational framework guided by Social Cognitive Theory for multi-scale (daily and weekly) modeling of longitudinal medication adherence. Our models employ both dynamic medication-taking patterns in the recent past (dynamic factors) as well as less frequently changing factors (static factors) for adherence prediction. Additionally, we assess the significance of various factors in influencing adherence behavior across different time scales. Our models outperform traditional machine learning counterparts in both daily and weekly tasks in terms of both accuracy and specificity. Daily models achieved an accuracy of 87.25%, and weekly models, an accuracy of 76.04%. Notably, dynamic past medication-taking patterns prove most valuable for predicting daily adherence, while a combination of dynamic and static factors is significant for macro-level weekly adherence patterns.


Accelerating Clinical NLP at Scale with a Hybrid Framework with Reduced GPU Demands: A Case Study in Dementia Identification

arXiv.org Artificial Intelligence

Clinical natural language processing (NLP) is increasingly in demand in both clinical research and operational practice. However, most of the state-of-the-art solutions are transformers-based and require high computational resources, limiting their accessibility. We propose a hybrid NLP framework that integrates rule-based filtering, a Support Vector Machine (SVM) classifier, and a BERT-based model to improve efficiency while maintaining accuracy. We applied this framework in a dementia identification case study involving 4.9 million veterans with incident hypertension, analyzing 2.1 billion clinical notes. At the patient level, our method achieved a precision of 0.90, a recall of 0.84, and an F1-score of 0.87. Additionally, this NLP approach identified over three times as many dementia cases as structured data methods. All processing was completed in approximately two weeks using a single machine with dual A40 GPUs. This study demonstrates the feasibility of hybrid NLP solutions for large-scale clinical text analysis, making state-of-the-art methods more accessible to healthcare organizations with limited computational resources.


Specialized text classification: an approach to classifying Open Banking transactions

arXiv.org Artificial Intelligence

Specialized text classification: an approach to classifying Open Banking transactions Duc Tuyen Ta, Wajdi Ben Saad, Ji Y oung Oh Data Science Team - Oney Bank - France Abstract --With the introduction of the PSD2 regulation in the EU which established the Open Banking framework, a new window of opportunities has opened for banks and fintechs to explore and enrich Bank transaction descriptions with the aim of building a better understanding of customer behavior, while using this understanding to prevent fraud, reduce risks and offer more competitive and tailored services. And although the usage of natural language processing models and techniques has seen an incredible progress in various applications and domains over the past few years, custom applications based on domain-specific text corpus remain unaddressed especially in the banking sector . In this paper, we introduce a language-based Open Banking transaction classification system with a focus on the french market and french language text. The system encompasses data collection, labeling, preprocessing, modeling, and evaluation stages. Unlike previous studies that focus on general classification approaches, this system is specifically tailored to address the challenges posed by training a language model with a specialized text corpus (Banking data in the French context).


SPreV

arXiv.org Machine Learning

SPREV, short for hyperSphere Reduced to two-dimensional Regular Polygon for Visualisation, is a novel dimensionality reduction technique developed to address the challenges of reducing dimensions and visualizing labeled datasets that exhibit a unique combination of three characteristics: small class size, high dimensionality, and low sample size. SPREV is designed not only to uncover but also to visually represent hidden patterns within such datasets. Its distinctive integration of geometric principles, adapted for discrete computational environments, makes it an indispensable tool in the modern data science toolkit, enabling users to identify trends, extract insights, and navigate complex data efficiently and effectively.