Support Vector Machines
Experiments with Optimal Model Trees
Roselli, Sabino Francesco, Frank, Eibe
Model trees provide an appealing way to perform interpretable machine learning for both classification and regression problems. In contrast to ``classic'' decision trees with constant values in their leaves, model trees can use linear combinations of predictor variables in their leaf nodes to form predictions, which can help achieve higher accuracy and smaller trees. Typical algorithms for learning model trees from training data work in a greedy fashion, growing the tree in a top-down manner by recursively splitting the data into smaller and smaller subsets. Crucially, the selected splits are only locally optimal, potentially rendering the tree overly complex and less accurate than a tree whose structure is globally optimal for the training data. In this paper, we empirically investigate the effect of constructing globally optimal model trees for classification and regression with linear support vector machines at the leaf nodes. To this end, we present mixed-integer linear programming formulations to learn optimal trees, compute such trees for a large collection of benchmark data sets, and compare their performance against greedily grown model trees in terms of interpretability and accuracy. We also compare to classic optimal and greedily grown decision trees, random forests, and support vector machines. Our results show that optimal model trees can achieve competitive accuracy with very small trees. We also investigate the effect on the accuracy of replacing axis-parallel splits with multivariate ones, foregoing interpretability while potentially obtaining greater accuracy.
Designing and Deploying AI Models for Sustainable Logistics Optimization: A Case Study on Eco-Efficient Supply Chains in the USA
Shawon, Reza E Rabbi, Hasan, MD Rokibul, Rahman, Md Anisur, Ghandri, Mohamed, Lamari, Iman Ahmed, Kawsar, Mohammed, Akter, Rubi
The rapid evolution of Artificial Intelligence (AI) and Machine Learning (ML) has significantly transformed logistics and supply chain management, particularly in the pursuit of sustainability and eco-efficiency. This study explores AI-based methodologies for optimizing logistics operations in the USA, focusing on reducing environmental impact, improving fuel efficiency, and minimizing costs. Key AI applications include predictive analytics for demand forecasting, route optimization through machine learning, and AI-powered fuel efficiency strategies. Various models, such as Linear Regression, XGBoost, Support Vector Machine, and Neural Networks, are applied to real-world logistics datasets to reduce carbon emissions based on logistics operations, optimize travel routes to minimize distance and travel time, and predict future deliveries to plan optimal routes. Other models such as K-Means and DBSCAN are also used to optimize travel routes to minimize distance and travel time for logistics operations. This study utilizes datasets from logistics companies' databases. The study also assesses model performance using metrics such as mean absolute error (MAE), mean squared error (MSE), and R2 score. This study also explores how these models can be deployed to various platforms for real-time logistics and supply chain use. The models are also examined through a thorough case study, highlighting best practices and regulatory frameworks that promote sustainability. The findings demonstrate AI's potential to enhance logistics efficiency, reduce carbon footprints, and contribute to a more resilient and adaptive supply chain ecosystem.
Further Exploration of Precise Binding Energies from Physics Informed Machine Learning and the Development of a Practical Ensemble Model
Bentley, I., Tedder, J., Gebran, M., Paul, A.
Sixteen new physics informed machine learning models have been trained on binding energy residuals from modern mass models that leverage shape parameters and other physical features. The models have been trained on a subset of AME 2012 data and have been verified with a subset of the AME 2020 data. Among the machine learning approaches tested in this work, the preferred approach is the least squares boosted ensemble of trees which appears to have a superior ability to both interpolate and extrapolate binding energy residuals. The machine learning models for four mass models created from the ensemble of trees approach have been combined to create a composite model called the Four Model Tree Ensemble (FMTE). The FMTE model predicts binding energy values from AME 2020 with a standard deviation of 76 keV and a mean deviation of 34 keV for all nuclei with N > 7 and Z > 7. A comparison with new mass measurements for 33 isotopes not included in AME 2012 or AME 2020 indicates that the FMTE performs better than all mass models that were tested.
cantnlp@DravidianLangTech2025: A Bag-of-Sounds Approach to Multimodal Hate Speech Detection
This paper presents the systems and results for the Multimodal Social Media Data Analysis in Dravidian Languages (MSMDA-DL) shared task at the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages (DravidianLangTech-2025). We took a `bag-of-sounds' approach by training our hate speech detection system on the speech (audio) data using transformed Mel spectrogram measures. While our candidate model performed poorly on the test set, our approach offered promising results during training and development for Malayalam and Tamil. With sufficient and well-balanced training data, our results show that it is feasible to use both text and speech (audio) data in the development of multimodal hate speech detection systems.
Quantum-Assisted Support Vector Regression
Dalal, Archismita, Bagherimehrab, Mohsen, Sanders, Barry C.
A popular machine-learning model for regression tasks, including stock-market prediction, weather forecasting and real-estate pricing, is the classical support vector regression (SVR). However, a practically realisable quantum SVR remains to be formulated. We devise annealing-based algorithms, namely simulated and quantum-classical hybrid, for training two SVR models and compare their empirical performances against the SVR implementation of Python's scikit-learn package for facial-landmark detection (FLD), a particular use case for SVR. Our method is to derive a quadratic-unconstrained-binary formulation for the optimisation problem used for training a SVR model and solve this problem using annealing. Using D-Wave's hybrid solver, we construct a quantum-assisted SVR model, thereby demonstrating a slight advantage over classical models regarding FLD accuracy. Furthermore, we observe that annealing-based SVR models predict landmarks with lower variances compared to the SVR models trained by gradient-based methods. Our work is a proof-of-concept example for applying quantum-assisted SVR to a supervised-learning task with a small training dataset.
Medifact at PerAnsSumm 2025: Leveraging Lightweight Models for Perspective-Specific Summarization of Clinical Q&A Forums
The PerAnsSumm 2025 challenge focuses on perspective-aware healthcare answer summarization (Agarwal et al., 2025). This work proposes a few-shot learning framework using a Snorkel-BART-SVM pipeline for classifying and summarizing open-ended healthcare community question-answering (CQA). An SVM model is trained with weak supervision via Snorkel, enhancing zero-shot learning. Extractive classification identifies perspective-relevant sentences, which are then summarized using a pretrained BART-CNN model. The approach achieved 12th place among 100 teams in the shared task, demonstrating computational efficiency and contextual accuracy. By leveraging pretrained summarization models, this work advances medical CQA research and contributes to clinical decision support systems.
Machine Learning-Based Model for Postoperative Stroke Prediction in Coronary Artery Disease
Pan, Haonan, Chen, Shuheng, Pishgar, Elham, Alaei, Kamiar, Placencia, Greg, Pishgar, Maryam
Coronary artery disease remains one of the leading causes of mortality globally. Despite advances in revascularization treatments like PCI and CABG, postoperative stroke is inevitable. This study aims to develop and evaluate a sophisticated machine learning prediction model to assess postoperative stroke risk in coronary revascularization patients.This research employed data from the MIMIC-IV database, consisting of a cohort of 7023 individuals. Study data included clinical, laboratory, and comorbidity variables. To reduce multicollinearity, variables with over 30% missing values and features with a correlation coefficient larger than 0.9 were deleted. The dataset has 70% training and 30% test. The Random Forest technique interpolated residual dataset missing values. Numerical values were normalized, whereas categorical variables were one-hot encoded. LASSO regularization selected features, and grid search found model hyperparameters. Finally, Logistic Regression, XGBoost, SVM, and CatBoost were employed for predictive modeling, and SHAP analysis assessed stroke risk for each variable. AUC of 0.855 (0.829-0.878) showed that SVM model outperformed logistic regression and CatBoost models in prior research. SHAP research showed that the Charlson Comorbidity Index (CCI), diabetes, chronic kidney disease, and heart failure are significant prognostic factors for postoperative stroke. This study shows that improved machine learning reduces overfitting and improves model predictive accuracy. Models using the CCI alone cannot predict postoperative stroke risk as accurately as those using independent comorbidity variables. The suggested technique provides a more thorough and individualized risk assessment by encompassing a wider range of clinically relevant characteristics, making it a better reference for preoperative risk assessments and targeted intervention.
Evaluating a Novel Neuroevolution and Neural Architecture Search System
Winter, Benjamin David, Teahan, William John
The choice of neural network features can have a large impact on both the accuracy and speed of the network. Despite the current industry shift towards large transformer models, specialized binary classifiers remain critical for numerous practical applications where computational efficiency and low latency are essential. Neural network features tend to be developed homogeneously, resulting in slower or less accurate networks when testing against multiple datasets. In this paper, we show the effectiveness of Neuvo NAS+ a novel Python implementation of an extended Neural Architecture Search (NAS+) which allows the user to optimise the training parameters of a network as well as the network's architecture. We provide an in-depth analysis of the importance of catering a network's architecture to each dataset. We also describe the design of the Neuvo NAS+ system that selects network features on a task-specific basis including network training hyper-parameters such as the number of epochs and batch size. Results show that the Neuvo NAS+ task-specific approach significantly outperforms several machine learning approaches such as Naive Bayes, C4.5, Support Vector Machine and a standard Artificial Neural Network for solving a range of binary classification problems in terms of accuracy. Our experiments demonstrate substantial diversity in evolved network architectures across different datasets, confirming the value of task-specific optimization. Additionally, Neuvo NAS+ outperforms other evolutionary algorithm optimisers in terms of both accuracy and computational efficiency, showing that properly optimized binary classifiers can match or exceed the performance of more complex models while requiring significantly fewer computational resources.
Predicting Treatment Response in Body Dysmorphic Disorder with Interpretable Machine Learning
Costilla-Reyes, Omar, Talbot, Morgan
Body Dysmorphic Disorder (BDD) is a highly prevalent and frequently underdiagnosed condition characterized by persistent, intrusive preoccupations with perceived defects in physical appearance. In this extended analysis, we employ multiple machine learning approaches to predict treatment outcomes -- specifically treatment response and remission -- with an emphasis on interpretability to ensure clinical relevance and utility. Across the various models investigated, treatment credibility emerged as the most potent predictor, surpassing traditional markers such as baseline symptom severity or comorbid conditions. Notably, while simpler models (e.g., logistic regression and support vector machines) achieved competitive predictive performance, decision tree analyses provided unique insights by revealing clinically interpretable threshold values in credibility scores. These thresholds can serve as practical guideposts for clinicians when tailoring interventions or allocating treatment resources. We further contextualize our findings within the broader literature on BDD, addressing technology-based therapeutics, digital interventions, and the psychosocial determinants of treatment engagement. An extensive array of references situates our results within current research on BDD prevalence, suicidality risks, and digital innovation. Our work underscores the potential of integrating rigorous statistical methodologies with transparent machine learning models. By systematically identifying modifiable predictors -- such as treatment credibility -- we propose a pathway toward more targeted, personalized, and ultimately efficacious interventions for individuals with BDD.
GBSVR: Granular Ball Support Vector Regression
Rastogi, Reshma, Bisht, Ankush, Kumar, Sanjay, Chandra, Suresh
Support Vector Regression (SVR) and its variants are widely used to handle regression tasks, however, since their solution involves solving an expensive quadratic programming problem, it limits its application, especially when dealing with large datasets. Additionally, SVR uses an epsilon-insensitive loss function which is sensitive to outliers and therefore can adversely affect its performance. We propose Granular Ball Support Vector Regression (GBSVR) to tackle problem of regression by using granular ball concept. These balls are useful in simplifying complex data spaces for machine learning tasks, however, to the best of our knowledge, they have not been sufficiently explored for regression problems. Granular balls group the data points into balls based on their proximity and reduce the computational cost in SVR by replacing the large number of data points with far fewer granular balls. This work also suggests a discretization method for continuous-valued attributes to facilitate the construction of granular balls. The effectiveness of the proposed approach is evaluated on several benchmark datasets and it outperforms existing state-of-the-art approaches