Support Vector Machines
Kernel Approximation using Analog In-Memory Computing
Büchel, Julian, Camposampiero, Giacomo, Vasilopoulos, Athanasios, Lammie, Corey, Gallo, Manuel Le, Rahimi, Abbas, Sebastian, Abu
Kernel functions are vital ingredients of several machine learning algorithms, but often incur significant memory and computational costs. We introduce an approach to kernel approximation in machine learning algorithms suitable for mixed-signal Analog In-Memory Computing (AIMC) architectures. Analog In-Memory Kernel Approximation addresses the performance bottlenecks of conventional kernel-based methods by executing most operations in approximate kernel methods directly in memory. The IBM HERMES Project Chip, a state-of-the-art phase-change memory based AIMC chip, is utilized for the hardware demonstration of kernel approximation. Experimental results show that our method maintains high accuracy, with less than a 1% drop in kernel-based ridge classification benchmarks and within 1% accuracy on the Long Range Arena benchmark for kernelized attention in Transformer neural networks. Compared to traditional digital accelerators, our approach is estimated to deliver superior energy efficiency and lower power consumption. These findings highlight the potential of heterogeneous AIMC architectures to enhance the efficiency and scalability of machine learning applications.
Ozone level forecasting in Mexico City with temporal features and interactions
Cerritos, J. M. Sánchez, Martínez-Cadena, J. A., Marín-López, A., Delgado-Fernández, J.
Precursor concentration and solar radiation intensity determine the dynamic equilibrium between ozone creation and destruction. Tropospheric ozone is a dangerous pollutant that can lead to a number of health problems as well as environmental difficulties. In contrast, stratospheric ozone creates a protective ozone layer. Exposure to high levels of tropospheric ozone can cause a range of respiratory problems, including coughing, throat irritation, and worsening of asthma symptoms. Long-term exposure can lead to more severe health issues such as chronic respiratory diseases, reduced lung function, and increased mortality rates. Children, the elderly, and individuals with pre-existing health conditions are particularly vulnerable to the adverse effects of ozone. Ground-level ozone can also damage flora, which can result in decreased agricultural production, damage to forests, and a decline in biodiversity. It prevents plants from photosynthesizing, which slows down their growth and increases their vulnerability to pests, illnesses, and harsh weather condition.
New random projections for isotropic kernels using stable spectral distributions
Langrené, Nicolas, Warin, Xavier, Gruet, Pierre
Rahimi and Recht [31] introduced the idea of decomposing shift-invariant kernels by randomly sampling from their spectral distribution. This famous technique, known as Random Fourier Features (RFF), is in principle applicable to any shift-invariant kernel whose spectral distribution can be identified and simulated. In practice, however, it is usually applied to the Gaussian kernel because of its simplicity, since its spectral distribution is also Gaussian. Clearly, simple spectral sampling formulas would be desirable for broader classes of kernel functions. In this paper, we propose to decompose spectral kernel distributions as a scale mixture of $\alpha$-stable random vectors. This provides a simple and ready-to-use spectral sampling formula for a very large class of multivariate shift-invariant kernels, including exponential power kernels, generalized Mat\'ern kernels, generalized Cauchy kernels, as well as newly introduced kernels such as the Beta, Kummer, and Tricomi kernels. In particular, we show that the spectral densities of all these kernels are scale mixtures of the multivariate Gaussian distribution. This provides a very simple way to modify existing Random Fourier Features software based on Gaussian kernels to cover a much richer class of multivariate kernels. This result has broad applications for support vector machines, kernel ridge regression, Gaussian processes, and other kernel-based machine learning techniques for which the random Fourier features technique is applicable.
Multi Modal Information Fusion of Acoustic and Linguistic Data for Decoding Dairy Cow Vocalizations in Animal Welfare Assessment
Jobarteh, Bubacarr, Mincu, Madalina, Dinu, Gavojdian, Neethirajan, Suresh
Understanding animal vocalizations through multi-source data fusion is crucial for assessing emotional states and enhancing animal welfare in precision livestock farming. This study aims to decode dairy cow contact calls by employing multi-modal data fusion techniques, integrating transcription, semantic analysis, contextual and emotional assessment, and acoustic feature extraction. We utilized the Natural Language Processing model to transcribe audio recordings of cow vocalizations into written form. By fusing multiple acoustic features frequency, duration, and intensity with transcribed textual data, we developed a comprehensive representation of cow vocalizations. Utilizing data fusion within a custom-developed ontology, we categorized vocalizations into high frequency calls associated with distress or arousal, and low frequency calls linked to contentment or calmness. Analyzing the fused multi dimensional data, we identified anxiety related features indicative of emotional distress, including specific frequency measurements and sound spectrum results. Assessing the sentiment and acoustic features of vocalizations from 20 individual cows allowed us to determine differences in calling patterns and emotional states. Employing advanced machine learning algorithms, Random Forest, Support Vector Machine, and Recurrent Neural Networks, we effectively processed and fused multi-source data to classify cow vocalizations. These models were optimized to handle computational demands and data quality challenges inherent in practical farm environments. Our findings demonstrate the effectiveness of multi-source data fusion and intelligent processing techniques in animal welfare monitoring. This study represents a significant advancement in animal welfare assessment, highlighting the role of innovative fusion technologies in understanding and improving the emotional wellbeing of dairy cows.
Towards Convexity in Anomaly Detection: A New Formulation of SSLM with Unique Optimal Solutions
Liu, Hongying, Wang, Hao, Chu, Haoran, Wu, Yibo
An unsolved issue in widely used methods such as Support Vector Data Description (SVDD) and Small Sphere and Large Margin SVM (SSLM) for anomaly detection is their nonconvexity, which hampers the analysis of optimal solutions in a manner similar to SVMs and limits their applicability in large-scale scenarios. In this paper, we introduce a novel convex SSLM formulation which has been demonstrated to revert to a convex quadratic programming problem for hyperparameter values of interest. Leveraging the convexity of our method, we derive numerous results that are unattainable with traditional nonconvex approaches. We conduct a thorough analysis of how hyperparameters influence the optimal solution, pointing out scenarios where optimal solutions can be trivially found and identifying instances of ill-posedness. Most notably, we establish connections between our method and traditional approaches, providing a clear determination of when the optimal solution is unique -- a task unachievable with traditional nonconvex methods. We also derive the {\nu}-property to elucidate the interactions between hyperparameters and the fractions of support vectors and margin errors in both positive and negative classes.
Enhancing binary classification: A new stacking method via leveraging computational geometry
Wu, Wei, Tang, Liang, Zhao, Zhongjie, Teo, Chung-Piaw
Binary classification is a fundamental task in machine learning and data science, with applications spanning numerous domains, including spam detection, medical diagnostics, image recognition, credit scoring. The goal is to predict a binary outcome--typically labeled as 0 or 1--based on a set of input features. Various machine learning algorithms, such as logistic regression (LR), k-nearest neighbors (kNN), support vector machines (SVM), and neural network (NN), are commonly employed for binary classification tasks. These algorithms can be mainly divided into two categories: those with interpretability, which are convenient for analysis and control (e.g., LR); and those without interpretability but with potentially good classification performance (e.g., NN). Ensemble learning, a powerful technique in predictive modeling, has gained widespread recognition for its ability to improve model performance by combining the strengths of multiple learning algorithms [1]. Among ensemble methods, stacking stands out by integrating the predictions of diverse base models (different learning algorithms) through a meta-model, resulting in enhanced prediction accuracy compared to only using the best base model [2]. Stacking has demonstrated significant applications in classification problems such as network intrusion detection [3, 4], cancer type classification [5], credit lending [6], and protein-protein binding affinity prediction [7].
A Federated Distributionally Robust Support Vector Machine with Mixture of Wasserstein Balls Ambiguity Set for Distributed Fault Diagnosis
Ibrahim, Michael, Rozas, Heraldo, Gebraeel, Nagi, Xie, Weijun
The training of classification models for fault diagnosis tasks using geographically dispersed data is a crucial task for original equipment manufacturers (OEMs) seeking to provide long-term service contracts (LTSCs) to their customers. Due to privacy and bandwidth constraints, such models must be trained in a federated fashion. Moreover, due to harsh industrial settings the data often suffers from feature and label uncertainty. Therefore, we study the problem of training a distributionally robust (DR) support vector machine (SVM) in a federated fashion over a network comprised of a central server and $G$ clients without sharing data. We consider the setting where the local data of each client $g$ is sampled from a unique true distribution $\mathbb{P}_g$, and the clients can only communicate with the central server. We propose a novel Mixture of Wasserstein Balls (MoWB) ambiguity set that relies on local Wasserstein balls centered at the empirical distribution of the data at each client. We study theoretical aspects of the proposed ambiguity set, deriving its out-of-sample performance guarantees and demonstrating that it naturally allows for the separability of the DR problem. Subsequently, we propose two distributed optimization algorithms for training the global FDR-SVM: i) a subgradient method-based algorithm, and ii) an alternating direction method of multipliers (ADMM)-based algorithm. We derive the optimization problems to be solved by each client and provide closed-form expressions for the computations performed by the central server during each iteration for both algorithms. Finally, we thoroughly examine the performance of the proposed algorithms in a series of numerical experiments utilizing both simulation data and popular real-world datasets.
A Systematic Review of Machine Learning in Sports Betting: Techniques, Challenges, and Future Directions
Galekwa, René Manassé, Tshimula, Jean Marie, Tajeuna, Etienne Gael, Kyandoghere, Kyamakya
The sports betting industry has experienced rapid growth, driven largely by technological advancements and the proliferation of online platforms. Machine learning (ML) has played a pivotal role in the transformation of this sector by enabling more accurate predictions, dynamic odds-setting, and enhanced risk management for both bookmakers and bettors. This systematic review explores various ML techniques, including support vector machines, random forests, and neural networks, as applied in different sports such as soccer, basketball, tennis, and cricket. These models utilize historical data, in-game statistics, and real-time information to optimize betting strategies and identify value bets, ultimately improving profitability. For bookmakers, ML facilitates dynamic odds adjustment and effective risk management, while bettors leverage data-driven insights to exploit market inefficiencies. This review also underscores the role of ML in fraud detection, where anomaly detection models are used to identify suspicious betting patterns. Despite these advancements, challenges such as data quality, real-time decision-making, and the inherent unpredictability of sports outcomes remain. Ethical concerns related to transparency and fairness are also of significant importance. Future research should focus on developing adaptive models that integrate multimodal data and manage risk in a manner akin to financial portfolios. This review provides a comprehensive examination of the current applications of ML in sports betting, and highlights both the potential and the limitations of these technologies.
Intuitionistic Fuzzy Universum Twin Support Vector Machine for Imbalanced Data
One of the major difficulties in machine learning methods is categorizing datasets that are imbalanced. This problem may lead to biased models, where the training process is dominated by the majority class, resulting in inadequate representation of the minority class. Universum twin support vector machine (UTSVM) produces a biased model towards the majority class, as a result, its performance on the minority class is often poor as it might be mistakenly classified as noise. Moreover, UTSVM is not proficient in handling datasets that contain outliers and noises. Inspired by the concept of incorporating prior information about the data and employing an intuitionistic fuzzy membership scheme, we propose intuitionistic fuzzy universum twin support vector machines for imbalanced data (IFUTSVM-ID). We use an intuitionistic fuzzy membership scheme to mitigate the impact of noise and outliers. Moreover, to tackle the problem of imbalanced class distribution, data oversampling and undersampling methods are utilized. Prior knowledge about the data is provided by universum data. This leads to better generalization performance. UTSVM is susceptible to overfitting risks due to the omission of the structural risk minimization (SRM) principle in their primal formulations. However, the proposed IFUTSVM-ID model incorporates the SRM principle through the incorporation of regularization terms, effectively addressing the issue of overfitting. We conduct a comprehensive evaluation of the proposed IFUTSVM-ID model on benchmark datasets from KEEL and compare it with existing baseline models. Furthermore, to assess the effectiveness of the proposed IFUTSVM-ID model in diagnosing Alzheimer's disease (AD), we applied them to the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. Experimental results showcase the superiority of the proposed IFUTSVM-ID models compared to the baseline models.
Mechanism learning: Reverse causal inference in the presence of multiple unknown confounding through front-door causal bootstrapping
A major limitation of machine learning (ML) prediction models is that they recover associational, rather than causal, predictive relationships between variables. In high-stakes automation applications of ML this is problematic, as the model often learns spurious, non-causal associations. This paper proposes mechanism learning, a simple method which uses front-door causal bootstrapping to deconfound observational data such that any appropriate ML model is forced to learn predictive relationships between effects and their causes (reverse causal inference), despite the potential presence of multiple unknown and unmeasured confounding. Effect variables can be very high dimensional, and the predictive relationship nonlinear, as is common in ML applications. This novel method is widely applicable, the only requirement is the existence of a mechanism variable mediating the cause (prediction target) and effect (feature data), which is independent of the (unmeasured) confounding variables. We test our method on fully synthetic, semi-synthetic and real-world datasets, demonstrating that it can discover reliable, unbiased, causal ML predictors where by contrast, the same ML predictor trained naively using classical supervised learning on the original observational data, is heavily biased by spurious associations. We provide code to implement the results in the paper, online.