Support Vector Machines
Big Data - Supply Chain Management Framework for Forecasting: Data Preprocessing and Machine Learning Techniques
Jahin, Md Abrar, Shovon, Md Sakib Hossain, Shin, Jungpil, Ridoy, Istiyaque Ahmed, Tomioka, Yoichi, Mridha, M. F.
This article intends to systematically identify and comparatively analyze state-of-the-art supply chain (SC) forecasting strategies and technologies. A novel framework has been proposed incorporating Big Data Analytics in SC Management (problem identification, data sources, exploratory data analysis, machine-learning model training, hyperparameter tuning, performance evaluation, and optimization), forecasting effects on human-workforce, inventory, and overall SC. Initially, the need to collect data according to SC strategy and how to collect them has been discussed. The article discusses the need for different types of forecasting according to the period or SC objective. The SC KPIs and the error-measurement systems have been recommended to optimize the top-performing model. The adverse effects of phantom inventory on forecasting and the dependence of managerial decisions on the SC KPIs for determining model performance parameters and improving operations management, transparency, and planning efficiency have been illustrated. The cyclic connection within the framework introduces preprocessing optimization based on the post-process KPIs, optimizing the overall control process (inventory management, workforce determination, cost, production and capacity planning). The contribution of this research lies in the standard SC process framework proposal, recommended forecasting data analysis, forecasting effects on SC performance, machine learning algorithms optimization followed, and in shedding light on future research.
Localisation of Regularised and Multiview Support Vector Machine Learning
Gheondea, Aurelian, Tilki, Cankat
We prove a few representer theorems for a localised version of the regularised and multiview support vector machine learning problem introduced by H.Q.~Minh, L.~Bazzani, and V.~Murino, \textit{Journal of Machine Learning Research}, \textbf{17}(2016) 1--72, that involves operator valued positive semidefinite kernels and their reproducing kernel Hilbert spaces. The results concern general cases when convex or nonconvex loss functions and finite or infinite dimensional input spaces are considered. We show that the general framework allows infinite dimensional input spaces and nonconvex loss functions for some special cases, in particular in case the loss functions are G\^ateaux differentiable. Detailed calculations are provided for the exponential least squares loss functions that leads to partially nonlinear problems.
On Comparing Fair Classifiers under Data Bias
Sharma, Mohit, Deshpande, Amit, Shah, Rajiv Ratn
In this paper, we consider a theoretical model for injecting data bias, namely, under-representation and label bias (Blum & Stangl, 2019). We empirically study the effect of varying data biases on the accuracy and fairness of fair classifiers. Through extensive experiments on both synthetic and real-world datasets (e.g., Adult, German Credit, Bank Marketing, COMPAS), we empirically audit pre-, in-, and post-processing fair classifiers from standard fairness toolkits for their fairness and accuracy by injecting varying amounts of under-representation and label bias in their training data (but not the test data). Our main observations are: 1. The fairness and accuracy of many standard fair classifiers degrade severely as the bias injected in their training data increases, 2. A simple logistic regression model trained on the right data can often outperform, in both accuracy and fairness, most fair classifiers trained on biased training data, and 3. A few, simple fairness techniques (e.g., reweighing, exponentiated gradients) seem to offer stable accuracy and fairness guarantees even when their training data is injected with under-representation and label bias. Our experiments also show how to integrate a measure of data bias risk in the existing fairness dashboards for real-world deployments.
Large-Scale Quantum Separability Through a Reproducible Machine Learning Lens
Casalรฉ, Balthazar, Di Molfetta, Giuseppe, Anthoine, Sandrine, Kadri, Hachem
The quantum separability problem consists in deciding whether a bipartite density matrix is entangled or separable. In this work, we propose a machine learning pipeline for finding approximate solutions for this NP-hard problem in large-scale scenarios. We provide an efficient Frank-Wolfe-based algorithm to approximately seek the nearest separable density matrix and derive a systematic way for labeling density matrices as separable or entangled, allowing us to treat quantum separability as a classification problem. Our method is applicable to any two-qudit mixed states. Numerical experiments with quantum states of 3- and 7-dimensional qudits validate the efficiency of the proposed procedure, and demonstrate that it scales up to thousands of density matrices with a high quantum entanglement detection accuracy. This takes a step towards benchmarking quantum separability to support the development of more powerful entanglement detection techniques.
Loss-Optimal Classification Trees: A Generalized Framework and the Logistic Case
Aldinucci, Tommaso, Lapucci, Matteo
The Classification Tree (CT) is one of the most common models in interpretable machine learning. Although such models are usually built with greedy strategies, in recent years, thanks to remarkable advances in Mixer-Integer Programming (MIP) solvers, several exact formulations of the learning problem have been developed. In this paper, we argue that some of the most relevant ones among these training models can be encapsulated within a general framework, whose instances are shaped by the specification of loss functions and regularizers. Next, we introduce a novel realization of this framework: specifically, we consider the logistic loss, handled in the MIP setting by a linear piece-wise approximation, and couple it with $\ell_1$-regularization terms. The resulting Optimal Logistic Tree model numerically proves to be able to induce trees with enhanced interpretability features and competitive generalization capabilities, compared to the state-of-the-art MIP-based approaches.
A Comprehensive Review of Visual-Textual Sentiment Analysis from Social Media Networks
Al-Tameemi, Israa Khalaf Salman, Feizi-Derakhshi, Mohammad-Reza, Pashazadeh, Saeed, Asadpour, Mohammad
Social media networks have become a significant aspect of people's lives, serving as a platform for their ideas, opinions and emotions. Consequently, automated sentiment analysis (SA) is critical for recognising people's feelings in ways that other information sources cannot. The analysis of these feelings revealed various applications, including brand evaluations, YouTube film reviews and healthcare applications. As social media continues to develop, people post a massive amount of information in different forms, including text, photos, audio and video. Thus, traditional SA algorithms have become limited, as they do not consider the expressiveness of other modalities. By including such characteristics from various material sources, these multimodal data streams provide new opportunities for optimising the expected results beyond text-based SA. Our study focuses on the forefront field of multimodal SA, which examines visual and textual data posted on social media networks. Many people are more likely to utilise this information to express themselves on these platforms. To serve as a resource for academics in this rapidly growing field, we introduce a comprehensive overview of textual and visual SA, including data pre-processing, feature extraction techniques, sentiment benchmark datasets, and the efficacy of multiple classification methodologies suited to each field. We also provide a brief introduction of the most frequently utilised data fusion strategies and a summary of existing research on visual-textual SA. Finally, we highlight the most significant challenges and investigate several important sentiment applications.
TSVR+: Twin support vector regression with privileged information
In the realm of machine learning, the data may contain additional attributes, known as privileged information (PI). The main purpose of PI is to assist in the training of the model and then utilize the acquired knowledge to make predictions for unseen samples. Support vector regression (SVR) is an effective regression model, however, it has a low learning speed due to solving a convex quadratic problem (QP) subject to a pair of constraints. In contrast, twin support vector regression (TSVR) is more efficient than SVR as it solves two QPs each subject to one set of constraints. However, TSVR and its variants are trained only on regular features and do not use privileged features for training. To fill this gap, we introduce a fusion of TSVR with learning using privileged information (LUPI) and propose a novel approach called twin support vector regression with privileged information (TSVR+). The regularization terms in the proposed TSVR+ capture the essence of statistical learning theory and implement the structural risk minimization principle. We use the successive overrelaxation (SOR) technique to solve the optimization problem of the proposed TSVR+, which enhances the training efficiency. As far as our knowledge extends, the integration of the LUPI concept into twin variants of regression models is a novel advancement. The numerical experiments conducted on UCI, stock and time series data collectively demonstrate the superiority of the proposed model.
Corporate Bankruptcy Prediction with Domain-Adapted BERT
This study performs BERT-based analysis, which is a representative contextualized language model, on corporate disclosure data to predict impending bankruptcies. Prior literature on bankruptcy prediction mainly focuses on developing more sophisticated prediction methodologies with financial variables. However, in our study, we focus on improving the quality of input dataset. Specifically, we employ BERT model to perform sentiment analysis on MD&A disclosures. We show that BERT outperforms dictionary-based predictions and Word2Vec-based predictions in terms of adjusted R-square in logistic regression, k-nearest neighbor (kNN-5), and linear kernel support vector machine (SVM). Further, instead of pre-training the BERT model from scratch, we apply self-learning with confidence-based filtering to corporate disclosure data (10-K). We achieve the accuracy rate of 91.56% and demonstrate that the domain adaptation procedure brings a significant improvement in prediction accuracy.
Modelling human logical reasoning process in dynamic environmental stress with cognitive agents
Modelling human cognition can provide key insights into behavioral dynamics under changing conditions. This enables synthetic data generation and guides adaptive interventions for cognitive regulation. Challenges arise when environments are highly dynamic, obscuring stimulus-behavior relationships. We propose a cognitive agent integrating drift-diffusion with deep reinforcement learning to simulate granular stress effects on logical reasoning process. Leveraging a large dataset of 21,157 logical responses, we investigate performance impacts of dynamic stress. This prior knowledge informed model design and evaluation. Quantitatively, the framework improves cognition modelling by capturing both subject-specific and stimuli-specific behavioural differences. Qualitatively, it captures general trends in human logical reasoning under stress. Our approach is extensible to examining diverse environmental influences on cognition and behavior. Overall, this work demonstrates a powerful, data-driven methodology to simulate and understand the vagaries of human logical reasoning process in dynamic contexts.
Anomaly Detection Under Uncertainty Using Distributionally Robust Optimization Approach
Noormohammadia, Amir Hossein, MirHassania, Seyed Ali, Khaligh, Farnaz Hooshmand
Anomaly detection is defined as the problem of finding data points that do not follow the patterns of the majority. Among the various proposed methods for solving this problem, classification-based methods, including one-class Support Vector Machines (SVM) are considered effective and state-of-the-art. The one-class SVM method aims to find a decision boundary to distinguish between normal data points and anomalies using only the normal data. On the other hand, most real-world problems involve some degree of uncertainty, where the true probability distribution of each data point is unknown, and estimating it is often difficult and costly. Assuming partial distribution information such as the first and second-order moments is known, a distributionally robust chance-constrained model is proposed in which the probability of misclassification is low. By utilizing a mapping function to a higher dimensional space, the proposed model will be capable of classifying origin-inseparable datasets. Also, by adopting the kernel idea, the need for explicitly knowing the mapping is eliminated, computations can be performed in the input space, and computational complexity is reduced. Computational results validate the robustness of the proposed model under different probability distributions and also the superiority of the proposed model compared to the standard one-class SVM in terms of various evaluation metrics.