Support Vector Machines
Effective Defect Detection Using Instance Segmentation for NDI
Rahman, Ashiqur, Seethi, Venkata Devesh Reddy, Yunker, Austin, Kral, Zachary, Kettimuthu, Rajkumar, Alhoori, Hamed
Ultrasonic testing is a common Non-Destructive Inspection (NDI) method used in aerospace manufacturing. However, the complexity and size of the ultrasonic scans make it challenging to identify defects through visual inspection or machine learning models. Using computer vision techniques to identify defects from ultrasonic scans is an evolving research area. In this study, we used instance segmentation to identify the presence of defects in the ultrasonic scan images of composite panels that are representative of real components manufactured in aerospace. We used two models based on Mask-RCNN (Detectron 2) and YOLO 11 respectively. Additionally, we implemented a simple statistical pre-processing technique that reduces the burden of requiring custom-tailored pre-processing techniques. Our study demonstrates the feasibility and effectiveness of using instance segmentation in the NDI pipeline by significantly reducing data pre-processing time, inspection time, and overall costs.
Review for NeurIPS paper: Fast Epigraphical Projection-based Incremental Algorithms for Wasserstein Distributionally Robust Support Vector Machine
Summary and Contributions: The authors have proposed two new algorithms to solve the timely and important distributionally robust SVM problem. These new algorithms are instances of incremental projected subgradient descent and incremental proximal point algorithm. The main novelty of ISG and IPPA proposed in this work is that they developed efficient algorithms (in linear time) for the subproblems in these solving strategies, which are interesting and potentially beneficial for other problems with similar structures. Besides, the authors carefully analyze the iteration complexity of their new algorithms under the so-called BLR condition. They show if the BLR condition is satisfied, the exponent in the Holderian growth condition could be explicitly determined.
Review for NeurIPS paper: Fast Epigraphical Projection-based Incremental Algorithms for Wasserstein Distributionally Robust Support Vector Machine
This paper addresses important problems, presents novel techinical analysis, and neumerical results are encouraging. All of the reviewers and I have agreed to accept this paper. However, several reviewers (especially R1 and R4) have pointed out some important concerns. Please consider revising your paper to address them before submitting a camera-ready.
Manifold learning and optimization using tangent space proxies
Robinett, Ryan A., Orecchia, Lorenzo, Riesenfeld, Samantha J.
In machine learning, Riemannian manifolds offer a useful abstraction for approximating commonly encountered, non-Euclidean empirical data distributions and optimization state spaces. While Euclidean machine learning algorithms have been adapted to Riemannian manifolds, these adaptations rely on computationally intensive differential-geometric primitives, such as exponential maps and parallel transports. Here, we present a framework for efficiently approximating differentialgeometric primitives on arbitrary manifolds via construction of an atlas graph representation, which leverages the canonical characterization of a manifold as a finite collection, or atlas, of overlapping coordinate charts. We first show the utility of this framework in a setting where the manifold is expressed in closed form, specifically, a runtime advantage, compared with state-of-the-art approaches, for first-order optimization over the Grassmann manifold. Moreover, using point cloud data for which a complex manifold structure was previously established, i.e., high-contrast image patches, we show that an atlas graph with the correct geometry can be directly learned from the point cloud. Finally, we demonstrate that learning an atlas graph enables downstream key machine learning tasks. In particular, we implement a Riemannian generalization of support vector machines that uses the learned atlas graph to approximate complex differential-geometric primitives, including Riemannian logarithms and vector transports. These settings suggest the potential of this framework for even more complex settings, where ambient dimension and noise levels may be much higher. Taken together, our results speed up and render more broadly applicable Riemannian optimization routines at the forefront of modern data science and machine learning.
Review for NeurIPS paper: A Scalable MIP-based Method for Learning Optimal Multivariate Decision Trees
This paper is about employing advances in computational efficiency of mixed integer programming methods towards decision tree construction problems. While locally optimal methods can achieve an upper bound on the minimization problem efficiently, closing the optimality gap requires tight lower bounds. The authors use an interval relaxation and a support-vector machine procedure to tighten the lower bound. To scale the algorithm, the authors use a LP-based data selection procedure, and perform all experiments using this procedure. It is not clear whether the global optimality properties of the MIP formulation carry through with the data-selection procedure.
Transductive Conformal Inference for Ranking
Fermanian, Jean-Baptiste, Humbert, Pierre, Blanchard, Gilles
We introduce a method based on Conformal Prediction (CP) to quantify the uncertainty of full ranking algorithms. We focus on a specific scenario where $n + m$ items are to be ranked by some ''black box'' algorithm. It is assumed that the relative (ground truth) ranking of n of them is known. The objective is then to quantify the error made by the algorithm on the ranks of the m new items among the total $(n + m)$. In such a setting, the true ranks of the n original items in the total $(n + m)$ depend on the (unknown) true ranks of the m new ones. Consequently, we have no direct access to a calibration set to apply a classical CP method. To address this challenge, we propose to construct distribution-free bounds of the unknown conformity scores using recent results on the distribution of conformal p-values. Using these scores upper bounds, we provide valid prediction sets for the rank of any item. We also control the false coverage proportion, a crucial quantity when dealing with multiple prediction sets. Finally, we empirically show on both synthetic and real data the efficiency of our CP method.
Towards a Unified Information-Theoretic Framework for Generalization
In this work, we investigate the expressiveness of the "conditional mutual information" (CMI) framework of Steinke and Zakynthinou (2020) and the prospect of using it to provide a unified framework for proving generalization bounds in the realizable setting. We first demonstrate that one can use this framework to express non-trivial (but sub-optimal) bounds for any learning algorithm that outputs hypotheses from a class of bounded VC dimension. We then explore two directions of strengthening this bound: (i) Can the CMI framework express optimal bounds for VC classes? With respect to Item (i) we prove that the CMI framework yields the optimal bound on the expected risk of Support Vector Machines (SVMs) for learning halfspaces. This result is an application of our general result showing that stable compression schemes Bousquet al. (2020) of size k have uniformly bounded CMI of order O(k) .
Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation
The growing literature on "benign overfitting" in overparameterized models has been mostly restricted to regression or binary classification settings; however, most success stories of modern machine learning have been recorded in multiclass settings. Motivated by this discrepancy, we study benign overfitting in multiclass linear classification. Specifically, we consider the following popular training algorithms on separable data: (i) empirical risk minimization (ERM) with cross-entropy loss, which converges to the multiclass support vector machine (SVM) solution; (ii) ERM with least-squares loss, which converges to the min-norm interpolating (MNI) solution; and, (iii) the one-vs-all SVM classifier. Our first key finding is that under a simple sufficient condition, all three algorithms lead to classifiers that interpolate the training data and have equal accuracy. When the data is generated from Gaussian mixtures or a multinomial logistic model, this condition holds under high enough effective overparameterization. Second, we derive novel error bounds on the accuracy of the MNI classifier, thereby showing that all three training algorithms lead to benign overfitting under sufficient overparameterization. Ultimately, our analysis shows that good generalization is possible for SVM solutions beyond the realm in which typical margin-based bounds apply.
On the Equivalence between Neural Network and Support Vector Machine
Recent research shows that the dynamics of an infinitely wide neural network (NN) trained by gradient descent can be characterized by Neural Tangent Kernel (NTK) \citep{jacot2018neural}. Under the squared loss, the infinite-width NN trained by gradient descent with an infinitely small learning rate is equivalent to kernel regression with NTK \citep{arora2019exact}. However, the equivalence is only known for ridge regression currently \citep{arora2019harnessing}, while the equivalence between NN and other kernel machines (KMs), e.g. Therefore, in this work, we propose to establish the equivalence between NN and SVM, and specifically, the infinitely wide NN trained by soft margin loss and the standard soft margin SVM with NTK trained by subgradient descent. Our main theoretical results include establishing the equivalence between NN and a broad family of \ell_2 regularized KMs with finite-width bounds, which cannot be handled by prior work, and showing that every finite-width NN trained by such regularized loss functions is approximately a KM.
Risk Analysis of Flowlines in the Oil and Gas Sector: A GIS and Machine Learning Approach
Chittumuri, I., Alshehab, N., Voss, R. J., Douglass, L. L., Kamrava, S., Fan, Y., Miskimins, J., Fleckenstein, W., Bandyopadhyay, S.
This paper presents a risk analysis of flowlines in the oil and gas sector using Geographic Information Systems (GIS) and machine learning (ML). Flowlines, vital conduits transporting oil, gas, and water from wellheads to surface facilities, often face under-assessment compared to transmission pipelines. This study addresses this gap using advanced tools to predict and mitigate failures, improving environmental safety and reducing human exposure. Extensive datasets from the Colorado Energy and Carbon Management Commission (ECMC) were processed through spatial matching, feature engineering, and geometric extraction to build robust predictive models. Various ML algorithms, including logistic regression, support vector machines, gradient boosting decision trees, and K-Means clustering, were used to assess and classify risks, with ensemble classifiers showing superior accuracy, especially when paired with Principal Component Analysis (PCA) for dimensionality reduction. Finally, a thorough data analysis highlighted spatial and operational factors influencing risks, identifying high-risk zones for focused monitoring. Overall, the study demonstrates the transformative potential of integrating GIS and ML in flowline risk management, proposing a data-driven approach that emphasizes the need for accurate data and refined models to improve safety in petroleum extraction.