Goto

Collaborating Authors

 Support Vector Machines


Dual Decomposed Learning with Factorwise Oracle for Structural SVM of Large Output Domain

Neural Information Processing Systems

Many applications of machine learning involve structured output with large domain, where learning of structured predictor is prohibitive due to repetitive calls to expensive inference oracle. In this work, we show that, by decomposing training of Structural Support Vector Machine (SVM) into a series of multiclass SVM problems connected through messages, one can replace expensive structured oracle with Factorwise Maximization Oracle (FMO) that allows efficient implementation of complexity sublinear to the factor domain. A Greedy Direction Method of Multiplier (GDMM) algorithm is proposed to exploit sparsity of messages which guarantees $\epsilon$ sub-optimality after $O(log(1/\epsilon))$ passes of FMO calls. We conduct experiments on chain-structured problems and fully-connected problems of large output domains. The proposed approach is orders-of-magnitude faster than the state-of-the-art training algorithms for Structural SVM.


Parametric Simplex Method for Sparse Learning

Neural Information Processing Systems

High dimensional sparse learning has imposed a great computational challenge to large scale data analysis. In this paper, we investiage a broad class of sparse learning approaches formulated as linear programs parametrized by a {\em regularization factor}, and solve them by the parametric simplex method (PSM). PSM offers significant advantages over other competing methods: (1) PSM naturally obtains the complete solution path for all values of the regularization parameter; (2) PSM provides a high precision dual certificate stopping criterion; (3) PSM yields sparse solutions through very few iterations, and the solution sparsity significantly reduces the computational cost per iteration. Particularly, we demonstrate the superiority of PSM over various sparse learning approaches, including Dantzig selector for sparse linear regression, sparse support vector machine for sparse linear classification, and sparse differential network estimation. We then provide sufficient conditions under which PSM always outputs sparse solutions such that its computational performance can be significantly boosted. Thorough numerical experiments are provided to demonstrate the outstanding performance of the PSM method.


Variational Autoencoder for Deep Learning of Images, Labels and Captions

Neural Information Processing Systems

A novel variational autoencoder is developed to model images, as well as associated labels or captions. The Deep Generative Deconvolutional Network (DGDN) is used as a decoder of the latent image features, and a deep Convolutional Neural Network (CNN) is used as an image encoder; the CNN is used to approximate a distribution for the latent DGDN features/code. The latent code is also linked to generative models for labels (Bayesian support vector machine) or captions (recurrent neural network). When predicting a label/caption for a new image at test, averaging is performed across the distribution of latent codes; this is computationally efficient as a consequence of the learned CNN-based encoder. Since the framework is capable of modeling the image in the presence/absence of associated labels/captions, a new semi-supervised setting is manifested for CNN learning with images; the framework even allows unsupervised CNN learning, based on images alone.





Process-constrained batch Bayesian optimisation

Pratibha Vellanki, Santu Rana, Sunil Gupta, David Rubin, Alessandra Sutti, Thomas Dorin, Murray Height, Paul Sanders, Svetha Venkatesh

Neural Information Processing Systems

Prevailing batch Bayesian optimisation methods allow all control variables to be freely altered at each iteration. Real-world experiments, however, often have physical limitations making it time-consuming to alter all settings for each recommendation in a batch. This gives rise to a unique problem in BO: in a recommended batch, a set of variables that are expensive to experimentally change need to be fixed, while the remaining control variables can be varied. We formulate this as a process-constrained batch Bayesian optimisation problem. We propose two algorithms, pc-BO(basic) and pc-BO(nested).


Learning Confidence Sets using Support Vector Machines

Neural Information Processing Systems

The goal of confidence-set learning in the binary classification setting is to construct two sets, each with a specific probability guarantee to cover a class. An observation outside the overlap of the two sets is deemed to be from one of the two classes, while the overlap is an ambiguity region which could belong to either class. Instead of plug-in approaches, we propose a support vector classifier to construct confidence sets in a flexible manner. Theoretically, we show that the proposed learner can control the non-coverage rates and minimize the ambiguity with high probability. Efficient algorithms are developed and numerical studies illustrate the effectiveness of the proposed method.



BrainRotViT: Transformer-ResNet Hybrid for Explainable Modeling of Brain Aging from 3D sMRI

Jalal, Wasif, Rahman, Md Nafiu, Rahman, Atif Hasan, Rahman, M. Sohel

arXiv.org Artificial Intelligence

The human brain undergoes continuous transformations across the lifespan, representing a natural component of aging that does not inherently signal pathological conditions [1]. Neurodegenerative disorders such as dementia can compromise the brain structure and accelerate aging processes. Understanding and characterizing healthy brain aging patterns therefore becomes essential for distinguishing normal aging from pathological neurodegeneration, potentially enabling earlier detection of neurodegenerative diseases. The Brain Age-Gap (BAG), i.e. the discrepancy between predicted brain age and chronological age, has emerged as a robust biomarker that captures pathological brain processes and offers insights into the rate at which an individual's brain ages in comparison to others in the population [2, 3]. It is not only associated with various neurological disorders, such as Alzheimer's disease, cognitive impairment, and Autism Spectrum Disorder, but also serves as an indicator of all-cause mortality [4, 5, 6, 7, 8] Brain age estimation has been approached through both conventional and machine learning techniques, analyzing either the whole brain, specific regions, or localized patches [9, 10, 11]. One particular study presented a method using T1-weighted MRI to predict age through region-level and voxel-level metrics [12]. Regression-based machine learning has shown promise for the brain age prediction, with kernel regression applied to whole-brain MRI across diverse age ranges [13]. Various algorithms including Support Vector Regression and Binary Decision Trees have been compared for their brain age prediction capabilities [14]. Additional regression techniques such as Relevance Vector Regression, Twin Support Vector Regression, and Gaussian Process Regression have been explored across different imaging modalities for age estimation and mortality prediction [11, 15, 16, 17].