Chakraborty, Amit
Interpretable Deep Learning for Two-Prong Jet Classification with Jet Spectra
Chakraborty, Amit, Lim, Sung Hak, Nojiri, Mihoko M.
Deep learning is gaining significant interest recently in the field of collider data analysis. One of the primary motivations is to extract the maximum information from the complex collision events. The deep learning in collider physics takes advantage of a large influx of data from experiments, more precise theoretical predictions, significant improvement in computing power, and ongoing progress in the field of machine learning itself. Such techniques offer advances in areas ranging from event selection to particle identification. The large center-of-mass energy at the Large Hadron Collider (LHC) enables the production of boosted particles whose decay products are highly collimated. These collimated objects are reconstructed as a jet, and it is often misidentified as a QCD jet originated from light quarks or gluons. Many jet substructure techniques using the information of subjets [1-10] and the distribution of jet constituents [11-17] have been developed in order to improve the sensitivity of tagging and classifying these boosted particle jets.
Convolutional Neural Knowledge Graph Learning
Zhao, Feipeng, Min, Martin Renqiang, Shen, Chen, Chakraborty, Amit
Previous models for learning entity and relationship embeddings of knowledge graphs such as TransE, TransH, and TransR aim to explore new links based on learned representations. However, these models interpret relationships as simple translations on entity embeddings. In this paper, we try to learn more complex connections between entities and relationships. In particular, we use a Convolutional Neural Network (CNN) to learn entity and relationship representations in knowledge graphs. In our model, we treat entities and relationships as one-dimensional numerical sequences with the same length. After that, we combine each triplet of head, relationship, and tail together as a matrix with height 3. CNN is applied to the triplets to get confidence scores. Positive and manually corrupted negative triplets are used to train the embeddings and the CNN model simultaneously. Experimental results on public benchmark datasets show that the proposed model outperforms state-of-the-art models on exploring unseen relationships, which proves that CNN is effective to learn complex interactive patterns between entities and relationships.
Proximal gradient method for huberized support vector machine
Xu, Yangyang, Akrotirianakis, Ioannis, Chakraborty, Amit
The Support Vector Machine (SVM) has been used in a wide variety of classification problems. The original SVM uses the hinge loss function, which is non-differentiable and makes the problem difficult to solve in particular for regularized SVMs, such as with $\ell_1$-regularization. This paper considers the Huberized SVM (HSVM), which uses a differentiable approximation of the hinge loss function. We first explore the use of the Proximal Gradient (PG) method to solving binary-class HSVM (B-HSVM) and then generalize it to multi-class HSVM (M-HSVM). Under strong convexity assumptions, we show that our algorithm converges linearly. In addition, we give a finite convergence result about the support of the solution, based on which we further accelerate the algorithm by a two-stage method. We present extensive numerical experiments on both synthetic and real datasets which demonstrate the superiority of our methods over some state-of-the-art methods for both binary- and multi-class SVMs.
Alternating direction method of multipliers for regularized multiclass support vector machines
Xu, Yangyang, Akrotirianakis, Ioannis, Chakraborty, Amit
The support vector machine (SVM) was originally designed for binary classifications. A lot of effort has been put to generalize the binary SVM to multiclass SVM (MSVM) which are more complex problems. Initially, MSVMs were solved by considering their dual formulations which are quadratic programs and can be solved by standard second-order methods. However, the duals of MSVMs with regularizers are usually more difficult to formulate and computationally very expensive to solve. This paper focuses on several regularized MSVMs and extends the alternating direction method of multiplier (ADMM) to these MSVMs. Using a splitting technique, all considered MSVMs are written as two-block convex programs, for which the ADMM has global convergence guarantees. Numerical experiments on synthetic and real data demonstrate the high efficiency and accuracy of our algorithms.
HIPAD - A Hybrid Interior-Point Alternating Direction algorithm for knowledge-based SVM and feature selection
Qin, Zhiwei, Tang, Xiaocheng, Akrotirianakis, Ioannis, Chakraborty, Amit
We consider classification tasks in the regime of scarce labeled training data in high dimensional feature space, where specific expert knowledge is also available. We propose a new hybrid optimization algorithm that solves the elastic-net support vector machine (SVM) through an alternating direction method of multipliers in the first phase, followed by an interior-point method for the classical SVM in the second phase. Both SVM formulations are adapted to knowledge incorporation. Our proposed algorithm addresses the challenges of automatic feature selection, high optimization accuracy, and algorithmic flexibility for taking advantage of prior knowledge. We demonstrate the effectiveness and efficiency of our algorithm and compare it with existing methods on a collection of synthetic and real-world data.