Goto

Collaborating Authors

 Regression


Online Multi-Task Learning with Recursive Least Squares and Recursive Kernel Methods

arXiv.org Artificial Intelligence

This paper introduces two novel approaches for Online Multi-Task Learning (MTL) Regression Problems. We employ a high performance graph-based MTL formulation and develop its recursive versions based on the Weighted Recursive Least Squares (WRLS) and the Online Sparse Least Squares Support Vector Regression (OSLSSVR). Adopting task-stacking transformations, we demonstrate the existence of a single matrix incorporating the relationship of multiple tasks and providing structural information to be embodied by the MT-WRLS method in its initialization procedure and by the MT-OSLSSVR in its multi-task kernel function. Contrasting the existing literature, which is mostly based on Online Gradient Descent (OGD) or cubic inexact approaches, we achieve exact and approximate recursions with quadratic per-instance cost on the dimension of the input space (MT-WRLS) or on the size of the dictionary of instances (MT-OSLSSVR). We compare our online MTL methods to other contenders in a real-world wind speed forecasting case study, evidencing the significant gain in performance of both proposed approaches.


Investigation on Machine Learning Based Approaches for Estimating the Critical Temperature of Superconductors

arXiv.org Artificial Intelligence

Superconductors have been among the most fascinating substances, as the fundamental concept of superconductivity as well as the correlation of critical temperature and superconductive materials have been the focus of extensive investigation since their discovery. However, superconductors at normal temperatures have yet to be identified. Additionally, there are still many unknown factors and gaps of understanding regarding this unique phenomenon, particularly the connection between superconductivity and the fundamental criteria to estimate the critical temperature. To bridge the gap, numerous machine learning techniques have been established to estimate critical temperatures as it is extremely challenging to determine. Furthermore, the need for a sophisticated and feasible method for determining the temperature range that goes beyond the scope of the standard empirical formula appears to be strongly emphasized by various machine-learning approaches. This paper uses a stacking machine learning approach to train itself on the complex characteristics of superconductive materials in order to accurately predict critical temperatures. In comparison to other previous accessible research investigations, this model demonstrated a promising performance with an RMSE of 9.68 and an R2 score of 0.922. The findings presented here could be a viable technique to shed new insight on the efficient implementation of the stacking ensemble method with hyperparameter optimization (HPO).


Mapping Computer Science Research: Trends, Influences, and Predictions

arXiv.org Artificial Intelligence

This paper explores the current trending research areas in the field of Computer Science (CS) and investigates the factors contributing to their emergence. Leveraging a comprehensive dataset comprising papers, citations, and funding information, we employ advanced machine learning techniques, including Decision Tree and Logistic Regression models, to predict trending research areas. Our analysis reveals that the number of references cited in research papers (Reference Count) plays a pivotal role in determining trending research areas making reference counts the most relevant factor that drives trend in the CS field. Additionally, the influence of NSF grants and patents on trending topics has increased over time. The Logistic Regression model outperforms the Decision Tree model in predicting trends, exhibiting higher accuracy, precision, recall, and F1 score. By surpassing a random guess baseline, our data-driven approach demonstrates higher accuracy and efficacy in identifying trending research areas. The results offer valuable insights into the trending research areas, providing researchers and institutions with a data-driven foundation for decision-making and future research direction.


Robust Linear Regression: Phase-Transitions and Precise Tradeoffs for General Norms

arXiv.org Artificial Intelligence

In this paper, we investigate the impact of test-time adversarial attacks on linear regression models and determine the optimal level of robustness that any model can reach while maintaining a given level of standard predictive performance (accuracy). Through quantitative estimates, we uncover fundamental tradeoffs between adversarial robustness and accuracy in different regimes. We obtain a precise characterization which distinguishes between regimes where robustness is achievable without hurting standard accuracy and regimes where a tradeoff might be unavoidable. Our findings are empirically confirmed with simple experiments that represent a variety of settings. This work applies to feature covariance matrices and attack norms of any nature, and extends beyond previous works in this area.


Copula for Instance-wise Feature Selection and Ranking

arXiv.org Artificial Intelligence

The identification of feature correlations can minimize the redundancy of features. Yet, in the literature of instance-wise Instance-wise feature selection and ranking methods feature selection and ranking methods [Chen et al., 2018, can achieve a good selection of task-friendly Yoon et al., 2019, Abid et al., 2019, Masoomi et al., 2020, features for each sample in the context of neural Wu and Liu, 2018] that follow the context of neural networks, networks. However, existing approaches that the dependencies between features has not been considered assume feature subsets to be independent are imperfect manifestly. For instance, L2X [Chen et al., 2018] performs when considering the dependency between a feature selection for maximizing the mutual information features. To address this limitation, we propose between selected feature subsets and corresponding outputs.


Investigating the Learning Behaviour of In-context Learning: A Comparison with Supervised Learning

arXiv.org Artificial Intelligence

Large language models (LLMs) have shown remarkable However, despite the advantages of ICL, it is still unclear how ICL capacity for in-context learning (ICL), where learning a new task learns knowledge from the given prompts without updating its model from just a few training examples is done without being explicitly parameters. Preliminary research [1, 11] compared ICL with simple pre-trained. However, despite the success of LLMs, there has been machine learning models, such as logistic regression and shallow little understanding of how ICL learns the knowledge from the given neural networks. In this paper, we take a further step and investigate prompts. In this paper, to make progress toward understanding the learning behaviour differences between ICL and supervised learning learning behaviour of ICL, we train the same LLMs with the same (SL). Specifically, we train three LLMs with the same training data demonstration examples via ICL and supervised learning (SL), respectively, via in-context learning and supervised learning separately and analyze and investigate their performance under label perturbations their generated outputs. While SL is a well-established approach (i.e., noisy labels and label imbalance) on a range of classification that uses labelled data to train models to make accurate predictions, tasks. First, via extensive experiments, we find that gold labels ICL takes a different approach by leveraging the context of the text have significant impacts on the downstream in-context performance, to learn from unlabeled data in order to improve the accuracy of the especially for large language models; however, imbalanced predictions. By comparing the performance of ICL and SL, we gain labels matter little to ICL across all model sizes.


A Unified Analysis of Multi-task Functional Linear Regression Models with Manifold Constraint and Composite Quadratic Penalty

arXiv.org Machine Learning

This work studies the multi-task functional linear regression models where both the covariates and the unknown regression coefficients (called slope functions) are curves. For slope function estimation, we employ penalized splines to balance bias, variance, and computational complexity. The power of multi-task learning is brought in by imposing additional structures over the slope functions. We propose a general model with double regularization over the spline coefficient matrix: i) a matrix manifold constraint, and ii) a composite penalty as a summation of quadratic terms. Many multi-task learning approaches can be treated as special cases of this proposed model, such as a reduced-rank model and a graph Laplacian regularized model. We show the composite penalty induces a specific norm, which helps to quantify the manifold curvature and determine the corresponding proper subset in the manifold tangent space. The complexity of tangent space subset is then bridged to the complexity of geodesic neighbor via generic chaining. A unified convergence upper bound is obtained and specifically applied to the reduced-rank model and the graph Laplacian regularized model. The phase transition behaviors for the estimators are examined as we vary the configurations of model parameters.


Beam Detection Based on Machine Learning Algorithms

arXiv.org Artificial Intelligence

The free electron laser(FEL) at Stanford Linear Accelerator Center(SLAC) is an ultra-fast X-ray laser. As one of the most advanced X-ray light source [5] [6], it is famous for its high brightness and short pulse duration: it is 10 billion times brighter than the world's second brightest light source; the pulse duration is several tens femtoseconds.It plays a pivotal role in both fundamental science research and applied research [6]. The mechanism behind this laser is very delicate [5]. Thus to keep the laser in optimal working condition is challenging.The positions of the electron beams and the laser beams are of fundamental importance in the control and maintenance of this FEL. Currently, the task of locating beam spots heavily depends on human labor. This is mainly attributed to the wide varieties of beam spots and the presentation of strong noises as demonstrated in Figure 1, where the white square marks the boundary of the beam spot. Each picture requires a long sequence of signal processing methods to mark the beam position.


Best-Subset Selection in Generalized Linear Models: A Fast and Consistent Algorithm via Splicing Technique

arXiv.org Artificial Intelligence

In high-dimensional generalized linear models, it is crucial to identify a sparse model that adequately accounts for response variation. Although the best subset section has been widely regarded as the Holy Grail of problems of this type, achieving either computational efficiency or statistical guarantees is challenging. In this article, we intend to surmount this obstacle by utilizing a fast algorithm to select the best subset with high certainty. We proposed and illustrated an algorithm for best subset recovery in regularity conditions. Under mild conditions, the computational complexity of our algorithm scales polynomially with sample size and dimension. In addition to demonstrating the statistical properties of our method, extensive numerical experiments reveal that it outperforms existing methods for variable selection and coefficient estimation. The runtime analysis shows that our implementation achieves approximately a fourfold speedup compared to popular variable selection toolkits like glmnet and ncvreg.


Pupil Learning Mechanism

arXiv.org Artificial Intelligence

Studies on artificial neural networks rarely address both vanishing gradients and overfitting issues. In this study, we follow the pupil learning procedure, which has the features of interpreting, picking, understanding, cramming, and organizing, to derive the pupil learning mechanism (PLM) by which to modify the network structure and weights of 2-layer neural networks (2LNNs). The PLM consists of modules for sequential learning, adaptive learning, perfect learning, and less-overfitted learning. Based upon a copper price forecasting dataset, we conduct an experiment to validate the PLM module design modules, and an experiment to evaluate the performance of PLM. The empirical results indeed approve the PLM module design and show the superiority of the proposed PLM model over the linear regression model and the conventional backpropagation-based 2LNN model.