AITopics

2303.16535

Country:

Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(6 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)

Sasai, Takeyuki, Fujisawa, Hironori

Outlier Robust and Sparse Estimation of Linear Regression Coefficients

arXiv.org Machine LearningSep-5-2023

We consider outlier-robust and sparse estimation of linear regression coefficients, when the covariates and the noises are contaminated by adversarial outliers and noises are sampled from a heavy-tailed distribution. Our results present sharper error bounds under weaker assumptions than prior studies that share similar interests with this study. Our analysis relies on some sharp concentration inequalities resulting from generic chaining.

artificial intelligence, machine learning, proposition 5, (19 more...)

2208.11592

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Robust penalized least squares of depth trimmed residuals regression for high-dimensional data

Zuo, Yijun

Challenges with data in the big-data era include (i) the dimension $p$ is often larger than the sample size $n$ (ii) outliers or contaminated points are frequently hidden and more difficult to detect. Challenge (i) renders most conventional methods inapplicable. Thus, it attracts tremendous attention from statistics, computer science, and bio-medical communities. Numerous penalized regression methods have been introduced as modern methods for analyzing high-dimensional data. Disproportionate attention has been paid to the challenge (ii) though. Penalized regression methods can do their job very well and are expected to handle the challenge (ii) simultaneously. Most of them, however, can break down by a single outlier (or single adversary contaminated point) as revealed in this article. The latter systematically examines leading penalized regression methods in the literature in terms of their robustness, provides quantitative assessment, and reveals that most of them can break down by a single outlier. Consequently, a novel robust penalized regression method based on the least sum of squares of depth trimmed residuals is proposed and studied carefully. Experiments with simulated and real data reveal that the newly proposed method can outperform some leading competitors in estimation and prediction accuracy in the cases considered.

artificial intelligence, data mining, machine learning, (19 more...)

2309.01666

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Michigan > Ingham County > Lansing (0.04)
North America > United States > Michigan > Ingham County > East Lansing (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Heitmeier, Maria, Chuang, Yu-Ying, Baayen, R. Harald

How trial-to-trial learning shapes mappings in the mental lexicon: Modelling Lexical Decision with Linear Discriminative Learning

arXiv.org Artificial IntelligenceSep-4-2023

Trial-to-trial effects have been found in a number of studies, indicating that processing a stimulus influences responses in subsequent trials. A special case are priming effects which have been modelled successfully with error-driven learning (Marsolek, 2008), implying that participants are continuously learning during experiments. This study investigates whether trial-to-trial learning can be detected in an unprimed lexical decision experiment. We used the Discriminative Lexicon Model (DLM; Baayen et al., 2019), a model of the mental lexicon with meaning representations from distributional semantics, which models error-driven incremental learning with the Widrow-Hoff rule. We used data from the British Lexicon Project (BLP; Keuleers et al., 2012) and simulated the lexical decision experiment with the DLM on a trial-by-trial basis for each subject individually. Then, reaction times were predicted with Generalised Additive Models (GAMs), using measures derived from the DLM simulations as predictors. We extracted measures from two simulations per subject (one with learning updates between trials and one without), and used them as input to two GAMs. Learning-based models showed better model fit than the non-learning ones for the majority of subjects. Our measures also provide insights into lexical processing and individual differences. This demonstrates the potential of the DLM to model behavioural data and leads to the conclusion that trial-to-trial learning can indeed be detected in unprimed lexical decision. Our results support the possibility that our lexical knowledge is subject to continuous changes.

nonword, predictor, reaction time, (15 more...)

arXiv.org Artificial Intelligence

2207.0043

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Africa > Kenya > Mandera County > Mandera (0.04)
North America > United States > Pennsylvania (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Education (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(3 more...)

Mirzaeifard, Reza, Venkategowda, Naveen K. D., Gogineni, Vinay Chakravarthi, Werner, Stefan

Smoothing ADMM for Sparse-Penalized Quantile Regression with Non-Convex Penalties

This paper investigates quantile regression in the presence of non-convex and non-smooth sparse penalties, such as the minimax concave penalty (MCP) and smoothly clipped absolute deviation (SCAD). The non-smooth and non-convex nature of these problems often leads to convergence difficulties for many algorithms. While iterative techniques like coordinate descent and local linear approximation can facilitate convergence, the process is often slow. This sluggish pace is primarily due to the need to run these approximation techniques until full convergence at each step, a requirement we term as a \emph{secondary convergence iteration}. To accelerate the convergence speed, we employ the alternating direction method of multipliers (ADMM) and introduce a novel single-loop smoothing ADMM algorithm with an increasing penalty parameter, named SIAD, specifically tailored for sparse-penalized quantile regression. We first delve into the convergence properties of the proposed SIAD algorithm and establish the necessary conditions for convergence. Theoretically, we confirm a convergence rate of $o\big({k^{-\frac{1}{4}}}\big)$ for the sub-gradient bound of augmented Lagrangian. Subsequently, we provide numerical results to showcase the effectiveness of the SIAD algorithm. Our findings highlight that the SIAD method outperforms existing approaches, providing a faster and more stable solution for sparse-penalized quantile regression.

algorithm, artificial intelligence, machine learning, (16 more...)

2309.03094

Country:

Europe > Sweden > Östergötland County > Linköping (0.04)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
Europe > Denmark > Southern Denmark (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Mendes, Eduardo F., Pinto, Gabriel J. P.

Generalized Information Criteria for Structured Sparse Models

Regularized m-estimators are widely used due to their ability of recovering a low-dimensional model in high-dimensional scenarios. Some recent efforts on this subject focused on creating a unified framework for establishing oracle bounds, and deriving conditions for support recovery. Under this same framework, we propose a new Generalized Information Criteria (GIC) that takes into consideration the sparsity pattern one wishes to recover. We obtain non-asymptotic model selection bounds and sufficient conditions for model selection consistency of the GIC. Furthermore, we show that the GIC can also be used for selecting the regularization parameter within a regularized $m$-estimation framework, which allows practical use of the GIC for model selection in high-dimensional scenarios. We provide examples of group LASSO in the context of generalized linear regression and low rank matrix regression.

artificial intelligence, machine learning, subspace, (17 more...)

2309.01764

Country:

South America > Brazil > São Paulo (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Boschi, Tobia, Testa, Lorenzo, Chiaromonte, Francesca, Reimherr, Matthew

FAStEN: an efficient adaptive method for feature selection and estimation in high-dimensional functional regressions

Functional regression analysis is an established tool for many contemporary scientific applications. Regression problems involving large and complex data sets are ubiquitous, and feature selection is crucial for avoiding overfitting and achieving accurate predictions. We propose a new, flexible and ultra-efficient approach to perform feature selection in a sparse high dimensional function-on-function regression problem, and we show how to extend it to the scalar-on-function framework. Our method, called FAStEN, combines functional data, optimization, and machine learning techniques to perform feature selection and parameter estimation simultaneously. We exploit the properties of Functional Principal Components and the sparsity inherent to the Dual Augmented Lagrangian problem to significantly reduce computational cost, and we introduce an adaptive scheme to improve selection accuracy. In addition, we derive asymptotic oracle properties, which guarantee estimation and selection consistency for the proposed FAStEN estimator. Through an extensive simulation study, we benchmark our approach to the best existing competitors and demonstrate a massive gain in terms of CPU time and selection performance, without sacrificing the quality of the coefficients' estimation. The theoretical derivations and the simulation study provide a strong motivation for our approach. Finally, we present an application to brain fMRI data from the AOMIC PIOP1 study.

artificial intelligence, machine learning, regression, (19 more...)

2303.14801

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Italy (0.04)

Genre: Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

arXiv.org Artificial IntelligenceSep-2-2023

CausalLM is not optimal for in-context learning

Ding, Nan, Levinboim, Tomer, Wu, Jialin, Goodman, Sebastian, Soricut, Radu

Recent empirical evidence indicates that transformer based in-context learning performs better when using a prefix language model (prefixLM), in which in-context samples can all attend to each other, compared to causal language models (causalLM), which use auto-regressive attention that prohibits in-context samples to attend to future samples. While this result is intuitive, it is not understood from a theoretical perspective. In this paper we take a theoretical approach and analyze the convergence behavior of prefixLM and causalLM under a certain parameter construction. Our analysis shows that both LM types converge to their stationary points at a linear rate, but that while prefixLM converges to the optimal solution of linear regression, causalLM convergence dynamics follows that of an online gradient descent algorithm, which is not guaranteed to be optimal even as the number of samples grows infinitely. We supplement our theoretical claims with empirical experiments over synthetic and real tasks and using various types of transformers. Our experiments verify that causalLM consistently underperforms prefixLM in all settings.

causallm, linear regression, transformer, (14 more...)

arXiv.org Artificial Intelligence

2308.06912

Country: Oceania > Australia > Victoria > Melbourne (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Löfström, Tuwe, Löfström, Helena, Johansson, Ulf, Sönströd, Cecilia, Matela, Rudy

Calibrated Explanations for Regression

arXiv.org Machine LearningSep-1-2023

Artificial Intelligence (AI) is often an integral part of modern decision support systems (DSSs). The best-performing predictive models used in AI-based DSSs lack transparency. Explainable Artificial Intelligence (XAI) aims to create AI systems that can explain their rationale to human users. Local explanations in XAI can provide information about the causes of individual predictions in terms of feature importance. However, a critical drawback of existing local explanation methods is their inability to quantify the uncertainty associated with a feature's importance. This paper introduces an extension of a feature importance explanation method, Calibrated Explanations (CE), previously only supporting classification, with support for standard regression and probabilistic regression, i.e., the probability that the target is above an arbitrary threshold. The extension for regression keeps all the benefits of CE, such as calibration of the prediction from the underlying model with confidence intervals, uncertainty quantification of feature importance, and allows both factual and counterfactual explanations. CE for standard regression provides fast, reliable, stable, and robust explanations. CE for probabilistic regression provides an entirely new way of creating probabilistic explanations from any ordinary regression model and with a dynamic selection of thresholds. The performance of CE for probabilistic regression regarding stability and speed is comparable to LIME. The method is model agnostic with easily understood conditional rules. An implementation in Python is freely available on GitHub and for installation using pip making the results in this paper easily replicable.

explanation, prediction, probability, (16 more...)

2308.16245

Country:

North America > United States > California (0.05)
Europe > Sweden > Jönköping County > Jönköping (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.81)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.75)

Dominguez, Alejandro Rodriguez, Shahzad, Muhammad, Hong, Xia

Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction

arXiv.org Machine LearningSep-1-2023

Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions. It can be tackled with multiple hypotheses frameworks but with the difficulty of combining them efficiently in a learning model. A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems. The predictors are regression models of any type that can form centroidal Voronoi tessellations which are a function of their losses during training. It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution and is equivalent to interpolating the meta-loss of the predictors, the loss being a zero set of the interpolation error. This model has a fixed-point iteration algorithm between the predictors and the centers of the basis functions. Diversity in learning can be controlled parametrically by truncating the tessellation formation with the losses of individual predictors. A closed-form solution with least-squares is presented, which to the authors knowledge, is the fastest solution in the literature for multiple hypotheses and structured predictions. Superior generalization performance and computational efficiency is achieved using only two-layer neural networks as predictors controlling diversity as a key component of success. A gradient-descent approach is introduced which is loss-agnostic regarding the predictors. The expected value for the loss of the structured model with Gaussian basis functions is computed, finding that correlation between predictors is not an appropriate tool for diversification. The experiments show outperformance with respect to the top competitors in the literature.

artificial intelligence, machine learning, predictor, (17 more...)

2309.00781

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry: Banking & Finance (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)