AITopics

doi: 10.13140/RG.2.2.29687.19365

2305.0812

Country: Asia > India (0.04)

Genre: Research Report > New Finding (0.69)

Industry:

Information Technology > Services (0.68)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Jorgensen, Palle E. T., Song, Myung-Sin, Tian, James

Conditional mean embeddings and optimal feature selection via positive definite kernels

arXiv.org Artificial IntelligenceMay-14-2023

Motivated by applications, we consider here new operator theoretic approaches to Conditional mean embeddings (CME). Our present results combine a spectral analysis-based optimization scheme with the use of kernels, stochastic processes, and constructive learning algorithms. For initially given non-linear data, we consider optimization-based feature selections. This entails the use of convex sets of positive definite (p.d.) kernels in a construction of optimal feature selection via regression algorithms from learning models. Thus, with initial inputs of training data (for a suitable learning algorithm,) each choice of p.d. kernel $K$ in turn yields a variety of Hilbert spaces and realizations of features. A novel idea here is that we shall allow an optimization over selected sets of kernels $K$ from a convex set $C$ of positive definite kernels $K$. Hence our \textquotedblleft optimal\textquotedblright{} choices of feature representations will depend on a secondary optimization over p.d. kernels $K$ within a specified convex set $C$.

artificial intelligence, kernel, machine learning, (17 more...)

2305.081

Country:

North America > United States > Iowa > Johnson County > Iowa City (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New Jersey > Bergen County > Hackensack (0.04)
(2 more...)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Fahrbach, Matthew, Fu, Thomas, Ghadiri, Mehrdad

Subquadratic Kronecker Regression with Applications to Tensor Decomposition

arXiv.org Artificial IntelligenceMay-12-2023

Kronecker regression is a highly-structured least squares problem $\min_{\mathbf{x}} \lVert \mathbf{K}\mathbf{x} - \mathbf{b} \rVert_{2}^2$, where the design matrix $\mathbf{K} = \mathbf{A}^{(1)} \otimes \cdots \otimes \mathbf{A}^{(N)}$ is a Kronecker product of factor matrices. This regression problem arises in each step of the widely-used alternating least squares (ALS) algorithm for computing the Tucker decomposition of a tensor. We present the first subquadratic-time algorithm for solving Kronecker regression to a $(1+\varepsilon)$-approximation that avoids the exponential term $O(\varepsilon^{-N})$ in the running time. Our techniques combine leverage score sampling and iterative methods. By extending our approach to block-design matrices where one block is a Kronecker product, we also achieve subquadratic-time algorithms for (1) Kronecker ridge regression and (2) updating the factor matrices of a Tucker decomposition in ALS, which is not a pure Kronecker regression problem, thereby improving the running time of all steps of Tucker ALS. We demonstrate the speed and accuracy of this Kronecker regression algorithm on synthetic data and real-world image tensors.

artificial intelligence, machine learning, matrix, (18 more...)

2209.04876

Country:

Africa > Senegal > Kolda Region > Kolda (0.04)
North America > United States > Virginia (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Santos, Milene Regina dos, Izbicki, Rafael

Expertise-based Weighting for Regression Models with Noisy Labels

arXiv.org Artificial IntelligenceMay-12-2023

Regression methods assume that accurate labels are available for training. However, in certain scenarios, obtaining accurate labels may not be feasible, and relying on multiple specialists with differing opinions becomes necessary. Existing approaches addressing noisy labels often impose restrictive assumptions on the regression function. In contrast, this paper presents a novel, more flexible approach. Our method consists of two steps: estimating each labeler's expertise and combining their opinions using learned weights. We then regress the weighted average against the input features to build the prediction model. The proposed method is formally justified and empirically demonstrated to outperform existing techniques on simulated and real data. Furthermore, its flexibility enables the utilization of any machine learning technique in both steps. In summary, this method offers a simple, fast, and effective solution for training regression models with noisy labels derived from diverse expert opinions.

artificial intelligence, dependent variable, machine learning, (17 more...)

2305.0743

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia (0.04)
South America > Brazil > São Paulo (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Wycoff, Nathan, Arab, Ali, Donato, Katharine M., Singh, Lisa O.

Sparse Bayesian Lasso via a Variable-Coefficient $\ell_1$ Penalty

arXiv.org Machine LearningMay-12-2023

Modern statistical learning algorithms are capable of amazing flexibility, but struggle with interpretability. One possible solution is sparsity: making inference such that many of the parameters are estimated as being identically 0, which may be imposed through the use of nonsmooth penalties such as the $\ell_1$ penalty. However, the $\ell_1$ penalty introduces significant bias when high sparsity is desired. In this article, we retain the $\ell_1$ penalty, but define learnable penalty weights $\lambda_p$ endowed with hyperpriors. We start the article by investigating the optimization problem this poses, developing a proximal operator associated with the $\ell_1$ norm. We then study the theoretical properties of this variable-coefficient $\ell_1$ penalty in the context of penalized likelihood. Next, we investigate application of this penalty to Variational Bayes, developing a model we call the Sparse Bayesian Lasso which allows for behavior qualitatively like Lasso regression to be applied to arbitrary variational models. In simulation studies, this gives us the Uncertainty Quantification and low bias properties of simulation-based approaches with an order of magnitude less computation. Finally, we apply our methodology to a Bayesian lagged spatiotemporal regression model of internal displacement that occurred during the Iraqi Civil War of 2013-2017.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2211.05089

Country:

Asia > Middle East > Iraq > Baghdad Governorate > Baghdad (0.04)
North America > United States (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(5 more...)

Genre: Research Report (0.81)

Industry: Government > Immigration & Customs (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
(2 more...)

A data-driven rutting depth short-time prediction model with metaheuristic optimization for asphalt pavements based on RIOHTrack

Li, Zhuoxuan, Korovin, Iakov, Shi, Xinli, Gorbachev, Sergey, Gorbacheva, Nadezhda, Huang, Wei, Cao, Jinde

Rutting of asphalt pavements is a crucial design criterion in various pavement design guides. A good road transportation base can provide security for the transportation of oil and gas in road transportation. This study attempts to develop a robust artificial intelligence model to estimate different asphalt pavements' rutting depth clips, temperature, and load axes as primary characteristics. The experiment data were obtained from 19 asphalt pavements with different crude oil sources on a 2.038 km long full-scale field accelerated pavement test track (RIOHTrack, Road Track Institute) in Tongzhou, Beijing. In addition, this paper also proposes to build complex networks with different pavement rutting depths through complex network methods and the Louvain algorithm for community detection. The most critical structural elements can be selected from different asphalt pavement rutting data, and similar structural elements can be found. An extreme learning machine algorithm with residual correction (RELM) is designed and optimized using an independent adaptive particle swarm algorithm. The experimental results of the proposed method are compared with several classical machine learning algorithms, with predictions of Average Root Mean Squared Error, Average Mean Absolute Error, and Average Mean Absolute Percentage Error for 19 asphalt pavements reaching 1.742, 1.363, and 1.94\% respectively. The experiments demonstrate that the RELM algorithm has an advantage over classical machine learning methods in dealing with non-linear problems in road engineering. Notably, the method ensures the adaptation of the simulated environment to different levels of abstraction through the cognitive analysis of the production environment parameters.

artificial intelligence, evolutionary algorithm, machine learning, (20 more...)

doi: 10.1109/JAS.2023.123192

2305.06707

Country:

Europe (0.93)
Asia > China > Beijing > Beijing (0.24)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.66)

Industry:

Transportation > Ground > Road (1.00)
Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Jeanselme, Vincent, Yoon, Chang Ho, Tom, Brian, Barrett, Jessica

Neural Fine-Gray: Monotonic neural networks for competing risks

Time-to-event modelling, known as survival analysis, differs from standard regression as it addresses censoring in patients who do not experience the event of interest. Despite competitive performances in tackling this problem, machine learning methods often ignore other competing risks that preclude the event of interest. This practice biases the survival estimation. Extensions to address this challenge often rely on parametric assumptions or numerical estimations leading to sub-optimal survival approximations. This paper leverages constrained monotonic neural networks to model each competing survival distribution. This modelling choice ensures the exact likelihood maximisation at a reduced computational cost by using automatic differentiation. The effectiveness of the solution is demonstrated on one synthetic and three medical datasets. Finally, we discuss the implications of considering competing risks when developing risk scores for medical practice.

artificial intelligence, machine learning, neuralfg 0, (16 more...)

2305.06703

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.67)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Enhancing Robustness of Gradient-Boosted Decision Trees through One-Hot Encoding and Regularization

Cui, Shijie, Sudjianto, Agus, Zhang, Aijun, Li, Runze

Gradient-boosted decision trees (GBDT) are widely used and highly effective machine learning approach for tabular data modeling. However, their complex structure may lead to low robustness against small covariate perturbation in unseen data. In this study, we apply one-hot encoding to convert a GBDT model into a linear framework, through encoding of each tree leaf to one dummy variable. This allows for the use of linear regression techniques, plus a novel risk decomposition for assessing the robustness of a GBDT model against covariate perturbations. We propose to enhance the robustness of GBDT models by refitting their linear regression forms with $L_1$ or $L_2$ regularization. Theoretical results are obtained about the effect of regularization on the model performance and robustness. It is demonstrated through numerical experiments that the proposed regularization approach can enhance the robustness of the one-hot-encoded GBDT models.

artificial intelligence, machine learning, robustness, (17 more...)

2304.13761

Country:

North America > United States > California (0.04)
North America > United States > Pennsylvania > Centre County > University Park (0.04)
North America > United States > North Carolina > Mecklenburg County > Charlotte (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)

Schwienhorst, Benedikt Lütke, Kock, Lucas, Nott, David J., Klein, Nadja

Dropout Regularization in Extended Generalized Linear Models based on Double Exponential Families

Even though dropout is a popular regularization technique, its theoretical properties are not fully understood. In this paper we study dropout regularization in extended generalized linear models based on double exponential families, for which the dispersion parameter can vary with the features. A theoretical analysis shows that dropout regularization prefers rare but important features in both the mean and dispersion, generalizing an earlier result for conventional generalized linear models. Training is performed using stochastic gradient descent with adaptive learning rate. To illustrate, we apply dropout to adaptive smoothing with B-splines, where both the mean and dispersion parameters are modelled flexibly. The important B-spline basis functions can be thought of as rare features, and we confirm in experiments that dropout is an effective form of regularization for mean and dispersion parameters that improves on a penalized maximum likelihood approach with an explicit smoothness penalty.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2305.06625

Country:

North America > United States > New York (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

da Silva, Giovana D., Silva, Filipi N., de Arruda, Henrique F., Souza, Bárbara C. e, Costa, Luciano da F., Amancio, Diego R.

Using Full-Text Content to Characterize and Identify Best Seller Books

Artistic pieces can be studied from several perspectives, one example being their reception among readers over time. In the present work, we approach this interesting topic from the standpoint of literary works, particularly assessing the task of predicting whether a book will become a best seller. Dissimilarly from previous approaches, we focused on the full content of books and considered visualization and classification tasks. We employed visualization for the preliminary exploration of the data structure and properties, involving SemAxis and linear discriminant analyses. Then, to obtain quantitative and more objective results, we employed various classifiers. Such approaches were used along with a dataset containing (i) books published from 1895 to 1924 and consecrated as best sellers by the Publishers Weekly Bestseller Lists and (ii) literary works published in the same period but not being mentioned in that list. Our comparison of methods revealed that the best-achieved result -- combining a bag-of-words representation with a logistic regression classifier -- led to an average accuracy of 0.75 both for the leave-one-out and 10-fold cross-validations. Such an outcome suggests that it is unfeasible to predict the success of books with high accuracy using only the full content of the texts. Nevertheless, our findings provide insights into the factors leading to the relative success of a literary work.

artificial intelligence, machine learning, representation, (15 more...)

2210.02334

Country:

South America > Brazil (0.04)
North America > United States > New York (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Media > Publishing (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)