AITopics

2309.02014

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Griebel, Michael, Oswald, Peter

Convergence analysis of online algorithms for vector-valued kernel regression

arXiv.org Machine LearningSep-14-2023

We consider the problem of approximating the regression function from noisy vector-valued data by an online learning algorithm using an appropriate reproducing kernel Hilbert space (RKHS) as prior. In an online algorithm, i.i.d. samples become available one by one by a random process and are successively processed to build approximations to the regression function. We are interested in the asymptotic performance of such online approximation algorithms and show that the expected squared error in the RKHS norm can be bounded by $C^2 (m+1)^{-s/(2+s)}$, where $m$ is the current number of processed data, the parameter $0

algorithm, artificial intelligence, machine learning, (18 more...)

2309.07779

Country:

North America > United States > New York (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.94)

Maia, Mateus, Murphy, Keefe, Parnell, Andrew C.

GP-BART: a novel Bayesian additive regression trees approach using Gaussian processes

arXiv.org Machine LearningSep-14-2023

The Bayesian additive regression trees (BART) model is an ensemble method extensively and successfully used in regression tasks due to its consistently strong predictive performance and its ability to quantify uncertainty. BART combines "weak" tree models through a set of shrinkage priors, whereby each tree explains a small portion of the variability in the data. However, the lack of smoothness and the absence of an explicit covariance structure over the observations in standard BART can yield poor performance in cases where such assumptions would be necessary. The Gaussian processes Bayesian additive regression trees (GP-BART) model is an extension of BART which addresses this limitation by assuming Gaussian process (GP) priors for the predictions of each terminal node among all trees. The model's effectiveness is demonstrated through applications to simulated and real-world data, surpassing the performance of traditional modeling approaches in various scenarios.

artificial intelligence, gp-bart, machine learning, (17 more...)

2204.02112

Country:

Europe > Ireland (0.14)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceSep-13-2023

Remote Inference of Cognitive Scores in ALS Patients Using a Picture Description

Agurto, Carla, Cecchi, Guillermo, Wen, Bo, Fraenkel, Ernest, Berry, James, Navar, Indu, Norel, Raquel

Amyotrophic lateral sclerosis is a fatal disease that not only affects movement, speech, and breath but also cognition. Recent studies have focused on the use of language analysis techniques to detect ALS and infer scales for monitoring functional progression. In this paper, we focused on another important aspect, cognitive impairment, which affects 35-50% of the ALS population. In an effort to reach the ALS population, which frequently exhibits mobility limitations, we implemented the digital version of the Edinburgh Cognitive and Behavioral ALS Screen (ECAS) test for the first time. This test which is designed to measure cognitive impairment was remotely performed by 56 participants from the EverythingALS Speech Study. As part of the study, participants (ALS and non-ALS) were asked to describe weekly one picture from a pool of many pictures with complex scenes displayed on their computer at home. We analyze the descriptions performed within +/- 60 days from the day the ECAS test was administered and extract different types of linguistic and acoustic features. We input those features into linear regression models to infer 5 ECAS sub-scores and the total score. Speech samples from the picture description are reliable enough to predict the ECAS subs-scores, achieving statistically significant Spearman correlation values between 0.32 and 0.51 for the model's performance using 10-fold cross-validation.

cognitive score, picture description, remote inference

doi: 10.1109/ICDH60066.2023.00017

2309.06989

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Amyotrophic Lateral Sclerosis (ALS) (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.53)

arXiv.org Artificial IntelligenceSep-13-2023

Optimizing Offensive Gameplan in the National Basketball Association with Machine Learning

Mukhopadhyay, Eamon

Throughout the analytical revolution that has occurred in the NBA, the development of specific metrics and formulas has given teams, coaches, and players a new way to see the game. However - the question arises - how can we verify any metrics? One method would simply be eyeball approximation (trying out many different gameplans) and/or trial and error - an estimation-based and costly approach. Another approach is to try to model already existing metrics with a unique set of features using machine learning techniques. The key to this approach is that with these features that are selected, we can try to gauge the effectiveness of these features combined, rather than using individual analysis in simple metric evaluation. If we have an accurate model, it can particularly help us determine the specifics of gameplan execution. In this paper, the statistic ORTG (Offensive Rating, developed by Dean Oliver) was found to have a correlation with different NBA playtypes using both a linear regression model and a neural network regression model, although ultimately, a neural network worked slightly better than linear regression. Using the accuracy of the models as a justification, the next step was to optimize the output of the model with test examples, which would demonstrate the combination of features to best achieve a highly functioning offense.

machine learning, national basketball association, optimizing offensive gameplan

2308.06851

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)

arXiv.org Machine LearningSep-12-2023

A Consistent and Scalable Algorithm for Best Subset Selection in Single Index Models

Tang, Borui, Zhu, Jin, Zhu, Junxian, Wang, Xueqin, Zhang, Heping

Analysis of high-dimensional data has led to increased interest in both single index models (SIMs) and best subset selection. SIMs provide an interpretable and flexible modeling framework for high-dimensional data, while best subset selection aims to find a sparse model from a large set of predictors. However, best subset selection in high-dimensional models is known to be computationally intractable. Existing methods tend to relax the selection, but do not yield the best subset solution. In this paper, we directly tackle the intractability by proposing the first provably scalable algorithm for best subset selection in high-dimensional SIMs. Our algorithmic solution enjoys the subset selection consistency and has the oracle property with a high probability. The algorithm comprises a generalized information criterion to determine the support size of the regression coefficients, eliminating the model selection tuning. Moreover, our method does not assume an error distribution or a specific link function and hence is flexible to apply. Extensive simulation results demonstrate that our method is not only computationally efficient but also able to exactly recover the best subset in various settings (e.g., linear regression, Poisson regression, heteroscedastic models).

algorithm, selection, selection consistency, (15 more...)

2309.0623

Country:

Asia > China (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)

Beachy, Atticus, Bae, Harok, Camberos, Jose, Grandhi, Ramana

Epistemic Modeling Uncertainty of Rapid Neural Network Ensembles for Adaptive Learning

arXiv.org Artificial IntelligenceSep-12-2023

Emulator embedded neural networks, which are a type of physics informed neural network, leverage multi-fidelity data sources for efficient design exploration of aerospace engineering systems. Multiple realizations of the neural network models are trained with different random initializations. The ensemble of model realizations is used to assess epistemic modeling uncertainty caused due to lack of training samples. This uncertainty estimation is crucial information for successful goal-oriented adaptive learning in an aerospace system design exploration. However, the costs of training the ensemble models often become prohibitive and pose a computational challenge, especially when the models are not trained in parallel during adaptive learning. In this work, a new type of emulator embedded neural network is presented using the rapid neural network paradigm. Unlike the conventional neural network training that optimizes the weights and biases of all the network layers by using gradient-based backpropagation, rapid neural network training adjusts only the last layer connection weights by applying a linear regression technique. It is found that the proposed emulator embedded neural network trains near-instantaneously, typically without loss of prediction accuracy. The proposed method is demonstrated on multiple analytical examples, as well as an aerospace flight parameter study of a generic hypersonic vehicle.

adaptive learning, epistemic modeling uncertainty, rapid neural network ensemble

2309.06628

Genre: Research Report (0.40)

Industry:

Aerospace & Defense (1.00)
Transportation > Air (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.53)

arXiv.org Artificial IntelligenceSep-12-2023

AKEM: Aligning Knowledge Base to Queries with Ensemble Model for Entity Recognition and Linking

Lu, Di, Liang, Zhongping, Yuan, Caixia, Wang, Xiaojie

This paper presents a novel approach to address the Entity Recognition and Linking Challenge at NLPCC 2015. The task involves extracting named entity mentions from short search queries and linking them to entities within a reference Chinese knowledge base. To tackle this problem, we first expand the existing knowledge base and utilize external knowledge to identify candidate entities, thereby improving the recall rate. Next, we extract features from the candidate entities and utilize Support Vector Regression and Multiple Additive Regression Tree as scoring functions to filter the results. Additionally, we apply rules to further refine the results and enhance precision. Our method is computationally efficient and achieves an F1 score of 0.535.

aligning knowledge base, ensemble model, entity recognition, (2 more...)

2309.06175

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.53)

Padilla, Carlos Misael Madrid, Padilla, Oscar Hernan Madrid, Wang, Daren

Temporal-spatial model via Trend Filtering

arXiv.org Machine LearningSep-12-2023

This research focuses on the estimation of a non-parametric regression function designed for data with simultaneous time and space dependencies. In such a context, we study the Trend Filtering, a nonparametric estimator introduced by \cite{mammen1997locally} and \cite{rudin1992nonlinear}. For univariate settings, the signals we consider are assumed to have a kth weak derivative with bounded total variation, allowing for a general degree of smoothness. In the multivariate scenario, we study a $K$-Nearest Neighbor fused lasso estimator as in \cite{padilla2018adaptive}, employing an ADMM algorithm, suitable for signals with bounded variation that adhere to a piecewise Lipschitz continuity criterion. By aligning with lower bounds, the minimax optimality of our estimators is validated. A unique phase transition phenomenon, previously uncharted in Trend Filtering studies, emerges through our analysis. Both Simulation studies and real data applications underscore the superior performance of our method when compared with established techniques in the existing literature.

artificial intelligence, inequality, machine learning, (18 more...)

2308.16172

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.13)
Europe > Spain > Galicia > Madrid (0.05)
Asia > Japan > Honshū > Kansai > Wakayama Prefecture > Wakayama (0.04)
(6 more...)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Ghanem, Elsayed, Yoosefi, Moein, Hatefi, Armin

Liu-type Shrinkage Estimators for Mixture of Poisson Regressions with Experts: A Heart Disease Study

arXiv.org Machine LearningSep-11-2023

Count data play a critical role in medical research, such as heart disease. The Poisson regression model is a common technique for evaluating the impact of a set of covariates on the count responses. The mixture of Poisson regression models with experts is a practical tool to exploit the covariates, not only to handle the heterogeneity in the Poisson regressions but also to learn the mixing structure of the population. Multicollinearity is one of the most common challenges with regression models, leading to ill-conditioned design matrices of Poisson regression components and expert classes. The maximum likelihood method produces unreliable and misleading estimates for the effects of the covariates in multicollinearity. In this research, we develop Ridge and Liu-type methods as two shrinkage approaches to cope with the ill-conditioned design matrices of the mixture of Poisson regression models with experts. Through various numerical studies, we demonstrate that the shrinkage methods offer more reliable estimates for the coefficients of the mixture model in multicollinearity while maintaining the classification performance of the ML method. The shrinkage methods are finally applied to a heart study to analyze the heart disease rate stages.

artificial intelligence, machine learning, regression model, (16 more...)

2309.05838

Country:

Europe > United Kingdom (0.14)
North America > United States > California > Orange County > Irvine (0.04)
North America > Greenland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)