AITopics

2403.17296

Country:

North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
(3 more...)

Genre: Research Report > New Finding (0.55)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Toner, William, Darlow, Luke

An Analysis of Linear Time Series Forecasting Models

arXiv.org Artificial IntelligenceMar-25-2024

Despite their simplicity, linear models perform well at time series forecasting, even when pitted against deeper and more expensive models. A number of variations to the linear model have been proposed, often including some form of feature normalisation that improves model generalisation. In this paper we analyse the sets of functions expressible using these linear model architectures. In so doing we show that several popular variants of linear models for time series forecasting are equivalent and functionally indistinguishable from standard, unconstrained linear regression. We characterise the model classes for each linear variant. We demonstrate that each model can be reinterpreted as unconstrained linear regression over a suitably augmented feature set, and therefore admit closed-form solutions when using a mean-squared loss function. We provide experimental evidence that the models under inspection learn nearly identical solutions, and finally demonstrate that the simpler closed form solutions are superior forecasters across 72% of test settings.

forecasting, matrix, submission and formatting instruction, (13 more...)

2403.14587

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Gupta, Avani, Narayanan, P J

A survey on Concept-based Approaches For Model Improvement

arXiv.org Artificial IntelligenceMar-23-2024

The focus of recent research has shifted from merely improving the metrics based performance of Deep Neural Networks (DNNs) to DNNs which are more interpretable to humans. The field of eXplainable Artificial Intelligence (XAI) has observed various techniques, including saliency-based and concept-based approaches. These approaches explain the model's decisions in simple human understandable terms called Concepts. Concepts are known to be the thinking ground of humans}. Explanations in terms of concepts enable detecting spurious correlations, inherent biases, or clever-hans. With the advent of concept-based explanations, a range of concept representation methods and automatic concept discovery algorithms have been introduced. Some recent works also use concepts for model improvement in terms of interpretability and generalization. We provide a systematic review and taxonomy of various concept representations and their discovery algorithms in DNNs, specifically in vision. We also provide details on concept-based model improvement literature marking the first comprehensive survey of these methods.

arxiv preprint arxiv, explanation, representation, (13 more...)

2403.14566

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > India > Telangana > Hyderabad (0.04)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.46)

Industry:

Health & Medicine (1.00)
Education (1.00)
Leisure & Entertainment > Games > Chess (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.93)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.88)
(2 more...)

Hejazi, Taha-Hossein, Ghadimkhani, Zahra, Borji, Arezoo

A learning-based solution approach to the application placement problem in mobile edge computing under uncertainty

arXiv.org Artificial IntelligenceMar-23-2024

Placing applications in mobile edge computing servers presents a complex challenge involving many servers, users, and their requests. Existing algorithms take a long time to solve high-dimensional problems with significant uncertainty scenarios. Therefore, an efficient approach is required to maximize the quality of service while considering all technical constraints. One of these approaches is machine learning, which emulates optimal solutions for application placement in edge servers. Machine learning models are expected to learn how to allocate user requests to servers based on the spatial positions of users and servers. In this study, the problem is formulated as a two-stage stochastic programming. A sufficient amount of training records is generated by varying parameters such as user locations, their request rates, and solving the optimization model. Then, based on the distance features of each user from the available servers and their request rates, machine learning models generate decision variables for the first stage of the stochastic optimization model, which is the user-to-server request allocation, and are employed as independent decision agents that reliably mimic the optimization model. Support Vector Machines (SVM) and Multi-layer Perceptron (MLP) are used in this research to achieve practical decisions from the stochastic optimization models. The performance of each model has shown an execution effectiveness of over 80%. This research aims to provide a more efficient approach for tackling high-dimensional problems and scenarios with uncertainties in mobile edge computing by leveraging machine learning models for optimal decision-making in request allocation to edge servers. These results suggest that machine-learning models can significantly improve solution times compared to conventional approaches.

accuracy, application, server, (16 more...)

2403.11259

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
Europe > Germany > Hamburg (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation (0.93)
Energy (0.93)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

arXiv.org Machine LearningMar-23-2024

Near-Optimal differentially private low-rank trace regression with guaranteed private initialization

Zha, Mengyue

We study differentially private (DP) estimation of a rank-$r$ matrix $M \in \mathbb{R}^{d_1\times d_2}$ under the trace regression model with Gaussian measurement matrices. Theoretically, the sensitivity of non-private spectral initialization is precisely characterized, and the differential-privacy-constrained minimax lower bound for estimating $M$ under the Schatten-$q$ norm is established. Methodologically, the paper introduces a computationally efficient algorithm for DP-initialization with a sample size of $n \geq \widetilde O (r^2 (d_1\vee d_2))$. Under certain regularity conditions, the DP-initialization falls within a local ball surrounding $M$. We also propose a differentially private algorithm for estimating $M$ based on Riemannian optimization (DP-RGrad), which achieves a near-optimal convergence rate with the DP-initialization and sample size of $n \geq \widetilde O(r (d_1 + d_2))$. Finally, the paper discusses the non-trivial gap between the minimax lower bound and the upper bound of low-rank matrix estimation under the trace regression model. It is shown that the estimator given by DP-RGrad attains the optimal convergence rate in a weaker notion of differential privacy. Our powerful technique for analyzing the sensitivity of initialization requires no eigengap condition between $r$ non-zero singular values.

matrix, privacy, probability, (16 more...)

2403.15999

Country:

Asia > China > Hong Kong (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Statistical Inference For Noisy Matrix Completion Incorporating Auxiliary Information

Ma, Shujie, Niu, Po-Yao, Zhang, Yichong, Zhu, Yinchu

This paper investigates statistical inference for noisy matrix completion in a semi-supervised model when auxiliary covariates are available. The model consists of two parts. One part is a low-rank matrix induced by unobserved latent factors; the other part models the effects of the observed covariates through a coefficient matrix which is composed of high-dimensional column vectors. We model the observational pattern of the responses through a logistic regression of the covariates, and allow its probability to go to zero as the sample size increases. We apply an iterative least squares (LS) estimation approach in our considered context. The iterative LS methods in general enjoy a low computational cost, but deriving the statistical properties of the resulting estimators is a challenging task. We show that our method only needs a few iterations, and the resulting entry-wise estimators of the low-rank matrix and the coefficient matrix are guaranteed to have asymptotic normal distributions. As a result, individual inference can be conducted for each entry of the unknown matrices. We also propose a simultaneous testing procedure with multiplier bootstrap for the high-dimensional coefficient matrix. This simultaneous inferential tool can help us further investigate the effects of covariates for the prediction of missing entries.

assumption 1, estimator, max 1, (16 more...)

doi: 10.1080/01621459.2024.2335591

2403.14899

Country:

North America > United States > California > Riverside County > Riverside (0.04)
Asia > Singapore (0.04)
North America > United States > Minnesota (0.04)

Genre: Research Report > Experimental Study (0.66)

Industry:

Media > Film (0.67)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

A Transfer Learning Causal Approach to Evaluate Racial/Ethnic and Geographic Variation in Outcomes Following Congenital Heart Surgery

Han, Larry, Zhang, Yi, Nathan, Meena, Mayer,, John E. Jr., Pasquali, Sara K., Zelevinsky, Katya, Duan, Rui, Normand, Sharon-Lise T.

Congenital heart defects (CHD) are the most prevalent birth defects in the United States and surgical outcomes vary considerably across the country. The outcomes of treatment for CHD differ for specific patient subgroups, with non-Hispanic Black and Hispanic populations experiencing higher rates of mortality and morbidity. A valid comparison of outcomes within racial/ethnic subgroups is difficult given large differences in case-mix and small subgroup sizes. We propose a causal inference framework for outcome assessment and leverage advances in transfer learning to incorporate data from both target and source populations to help estimate causal effects while accounting for different sources of risk factor and outcome differences across populations. Using the Society of Thoracic Surgeons' Congenital Heart Surgery Database (STS-CHSD), we focus on a national cohort of patients undergoing the Norwood operation from 2016-2022 to assess operative mortality and morbidity outcomes across U.S. geographic regions by race/ethnicity. We find racial and ethnic outcome differences after controlling for potential confounding factors. While geography does not have a causal effect on outcomes for non-Hispanic Caucasian patients, non-Hispanic Black patients experience wide variability in outcomes with estimated 30-day mortality ranging from 5.9% (standard error 2.2%) to 21.6% (4.4%) across U.S. regions.

non-hispanic black patient, norwood procedure, target population, (13 more...)

2403.14573

Country:

North America > United States > Texas (0.14)
North America > United States > Georgia (0.14)
North America > United States > Michigan (0.04)
(47 more...)

Genre: Research Report > Experimental Study (0.90)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Surgery (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Fuhr, Jonathan, Berens, Philipp, Papies, Dominik

Estimating Causal Effects with Double Machine Learning -- A Method Evaluation

The estimation of causal effects with observational data continues to be a very active research area. In recent years, researchers have developed new frameworks which use machine learning to relax classical assumptions necessary for the estimation of causal effects. In this paper, we review one of the most prominent methods - "double/debiased machine learning" (DML) - and empirically evaluate it by comparing its performance on simulated data relative to more traditional statistical methods, before applying it to real-world data. Our findings indicate that the application of a suitably flexible machine learning algorithm within DML improves the adjustment for various nonlinear confounding relationships. This advantage enables a departure from traditional functional form assumptions typically necessary in causal effect estimation. However, we demonstrate that the method continues to critically depend on standard assumptions about causal structure and identification. When estimating the effects of air pollution on housing prices in our application, we find that DML estimates are consistently larger than estimates of less flexible methods. From our overall results, we provide actionable recommendations for specific choices researchers must make when applying DML in practice.

confounder, dml, functional form, (17 more...)

2403.14385

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
North America > United States > New York (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Government (1.00)
Health & Medicine > Therapeutic Area (0.93)
Law (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)

Mankovich, Nathan, Durand, Homer, Diaz, Emiliano, Varando, Gherardo, Camps-Valls, Gustau

Recovering Latent Confounders from High-dimensional Proxy Variables

Detecting latent confounders from proxy variables is an essential problem in causal effect estimation. Previous approaches are limited to low-dimensional proxies, sorted proxies, and binary treatments. We remove these assumptions and present a novel Proxy Confounder Factorization (PCF) framework for continuous treatment effect estimation when latent confounders manifest through high-dimensional, mixed proxy variables. For specific sample sizes, our two-step PCF implementation, using Independent Component Analysis (ICA-PCF), and the end-to-end implementation, using Gradient Descent (GD-PCF), achieve high correlation with the latent confounder and low absolute error in causal effect estimation with synthetic datasets in the high sample size regime. Even when faced with climate data, ICA-PCF recovers four components that explain $75.9\%$ of the variance in the North Atlantic Oscillation, a known confounder of precipitation patterns in Europe. Code for our PCF implementations and experiments can be found here: https://github.com/IPL-UV/confound_it. The proposed methodology constitutes a stepping stone towards discovering latent confounders and can be applied to many problems in disciplines dealing with high-dimensional observed proxies, e.g., spatiotemporal fields.

confounder, estimation, implementation, (14 more...)

2403.14228

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Portugal > Azores > Ponta Delgada (0.05)
Europe > Iceland > Capital Region > Reykjavik (0.05)
(3 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Bhattacharjee, Abhinab, Popov, Andrey A., Sarshar, Arash, Sandu, Adrian

Improving the Adaptive Moment Estimation (ADAM) stochastic optimizer through an Implicit-Explicit (IMEX) time-stepping approach

arXiv.org Artificial IntelligenceMar-20-2024

The Adam optimizer, often used in Machine Learning for neural network training, corresponds to an underlying ordinary differential equation (ODE) in the limit of very small learning rates. This work shows that the classical Adam algorithm is a first order implicit-explicit (IMEX) Euler discretization of the underlying ODE. Employing the time discretization point of view, we propose new extensions of the Adam scheme obtained by using higher order IMEX methods to solve the ODE. Based on this approach, we derive a new optimization algorithm for neural network training that performs better than classical Adam on several regression and classification problems.

algorithm, equation, gradient evaluation, (16 more...)

2403.13704

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)
North America > United States > Virginia > Montgomery County > Blacksburg (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)