AITopics

2209.09404

Country:

North America > Canada > Ontario > Toronto (0.29)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Arizona (0.04)
Europe (0.04)

Genre: Research Report (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Government (1.00)
Transportation > Ground > Road (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

arXiv.org Machine LearningOct-2-2023

Optimal Estimator for Linear Regression with Shuffled Labels

Zhang, Hang, Li, Ping

This paper considers the task of linear regression with shuffled labels, i.e., $\mathbf Y = \mathbf \Pi \mathbf X \mathbf B + \mathbf W$, where $\mathbf Y \in \mathbb R^{n\times m}, \mathbf Pi \in \mathbb R^{n\times n}, \mathbf X\in \mathbb R^{n\times p}, \mathbf B \in \mathbb R^{p\times m}$, and $\mathbf W\in \mathbb R^{n\times m}$, respectively, represent the sensing results, (unknown or missing) corresponding information, sensing matrix, signal of interest, and additive sensing noise. Given the observation $\mathbf Y$ and sensing matrix $\mathbf X$, we propose a one-step estimator to reconstruct $(\mathbf \Pi, \mathbf B)$. From the computational perspective, our estimator's complexity is $O(n^3 + np^2m)$, which is no greater than the maximum complexity of a linear assignment algorithm (e.g., $O(n^3)$) and a least square algorithm (e.g., $O(np^2 m)$). From the statistical perspective, we divide the minimum $snr$ requirement into four regimes, e.g., unknown, hard, medium, and easy regimes; and present sufficient conditions for the correct permutation recovery under each regime: $(i)$ $snr \geq \Omega(1)$ in the easy regime; $(ii)$ $snr \geq \Omega(\log n)$ in the medium regime; and $(iii)$ $snr \geq \Omega((\log n)^{c_0}\cdot n^{{c_1}/{srank(\mathbf B)}})$ in the hard regime ($c_0, c_1$ are some positive constants and $srank(\mathbf B)$ denotes the stable rank of $\mathbf B$). In the end, we also provide numerical experiments to confirm the above claims.

artificial intelligence, machine learning, piq, (16 more...)

2310.01326

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > United States > Washington > King County > Bellevue (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Government (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

arXiv.org Machine LearningOct-2-2023

Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP

Chen, Zixiang, Deng, Yihe, Li, Yuanzhi, Gu, Quanquan

Multi-modal learning (Ngiam et al., 2011) integrates information from a variety of data types, resulting in AI systems that are both robust and precise. Recently, CLIP (Radford et al., 2021) emerged as a milestone work that leverages vision-language contrastive pretraining to jointly learn image and text embeddings, using the vast amounts of image-text data available on the web. During the training process, CLIP considers image-text data that appear together as positive pairs and other combinations as negative pairs. The goal is to maximize the embedding similarity for the positive pairs while minimizing it for the negative pairs. Remarkably, this approach has achieved significant success in zero-shot transfer (Lei Ba et al., 2015), indicating the model's ability to handle a great variety of tasks without prior exposure to any of their training data. Inspired by CLIP's groundbreaking zero-shot capabilities, subsequent studies (Yao et al., 2022; Li et al., 2022; Mu et al., 2022; Goel et al., 2022; Zhai et al., 2022; Alayrac et al., 2022) emerged with the primary objective of further enhancing CLIP's zero-shot performance. Despite the empirical success of CLIP in zero-shot transfer, the theoretical understanding of how it works remains elusive. An intriguing inquiry is thus: How does CLIP learn representations that are transferable to the various downstream tasks?

large language model, machine learning, natural language, (20 more...)

2310.00927

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

arXiv.org Machine LearningOct-2-2023

Heteroscedastic sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm

Zgodic, Anja, Bai, Ray, Zhang, Jiajia, Wang, Yuan, Rorden, Chris, McLain, Alexander

Sparse linear regression methods for high-dimensional data often assume that residuals have constant variance. When this assumption is violated, it can lead to bias in estimated coefficients, prediction intervals (PI) with improper length, and increased type I errors. We propose a heteroscedastic high-dimensional linear regression model through a partitioned empirical Bayes Expectation Conditional Maximization (H-PROBE) algorithm. H-PROBE is a computationally efficient maximum a posteriori estimation approach based on a Parameter-Expanded Expectation-Conditional-Maximization algorithm. It requires minimal prior assumptions on the regression parameters through plug-in empirical Bayes estimates of hyperparameters. The variance model uses a multivariate log-Gamma prior on coefficients that can incorporate covariates hypothesized to impact heterogeneity. The motivation of our approach is a study relating Aphasia Quotient (AQ) to high-resolution T2 neuroimages of brain damage in stroke patients. AQ is a vital measure of language impairment and informs treatment decisions, but it is challenging to measure and subject to heteroscedastic errors. It is, therefore, of clinical importance -- and the goal of this paper -- to use high-dimensional neuroimages to predict and provide PIs for AQ that accurately reflect the heterogeneity in residual variance. Our analysis demonstrates that H-PROBE can use markers of heterogeneity to provide narrower PI widths than standard methods without sacrificing coverage. Through extensive simulation studies, we exhibit that H-PROBE results in superior prediction, variable selection, and predictive inference than competing methods.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2309.08783

Country:

North America > United States > New York (0.04)
North America > United States > South Carolina (0.04)

Genre: Research Report > Experimental Study (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceOct-1-2023

Asymptotically Efficient Online Learning for Censored Regression Models Under Non-I.I.D Data

Zhang, Lantian, Guo, Lei

The asymptotically efficient online learning problem is investigated for stochastic censored regression models, which arise from various fields of learning and statistics but up to now still lacks comprehensive theoretical studies on the efficiency of the learning algorithms. For this, we propose a two-step online algorithm, where the first step focuses on achieving algorithm convergence, and the second step is dedicated to improving the estimation performance. Under a general excitation condition on the data, we show that our algorithm is strongly consistent and asymptotically normal by employing the stochastic Lyapunov function method and limit theories for martingales. Moreover, we show that the covariances of the estimates can achieve the Cramér-Rao(C-R) bound asymptotically, indicating that the performance of the proposed algorithm is the best possible that one can expect in general. Unlike most of the existing works, our results are obtained without resorting to the traditionally used but stringent conditions such as independent and identically distributed (i.i.d) assumption on the data, and thus our results do not exclude applications to stochastic dynamical systems with feedback. A numerical example is also provided to illustrate the superiority of the proposed online algorithm over the existing related ones in the literature. Keywords: stochastic dynamical systems, censored regression models, online learning, non-i.i.d data, cramér-Rao bound.

algorithm, assumption 2, dynamical system, (12 more...)

2309.09454

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.54)

Industry:

Education > Educational Setting > Online (0.82)
Law > Civil Rights & Constitutional Law (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

arXiv.org Artificial IntelligenceOct-1-2023

A Decision Making Framework for Recommended Maintenance of Road Segments

Sun, Haoyu, Yan, Yan

Due to limited budgets allocated for road maintenance projects in various countries, road management departments face difficulties in making scientific maintenance decisions. This paper aims to provide road management departments with more scientific decision tools and evidence. The framework proposed in this paper mainly has the following four innovative points: 1) Predicting pavement performance deterioration levels of road sections as decision basis rather than accurately predicting specific indicator values; 2) Determining maintenance route priorities based on multiple factors; 3) Making maintenance plan decisions by establishing deep reinforcement learning models to formulate predictive strategies based on past maintenance performance evaluations, while considering both technical and management indicators; 4) Determining repair section priorities according to actual and suggested repair effects. By resolving these four issues, the framework can make intelligent decisions regarding optimal maintenance plans and sections, taking into account limited funds and historical maintenance management experiences.

maintenance, maintenance plan, probability, (17 more...)

2307.10085

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > China > Xinjiang Uygur Autonomous Region (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Mougan, Carlos, State, Laura, Ferrara, Antonio, Ruggieri, Salvatore, Staab, Steffen

Beyond Demographic Parity: Redefining Equal Treatment

arXiv.org Artificial IntelligenceOct-1-2023

Liberalism-oriented political philosophy reasons that all individuals should be treated equally independently of their protected characteristics. Related work in machine learning has translated the concept of \emph{equal treatment} into terms of \emph{equal outcome} and measured it as \emph{demographic parity} (also called \emph{statistical parity}). Our analysis reveals that the two concepts of equal outcome and equal treatment diverge; therefore, demographic parity does not faithfully represent the notion of \emph{equal treatment}. We propose a new formalization for equal treatment by (i) considering the influence of feature values on predictions, such as computed by Shapley values decomposing predictions across its features, (ii) defining distributions of explanations, and (iii) comparing explanation distributions between populations with different protected characteristics. We show the theoretical properties of our notion of equal treatment and devise a classifier two-sample test based on the AUC of an equal treatment inspector. We study our formalization of equal treatment on synthetic and natural data. We release \texttt{explanationspace}, an open-source Python package with methods and tutorials.

auc, equal treatment, shapley value, (12 more...)

2303.0804

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > West Virginia (0.04)
North America > United States > New York (0.04)
(6 more...)

Genre:

Research Report > Experimental Study (0.46)
Research Report > Promising Solution (0.46)

Industry:

Law (1.00)
Government > Regional Government (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
(4 more...)

Leplat, Valentin, Merkulov, Daniil, Katrutsa, Aleksandr, Bershatsky, Daniel, Tsymboi, Olga, Oseledets, Ivan

NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizer

arXiv.org Artificial IntelligenceSep-30-2023

Classical machine learning models such as deep neural networks are usually trained by using Stochastic Gradient Descent-based (SGD) algorithms. The classical SGD can be interpreted as a discretization of the stochastic gradient flow. In this paper we propose a novel, robust and accelerated stochastic optimizer that relies on two key elements: (1) an accelerated Nesterov-like Stochastic Differential Equation (SDE) and (2) its semi-implicit Gauss-Seidel type discretization. The convergence and stability of the obtained method, referred to as NAG-GS, are first studied extensively in the case of the minimization of a quadratic function. This analysis allows us to come up with an optimal learning rate in terms of the convergence rate while ensuring the stability of NAG-GS. This is achieved by the careful analysis of the spectral radius of the iteration matrix and the covariance matrix at stationarity with respect to all hyperparameters of our method. Further, we show that NAG- GS is competitive with state-of-the-art methods such as momentum SGD with weight decay and AdamW for the training of machine learning models such as the logistic regression model, the residual networks models on standard computer vision datasets, Transformers in the frame of the GLUE benchmark and the recent Vision Transformers.

convergence, nag-gs method, step size, (15 more...)

2209.14937

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

Lomurno, Eugenio, Toffoli, Simone, Di Febbo, Davide, Matteucci, Matteo, Lunardini, Francesca, Ferrante, Simona

Age Group Discrimination via Free Handwriting Indicators

arXiv.org Artificial IntelligenceSep-29-2023

The growing global elderly population is expected to increase the prevalence of frailty, posing significant challenges to healthcare systems. Frailty, a syndrome associated with ageing, is characterised by progressive health decline, increased vulnerability to stressors and increased risk of mortality. It represents a significant burden on public health and reduces the quality of life of those affected. The lack of a universally accepted method to assess frailty and a standardised definition highlights a critical research gap. Given this lack and the importance of early prevention, this study presents an innovative approach using an instrumented ink pen to ecologically assess handwriting for age group classification. Content-free handwriting data from 80 healthy participants in different age groups (20-40, 41-60, 61-70 and 70+) were analysed. Fourteen gesture- and tremor-related indicators were computed from the raw data and used in five classification tasks. These tasks included discriminating between adjacent and non-adjacent age groups using Catboost and Logistic Regression classifiers. Results indicate exceptional classifier performance, with accuracy ranging from 82.5% to 97.5%, precision from 81.8% to 100%, recall from 75% to 100% and ROC-AUC from 92.2% to 100%. Model interpretability, facilitated by SHAP analysis, revealed age-dependent sensitivity of temporal and tremor-related handwriting features. Importantly, this classification method offers potential for early detection of abnormal signs of ageing in uncontrolled settings such as remote home monitoring, thereby addressing the critical issue of frailty detection and contributing to improved care for older adults.

age group discrimination, free handwriting indicator

2309.17156

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Hoffmann, Till, Onnela, Jukka-Pekka

Minimising the Expected Posterior Entropy Yields Optimal Summary Statistics

arXiv.org Machine LearningSep-29-2023

Extracting low-dimensional summary statistics from large datasets is essential for efficient (likelihood-free) inference. We characterise different classes of summaries and demonstrate their importance for correctly analysing dimensionality reduction algorithms. We propose obtaining summaries by minimising the expected posterior entropy (EPE) under the prior predictive distribution of the model. Many existing methods are equivalent to or are special or limiting cases of minimising the EPE. We develop a method to obtain high-fidelity summaries that minimise the EPE; we apply it to benchmark and real-world examples. We both offer a unifying perspective for obtaining informative summaries and provide concrete recommendations for practitioners.

artificial intelligence, information, machine learning, (16 more...)

2206.0234

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)