AITopics

2407.10442

Country:

North America > United States > Vermont (0.04)
North America > United States > District of Columbia (0.04)

Genre: Research Report > Experimental Study (0.68)

Industry:

Health & Medicine (0.68)
Government > Voting & Elections (0.67)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

arXiv.org Artificial IntelligenceJul-14-2024

Learning to Represent Surroundings, Anticipate Motion and Take Informed Actions in Unstructured Environments

Zhi, Weiming

Contemporary robots have become exceptionally skilled at achieving specific tasks in structured environments. However, they often fail when faced with the limitless permutations of real-world unstructured environments. This motivates robotics methods which learn from experience, rather than follow a pre-defined set of rules. In this thesis, we present a range of learning-based methods aimed at enabling robots, operating in dynamic and unstructured environments, to better understand their surroundings, anticipate the actions of others, and take informed actions accordingly. In the first part of the thesis, we investigate methods which leverage learning to represent the structure and motion in a robot's operating environment, in a continuous manner.

computer vision and pattern recognition, sequential quadratic programming, stochastic process representation, (16 more...)

2407.10383

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
Asia > Middle East > Jordan (0.04)
(5 more...)

Genre:

Summary/Review (1.00)
Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (0.92)

Industry:

Transportation (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(12 more...)

Gunonu, Seyma, Altun, Gizem, Cavus, Mustafa

Explainable bank failure prediction models: Counterfactual explanations to reduce the failure risk

arXiv.org Artificial IntelligenceJul-14-2024

The accuracy and understandability of bank failure prediction models are crucial. While interpretable models like logistic regression are favored for their explainability, complex models such as random forest, support vector machines, and deep learning offer higher predictive performance but lower explainability. These models, known as black boxes, make it difficult to derive actionable insights. To address this challenge, using counterfactual explanations is suggested. These explanations demonstrate how changes in input variables can alter the model output and suggest ways to mitigate bank failure risk. The key challenge lies in selecting the most effective method for generating useful counterfactuals, which should demonstrate validity, proximity, sparsity, and plausibility. The paper evaluates several counterfactual generation methods: WhatIf, Multi Objective, and Nearest Instance Counterfactual Explanation, and also explores resampling methods like undersampling, oversampling, SMOTE, and the cost sensitive approach to address data imbalance in bank failure prediction in the US. The results indicate that the Nearest Instance Counterfactual Explanation method yields higher quality counterfactual explanations, mainly using the cost sensitive approach. Overall, the Multi Objective Counterfactual and Nearest Instance Counterfactual Explanation methods outperform others regarding validity, proximity, and sparsity metrics, with the cost sensitive approach providing the most desirable counterfactual explanations. These findings highlight the variability in the performance of counterfactual generation methods across different balancing strategies and machine learning models, offering valuable strategies to enhance the utility of black box bank failure prediction models.

bank failure prediction model, cost-sensitive approach, counterfactual explanation, (9 more...)

2407.11089

Country:

North America > United States (0.48)
Asia > Middle East > Republic of Türkiye > Eskisehir Province > Eskisehir (0.04)
Europe > Spain > Castile and León > Salamanca Province > Salamanca (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
(2 more...)

arXiv.org Machine LearningJul-14-2024

An integrated perspective of robustness in regression through the lens of the bias-variance trade-off

Okuno, Akifumi

The concept of robustness is of paramount importance across a variety of fields, particularly those involving practical statistical parameter estimation based on real-world observations. However, robust estimation techniques introduced in various methodologies aim to achieve different objectives, and each technique has been examined within individual frameworks. It is crucial to reexamine the purpose behind robust estimation and provide an integrated perspective across disciplinary boundaries. To facilitate this, this study initially classifies the goals of robust estimation methods into three categories: resistance to (1) outlier contamination (see, e.g., Huber and Ronchetti (1981) and Hampel et al. (1986)), (2) user-specified imaginary dataset-perturbation (see, e.g., Ben-Tal and Nemirovski (2002) and Biggio et al. (2013)), and (3) model misspecification. Notably, (3) can be addressed using expressive models in certain cases; (3) will be discussed later but will not be the main focus. Therefore, this study primarily focuses on the following two categories within the context of linear regression: (1) Outlier-resistance. Outliers are data points that deviate significantly from the overall trend of the other observations in a dataset. Since the presence of outliers can affect statistical parameter estimation, potentially leading to unintended results, outlier-resistant estimation has been a focus for many decades (Huber and Ronchetti, 1981; Hampel et al., 1986; Maronna et al., 2006) mainly in the field of statistics. Originating from the works of Tukey (1960) and Huber (1964), many outlier-resistant estimations are designed by modifying the loss function.

loss function, optimization, regression, (15 more...)

2407.10418

Country:

North America > United States (0.14)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.51)

arXiv.org Artificial IntelligenceJul-13-2024

Byzantine-Robust Decentralized Federated Learning

Fang, Minghong, Zhang, Zifan, Hairi, null, Khanduri, Prashant, Liu, Jia, Lu, Songtao, Liu, Yuchen, Gong, Neil

Federated learning (FL) enables multiple clients to collaboratively train machine learning models without revealing their private training data. In conventional FL, the system follows the server-assisted architecture (server-assisted FL), where the training process is coordinated by a central server. However, the server-assisted FL framework suffers from poor scalability due to a communication bottleneck at the server, and trust dependency issues. To address challenges, decentralized federated learning (DFL) architecture has been proposed to allow clients to train models collaboratively in a serverless and peer-to-peer manner. However, due to its fully decentralized nature, DFL is highly vulnerable to poisoning attacks, where malicious clients could manipulate the system by sending carefully-crafted local models to their neighboring clients. To date, only a limited number of Byzantine-robust DFL methods have been proposed, most of which are either communication-inefficient or remain vulnerable to advanced poisoning attacks. In this paper, we propose a new algorithm called BALANCE (Byzantine-robust averaging through local similarity in decentralization) to defend against poisoning attacks in DFL. In BALANCE, each client leverages its own local model as a similarity reference to determine if the received model is malicious or benign. We establish the theoretical convergence guarantee for BALANCE under poisoning attacks in both strongly convex and non-convex settings. Furthermore, the convergence rate of BALANCE under poisoning attacks matches those of the state-of-the-art counterparts in Byzantine-free settings. Extensive experiments also demonstrate that BALANCE outperforms existing DFL methods and effectively defends against poisoning attacks.

dataset, local model, poisoning attack, (14 more...)

2406.10416

Country:

North America > United States > Utah > Salt Lake County > Salt Lake City (0.05)
North America > United States > North Carolina (0.04)
North America > United States > Wisconsin (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.67)
Government > Military (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

arXiv.org Artificial IntelligenceJul-13-2024

LFFR: Logistic Function For (single-output) Regression

Chiang, John

Privacy-preserving regression in machine learning is a crucial area of research, aimed at enabling the use of powerful machine learning techniques while protecting individuals' privacy. In this paper, we implement privacy-preserving regression training using data encrypted under a fully homomorphic encryption scheme. We first examine the common linear regression algorithm and propose a (simplified) fixed Hessian for linear regression training, which can be applied for any datasets even not normalized into the range $[0, 1]$. We also generalize this constant Hessian matrix to the ridge regression version, namely linear regression which includes a regularization term to penalize large coefficients. However, our main contribution is to develop a novel and efficient algorithm called LFFR for homomorphic regression using the logistic function, which could model more complex relations between input values and output prediction in comparison with linear regression. We also find a constant simplified Hessian to train our LFFR algorithm using the Newton-like method and compare it against to with our new fixed Hessian linear regression training over two real-world datasets. We suggest normalizing not only the data but also the target predictions even for the original linear regression used in a privacy-preserving manner, which is helpful to remain weights in a small range, say $[-5, +5]$ good for refreshing ciphertext setting parameters, and avoid tuning the regularization parameter $\lambda$ via cross validation. The linear regression with normalized predictions could be a viable alternative to ridge regression.

algorithm, dataset, regression, (16 more...)

2407.09955

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
Asia > Middle East > Israel (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.86)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Olmin, Amanda, Lindsten, Fredrik

Towards understanding epoch-wise double descent in two-layer linear neural networks

arXiv.org Machine LearningJul-13-2024

Epoch-wise double descent is the phenomenon where generalisation performance improves beyond the point of overfitting, resulting in a generalisation curve exhibiting two descents under the course of learning. Understanding the mechanisms driving this behaviour is crucial not only for understanding the generalisation behaviour of machine learning models in general, but also for employing conventional selection methods, such as the use of early stopping to mitigate overfitting. While we ultimately want to draw conclusions of more complex models, such as deep neural networks, a majority of theoretical conclusions regarding the underlying cause of epoch-wise double descent are based on simple models, such as standard linear regression. To start bridging this gap, we study epoch-wise double descent in two-layer linear neural networks. First, we derive a gradient flow for the linear two-layer model, that bridges the learning dynamics of the standard linear regression model, and the linear two-layer diagonal network with quadratic weights. Second, we identify additional factors of epoch-wise double descent emerging with the extra model layer, by deriving necessary conditions for the generalisation error to follow a double descent pattern. While epoch-wise double descent in linear regression has been attributed to differences in input variance, in the two-layer model, also the singular values of the input-output covariance matrix play an important role. This opens up for further questions regarding unidentified factors of epoch-wise double descent for truly deep models.

double descent, epoch-wise double descent, inflection point, (17 more...)

2407.09845

Country:

North America > United States (0.14)
Europe > Sweden > Östergötland County > Linköping (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.94)

Baptista, Ricardo, O'Reilly, Eliza, Xie, Yangxinyu

TrIM: Transformed Iterative Mondrian Forests for Gradient-based Dimension Reduction and High-Dimensional Regression

arXiv.org Machine LearningJul-13-2024

We propose a computationally efficient algorithm for gradient-based linear dimension reduction and high-dimensional regression. The algorithm initially computes a Mondrian forest and uses this estimator to identify a relevant feature subspace of the inputs from an estimate of the expected gradient outer product (EGOP) of the regression function. In addition, we introduce an iterative approach known as Transformed Iterative Mondrian (TrIM) forest to improve the Mondrian forest estimator by using the EGOP estimate to update the set of features and weights used by the Mondrian partitioning mechanism. We obtain consistency guarantees and convergence rates for the estimation of the EGOP matrix and the random forest estimator obtained from one iteration of the TrIM algorithm. Lastly, we demonstrate the effectiveness of our proposed algorithm for learning the relevant feature subspace across a variety of settings with both simulated and real data.

estimator, reilly, subspace, (15 more...)

2407.09964

Country:

Africa > West Africa (0.04)
Africa > Sierra Leone (0.04)
North America > United States > Pennsylvania (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine (0.95)
Government > Regional Government (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

arXiv.org Artificial IntelligenceJul-12-2024

Beyond Euclid: An Illustrated Guide to Modern Machine Learning with Geometric, Topological, and Algebraic Structures

Sanborn, Sophia, Mathe, Johan, Papillon, Mathilde, Buracas, Domas, Lillemark, Hansen J, Shewmake, Christian, Bertics, Abby, Pennec, Xavier, Miolane, Nina

The enduring legacy of Euclidean geometry underpins classical machine learning, which, for decades, has been primarily developed for data lying in Euclidean space. Yet, modern machine learning increasingly encounters richly structured data that is inherently nonEuclidean. This data can exhibit intricate geometric, topological and algebraic structure: from the geometry of the curvature of space-time, to topologically complex interactions between neurons in the brain, to the algebraic transformations describing symmetries of physical systems. Extracting knowledge from such non-Euclidean data necessitates a broader mathematical perspective. Echoing the 19th-century revolutions that gave rise to non-Euclidean geometry, an emerging line of research is redefining modern machine learning with non-Euclidean structures. Its goal: generalizing classical methods to unconventional data types with geometry, topology, and algebra. In this review, we provide an accessible gateway to this fast-growing field and propose a graphical taxonomy that integrates recent advances into an intuitive unified framework. We subsequently extract insights into current challenges and highlight exciting opportunities for future development in this field.

euclidean signal, group action, manifold, (15 more...)

2407.09468

Country:

North America > United States > New York (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > Montana (0.04)
(4 more...)

Genre:

Overview (1.00)
Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceJul-12-2024

FastImpute: A Baseline for Open-source, Reference-Free Genotype Imputation Methods -- A Case Study in PRS313

Ge, Aaron, Balasubramanian, Jeya, Wu, Xueyao, Kraft, Peter, Almeida, Jonas S.

Genotype imputation enhances genetic data by predicting missing SNPs using reference haplotype information. Traditional methods leverage linkage disequilibrium (LD) to infer untyped SNP genotypes, relying on the similarity of LD structures between genotyped target sets and fully sequenced reference panels. Recently, reference-free deep learning-based methods have emerged, offering a promising alternative by predicting missing genotypes without external databases, thereby enhancing privacy and accessibility. However, these methods often produce models with tens of millions of parameters, leading to challenges such as the need for substantial computational resources to train and inefficiency for client-sided deployment. Our study addresses these limitations by introducing a baseline for a novel genotype imputation pipeline that supports client-sided imputation models generalizable across any genotyping chip and genomic region. This approach enhances patient privacy by performing imputation directly on edge devices. As a case study, we focus on PRS313, a polygenic risk score comprising 313 SNPs used for breast cancer risk prediction. Utilizing consumer genetic panels such as 23andMe, our model democratizes access to personalized genetic insights by allowing 23andMe users to obtain their PRS313 score. We demonstrate that simple linear regression can significantly improve the accuracy of PRS313 scores when calculated using SNPs imputed from consumer gene panels, such as 23andMe. Our linear regression model achieved an R^2 of 0.86, compared to 0.33 without imputation and 0.28 with simple imputation (substituting missing SNPs with the minor allele frequency). These findings suggest that popular SNP analysis libraries could benefit from integrating linear regression models for genotype imputation, providing a viable and light-weight alternative to reference based imputation.

prs313 snp, regression model, snp, (14 more...)

2407.09355

Country:

North America > United States > Michigan (0.04)
North America > United States > Maryland (0.04)

Genre: Research Report > New Finding (0.69)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)