AITopics

2411.01001

Country:

North America > United States > Nebraska > Lancaster County > Lincoln (0.14)
Oceania > Australia (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)

Porwal, Anupreet, Rodriguez, Abel

Dirichlet process mixtures of block $g$ priors for model selection and prediction in linear models

arXiv.org Artificial IntelligenceNov-1-2024

This paper introduces Dirichlet process mixtures of block $g$ priors for model selection and prediction in linear models. These priors are extensions of traditional mixtures of $g$ priors that allow for differential shrinkage for various (data-selected) blocks of parameters while fully accounting for the predictors' correlation structure, providing a bridge between the literatures on model selection and continuous shrinkage priors. We show that Dirichlet process mixtures of block $g$ priors are consistent in various senses and, in particular, that they avoid the conditional Lindley ``paradox'' highlighted by Som et al.(2016). Further, we develop a Markov chain Monte Carlo algorithm for posterior inference that requires only minimal ad-hoc tuning. Finally, we investigate the empirical performance of the prior in various real and simulated datasets. In the presence of a small number of very large effects, Dirichlet process mixtures of block $g$ priors lead to higher power for detecting smaller but significant effects without only a minimal increase in the number of false discoveries.

bayes factor, coefficient, procedure, (14 more...)

2411.00471

Country:

North America > United States > Ohio (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)

arXiv.org Machine LearningNov-1-2024

Classification problem in liability insurance using machine learning models: a comparative study

Qazvini, Marjan

The insurance company uses different factors to classify the policyholders. In this study, we apply several machine learning models such as nearest neighbour and logistic regression to the Actuarial Challenge dataset used by Qazvini (2019) to classify liability insurance policies into two groups: 1 - policies with claims and 2 - policies without claims. The applications of Machine Learning (ML) models and Artificial Intelligence (AI) in areas such as medical diagnosis, economics, banking, fraud detection, agriculture, etc, have been known for quite a number of years. ML models have changed these industries remarkably. However, despite their high predictive power and their capability to identify nonlinear transformations and interactions between variables, they are slowly being introduced into the insurance industry and actuarial fields.

artificial intelligence, machine learning, policy and claim frequency, (16 more...)

2411.00354

Country:

Europe > France > Île-de-France (0.68)
Europe > France > Nouvelle-Aquitaine (0.68)
Europe > France > Provence-Alpes-Côte d'Azur (0.46)
(3 more...)

Genre: Research Report > New Finding (0.70)

Industry:

Banking & Finance > Insurance (1.00)
Transportation > Ground > Road (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.37)

AIHubOct-31-2024, 10:30:11 GMT

Building trust in AI: Transparent models for better decisions

AI is becoming a part of our daily lives, from approving loans to diagnosing diseases. AI model outputs are used to make increasingly important decisions, based on smart algorithms and data. But if we can't understand these decisions, how can we trust them? One approach to making AI decisions more understandable is to use models that are inherently interpretable. These are models that are designed in such a way that consumers of the model outputs can infer the model's behaviour by reading the parameters of the model. Popular inherently interpretable models include Decision Trees and Linear Regression.

logistic model, probability, regression, (15 more...)

AIHub

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.59)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.40)

arXiv.org Machine LearningOct-30-2024

Explainable Artificial Intelligence for Dependent Features: Additive Effects of Collinearity

Salih, Ahmed M

Explainable Artificial Intelligence (XAI) emerged to reveal the internal mechanism of machine learning models and how the features affect the prediction outcome. Collinearity is one of the big issues that XAI methods face when identifying the most informative features in the model. Current XAI approaches assume the features in the models are independent and calculate the effect of each feature toward model prediction independently from the rest of the features. However, such assumption is not realistic in real life applications. We propose an Additive Effects of Collinearity (AEC) as a novel XAI method that aim to considers the collinearity issue when it models the effect of each feature in the model on the outcome. AEC is based on the idea of dividing multivariate models into several univariate models in order to examine their impact on each other and consequently on the outcome. The proposed method is implemented using simulated and real data to validate its efficiency comparing with the a state of arts XAI method. The results indicate that AEC is more robust and stable against the impact of collinearity when it explains AI models compared with the state of arts XAI method.

artificial intelligence, collinearity, machine learning, (16 more...)

2411.00846

Country:

Europe > United Kingdom > England > Greater London > London (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Leicestershire > Leicester (0.05)
(3 more...)

Genre: Research Report (0.85)

Industry: Health & Medicine > Therapeutic Area (0.71)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Rezapour, Mostafa, Narayanan, Aarthi, Mowery, Wyatt H., Gurcan, Metin Nafi

Assessing Concordance between RNA-Seq and NanoString Technologies in Ebola-Infected Nonhuman Primates Using Machine Learning

arXiv.org Artificial IntelligenceOct-30-2024

This study evaluates the concordance between RNA sequencing (RNA-Seq) and NanoString technologies for gene expression analysis in non-human primates (NHPs) infected with Ebola virus (EBOV). We performed a detailed comparison of both platforms, demonstrating a strong correlation between them, with Spearman coefficients for 56 out of 62 samples ranging from 0.78 to 0.88, with a mean of 0.83 and a median of 0.85. Bland-Altman analysis further confirmed high consistency, with most measurements falling within 95% confidence limits. A machine learning approach, using the Supervised Magnitude-Altitude Scoring (SMAS) method trained on NanoString data, identified OAS1 as a key marker for distinguishing RT-qPCR positive from negative samples. Remarkably, when applied to RNA-Seq data, OAS1 also achieved 100% accuracy in differentiating infected from uninfected samples using logistic regression, demonstrating its robustness across platforms. Further differential expression analysis identified 12 common genes including ISG15, OAS1, IFI44, IFI27, IFIT2, IFIT3, IFI44L, MX1, MX2, OAS2, RSAD2, and OASL which demonstrated the highest levels of statistical significance and biological relevance across both platforms. Gene Ontology (GO) analysis confirmed that these genes are directly involved in key immune and viral infection pathways, reinforcing their importance in EBOV infection. In addition, RNA-Seq uniquely identified genes such as CASP5, USP18, and DDX60, which play key roles in immune regulation and antiviral defense. This finding highlights the broader detection capabilities of RNA-Seq and underscores the complementary strengths of both platforms in providing a comprehensive and accurate assessment of gene expression changes during Ebola virus infection.

platform, regulation, rna-seq, (15 more...)

2410.23433

Country:

North America > United States > Virginia > Fairfax County > Fairfax (0.04)
North America > United States > North Carolina > Forsyth County > Winston-Salem (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Wang, Huan-Chih, Wu, Ja-Ling

A Study of Secure Algorithms for Vertical Federated Learning: Take Secure Logistic Regression as an Example

arXiv.org Artificial IntelligenceOct-30-2024

After entering the era of big data, more and more companies build services with machine learning techniques. However, it is costly for companies to collect data and extract helpful handcraft features on their own. Although it is a way to combine with other companies' data for boosting the model's performance, this approach may be prohibited by laws. In other words, finding the balance between sharing data with others and keeping data from privacy leakage is a crucial topic worthy of close attention. This paper focuses on distributed data and conducts secure model-training tasks on a vertical federated learning scheme. Here, secure implies that the whole process is executed in the encrypted domain; therefore, the privacy concern is released.

dataset, logistic regression model, regression model, (8 more...)

2410.2296

Country:

Asia > Taiwan (0.04)
Europe > France (0.04)
Asia > China > Hong Kong (0.04)

Genre:

Research Report > Experimental Study (0.58)
Research Report > New Finding (0.58)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.95)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.62)

Buriticá, Gloria, Engelke, Sebastian

Progression: an extrapolation principle for regression

arXiv.org Machine LearningOct-30-2024

The problem of regression extrapolation, or out-of-distribution generalization, arises when predictions are required at test points outside the range of the training data. In such cases, the non-parametric guarantees for regression methods from both statistics and machine learning typically fail. Based on the theory of tail dependence, we propose a novel statistical extrapolation principle. After a suitable, data-adaptive marginal transformation, it assumes a simple relationship between predictors and the response at the boundary of the training predictor samples. This assumption holds for a wide range of models, including non-parametric regression functions with additive noise. Our semi-parametric method, progression, leverages this extrapolation principle and offers guarantees on the approximation error beyond the training data range. We demonstrate how this principle can be effectively integrated with existing approaches, such as random forests and additive models, to improve extrapolation performance on out-of-distribution samples.

assumption, extrapolation, extrapolation principle, (15 more...)

2410.23246

Country:

North America > United States > New York (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > United Kingdom > England > West Sussex (0.04)
(4 more...)

Genre:

Research Report (0.50)
Overview (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

arXiv.org Machine LearningOct-30-2024

Very fast Bayesian Additive Regression Trees on GPU

Petrillo, Giacomo

BART Bayesian Additive Regression Trees (BART) is a nonparametric Bayesian regression method, introduced by Chipman, George, and McCulloch (2006, 2010). It defines a prior distribution over the space of functions by representing them as a sum of binary decision trees, and then specifying a stochastic tree generation process. The posterior is then obtained with Metropolis-Gibbs sampling over the trees. See Hill, Linero, and Murray (2020) for a review, and Daniels, Linero, and Roy (2023, ch. 5) for a textbook treatment. BART's success BART has proven empirically effective, and is gaining popularity (consider, e.g., Tan and Roy 2019). The Atlantic Causal Inference Conference (ACIC) Data Challenge has confirmed BART as one of the best regression methods for causal inference (Dorie et al. 2019; Gruber et al. 2019; Hahn, Dorie, and Murray 2019; Thal and Finucane 2023). Many BART variants have been developed throughout the years, adding features such as variable selection (Linero 2018).

bart, bayesian additive regression tree, implementation, (15 more...)

2410.23244

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York (0.04)
Europe > Italy (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Fernsel, Linda, Kalff, Yannick, Simbeck, Katharina

Assessing the Auditability of AI-integrating Systems: A Framework and Learning Analytics Case Study

arXiv.org Artificial IntelligenceOct-29-2024

Audits contribute to the trustworthiness of Learning Analytics (LA) systems that integrate Artificial Intelligence (AI) and may be legally required in the future. We argue that the efficacy of an audit depends on the auditability of the audited system. Therefore, systems need to be designed with auditability in mind. We present a framework for assessing the auditability of AI-integrating systems that consists of three parts: (1) Verifiable claims about the validity, utility and ethics of the system, (2) Evidence on subjects (data, models or the system) in different types (documentation, raw sources and logs) to back or refute claims, (3) Evidence must be accessible to auditors via technical means (APIs, monitoring tools, explainable AI, etc.). We apply the framework to assess the auditability of Moodle's dropout prediction system and a prototype AI-based LA. We find that Moodle's auditability is limited by incomplete documentation, insufficient monitoring capabilities and a lack of available test data. The framework supports assessing the auditability of AI-based LA systems in use and improves the design of auditable systems and thus of audits.

auditability, data mining, machine learning, (15 more...)

2411.08906

Country:

North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
(4 more...)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.93)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Online (1.00)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(2 more...)