AITopics

2311.11491

Country:

North America > Canada (0.04)
North America > United States > Virginia (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
(2 more...)

Genre:

Overview (1.00)
Research Report (0.64)

Industry:

Banking & Finance (1.00)
Information Technology > Security & Privacy (0.93)
Health & Medicine > Therapeutic Area > Oncology (0.34)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
(3 more...)

Jafarigol, Elaheh, Trafalis, Theodore, Razzaghi, Talayeh, Zamankhani, Mona

Exploring Machine Learning Models for Federated Learning: A Review of Approaches, Performance, and Limitations

arXiv.org Artificial IntelligenceNov-17-2023

In the growing world of artificial intelligence, federated learning is a distributed learning framework enhanced to preserve the privacy of individuals' data. Federated learning lays the groundwork for collaborative research in areas where the data is sensitive. Federated learning has several implications for real-world problems. In times of crisis, when real-time decision-making is critical, federated learning allows multiple entities to work collectively without sharing sensitive data. This distributed approach enables us to leverage information from multiple sources and gain more diverse insights. This paper is a systematic review of the literature on privacy-preserving machine learning in the last few years based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Specifically, we have presented an extensive review of supervised/unsupervised machine learning algorithms, ensemble methods, meta-heuristic approaches, blockchain technology, and reinforcement learning used in the framework of federated learning, in addition to an overview of federated learning applications. This paper reviews the literature on the components of federated learning and its applications in the last few years. The main purpose of this work is to provide researchers and practitioners with a comprehensive overview of federated learning from the machine learning point of view. A discussion of some open problems and future research directions in federated learning is also provided.

application, federated learning, learning, (12 more...)

2311.10832

Country:

North America > United States > Oklahoma (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Industry:

Law Enforcement & Public Safety (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Technology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(5 more...)

Cavalheiro, Lucca Portes, Bernard, Simon, Barddal, Jean Paul, Heutte, Laurent

Random Forest Kernel for High-Dimension Low Sample Size Classification

arXiv.org Machine LearningNov-17-2023

High dimension, low sample size (HDLSS) problems are numerous among real-world applications of machine learning. From medical images to text processing, traditional machine learning algorithms are usually unsuccessful in learning the best possible concept from such data. In a previous work, we proposed a dissimilarity-based approach for multi-view classification, the Random Forest Dissimilarity (RFD), that perfoms state-of-the-art results for such problems. In this work, we transpose the core principle of this approach to solving HDLSS classification problems, by using the RF similarity measure as a learned precomputed SVM kernel (RFSVM). We show that such a learned similarity measure is particularly suited and accurate for this classification context. Experiments conducted on 40 public HDLSS classification datasets, supported by rigorous statistical analyses, show that the RFSVM method outperforms existing methods for the majority of HDLSS problems and remains at the same time very competitive for low or non-HDLSS problems.

artificial intelligence, decision tree learning, machine learning, (17 more...)

doi: 10.1007/s11222-023-10309-0

2310.1471

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > France > Normandy > Seine-Maritime > Rouen (0.04)
South America > Brazil > Paraná > Curitiba (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Diagnostic Medicine > Imaging (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.87)

Goedhart, Jeroen M., Klausch, Thomas, Janssen, Jurriaan, van de Wiel, Mark A.

Co-data Learning for Bayesian Additive Regression Trees

arXiv.org Machine LearningNov-16-2023

Medical prediction applications often need to deal with small sample sizes compared to the number of covariates. Such data pose problems for prediction and variable selection, especially when the covariate-response relationship is complicated. To address these challenges, we propose to incorporate co-data, i.e. external information on the covariates, into Bayesian additive regression trees (BART), a sum-of-trees prediction model that utilizes priors on the tree parameters to prevent overfitting. To incorporate co-data, an empirical Bayes (EB) framework is developed that estimates, assisted by a co-data model, prior covariate weights in the BART model. The proposed method can handle multiple types of co-data simultaneously. Furthermore, the proposed EB framework enables the estimation of the other hyperparameters of BART as well, rendering an appealing alternative to cross-validation. We show that the method finds relevant covariates and that it improves prediction compared to default BART in simulations. If the covariate-response relationship is nonlinear, the method benefits from the flexibility of BART to outperform regression-based co-data learners. Finally, the use of co-data enhances prediction in an application to diffuse large B-cell lymphoma prognosis based on clinical covariates, gene mutations, DNA translocations, and DNA copy number data. Keywords: Bayesian additive regression trees; Empirical Bayes; Co-data; High-dimensional data; Omics; Prediction

artificial intelligence, covariate, machine learning, (20 more...)

2311.09997

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.87)
Health & Medicine > Therapeutic Area > Oncology > Lymphoma (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Khosravi, Hamed, Farhadpour, Sarah, Grandhi, Manikanta, Raihan, Ahmed Shoyeb, Das, Srinjoy, Ahmed, Imtiaz

Strategic Data Augmentation with CTGAN for Smart Manufacturing: Enhancing Machine Learning Predictions of Paper Breaks in Pulp-and-Paper Production

arXiv.org Artificial IntelligenceNov-15-2023

A significant challenge for predictive maintenance in the pulp-and-paper industry is the infrequency of paper breaks during the production process. In this article, operational data is analyzed from a paper manufacturing machine in which paper breaks are relatively rare but have a high economic impact. Utilizing a dataset comprising 18,398 instances derived from a quality assurance protocol, we address the scarcity of break events (124 cases) that pose a challenge for machine learning predictive models. With the help of Conditional Generative Adversarial Networks (CTGAN) and Synthetic Minority Oversampling Technique (SMOTE), we implement a novel data augmentation framework. This method ensures that the synthetic data mirrors the distribution of the real operational data but also seeks to enhance the performance metrics of predictive modeling. Before and after the data augmentation, we evaluate three different machine learning algorithms-Decision Trees (DT), Random Forest (RF), and Logistic Regression (LR). Utilizing the CTGAN-enhanced dataset, our study achieved significant improvements in predictive maintenance performance metrics. The efficacy of CTGAN in addressing data scarcity was evident, with the models' detection of machine breaks (Class 1) improving by over 30% for Decision Trees, 20% for Random Forest, and nearly 90% for Logistic Regression. With this methodological advancement, this study contributes to industrial quality control and maintenance scheduling by addressing rare event prediction in manufacturing processes.

class 1, dataset, real data, (12 more...)

2311.09333

Country: North America > United States > West Virginia > Monongalia County > Morgantown (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Materials > Paper & Forest Products > Paper Products (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
(2 more...)

arXiv.org Artificial IntelligenceNov-15-2023

In-vehicle Sensing and Data Analysis for Older Drivers with Mild Cognitive Impairment

Moshfeghi, Sonia, Jan, Muhammad Tanveer, Conniff, Joshua, Ghoreishi, Seyedeh Gol Ara, Jang, Jinwoo, Furht, Borko, Yang, Kwangsoo, Rosselli, Monica, Newman, David, Tappen, Ruth, Smith, Dana

Driving is a complex daily activity indicating age and disease related cognitive declines. Therefore, deficits in driving performance compared with ones without mild cognitive impairment (MCI) can reflect changes in cognitive functioning. There is increasing evidence that unobtrusive monitoring of older adults driving performance in a daily-life setting may allow us to detect subtle early changes in cognition. The objectives of this paper include designing low-cost in-vehicle sensing hardware capable of obtaining high-precision positioning and telematics data, identifying important indicators for early changes in cognition, and detecting early-warning signs of cognitive impairment in a truly normal, day-to-day driving condition with machine learning approaches. Our statistical analysis comparing drivers with MCI to those without reveals that those with MCI exhibit smoother and safer driving patterns. This suggests that drivers with MCI are cognizant of their condition and tend to avoid erratic driving behaviors. Furthermore, our Random Forest models identified the number of night trips, number of trips, and education as the most influential factors in our data evaluation.

cognitive impairment, dementia, older driver, (11 more...)

2311.09273

Country:

North America > United States > Florida > Palm Beach County > Boca Raton (0.06)
North America > United States > Florida > Hillsborough County > University (0.05)
North America > United States > Nevada (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.69)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.35)

Williams, Elizabeth, Gartner, Daniel, Harper, Paul

Predictive and Prescriptive Analytics for Multi-Site Modeling of Frail and Elderly Patient Services

arXiv.org Artificial IntelligenceNov-13-2023

Recent research has highlighted the potential of linking predictive and prescriptive analytics. However, it remains widely unexplored how both paradigms could benefit from one another to address today's major challenges in healthcare. One of these is smarter planning of resource capacities for frail and elderly inpatient wards, addressing the societal challenge of an aging population. Frail and elderly patients typically suffer from multimorbidity and require more care while receiving medical treatment. The aim of this research is to assess how various predictive and prescriptive analytical methods, both individually and in tandem, contribute to addressing the operational challenges within an area of healthcare that is growing in demand. Clinical and demographic patient attributes are gathered from more than 165,000 patient records and used to explain and predict length of stay. To that extent, we employ Classification and Regression Trees (CART) analysis to establish this relationship. On the prescriptive side, deterministic and two-stage stochastic programs are developed to determine how to optimally plan for beds and ward staff with the objective to minimize cost. Furthermore, the two analytical methodologies are linked by generating demand for the prescriptive models using the CART groupings. The results show the linked methodologies provided different but similar results compared to using averages and in doing so, captured a more realistic real-world variation in the patient length of stay. Our research reveals that healthcare managers should consider using predictive and prescriptive models to make more informed decisions. By combining predictive and prescriptive analytics, healthcare managers can move away from relying on averages and incorporate the unique characteristics of their patients to create more robust planning decisions, mitigating risks caused by variations in demand.

frail and elderly patient service, multi-site modeling, predictive and prescriptive analytic

2311.07283

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.53)

Papacharalampous, Georgia, Tyralis, Hristos, Doulamis, Nikolaos, Doulamis, Anastasios

Machine learning for uncertainty estimation in fusing precipitation observations from satellites and ground-based gauges

arXiv.org Machine LearningNov-13-2023

To form precipitation datasets that are accurate and, at the same time, have high spatial densities, data from satellites and gauges are often merged in the literature. However, uncertainty estimates for the data acquired in this manner are scarcely provided, although the importance of uncertainty quantification in predictive modelling is widely recognized. Furthermore, the benefits that machine learning can bring to the task of providing such estimates have not been broadly realized and properly explored through benchmark experiments. The present study aims at filling in this specific gap by conducting the first benchmark tests on the topic. On a large dataset that comprises 15-year-long monthly data spanning across the contiguous United States, we extensively compared six learners that are, by their construction, appropriate for predictive uncertainty quantification. These are the quantile regression (QR), quantile regression forests (QRF), generalized random forests (GRF), gradient boosting machines (GBM), light gradient boosting machines (LightGBM) and quantile regression neural networks (QRNN). The comparison referred to the competence of the learners in issuing predictive quantiles at nine levels that facilitate a good approximation of the entire predictive probability distribution, and was primarily based on the quantile and continuous ranked probability skill scores. Three types of predictor variables (i.e., satellite precipitation variables, distances between a point of interest and satellite grid points, and elevation at a point of interest) were used in the comparison and were additionally compared with each other. This additional comparison was based on the explainable machine learning concept of feature importance. The results suggest that the order from the best to the worst of the learners for the task investigated is the following: LightGBM, QRF, GRF, GBM, QRNN and QR...

artificial intelligence, learner, machine learning, (15 more...)

2311.07511

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Austria > Vienna (0.14)
Europe > Greece (0.05)
(6 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Energy (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

arXiv.org Artificial IntelligenceNov-12-2023

A GPU-Accelerated Moving-Horizon Algorithm for Training Deep Classification Trees on Large Datasets

Ren, Jiayang, Osuna-Enciso, Valentín, Okamoto, Morimasa, Mao, Qiangqiang, Ji, Chaojie, Cao, Liang, Hua, Kaixun, Cao, Yankai

Decision trees are essential yet NP-complete to train, prompting the widespread use of heuristic methods such as CART, which suffers from sub-optimal performance due to its greedy nature. Recently, breakthroughs in finding optimal decision trees have emerged; however, these methods still face significant computational costs and struggle with continuous features in large-scale datasets and deep trees. To address these limitations, we introduce a moving-horizon differential evolution algorithm for classification trees with continuous features (MH-DEOCT). Our approach consists of a discrete tree decoding method that eliminates duplicated searches between adjacent samples, a GPU-accelerated implementation that significantly reduces running time, and a moving-horizon strategy that iteratively trains shallow subtrees at each node to balance the vision and optimizer capability. Comprehensive studies on 68 UCI datasets demonstrate that our approach outperforms the heuristic method CART on training and testing accuracy by an average of 3.44% and 1.71%, respectively. Moreover, these numerical studies empirically demonstrate that MH-DEOCT achieves near-optimal performance (only 0.38% and 0.06% worse than the global optimal method on training and testing, respectively), while it offers remarkable scalability for deep trees (e.g., depth=8) and large-scale datasets (e.g., ten million samples).

dataset, deoct mh-deoct cart, mh-deoct, (15 more...)

2311.06952

Country:

North America > Canada > British Columbia (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > Italy (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre:

Research Report > Promising Solution (0.92)
Research Report > New Finding (0.92)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

arXiv.org Machine LearningNov-12-2023

ODTlearn: A Package for Learning Optimal Decision Trees for Prediction and Prescription

Vossler, Patrick, Aghaei, Sina, Justin, Nathan, Jo, Nathanael, Gómez, Andrés, Vayanos, Phebe

ODTLearn is an open-source Python package that provides methods for learning optimal decision trees for high-stakes predictive and prescriptive tasks based on the mixed-integer optimization (MIO) framework proposed in (Aghaei et al., 2019) and several of its extensions. The current version of the package provides implementations for learning optimal classification trees, optimal fair classification trees, optimal classification trees robust to distribution shifts, and optimal prescriptive trees from observational data. We have designed the package to be easy to maintain and extend as new optimal decision tree problem classes, reformulation strategies, and solution algorithms are introduced. To this end, the package follows object-oriented design principles and supports both commercial (Gurobi) and open source (COIN-OR branch and cut) solvers.

classification tree, machine learning, object-oriented architecture, (18 more...)

2307.15691

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.69)