AITopics

2404.13056

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Texas > Brazoria County > Lake Jackson (0.04)
(5 more...)

Genre: Research Report > New Finding (0.34)

Industry: Automobiles & Trucks > Manufacturer (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Zhou, Youran, Aryal, Sunil, Bouadjenek, Mohamed Reda

Review for Handling Missing Data with special missing mechanism

arXiv.org Artificial IntelligenceApr-7-2024

Missing data poses a significant challenge in data science, affecting decision-making processes and outcomes. Understanding what missing data is, how it occurs, and why it is crucial to handle it appropriately is paramount when working with real-world data, especially in tabular data, one of the most commonly used data types in the real world. Three missing mechanisms are defined in the literature: Missing Completely At Random (MCAR), Missing At Random (MAR), and Missing Not At Random (MNAR), each presenting unique challenges in imputation. Most existing work are focused on MCAR that is relatively easy to handle. The special missing mechanisms of MNAR and MAR are less explored and understood. This article reviews existing literature on handling missing values. It compares and contrasts existing methods in terms of their ability to handle different missing mechanisms and data types. It identifies research gap in the existing literature and lays out potential directions for future research in the field. The information in this review will help data analysts and researchers to adopt and promote good practices for handling missing data in real-world problems.

dataset, imputation, mechanism, (17 more...)

2404.04905

Country:

Oceania > Australia (0.04)
North America > United States > District of Columbia > Washington (0.04)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)
Research Report > New Finding (0.92)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(5 more...)

Agarwal, Rohit, Das, Arijit, Horsch, Alexander, Agarwal, Krishna, Prasad, Dilip K.

Online Learning under Haphazard Input Conditions: A Comprehensive Review and Analysis

arXiv.org Artificial IntelligenceApr-7-2024

The domain of online learning has experienced multifaceted expansion owing to its prevalence in real-life applications. Nonetheless, this progression operates under the assumption that the input feature space of the streaming data remains constant. In this survey paper, we address the topic of online learning in the context of haphazard inputs, explicitly foregoing such an assumption. We discuss, classify, evaluate, and compare the methodologies that are adept at modeling haphazard inputs, additionally providing the corresponding code implementations and their carbon footprint. Moreover, we classify the datasets related to the field of haphazard inputs and introduce evaluation metrics specifically designed for datasets exhibiting imbalance. The code of each methodology can be found at https://github.com/Rohit102497/HaphazardInputsReview

dataset, haphazard input, time 0, (16 more...)

2404.04903

Country:

Asia > India > Jharkhand > Dhanbad (0.04)
Europe > Norway > Northern Norway > Troms > Tromsø (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(6 more...)

Genre: Overview (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (1.00)
Energy (1.00)
Education > Educational Setting > Online (0.81)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(5 more...)

Islam, Taminul, Sheakh, Md. Alif, Tahosin, Mst. Sazia, Hena, Most. Hasna, Akash, Shopnil, Jardan, Yousef A. Bin, Wondmie, Gezahign Fentahun, Nafidi, Hiba-Allah, Bourhia, Mohammed

Predictive Modeling for Breast Cancer Classification in the Context of Bangladeshi Patients: A Supervised Machine Learning Approach with Explainable AI

arXiv.org Artificial IntelligenceApr-6-2024

Breast cancer has rapidly increased in prevalence in recent years, making it one of the leading causes of mortality worldwide. Among all cancers, it is by far the most common. Diagnosing this illness manually requires significant time and expertise. Since detecting breast cancer is a time-consuming process, preventing its further spread can be aided by creating machine-based forecasts. Machine learning and Explainable AI are crucial in classification as they not only provide accurate predictions but also offer insights into how the model arrives at its decisions, aiding in the understanding and trustworthiness of the classification results. In this study, we evaluate and compare the classification accuracy, precision, recall, and F-1 scores of five different machine learning methods using a primary dataset (500 patients from Dhaka Medical College Hospital). Five different supervised machine learning techniques, including decision tree, random forest, logistic regression, naive bayes, and XGBoost, have been used to achieve optimal results on our dataset. Additionally, this study applied SHAP analysis to the XGBoost model to interpret the model's predictions and understand the impact of each feature on the model's output. We compared the accuracy with which several algorithms classified the data, as well as contrasted with other literature in this field. After final evaluation, this study found that XGBoost achieved the best model accuracy, which is 97%.

accuracy, algorithm, dataset, (14 more...)

2404.04686

Country:

Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.25)
North America > United States > Wisconsin (0.05)
Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Samani, Rasoul, Dehghani, Mohammad, Shahrokh, Fahime

Enhancing Readmission Prediction with Deep Learning: Extracting Biomedical Concepts from Clinical Texts

arXiv.org Artificial IntelligenceApr-6-2024

Hospital readmission, defined as patients being re-hospitalized shortly after discharge, is a critical concern as it impacts patient outcomes and healthcare costs. Identifying patients at risk of readmission allows for timely interventions, reducing re-hospitalization rates and overall treatment costs. This study focuses on predicting patient readmission within less than 30 days using text mining techniques applied to discharge report texts from electronic health records (EHR). Various machine learning and deep learning methods were employed to develop a classification model for this purpose. A novel aspect of this research involves leveraging the Bio-Discharge Summary Bert (BDSS) model along with principal component analysis (PCA) feature extraction to preprocess data for deep learning model input. Our analysis of the MIMIC-III dataset indicates that our approach, which combines the BDSS model with a multilayer perceptron (MLP), outperforms state-of-the-art methods. This model achieved a recall of 94% and an area under the curve (AUC) of 75%, showcasing its effectiveness in predicting patient readmissions. This study contributes to the advancement of predictive modeling in healthcare by integrating text mining techniques with deep learning algorithms to improve patient outcomes and optimize resource allocation.

dataset, hospital readmission, readmission, (13 more...)

2403.09722

Country:

North America > United States > Maryland > Baltimore (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Therapeutic Area (0.94)
Health & Medicine > Health Care Technology > Medical Record (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)

arXiv.org Artificial IntelligenceApr-6-2024

PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics

Zhu, Derui, Chen, Dingfan, Li, Qing, Chen, Zongxiong, Ma, Lei, Grossklags, Jens, Fritz, Mario

Despite tremendous advancements in large language models (LLMs) over recent years, a notably urgent challenge for their practical deployment is the phenomenon of hallucination, where the model fabricates facts and produces non-factual statements. In response, we propose PoLLMgraph, a Polygraph for LLMs, as an effective model-based white-box detection and forecasting approach. PoLLMgraph distinctly differs from the large body of existing research that concentrates on addressing such challenges through black-box evaluations. In particular, we demonstrate that hallucination can be effectively detected by analyzing the LLM's internal state transition dynamics during generation via tractable probabilistic models. Experimental results on various open-source LLMs confirm the efficacy of PoLLMgraph, outperforming state-of-the-art methods by a considerable margin, evidenced by over 20% improvement in AUC-ROC on common benchmarking datasets like TruthfulQA. Our work paves a new way for model-based white-box analysis of LLMs, motivating the research community to further explore, understand, and refine the intricate dynamics of LLM behaviors.

dataset, hallucination, pollmgraph, (14 more...)

2404.04722

Country:

North America > United States (0.46)
North America > Canada > Alberta (0.14)
Europe > Norway > Western Norway > Rogaland > Stavanger (0.04)
(4 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

arXiv.org Machine LearningApr-6-2024

Bayesian Inference for Consistent Predictions in Overparameterized Nonlinear Regression

Wakayama, Tomoya

The remarkable generalization performance of overparameterized models has challenged the conventional wisdom of statistical learning theory. While recent theoretical studies have shed light on this behavior in linear models or nonlinear classifiers, a comprehensive understanding of overparameterization in nonlinear regression models remains lacking. This paper explores the predictive properties of overparameterized nonlinear regression within the Bayesian framework, extending the methodology of adaptive prior based on the intrinsic spectral structure of the data. We establish posterior contraction for single-neuron models with Lipschitz continuous activation functions and for generalized linear models, demonstrating that our approach achieves consistent predictions in the overparameterized regime. Moreover, our Bayesian framework allows for uncertainty estimation of the predictions. The proposed method is validated through numerical simulations and a real data application, showcasing its ability to achieve accurate predictions and reliable uncertainty estimates. Our work advances the theoretical understanding of the blessing of overparameterization and offers a principled Bayesian approach for prediction in large nonlinear models.

posterior distribution, prediction, regression, (13 more...)

2404.04498

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > Japan > Honshū > Kansai > Wakayama Prefecture > Wakayama (0.04)
North America > United States > Virginia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.89)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Böttcher, Lucas, Wheeler, Gregory

Statistical Mechanics and Artificial Neural Networks: Principles, Models, and Applications

arXiv.org Artificial IntelligenceApr-5-2024

The field of neuroscience and the development of artificial neural networks (ANNs) have mutually influenced each other, drawing from and contributing to many concepts initially developed in statistical mechanics. Notably, Hopfield networks and Boltzmann machines are versions of the Ising model, a model extensively studied in statistical mechanics for over a century. In the first part of this chapter, we provide an overview of the principles, models, and applications of ANNs, highlighting their connections to statistical mechanics and statistical learning theory. Artificial neural networks can be seen as high-dimensional mathematical functions, and understanding the geometric properties of their loss landscapes (i.e., the high-dimensional space on which one wishes to find extrema or saddles) can provide valuable insights into their optimization behavior, generalization abilities, and overall performance. Visualizing these functions can help us design better optimization methods and improve their generalization abilities. Thus, the second part of this chapter focuses on quantifying geometric properties and visualizing loss functions associated with deep ANNs.

curvature, neural network, projection, (16 more...)

2405.10957

Country:

North America > United States > New York > Richmond County > New York City (0.14)
North America > United States > New York > Queens County > New York City (0.14)
North America > United States > New York > New York County > New York City (0.14)
(31 more...)

Genre:

Overview (0.68)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

arXiv.org Machine LearningApr-5-2024

Bayesian Additive Regression Networks

Van Boxel, Danielle

We apply Bayesian Additive Regression Tree (BART) principles to training an ensemble of small neural networks for regression tasks. Using Markov Chain Monte Carlo, we sample from the posterior distribution of neural networks that have a single hidden layer. To create an ensemble of these, we apply Gibbs sampling to update each network against the residual target value (i.e. subtracting the effect of the other networks). We demonstrate the effectiveness of this technique on several benchmark regression problems, comparing it to equivalent shallow neural networks, BART, and ordinary least squares. Our Bayesian Additive Regression Networks (BARN) provide more consistent and often more accurate results. On test data benchmarks, BARN averaged between 5 to 20 percent lower root mean square error. This error performance does come at the cost, however, of greater computation time. BARN sometimes takes on the order of a minute where competing methods take a second or less. But, BARN without cross-validated hyperparameter tuning takes about the same amount of computation time as tuned other methods. Yet BARN is still typically more accurate.

bart, ensemble, neural network, (15 more...)

2404.04425

Country:

North America > United States > Arizona > Pima County > Tucson (0.14)
North America > United States > Wisconsin (0.05)
North America > United States > California (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

arXiv.org Machine LearningApr-5-2024

Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation

Zhou, Mingyuan, Zheng, Huangjie, Wang, Zhendong, Yin, Mingzhang, Huang, Hai

We introduce Score identity Distillation (SiD), an innovative data-free method that distills the generative capabilities of pretrained diffusion models into a single-step generator. SiD not only facilitates an exponentially fast reduction in Fr\'echet inception distance (FID) during distillation but also approaches or even exceeds the FID performance of the original teacher diffusion models. By reformulating forward diffusion processes as semi-implicit distributions, we leverage three score-related identities to create an innovative loss mechanism. This mechanism achieves rapid FID reduction by training the generator using its own synthesized images, eliminating the need for real data or reverse-diffusion-based generation, all accomplished within significantly shortened generation time. Upon evaluation across four benchmark datasets, the SiD algorithm demonstrates high iteration efficiency during distillation and surpasses competing distillation approaches, whether they are one-step or few-step, data-free, or dependent on training data, in terms of generation quality. This achievement not only redefines the benchmarks for efficiency and effectiveness in diffusion distillation but also in the broader field of diffusion-based generation. Our PyTorch implementation will be publicly accessible on GitHub.

diffusion model, international conference, neural information processing system, (12 more...)

2404.04057

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)