Goto

Collaborating Authors

 Al Qadisiyah Governorate



High-dimensional Bayesian Tobit regression for censored response with Horseshoe prior

arXiv.org Machine Learning

Censored response variables--where outcomes are only partially observed due to known bounds--arise in numerous scientific domains and present serious challenges for regression analysis. The Tobit model, a classical solution for handling left-censoring, has been widely used in economics and beyond. However, with the increasing prevalence of high-dimensional data, where the number of covariates exceeds the sample size, traditional Tobit methods become inadequate. While frequentist approaches for high-dimensional Tobit regression have recently been developed, notably through Lasso-based estimators, the Bayesian literature remains sparse and lacks theoretical guarantees. In this work, we propose a novel Bayesian framework for high-dimensional Tobit regression that addresses both censoring and sparsity. Our method leverages the Horseshoe prior to induce shrinkage and employs a data augmentation strategy to facilitate efficient posterior computation via Gibbs sampling. We establish posterior consistency and derive concentration rates under sparsity, providing the first theoretical results for Bayesian Tobit models in high dimensions. Numerical experiments show that our approach outperforms favorably with the recent Lasso-Tobit method. Our method is implemented in the R package tobitbayes, which can be found on Github.


The importance of the clustering model to detect new types of intrusion in data traffic

arXiv.org Artificial Intelligence

In the current digital age, the volume of data generated by various cyber activities has become enormous and is constantly increasing. The data may contain valuable insights that can be harnessed to improve cyber security measures. However, much of this data is unclassified and qualitative, which poses significant challenges to traditional analysis methods. Clustering facilitates the identification of hidden patterns and structures in data through grouping similar data points, which makes it simpler to identify and address threats. Clustering can be defined as a data mining (DM) approach, which uses similarity calculations for dividing a data set into several categories. Hierarchical, density-based, along with partitioning clustering algorithms are typical. The presented work use K-means algorithm, which is a popular clustering technique. Utilizing K-means algorithm, we worked with two different types of data: first, we gathered data with the use of XG-boost algorithm following completing the aggregation with K-means algorithm. Data was gathered utilizing Kali Linux environment, cicflowmeter traffic, and Putty Software tools with the use of diverse and simple attacks. The concept could assist in identifying new attack types, which are distinct from the known attacks, and labeling them based on the characteristics they will exhibit, as the dynamic nature regarding cyber threats means that new attack types often emerge, for which labeled data might not yet exist. The model counted the attacks and assigned numbers to each one of them. Secondly, We tried the same work on the ready data inside the Kaggle repository called (Intrusion Detection in Internet of Things Network), and the clustering model worked well and detected the number of attacks correctly as shown in the results section.


FGR-Net:Interpretable fundus imagegradeability classification based on deepreconstruction learning

arXiv.org Artificial Intelligence

The performance of diagnostic Computer-Aided Design (CAD) systems for retinal diseases depends on the quality of the retinal images being screened. Thus, many studies have been developed to evaluate and assess the quality of such retinal images. However, most of them did not investigate the relationship between the accuracy of the developed models and the quality of the visualization of interpretability methods for distinguishing between gradable and non-gradable retinal images. Consequently, this paper presents a novel framework called FGR-Net to automatically assess and interpret underlying fundus image quality by merging an autoencoder network with a classifier network. The FGR-Net model also provides an interpretable quality assessment through visualizations. In particular, FGR-Net uses a deep autoencoder to reconstruct the input image in order to extract the visual characteristics of the input fundus images based on self-supervised learning. The extracted features by the autoencoder are then fed into a deep classifier network to distinguish between gradable and ungradable fundus images. FGR-Net is evaluated with different interpretability methods, which indicates that the autoencoder is a key factor in forcing the classifier to focus on the relevant structures of the fundus images, such as the fovea, optic disk, and prominent blood vessels. Additionally, the interpretability methods can provide visual feedback for ophthalmologists to understand how our model evaluates the quality of fundus images. The experimental results showed the superiority of FGR-Net over the state-of-the-art quality assessment methods, with an accuracy of 89% and an F1-score of 87%.


Adaptive USVs Swarm Optimization for Target Tracking in Dynamic Environments

arXiv.org Artificial Intelligence

This research investigates the performance and efficiency of Unmanned Surface Vehicles (USVs) in multi-target tracking scenarios using the Adaptive Particle Swarm Optimization with k-Nearest Neighbors (APSO-kNN) algorithm. The study explores various search patterns-Random Walk, Spiral, Lawnmower, and Cluster Search to assess their effectiveness in dynamic environments. Through extensive simulations, we evaluate the impact of different search strategies, varying the number of targets and USVs' sensing capabilities, and integrating a Pursuit-Evasion model to test adaptability. Our findings demonstrate that systematic search patterns like Spiral and Lawnmower provide superior coverage and tracking accuracy, making them ideal for thorough area exploration. In contrast, the Random Walk pattern, while highly adaptable, shows lower accuracy due to its non-deterministic nature, and Cluster Search maintains group cohesion but is heavily dependent on target distribution. The mixed strategy, combining multiple patterns, offers robust performance across varied scenarios, while APSO-kNN effectively balances exploration and exploitation, making it a promising approach for real-world applications such as surveillance, search and rescue, and environmental monitoring. This study provides valuable insights into optimizing search strategies and sensing configurations for USV swarms, ultimately enhancing their operational efficiency and success in complex environments.


MLtoGAI: Semantic Web based with Machine Learning for Enhanced Disease Prediction and Personalized Recommendations using Generative AI

arXiv.org Artificial Intelligence

In modern healthcare, addressing the complexities of accurate disease prediction and personalized recommendations is both crucial and challenging. This research introduces MLtoGAI, which integrates Semantic Web technology with Machine Learning (ML) to enhance disease prediction and offer user-friendly explanations through ChatGPT. The system comprises three key components: a reusable disease ontology that incorporates detailed knowledge about various diseases, a diagnostic classification model that uses patient symptoms to detect specific diseases accurately, and the integration of Semantic Web Rule Language (SWRL) with ontology and ChatGPT to generate clear, personalized health advice. This approach significantly improves prediction accuracy and ensures results that are easy to understand, addressing the complexity of diseases and diverse symptoms. The MLtoGAI system demonstrates substantial advancements in accuracy and user satisfaction, contributing to developing more intelligent and accessible healthcare solutions. This innovative approach combines the strengths of ML algorithms with the ability to provide transparent, human-understandable explanations through ChatGPT, achieving significant improvements in prediction accuracy and user comprehension. By leveraging semantic technology and explainable AI, the system enhances the accuracy of disease prediction and ensures that the recommendations are relevant and easily understood by individual patients. Our research highlights the potential of integrating advanced technologies to overcome existing challenges in medical diagnostics, paving the way for future developments in intelligent healthcare systems. Additionally, the system is validated using 200 synthetic patient data records, ensuring robust performance and reliability.


Decoding Multilingual Topic Dynamics and Trend Identification through ARIMA Time Series Analysis on Social Networks: A Novel Data Translation Framework Enhanced by LDA/HDP Models

arXiv.org Artificial Intelligence

In this study, the authors present a novel methodology adept at decoding multilingual topic dynamics and identifying communication trends during crises. We focus on dialogues within Tunisian social networks during the Coronavirus Pandemic and other notable themes like sports and politics. We start by aggregating a varied multilingual corpus of comments relevant to these subjects. This dataset undergoes rigorous refinement during data preprocessing. We then introduce our No-English-to-English Machine Translation approach to handle linguistic differences. Empirical tests of this method showed high accuracy and F1 scores, highlighting its suitability for linguistically coherent tasks. Delving deeper, advanced modeling techniques, specifically LDA and HDP models are employed to extract pertinent topics from the translated content. This leads to applying ARIMA time series analysis to decode evolving topic trends. Applying our method to a multilingual Tunisian dataset, we effectively identified key topics mirroring public sentiment. Such insights prove vital for organizations and governments striving to understand public perspectives during crises. Compared to standard approaches, our model outperforms, as confirmed by metrics like Coherence Score, U-mass, and Topic Coherence. Additionally, an in-depth assessment of the identified topics revealed notable thematic shifts in discussions, with our trends identification indicating impressive accuracy, backed by RMSE-based analysis.


Long-term Neurological Sequelae in Post-COVID-19 Patients: A Machine Learning Approach to Predict Outcomes

arXiv.org Artificial Intelligence

The COVID-19 pandemic has brought to light a concerning aspect of long-term neurological complications in post-recovery patients. This study delved into the investigation of such neurological sequelae in a cohort of 500 post-COVID-19 patients, encompassing individuals with varying illness severity. The primary aim was to predict outcomes using a machine learning approach based on diverse clinical data and neuroimaging parameters. The results revealed that 68% of the post-COVID-19 patients reported experiencing neurological symptoms, with fatigue, headache, and anosmia being the most common manifestations. Moreover, 22% of the patients exhibited more severe neurological complications, including encephalopathy and stroke. The application of machine learning models showed promising results in predicting long-term neurological outcomes. Notably, the Random Forest model achieved an accuracy of 85%, sensitivity of 80%, and specificity of 90% in identifying patients at risk of developing neurological sequelae. These findings underscore the importance of continuous monitoring and follow-up care for post-COVID-19 patients, particularly in relation to potential neurological complications. The integration of machine learning-based outcome prediction offers a valuable tool for early intervention and personalized treatment strategies, aiming to improve patient care and clinical decision-making. In conclusion, this study sheds light on the prevalence of long-term neurological complications in post-COVID-19 patients and demonstrates the potential of machine learning in predicting outcomes, thereby contributing to enhanced patient management and better health outcomes. Further research and larger studies are warranted to validate and refine these predictive models and to gain deeper insights into the underlying mechanisms of post-COVID-19 neurological sequelae.


Advancements In Crowd-Monitoring System: A Comprehensive Analysis of Systematic Approaches and Automation Algorithms: State-of-The-Art

arXiv.org Artificial Intelligence

Growing apprehensions surrounding public safety have captured the attention of numerous governments and security agencies across the globe. These entities are increasingly acknowledging the imperative need for reliable and secure crowd-monitoring systems to address these concerns. Effectively managing human gatherings necessitates proactive measures to prevent unforeseen events or complications, ensuring a safe and well-coordinated environment. The scarcity of research focusing on crowd monitoring systems and their security implications has given rise to a burgeoning area of investigation, exploring potential approaches to safeguard human congregations effectively. Crowd monitoring systems depend on a bifurcated approach, encompassing vision-based and non-vision-based technologies. An in-depth analysis of these two methodologies will be conducted in this research. The efficacy of these approaches is contingent upon the specific environment and temporal context in which they are deployed, as they each offer distinct advantages. This paper endeavors to present an in-depth analysis of the recent incorporation of artificial intelligence (AI) algorithms and models into automated systems, emphasizing their contemporary applications and effectiveness in various contexts.


Comparative Study of MPPT and Parameter Estimation of PV cells

arXiv.org Artificial Intelligence

Solar energy has been developed as a better alternative to fossil fuels in the past few years. It is a renewable and infinite source of energy which does not have a bad impact on the environment. It is also cheap and easily accessible, making it a better alternative for both personal and commercial purposes. Solar Arrays are made when PV modules used in solar panels are connected together. Energy is produced when sunlight falls on Solar Panels which can be used instead of Fossil fuel's produced energy. For execution of a PV system under different situations, estimating the parameters in a PV model plays an important role because it enables us to optimise the design and performance of the system which leads to increased energy production and improved performance. If a PV system is not performing as expected, then identification of parameters of the PV model helps identify the root cause of the problem. This could be due to factors such as shading, module mismatch, or degradation over time. By accurately estimating the parameters, we can determine the best method to improve its performance.