Goto

Collaborating Authors

 traditional machine


SliceVision-F2I: A Synthetic Feature-to-Image Dataset for Visual Pattern Representation on Network Slices

Rafi, Md. Abid Hasan, Johora, Mst. Fatematuj, Bhowmik, Pankaj

arXiv.org Artificial Intelligence

The emergence of 5G and 6G networks has established network slicing as a significant part of future service-oriented architectures, demanding refined identification methods supported by robust datasets. The article presents SliceVision-F2I, a dataset of synthetic samples for studying feature visualization in network slicing for next-generation networking systems. The dataset transforms multivariate Key Performance Indicator (KPI) vectors into visual representations through four distinct encoding methods: physically inspired mappings, Perlin noise, neural wallpapering, and fractal branching. For each encoding method, 30,000 samples are generated, each comprising a raw KPI vector and a corresponding RGB image at low-resolution pixels. The dataset simulates realistic and noisy network conditions to reflect operational uncertainties and measurement imperfections. SliceVision-F2I is suitable for tasks involving visual learning, network state classification, anomaly detection, and benchmarking of image-based machine learning techniques applied to network data. The dataset is publicly available and can be reused in various research contexts, including multivariate time series analysis, synthetic data generation, and feature-to-image transformations.


Window-Based Feature Engineering for Cognitive Workload Detection

Hallam, Andrew, Gayathri, R G, Lee, Glory, Sajjanhar, Atul

arXiv.org Artificial Intelligence

Cognitive workload is a topic of increasing interest across various fields such as health, psychology, and defense applications. In this research, we focus on classifying cognitive workload using the COLET dataset, employing a window-based approach for feature generation and machine/deep learning techniques for classification. We apply window-based temporal partitioning to enhance features used in existing research, followed by machine learning and deep learning models to classify different levels of cognitive workload. The results demonstrate that deep learning models, particularly tabular architectures, outperformed traditional machine learning methods in precision, F1-score, accuracy, and classification precision. This study highlights the effectiveness of window-based temporal feature extraction and the potential of deep learning techniques for real-time cognitive workload assessment in complex and dynamic tasks.


Hope Speech Detection in Social Media English Corpora: Performance of Traditional and Transformer Models

Ramos, Luis, Calvo, Hiram, Kolesnikova, Olga

arXiv.org Artificial Intelligence

The identification of hope speech has become a promised NLP task, considering the need to detect motivational expressions of agency and goal-directed behaviour on social media platforms. This proposal evaluates traditional machine learning models and fine-tuned transformers for a previously split hope speech dataset as train, development and test set. On development test, a linear-kernel SVM and logistic regression both reached a macro-F1 of 0.78; SVM with RBF kernel reached 0.77, and Naïve Bayes hit 0.75. Transformer models delivered better results, the best model achieved weighted precision of 0.82, weighted recall of 0.80, weighted F1 of 0.79, macro F1 of 0.79, and 0.80 accuracy. These results suggest that while optimally configured traditional machine learning models remain agile, transformer architectures detect some subtle semantics of hope to achieve higher precision and recall in hope speech detection, suggesting that larges transformers and LLMs could perform better in small datasets.


When Curiosity Signals Danger: Predicting Health Crises Through Online Medication Inquiries

Goncharok, Dvora, Shifman, Arbel, Apartsin, Alexander, Aperstein, Yehudit

arXiv.org Artificial Intelligence

Online medical forums are a rich and underutilized source of insight into patient concerns, especially regarding medication use. Some of the many questions users pose may signal confusion, misuse, or even the early warning signs of a developing health crisis. Detecting these critical questions that may precede severe adverse events or life-threatening complications is vital for timely intervention and improving patient safety. This study introduces a novel annotated dataset of medication-related questions extracted from online forums. Each entry is manually labelled for criticality based on clinical risk factors. We benchmark the performance of six traditional machine learning classifiers using TF-IDF textual representations, alongside three state-of-the-art large language model (LLM)-based classification approaches that leverage deep contextual understanding. Our results highlight the potential of classical and modern methods to support real-time triage and alert systems in digital health spaces. The curated dataset is made publicly available to encourage further research at the intersection of patient-generated data, natural language processing, and early warning systems for critical health events. The dataset and benchmark are available at: https://github.com/Dvora-coder/LLM-Medication-QA-Risk-Classifier-MediGuard.


Comparative Analysis of Transformer Models in Disaster Tweet Classification for Public Safety

Zisad, Sharif Noor, Chowdhury, N. M. Istiak, Hasan, Ragib

arXiv.org Artificial Intelligence

Twitter and other social media platforms have become vital sources of real time information during disasters and public safety emergencies. Automatically classifying disaster related tweets can help emergency services respond faster and more effectively. Traditional Machine Learning (ML) models such as Logistic Regression, Naive Bayes, and Support Vector Machines have been widely used for this task, but they often fail to understand the context or deeper meaning of words, especially when the language is informal, metaphorical, or ambiguous. We posit that, in this context, transformer based models can perform better than traditional ML models. In this paper, we evaluate the effectiveness of transformer based models, including BERT, DistilBERT, RoBERTa, and DeBERTa, for classifying disaster related tweets. These models are compared with traditional ML approaches to highlight the performance gap. Experimental results show that BERT achieved the highest accuracy (91%), significantly outperforming traditional models like Logistic Regression and Naive Bayes (both at 82%). The use of contextual embeddings and attention mechanisms allows transformer models to better understand subtle language in tweets, where traditional ML models fall short. This research demonstrates that transformer architectures are far more suitable for public safety applications, offering improved accuracy, deeper language understanding, and better generalization across real world social media text.


Google secretly 'enhanced' YouTube Shorts videos with AI filters

PCWorld

YouTube is being overrun with AI slop. And it probably doesn't help that YouTube itself, and owner Google, is where a lot of it is coming from. Using AI-powered tools to "enhance" videos, without telling anyone--including the creators who made said videos. YouTube viewers and video producers like Rhett Shull have noticed a certain sheen and smoothness to some videos that wasn't intentionally done by the original uploaders. This sort of filtering isn't new--in fact, you've probably seen it over-applied to old movie clips uploaded to TikTok and YouTube shorts, giving them an unnaturally smooth motion and overly glossy look for things like human skin.


Bangla BERT for Hyperpartisan News Detection: A Semi-Supervised and Explainable AI Approach

Hasan, Mohammad Mehadi, Hassan, Fatema Binte, Jubair, Md Al, Ahmed, Zobayer, Yeakin, Sazzatul, Billah, Md Masum

arXiv.org Artificial Intelligence

In the current digital landscape, misinformation circulates rapidly, shaping public perception and causing societal divisions. It is difficult to identify hyperpartisan news in Bangla since there aren't many sophisticated natural language processing methods available for this low-resource language. Without effective detection methods, biased content can spread unchecked, posing serious risks to informed discourse. To address this gap, our research fine-tunes Bangla BERT. This is a state-of-the-art transformer-based model, designed to enhance classification accuracy for hyperpartisan news. We evaluate its performance against traditional machine learning models and implement semi-supervised learning to enhance predictions further. Not only that, we use LIME to provide transparent explanations of the model's decision-making process, which helps to build trust in its outcomes. With a remarkable accuracy score of 95.65%, Bangla BERT outperforms conventional approaches, according to our trial data. The findings of this study demonstrate the usefulness of transformer models even in environments with limited resources, which opens the door to further improvements in this area.


Outcome-Based Education: Evaluating Students' Perspectives Using Transformer

Das, Shuvra Smaran, Anik, Anirban Saha, Morol, Md Kishor, Mahmood, Mohammad Sakib

arXiv.org Artificial Intelligence

Outcome-Based Education (OBE) emphasizes the development of specific competencies through student-centered learning. In this study, we reviewed the importance of OBE and implemented transformer-based models, particularly DistilBERT, to analyze an NLP dataset that includes student feedback. Our objective is to assess and improve educational outcomes. Our approach is better than other machine learning models because it uses the transformer's deep understanding of language context to classify sentiment better, giving better results across a wider range of matrices. Our work directly contributes to OBE's goal of achieving measurable outcomes by facilitating the identification of patterns in student learning experiences. We have also applied LIME (local interpretable model-agnostic explanations) to make sure that model predictions are clear. This gives us understandable information about how key terms affect sentiment. Our findings indicate that the combination of transformer models and LIME explanations results in a strong and straightforward framework for analyzing student feedback. This aligns more closely with the principles of OBE and ensures the improvement of educational practices through data-driven insights.


Advancing Harmful Content Detection in Organizational Research: Integrating Large Language Models with Elo Rating System

Akben, Mustafa, Satko, Aaron

arXiv.org Artificial Intelligence

Large language models (LLMs) offer promising opportunities for organizational research. However, their built-in moderation systems can create problems when researchers try to analyze harmful content, often refusing to follow certain instructions or producing overly cautious responses that undermine validity of the results. This is particularly problematic when analyzing organizational conflicts such as microaggressions or hate speech. This paper introduces an Elo rating-based method that significantly improves LLM performance for harmful content analysis In two datasets, one focused on microaggression detection and the other on hate speech, we find that our method outperforms traditional LLM prompting techniques and conventional machine learning models on key measures such as accuracy, precision, and F1 scores. Advantages include better reliability when analyzing harmful content, fewer false positives, and greater scalability for large-scale datasets. This approach supports organizational applications, including detecting workplace harassment, assessing toxic communication, and fostering safer and more inclusive work environments.


Detection of Suicidal Risk on Social Media: A Hybrid Model

Yang, Zaihan, Leonard, Ryan, Tran, Hien, Driscoll, Rory, Davis, Chadbourne

arXiv.org Artificial Intelligence

--Suicidal thoughts and behaviors are increasingly recognized as a critical societal concern, highlighting the urgent need for effective tools to enable early detection of suicidal risk. In this work, we develop robust machine learning models that leverage Reddit posts to automatically classify them into four distinct levels of suicide risk severity. We frame this as a multi-class classification task and propose a RoBERT a-TF-IDF-PCA Hybrid model, integrating the deep contextual embeddings from Robustly Optimized BERT Approach (RoBERT a), a state-of-the-art deep learning transformer model, with the statistical term-weighting of TF-IDF, further compressed with PCA, to boost the accuracy and reliability of suicide risk assessment. T o address data imbalance and overfitting, we explore various data resampling techniques and data augmentation strategies to enhance model generalization. Additionally, we compare our model's performance against that of using RoBERT a only, the BERT model and other traditional machine learning classifiers. Suicidal thoughts and behaviors are increasingly becoming a significant societal concern. As of the latest estimates from World Health Organization, approximately 700,000 to 800,000 people die by suicide globally each year. In the U.S., suicide is the second leading cause of death for individuals aged 10-34 and the fourth leading cause for those aged 35-64. Suicidal thoughts can vary in severity, ranging from explicit and repetitive suicidal feelings, to actively planning suicide or engaging in self-harm behaviors like cutting or burning, and ultimately to making actual attempts through methods such as cutting, jumping, drug overdose, or using firearms.