AITopics

Predictive maintenance (PdM) is crucial for optimizing efficiency and minimizing downtime of electric buses. While these vehicles provide environmental benefits, they pose challenges for PdM due to complex electric transmission and battery systems. Traditional maintenance, often based on scheduled inspections, struggles to capture anomalies in multi-dimensional real-time CAN Bus data. This study employs a graph-based feature selection method to analyze relationships among CAN Bus parameters of electric buses and investigates the prediction performance of targeted alarms using artificial intelligence techniques. The raw data collected over two years underwent extensive preprocessing to ensure data quality and consistency. A hybrid graph-based feature selection tool was developed by combining statistical filtering (Pearson correlation, Cramer's V, ANOVA F-test) with optimization-based community detection algorithms (InfoMap, Leiden, Louvain, Fast Greedy). Machine learning models, including SVM, Random Forest, and XGBoost, were optimized through grid and random search with data balancing via SMOTEEN and binary search-based down-sampling. Model interpretability was achieved using LIME to identify the features influencing predictions. The results demonstrate that the developed system effectively predicts vehicle alarms, enhances feature interpretability, and supports proactive maintenance strategies aligned with Industry 4.0 principles.

artificial intelligence, dataset, machine learning, (17 more...)

2510.23879

Country:

Asia (0.46)
Europe > Netherlands > South Holland > Leiden (0.25)

Genre: Research Report > New Finding (0.48)

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.91)
Transportation > Passenger (0.81)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

On the Societal Impact of Machine Learning

Baumann, Joachim

This PhD thesis investigates the societal impact of machine learning (ML). ML increasingly informs consequential decisions and recommendations, significantly affecting many aspects of our lives. As these data-driven systems are often developed without explicit fairness considerations, they carry the risk of discriminatory effects. The contributions in this thesis enable more appropriate measurement of fairness in ML systems, systematic decomposition of ML systems to anticipate bias dynamics, and effective interventions that reduce algorithmic discrimination while maintaining system utility. I conclude by discussing ongoing challenges and future research directions as ML systems, including generative artificial intelligence, become increasingly integrated into society. This work offers a foundation for ensuring that ML's societal impact aligns with broader social values.

artificial intelligence, machine learning, proceedings, (16 more...)

2510.23693

Country:

North America > United States (1.00)
Europe (0.93)

Genre: Research Report > Experimental Study (0.92)

Industry:

Social Sector (1.00)
Law (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Anshul, Ashutosh, Rehman, Mohammad Zia Ur, Kadali, Sri Akash, Kumar, Nagendra

RoGBot: Relationship-Oblivious Graph-based Neural Network with Contextual Knowledge for Bot Detection

Abstract--Detecting automated accounts (bots) among genuine users on platforms like Twitter remains a challenging task due to the evolving behaviors and adaptive strategies of such accounts. While recent methods have achieved strong detection performance by combining text, metadata, and user relationship information within graph-based frameworks, many of these models heavily depend on explicit user-user relationship data. This reliance limits their applicability in scenarios where such information is unavailable. T o address this limitation, we propose a novel multimodal framework that integrates detailed textual features with enriched user metadata while employing graph-based reasoning without requiring follower-following data. Our method uses transformer-based models (e.g., BERT) to extract deep semantic embeddings from tweets, which are aggregated using max pooling to form comprehensive user-level representations. These are further combined with auxiliary behavioral features and passed through a GraphSAGE model to capture both local and global patterns in user behavior . Experimental results on the Cresci-15, Cresci-17, and PAN 2019 datasets demonstrate the robustness of our approach, achieving accuracies of 99.8%, 99.1%, and 96.8%, respectively, and highlighting its effectiveness against increasingly sophisticated bot strategies. Social media platforms have become essential tools for communication, sharing news, and fostering public engagement. However, the increasing presence of automated accounts, commonly known as bots, has raised serious concerns.

detection, machine learning, natural language, (20 more...)

2510.23648

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.94)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Structure-Aware Fusion with Progressive Injection for Multimodal Molecular Representation Learning

Jing, Zihao, Sun, Yan, Li, Yan Yi, Janarthanan, Sugitha, Deng, Alana, Hu, Pingzhao

Multimodal molecular models often suffer from 3D conformer unreliability and modality collapse, limiting their robustness and generalization. We propose MuMo, a structured multimodal fusion framework that addresses these challenges in molecular representation through two key strategies. To reduce the instability of conformer-dependent fusion, we design a Structured Fusion Pipeline (SFP) that combines 2D topology and 3D geometry into a unified and stable structural prior. To mitigate modality collapse caused by naive fusion, we introduce a Progressive Injection (PI) mechanism that asymmetrically integrates this prior into the sequence stream, preserving modality-specific modeling while enabling cross-modal enrichment. Built on a state space backbone, MuMo supports long-range dependency modeling and robust information propagation. Across 29 benchmark tasks from Therapeutics Data Commons (TDC) and MoleculeNet, MuMo achieves an average improvement of 2.7% over the best-performing baseline on each task, ranking first on 22 of them, including a 27% improvement on the LD50 task. These results validate its robustness to 3D conformer noise and the effectiveness of multimodal fusion in molecular representation. The code is available at: github.com/selmiss/MuMo.

data mining, machine learning, natural language, (22 more...)

2510.2364

Country: North America > Canada > Ontario (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Data Science > Data Mining (0.92)
(3 more...)

Gerhart, Alyssa, Iyangar, Balaji

Adversarially-Aware Architecture Design for Robust Medical AI Systems

Adversarial attacks pose a severe risk to AI systems used in healthcare, capable of misleading models into dangerous misclassifications that can delay treatments or cause misdiagnoses. These attacks, often imperceptible to human perception, threaten patient safety, particularly in underserved populations. Our study explores these vulnerabilities through empirical experimentation on a dermatological dataset, where adversarial methods significantly reduce classification accuracy. Through detailed threat modeling, experimental benchmarking, and model evaluation, we demonstrate both the severity of the threat and the partial success of defenses like adversarial training and distillation. Our results show that while defenses reduce attack success rates, they must be balanced against model performance on clean data. We conclude with a call for integrated technical, ethical, and policy-based approaches to build more resilient, equitable AI in healthcare.

adversarial training, artificial intelligence, machine learning, (18 more...)

2510.23622

Country: North America > United States (0.69)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)
Health & Medicine > Therapeutic Area > Dermatology (0.95)
Health & Medicine > Diagnostic Medicine > Imaging (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Understanding AI Trustworthiness: A Scoping Review of AIES & FAccT Articles

Mehrotra, Siddharth, Huang, Jin, Fu, Xuelong, Dobbe, Roel, Sánchez, Clara I., de Rijke, Maarten

Background: Trustworthy AI serves as a foundational pillar for two major AI ethics conferences: AIES and FAccT. However, current research often adopts techno-centric approaches, focusing primarily on technical attributes such as reliability, robustness, and fairness, while overlooking the sociotechnical dimensions critical to understanding AI trustworthiness in real-world contexts. Objectives: This scoping review aims to examine how the AIES and FAccT communities conceptualize, measure, and validate AI trustworthiness, identifying major gaps and opportunities for advancing a holistic understanding of trustworthy AI systems. Methods: We conduct a scoping review of AIES and FAccT conference proceedings to date, systematically analyzing how trustworthiness is defined, operationalized, and applied across different research domains. Our analysis focuses on conceptualization approaches, measurement methods, verification and validation techniques, application areas, and underlying values. Results: While significant progress has been made in defining technical attributes such as transparency, accountability, and robustness, our findings reveal critical gaps. Current research often predominantly emphasizes technical precision at the expense of social and ethical considerations. The sociotechnical nature of AI systems remains less explored and trustworthiness emerges as a contested concept shaped by those with the power to define it. Conclusions: An interdisciplinary approach combining technical rigor with social, cultural, and institutional considerations is essential for advancing trustworthy AI. We propose actionable measures for the AI ethics community to adopt holistic frameworks that genuinely address the complex interplay between AI systems and society, ultimately promoting responsible technological development that benefits all stakeholders.

artificial intelligence, machine learning, trustworthiness, (13 more...)

2510.21293

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.28)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.67)
Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)
Social Sector (0.93)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Detecting Latin in Historical Books with Large Language Models: A Multimodal Benchmark

Wu, Yu, Shu, Ke, Fischer, Jonas, Pivovarova, Lidia, Rosson, David, Mäkelä, Eetu, Tolonen, Mikko

This paper presents a novel task of extracting Latin fragments from mixed-language historical documents with varied layouts. We benchmark and evaluate the performance of large foundation models against a multimodal dataset of 724 annotated pages. The results demonstrate that reliable Latin detection with contemporary models is achievable. Our study provides the first comprehensive analysis of these models' capabilities and limits for this task.

category, large language model, machine learning, (20 more...)

2510.19585

Country: Europe > France (0.14)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Raszewski, Luc, De Kock, Christine

Detecting Sockpuppetry on Wikipedia Using Meta-Learning

Malicious sockpuppet detection on Wikipedia is critical to preserving access to reliable information on the internet and preventing the spread of disinformation. Prior machine learning approaches rely on stylistic and meta-data features, but do not prioritise adaptability to author-specific behaviours. As a result, they struggle to effectively model the behaviour of specific sockpuppet-groups, especially when text data is limited. To address this, we propose the application of meta-learning, a machine learning technique designed to improve performance in data-scarce settings by training models across multiple tasks. Meta-learning optimises a model for rapid adaptation to the writing style of a new sockpuppet-group. Our results show that meta-learning significantly enhances the precision of predictions compared to pre-trained models, marking an advancement in combating sockpuppetry on open editing platforms. We release a new dataset of sockpuppet investigations to foster future research in both sockpuppetry and meta-learning fields.

artificial intelligence, contribution, machine learning, (18 more...)

doi: 10.18653/v1/2025.acl-long.1083

2506.10314

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (0.48)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Apollo: A Posteriori Label-Only Membership Inference Attack Towards Machine Unlearning

Tang, Liou, Joshi, James, Kundu, Ashish

Machine Unlearning (MU) aims to update Machine Learning (ML) models following requests to remove training samples and their influences on a trained model efficiently without retraining the original ML model from scratch. While MU itself has been employed to provide privacy protection and regulatory compliance, it can also increase the attack surface of the model. Existing privacy inference attacks towards MU that aim to infer properties of the unlearned set rely on the weaker threat model that assumes the attacker has access to both the unlearned model and the original model, limiting their feasibility toward real-life scenarios. We propose a novel privacy attack, A Posteriori Label-Only Membership Inference Attack towards MU, Apollo, that infers whether a data sample has been unlearned, following a strict threat model where an adversary has access to the label-output of the unlearned model only. We demonstrate that our proposed attack, while requiring less access to the target model compared to previous attacks, can achieve relatively high precision on the membership status of the unlearned samples.

artificial intelligence, machine learning, nlearning, (17 more...)

2506.09923

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Machine LearningOct-29-2025

Understanding Fairness and Prediction Error through Subspace Decomposition and Influence Analysis

Shi, Enze, Bhagwat, Pankaj, Yang, Zhixian, Kong, Linglong, Jiang, Bei

Machine learning models have achieved widespread success but often inherit and amplify historical biases, resulting in unfair outcomes. Traditional fairness methods typically impose constraints at the prediction level, without addressing underlying biases in data representations. In this work, we propose a principled framework that adjusts data representations to balance predictive utility and fairness. Using sufficient dimension reduction, we decompose the feature space into target-relevant, sensitive, and shared components, and control the fairness-utility trade-off by selectively removing sensitive information. We provide a theoretical analysis of how prediction error and fairness gaps evolve as shared subspaces are added, and employ influence functions to quantify their effects on the asymptotic behavior of parameter estimates. Experiments on both synthetic and real-world datasets validate our theoretical insights and show that the proposed method effectively improves fairness while preserving predictive performance.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Machine Learning

2510.23935

Country:

North America > Canada (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Law (0.93)
Information Technology > Security & Privacy (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)