AITopics

2505.15294

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

D'Amico, Francesco, Bocchi, Dario, Negri, Matteo

Implicit bias produces neural scaling laws in learning curves, from perceptrons to deep networks

arXiv.org Machine LearningMay-20-2025

Scaling laws in deep learning - empirical power-law relationships linking model performance to resource growth - have emerged as simple yet striking regularities across architectures, datasets, and tasks. These laws are particularly impactful in guiding the design of state-of-the-art models, since they quantify the benefits of increasing data or model size, and hint at the foundations of interpretability in machine learning. However, most studies focus on asymptotic behavior at the end of training or on the optimal training time given the model size. In this work, we uncover a richer picture by analyzing the entire training dynamics through the lens of spectral complexity norms. We identify two novel dynamical scaling laws that govern how performance evolves during training. These laws together recover the well-known test error scaling at convergence, offering a mechanistic explanation of generalization emergence. Our findings are consistent across CNNs, ResNets, and Vision Transformers trained on MNIST, CIFAR-10 and CIFAR-100. Furthermore, we provide analytical support using a solvable model: a single-layer perceptron trained with binary cross-entropy. In this setting, we show that the growth of spectral complexity driven by the implicit bias mirrors the generalization behavior observed at fixed norm, allowing us to connect the performance dynamics to classical learning rules in the perceptron.

artificial intelligence, machine learning, perceptron, (17 more...)

arXiv.org Machine Learning

2505.1323

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Haque, Md. Ehsanul, Polash, Md. Saymon Hosen, Simla, Md Al-Imran Sanjida, Hossain, Md Alomgir, Jahan, Sarwar

Enhancing IoT Cyber Attack Detection in the Presence of Highly Imbalanced Data

arXiv.org Artificial IntelligenceMay-19-2025

Due to the rapid growth in the number of Internet of Things (IoT) networks, the cyber risk has increased exponentially, and therefore, we have to develop effective IDS that can work well with highly imbalanced datasets. A high rate of missed threats can be the result, as traditional machine learning models tend to struggle in identifying attacks when normal data volume is much higher than the volume of attacks. For example, the dataset used in this study reveals a strong class imbalance with 94,659 instances of the majority class and only 28 instances of the minority class, making it quite challenging to determine rare attacks accurately. The challenges presented in this research are addressed by hybrid sampling techniques designed to improve data imbalance detection accuracy in IoT domains. After applying these techniques, we evaluate the performance of several machine learning models such as Random Forest, Soft Voting, Support Vector Classifier (SVC), K-Nearest Neighbors (KNN), Multi-Layer Perceptron (MLP), and Logistic Regression with respect to the classification of cyber-attacks. The obtained results indicate that the Random Forest model achieved the best performance with a Kappa score of 0.9903, test accuracy of 0.9961, and AUC of 0.9994. Strong performance is also shown by the Soft Voting model, with an accuracy of 0.9952 and AUC of 0.9997, indicating the benefits of combining model predictions. Overall, this work demonstrates the value of hybrid sampling combined with robust model and feature selection for significantly improving IoT security against cyber-attacks, especially in highly imbalanced data environments.

accuracy, artificial intelligence, machine learning, (11 more...)

doi: 10.1109/CSNT64827.2025.10968583

2505.106

Country: Asia (0.46)

Genre: Research Report > New Finding (0.89)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.71)

Muscarnera, Luca, Loreti, Luigi, Todeschini, Giovanni, Fumagalli, Alessio, Regazzoni, Francesco

Emergence of Structure in Ensembles of Random Neural Networks

arXiv.org Artificial IntelligenceMay-16-2025

Randomness is ubiquitous in many applications across data science and machine learning. Remarkably, systems composed of random components often display emergent global behaviors that appear deterministic, manifesting a transition from microscopic disorder to macroscopic organization. In this work, we introduce a theoretical model for studying the emergence of collective behaviors in ensembles of random classifiers. We argue that, if the ensemble is weighted through the Gibbs measure defined by adopting the classification loss as an energy, then there exists a finite temperature parameter for the distribution such that the classification is optimal, with respect to the loss (or the energy). Interestingly, for the case in which samples are generated by a Gaussian distribution and labels are constructed by employing a teacher perceptron, we analytically prove and numerically confirm that such optimal temperature does not depend neither on the teacher classifier (which is, by construction of the learning problem, unknown), nor on the number of random classifiers, highlighting the universal nature of the observed behavior. Experiments on the MNIST dataset underline the relevance of this phenomenon in high-quality, noiseless, datasets. Finally, a physical analogy allows us to shed light on the self-organizing nature of the studied phenomenon.

artificial intelligence, classifier, machine learning, (19 more...)

2505.10331

Country: Europe (0.46)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Kościałkowski, Jan, Marcinkowski, Paweł

Divide (Text) and Conquer (Sentiment): Improved Sentiment Classification by Constituent Conflict Resolution

Sentiment classification, a complex task in natural language processing, becomes even more challenging when analyzing passages with multiple conflicting tones. Typically, longer passages exacerbate this issue, leading to decreased model performance. The aim of this paper is to introduce novel methodologies for isolating conflicting sentiments and aggregating them to effectively predict the overall sentiment of such passages. One of the aggregation strategies involves a Multi-Layer Perceptron (MLP) model which outperforms baseline models across various datasets, including Amazon, Twitter, and SST while costing $\sim$1/100 of what fine-tuning the baseline would take.

artificial intelligence, machine learning, natural language, (17 more...)

2505.0632

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > Monaco (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.69)

Ma, Yizhou, Yang, Zhuoqin, Ibáñez, Luis-Daniel

Enhancing Federated Learning with Kolmogorov-Arnold Networks: A Comparative Study Across Diverse Aggregation Strategies

Multilayer Perceptron (MLP), as a simple yet powerful model, continues to be widely used in classification and regression tasks. However, traditional MLPs often struggle to efficiently capture nonlinear relationships in load data when dealing with complex datasets. Kolmogorov-Arnold Networks (KAN), inspired by the Kolmogorov-Arnold representation theorem, have shown promising capabilities in modeling complex nonlinear relationships. In this study, we explore the performance of KANs within federated learning (FL) frameworks and compare them to traditional Multilayer Perceptrons. Our experiments, conducted across four diverse datasets demonstrate that KANs consistently outperform MLPs in terms of accuracy, stability, and convergence efficiency. KANs exhibit remarkable robustness under varying client numbers and non-IID data distributions, maintaining superior performance even as client heterogeneity increases. Notably, KANs require fewer communication rounds to converge compared to MLPs, highlighting their efficiency in FL scenarios. Additionally, we evaluate multiple parameter aggregation strategies, with trimmed mean and FedProx emerging as the most effective for optimizing KAN performance. These findings establish KANs as a robust and scalable alternative to MLPs for federated learning tasks, paving the way for their application in decentralized and privacy-preserving environments.

aggregation strategy, artificial intelligence, machine learning, (16 more...)

2505.07629

Country: Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (0.89)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Mishra, Aditya, Lone, Haroon

RiM: Record, Improve and Maintain Physical Well-being using Federated Learning

In academic settings, the demanding environment often forces students to prioritize academic performance over their physical well-being. Moreover, privacy concerns and the inherent risk of data breaches hinder the deployment of traditional machine learning techniques for addressing these health challenges. In this study, we introduce RiM: Record, Improve, and Maintain, a mobile application which incorporates a novel personalized machine learning framework that leverages federated learning to enhance students' physical well-being by analyzing their lifestyle habits. Our approach involves pre-training a multilayer perceptron (MLP) model on a large-scale simulated dataset to generate personalized recommendations. Subsequently, we employ federated learning to fine-tune the model using data from IISER Bhopal students, thereby ensuring its applicability in real-world scenarios. The federated learning approach guarantees differential privacy by exclusively sharing model weights rather than raw data. Experimental results show that the FedAvg-based RiM model achieves an average accuracy of 60.71% and a mean absolute error of 0.91--outperforming the FedPer variant (average accuracy 46.34%, MAE 1.19)--thereby demonstrating its efficacy in predicting lifestyle deficits under privacy-preserving constraints.

application, artificial intelligence, machine learning, (16 more...)

2505.06384

Country: Asia > India > Madhya Pradesh > Bhopal (0.25)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Consumer Health (1.00)
Education (1.00)
Information Technology > Security & Privacy (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)

Peter, Jacques, Bennehard, Quentin, Heib, Sébastien, Hantrais-Gervois, Jean-Luc, Moëns, Frédéric

ONERA's CRM WBPN database for machine learning activities, related regression challenge and first results

This paper presents a new Computational Fluid Dynamics database, developed at ONERA, to support the advancement of machine learning techniques for aerodynamic field prediction. It contains 468 Reynolds-Averaged Navier-Stokes simulations using the Spalart-Allmaras turbulence model, performed on the NASA/Boeing Common Research Model wing-body-pylon-nacelle configuration. The database spans a wide range of flow conditions, varying Mach number (including transonic regimes), angle of attack (capturing flow separation), and Reynolds number (based on three stagnation pressures, with one setting matching wind tunnel experiments). The quality of the database is assessed, through checking the convergence level of each computation. Based on these data, a regression challenge is defined. It consists in predicting the wall distributions of pressure and friction coefficients for unseen aerodynamic conditions. The 468 simulations are split into training and testing sets, with the training data made available publicly on the Codabench platform. The paper further evaluates several classical machine learning regressors on this task. Tested pointwise methods include Multi-Layer Perceptrons, $λ$-DNNs, and Decision Trees, while global methods include Multi-Layer Perceptron, k-Nearest Neighbors, Proper Orthogonal Decomposition and IsoMap. Initial performance results, using $R^2$ scores and worst relative mean absolute error metrics, are presented, offering insights into the capabilities of these techniques for the challenge and references for future work.

artificial intelligence, machine learning, regressor, (19 more...)

2505.06265

Genre: Research Report (1.00)

Industry:

Education (0.76)
Aerospace & Defense (0.67)
Government (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

arXiv.org Artificial IntelligenceMay-8-2025

Explainable AI for Correct Root Cause Analysis of Product Quality in Injection Moulding

Muaz, Muhammad, Sajid, Sameed, Schulze, Tobias, Liu, Chang, Klasen, Nils, Drescher, Benny

If a product deviates from its desired properties in the injection moulding process, its root cause analysis can be aided by models that relate the input machine settings with the output quality characteristics. The machine learning models tested in the quality prediction are mostly black boxes; therefore, no direct explanation of their prognosis is given, which restricts their applicability in the quality control. The previously attempted explainability methods are either restricted to tree-based algorithms only or do not emphasize on the fact that some explainability methods can lead to wrong root cause identification of a product's deviation from its desired properties. This study first shows that the interactions among the multiple input machine settings do exist in real experimental data collected as per a central composite design. Then, the model-agnostic explainable AI methods are compared for the first time to show that different explainability methods indeed lead to different feature impact analysis in injection moulding. Moreover, it is shown that the better feature attribution translates to the correct cause identification and actionable insights for the injection moulding process. Being model agnostic, explanations on both random forest and multilayer perceptron are performed for the cause analysis, as both models have the mean absolute percentage error of less than 0.05% on the experimental dataset.

artificial intelligence, interaction, machine learning, (18 more...)

doi: 10.1016/j.jmapro.2025.03.114

2505.01445

Country:

Asia > China > Hong Kong (0.14)
Europe > Switzerland > St. Gallen > St. Gallen (0.04)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Materials (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Wang, Shuyu, Saillet, Angélique, Gall, Philomène Le, Lacroux, Alain, Martin-Lacroux, Christelle, Brault, Vincent

Study of the influence of a biased database on the prediction of standard algorithms for selecting the best candidate for an interview

arXiv.org Artificial IntelligenceMay-6-2025

Artificial Intelligence (AI) is extensively used across various stages of the recruitment process, from automated candidate sourcing on social media platforms to asynchronous video recruitment methods. A study of Human Resources (HR) professionals representing 500 mid-sized organisations from diverse industries across five countries revealed that 24% of businesses have already implemented AI for recruitment purposes, while 56% of hiring managers plan to adopt it within the next year [Sage, 2020]. AI is employed to augment human decision-making regarding job candidates (such as determining who should receive a job offer) and to support the actions of human decision-makers throughout the process (such as data collection and analysis; Gonzalez, Liu, Shirase, Tomczak, Lobbe, Justenhoven, and Martin [2022]). Some applications incorporating AI algorithms are widely accepted and relatively uncontroversial.

artificial intelligence, classification, machine learning, (16 more...)

2505.02609

Country:

Europe > France (0.29)
Europe > Austria (0.28)

Genre: Research Report (1.00)

Industry: Law > Civil Rights & Constitutional Law (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.48)