AITopics | Perceptrons

Collaborating Authors

Perceptrons

News Overviews Instructional Materials AI-Alerts Classics

What Causes Postoperative Aspiration?

Nagesh, Supriya, Covarrubias, Karina, El-Kareh, Robert, Kasiviswanathan, Shiva Prasad, Mishra, Nina

arXiv.org Artificial IntelligenceOct-30-2025

Background: Aspiration, the inhalation of foreign material into the lungs, significantly impacts surgical patient morbidity and mortality. This study develops a machine learning (ML) model to predict postoperative aspiration, enabling timely preventative interventions. Methods: From the MIMIC-IV database of over 400,000 hospital admissions, we identified 826 surgical patients (mean age: 62, 55.7\% male) who experienced aspiration within seven days post-surgery, along with a matched non-aspiration cohort. Three ML models: XGBoost, Multilayer Perceptron, and Random Forest were trained using pre-surgical hospitalization data to predict postoperative aspiration. To investigate causation, we estimated Average Treatment Effects (ATE) using Augmented Inverse Probability Weighting. Results: Our ML model achieved an AUROC of 0.86 and 77.3\% sensitivity on a held-out test set. Maximum daily opioid dose, length of stay, and patient age emerged as the most important predictors. ATE analysis identified significant causative factors: opioids (0.25 +/- 0.06) and operative site (neck: 0.20 +/- 0.13, head: 0.19 +/- 0.13). Despite equal surgery rates across genders, men were 1.5 times more likely to aspirate and received 27\% higher maximum daily opioid dosages compared to women. Conclusion: ML models can effectively predict postoperative aspiration risk, enabling targeted preventative measures. Maximum daily opioid dosage and operative site significantly influence aspiration risk. The gender disparity in both opioid administration and aspiration rates warrants further investigation. These findings have important implications for improving postoperative care protocols and aspiration prevention strategies.

artificial intelligence, aspiration, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2510.21779

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)

Add feedback

K-DAREK: Distance Aware Error for Kurkova Kolmogorov Networks

Ataei, Masoud, Dhiman, Vikas, Khojasteh, Mohammad Javad

arXiv.org Machine LearningOct-28-2025

Neural networks are parametric and powerful tools for function approximation, and the choice of architecture heavily influences their interpretability, efficiency, and generalization. In contrast, Gaussian processes (GPs) are nonparametric probabilistic models that define distributions over functions using a kernel to capture correlations among data points. However, these models become computationally expensive for large-scale problems, as they require inverting a large covariance matrix. Kolmogorov- Arnold networks (KANs), semi-parametric neural architectures, have emerged as a prominent approach for modeling complex functions with structured and efficient representations through spline layers. Kurkova Kolmogorov-Arnold networks (KKANs) extend this idea by reducing the number of spline layers in KAN and replacing them with Chebyshev layers and multi-layer perceptrons, thereby mapping inputs into higher-dimensional spaces before applying spline-based transformations. Compared to KANs, KKANs perform more stable convergence during training, making them a strong architecture for estimating operators and system modeling in dynamical systems. By enhancing the KKAN architecture, we develop a novel learning algorithm, distance-aware error for Kurkova-Kolmogorov networks (K-DAREK), for efficient and interpretable function approximation with uncertainty quantification. Our approach establishes robust error bounds that are distance-aware; this means they reflect the proximity of a test point to its nearest training points. Through case studies on a safe control task, we demonstrate that K-DAREK is about four times faster and ten times higher computationally efficiency than Ensemble of KANs, 8.6 times more scalable than GP by increasing the data size, and 50% safer than our previous work distance-aware error for Kolmogorov networks (DAREK).

artificial intelligence, k-darek, machine learning, (16 more...)

arXiv.org Machine Learning

2510.22021

Country:

North America > United States > Maine > Penobscot County > Orono (0.14)
North America > United States > New York > Monroe County > Rochester (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

An MLP Baseline for Handwriting Recognition Using Planar Curvature and Gradient Orientation

Nouri, Azam

arXiv.org Artificial IntelligenceOct-27-2025

This study investigates whether second-order geometric cues - planar curvature magnitude, curvature sign, and gradient orientation - are sufficient on their own to drive a multilayer perceptron (MLP) classifier for handwritten character recognition (HCR), offering an alternative to convolutional neural networks (CNNs). Using these three handcrafted feature maps as inputs, our curvature-orientation MLP achieves 97 percent accuracy on MNIST digits and 89 percent on EMNIST letters. These results underscore the discriminative power of curvature-based representations for handwritten character images and demonstrate that the advantages of deep learning can be realized even with interpretable, hand-engineered features.

machine learning, orientation, recognition, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.5120/ijca2025925791

2508.11803

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

CosmoCore Affective Dream-Replay Reinforcement Learning for Code Generation

Ravindran, Santhosh Kumar

arXiv.org Artificial IntelligenceOct-23-2025

We introduce CosmoCore, a neuroscience-inspired reinforcement learning (RL) architecture that integrates affective signals to enhance code generation in large language models (LLMs). Motivated by human and animal learning where embarrassment from mistakes drives rapid correction, as observed in training a puppy to avoid repeating errors after a single scolding CosmoCore tags code generation trajectories with valence and surprise using a lightweight multi-layer perceptron (MLP). High-negative valence (cringe) episodes, such as buggy code outputs, are prioritized in a Dream Queue for five-fold replay during off-policy updates, while low-surprise successes are pruned to prevent overconfidence and buffer bloat. Evaluated on code generation benchmarks like HumanEval and BigCodeBench, alongside simulations with a custom data pipeline environment, CosmoCore reduces hallucinated code (e.g., syntax errors or logical bugs) by 48\% and accelerates self-correction by 45\%. Local experiments using Hugging Face models in a PySpark environment validate these gains, with code snippets provided for replication. Ablations confirm valence tagging boosts curiosity in exploration, and pruning mitigates inefficiency. This framework extends RL from human feedback (RLHF) for more emotionally aware code assistants, with applications in IDEs and data pipelines. Code and the custom mini-world simulation are released.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2510.18895

Country: North America > United States (0.14)

Genre:

Research Report (0.50)
Overview (0.46)

Industry:

Information Technology (0.46)
Health & Medicine (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Add feedback

Overlap-weighted orthogonal meta-learner for treatment effect estimation over time

Hess, Konstantin, Frauen, Dennis, van der Schaar, Mihaela, Feuerriegel, Stefan

arXiv.org Artificial IntelligenceOct-23-2025

Estimating heterogeneous treatment effects (HTEs) in time-varying settings is particularly challenging, as the probability of observing certain treatment sequences decreases exponentially with longer prediction horizons. Thus, the observed data contain little support for many plausible treatment sequences, which creates severe overlap problems. Existing meta-learners for the time-varying setting typically assume adequate treatment overlap, and thus suffer from exploding estimation variance when the overlap is low. To address this problem, we introduce a novel overlap-weighted orthogonal (WO) meta-learner for estimating HTEs that targets regions in the observed data with high probability of receiving the interventional treatment sequences. This offers a fully data-driven approach through which our WO-learner can counteract instabilities as in existing meta-learners and thus obtain more reliable HTE estimates. Methodologically, we develop a novel Neyman-orthogonal population risk function that minimizes the overlap-weighted oracle risk. We show that our WO-learner has the favorable property of Neyman-orthogonality, meaning that it is robust against misspecification in the nuisance functions. Further, our WO-learner is fully model-agnostic and can be applied to any machine learning model. Through extensive experiments with both transformer and LSTM backbones, we demonstrate the benefits of our novel WO-learner.

artificial intelligence, estimation, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.19643

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.46)

Add feedback

Visual Space Optimization for Zero-shot Learning

Wang, Xinsheng, Pang, Shanmin, Zhu, Jihua, Li, Zhongyu, Tian, Zhiqiang, Li, Yaochen

arXiv.org Artificial IntelligenceOct-22-2025

Zero-shot learning, which aims to recognize new categories that are not included in the training set, has gained popularity owing to its potential ability in the real-word applications. Zero-shot learning models rely on learning an embedding space, where both semantic descriptions of classes and visual features of instances can be embedded for nearest neighbor search. Recently, most of the existing works consider the visual space formulated by deep visual features as an ideal choice of the embedding space. However, the discrete distribution of instances in the visual space makes the data structure unremarkable. We argue that optimizing the visual space is crucial as it allows semantic vectors to be embedded into the visual space more effectively. In this work, we propose two strategies to accomplish this purpose. One is the visual prototype based method, which learns a visual prototype for each visual class, so that, in the visual space, a class can be represented by a prototype feature instead of a series of discrete visual features. The other is to optimize the visual feature structure in an intermediate embedding space, and in this method we successfully devise a multilayer perceptron framework based algorithm that is able to learn the common intermediate embedding space and meanwhile to make the visual data structure more distinctive. Through extensive experimental evaluation on four benchmark datasets, we demonstrate that optimizing visual space is beneficial for zero-shot learning. Besides, the proposed prototype based method achieves the new state-of-the-art performance.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

1907.0033

Country: Asia (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.99)
(2 more...)

Add feedback

RAISE: A Unified Framework for Responsible AI Scoring and Evaluation

Nguyen, Loc Phuc Truong, Do, Hung Thanh

arXiv.org Artificial IntelligenceOct-22-2025

As AI systems enter high-stakes domains, evaluation must extend beyond predictive accuracy to include explainability, fairness, robustness, and sustainability. We introduce RAISE (Responsible AI Scoring and Evaluation), a unified framework that quantifies model performance across these four dimensions and aggregates them into a single, holistic Responsibility Score. We evaluated three deep learning models: a Multilayer Perceptron (MLP), a Tabular ResNet, and a Feature Tokenizer Transformer, on structured datasets from finance, healthcare, and socioeconomics. Our findings reveal critical trade-offs: the MLP demonstrated strong sustainability and robustness, the Transformer excelled in explainability and fairness at a very high environmental cost, and the Tabular ResNet offered a balanced profile. These results underscore that no single model dominates across all responsibility criteria, highlighting the necessity of multi-dimensional evaluation for responsible model selection. Our implementation is available at: https://github.com/raise-framework/raise.

artificial intelligence, machine learning, responsible ai scoring, (13 more...)

arXiv.org Artificial Intelligence

2510.18559

Country: Europe > Germany (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.51)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

A Primer on Kolmogorov-Arnold Networks (KANs) for Probabilistic Time Series Forecasting

Vaca-Rubio, Cristian J., Pereira, Roberto, Blanco, Luis, Zeydan, Engin, Caus, Màrius

arXiv.org Artificial IntelligenceOct-21-2025

This work introduces Probabilistic Kolmogorov-Arnold Network (P-KAN), a novel probabilistic extension of Kolmogorov-Arnold Networks (KANs) for time series forecasting. By replacing scalar weights with spline-based functional connections and directly parameterizing predictive distributions, P-KANs offer expressive yet parameter-efficient models capable of capturing nonlinear and heavy-tailed dynamics. We evaluate P-KANs on satellite traffic forecasting, where uncertainty-aware predictions enable dynamic thresholding for resource allocation. Results show that P-KANs consistently outperform Multi Layer Perceptron (MLP) baselines in both accuracy and calibration, achieving superior efficiency-risk trade-offs while using significantly fewer parameters. We build up P-KANs on two distributions, namely Gaussian and Student-t distributions. The Gaussian variant provides robust, conservative forecasts suitable for safety-critical scenarios, whereas the Student-t variant yields sharper distributions that improve efficiency under stable demand. These findings establish P-KANs as a powerful framework for probabilistic forecasting with direct applicability to satellite communications and other resource-constrained domains.

allocation, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2510.1694

Country: Europe (0.46)

Genre: Research Report > New Finding (0.34)

Industry: Telecommunications (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Add feedback

PAD: Phase-Amplitude Decoupling Fusion for Multi-Modal Land Cover Classification

Zheng, Huiling, Zhong, Xian, Liu, Bin, Xiao, Yi, Wen, Bihan, Li, Xiaofeng

arXiv.org Artificial IntelligenceOct-20-2025

The fusion of Synthetic Aperture Radar (SAR) and RGB imagery for land cover classification remains challenging due to modality heterogeneity and underexploited spectral complementarity. Existing approaches often fail to decouple shared structural features from modality-complementary radiometric attributes, resulting in feature conflicts and information loss. To address this, we propose Phase-Amplitude Decoupling (PAD), a frequency-aware framework that separates phase (modality-shared) and amplitude (modality-complementary) components in the Fourier domain. This design reinforces shared structures while preserving complementary characteristics, thereby enhancing fusion quality. Unlike previous methods that overlook the distinct physical properties encoded in frequency spectra, PAD explicitly introduces amplitude-phase decoupling for multi-modal fusion. Specifically, PAD comprises two key components: 1) Phase Spectrum Correction (PSC), which aligns cross-modal phase features via convolution-guided scaling to improve geometric consistency; and 2) Amplitude Spectrum Fusion (ASF), which dynamically integrates high- and low-frequency patterns using frequency-adaptive multilayer perceptrons, effectively exploiting SAR's morphological sensitivity and RGB's spectral richness. Extensive experiments on WHU-OPT-SAR and DDHR-SK demonstrate state-of-the-art performance. This work establishes a new paradigm for physics-aware multi-modal fusion in remote sensing. The code will be available at https://github.com/RanFeng2/PAD.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2504.19136

Country:

North America > United States (1.00)
Asia > China > Hubei Province (0.15)

Genre: Research Report > New Finding (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

From Universal Approximation Theorem to Tropical Geometry of Multi-Layer Perceptrons

Chu, Yi-Shan, Kuo, Yueh-Cheng

arXiv.org Machine LearningOct-20-2025

We revisit the Universal Approximation Theorem(UAT) through the lens of the tropical geometry of neural networks and introduce a constructive, geometry-aware initialization for sigmoidal multi-layer perceptrons (MLPs). Tropical geometry shows that Rectified Linear Unit (ReLU) networks admit decision functions with a combinatorial structure often described as a tropical rational, namely a difference of tropical polynomials. Focusing on planar binary classification, we design purely sigmoidal MLPs that adhere to the finite-sum format of UAT: a finite linear combination of shifted and scaled sigmoids of affine functions. The resulting models yield decision boundaries that already align with prescribed shapes at initialization and can be refined by standard training if desired. This provides a practical bridge between the tropical perspective and smooth MLPs, enabling interpretable, shape-driven initialization without resorting to ReLU architectures. We focus on the construction and empirical demonstrations in two dimensions; theoretical analysis and higher-dimensional extensions are left for future work.

artificial intelligence, construction, machine learning, (15 more...)

arXiv.org Machine Learning

2510.15012

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Add feedback