AITopics | Perceptrons

Collaborating Authors

Perceptrons

News Overviews Instructional Materials AI-Alerts Classics

Urinary Tract Infection Detection in Digital Remote Monitoring: Strategies for Managing Participant-Specific Prediction Complexity

Fan, Kexin, Capstick, Alexander, Nilforooshan, Ramin, Barnaghi, Payam

arXiv.org Artificial IntelligenceFeb-18-2025

Urinary tract infections (UTIs) are a significant health concern, particularly for people living with dementia (PLWD), as they can lead to severe complications if not detected and treated early. This study builds on previous work that utilised machine learning (ML) to detect UTIs in PLWD by analysing in-home activity and physiological data collected through low-cost, passive sensors. The current research focuses on improving the performance of previous models, particularly by refining the Multilayer Perceptron (MLP), to better handle variations in home environments and improve sex fairness in predictions by making use of concepts from multitask learning. This study implemented three primary model designs: feature clustering, loss-dependent clustering, and participant ID embedding which were compared against a baseline MLP model. The results demonstrated that the loss-dependent MLP achieved the most significant improvements, increasing validation precision from 48.92% to 72.60% and sensitivity from 27.44% to 70.52%, while also enhancing model fairness across sexes. These findings suggest that the refined models offer a more reliable and equitable approach to early UTI detection in PLWD, addressing participant-specific data variations and enabling clinicians to detect and screen for UTI risks more effectively, thereby facilitating earlier and more accurate treatment decisions.

dataset, participant, tract infection, (14 more...)

arXiv.org Artificial Intelligence

2502.17484

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Urology (0.72)
Health & Medicine > Therapeutic Area > Nephrology (0.62)
Health & Medicine > Therapeutic Area > Internal Medicine (0.62)
Health & Medicine > Therapeutic Area > Neurology > Dementia (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Comparative Analysis of FPGA and GPU Performance for Machine Learning-Based Track Reconstruction at LHCb

Giasemis, Fotis I., Lončar, Vladimir, Granado, Bertrand, Gligorov, Vladimir Vava

arXiv.org Artificial IntelligenceFeb-16-2025

In high-energy physics, the increasing luminosity and detector granularity at the Large Hadron Collider are driving the need for more efficient data processing solutions. Machine Learning has emerged as a promising tool for reconstructing charged particle tracks, due to its potentially linear computational scaling with detector hits. The recent implementation of a graph neural network-based track reconstruction pipeline in the first level trigger of the LHCb experiment on GPUs serves as a platform for comparative studies between computational architectures in the context of high-energy physics. This paper presents a novel comparison of the throughput of ML model inference between FPGAs and GPUs, focusing on the first step of the track reconstruction pipeline$\unicode{x2013}$an implementation of a multilayer perceptron. Using HLS4ML for FPGA deployment, we benchmark its performance against the GPU implementation and demonstrate the potential of FPGAs for high-throughput, low-latency inference without the need for an expertise in FPGA development and while consuming significantly less power.

artificial intelligence, implementation, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.02304

Country:

North America > United States (0.47)
Europe (0.29)

Genre:

Research Report (0.64)
Workflow (0.49)

Industry:

Information Technology (0.50)
Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

CL-MFAP: A Contrastive Learning-Based Multimodal Foundation Model for Molecular Property Prediction and Antibiotic Screening

Zhou, Gen, Janarthanan, Sugitha, Lu, Yutong, Hu, Pingzhao

arXiv.org Artificial IntelligenceFeb-16-2025

Due to the rise in antimicrobial resistance, identifying novel compounds with antibiotic potential is crucial for combatting this global health issue. However, traditional drug development methods are costly and inefficient. Recognizing the pressing need for more effective solutions, researchers have turned to machine learning techniques to streamline the prediction and development of novel antibiotic compounds. While foundation models have shown promise in antibiotic discovery, current mainstream efforts still fall short of fully leveraging the potential of multimodal molecular data. Recent studies suggest that contrastive learning frameworks utilizing multimodal data exhibit excellent performance in representation learning across various domains. Building upon this, we introduce CL-MFAP, an unsupervised contrastive learning (CL)-based multimodal foundation (MF) model specifically tailored for discovering small molecules with potential antibiotic properties (AP) using three types of molecular data. This model employs 1.6 million bioactive molecules with drug-like properties from the ChEMBL dataset to jointly pretrain three encoders: (1) a transformer-based encoder with rotary position embedding for processing SMILES strings; (2) another transformerbased encoder, incorporating a novel bi-level routing attention mechanism to handle molecular graph representations; and (3) a Morgan fingerprint encoder using a multilayer perceptron, to achieve the contrastive learning purpose. The CL-MFAP outperforms baseline models in antibiotic property prediction by effectively utilizing different molecular modalities and demonstrates superior domain-specific performance when fine-tuned for antibiotic-related property prediction tasks. Bacteria play a pivotal role in a diverse array of diseases within the human body, serving as either the primary cause or a contributing factor. A promising and sometimes sole treatment for these diseases is antibiotics, a specialized class of drugs designed to target pathogenic bacteria. Despite advancements, a lack of antibiotics for many pathogenic bacteria persists, and antibiotic resistance allows bacteria to survive once effective treatments. Consequently, there is a pressing demand for the continual development of antibiotics.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.11001

Country:

North America > United States (0.68)
North America > Canada > Ontario (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

To Bin or not to Bin: Alternative Representations of Mass Spectra

de Jonge, Niek, van der Hooft, Justin J. J., Probst, Daniel

arXiv.org Artificial IntelligenceFeb-15-2025

Mass spectrometry, especially so-called tandem mass spectrometry, is commonly used to assess the chemical diversity of samples. The resulting mass fragmentation spectra are representations of molecules of which the structure may have not been determined. This poses the challenge of experimentally determining or computationally predicting molecular structures from mass spectra. An alternative option is to predict molecular properties or molecular similarity directly from spectra. Various methodologies have been proposed to embed mass spectra for further use in machine learning tasks. However, these methodologies require preprocessing of the spectra, which often includes binning or sub-sampling peaks with the main reasoning of creating uniform vector sizes and removing noise. Here, we investigate two alternatives to the binning of mass spectra before down-stream machine learning tasks, namely, set-based and graph-based representations. Comparing the two proposed representations to train a set transformer and a graph neural network on a regression task, respectively, we show that they both perform substantially better than a multilayer perceptron trained on binned data.

artificial intelligence, machine learning, mass spectra, (14 more...)

arXiv.org Artificial Intelligence

2502.10851

Country:

Africa (0.29)
North America > United States > Hawaii (0.15)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.56)

Add feedback

LEAPS: A discrete neural sampler via locally equivariant networks

Holderrieth, Peter, Albergo, Michael S., Jaakkola, Tommi

arXiv.org Machine LearningFeb-15-2025

We propose LEAPS, an algorithm to sample from discrete distributions known up to normalization by learning a rate matrix of a continuous-time Markov chain (CTMC). LEAPS can be seen as a continuous-time formulation of annealed importance sampling and sequential Monte Carlo methods, extended so that the variance of the importance weights is offset by the inclusion of the CTMC. To derive these importance weights, we introduce a set of Radon-Nikodym derivatives of CTMCs over their path measures. Because the computation of these weights is intractable with standard neural network parameterizations of rate matrices, we devise a new compact representation for rate matrices via what we call locally equivariant functions. To parameterize them, we introduce a family of locally equivariant multilayer perceptrons, attention layers, and convolutional networks, and provide an approach to make deep networks that preserve the local equivariance. This property allows us to propose a scalable training algorithm for the rate matrix such that the variance of the importance weights associated to the CTMC are minimal. We demonstrate the efficacy of LEAPS on problems in statistical physics.

artificial intelligence, discrete neural sampler, machine learning, (1 more...)

arXiv.org Machine Learning

2502.10843

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.53)

Add feedback

USER-VLM 360: Personalized Vision Language Models with User-aware Tuning for Social Human-Robot Interactions

Rahimi, Hamed, Bahaj, Adil, Abrini, Mouad, Khoramshahi, Mahdi, Ghogho, Mounir, Chetouani, Mohamed

arXiv.org Artificial IntelligenceFeb-14-2025

The integration of vision-language models into robotic systems constitutes a significant advancement in enabling machines to interact with their surroundings in a more intuitive manner. While VLMs offer rich multimodal reasoning, existing approaches lack user-specific adaptability, often relying on generic interaction paradigms that fail to account for individual behavioral, contextual, or socio-emotional nuances. When customization is attempted, ethical concerns arise from unmitigated biases in user data, risking exclusion or unfair treatment. To address these dual challenges, we propose User-VLM 360{\deg}, a holistic framework integrating multimodal user modeling with bias-aware optimization. Our approach features: (1) user-aware tuning that adapts interactions in real time using visual-linguistic signals; (2) bias mitigation via preference optimization; and (3) curated 360{\deg} socio-emotive interaction datasets annotated with demographic, emotion, and relational metadata. Evaluations across eight benchmarks demonstrate state-of-the-art results: +35.3% F1 in personalized VQA, +47.5% F1 in facial features understanding, 15% bias reduction, and 30X speedup over baselines. Ablation studies confirm component efficacy, and deployment on the Pepper robot validates real-time adaptability across diverse users. We open-source parameter-efficient 3B/10B models and an ethical verification framework for responsible adaptation.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.10636

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > UAE (0.04)
Africa > Middle East > Morocco > Rabat-Salé-Kénitra Region > Rabat (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Unlocking the Potential of Classic GNNs for Graph-level Tasks: Simple Architectures Meet Excellence

Luo, Yuankai, Shi, Lei, Wu, Xiao-Ming

arXiv.org Artificial IntelligenceFeb-13-2025

Message-passing Graph Neural Networks (GNNs) are often criticized for their limited expressiveness, issues like over-smoothing and over-squashing, and challenges in capturing long-range dependencies, while Graph Transformers (GTs) are considered superior due to their global attention mechanisms. Literature frequently suggests that GTs outperform GNNs, particularly in graph-level tasks such as graph classification and regression. In this study, we explore the untapped potential of GNNs through an enhanced framework, GNN+, which integrates six widely used techniques: edge feature integration, normalization, dropout, residual connections, feed-forward networks, and positional encoding, to effectively tackle graph-level tasks. We conduct a systematic evaluation of three classic GNNs, namely GCN, GIN, and GatedGCN, enhanced by the GNN+ framework across 14 well-known graph-level datasets. Our results show that, contrary to the prevailing belief, classic GNNs excel in graph-level tasks, securing top three rankings across all datasets and achieving first place in eight, while also demonstrating greater efficiency than GTs. This highlights the potential of simple GNN architectures, challenging the belief that complex mechanisms in GTs are essential for superior graph-level performance.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2502.09263

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.48)

Add feedback

Shortcut Learning Susceptibility in Vision Classifiers

Suhail, Pirzada, Sethi, Amit

arXiv.org Artificial IntelligenceFeb-13-2025

Shortcut learning, where machine learning models exploit spurious correlations in data instead of capturing meaningful features, poses a significant challenge to building robust and generalizable models. This phenomenon is prevalent across various machine learning applications, including vision, natural language processing, and speech recognition, where models may find unintended cues that minimize training loss but fail to capture the underlying structure of the data. Vision classifiers such as Convolutional Neural Networks (CNNs), Multi-Layer Perceptrons (MLPs), and Vision Transformers (ViTs) leverage distinct architectural principles to process spatial and structural information, making them differently susceptible to shortcut learning. In this study, we systematically evaluate these architectures by introducing deliberate shortcuts into the dataset that are positionally correlated with class labels, creating a controlled setup to assess whether models rely on these artificial cues or learn actual distinguishing features. We perform both quantitative evaluation by training on the shortcut-modified dataset and testing them on two different test sets -- one containing the same shortcuts and another without them -- to determine the extent of reliance on shortcuts. Additionally, qualitative evaluation is performed by using network inversion-based reconstruction techniques to analyze what the models internalize in their weights, aiming to reconstruct the training data as perceived by the classifiers. We evaluate shortcut learning behavior across multiple benchmark datasets, including MNIST, Fashion-MNIST, SVHN, and CIFAR-10, to compare the susceptibility of different vision classifier architectures to shortcut reliance and assess their varying degrees of sensitivity to spurious correlations.

shortcut, susceptibility, training data, (13 more...)

arXiv.org Artificial Intelligence

2502.0915

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Switzerland (0.04)
Asia > India > Maharashtra > Mumbai (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.69)

Add feedback

On Herding and the Perceptron Cycling Theorem

Andrew Gelfand, Yutian Chen, Laurens Maaten, Max Welling

Neural Information Processing SystemsFeb-12-2025, 02:24:40 GMT

The paper develops a connection between traditional perceptron algorithms and recently introduced herding algorithms. It is shown that both algorithms can be viewed as an application of the perceptron cycling theorem. This connection strengthens some herding results and suggests new (supervised) herding algorithms that, like CRFs or discriminative RBMs, make predictions by conditioning on the input attributes. We develop and investigate variants of conditional herding, and show that conditional herding leads to practical algorithms that perform better than or on par with related classifiers such as the voted perceptron and the discriminative RBM.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Estimating Probabilities of Causation with Machine Learning Models

Wang, Shuai, Li, Ang

arXiv.org Artificial IntelligenceFeb-12-2025

Probabilities of causation play a crucial role in modern decision-making. This paper addresses the challenge of predicting probabilities of causation for subpopulations with insufficient data using machine learning models. Tian and Pearl first defined and derived tight bounds for three fundamental probabilities of causation: the probability of necessity and sufficiency (PNS), the probability of sufficiency (PS), and the probability of necessity (PN). However, estimating these probabilities requires both experimental and observational distributions specific to each subpopulation, which are often unavailable or impractical to obtain with limited population-level data. We assume that the probabilities of causation for each subpopulation are determined by its characteristics. To estimate these probabilities for subpopulations with insufficient data, we propose using machine learning models that draw insights from subpopulations with sufficient data. Our evaluation of multiple machine learning models indicates that, given sufficient population-level data and an appropriate choice of machine learning model and activation function, PNS can be effectively predicted. Through simulation studies, we show that our multilayer perceptron (MLP) model with the Mish activation function achieves a mean absolute error (MAE) of approximately 0.02 in predicting PNS for 32,768 subpopulations using data from around 2,000 subpopulations.

artificial intelligence, machine learning, probability, (16 more...)

arXiv.org Artificial Intelligence

2502.08858

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Florida > Leon County > Tallahassee (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback