Goto

Collaborating Authors

 odel


Almost Surely Asymptotically Constant Graph Neural Networks

Neural Information Processing Systems

We present a new angle on the expressive power of graph neural networks (GNNs) by studying how the predictions of real-valued GNN classifiers, such as those classifying graphs probabilistically, evolve as we apply them on larger graphs drawn from some random graph model. We show that the output converges to a constant function, which upper-bounds what these classifiers can uniformly express. This strong convergence phenomenon applies to a very wide class of GNNs, including state of the art models, with aggregates including mean and the attention-based mechanism of graph transformers. Our results apply to a broad class of random graph models, including sparse and dense variants of the Erdős-Rényi model, the stochastic block model, and the Barabási-Albert model. We empirically validate these findings, observing that the convergence phenomenon appears not only on random graphs but also on some real-world graphs.


Particle swarm optimization for online sparse streaming feature selection under uncertainty

Xu, Ruiyang

arXiv.org Artificial Intelligence

In real - world applications involving high - dimensional streaming dat a, online streaming feature selection (OSFS) is widely adopt ed. Yet, practical deployments frequently face data incompleteness due to sensor failures or technical constraints. While online sparse streaming feature selection (OS FS) mitigates this issue via latent factor analysis - based imputation, existing methods s truggle with uncertain feature - label correlations, leading to inflexible models and degraded performance. To address these gaps, this work proposes P OS FS -- an uncertainty - aware online sparse stream ing feature selection framework enhanced by particle swarm optimization (PSO). The approach introduces: 1) PSO - driven supervision to reduce uncertainty in feature - label relationships; 2) Three - way decision theory to manage feature fuzziness in supervised l earning. Rigorous testing on six real - world datasets confirms P OS FS outperforms conventional OSFS and OS FS techniques, delivering higher accuracy through more robust feature subset selection.


DragonFruitQualityNet: A Lightweight Convolutional Neural Network for Real-Time Dragon Fruit Quality Inspection on Mobile Devices

Haquea, Md Zahurul, Sarker, Yeahyea, Mahi, Muhammed Farhan Sadique, Jaman, Syed Jubayer, Islam, Md Robiul

arXiv.org Artificial Intelligence

Dragon fruit, renowned for its nutritional benefits and economic value, has experienced rising global demand due to its affordability and local availability. As dragon fruit cultivation expands, efficient pre- and post-harvest quality inspection has become essential for improving agricultural productivity and minimizing post-harvest losses. This study presents DragonFruitQualityNet, a lightweight Convolutional Neural Network (CNN) optimized for real-time quality assessment of dragon fruits on mobile devices. We curated a diverse dataset of 13,789 images, integrating self-collected samples with public datasets (dataset from Mendeley Data), and classified them into four categories: fresh, immature, mature, and defective fruits to ensure robust model training. The proposed model achieves an impressive 93.98% accuracy, outperforming existing methods in fruit quality classification. To facilitate practical adoption, we embedded the model into an intuitive mobile application, enabling farmers and agricultural stakeholders to conduct on-device, real-time quality inspections. This research provides an accurate, efficient, and scalable AI-driven solution for dragon fruit quality control, supporting digital agriculture and empowering smallholder farmers with accessible technology. By bridging the gap between research and real-world application, our work advances post-harvest management and promotes sustainable farming practices.


Almost Surely Asymptotically Constant Graph Neural Networks

Neural Information Processing Systems

We present a new angle on the expressive power of graph neural networks (GNNs) by studying how the predictions of real-valued GNN classifiers, such as those classifying graphs probabilistically, evolve as we apply them on larger graphs drawn from some random graph model. We show that the output converges to a constant function, which upper-bounds what these classifiers can uniformly express. This strong convergence phenomenon applies to a very wide class of GNNs, including state of the art models, with aggregates including mean and the attention-based mechanism of graph transformers. Our results apply to a broad class of random graph models, including sparse and dense variants of the Erdős-Rényi model, the stochastic block model, and the Barabási-Albert model. We empirically validate these findings, observing that the convergence phenomenon appears not only on random graphs but also on some real-world graphs.


A Computational Model of Learning and Memory Using Structurally Dynamic Cellular Automata

Singh, Jeet

arXiv.org Artificial Intelligence

In the fields of computation and neuroscience, much is still unknown about the underlying computations that enable key cognitive functions including learning, memory, abstraction and behavior. This paper proposes a mathematical and computational model of learning and memory based on a small set of bio-plausible functions that include coincidence detection, signal modulation, and reward/penalty mechanisms. Our theoretical approach proposes that these basic functions are sufficient to establish and modulate an information space over which computation can be carried out, generating signal gradients usable for inference and behavior. The computational method used to test this is a structurally dynamic cellular automaton with continuous-valued cell states and a series of recursive steps propagating over an undirected graph with the memory function embedded entirely in the creation and modulation of graph edges. The experimental results show: that the toy model can make near-optimal choices to re-discover a reward state after a single training run; that it can avoid complex penalty configurations; that signal modulation and network plasticity can generate exploratory behaviors in sparse reward environments; that the model generates context-dependent memory representations; and that it exhibits high computational efficiency because of its minimal, single-pass training requirements combined with flexible and contextual memory representation.


New Insight in Cervical Cancer Diagnosis Using Convolution Neural Network Architecture

Khozaimi, Ach., Mahmudy, Wayan Firdaus

arXiv.org Artificial Intelligence

The Pap smear is a screening method for early cervical cancer diagnosis. The selection of the right optimizer in the convolutional neural network (CNN) model is key to the success of the CNN in image classification, including the classification of cervical cancer Pap smear images. In this study, stochastic gradient descent (SGD), RMSprop, Adam, AdaGrad, AdaDelta, Adamax, and Nadam optimizers were used to classify cervical cancer Pap smear images from the SipakMed dataset. Resnet-18, Resnet-34, and VGG-16 are the CNN architectures used in this study, and each architecture uses a transfer-learning model. Based on the test results, we conclude that the transfer learning model performs better on all CNNs and optimization techniques and that in the transfer learning model, the optimization has little influence on the training of the model. Adamax, with accuracy values of 72.8% and 66.8%, had the best accuracy for the VGG-16 and Resnet-18 architectures, respectively. Resnet-34 had 54.0%. This is 0.034% lower than Nadam. Overall, Adamax is a suitable optimizer for CNN in cervical cancer classification on Resnet-18, Resnet-34, and VGG-16 architectures. This study provides new insights into the configuration of CNN models for Pap smear image analysis.


Neural Superstatistics for Bayesian Estimation of Dynamic Cognitive Models

Schumacher, Lukas, Bürkner, Paul-Christian, Voss, Andreas, Köthe, Ullrich, Radev, Stefan T.

arXiv.org Machine Learning

Mathematical models of cognition are often memoryless and ignore potential fluctuations of their parameters. However, human cognition is inherently dynamic. Thus, we propose to augment mechanistic cognitive models with a temporal dimension and estimate the resulting dynamics from a superstatistics perspective. Such a model entails a hierarchy between a low-level observation model and a high-level transition model. The observation model describes the local behavior of a system, and the transition model specifies how the parameters of the observation model evolve over time. To overcome the estimation challenges resulting from the complexity of superstatistical models, we develop and validate a simulation-based deep learning method for Bayesian inference, which can recover both time-varying and time-invariant parameters. We first benchmark our method against two existing frameworks capable of estimating time-varying parameters. We then apply our method to fit a dynamic version of the diffusion decision model to long time series of human response times data. Our results show that the deep learning approach is very efficient in capturing the temporal dynamics of the model. Furthermore, we show that the erroneous assumption of static or homogeneous parameters will hide important temporal information.


EdnaML: A Declarative API and Framework for Reproducible Deep Learning

Suprem, Abhijit, Vaidya, Sanjyot, Venugopal, Avinash, Ferreira, Joao Eduardo, Pu, Calton

arXiv.org Artificial Intelligence

Machine Learning has become the bedrock of recent advances in text, image, video, and audio processing and generation. Most production systems deal with several models during deployment and training, each with a variety of tuned hyperparameters. Furthermore, data collection and processing aspects of ML pipelines are receiving increasing interest due to their importance in creating sustainable high-quality classifiers. We present EdnaML, a framework with a declarative API for reproducible deep learning. EdnaML provides low-level building blocks that can be composed manually, as well as a high-level pipeline orchestration API to automate data collection, data processing, classifier training, classifier deployment, and model monitoring. Our layered API allows users to manage ML pipelines at high-level component abstractions, while providing flexibility to modify any part of it through the building blocks. We present several examples of ML pipelines with EdnaML, including a large-scale fake news labeling and classification system with six sub-pipelines managed by EdnaML.


Rethinking and Recomputing the Value of ML Models

Sayin, Burcu, Casati, Fabio, Passerini, Andrea, Yang, Jie, Chen, Xinyue

arXiv.org Artificial Intelligence

In this paper, we argue that the way we have been training and evaluating ML models has largely forgotten the fact that they are applied in an organization or societal context as they provide value to people. We show that with this perspective we fundamentally change how we evaluate, select and deploy ML models - and to some extent even what it means to learn. Specifically, we stress that the notion of value plays a central role in learning and evaluating, and different models may require different learning practices and provide different values based on the application context they are applied. We also show that this concretely impacts how we select and embed models into human workflows based on experimental datasets. Nothing of what is presented here is hard: to a large extent is a series of fairly trivial observations with massive practical implications.


Neural Networks for Local Search and Crossover in Vehicle Routing: A Possible Overkill?

Santana, Ítalo, Lodi, Andrea, Vidal, Thibaut

arXiv.org Artificial Intelligence

Extensive research has been conducted, over recent years, on various ways of enhancing heuristic search for combinatorial optimization problems with machine learning algorithms. In this study, we investigate the use of predictions from graph neural networks (GNNs) in the form of heatmaps to improve the Hybrid Genetic Search (HGS), a state-of-the-art algorithm for the Capacitated Vehicle Routing Problem (CVRP). The crossover and local-search components of HGS are instrumental in finding improved solutions, yet these components essentially rely on simple greedy or random choices. It seems intuitive to attempt to incorporate additional knowledge at these levels. Throughout a vast experimental campaign on more than 10,000 problem instances, we show that exploiting more sophisticated strategies using measures of node relatedness (heatmaps, or simply distance) within these algorithmic components can significantly enhance performance. However, contrary to initial expectations, we also observed that heatmaps did not present significant advantages over simpler distance measures for these purposes. Therefore, we faced a common -- though rarely documented -- situation of overkill: GNNs can indeed improve performance on an important optimization task, but an ablation analysis demonstrated that simpler alternatives perform equally well.