Perceptrons
Evaluating the Generalization Ability of Spatiotemporal Model in Urban Scenario
Wang, Hongjun, Chen, Jiyuan, Pan, Tong, Dong, Zheng, Zhang, Lingyu, Jiang, Renhe, Song, Xuan
Spatiotemporal neural networks have shown great promise in urban scenarios by effectively capturing temporal and spatial correlations. However, urban environments are constantly evolving, and current model evaluations are often limited to traffic scenarios and use data mainly collected only a few weeks after training period to evaluate model performance. The generalization ability of these models remains largely unexplored. To address this, we propose a Spatiotemporal Out-of-Distribution (ST-OOD) benchmark, which comprises six urban scenario: bike-sharing, 311 services, pedestrian counts, traffic speed, traffic flow, ride-hailing demand, and bike-sharing, each with in-distribution (same year) and out-of-distribution (next years) settings. We extensively evaluate state-of-the-art spatiotemporal models and find that their performance degrades significantly in out-of-distribution settings, with most models performing even worse than a simple Multi-Layer Perceptron (MLP). Our findings suggest that current leading methods tend to over-rely on parameters to overfit training data, which may lead to good performance on in-distribution data but often results in poor generalization. We also investigated whether dropout could mitigate the negative effects of overfitting. Our results showed that a slight dropout rate could significantly improve generalization performance on most datasets, with minimal impact on in-distribution performance. However, balancing in-distribution and out-of-distribution performance remains a challenging problem. We hope that the proposed benchmark will encourage further research on this critical issue.
Reviews: Incorporating Side Information by Adaptive Convolution
Summary of the Paper: This work proposes to use adaptive convolutions (also called'cross convolutions') to incorporate side information (e.g., camera angle) into CNN architectures for vision tasks (e.g., crowd counting). The filter weights in each adaptive convolution layer are predicted using a separate neural network (one network for each set of filter weights) with is a multi-layer perceptron. This network is referred to as'Filter Manifold Network' which takes the auxiliary side information as input and predicts the filter weights. Experiments on three vision tasks of crowd counting, digit recognition and image deconvolution indicate the potential of the proposed technique for incorporating auxiliary information. In addition, this paper contributes a new dataset for crowd counting with different camera heights and angles.
Reviews: A simple neural network module for relational reasoning
The paper proposes a plug and play module (called Relation Networks (RNs)) specialized for relational reasoning. The module is composed of Multi Layer Perceptrons and considers relations between all pairs of objects. The proposed module when plugged into traditional networks achieves state of the art performance on the CLEVR visual question answering dataset, state of the art (with joint training for all tasks) on the bAbI textual question answering dataset and high performance (93% on one task and 95% on another) on a newly collected dataset of simulated physical mass-spring systems. The paper also collects a dataset similar to CLEVR to demonstrate the effectiveness of the proposed RNs for relational questions. The proposed Relation Network is a novel neural network specialized for relational reasoning.
FreSh: Frequency Shifting for Accelerated Neural Representation Learning
Kania, Adam, Mihajlovic, Marko, Prokudin, Sergey, Tabor, Jacek, Spurek, Przemysław
Implicit Neural Representations (INRs) have recently gained attention as a powerful approach for continuously representing signals such as images, videos, and 3D shapes using multilayer perceptrons (MLPs). However, MLPs are known to exhibit a low-frequency bias, limiting their ability to capture high-frequency details accurately. This limitation is typically addressed by incorporating high-frequency input embeddings or specialized activation layers. In this work, we demonstrate that these embeddings and activations are often configured with hyperparameters that perform well on average but are suboptimal for specific input signals under consideration, necessitating a costly grid search to identify optimal settings. Our key observation is that the initial frequency spectrum of an untrained model's output correlates strongly with the model's eventual performance on a given target signal. Leveraging this insight, we propose frequency shifting (or FreSh), a method that selects embedding hyperparameters to align the frequency spectrum of the model's initial output with that of the target signal. We show that this simple initialization technique improves performance across various neural representation methods and tasks, achieving results comparable to extensive hyperparameter sweeps but with only marginal computational overhead compared to training a single model with default hyperparameters. Implicit Neural Representations (INRs) are advancing computer graphics research by integrating classical algorithms with continuous signal representations. They have been successfully applied in signal representation and inverse problems, with notable applications in neural rendering, compression, and 2D and 3D signal reconstruction (Xie et al., 2022). INRs primarily rely on multilayer perceptrons (MLPs), making them susceptible to spectral bias, which refers to the slower convergence of MLPs when approximating high-frequency components of the target signal (Rahaman et al., 2019).
Reviews: Decoupling "when to update" from "how to update"
Summary The paper proposes a meta algorithm for training any binary classifier in a manner that is robust to label noise. A model trained with noisy labels will overfit them trained for too long. Instead, one can train two models at the same time, initialized at random, and update by disagreement: the updates are performed only when the two models' prediction differ, a sign that they are still learning from the genuine signal in the data (not the noise); and instead, defensively, if the models agree on their predictions and the respective ground truth label is different, they should not perform an update, because this is a sign of potential label noise. A key element is the random initialization of the models, since the assumption is that the two should not give the same prediction unless they are close to converge; this fits well with deep neural networks, the target of this work. The paper provides a proof of convergence in the case of linear models (updated with perceptron algorithm and in the realizable case) and a proof that the optimal model cannot be reach in general, unless we resort to restrictive distributional assumptions (this is nice since it also shows a theoretical limitation of the meta-algorithm).
Reviews: Revisiting Perceptron: Efficient and Label-Optimal Learning of Halfspaces
It shows that their algorithm learns using a nearly tight number of samples in the random independent noise of bounded rate. Previous work had exponentially worse dependence on the noise rate. In addition, it shows that this algorithm can deal with adversarial noise of sufficiently low rate. The latter result improves polynomially on the sample complexity but requires a stronger condtion on the noise rate. The assumptions in this setting are very strong and as a result are highly unlikely to hold in any realistic problem.
Reviews: Nonparametric Bayesian Lomax delegate racing for survival analysis with competing risks
The model has two appealing characteristics. First, it allows predictors to affect the hazard function non-linearly. Second, the non-linearity is formulated using latent "sub-events" that compete to determine when an observable event of interest will occur. This arguably makes the non-linearity more interpretable than approaches like random forests or multilayer perceptrons. Moreover, the number of sub-events is specified using a nonparameteric Bayesian model and so model complexity can adapt to the problem.
Easydiagnos: a framework for accurate feature selection for automatic diagnosis in smart healthcare
Maji, Prasenjit, Mondal, Amit Kumar, Mondal, Hemanta Kumar, Mohanty, Saraju P.
The rapid advancements in artificial intelligence (AI) have revolutionized smart healthcare, driving innovations in wearable technologies, continuous monitoring devices, and intelligent diagnostic systems. However, security, explainability, robustness, and performance optimization challenges remain critical barriers to widespread adoption in clinical environments. This research presents an innovative algorithmic method using the Adaptive Feature Evaluator (AFE) algorithm to improve feature selection in healthcare datasets and overcome problems. AFE integrating Genetic Algorithms (GA), Explainable Artificial Intelligence (XAI), and Permutation Combination Techniques (PCT), the algorithm optimizes Clinical Decision Support Systems (CDSS), thereby enhancing predictive accuracy and interpretability. The proposed method is validated across three diverse healthcare datasets using six distinct machine learning algorithms, demonstrating its robustness and superiority over conventional feature selection techniques. The results underscore the transformative potential of AFE in smart healthcare, enabling personalized and transparent patient care. Notably, the AFE algorithm, when combined with a Multi-layer Perceptron (MLP), achieved an accuracy of up to 98.5%, highlighting its capability to improve clinical decision-making processes in real-world healthcare applications.
Using fractal dimension to predict the risk of intra cranial aneurysm rupture with machine learning
Elavarthi, Pradyumna, Ralescu, Anca, Johnson, Mark D., Prestigiacomo, Charles J.
Intracranial aneurysms (IAs) that rupture result in significant morbidity and mortality. While traditional risk models such as the PHASES score are useful in clinical decision making, machine learning (ML) models offer the potential to provide more accuracy. In this study, we compared the performance of four different machine learning algorithms Random Forest (RF), XGBoost (XGB), Support Vector Machine (SVM), and Multi Layer Perceptron (MLP) on clinical and radiographic features to predict rupture status of intracranial aneurysms. Among the models, RF achieved the highest accuracy (85%) with balanced precision and recall, while MLP had the lowest overall performance (accuracy of 63%). Fractal dimension ranked as the most important feature for model performance across all models.
Generating peak-aware pseudo-measurements for low-voltage feeders using metadata of distribution system operators
Treutlein, Manuel, Schmidt, Marc, Hahn, Roman, Hertel, Matthias, Heidrich, Benedikt, Mikut, Ralf, Hagenmeyer, Veit
Distribution system operators (DSOs) must cope with new challenges such as the reconstruction of distribution grids along climate neutrality pathways or the ability to manage and control consumption and generation in the grid. In order to meet the challenges, measurements within the distribution grid often form the basis for DSOs. Hence, it is an urgent problem that measurement devices are not installed in many low-voltage (LV) grids. In order to overcome this problem, we present an approach to estimate pseudo-measurements for non-measured LV feeders based on the metadata of the respective feeder using regression models. The feeder metadata comprise information about the number of grid connection points, the installed power of consumers and producers, and billing data in the downstream LV grid. Additionally, we use weather data, calendar data and timestamp information as model features. The existing measurements are used as model target. We extensively evaluate the estimated pseudo-measurements on a large real-world dataset with 2,323 LV feeders characterized by both consumption and feed-in. For this purpose, we introduce peak metrics inspired by the BigDEAL challenge for the peak magnitude, timing and shape for both consumption and feed-in. As regression models, we use XGBoost, a multilayer perceptron (MLP) and a linear regression (LR). We observe that XGBoost and MLP outperform the LR. Furthermore, the results show that the approach adapts to different weather, calendar and timestamp conditions and produces realistic load curves based on the feeder metadata. In the future, the approach can be adapted to other grid levels like substation transformers and can supplement research fields like load modeling, state estimation and LV load forecasting.