layer size
DRMD: Deep Reinforcement Learning for Malware Detection under Concept Drift
McFadden, Shae, Foley, Myles, D'Onghia, Mario, Hicks, Chris, Mavroudis, Vasilios, Paoletti, Nicola, Pierazzi, Fabio
Malware detection in real-world settings must deal with evolving threats, limited labeling budgets, and uncertain predictions. Traditional classifiers, without additional mechanisms, struggle to maintain performance under concept drift in malware domains, as their supervised learning formulation cannot optimize when to defer decisions to manual labeling and adaptation. Modern malware detection pipelines combine classifiers with monthly active learning (AL) and rejection mechanisms to mitigate the impact of concept drift. In this work, we develop a novel formulation of malware detection as a one-step Markov Decision Process and train a deep reinforcement learning (DRL) agent, simultaneously optimizing sample classification performance and rejecting high-risk samples for manual labeling. We evaluated the joint detection and drift mitigation policy learned by the DRL-based Malware Detection (DRMD) agent through time-aware evaluations on Android malware datasets subject to realistic drift requiring multi-year performance stability. The policies learned under these conditions achieve a higher Area Under Time (AUT) performance compared to standard classification approaches used in the domain, showing improved resilience to concept drift. Specifically, the DRMD agent achieved an average AUT improvement of 8.66 and 10.90 for the classification-only and classification-rejection policies, respectively. Our results demonstrate for the first time that DRL can facilitate effective malware detection and improved resiliency to concept drift in the dynamic setting of Android malware detection.
Supplementary Material
We use the PyTorch framework for our experiments. Similar to TD3, we implement our GRU-ODE in SAC. In this ablation study, we ask two questions in relation to numerical integration. Thus, simple numerical solvers are enough. We evaluate the time costs of different baselines on Walker-P environments.
Logic Gate Neural Networks are Good for Verification
Kresse, Fabian, Yu, Emily, Lampert, Christoph H., Henzinger, Thomas A.
Learning-based systems are increasingly deployed across various domains, yet the complexity of traditional neural networks poses significant challenges for formal verification. Unlike conventional neural networks, learned Logic Gate Networks (LGNs) replace multiplications with Boolean logic gates, yielding a sparse, netlist-like architecture that is inherently more amenable to symbolic verification, while still delivering promising performance. In this paper, we introduce a SA T encoding for verifying global robustness and fairness in LGNs. We evaluate our method on five benchmark datasets, including a newly constructed 5-class variant, and find that LGNs are both verification-friendly and maintain strong predictive performance.
Consistency of Feature Attribution in Deep Learning Architectures for Multi-Omics
Claborne, Daniel, Flores, Javier, Erwin, Samantha, Durell, Luke, Richardson, Rachel, Fore, Ruby, Bramer, Lisa
Machine and deep learning have grown in popularity and use in biological research over the last decade but still present challenges in interpretability of the fitted model. The development and use of metrics to determine features driving predictions and increase model i nterpretability continues to be an open area of research. We investigate the use of Shapley Additive Explanations (SHAP) on a multi - view deep learning model applied to multi - omics data for the purposes of identifying biomolecules of interest . Rankings of features via these attribution methods are compared across various architectures to evaluate consistency of the method. We perform multiple computational experiments to assess the robustness of SHAP and investigate modeling approaches and diagnostics to increase and measure the reliability of the identification of important features. Accuracy of a random - forest model fit on subsets of features selected as being most influential as well as clustering quality using o nly these features are used as a measure of enullectiveness of the attribution method. Our findings indicate that the rankings of features resulting from SHAP are sensitive to the choice of architecture as well as dinullerent random initializations of weights, suggesting caution when u sing attribution methods on multi - view deep learning models applied to multi - omics data. We present a n alternative, simple method to assess the robustness of identification of important biomolecules.
Real-Time Moving Flock Detection in Pedestrian Trajectories Using Sequential Deep Learning Models
Sanjjamts, Amartaivan, Morita, Hiroshi, Enkhtogtokh, Togootogtokh
The analysis of pedestrian trajectories has become an essential aspect of understanding human mobility patterns in various environments such as urban spaces, transportation systems, and public gatherings. In particular, the identification of pedestrian groups or "flocks" moving together in real-time is a challenging but crucial task. A flock can be defined as a group of individuals whose movements are highly correlated over time, often indicating a shared goal or destination. Detecting such flocks is not only important for crowd management and safety but also for enhancing the effectiveness of autonomous systems, such as self-driving cars, and improving human-robot interaction. Collective motion in trajectory data can be categorized into different formats, including flocks, convoys, and swarms [1]. A flock is a set of agents moving together within a limited spatial region over a specific time interval. A convoy extends this definition by maintaining the same group structure over longer periods, making it more stable in dynamic environments. A swarm represents a more loosely connected group, where individuals exhibit similar movement patterns but do not necessarily maintain fixed spatial relationships. In this study, we focus on moving flock detection, where groups of pedestrians dynamically form and dissolve while moving together over short time intervals.
MuJoCo Playground
Zakka, Kevin, Tabanpour, Baruch, Liao, Qiayuan, Haiderbhai, Mustafa, Holt, Samuel, Luo, Jing Yuan, Allshire, Arthur, Frey, Erik, Sreenath, Koushil, Kahrs, Lueder A., Sferrazza, Carmelo, Tassa, Yuval, Abbeel, Pieter
We introduce MuJoCo Playground, a fully open-source framework for robot learning built with MJX, with the express goal of streamlining simulation, training, and sim-to-real transfer onto robots. With a simple "pip install playground", researchers can train policies in minutes on a single GPU. Playground supports diverse robotic platforms, including quadrupeds, humanoids, dexterous hands, and robotic arms, enabling zero-shot sim-to-real transfer from both state and pixel inputs. This is achieved through an integrated stack comprising a physics engine, batch renderer, and training environments. Along with video results, the entire framework is freely available at playground.mujoco.org
Training Multi-Layer Binary Neural Networks With Local Binary Error Signals
Colombo, Luca, Pittorino, Fabrizio, Roveri, Manuel
Binary Neural Networks (BNNs) hold the potential for significantly reducing computational complexity and memory demand in machine and deep learning. However, most successful training algorithms for BNNs rely on quantization-aware floating-point Stochastic Gradient Descent (SGD), with full-precision hidden weights used during training. The binarized weights are only used at inference time, hindering the full exploitation of binary operations during the training process. In contrast to the existing literature, we introduce, for the first time, a multi-layer training algorithm for BNNs that does not require the computation of back-propagated full-precision gradients. Specifically, the proposed algorithm is based on local binary error signals and binary weight updates, employing integer-valued hidden weights that serve as a synaptic metaplasticity mechanism, thereby establishing it as a neurobiologically plausible algorithm. The binary-native and gradient-free algorithm proposed in this paper is capable of training binary multi-layer perceptrons (BMLPs) with binary inputs, weights, and activations, by using exclusively XNOR, Popcount, and increment/decrement operations, hence effectively paving the way for a new class of operation-optimized training algorithms. Experimental results on BMLPs fully trained in a binary-native and gradient-free manner on multi-class image classification benchmarks demonstrate an accuracy improvement of up to +13.36% compared to the fully binary state-of-the-art solution, showing minimal accuracy degradation compared to the same architecture trained with full-precision SGD and floating-point weights, activations, and inputs. The proposed algorithm is made available to the scientific community as a public repository.