Goto

Collaborating Authors

 batchnorm1d


SupplementaryMaterialsforHouseofCans: Covert TransmissionofInternalDatasetsviaCapacity-Aware NeuronSteganography

Neural Information Processing Systems

However, considering the ever-evolving paradigms in deep learning, employees with ulterior motivesmay fabricate reasons such asthe requirements ofdata augmentation [6]orthe purpose of multimodal learning [3] to apply for relevant and irrelevant private datasets, which is common in social engineering [4].


Industrial Steel Slag Flow Data Loading Method for Deep Learning Applications

arXiv.org Artificial Intelligence

Steel casting processes are vulnerable to financial losses due to slag flow contamination, making accurate slag flow condition detection essential. This study introduces a novel cross-domain diagnostic method using vibration data collected from an industrial steel foundry to identify various stages of slag flow. A hybrid deep learning model combining one-dimensional convolutional neural networks and long short-term memory layers is implemented, tested, and benchmarked against a standard one-dimensional convolutional neural network. The proposed method processes raw time-domain vibration signals from accelerometers and evaluates performance across 16 distinct domains using a realistic cross-domain dataset split. Results show that the hybrid convolutional neural network and long short-term memory architecture, when combined with root mean square preprocessing and a selective embedding data loading strategy, achieves robust classification accuracy, outperforming traditional models and loading techniques. The highest test accuracy of 99.10 +/- 0.30 demonstrates the method's capability for generalization and industrial relevance. This work presents a practical and scalable solution for real-time slag flow monitoring, contributing to improved reliability and operational efficiency in steel manufacturing.



Decoding EEG Speech Perception with Transformers and VAE-based Data Augmentation

arXiv.org Artificial Intelligence

Decoding speech from non-invasive brain signals, such as electroencephalography (EEG), has the potential to advance brain-computer interfaces (BCIs), with applications in silent communication and assistive technologies for individuals with speech impairments. However, EEG-based speech decoding faces major challenges, such as noisy data, limited datasets, and poor performance on complex tasks like speech perception. This study attempts to address these challenges by employing variational autoencoders (VAEs) for EEG data augmentation to improve data quality and applying a state-of-the-art (SOTA) sequence-to-sequence deep learning architecture, originally successful in electromyography (EMG) tasks, to EEG-based speech decoding. Additionally, we adapt this architecture for word classification tasks. Using the Brennan dataset, which contains EEG recordings of subjects listening to narrated speech, we preprocess the data and evaluate both classification and sequence-to-sequence models for EEG-to-words/sentences tasks. Our experiments show that VAEs have the potential to reconstruct artificial EEG data for augmentation. Meanwhile, our sequence-to-sequence model achieves more promising performance in generating sentences compared to our classification model, though both remain challenging tasks. These findings lay the groundwork for future research on EEG speech perception decoding, with possible extensions to speech production tasks such as silent or imagined speech.


Sampling from Bayesian Neural Network Posteriors with Symmetric Minibatch Splitting Langevin Dynamics

arXiv.org Machine Learning

We propose a scalable kinetic Langevin dynamics algorithm for sampling parameter spaces of big data and AI applications. Our scheme combines a symmetric forward/backward sweep over minibatches with a symmetric discretization of Langevin dynamics. For a particular Langevin splitting method (UBU), we show that the resulting Symmetric Minibatch Splitting-UBU (SMS-UBU) integrator has bias $O(h^2 d^{1/2})$ in dimension $d>0$ with stepsize $h>0$, despite only using one minibatch per iteration, thus providing excellent control of the sampling bias as a function of the stepsize. We apply the algorithm to explore local modes of the posterior distribution of Bayesian neural networks (BNNs) and evaluate the calibration performance of the posterior predictive probabilities for neural networks with convolutional neural network architectures for classification problems on three different datasets (Fashion-MNIST, Celeb-A and chest X-ray). Our results indicate that BNNs sampled with SMS-UBU can offer significantly better calibration performance compared to standard methods of training and stochastic weight averaging.


Sketch-Plan-Generalize: Continual Few-Shot Learning of Inductively Generalizable Spatial Concepts

arXiv.org Artificial Intelligence

Our goal is to enable embodied agents to learn inductively generalizable spatial concepts, e.g., learning staircase as an inductive composition of towers of increasing height. Given a human demonstration, we seek a learning architecture that infers a succinct ${program}$ representation that explains the observed instance. Additionally, the approach should generalize inductively to novel structures of different sizes or complex structures expressed as a hierarchical composition of previously learned concepts. Existing approaches that use code generation capabilities of pre-trained large (visual) language models, as well as purely neural models, show poor generalization to a-priori unseen complex concepts. Our key insight is to factor inductive concept learning as (i) ${\it Sketch:}$ detecting and inferring a coarse signature of a new concept (ii) ${\it Plan:}$ performing MCTS search over grounded action sequences (iii) ${\it Generalize:}$ abstracting out grounded plans as inductive programs. Our pipeline facilitates generalization and modular reuse, enabling continual concept learning. Our approach combines the benefits of the code generation ability of large language models (LLM) along with grounded neural representations, resulting in neuro-symbolic programs that show stronger inductive generalization on the task of constructing complex structures in relation to LLM-only and neural-only approaches. Furthermore, we demonstrate reasoning and planning capabilities with learned concepts for embodied instruction following.


Learning Galaxy Intrinsic Alignment Correlations

arXiv.org Artificial Intelligence

The intrinsic alignments (IA) of galaxies, regarded as a contaminant in weak lensing analyses, represents the correlation of galaxy shapes due to gravitational tidal interactions and galaxy formation processes. As such, understanding IA is paramount for accurate cosmological inferences from weak lensing surveys; however, one limitation to our understanding and mitigation of IA is expensive simulation-based modeling. In this work, we present a deep learning approach to emulate galaxy position-position ($\xi$), position-orientation ($\omega$), and orientation-orientation ($\eta$) correlation function measurements and uncertainties from halo occupation distribution-based mock galaxy catalogs. We find strong Pearson correlation values with the model across all three correlation functions and further predict aleatoric uncertainties through a mean-variance estimation training procedure. $\xi(r)$ predictions are generally accurate to $\leq10\%$. Our model also successfully captures the underlying signal of the noisier correlations $\omega(r)$ and $\eta(r)$, although with a lower average accuracy. We find that the model performance is inhibited by the stochasticity of the data, and will benefit from correlations averaged over multiple data realizations. Our code will be made open source upon journal publication.


DeepGraviLens: a Multi-Modal Architecture for Classifying Gravitational Lensing Data

arXiv.org Artificial Intelligence

In astrophysics, a gravitational lens is a matter distribution (e.g., a black hole) able to bend the trajectory of transiting light, similar to an optical lens. Such apparent distortion is caused by the curvature of the geometry of space-time around the massive body acting as a lens, a phenomenon that forces the light to travel along the geodesics (i.e., the shortest paths in the curved space-time). Strong and weak gravitational lensing focus on the effects produced by particularly massive bodies (e.g., galaxies and black holes), while microlensing addresses the consequences produced by lighter entities (e.g., stars). This research proposes an approach to automatically classify strong gravitational lenses with respect to the lensed objects and to their evolution through time. Automatically finding and classifying gravitational lenses is a major challenge in astrophysics. As [103, 91, 39, 44] show, gravitational lensing systems can be complex, ubiquitous and hard to detect without computer-aided data processing. The volumes of data gathered by contemporary instruments make manual inspection unfeasible. As an example, the Vera C. Rubin Observatory is expected to collect petabytes of data [108]. Moreover, strong lensing is involved in major astrophysical problems: studying massive bodies that are too faint to be analyzed with current instrumentation; characterizing the geometry, content and kinematics of the universe; and investigating mass distribution in the galaxy formation process [103].


Clustering-Induced Generative Incomplete Image-Text Clustering (CIGIT-C)

arXiv.org Artificial Intelligence

The target of image-text clustering (ITC) is to find correct clusters by integrating complementary and consistent information of multi-modalities for these heterogeneous samples. However, the majority of current studies analyse ITC on the ideal premise that the samples in every modality are complete. This presumption, however, is not always valid in real-world situations. The missing data issue degenerates the image-text feature learning performance and will finally affect the generalization abilities in ITC tasks. Although a series of methods have been proposed to address this incomplete image text clustering issue (IITC), the following problems still exist: 1) most existing methods hardly consider the distinct gap between heterogeneous feature domains. 2) For missing data, the representations generated by existing methods are rarely guaranteed to suit clustering tasks. 3) Existing methods do not tap into the latent connections both inter and intra modalities. In this paper, we propose a Clustering-Induced Generative Incomplete Image-Text Clustering(CIGIT-C) network to address the challenges above. More specifically, we first use modality-specific encoders to map original features to more distinctive subspaces. The latent connections between intra and inter-modalities are thoroughly explored by using the adversarial generating network to produce one modality conditional on the other modality. Finally, we update the corresponding modalityspecific encoders using two KL divergence losses. Experiment results on public image-text datasets demonstrated that the suggested method outperforms and is more effective in the IITC job.


Bayesian Learning via Neural Schr\"odinger-F\"ollmer Flows

arXiv.org Machine Learning

In this work we explore a new framework for approximate Bayesian inference in large datasets based on stochastic control. We advocate stochastic control as a finite time and low variance alternative to popular steady-state methods such as stochastic gradient Langevin dynamics (SGLD). Furthermore, we discuss and adapt the existing theoretical guarantees of this framework and establish connections to already existing VI routines in SDE-based models.