Goto

Collaborating Authors

 inn



Intelligent Neural Networks: From Layered Architectures to Graph-Organized Intelligence

Salomon, Antoine

arXiv.org Artificial Intelligence

Biological neurons exhibit remarkable intelligence: they maintain internal states, communicate selectively with other neurons, and self-organize into complex graphs rather than rigid hierarchical layers. What if artificial intelligence could emerge from similarly intelligent computational units? We introduce Intelligent Neural Networks (INN), a paradigm shift where neurons are first-class entities with internal memory and learned communication patterns, organized in complete graphs rather than sequential layers. Each Intelligent Neuron combines selective state-space dynamics (knowing when to activate) with attention-based routing (knowing to whom to send signals), enabling emergent computation through graph-structured interactions. On the standard Text8 character modeling benchmark, INN achieves 1.705 Bit-Per-Character (BPC), significantly outperforming a comparable Transformer (2.055 BPC) and matching a highly optimized LSTM baseline. Crucially, a parameter-matched baseline of stacked Mamba blocks fails to converge (>3.4 BPC) under the same training protocol, demonstrating that INN's graph topology provides essential training stability. Ablation studies confirm this: removing inter-neuron communication degrades performance or leads to instability, proving the value of learned neural routing. This work demonstrates that neuron-centric design with graph organization is not merely bio-inspired -- it is computationally effective, opening new directions for modular, interpretable, and scalable neural architectures.



Introducing Interval Neural Networks for Uncertainty-Aware System Identification

Ferah, Mehmet Ali, Kumbasar, Tufan

arXiv.org Artificial Intelligence

System Identification (SysID) is crucial for modeling and understanding dynamical systems using experimental data. While traditional SysID methods emphasize linear models, their inability to fully capture nonlinear dynamics has driven the adoption of Deep Learning (DL) as a more powerful alternative. However, the lack of uncertainty quantification (UQ) in DL-based models poses challenges for reliability and safety, highlighting the necessity of incorporating UQ. This paper introduces a systematic framework for constructing and learning Interval Neural Networks (INNs) to perform UQ in SysID tasks. INNs are derived by transforming the learnable parameters (LPs) of pre-trained neural networks into interval-valued LPs without relying on probabilistic assumptions. By employing interval arithmetic throughout the network, INNs can generate Prediction Intervals (PIs) that capture target coverage effectively. We extend Long Short-Term Memory (LSTM) and Neural Ordinary Differential Equations (Neural ODEs) into Interval LSTM (ILSTM) and Interval NODE (INODE) architectures, providing the mathematical foundations for their application in SysID. To train INNs, we propose a DL framework that integrates a UQ loss function and parameterization tricks to handle constraints arising from interval LPs. We introduce novel concept "elasticity" for underlying uncertainty causes and validate ILSTM and INODE in SysID experiments, demonstrating their effectiveness.


Rigorous Probabilistic Guarantees for Robust Counterfactual Explanations

Marzari, Luca, Leofante, Francesco, Cicalese, Ferdinando, Farinelli, Alessandro

arXiv.org Artificial Intelligence

We study the problem of assessing the robustness of counterfactual explanations for deep learning models. We focus on $\textit{plausible model shifts}$ altering model parameters and propose a novel framework to reason about the robustness property in this setting. To motivate our solution, we begin by showing for the first time that computing the robustness of counterfactuals with respect to plausible model shifts is NP-complete. As this (practically) rules out the existence of scalable algorithms for exactly computing robustness, we propose a novel probabilistic approach which is able to provide tight estimates of robustness with strong guarantees while preserving scalability. Remarkably, and differently from existing solutions targeting plausible model shifts, our approach does not impose requirements on the network to be analyzed, thus enabling robustness analysis on a wider range of architectures. Experiments on four binary classification datasets indicate that our method improves the state of the art in generating robust explanations, outperforming existing methods on a range of metrics.


Engineering software 2.0 by interpolating neural networks: unifying training, solving, and calibration

Park, Chanwook, Saha, Sourav, Guo, Jiachen, Xie, Xiaoyu, Mojumder, Satyajit, Bessa, Miguel A., Qian, Dong, Chen, Wei, Wagner, Gregory J., Cao, Jian, Liu, Wing Kam

arXiv.org Artificial Intelligence

The evolution of artificial intelligence (AI) and neural network theories has revolutionized the way software is programmed, shifting from a hard-coded series of codes to a vast neural network. However, this transition in engineering software has faced challenges such as data scarcity, multi-modality of data, low model accuracy, and slow inference. Here, we propose a new network based on interpolation theories and tensor decomposition, the interpolating neural network (INN). Instead of interpolating training data, a common notion in computer science, INN interpolates interpolation points in the physical space whose coordinates and values are trainable. It can also extrapolate if the interpolation points reside outside of the range of training data and the interpolation functions have a larger support domain. INN features orders of magnitude fewer trainable parameters, faster training, a smaller memory footprint, and higher model accuracy compared to feed-forward neural networks (FFNN) or physics-informed neural networks (PINN). INN is poised to usher in Engineering Software 2.0, a unified neural network that spans various domains of space, time, parameters, and initial/boundary conditions. This has previously been computationally prohibitive due to the exponentially growing number of trainable parameters, easily exceeding the parameter size of ChatGPT, which is over 1 trillion. INN addresses this challenge by leveraging tensor decomposition and tensor product, with adaptable network architecture.


LlaMaVAE: Guiding Large Language Model Generation via Continuous Latent Sentence Spaces

Zhang, Yingji, Carvalho, Danilo S., Pratt-Hartmann, Ian, Freitas, André

arXiv.org Artificial Intelligence

Deep generative neural networks, such as Variational AutoEncoders (VAEs), offer an opportunity to better understand and control language models from the perspective of sentence-level latent spaces. To combine the controllability of VAE latent spaces with the state-of-the-art performance of recent large language models (LLMs), we present in this work LlaMaVAE, which combines expressive encoder and decoder models (sentenceT5 and LlaMA) with a VAE architecture, aiming to provide better text generation control to LLMs. In addition, to conditionally guide the VAE generation, we investigate a new approach based on flow-based invertible neural networks (INNs) named Invertible CVAE. Experimental results reveal that LlaMaVAE can outperform the previous state-of-the-art VAE language model, Optimus, across various tasks, including language modelling, semantic textual similarity and definition modelling. Qualitative analysis on interpolation and traversal experiments also indicates an increased degree of semantic clustering and geometric consistency, which enables better generation control.


Distributionally Robust Statistical Verification with Imprecise Neural Networks

Dutta, Souradeep, Caprio, Michele, Lin, Vivian, Cleaveland, Matthew, Jang, Kuk Jin, Ruchkin, Ivan, Sokolsky, Oleg, Lee, Insup

arXiv.org Artificial Intelligence

A particularly challenging problem in AI safety is providing guarantees on the behavior of high-dimensional autonomous systems. Verification approaches centered around reachability analysis fail to scale, and purely statistical approaches are constrained by the distributional assumptions about the sampling process. Instead, we pose a distributionally robust version of the statistical verification problem for black-box systems, where our performance guarantees hold over a large family of distributions. This paper proposes a novel approach based on a combination of active learning, uncertainty quantification, and neural network verification. A central piece of our approach is an ensemble technique called Imprecise Neural Networks, which provides the uncertainty to guide active learning. The active learning uses an exhaustive neural-network verification tool Sherlock to collect samples. An evaluation on multiple physical simulators in the openAI gym Mujoco environments with reinforcement-learned controllers demonstrates that our approach can provide useful and scalable guarantees for high-dimensional systems.


CommIN: Semantic Image Communications as an Inverse Problem with INN-Guided Diffusion Models

Chen, Jiakang, You, Di, Gündüz, Deniz, Dragotti, Pier Luigi

arXiv.org Artificial Intelligence

Joint source-channel coding schemes based on deep neural networks (DeepJSCC) have recently achieved remarkable performance for wireless image transmission. However, these methods usually focus only on the distortion of the reconstructed signal at the receiver side with respect to the source at the transmitter side, rather than the perceptual quality of the reconstruction which carries more semantic information. As a result, severe perceptual distortion can be introduced under extreme conditions such as low bandwidth and low signal-to-noise ratio. In this work, we propose CommIN, which views the recovery of high-quality source images from degraded reconstructions as an inverse problem. To address this, CommIN combines Invertible Neural Networks (INN) with diffusion models, aiming for superior perceptual quality. Through experiments, we show that our CommIN significantly improves the perceptual quality compared to DeepJSCC under extreme conditions and outperforms other inverse problem approaches used in DeepJSCC.


On the Approximation of Bi-Lipschitz Maps by Invertible Neural Networks

Jin, Bangti, Zhou, Zehui, Zou, Jun

arXiv.org Artificial Intelligence

Invertible neural networks (INNs) are a class of neural network (NN) architectures with invertibility by design, via special invertible layers called the flow layers. INNs often enjoy tractable numerical algorithms to compute the inverse map and Jacobian determinant, e.g., with explicit inversion formulas. These distinct features have made them very attractive for a variety of machine learning tasks, e.g., generative modeling [16, 31, 29], probabilistic modeling [38, 17, 23, 6], solving inverse problems [2, 1, 3], modeling nonlinear dynamics [9] and point cloud generation [44]. There are several different classes of INNs, including invertible residual networks (iResNet) [7, 43], neural ordinary differential equations (NODEs) [11, 13, 18] and coupling-based neural networks [16, 17, 25, 31, 2]. For iResNet, Behrmann et al [7] leveraged the viewpoint of ResNets as an Euler discretization of ODEs and proved the standard ResNet architecture can be made invertible by adding a simple normalization step to control the Lipschitz constant of the NN during training. The inverse is not available in closed form but can be obtained through a fixed-point iteration. Chen et al [13] proposed using black-box ODE solvers as a model component, and developed a class of new models, i.e., NODEs, for time-series modeling, supervised learning, and density estimation etc. NODEs indirectly models an invertible function by transforming an input vector through an ordinary differential equation (ODE). Dupont and Doucet [18] introduced a class of more expressive and empirically stable models, augmented neural ODEs (ANODEs), which have a lower computational cost.