B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory
–Neural Information Processing Systems
We describe a family of architectures to support transductive inference by allowing memory to grow to a finite but a-priori unknown bound while making efficient use of finite resources for inference. Current architectures use such resources to represent data either eidetically over a finite span ('context' in Transformers), or fading over an infinite span (in State Space Models, or SSMs). Recent hybrid architectures have combined eidetic and fading memory, but with limitations that do not allow the designer or the learning process to seamlessly modulate the two, nor to extend the eidetic memory span. We leverage ideas from Stochastic Realization Theory to develop a class of models called B'MOJO to seamlessly combine eidetic and fading memory within an elementary composable module. The overall architecture can be used to implement models that can access short-term eidetic memory'in-context,' permanent structural memory'in-weights,' fading memory'in-state,' and long-term eidetic memory'in-storage' by natively incorporating retrieval from an asynchronously updated memory.
Neural Information Processing Systems
May-27-2025, 20:32:52 GMT
- Technology: