Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling

Open in new window