S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces