Mamba stacks mixer layers, that happen to be the equal of Attention layers. The core logic of mamba is held while in the MambaMixer course.
It begins using a linear projection to develop upon the enter embeddings. https://k2spiceshop.com/product/liquid-k2-on-paper-online/