mamba paper Fundamentals Explained
This model inherits from PreTrainedModel. Check out the superclass documentation to the generic methods the mamba paper library implements for all its product (including downloading or preserving, resizing the enter embeddings, pruning heads If handed alongside, the model utilizes the past condition in many of the blocks (which is able to give th