MambaTab
Overview
MambaTab is exposed as a stable Mamba-family model, but the current DeepTab forward path behaves as a lightweight projected-feature network: it concatenates input features, projects them to d_model, normalizes and activates the representation, then predicts with MLPhead.
Use it as a compact baseline in the current release. For an active Mamba sequence model over feature tokens, prefer Mambular or MambAttention.
Architectural Details
The current MambaTab forward path is:
Concatenate all input tensors.
Apply
initial_layerfrom input dimension tod_model.Temporarily unsqueeze along
axis, applyLayerNorm, and applyembedding_activation.Squeeze back to a row representation.
Predict with
MLPhead.
features -> concat -> Linear(input_dim, d_model) -> LayerNorm -> activation -> MLPhead
Main Building Blocks
Component |
DeepTab implementation |
Role |
|---|---|---|
Input path |
|
Uses raw/preprocessed feature tensors directly. |
Projection |
|
Maps input vector to |
Normalization |
|
Stabilizes projected representation. |
Head |
|
Produces predictions. |
Mamba block |
|
Instantiated in |
Implementation Notes
The presence of Mamba-related config fields (d_state, d_conv, expand_factor, mamba_version, bidirectional) does not mean they affect the current forward pass. They configure the instantiated self.mamba module, but that module is not applied before the head.
This distinction matters for research comparisons: document the DeepTab version and verify the forward path if you report MambaTab as a state-space model.
Practical Config
from deeptab.configs import MambaTabConfig, PreprocessingConfig, TrainerConfig
from deeptab.models import MambaTabRegressor
model = MambaTabRegressor(
model_config=MambaTabConfig(
d_model=64,
dropout=0.05,
head_layer_sizes=[128],
head_dropout=0.1,
),
preprocessing_config=PreprocessingConfig(numerical_preprocessing="standard"),
trainer_config=TrainerConfig(lr=1e-3, batch_size=256, max_epochs=100),
random_state=101,
)
Key settings in the current forward path:
Setting |
Typical range |
Effect |
|---|---|---|
|
|
Width of the projected representation. |
|
|
Activation after projection/norm. |
|
|
Extra MLPhead capacity. |
|
|
Head regularization. |
|
|
Temporary unsqueeze axis before normalization. |
When To Use
Use MambaTab when you want a lightweight projection baseline from the Mamba-family API. Use Mambular for sequence modeling experiments where the Mamba block must be active.
References
Gu and Dao, Mamba: Linear-Time Sequence Modeling with Selective State Spaces.
Thielmann et al., Mambular: A Sequential Model for Tabular Deep Learning.