Mambular
Overview
Mambular treats tabular columns as a sequence of feature tokens and processes that sequence with Mamba-style state-space blocks. It is DeepTab’s main stable state-space model for tabular data.
Use Mambular when you want to compare sequence modeling over columns against attention models such as FTTransformer and SAINT.
Architectural Details
DeepTab’s Mambular pipeline is:
EmbeddingLayertokenizes numerical, categorical, and embedding features.Optional feature-token shuffling is applied when
shuffle_embeddings=True.A Mamba block stack processes the token sequence.
pool_sequenceaggregates the sequence.MLPheadpredicts the target.
feature tokens -> optional shuffle -> Mamba/MambaOriginal -> pooling -> MLPhead
Main Building Blocks
Component |
DeepTab implementation |
Role |
|---|---|---|
Tokenizer |
|
Converts columns to a token sequence. |
Sequence block |
|
Applies selective state-space sequence processing. |
Pooling |
|
Reduces tokens to a row representation. |
Head |
|
Task-specific prediction. |
Implementation Notes
The default config uses d_model=64, n_layers=4, d_state=128, d_conv=4, expand_factor=2, norm="RMSNorm", and pooling_method="avg".
mamba_version="mamba-torch" selects DeepTab’s local Mamba block; other values select MambaOriginal. bidirectional, use_learnable_interaction, and use_pscan expose implementation variants for research comparisons.
Practical Config
from deeptab.configs import MambularConfig, PreprocessingConfig, TrainerConfig
from deeptab.models import MambularClassifier
model = MambularClassifier(
model_config=MambularConfig(
d_model=64,
n_layers=4,
d_state=128,
d_conv=4,
pooling_method="avg",
),
preprocessing_config=PreprocessingConfig(numerical_preprocessing="quantile"),
trainer_config=TrainerConfig(lr=3e-4, batch_size=128, max_epochs=100),
random_state=101,
)
Key settings:
Setting |
Typical range |
Effect |
|---|---|---|
|
|
Token width. |
|
|
Number of Mamba blocks. |
|
|
State-space memory size. |
|
|
Local convolution width inside Mamba. |
|
|
Whether to process feature order in both directions. |
When To Use
Use Mambular when feature order or sequential token mixing is part of the model hypothesis. Because tabular columns do not have a natural order, compare against shuffled-token variants and attention baselines.
References
Gu and Dao, Mamba: Linear-Time Sequence Modeling with Selective State Spaces.
Thielmann et al., Mambular: A Sequential Model for Tabular Deep Learning.