TabulaRNN
Overview
TabulaRNN treats tabular columns as a sequence and processes feature tokens with recurrent layers plus depthwise convolution. It is useful when you want a sequence-model baseline that is simpler than Mamba and different from self-attention.
Use it for experiments on ordered feature sequences, sequentially engineered tabular features, or ablations against Mambular.
Architectural Details
DeepTab’s TabulaRNN pipeline is:
EmbeddingLayerconverts features to(batch, n_features, d_model)tokens.ConvRNNapplies depthwise convolution and an RNN-family layer across the sequence.A residual summary
zis computed by averaging input embeddings and projecting withlinear.The recurrent output is pooled and added to
z.Optional normalization and
MLPheadproduce predictions.
feature tokens -> ConvRNN -> pooling
feature tokens -> mean -> Linear
pooled recurrent state + projected mean -> optional norm -> MLPhead
Main Building Blocks
Component |
DeepTab implementation |
Role |
|---|---|---|
Tokenizer |
|
Builds sequence tokens. |
Local filter |
depthwise |
Adds local token mixing. |
Recurrent block |
|
Sequential feature processing. |
Residual summary |
|
Preserves direct feature-token information. |
Head |
|
Final prediction. |
Implementation Notes
The config field model_type selects the recurrent cell family. Valid values follow the ConvRNN mapping: "RNN", "LSTM", "GRU", "mLSTM", and "sLSTM" if the corresponding blocks are available.
The default config uses d_model=128, model_type="RNN", n_layers=4, rnn_dropout=0.2, dim_feedforward=256, and pooling_method="avg".
Practical Config
from deeptab.configs import PreprocessingConfig, TabulaRNNConfig, TrainerConfig
from deeptab.models import TabulaRNNClassifier
model = TabulaRNNClassifier(
model_config=TabulaRNNConfig(
d_model=128,
model_type="GRU",
n_layers=3,
rnn_dropout=0.2,
dim_feedforward=256,
pooling_method="avg",
),
preprocessing_config=PreprocessingConfig(numerical_preprocessing="quantile"),
trainer_config=TrainerConfig(lr=3e-4, batch_size=128, max_epochs=100),
random_state=101,
)
Key settings:
Setting |
Typical range |
Effect |
|---|---|---|
|
|
Recurrent cell family. |
|
|
Feature-token width. |
|
|
Recurrent depth. |
|
|
Hidden size consumed by the head. |
|
|
Depthwise convolution width. |
When To Use
Use TabulaRNN when you want a recurrent sequence baseline over feature tokens. Because column order is not always meaningful, compare with shuffled or alternative feature orderings when making architectural claims.
References
Hochreiter and Schmidhuber, Long Short-Term Memory.
Cho et al., Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation.