ResNet
Overview
ResNet is DeepTab’s residual feed-forward architecture for tabular data. It keeps the simplicity and speed of an MLP while adding residual blocks that make deeper nonlinear transformations easier to optimize.
Use ResNet when an MLP underfits, when you want a stronger classical neural baseline, or when you need a model that is still much cheaper than attention or retrieval-based methods.
Architectural Details
DeepTab’s ResNet pipeline is:
Concatenate preprocessed features, or embed features with
EmbeddingLayerand flatten tokens.Project the input vector with
initial_layer.Apply
num_blocksresidual blocks.Use a final linear output layer for the target task.
The residual blocks are implemented with deeptab.nn.blocks.resnet.ResidualBlock and use the configured activation, dropout, and optional normalization.
features -> optional embeddings -> initial Linear -> ResidualBlock x num_blocks -> output
Main Building Blocks
Component |
DeepTab implementation |
Role |
|---|---|---|
Input representation |
Raw concatenation or |
Converts heterogeneous columns to a tensor. |
Initial projection |
|
Sets hidden width. |
Residual body |
|
Learns transformations with skip paths. |
Output layer |
|
Produces task outputs. |
Implementation Notes
num_blocks controls how many residual blocks are instantiated. Each block uses layer_sizes[i] as input width and layer_sizes[i + 1] when available, otherwise the last width is reused. Keep num_blocks aligned with the length of layer_sizes; if num_blocks exceeds the number of transitions, later blocks stay at the final width.
Practical Config
from deeptab.configs import PreprocessingConfig, ResNetConfig, TrainerConfig
from deeptab.models import ResNetRegressor
model = ResNetRegressor(
model_config=ResNetConfig(
layer_sizes=[256, 128, 64],
num_blocks=3,
dropout=0.2,
norm=True,
),
preprocessing_config=PreprocessingConfig(numerical_preprocessing="standard"),
trainer_config=TrainerConfig(lr=1e-3, batch_size=256, max_epochs=100),
random_state=101,
)
Key settings:
Setting |
Typical range |
Effect |
|---|---|---|
|
|
Width schedule. |
|
|
Depth of residual processing. |
|
|
Regularization. |
|
|
Enables normalization inside residual blocks. |
|
|
Useful for categorical-heavy data. |
When To Use
Use ResNet as a default stable baseline beside MLP and TabM. It is a good choice when you want a stronger inductive bias than a plain MLP but do not want the memory and tuning cost of Transformer models.
References
He et al., Deep Residual Learning for Image Recognition.
Gorishniy et al., Revisiting Deep Learning Models for Tabular Data.