DeepTab: Tabular Deep Learning Made Simple
DeepTab is a Python library for deep learning on tabular data, built on PyTorch and Lightning with a scikit-learn compatible API. It offers 15 neural architectures, from Mamba-inspired state space models and Transformers to tree ensembles and MLP baselines, each available as a classifier, regressor, or distributional (LSS) model. One fit/predict/evaluate workflow covers everyday modeling, architecture research, and production deployment.
from deeptab.models import MambularClassifier
model = MambularClassifier()
model.fit(X_train, y_train, max_epochs=50)
predictions = model.predict(X_test)
probabilities = model.predict_proba(X_test)
Why DeepTab
Familiar interface. A scikit-learn
fit/predict/evaluateAPI that drops into existing pipelines, includingGridSearchCV.Automatic preprocessing. Feature-type detection, encoding, scaling, and missing-value handling are powered by PreTab and applied for you.
One model, three tasks. Every architecture ships as a classifier, a regressor, and a distributional (
LSS) variant for uncertainty quantification.A broad model zoo. 15 stable architectures plus experimental models, all behind the same interface, with selection guidance.
Built for real data. Mixed feature types, class imbalance, GPU acceleration, and early stopping work out of the box.
Installation
pip install deeptab
DeepTab requires Python 3.10+ and installs PyTorch automatically. See Installation for GPU setup and the optional Mamba CUDA kernels.
What’s New in v2.0
v2.0 is a ground-up restructuring of DeepTab. The high-level estimator API stays familiar, while the package layout, configuration objects, and import paths have been updated.
Split-config API: separate model, preprocessing, and training configuration objects, so each concern can be tuned on its own. This is the first thing you reach for in v2.
New models: AutoInt, ENODE, and TabR (stable); Tangos, Trompt, and ModernNCA (experimental).
Observability:
ObservabilityConfigadds structured lifecycle logging viastructlogand one-line MLflow or TensorBoard tracking, opt-in and silent by default.Deployment-safe inference:
InferenceModelexposes a read-only prediction surface with schema validation, so a served model cannot be re-fitted by accident.Self-describing artifacts: a single
.deeptabsave format bundles the architecture, feature schema, preprocessing, and versions with the weights.Registry-driven training: optimizers, schedulers, and losses are selectable by name through
TrainerConfigand extensible at runtime.Unified metrics: 25+ metric classes auto-selected per task across regression, classification, and distributional models.
Typed data layer:
TabularDataset,TabularDataModule, andFeatureSchemagive the pipeline an inspectable contract.Reproducibility: cross-platform seeding across CPU, CUDA, and MPS.
Rebuilt docs and tutorials: refreshed guides plus end-to-end, Colab-ready tutorials for classification, regression, and uncertainty quantification.
Warning
v2.0 is not backward compatible with v1, and v1 is no longer maintained. Three things changed that affect existing code:
Import paths were reorganised under the
deeptabnamespace.Config classes lost their
Defaultprefix, soDefaultMambularConfigis nowMambularConfig. Settings are also split acrossMambularConfig(architecture),PreprocessingConfig(feature handling), andTrainerConfig(optimisation).Data modules were renamed to
TabularDataModuleandTabularDataset; the oldMambular*aliases are deprecated.
If you need to stay on v1 for now, pin deeptab<2.0. Note that the v1 branch receives
no bug fixes or security updates. See the migration guide
for a step-by-step upgrade walkthrough, and the FAQ for the
full support policy.
The high-level fit/predict/evaluate workflow is unchanged. In most cases only
the imports and config construction need updating:
# v1: settings passed as flat keyword arguments on the estimator
from deeptab.models import MambularClassifier
model = MambularClassifier(d_model=128, n_layers=4, numerical_preprocessing="ple")
model.fit(X_train, y_train, max_epochs=50)
# v2: settings grouped into focused config objects
from deeptab.models import MambularClassifier
from deeptab.configs import MambularConfig, PreprocessingConfig
model = MambularClassifier(
model_config=MambularConfig(d_model=128, n_layers=4), # architecture
preprocessing_config=PreprocessingConfig(numerical_preprocessing="ple"), # features
)
model.fit(X_train, y_train, max_epochs=50)
Note
You only pass the configs you want to customise. MambularClassifier() with no
arguments uses sensible defaults for the architecture, preprocessing, and training.
The flat keyword-argument style from v1 is no longer accepted, so settings must go
through the relevant config object.
See the Overview for the full picture.
Available Models
DeepTab provides 15 stable architectures across five families: State Space Models (Mambular, MambaTab, MambAttention), Transformers (FTTransformer, TabTransformer, SAINT, AutoInt), residual networks (ResNet, TabR), tree-inspired models (NODE, ENODE, NDTF), and general baselines (MLP, TabM, TabulaRNN). Three experimental models (ModernNCA, Tangos, Trompt) are under evaluation for promotion.
Each architecture comes in three variants, *Classifier, *Regressor, and *LSS, sharing one interface so you can swap models without changing code. See the Model Zoo for comparisons and selection guidance.
Documentation
Note
These are starting points for each area. The sidebar navigation is the source of truth, so please review the individual documentation sections for the latest updates and further documentation.
Getting Started: Begin with the Overview and Quickstart, then see Installation for GPU setup.
Core Concepts: How DeepTab works, including the sklearn API, the Config System, and Training and Evaluation.
Tutorials: End-to-end, Colab-ready walkthroughs for regression, classification, and uncertainty quantification.
Model Zoo: Browse the Stable Models and Experimental Models, or use the Comparison Tables for selection guidance.
API Reference: Full reference for Models, Configs, and the rest of the public API.
Developer Guide: Contributing, Testing, and the Release Process for maintainers.
Citation
If you use DeepTab in your research, please cite:
@article{thielmann2024mambular,
title={Mambular: A Sequential Model for Tabular Deep Learning},
author={Thielmann, Anton Frederik and Kumar, Manish and Weisser, Christoph and Reuter, Arik and S{\"a}fken, Benjamin and Samiee, Soheila},
journal={arXiv preprint arXiv:2408.06291},
year={2024}
}
@article{thielmann2024efficiency,
title={On the Efficiency of NLP-Inspired Methods for Tabular Deep Learning},
author={Thielmann, Anton Frederik and Samiee, Soheila},
journal={arXiv preprint arXiv:2411.17207},
year={2024}
}
License
DeepTab is licensed under the MIT License. See LICENSE for details.