FAQ
Frequently asked questions about DeepTab and troubleshooting common issues.
General
What’s the difference between DeepTab v1 and v2?
Version 2.0 introduces a fully typed data layer (TabularDataset, TabularDataModule, FeatureSchema, TabularBatch) that makes it easier to work with tabular data at a lower level. The high-level estimator API remains unchanged and is still the recommended interface for most users.
Key changes in v2.0:
Automatic stratification for classification tasks
Typed batch containers with device management
Feature schema tracking with metadata
Consistent label shapes across tasks
Deprecated
MambularDataset/MambularDataModulealiases (useTabularDataset/TabularDataModule)
Important
Note on v1 support: DeepTab v1 is no longer supported following the v2.0 release. The changes in package structure and API design were substantial enough that maintaining backward compatibility would have compromised the improvements in v2. If you’re using v1 in production, we recommend planning a migration to v2. Pin your dependency to deeptab<2.0 if you need to continue using v1, but be aware that no bug fixes or security updates will be provided for the v1 branch.
See the Overview for details on the new data API.
Which model should I use?
Tip
When in doubt, start with MambularClassifier or MambularRegressor.
Mambular tends to work well across a variety of tabular problems. For a full selection guide by dataset size, feature type, and compute constraints, see the Model Comparison page.
Quick pointers:
Strong general-purpose baseline →
TabMorMambularMany categorical features →
TabTransformerFastest baseline →
MLPorResNetUncertainty estimates → any
LSSvariantInterpretability →
NODEorNDTF
Do I need a GPU?
No, but it helps significantly for larger datasets and more complex architectures. The short answer:
MLP, ResNet, TabM, MambaTab: train comfortably on CPU up to ~100K to 500K rows.
Mambular, TabulaRNN, TabTransformer, NODE: CPU is fine up to ~10K to 20K rows; GPU recommended beyond that.
FTTransformer, AutoInt, MambAttention, ENODE, NDTF, TabR: GPU recommended above ~5K to 10K rows.
SAINT: GPU strongly recommended above ~2K rows (row attention makes every batch expensive).
For a full per-model breakdown including the cost driver for each architecture, see the Model Zoo Comparison Tables in the Model Zoo.
How do I know if my GPU is being used?
Check CUDA availability:
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
DeepTab will automatically use the first available GPU. If CUDA is available but you’re not seeing speedups, ensure you’re training on a reasonably large dataset, since small batches may not benefit from GPU parallelism.
Can I use DeepTab with PyTorch dataloaders?
Note
The high-level API uses TabularDataModule internally, but you can access TabularDataset directly for custom data loading.
Yes. The internal TabularDataModule creates PyTorch DataLoader instances. If you need custom data loading logic, you can use TabularDataset directly:
from deeptab.data import TabularDataset
from torch.utils.data import DataLoader
dataset = TabularDataset(
cat_feature_list=[...],
num_feature_list=[...],
embedding_feature_list=None,
y=labels,
)
dataloader = DataLoader(dataset, batch_size=128, shuffle=True)
Data and preprocessing
What data types are supported?
DeepTab automatically handles:
Numerical:
int,floatdtypesCategorical:
object,category,booldtypesEmbeddings: Pass pre-computed embeddings via the
embeddingsparameter offit()
How do I handle missing values?
Tip
No manual imputation needed! DeepTab handles missing values automatically.
DeepTab handles missing values internally during preprocessing:
# DataFrame with missing values
df = pd.DataFrame({
"age": [25, np.nan, 47, 51],
"city": ["NYC", "Boston", None, "Chicago"],
})
# Works without manual imputation
model = MambularClassifier()
model.fit(df, y, max_epochs=50)
The pretab preprocessor (used internally) applies median imputation for numerical features and mode imputation for categoricals by default.
Can I use NumPy arrays instead of DataFrames?
Yes. DeepTab accepts both:
# NumPy arrays work
X = np.random.randn(1000, 10)
y = np.random.randint(0, 2, size=1000)
model = MambularClassifier()
model.fit(X, y, max_epochs=50)
However, DataFrames are recommended because they preserve column names and types, which helps with feature type detection and preprocessing.
How do I tell DeepTab which columns are categorical?
DeepTab infers feature types from DataFrame dtypes:
# Ensure categorical columns have the right dtype
df["city"] = df["city"].astype("category")
df["user_id"] = df["user_id"].astype("category") # Numeric ID, but categorical
model = MambularClassifier()
model.fit(df, y, max_epochs=50)
If you’re using NumPy arrays, all features are treated as numerical by default.
What if I have text or image data?
DeepTab is designed for tabular data. For text or images:
Use a pre-trained encoder to generate embeddings
Pass embeddings via the
embeddingsparameter offit()
from sentence_transformers import SentenceTransformer
# Encode text to embeddings
text_model = SentenceTransformer("all-MiniLM-L6-v2")
text_embeddings = text_model.encode(df["description"].tolist())
# Pass embeddings alongside tabular features
X_tabular = df.drop(columns=["description", "target"])
model = MambularClassifier()
model.fit(X_tabular, y, embeddings=text_embeddings, max_epochs=50)
Can I customize preprocessing per feature?
Not directly. PreprocessingConfig applies the same strategy to all numerical features. If you need per-feature preprocessing, apply it manually before passing to DeepTab:
# Custom preprocessing
df["log_income"] = np.log1p(df["income"])
df["age_binned"] = pd.cut(df["age"], bins=5).astype("category")
# Then fit DeepTab
model = MambularClassifier()
model.fit(df, y, max_epochs=50)
Training and performance
How do I speed up training?
Tip
Combine GPU acceleration with larger batch sizes and early stopping for fastest training.
Several options:
Use a GPU: install CUDA-enabled PyTorch
Increase batch size: larger batches are more efficient when memory allows (
TrainerConfig(batch_size=...))Reduce epochs: rely on early stopping instead of a fixed epoch count
Use multi-worker data loading: pass
num_workersthroughdataloader_kwargsinfit()
from deeptab.configs import TrainerConfig
model = MambularClassifier(
trainer_config=TrainerConfig(
batch_size=512, # Larger batch size
patience=10, # Early stopping
)
)
# num_workers is a DataLoader option, so pass it via dataloader_kwargs
model.fit(X_train, y_train, dataloader_kwargs={"num_workers": 4}, max_epochs=100)
Training is slow on GPU
Note
GPUs need larger batch sizes to show a speedup over CPU. Small batches or datasets may run faster on CPU.
Ensure you’re using GPU:
import torch
print(torch.cuda.is_available()) # Should be True
If True but still slow:
Small batches: GPU efficiency requires larger batches (try 256+)
Small dataset: for < 1K samples, CPU may be faster due to transfer overhead
CPU bottleneck: increase
num_workersviadataloader_kwargsinfit()for faster data loading
How do I use early stopping?
Early stopping is enabled by default. Adjust patience:
from deeptab.configs import TrainerConfig
model = MambularClassifier(
trainer_config=TrainerConfig(
patience=15, # Stop if no improvement for 15 epochs
)
)
Provide an explicit validation set for better early stopping:
model.fit(
X_train, y_train,
X_val=X_val, y_val=y_val,
max_epochs=100,
)
How do I save a trained model?
Use the .deeptab extension. DeepTab warns when a different extension is used.
# Save
model.save("my_model.deeptab")
# Load
from deeptab.models import MambularClassifier
loaded = MambularClassifier.load("my_model.deeptab")
predictions = loaded.predict(X_test)
The artifact includes weights, fitted preprocessor, feature schema, and task metadata.
Can I resume training from a checkpoint?
Not directly through the estimator API. If you need this, consider using TabularDataModule with PyTorch Lightning’s checkpointing directly.
How do I monitor training metrics?
DeepTab shows a progress bar by default. For richer per-epoch metrics, pass
train_metrics/val_metrics dicts to fit(), or attach an experiment tracker
through ObservabilityConfig:
from deeptab.core.observability import ObservabilityConfig
model = MambularClassifier(
observability_config=ObservabilityConfig(verbosity=2, experiment_trackers=["tensorboard"]),
)
For fully custom metrics, use Lightning callbacks (advanced usage, see the Lightning docs).
Errors and troubleshooting
CUDA out of memory
Warning
GPU memory errors usually indicate batch size is too large for your GPU.
Reduce batch size:
from deeptab.configs import TrainerConfig
model = MambularClassifier(
trainer_config=TrainerConfig(batch_size=64) # Smaller batch size
)
Or force CPU training by passing the Lightning accelerator to fit():
model = MambularClassifier()
model.fit(X_train, y_train, accelerator="cpu")
ValueError: could not convert string to float
Tip
This usually means categorical features weren’t properly detected. Explicitly set dtypes.
This happens when categorical features are not properly encoded. Ensure they have the right dtype:
df["city"] = df["city"].astype("category")
Or check for unexpected non-numeric values in numerical columns.
ImportError: No module named ‘deeptab’
Ensure DeepTab is installed in the active environment:
pip list | grep deeptab
If not listed:
pip install deeptab
AttributeError: ‘TabularDataModule’ object has no attribute ‘embedding_feature_info’
This was a bug in early v2.0 pre-releases. Upgrade to v2.0.0 or later:
pip install --upgrade deeptab
Training is unstable (loss explodes)
Warning
Exploding gradients indicate learning rate may be too high or data has extreme values.
Try reducing learning rate:
from deeptab.configs import TrainerConfig
model = MambularClassifier(
trainer_config=TrainerConfig(lr=1e-4) # Lower learning rate
)
Or enable gradient clipping, which is off by default. Pass it to fit() as a Lightning trainer argument:
model = MambularClassifier()
model.fit(X_train, y_train, gradient_clip_val=0.5)
RuntimeError: Expected all tensors to be on the same device
Note
The high-level estimator API handles device management automatically. This error typically occurs only with custom training loops.
Ensure all tensors are on the same device:
batch = batch.to("cuda") # Move entire batch
The estimator API handles this automatically.
Model-specific
What’s the difference between Mambular and MambaTab?
Both use Mamba (State Space Model) blocks, but differ in how they process features:
Mambular: Sequential model. Processes features one at a time in sequence, learning dependencies between features.
MambaTab: Joint model. Applies Mamba to a concatenated representation of all features at once.
Mambular tends to work better for datasets where feature order matters or where you want to learn sequential dependencies.
When should I use distributional regression (LSS)?
Tip
Use LSS models when you need uncertainty estimates, not just point predictions.
Use LSS models when you need:
Uncertainty quantification: Know when predictions are confident vs uncertain
Prediction intervals: Generate confidence bounds (e.g., 95% intervals)
Heteroscedastic noise: Model varying noise levels across inputs
Risk-aware decisions: Use full distributions for downstream optimization
Example:
from deeptab.models import MambularLSS
model = MambularLSS()
model.fit(X_train, y_train, family="normal", max_epochs=50)
# Get mean and std for each prediction
params = model.predict(X_test)
mean = params[:, 0]
std = params[:, 1]
# 95% prediction interval
lower = mean - 1.96 * std
upper = mean + 1.96 * std
Can I use my own custom architecture?
Yes, but it requires subclassing BaseTaskModel. See the source code for examples of how to extend the base classes.
Do experimental models work the same way as stable models?
Yes, the API is identical. The only difference is that experimental models may change without a deprecation cycle:
from deeptab.models.experimental import TromptClassifier
# Same API as stable models
model = TromptClassifier()
model.fit(X_train, y_train, max_epochs=50)
Integration
Can I use DeepTab with scikit-learn pipelines?
Yes:
from sklearn.pipeline import Pipeline
from deeptab.models import MambularClassifier
pipeline = Pipeline([
("model", MambularClassifier()),
])
pipeline.fit(X_train, y_train)
predictions = pipeline.predict(X_test)
Note: DeepTab does its own preprocessing, so additional preprocessing steps in the pipeline may be redundant.
Does GridSearchCV work?
Yes:
from sklearn.model_selection import GridSearchCV
search = GridSearchCV(
estimator=MambularClassifier(),
param_grid={
"model_config__d_model": [64, 128],
"trainer_config__lr": [1e-3, 5e-4],
},
cv=5,
)
search.fit(X_train, y_train)
Note: Set n_jobs=1 in GridSearchCV if using GPU, as each model will try to use the GPU.
Can I deploy DeepTab models?
Yes. For deployment, use InferenceModel. It validates the input schema and exposes only the inference surface, preventing accidental retraining in production:
# Training environment
model.save("model.deeptab")
# Deployment environment
from deeptab import InferenceModel
model = InferenceModel.from_path("model.deeptab")
X_clean = model.validate_input(X_new) # raises on schema mismatch
predictions = model.predict(X_clean)
See the Inference Model guide for the full deployment workflow.
Advanced usage
How do I access the underlying PyTorch model?
For most inspection needs, use the public helpers model.summary(),
model.describe(), and model.parameter_table(). They work once the model is
built or fitted and do not require touching internals.
model = MambularClassifier()
model.fit(X_train, y_train, max_epochs=50)
print(model.summary()) # human-readable overview
info = model.describe() # structured dict (architecture, task, params, ...)
If you need direct access for advanced work, the fitted Lightning module lives
in the private model._task_model attribute, and the raw nn.Module
architecture is model._task_model.estimator. These are internal and may change
between releases.
Can I use custom loss functions?
Not directly through the estimator API. If you need custom losses, use TabularDataModule with a custom Lightning module.
How do I extract learned features?
Access intermediate representations:
model = MambularClassifier()
model.fit(X_train, y_train, max_epochs=50)
# The raw architecture lives on the fitted Lightning module (internal API)
architecture = model._task_model.estimator
This is an advanced use case. See the source code for details.
Can I use multiple GPUs?
DeepTab uses the first available GPU by default. For multi-GPU training, use Lightning’s distributed strategies directly with TabularDataModule (advanced usage).
Contributing and support
How do I report a bug?
Open an issue on GitHub with:
DeepTab version (
import deeptab; print(deeptab.__version__))Python version
PyTorch version
Minimal reproducible example
Full error traceback
How do I request a feature?
Open a feature request on GitHub describing:
The use case
Why existing features don’t solve it
Proposed API (if applicable)
How do I contribute?
See the Contributing guide for:
Setting up the development environment
Running tests
Code style guidelines
Submitting pull requests
Where can I get help?
Check this FAQ first
Search GitHub issues
Open a new issue for bugs or questions
Join discussions on the GitHub repo
Performance comparisons
How does DeepTab compare to XGBoost?
It depends on the dataset:
Small datasets (< 1K samples): XGBoost often wins
Large datasets (> 10K samples): DeepTab competitive or better, especially with complex feature interactions
Categorical-heavy data: XGBoost may be more efficient
Need for uncertainty: DeepTab LSS models provide distributional predictions
Use both and compare on your specific data. DeepTab makes experimentation easy.
Is DeepTab faster than training PyTorch manually?
No, DeepTab uses PyTorch under the hood. It provides convenience, not speed improvements. However, it does:
Apply sensible defaults (early stopping, LR scheduling)
Handle device management automatically
Provide efficient data loading
So while not “faster”, it helps you get to a working model more quickly.
Still have questions?
If your question isn’t answered here:
Check the Core Concepts guide
Browse the Tutorials
Search GitHub issues
Open a new issue on GitHub