FAQ

Frequently asked questions about DeepTab and troubleshooting common issues.

General

What’s the difference between DeepTab v1 and v2?

Version 2.0 introduces a fully typed data layer (TabularDataset, TabularDataModule, FeatureSchema, TabularBatch) that makes it easier to work with tabular data at a lower level. The high-level estimator API remains unchanged and is still the recommended interface for most users.

Key changes in v2.0:

Automatic stratification for classification tasks
Typed batch containers with device management
Feature schema tracking with metadata
Consistent label shapes across tasks
Deprecated MambularDataset/MambularDataModule aliases (use TabularDataset/TabularDataModule)

Important

Note on v1 support: DeepTab v1 is no longer supported following the v2.0 release. The changes in package structure and API design were substantial enough that maintaining backward compatibility would have compromised the improvements in v2. If you’re using v1 in production, we recommend planning a migration to v2. Pin your dependency to deeptab<2.0 if you need to continue using v1, but be aware that no bug fixes or security updates will be provided for the v1 branch.

See the Overview for details on the new data API.

Which model should I use?

Tip

When in doubt, start with MambularClassifier or MambularRegressor.

Mambular tends to work well across a variety of tabular problems. For a full selection guide by dataset size, feature type, and compute constraints, see the Model Comparison page.

Quick pointers:

Strong general-purpose baseline → TabM or Mambular
Many categorical features → TabTransformer
Fastest baseline → MLP or ResNet
Uncertainty estimates → any LSS variant
Interpretability → NODE or NDTF

Do I need a GPU?

No, but it helps significantly for larger datasets and more complex architectures. The short answer:

MLP, ResNet, TabM, MambaTab: train comfortably on CPU up to ~100K to 500K rows.
Mambular, TabulaRNN, TabTransformer, NODE: CPU is fine up to ~10K to 20K rows; GPU recommended beyond that.
FTTransformer, AutoInt, MambAttention, ENODE, NDTF, TabR: GPU recommended above ~5K to 10K rows.
SAINT: GPU strongly recommended above ~2K rows (row attention makes every batch expensive).

For a full per-model breakdown including the cost driver for each architecture, see the Model Zoo Comparison Tables in the Model Zoo.

How do I know if my GPU is being used?

Check CUDA availability:

import torch
print(f"CUDA available: {torch.cuda.is_available()}")

DeepTab will automatically use the first available GPU. If CUDA is available but you’re not seeing speedups, ensure you’re training on a reasonably large dataset, since small batches may not benefit from GPU parallelism.

Can I use DeepTab with PyTorch dataloaders?

Note

The high-level API uses TabularDataModule internally, but you can access TabularDataset directly for custom data loading.

Yes. The internal TabularDataModule creates PyTorch DataLoader instances. If you need custom data loading logic, you can use TabularDataset directly:

from deeptab.data import TabularDataset
from torch.utils.data import DataLoader

dataset = TabularDataset(
    cat_feature_list=[...],
    num_feature_list=[...],
    embedding_feature_list=None,
    y=labels,
)

dataloader = DataLoader(dataset, batch_size=128, shuffle=True)

Data and preprocessing

What data types are supported?

DeepTab automatically handles:

Numerical: int, float dtypes
Categorical: object, category, bool dtypes
Embeddings: Pass pre-computed embeddings via the embeddings parameter of fit()

How do I handle missing values?

Tip

No manual imputation needed! DeepTab handles missing values automatically.

DeepTab handles missing values internally during preprocessing:

# DataFrame with missing values
df = pd.DataFrame({
    "age": [25, np.nan, 47, 51],
    "city": ["NYC", "Boston", None, "Chicago"],
})

# Works without manual imputation
model = MambularClassifier()
model.fit(df, y, max_epochs=50)

The pretab preprocessor (used internally) applies median imputation for numerical features and mode imputation for categoricals by default.

Can I use NumPy arrays instead of DataFrames?

Yes. DeepTab accepts both:

# NumPy arrays work
X = np.random.randn(1000, 10)
y = np.random.randint(0, 2, size=1000)

model = MambularClassifier()
model.fit(X, y, max_epochs=50)

However, DataFrames are recommended because they preserve column names and types, which helps with feature type detection and preprocessing.

How do I tell DeepTab which columns are categorical?

DeepTab infers feature types from DataFrame dtypes:

# Ensure categorical columns have the right dtype
df["city"] = df["city"].astype("category")
df["user_id"] = df["user_id"].astype("category")  # Numeric ID, but categorical

model = MambularClassifier()
model.fit(df, y, max_epochs=50)

If you’re using NumPy arrays, all features are treated as numerical by default.

What if I have text or image data?

DeepTab is designed for tabular data. For text or images:

Use a pre-trained encoder to generate embeddings
Pass embeddings via the embeddings parameter of fit()

from sentence_transformers import SentenceTransformer

# Encode text to embeddings
text_model = SentenceTransformer("all-MiniLM-L6-v2")
text_embeddings = text_model.encode(df["description"].tolist())

# Pass embeddings alongside tabular features
X_tabular = df.drop(columns=["description", "target"])
model = MambularClassifier()
model.fit(X_tabular, y, embeddings=text_embeddings, max_epochs=50)

Can I customize preprocessing per feature?

Not directly. PreprocessingConfig applies the same strategy to all numerical features. If you need per-feature preprocessing, apply it manually before passing to DeepTab:

# Custom preprocessing
df["log_income"] = np.log1p(df["income"])
df["age_binned"] = pd.cut(df["age"], bins=5).astype("category")

# Then fit DeepTab
model = MambularClassifier()
model.fit(df, y, max_epochs=50)

Training and performance

How do I speed up training?

Tip

Combine GPU acceleration with larger batch sizes and early stopping for fastest training.

Several options:

Use a GPU: install CUDA-enabled PyTorch
Increase batch size: larger batches are more efficient when memory allows (TrainerConfig(batch_size=...))
Reduce epochs: rely on early stopping instead of a fixed epoch count
Use multi-worker data loading: pass num_workers through dataloader_kwargs in fit()

from deeptab.configs import TrainerConfig

model = MambularClassifier(
    trainer_config=TrainerConfig(
        batch_size=512,   # Larger batch size
        patience=10,      # Early stopping
    )
)

# num_workers is a DataLoader option, so pass it via dataloader_kwargs
model.fit(X_train, y_train, dataloader_kwargs={"num_workers": 4}, max_epochs=100)

Training is slow on GPU

Note

GPUs need larger batch sizes to show a speedup over CPU. Small batches or datasets may run faster on CPU.

Ensure you’re using GPU:

import torch
print(torch.cuda.is_available())  # Should be True

If True but still slow:

Small batches: GPU efficiency requires larger batches (try 256+)
Small dataset: for < 1K samples, CPU may be faster due to transfer overhead
CPU bottleneck: increase num_workers via dataloader_kwargs in fit() for faster data loading

How do I use early stopping?

Early stopping is enabled by default. Adjust patience:

from deeptab.configs import TrainerConfig

model = MambularClassifier(
    trainer_config=TrainerConfig(
        patience=15,  # Stop if no improvement for 15 epochs
    )
)

Provide an explicit validation set for better early stopping:

model.fit(
    X_train, y_train,
    X_val=X_val, y_val=y_val,
    max_epochs=100,
)

How do I save a trained model?

Use the .deeptab extension. DeepTab warns when a different extension is used.

# Save
model.save("my_model.deeptab")

# Load
from deeptab.models import MambularClassifier
loaded = MambularClassifier.load("my_model.deeptab")
predictions = loaded.predict(X_test)

The artifact includes weights, fitted preprocessor, feature schema, and task metadata.

Can I resume training from a checkpoint?

Not directly through the estimator API. If you need this, consider using TabularDataModule with PyTorch Lightning’s checkpointing directly.

How do I monitor training metrics?

DeepTab shows a progress bar by default. For richer per-epoch metrics, pass train_metrics/val_metrics dicts to fit(), or attach an experiment tracker through ObservabilityConfig:

from deeptab.core.observability import ObservabilityConfig

model = MambularClassifier(
    observability_config=ObservabilityConfig(verbosity=2, experiment_trackers=["tensorboard"]),
)

For fully custom metrics, use Lightning callbacks (advanced usage, see the Lightning docs).

Errors and troubleshooting

CUDA out of memory

Warning

GPU memory errors usually indicate batch size is too large for your GPU.

Reduce batch size:

from deeptab.configs import TrainerConfig

model = MambularClassifier(
    trainer_config=TrainerConfig(batch_size=64)  # Smaller batch size
)

Or force CPU training by passing the Lightning accelerator to fit():

model = MambularClassifier()
model.fit(X_train, y_train, accelerator="cpu")

ValueError: could not convert string to float

Tip

This usually means categorical features weren’t properly detected. Explicitly set dtypes.

This happens when categorical features are not properly encoded. Ensure they have the right dtype:

df["city"] = df["city"].astype("category")

Or check for unexpected non-numeric values in numerical columns.

ImportError: No module named ‘deeptab’

Ensure DeepTab is installed in the active environment:

pip list | grep deeptab

If not listed:

pip install deeptab

AttributeError: ‘TabularDataModule’ object has no attribute ‘embedding_feature_info’

This was a bug in early v2.0 pre-releases. Upgrade to v2.0.0 or later:

pip install --upgrade deeptab

Training is unstable (loss explodes)

Warning

Exploding gradients indicate learning rate may be too high or data has extreme values.

Try reducing learning rate:

from deeptab.configs import TrainerConfig

model = MambularClassifier(
    trainer_config=TrainerConfig(lr=1e-4)  # Lower learning rate
)

Or enable gradient clipping, which is off by default. Pass it to fit() as a Lightning trainer argument:

model = MambularClassifier()
model.fit(X_train, y_train, gradient_clip_val=0.5)

RuntimeError: Expected all tensors to be on the same device

Note

The high-level estimator API handles device management automatically. This error typically occurs only with custom training loops.

Ensure all tensors are on the same device:

batch = batch.to("cuda")  # Move entire batch

The estimator API handles this automatically.

Model-specific

What’s the difference between Mambular and MambaTab?

Both use Mamba (State Space Model) blocks, but differ in how they process features:

Mambular: Sequential model. Processes features one at a time in sequence, learning dependencies between features.
MambaTab: Joint model. Applies Mamba to a concatenated representation of all features at once.

Mambular tends to work better for datasets where feature order matters or where you want to learn sequential dependencies.

When should I use distributional regression (LSS)?

Tip

Use LSS models when you need uncertainty estimates, not just point predictions.

Use LSS models when you need:

Uncertainty quantification: Know when predictions are confident vs uncertain
Prediction intervals: Generate confidence bounds (e.g., 95% intervals)
Heteroscedastic noise: Model varying noise levels across inputs
Risk-aware decisions: Use full distributions for downstream optimization

Example:

from deeptab.models import MambularLSS

model = MambularLSS()
model.fit(X_train, y_train, family="normal", max_epochs=50)

# Get mean and std for each prediction
params = model.predict(X_test)
mean = params[:, 0]
std = params[:, 1]

# 95% prediction interval
lower = mean - 1.96 * std
upper = mean + 1.96 * std

Can I use my own custom architecture?

Yes, but it requires subclassing BaseTaskModel. See the source code for examples of how to extend the base classes.

Do experimental models work the same way as stable models?

Yes, the API is identical. The only difference is that experimental models may change without a deprecation cycle:

from deeptab.models.experimental import TromptClassifier

# Same API as stable models
model = TromptClassifier()
model.fit(X_train, y_train, max_epochs=50)

Integration

Can I use DeepTab with scikit-learn pipelines?

Yes:

from sklearn.pipeline import Pipeline
from deeptab.models import MambularClassifier

pipeline = Pipeline([
    ("model", MambularClassifier()),
])
pipeline.fit(X_train, y_train)
predictions = pipeline.predict(X_test)

Note: DeepTab does its own preprocessing, so additional preprocessing steps in the pipeline may be redundant.

Does GridSearchCV work?

Yes:

from sklearn.model_selection import GridSearchCV

search = GridSearchCV(
    estimator=MambularClassifier(),
    param_grid={
        "model_config__d_model": [64, 128],
        "trainer_config__lr": [1e-3, 5e-4],
    },
    cv=5,
)
search.fit(X_train, y_train)

Note: Set n_jobs=1 in GridSearchCV if using GPU, as each model will try to use the GPU.

Can I deploy DeepTab models?

Yes. For deployment, use InferenceModel. It validates the input schema and exposes only the inference surface, preventing accidental retraining in production:

# Training environment
model.save("model.deeptab")

# Deployment environment
from deeptab import InferenceModel
model = InferenceModel.from_path("model.deeptab")

X_clean = model.validate_input(X_new)  # raises on schema mismatch
predictions = model.predict(X_clean)

See the Inference Model guide for the full deployment workflow.

Advanced usage

How do I access the underlying PyTorch model?

For most inspection needs, use the public helpers model.summary(), model.describe(), and model.parameter_table(). They work once the model is built or fitted and do not require touching internals.

model = MambularClassifier()
model.fit(X_train, y_train, max_epochs=50)

print(model.summary())        # human-readable overview
info = model.describe()       # structured dict (architecture, task, params, ...)

If you need direct access for advanced work, the fitted Lightning module lives in the private model._task_model attribute, and the raw nn.Module architecture is model._task_model.estimator. These are internal and may change between releases.

Can I use custom loss functions?

Not directly through the estimator API. If you need custom losses, use TabularDataModule with a custom Lightning module.

How do I extract learned features?

Access intermediate representations:

model = MambularClassifier()
model.fit(X_train, y_train, max_epochs=50)

# The raw architecture lives on the fitted Lightning module (internal API)
architecture = model._task_model.estimator

This is an advanced use case. See the source code for details.

Can I use multiple GPUs?

DeepTab uses the first available GPU by default. For multi-GPU training, use Lightning’s distributed strategies directly with TabularDataModule (advanced usage).

Contributing and support

How do I report a bug?

Open an issue on GitHub with:

DeepTab version (import deeptab; print(deeptab.__version__))
Python version
PyTorch version
Minimal reproducible example
Full error traceback

How do I request a feature?

Open a feature request on GitHub describing:

The use case
Why existing features don’t solve it
Proposed API (if applicable)

How do I contribute?

See the Contributing guide for:

Setting up the development environment
Running tests
Code style guidelines
Submitting pull requests

Where can I get help?

Check this FAQ first
Search GitHub issues
Open a new issue for bugs or questions
Join discussions on the GitHub repo

Performance comparisons

How does DeepTab compare to XGBoost?

It depends on the dataset:

Small datasets (< 1K samples): XGBoost often wins
Large datasets (> 10K samples): DeepTab competitive or better, especially with complex feature interactions
Categorical-heavy data: XGBoost may be more efficient
Need for uncertainty: DeepTab LSS models provide distributional predictions

Use both and compare on your specific data. DeepTab makes experimentation easy.

Is DeepTab faster than training PyTorch manually?

No, DeepTab uses PyTorch under the hood. It provides convenience, not speed improvements. However, it does:

Apply sensible defaults (early stopping, LR scheduling)
Handle device management automatically
Provide efficient data loading

So while not “faster”, it helps you get to a working model more quickly.

Still have questions?

If your question isn’t answered here:

Check the Core Concepts guide
Browse the Tutorials
Search GitHub issues
Open a new issue on GitHub