Configurations API
Base configs
These three classes form the core of the split-config API and are shared across all models.
- class deeptab.configs.TrainerConfig(max_epochs=100, batch_size=128, val_size=0.2, shuffle=True, stratify=True, patience=15, monitor='val_loss', mode='min', lr=0.0001, lr_patience=10, lr_factor=0.1, weight_decay=1e-06, optimizer_type='Adam', optimizer_kwargs=None, scheduler_type='ReduceLROnPlateau', scheduler_kwargs=None, scheduler_monitor=None, scheduler_interval='epoch', scheduler_frequency=1, no_weight_decay_for_bias_and_norm=False, checkpoint_path='model_checkpoints')[source]
Configuration for training loop, optimizer, and runtime execution.
These settings are entirely separate from model architecture. They control how a model is trained and executed, not what the model is.
- Parameters:
max_epochs (
int) – Maximum number of training epochs.batch_size (
int) – Number of samples per gradient update.val_size (
float) – Fraction of the training data held out for validation when no explicit validation set is provided.shuffle (
bool) – Whether to shuffle training data before each epoch.stratify (
bool) – Whether to stratify the validation split onyfor classification tasks, so the train and validation sets keep the same class proportions. Has no effect on regression, where a continuous target cannot be stratified. Set toFalseto draw a purely random split.patience (
int) – Number of epochs with no improvement onmonitorbefore early stopping is triggered.monitor (
str) – Metric name to monitor for early stopping and checkpoint selection.mode (
str) – Whether the monitored metric should be minimised ("min") or maximised ("max").lr (
float) – Learning rate for the optimizer.lr_patience (
int) – Number of epochs with no improvement before the learning rate is reduced bylr_factor.lr_factor (
float) – Multiplicative factor applied to the learning rate when patience is exceeded.weight_decay (
float) – L2 regularisation coefficient (weight decay) for the optimizer.optimizer_type (
str) – Optimizer class name. Must be a validtorch.optimclass name or a name registered in the project’s optimizer registry.optimizer_kwargs (
dict|None) – Extra keyword arguments forwarded to the optimizer constructor.scheduler_type (
str|None) – LR-scheduler class name (case-insensitive), orNone/"none"to disable the scheduler entirely.scheduler_kwargs (
dict|None) – Extra keyword arguments forwarded to the scheduler constructor.factorandpatienceare synthesised fromlr_factorandlr_patienceforReduceLROnPlateauwhen absent here.scheduler_monitor (
str|None) – Metric name for the scheduler to monitor. Falls back to the value ofmonitorwhenNone.scheduler_interval (
str) – Lightning scheduling granularity:"epoch"or"step".scheduler_frequency (
int) – How often the scheduler steps at the given interval.no_weight_decay_for_bias_and_norm (
bool) – WhenTrue, bias vectors and normalisation-layer scale/shift parameters receive zero weight decay. Recommended for transformer- style models withLayerNorm.checkpoint_path (
str) – Directory where PyTorch Lightning model checkpoints are saved.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.PreprocessingConfig(numerical_preprocessing=None, categorical_preprocessing=None, n_bins=None, feature_preprocessing=None, use_decision_tree_bins=None, binning_strategy=None, task=None, cat_cutoff=None, treat_all_integers_as_numerical=None, degree=None, scaling_strategy=None, n_knots=None, use_decision_tree_knots=None, knots_strategy=None, spline_implementation=None)[source]
Configuration for input feature preprocessing.
All fields map directly to arguments accepted by
pretab.preprocessor.Preprocessor. UsingNonefor any field leaves the preprocessor default in effect.- Parameters:
numerical_preprocessing (
str|None) – Strategy for transforming numerical features (e.g."ple","quantile","standard").Noneuses the preprocessor’s built-in default.categorical_preprocessing (
str|None) – Strategy for transforming categorical features (e.g."int","one-hot").Noneuses the preprocessor’s built-in default.n_bins (
int|None) – Number of bins for numerical binning.Noneuses the preprocessor default.feature_preprocessing (
str|None) – General feature-level preprocessing override.use_decision_tree_bins (
bool|None) – Whether to use decision-tree-derived bin edges.binning_strategy (
str|None) – Strategy for choosing bin edges (e.g."uniform","quantile").task (
str|None) – Task type passed to the preprocessor for task-aware transformations (e.g."regression","classification").cat_cutoff (
float|None) – Threshold for treating integer columns as categorical.treat_all_integers_as_numerical (
bool|None) – WhenTrue, integer columns are never converted to categorical.degree (
int|None) – Polynomial / spline degree for numerical feature expansion.scaling_strategy (
str|None) – Scaling method applied to numerical features (e.g."standard","minmax","robust").n_knots (
int|None) – Number of knots for spline preprocessing.use_decision_tree_knots (
bool|None) – Whether to use decision-tree-derived knot positions.knots_strategy (
str|None) – Strategy for knot placement.spline_implementation (
str|None) – Backend used for spline transformations.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- to_preprocessor_kwargs()[source]
Return a dict of non-None fields suitable for passing to
Preprocessor(**...).- Returns:
Mapping of field name → value for every field that is not
None.- Return type:
dict
- class deeptab.configs.BaseModelConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=32, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=ReLU(), cat_encoding='int')[source]
Shared architecture hyperparameters for all DeepTab models.
This class contains only architectural / structural configuration. Training-related parameters (
lr,weight_decay,max_epochs, …) belong inTrainerConfig. Preprocessing parameters belong inPreprocessingConfig.- Parameters:
use_embeddings (
bool) – Whether to use embedding layers for numerical/categorical features.embedding_activation (
Callable) – Activation function applied to embeddings.embedding_type (
str) – Type of embedding ("linear","plr", etc.).embedding_bias (
bool) – Whether to add a bias term to embedding layers.layer_norm_after_embedding (
bool) – Whether to apply layer normalisation after the embedding layer.d_model (
int) – Embedding / model dimensionality.plr_lite (
bool) – Whether to use the lightweight PLR embedding variant.n_frequencies (
int) – Number of frequency components for PLR embeddings.frequencies_init_scale (
float) – Initial scale for PLR frequency components.embedding_projection (
bool) – Whether to apply a linear projection after embeddings.batch_norm (
bool) – Whether to use batch normalisation in the model body.layer_norm (
bool) – Whether to use layer normalisation in the model body.layer_norm_eps (
float) – Epsilon for layer normalisation numerical stability.activation (
Callable) – Activation function used throughout the model body.cat_encoding (
str) – How categorical features are encoded at the model input ("int","one-hot","linear").
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
Model architecture configs
Each class below extends BaseModelConfig and adds the hyperparameters
specific to one model family.
- class deeptab.configs.AutoIntConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=128, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=ReLU(), cat_encoding='int', n_layers=4, n_heads=8, attn_dropout=0.2, transformer_dim_feedforward=256, fprenorm=False, bias=True, use_cls=False, kv_compression=0.5, kv_compression_sharing='key-value')[source]
Architecture-only configuration for AutoInt models (DeepTab 2.0 API).
- Parameters:
d_model (
int) – Dimensionality of the transformer model.n_layers (
int) – Number of transformer layers.n_heads (
int) – Number of attention heads in the transformer.attn_dropout (
float) – Dropout rate for the attention mechanism.transformer_dim_feedforward (
int) – Dimensionality of the feed-forward layers in the transformer.fprenorm (
bool) – Whether to apply pre-normalization in attention layers.bias (
bool) – Whether to use bias in linear layers.use_cls (
bool) – Whether to use a CLS token for pooling instead of averaging.kv_compression (
float) – Compression ratio for key-value pairs.kv_compression_sharing (
str) – Sharing strategy for key-value compression (‘headwise’, or ‘key- value’).
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.ENODEConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=8, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=ReLU(), cat_encoding='int', num_layers=4, layer_dim=64, tree_dim=1, depth=6, norm=None, head_layer_sizes=<factory>, head_dropout=0.3, head_skip_layers=False, head_activation=ReLU(), head_use_batch_norm=False)[source]
Architecture-only configuration for ENODE models (DeepTab 2.0 API).
- Parameters:
d_model (
int) – Hidden dimensionality used in the ENODE model.activation (
Callable) – Activation function for the internal ENODE layers.num_layers (
int) – Number of dense layers in the model.layer_dim (
int) – Dimensionality of each dense layer.tree_dim (
int) – Dimensionality of the output from each tree leaf.depth (
int) – Depth of each decision tree in the ensemble.norm (
str|None) – Type of normalization to use in the model.head_layer_sizes (
list) – Sizes of the layers in the model’s head.head_dropout (
float) – Dropout rate for the head layers.head_skip_layers (
bool) – Whether to skip layers in the head.head_activation (
Callable) – Activation function for the head layers.head_use_batch_norm (
bool) – Whether to use batch normalization in the head layers.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.FTTransformerConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=128, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=SELU(), cat_encoding='int', n_layers=4, n_heads=8, attn_dropout=0.2, ff_dropout=0.1, norm='LayerNorm', transformer_activation=ReGLU(), transformer_dim_feedforward=256, norm_first=False, bias=True, head_layer_sizes=<factory>, head_dropout=0.5, head_skip_layers=False, head_activation=SELU(), head_use_batch_norm=False, pooling_method='avg', use_cls=False)[source]
Architecture-only configuration for FTTransformer models (DeepTab 2.0 API).
- Parameters:
d_model (
int) – Dimensionality of the transformer model.activation (
Callable) – Activation function for the transformer layers.n_layers (
int) – Number of transformer layers.n_heads (
int) – Number of attention heads in the transformer.attn_dropout (
float) – Dropout rate for the attention mechanism.ff_dropout (
float) – Dropout rate for the feed-forward layers.norm (
str) – Type of normalization to be used (‘LayerNorm’, ‘RMSNorm’, etc.).transformer_activation (
Callable) – Activation function for the transformer feed-forward layers.transformer_dim_feedforward (
int) – Dimensionality of the feed-forward layers in the transformer.norm_first (
bool) – Whether to apply normalization before other operations in each transformer block.bias (
bool) – Whether to use bias in linear layers.head_layer_sizes (
list) – Sizes of the fully connected layers in the model’s head.head_dropout (
float) – Dropout rate for the head layers.head_skip_layers (
bool) – Whether to use skip connections in the head layers.head_activation (
Callable) – Activation function for the head layers.head_use_batch_norm (
bool) – Whether to use batch normalization in the head layers.pooling_method (
str) – Pooling method to be used (‘cls’, ‘avg’, etc.).use_cls (
bool) – Whether to use a CLS token for pooling.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.MambaTabConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=64, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=ReLU(), cat_encoding='int', n_layers=1, expand_factor=2, bias=False, d_conv=16, conv_bias=True, dropout=0.05, dt_rank='auto', d_state=128, dt_scale=1.0, dt_init='random', dt_max=0.1, dt_min=0.0001, dt_init_floor=0.0001, axis=1, head_layer_sizes=<factory>, head_dropout=0.0, head_skip_layers=False, head_activation=ReLU(), head_use_batch_norm=False, norm='LayerNorm', use_pscan=False, mamba_version='mamba-torch', bidirectional=False)[source]
Architecture-only configuration for MambaTab models (DeepTab 2.0 API).
- Parameters:
d_model (
int) – Dimensionality of the model.n_layers (
int) – Number of layers in the model.expand_factor (
int) – Expansion factor for the feed-forward layers.bias (
bool) – Whether to use bias in the linear layers.d_conv (
int) – Dimensionality of the convolutional layers.conv_bias (
bool) – Whether to use bias in the convolutional layers.dropout (
float) – Dropout rate for regularization.dt_rank (
str) – Rank of the decision tree used in the model.d_state (
int) – Dimensionality of the state in recurrent layers.dt_scale (
float) – Scaling factor for the decision tree.dt_init (
str) – Initialization method for the decision tree.dt_max (
float) – Maximum value for decision tree initialization.dt_min (
float) – Minimum value for decision tree initialization.dt_init_floor (
float) – Floor value for decision tree initialization.axis (
int) – Axis along which operations are applied, if applicable.head_layer_sizes (
list) – Sizes of the fully connected layers in the model’s head.head_dropout (
float) – Dropout rate for the head layers.head_skip_layers (
bool) – Whether to skip layers in the head.head_activation (
Callable) – Activation function for the head layers.head_use_batch_norm (
bool) – Whether to use batch normalization in the head layers.norm (
str) – Type of normalization to be used (‘LayerNorm’, ‘RMSNorm’, etc.).use_pscan (
bool) – Whether to use PSCAN for the state-space model.mamba_version (
str) – Version of the Mamba model to use (‘mamba-torch’, ‘mamba1’, ‘mamba2’).bidirectional (
bool) – Whether to process data bidirectionally.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.MambAttentionConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=64, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=SiLU(), cat_encoding='int', n_layers=4, expand_factor=2, n_heads=8, last_layer='attn', n_mamba_per_attention=1, bias=False, d_conv=4, conv_bias=True, dropout=0.0, attn_dropout=0.2, dt_rank='auto', d_state=128, dt_scale=1.0, dt_init='random', dt_max=0.1, dt_min=0.0001, dt_init_floor=0.0001, norm='LayerNorm', AD_weight_decay=True, BC_layer_norm=False, shuffle_embeddings=False, head_layer_sizes=<factory>, head_dropout=0.5, head_skip_layers=False, head_activation=SELU(), head_use_batch_norm=False, pooling_method='avg', bidirectional=False, use_learnable_interaction=False, use_cls=False, use_pscan=False, n_attention_layers=1)[source]
Architecture-only configuration for MambAttention models (DeepTab 2.0 API).
- Parameters:
d_model (
int) – Dimensionality of the model.activation (
Callable) – Activation function for the model.n_layers (
int) – Number of layers in the model.expand_factor (
int) – Expansion factor for the feed-forward layers.n_heads (
int) – Number of attention heads in the model.last_layer (
str) – Type of the last layer (e.g., ‘attn’).n_mamba_per_attention (
int) – Number of Mamba blocks per attention layer.bias (
bool) – Whether to use bias in the linear layers.d_conv (
int) – Dimensionality of the convolutional layers.conv_bias (
bool) – Whether to use bias in the convolutional layers.dropout (
float) – Dropout rate for regularization.attn_dropout (
float) – Dropout rate for the attention mechanism.dt_rank (
str) – Rank of the decision tree.d_state (
int) – Dimensionality of the state in recurrent layers.dt_scale (
float) – Scaling factor for the decision tree.dt_init (
str) – Initialization method for the decision tree.dt_max (
float) – Maximum value for decision tree initialization.dt_min (
float) – Minimum value for decision tree initialization.dt_init_floor (
float) – Floor value for decision tree initialization.norm (
str) – Type of normalization used in the model.AD_weight_decay (
bool) – Whether weight decay is applied to A-D matrices.BC_layer_norm (
bool) – Whether to apply layer normalization to B-C matrices.shuffle_embeddings (
bool) – Whether to shuffle embeddings before passing to Mamba layers.head_layer_sizes (
list) – Sizes of the fully connected layers in the model’s head.head_dropout (
float) – Dropout rate for the head layers.head_skip_layers (
bool) – Whether to use skip connections in the head layers.head_activation (
Callable) – Activation function for the head layers.head_use_batch_norm (
bool) – Whether to use batch normalization in the head layers.pooling_method (
str) – Pooling method to be used (‘avg’, ‘max’, etc.).bidirectional (
bool) – Whether to process input sequences bidirectionally.use_learnable_interaction (
bool) – Whether to use learnable feature interactions before passing through Mamba blocks.use_cls (
bool) – Whether to append a CLS token for sequence pooling.use_pscan (
bool) – Whether to use PSCAN for the state-space model.n_attention_layers (
int) – Number of attention layers in the model.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.MambularConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=64, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=SiLU(), cat_encoding='int', n_layers=4, d_conv=4, dilation=1, expand_factor=2, bias=False, dropout=0.0, dt_rank='auto', d_state=128, dt_scale=1.0, dt_init='random', dt_max=0.1, dt_min=0.0001, dt_init_floor=0.0001, norm='RMSNorm', conv_bias=False, AD_weight_decay=True, BC_layer_norm=False, shuffle_embeddings=False, head_layer_sizes=<factory>, head_dropout=0.5, head_skip_layers=False, head_activation=SELU(), head_use_batch_norm=False, pooling_method='avg', bidirectional=False, use_learnable_interaction=False, use_cls=False, use_pscan=False, mamba_version='mamba-torch')[source]
Architecture-only configuration for Mambular models (DeepTab 2.0 API).
- Parameters:
d_model (
int) – Dimensionality of the model.activation (
Callable) – Activation function for the model.n_layers (
int) – Number of layers in the model.d_conv (
int) – Size of convolution over columns.dilation (
int) – Dilation factor for the convolution.expand_factor (
int) – Expansion factor for the feed-forward layers.bias (
bool) – Whether to use bias in the linear layers.dropout (
float) – Dropout rate for regularization.dt_rank (
str) – Rank of the decision tree used in the model.d_state (
int) – Dimensionality of the state in recurrent layers.dt_scale (
float) – Scaling factor for decision tree parameters.dt_init (
str) – Initialization method for decision tree parameters.dt_max (
float) – Maximum value for decision tree initialization.dt_min (
float) – Minimum value for decision tree initialization.dt_init_floor (
float) – Floor value for decision tree initialization.norm (
str) – Type of normalization used (‘RMSNorm’, etc.).conv_bias (
bool) – Whether to use a bias in the 1D convolution before each mamba blockAD_weight_decay (
bool) – Whether to use weight decay als for the A and D matrices in MambaBC_layer_norm (
bool) – Whether to use layer norm on the B and C matricesshuffle_embeddings (
bool) – Whether to shuffle embeddings before being passed to Mamba layers.head_layer_sizes (
list) – Sizes of the layers in the model’s head.head_dropout (
float) – Dropout rate for the head layers.head_skip_layers (
bool) – Whether to skip layers in the head.head_activation (
Callable) – Activation function for the head layers.head_use_batch_norm (
bool) – Whether to use batch normalization in the head layers.pooling_method (
str) – Pooling method to use (‘avg’, ‘max’, etc.).bidirectional (
bool) – Whether to process data bidirectionally.use_learnable_interaction (
bool) – Whether to use learnable feature interactions before passing through Mamba blocks.use_cls (
bool) – Whether to append a CLS token to the input sequences.use_pscan (
bool) – Whether to use PSCAN for the state-space model.mamba_version (
str) – Version of the Mamba model to use (‘mamba-torch’, ‘mamba1’, ‘mamba2’).
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.MLPConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=32, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=ReLU(), cat_encoding='int', layer_sizes=<factory>, dropout=0.2, use_glu=False, skip_connections=False)[source]
Architecture-only configuration for MLP models (DeepTab 2.0 API).
Contains only structural hyperparameters. Training parameters (
lr,max_epochs, …) go inTrainerConfigand preprocessing parameters go inPreprocessingConfig.- Parameters:
layer_sizes (
list) – Number of units in each hidden layer.activation (
Callable) – Activation function for the MLP layers.skip_layers (bool, default=False) – Whether to include skip layers.
dropout (
float) – Dropout rate applied after each hidden layer.use_glu (
bool) – Whether to use Gated Linear Units instead of the plain activation.skip_connections (
bool) – Whether to use residual/skip connections between layers.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.NDTFConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=32, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=ReLU(), cat_encoding='int', min_depth=4, max_depth=16, temperature=0.1, node_sampling=0.3, lamda=0.3, n_ensembles=12, penalty_factor=1e-08)[source]
Architecture-only configuration for NDTF models (DeepTab 2.0 API).
- Parameters:
min_depth (
int) – Minimum depth of trees in the forest. Controls the simplest model structure.max_depth (
int) – Maximum depth of trees in the forest. Controls the maximum complexity of the trees.temperature (
float) – Temperature parameter for softening the node decisions during path probability calculation.node_sampling (
float) – Fraction of nodes sampled for regularization penalty calculation. Reduces computation by focusing on a subset of nodes.lamda (
float) – Regularization parameter to control the complexity of the paths, penalizing overconfident or imbalanced paths.n_ensembles (
int) – Number of trees in the forestpenalty_factor (
float) – Factor with which the penalty is multiplied
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.NODEConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=32, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=ReLU(), cat_encoding='int', num_layers=4, layer_dim=128, tree_dim=1, depth=6, norm=None, head_layer_sizes=<factory>, head_dropout=0.3, head_skip_layers=False, head_activation=ReLU(), head_use_batch_norm=False)[source]
Architecture-only configuration for NODE models (DeepTab 2.0 API).
- Parameters:
num_layers (
int) – Number of dense layers in the model.layer_dim (
int) – Dimensionality of each dense layer.tree_dim (
int) – Dimensionality of the output from each tree leaf.depth (
int) – Depth of each decision tree in the ensemble.norm (
str|None) – Type of normalization to use in the model.head_layer_sizes (
list) – Sizes of the layers in the model’s head.head_dropout (
float) – Dropout rate for the head layers.head_skip_layers (
bool) – Whether to skip layers in the head.head_activation (
Callable) – Activation function for the head layers.head_use_batch_norm (
bool) – Whether to use batch normalization in the head layers.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.ResNetConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=32, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=SELU(), cat_encoding='int', layer_sizes=<factory>, dropout=0.5, norm=False, num_blocks=3)[source]
Architecture-only configuration for ResNet models (DeepTab 2.0 API).
- Parameters:
activation (
Callable) – Activation function for the ResNet layers.layer_sizes (
list) – Sizes of the layers in the ResNet.dropout (
float) – Dropout rate for regularization.norm (
bool) – Whether to use normalization in the ResNet.num_blocks (
int) – Number of residual blocks in the ResNet.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.SAINTConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=128, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=GELU(approximate='none'), cat_encoding='int', n_layers=1, n_heads=2, attn_dropout=0.2, ff_dropout=0.1, norm='LayerNorm', norm_first=False, bias=True, head_layer_sizes=<factory>, head_dropout=0.5, head_skip_layers=False, head_activation=SELU(), head_use_batch_norm=False, pooling_method='cls', use_cls=True)[source]
Architecture-only configuration for SAINT models (DeepTab 2.0 API).
- Parameters:
d_model (
int) – Dimensionality of embeddings or model representations.activation (
Callable) – Activation function for the transformer layers.n_layers (
int) – Number of transformer layers.n_heads (
int) – Number of attention heads in the transformer.attn_dropout (
float) – Dropout rate for the attention mechanism.ff_dropout (
float) – Dropout rate for the feed-forward layers.norm (
str) – Type of normalization to be used (‘LayerNorm’, ‘RMSNorm’, etc.).norm_first (
bool) – Whether to apply normalization before other operations in each transformer block.bias (
bool) – Whether to use bias in linear layers.head_layer_sizes (
list) – Sizes of the fully connected layers in the model’s head.head_dropout (
float) – Dropout rate for the head layers.head_skip_layers (
bool) – Whether to use skip connections in the head layers.head_activation (
Callable) – Activation function for the head layers.head_use_batch_norm (
bool) – Whether to use batch normalization in the head layers.pooling_method (
str) – Pooling method to be used (‘cls’, ‘avg’, etc.).use_cls (
bool) – Whether to use a CLS token for pooling.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.TabMConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=32, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=ReLU(), cat_encoding='int', layer_sizes=<factory>, dropout=0.5, norm=None, use_glu=False, ensemble_size=32, ensemble_scaling_in=True, ensemble_scaling_out=True, ensemble_bias=True, scaling_init='ones', average_ensembles=False, model_type='mini', average_embeddings=True)[source]
Architecture-only configuration for TabM models (DeepTab 2.0 API).
- Parameters:
layer_sizes (
list) – Sizes of the layers in the model.dropout (
float) – Dropout rate for regularization.norm (
str|None) – Normalization method to be used, if any.use_glu (
bool) – Whether to use Gated Linear Units (GLU) in the model.ensemble_size (
int) – Number of ensemble members for batch ensembling.ensemble_scaling_in (
bool) – Whether to use input scaling for each ensemble member.ensemble_scaling_out (
bool) – Whether to use output scaling for each ensemble member.ensemble_bias (
bool) – Whether to use a unique bias term for each ensemble member.scaling_init (
Literal['ones','random-signs','normal']) – Initialization method for scaling weights.average_ensembles (
bool) – Whether to average the outputs of the ensembles.model_type (
Literal['mini','full']) – Model type to use (‘mini’ for reduced version, ‘full’ for complete model).average_embeddings (
bool) – Whether to average per-ensemble-member embeddings before the head.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.TabRConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='plr', embedding_bias=False, layer_norm_after_embedding=False, d_model=32, plr_lite=True, n_frequencies=75, frequencies_init_scale=0.045, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=ReLU(), cat_encoding='int', d_main=256, context_dropout=0.38920071545944357, d_multiplier=2, encoder_n_blocks=0, predictor_n_blocks=1, mixer_normalization='auto', dropout0=0.38852797479169876, dropout1=0.0, normalization='LayerNorm', memory_efficient=False, candidate_encoding_batch_size=0, context_size=96)[source]
Architecture-only configuration for TabR models (DeepTab 2.0 API).
Training fields (
lr,weight_decay,lr_factor) are configured viaTrainerConfig.- Parameters:
embedding_type (
str) – Type of feature embedding to use (e.g., ‘plr’, ‘ple’).plr_lite (
bool) – Whether to use the lightweight PLR embedding variant.n_frequencies (
int) – Number of random Fourier feature frequencies.frequencies_init_scale (
float) – Scale for initializing Fourier feature frequencies.d_main (
int) – Main hidden dimensionality of the predictor network.context_dropout (
float) – Dropout applied to context (candidate) representations.d_multiplier (
int) – Multiplier for intermediate dimensions inside the predictor.encoder_n_blocks (
int) – Number of residual blocks in the feature encoder.predictor_n_blocks (
int) – Number of residual blocks in the predictor network.mixer_normalization (
str) – Normalization strategy for the mixer ('auto'selects adaptively).dropout0 (
float) – Dropout rate on the first linear projection.dropout1 (
float) – Dropout rate on the second linear projection.normalization (
str) – Type of normalization layer to use.memory_efficient (
bool) – Whether to trade compute for lower memory in candidate lookups.candidate_encoding_batch_size (
int) – Batch size for encoding candidates (0 = full batch).context_size (
int) – Number of nearest-neighbour candidates to retrieve per sample.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.TabTransformerConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=128, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=SELU(), cat_encoding='int', n_layers=4, n_heads=8, attn_dropout=0.2, ff_dropout=0.1, norm='LayerNorm', transformer_activation=ReGLU(), transformer_dim_feedforward=512, norm_first=True, bias=True, head_layer_sizes=<factory>, head_dropout=0.5, head_skip_layers=False, head_activation=SELU(), head_use_batch_norm=False, pooling_method='avg')[source]
Architecture-only configuration for TabTransformer models (DeepTab 2.0 API).
- Parameters:
d_model (
int) – Dimensionality of embeddings or model representations.activation (
Callable) – Activation function for the transformer layers.n_layers (
int) – Number of layers in the transformer.n_heads (
int) – Number of attention heads in the transformer.attn_dropout (
float) – Dropout rate for the attention mechanism.ff_dropout (
float) – Dropout rate for the feed-forward layers.norm (
str) – Normalization method to be used.transformer_activation (
Callable) – Activation function for the transformer layers.transformer_dim_feedforward (
int) – Dimensionality of the feed-forward layers in the transformer.norm_first (
bool) – Whether to apply normalization before other operations in each transformer block.bias (
bool) – Whether to use bias in the linear layers.head_layer_sizes (
list) – Sizes of the layers in the model’s head.head_dropout (
float) – Dropout rate for the head layers.head_skip_layers (
bool) – Whether to skip layers in the head.head_activation (
Callable) – Activation function for the head layers.head_use_batch_norm (
bool) – Whether to use batch normalization in the head layers.pooling_method (
str) – Pooling method to be used (‘cls’, ‘avg’, etc.).
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.TabulaRNNConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=128, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=SELU(), cat_encoding='int', model_type='RNN', n_layers=4, rnn_dropout=0.2, norm='RMSNorm', residuals=False, norm_first=False, bias=True, rnn_activation='relu', dim_feedforward=256, d_conv=4, dilation=1, conv_bias=True, head_layer_sizes=<factory>, head_dropout=0.5, head_skip_layers=False, head_activation=SELU(), head_use_batch_norm=False, pooling_method='avg')[source]
Architecture-only configuration for TabulaRNN models (DeepTab 2.0 API).
- Parameters:
d_model (
int) – Dimensionality of embeddings or model representations.activation (
Callable) – Activation function for the RNN layers.model_type (
str) – Type of model, one of “RNN”, “LSTM”, “GRU”, “mLSTM”, “sLSTM”.n_layers (
int) – Number of layers in the RNN.rnn_dropout (
float) – Dropout rate for the RNN layers.norm (
str) – Normalization method to be used.residuals (
bool) – Whether to include residual connections in the RNN.norm_first (
bool) – Whether to apply normalization before other operations in each block.bias (
bool) – Whether to use bias in the linear layers.rnn_activation (
str) – Activation function for the RNN layers.dim_feedforward (
int) – Size of the feedforward network.d_conv (
int) – Size of the convolutional layer for embedding features.dilation (
int) – Dilation factor for the convolution.conv_bias (
bool) – Whether to use bias in the convolutional layers.head_layer_sizes (
list) – Sizes of the layers in the head of the model.head_dropout (
float) – Dropout rate for the head layers.head_skip_layers (
bool) – Whether to skip layers in the head.head_activation (
Callable) – Activation function for the head layers.head_use_batch_norm (
bool) – Whether to use batch normalization in the head layers.pooling_method (
str) – Pooling method to be used (‘avg’, ‘cls’, etc.).
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
Experimental model configs
- class deeptab.configs.ModernNCAConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='plr', embedding_bias=False, layer_norm_after_embedding=False, d_model=32, plr_lite=True, n_frequencies=75, frequencies_init_scale=0.045, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=ReLU(), cat_encoding='int', dim=128, d_block=512, n_blocks=4, dropout=0.1, temperature=0.75, sample_rate=0.5, num_embeddings=None, head_layer_sizes=<factory>, head_dropout=0.5, head_skip_layers=False, head_activation=SELU(), head_use_batch_norm=False)[source]
Architecture-only configuration for ModernNCA models (DeepTab 2.0 API).
- Parameters:
embedding_type (
str) – Type of feature embedding to use (e.g., ‘plr’, ‘ple’).plr_lite (
bool) – Whether to use the lightweight PLR embedding variant.n_frequencies (
int) – Number of random Fourier feature frequencies.frequencies_init_scale (
float) – Scale for initializing Fourier feature frequencies.dim (
int) – Embedding dimensionality per feature.d_block (
int) – Hidden size of each residual block.n_blocks (
int) – Number of residual blocks.dropout (
float) – Dropout rate applied inside each block.temperature (
float) – Temperature scaling for NCA softmax similarity.sample_rate (
float) – Fraction of training candidates used per forward pass.num_embeddings (
dict|None) – Optional dict mapping feature indices to embedding sizes.head_layer_sizes (
list) – Sizes of the fully connected layers in the prediction head.head_dropout (
float) – Dropout rate for the head layers.head_skip_layers (
bool) – Whether to use skip connections in the head layers.head_activation (
Callable) – Activation function for the head layers.head_use_batch_norm (
bool) – Whether to use batch normalization in the head layers.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.TangosConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=32, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=ReLU(), cat_encoding='int', layer_sizes=<factory>, skip_layers=False, dropout=0.2, use_glu=False, skip_connections=False, lamda1=0.5, lamda2=0.1, subsample=0.5)[source]
Architecture-only configuration for Tangos models (DeepTab 2.0 API).
- Parameters:
activation (
Callable) – Activation function for the TANGOS layers.layer_sizes (
list) – Sizes of the layers in the TANGOS.skip_layers (
bool) – Whether to skip layers in the TANGOS.dropout (
float) – Dropout rate for regularization.use_glu (
bool) – Whether to use Gated Linear Units (GLU) in the TANGOS.skip_connections (
bool) – Whether to use skip connections in the TANGOS.lamda1 (
float) – Weight on the task-specific orthogonality regularisation term.lamda2 (
float) – Weight on the cross-task specialisation regularisation term.subsample (
float) – Fraction of features subsampled for regularisation estimation.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- class deeptab.configs.TromptConfig(use_embeddings=False, embedding_activation=Identity(), embedding_type='linear', embedding_bias=False, layer_norm_after_embedding=False, d_model=128, plr_lite=False, n_frequencies=48, frequencies_init_scale=0.01, embedding_projection=True, batch_norm=False, layer_norm=False, layer_norm_eps=1e-05, activation=ReLU(), cat_encoding='int', n_cycles=6, n_cells=4, P=128)[source]
Architecture-only configuration for Trompt models (DeepTab 2.0 API).
- Parameters:
d_model (
int) – Dimensionality of the transformer model.n_cycles (
int) – Number of cycles in the Trompt model.n_cells (
int) – Number of cells in each cycle.P (
int) – Number of steps in the Trompt model.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance