Using config in deep learning

Introduction

Yesterday, I trained so many variations of my model to see which one gives me the better result. To do that, I use hydra and Omegaconf. OmegaConf is a strong tool to manage your configs. One of the best things about OmegaConf is that it supports dataclass. Also, hydra is built on top of OmegaConf to manage the configs for more complex systems. With hydra, you can easily override the config by command line.

Base config

I personally prefer to use dataclass in my configs. It is far easier to manage, and you can use Python code on it, which makes defining parameters far easier. To manage my configs better, I have made a base_config with the content below:

# base_config.py
from dataclasses import dataclass, field

from pathlib import Path

features_to_use = [
    "acc_x",
    "acc_y",
    "acc_z",
    "rot_w",
    "rot_x",
    "rot_y",
    "rot_z",
]

label_category_to_int = {
    "Above ear - pull hair": 0,
    "Cheek - pinch skin": 1,
    "Drink from bottle/cup": 2,
    "Eyebrow - pull hair": 3,
    "Eyelash - pull hair": 4,
    "Feel around in tray and pull out an object": 5,
    "Forehead - pull hairline": 6,
    "Forehead - scratch": 7,
    "Glasses on/off": 8,
    "Neck - pinch skin": 9,
    "Neck - scratch": 10,
    "Pinch knee/leg skin": 11,
    "Pull air toward your face": 12,
    "Scratch knee/leg skin": 13,
    "Text on phone": 14,
    "Wave hello": 15,
    "Write name in air": 16,
    "Write name on leg": 17,
}


@dataclass
class ModelConfig:
    num_classes: int = 18
    input_channel: int = len(features_to_use)
    d_length: int = 103


@dataclass
class GaussianNoiseConfig:
    probability: float = 0.5
    noise_std: float = 0.01


@dataclass
class ShrinkOneSequenceConfig:
    probability: float = 0.2
    low: int = 70
    high: int = 90


@dataclass
class CropStartSequenceConfig:
    probability: float = 0.2
    low: int = 10
    high: int = 50


@dataclass
class StretchOneSequenceConfig:
    probability: float = 0.2
    low: int = 20
    high: int = 50


@dataclass
class TimeWarpConfig:
    probability: float = 0.2
    n_knots: int = 4


@dataclass
class BaseConfig:
    name: str = f"{Path(__file__).stem}"
    max_sequence_count: int = 103
    num_classes: int = 18
    features_to_use: list[str] = field(default_factory=lambda: features_to_use)
    features_to_check: list[str] = field(
        default_factory=lambda: [
            "acc_x",
            "acc_y",
            "acc_z",
            "rot_w",
            "rot_x",
            "rot_y",
            "rot_z",
        ]
    )
    target_label: str = "gesture"
    id_features: list[str] = field(
        default_factory=lambda: ["sequence_id", "sequence_counter"]
    )

    label_category_to_int: dict[str, int] = field(
        default_factory=lambda: label_category_to_int
    )

    model: ModelConfig = field(default_factory=ModelConfig)
    gaussian_noise: GaussianNoiseConfig = field(default_factory=GaussianNoiseConfig)
    shrink_one_sequence: ShrinkOneSequenceConfig = field(
        default_factory=ShrinkOneSequenceConfig
    )
    crop_start_sequence: CropStartSequenceConfig = field(
        default_factory=CropStartSequenceConfig
    )

    stretch_one_sequence: StretchOneSequenceConfig = field(
        default_factory=StretchOneSequenceConfig
    )

    time_warp: TimeWarpConfig = field(default_factory=TimeWarpConfig)

    valid_split: float = 0.2
    max_epochs: int = 200
    patience: int = 6
    min_delta: float = 0.0
    use_tensorboard: bool = True
    fill_value: float = 0.0


def register_config():
    from hydra.core.config_store import ConfigStore

    cs = ConfigStore.instance()
    cs.store(name="config", node=BaseConfig)

The config below consists of the important parameters for training my model. Also, it has parameters for controlling augmentations (GaussianNoiseConfig, ShrinkOneSequenceConfig, CropStartSequenceConfig, CropStartSequenceConfig, StretchOneSequenceConfig, TimeWarpConfig).
To manage it with hydra, I have used a register_config() function. The reason that I used the import of hydra inside the register_config() was because I use hydra only for training my model and collecting logs on my computer. To run the final result on Google Colab or Kaggle, I don’t need to use hydra. Omegaconf is pre-installed in both of them, so there is no need to install hydra on them.

Make variations

For example, if my model is something different than the model I defined in base_config, I can simply make a subclass of the base model and change it in a way that I want. For example:

# a1_tcnt_1_config.py
from cmi_sensor_kaggle.configs.base_config import (
    BaseConfig,
    ModelConfig,
    GaussianNoiseConfig,
    ShrinkOneSequenceConfig,
    CropStartSequenceConfig,
)

from dataclasses import dataclass, field

from pathlib import Path

features_to_use = [
    "acc_x",
    "acc_y",
    "acc_z",
    "rot_w",
    "rot_x",
    "rot_y",
    "rot_z",
    "acc_gx",
    "acc_gy",
    "acc_gz_no_gravity",
    "acc_magnitude",
    "alpha",
    "beta",
    "gamma",
]

weight = [
    0.70977011,
    0.71088435,
    2.8126294,
    0.70977011,
    0.70755208,
    2.8126294,
    0.70755208,
    0.70755208,
    2.8126294,
    0.70755208,
    0.70755208,
    2.8126294,
    0.94933613,
    2.8126294,
    0.70755208,
    0.94735007,
    0.94933613,
    2.8126294,
]


@dataclass
class AttentionModelConfig(ModelConfig):
    d_model: int = 16
    n_head: int = 4
    num_layers: int = 2
    num_classes: int = 18
    input_dim: int = len(features_to_use)


@dataclass
class TransformTcnGlobalFeaturesConfig(BaseConfig):
    name: str = f"{Path(__file__).stem}"
    features_to_use: list[str] = field(default_factory=lambda: features_to_use)
    model: AttentionModelConfig = field(default_factory=AttentionModelConfig)
    weight: list[float] = field(default_factory=lambda: weight)

    patience: int = 10


def register_config():
    from hydra.core.config_store import ConfigStore

    cs = ConfigStore.instance()
    cs.store(name="config", node=TransformTcnGlobalFeaturesConfig)

In the code above, I have imported base_config. Then made some subclasses out of them to apply my changes. As you can see, in the register_config() I use the new class as my node.

Then, if I want to make another variation of the above config, I can do something like below:

# a1_tcnt_d_model_128.py
from .a1_tcnt_1_config import TransformTcnGlobalFeaturesConfig, AttentionModelConfig

from dataclasses import dataclass, field


@dataclass
class AttentionModelConfig2(AttentionModelConfig):
    d_model: int = 128


@dataclass
class TransformTcnGlobalFeaturesConfig2(TransformTcnGlobalFeaturesConfig):
    name: str = "a1_tcnt_d_model_128"
    model: AttentionModelConfig = field(default_factory=AttentionModelConfig2)


def register_config():
    from hydra.core.config_store import ConfigStore

    cs = ConfigStore.instance()
    cs.store(name="config", node=TransformTcnGlobalFeaturesConfig2)

In the code above, I just changed the d_model to 128. As you can see, managing configs right now is super simple.

Use the configs

When I want to use the configs, I take two approaches. First, when I want to use it for training on my computer, I use it like below:


from cmi_sensor_kaggle.configs.tcnt_weight_global.a1_tcnt_d_model_128 import (
    register_config,
)

register_config()


@hydra.main(
    version_base=None,
    config_name="config",
)
def main(cfg: DictConfig) -> None:
    # -------------------[ Load config ]-------------------
    print(OmegaConf.to_yaml(cfg))

    # rest of the code
    # ...


if __name__ == "__main__":
    main()

Second, when I want to run it on a notebook on Google Colab or Kaggle, I use it like this:

from cmi_sensor_kaggle.configs.tcnt_weight_global.a1_tcnt_d_model_128 import TransformTcnGlobalFeaturesConfig2
from omegaconf import OmegaConf

# %%
cfg = OmegaConf.structured(TransformTcnGlobalFeaturesConfig2)
print(OmegaConf.to_yaml(cfg))

Final thoughts

OmegaConf and hydra are super great tools to manage config files. They give you the ability to define a hierarchy of your configs, which makes the code more efficient. I have really enjoyed using them, and they made managing configs super easier for me.

Table of content

Using config in deep learning

Introduction

Base config

Make variations

Use the configs

Final thoughts

Using config in deep learning