Working with Time of Flight data

Introduction

Yesterday, I mostly focused on preparing Time of flight sensor data to give it to a deep model. The data that I’m working on belongs to a competition on Kaggle, and here is the link to it: CMI - Detect Behavior with Sensor Data They have used 5 Time of flight sensors. The output of each sensor is an 8x8 matrix, but they have stored it as a 64 array. So, I was trying to write a Preprocess module to make the data into an 8x8 matrix. Then, write a Transform to make all 5 sensors into a tensor like below:

[ Sequence_length, 5, 8, 8]

The reason that I wanted to make the preprocess module for turning data into 8x8 matrices was to being able to do other image related preprocesses on them. If I have written it in the Transform module, I might have lost some important preprocesses. But I should see.

Preprocess module

To change the Time of Flight data into to 2d format, I have written the module below:

import polars as pl
import polars.selectors as cs


class ChangeTOFTo2D:

    def __init__(
            self,
    ):
        pass

    @classmethod
    def from_config(
            cls,
            cfg: DictConfig,
    ) -> "ChangeTOFTo2D":
        return cls()

    def __call__(self, df: pl.DataFrame) -> pl.DataFrame:
        new_df = df.clone()
        df_tof = new_df.select(cs.starts_with("tof"))
        for i in range(1, 6):
            s_i = df_tof.select(cs.starts_with(f"tof_{i}_v"))
            s_i_n = s_i.to_numpy()
            s_i_n = s_i_n.reshape((-1, 8, 8))
            new_df = new_df.with_columns(pl.Series(f"tof_{i}_2d", s_i_n))
            new_df = new_df.drop(cs.starts_with(f"tof_{i}_v"))
        return new_df

As you can see, at first, I select all the Time of Flight data. Then I select each sensor’s data in a for loop, reshape them, and add them to a new Dataframe.

Transform module

To transform a Time of Flight data to a tensor, I have written the module below:


class TOFTransform:
    def __init__(
            self,
            features_to_use: list[str],
            max_sequence_count,
    ):
        self.features_to_use = features_to_use
        self.max_sequence_count = max_sequence_count

    @classmethod
    def from_config(cls, cfg: DictConfig) -> "TOFTransform":
        if "features_to_use" not in cfg:
            raise ValueError("features_to_use is required")

        if "max_sequence_count" not in cfg:
            raise ValueError("max_sequence_count is required")

        return cls(
            cfg.features_to_use,
            cfg.max_sequence_count,
        )

    def __call__(
            self,
            sequence: pl.DataFrame,
    ) -> tuple[torch.Tensor, torch.Tensor]:

        num_features = len(self.features_to_use)
        sequence_np = sequence.select(self.features_to_use).to_numpy()

        all_zeros = np.zeros((self.max_sequence_count, num_features, 8, 8))
        mask = np.zeros(self.max_sequence_count, dtype=bool)

        result_list = []
        for i in range(sequence_np.shape[0]):
            result_row = []
            for j in range(sequence_np.shape[1]):
                result_row.append(sequence_np[i][j])
            result_list.append(result_row)

        result_np = np.array(result_list)

        all_zeros[: result_np.shape[0]] = result_np
        mask[: result_np.shape[0]] = True

        data = torch.from_numpy(all_zeros).float()
        attention_mask = torch.from_numpy(mask).bool()

        return data, attention_mask

As you can see, at first, I create a sequence full of zeros. The reason for doing that is to zero-pad the sequences to have all the sequences in the same shape. When I convert the sequence to numpy, the result of that would be: [sequence_length, 5]. The 8x8 matrices are being seen as objects. But my desired output is: [sequence_length, 5, 8, 8]. So, to achieve that, I defined result_list, and with two for loops, I changed the objects to 8x8 arrays.

Final thoughts

Time of Flight sensor’s data can be seen as images. I’m trying to assess their importance in the CMI - Detect Behavior with Sensor Data competition on Kaggle. In my opinion, they can contain more important features than IMU, but I’m not sure until I give it a try.

© 2025 LiterallyTheOne