Inconsistency in TensorDataset

The len(val_ds.dataset) and len(val_ds) results different things. Same issue with train part did not get why?

If anyone can help, I didn’t get idea from pytorch documentation.

From what I remember, the datasets made by random_split keep an original one from which the examples are then “choosen”.

So the train_ds and val_ds have the same data inside (.dataset field) but they “return” different examples when asked for batch.

I just suggest looking at the source of this function, because I was surprised by this behavior as well - I’ve made different transforms for train and validation set, but apparently, when I changed one transform, the other one changed as well, because they share the same dataset.

1 Like