RuntimeError: mat1 dim 1 must match mat2 dim 0

I have mostly just replication all the classes and functions from Lesson 5 and run it on a new dataset (intel Images from Kaggle) the images are 150X150. I got the error below, so I first tried starting with tt.Resize((36,36)), (make sure you use two brackets as it is tuple- An hour of my lift lost figuring that one out)

history = [evaluate(model, valid_dl)]
history

/usr/local/lib/python3.6/dist-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs) 24 def decorate_context(*args, **kwargs): 25 with self.class(): —> 26 return func(*args, **kwargs) 27 return cast(F, decorate_context) 28

in evaluate(model, val_loader) 2 def evaluate(model, val_loader): 3 model.eval() ----> 4 outputs = [model.validation_step(batch) for batch in val_loader] 5 return model.validation_epoch_end(outputs) 6

in (.0) 2 def evaluate(model, val_loader): 3 model.eval() ----> 4 outputs = [model.validation_step(batch) for batch in val_loader] 5 return model.validation_epoch_end(outputs) 6

in validation_step(self, batch) 12 def validation_step(self, batch): 13 images, labels = batch —> 14 out = self(images) # Generate predictions 15 loss = F.cross_entropy(out, labels) # Calculate loss 16 acc = accuracy(out, labels) # Calculate accuracy

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 725 result = self._slow_forward(*input, **kwargs) 726 else: --> 727 result = self.forward(*input, **kwargs) 728 for hook in itertools.chain( 729 _global_forward_hooks.values(),

in forward(self, xb) 32 out = self.conv4(out) 33 out = self.res2(out) + out —> 34 out = self.classifier(out) 35 return out

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 725 result = self._slow_forward(*input, **kwargs) 726 else: --> 727 result = self.forward(*input, **kwargs) 728 for hook in itertools.chain( 729 _global_forward_hooks.values(),

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in forward(self, input) 115 def forward(self, input): 116 for module in self: --> 117 input = module(input) 118 return input 119

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 725 result = self._slow_forward(*input, **kwargs) 726 else: --> 727 result = self.forward(*input, **kwargs) 728 for hook in itertools.chain( 729 _global_forward_hooks.values(),

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/linear.py in forward(self, input) 91 92 def forward(self, input: Tensor) -> Tensor: —> 93 return F.linear(input, self.weight, self.bias) 94 95 def extra_repr(self) -> str:

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in linear(input, weight, bias)
1688 if input.dim() == 2 and bias is not None:
1689 # fused op is marginally faster ->
1690 ret = torch.addmm(bias, input, weight.t())
1691 else:
1692 output = input.matmul(weight.t())

RuntimeError: mat1 dim 1 must match mat2 dim 0

I ran the following,
for images, labels in train_dl:
print(images.size())
break

torch.Size(500,3,32,32) which is the same as the original from lesson 5.

I have made no changes tot he original code from Lesson 5 other than just trying to get the files loaded, and added the Resize to

train_tfms = tt.Compose([tt.Resize((32,32)),
tt.RandomCrop(32, padding=4, padding_mode=‘reflect’),
tt.RandomHorizontalFlip(),
tt.RandomRotation([-15, 15]),
tt.RandomResizedCrop(32),
tt.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1),
tt.ToTensor(),
tt.Normalize(*stats,inplace=True)])

– Updated –
commented out the following and it worked - trained to 82%
tt.RandomCrop(32, padding=4, padding_mode=‘reflect’),
tt.RandomHorizontalFlip(),
tt.RandomRotation([-15, 15]),
tt.RandomResizedCrop(32),
tt.ColorJitter
Changed Resize to 150,150 as this is needed to clean up about 40 images that are not 150X150, and I am back to “RuntimeError: mat1 dim 1 must match mat2 dim 0”

1 Like

@gdavey1168 Can you please share the link to the whole notebook? Are you using cnn or resent notebook of lesson 5?

I am having same problem as well, as long as I do not resize all image to 32*32, this error will come up, any idea?

This problem comes with bad data in the data set. Make sure all the images(or Tensors) are of same size. how much does not matter here. you need to have same tensor shape in the batches of data.

use different padding techniques to pad the tensor.

I was not aware of this before hand. I understood this while I was doing my project. I choose to work on audio and the audio was not ready to fed into the MODEL. I spent good amount of time(6d LOL) to process the data.

You checkout my notebook how I handled the padding(I used pad_sequence to do the job)

Of course the my project is far form complete, but I think it can help and save some time(maybe).

hope it helps.

Hello Greg,

I faced the same issue, when I send a bigger image than 32x32 (training size) through the model. My solution was to add an adaptive pooling layer at the beginning of the model init method. Now I can feed any image size to the model.

nn.AdaptiveAvgPool2d(output_size=32)

class CyrilliclettersModel(ImageClassificationBase):
        def __init__(self):
            super().__init__()
            self.network = nn.Sequential(
                nn.AdaptiveAvgPool2d(output_size=32),
                
                nn.Conv2d(1, 32, kernel_size=3, padding=1),
                .
                .
                .

Thanks Danieldownload, I think we found a similar hack, I just do it in at the transform as shown below.

train_tfms = tt.Compose([tt.Resize((32,32)),
                         #tt.RandomCrop(32, padding=4, padding_mode='reflect'), 
                         #tt.RandomHorizontalFlip(), 
                         #tt.RandomRotation([-15, 15]),
                         #tt.RandomResizedCrop(32), 
                         #tt.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1),
                         tt.ToTensor(), 
                         tt.Normalize(*stats,inplace=True)])

However, I would like to be able to send larger bit-density images with more detail to the initial convolution layer instead of toss out 75% of the pixels. Is there a difference between the way Resize does it, vs. AdaptiveAvgPool2d?

the trouble seems to be in the forward function.
in forward(self, xb)
32 out = self.conv4(out)
33 out = self.res2(out) + out
—> 34 out = self.classifier(out)
35 return out

1 Like

I bit of an update. I think the problem is in resNet9 layers. It is expected the first layer to be 1X32X32. and in the first convolution it goes to 64X32X32 then Pooling is added to go to 128X16X16 and eventually to 512X1X1. However, it it starts at higher than 32, then it will never flatten out to 512X1X1. So, I added an additional convolution layer with MaxPool to get it back down to 512X1X1 at the end.

class ResNet9(ImageClassificationBase):
def init(self, in_channels, num_classes):
super().init()
# 3 x 64 x 64

    self.conv1 = conv_block(in_channels, 128)        #128 x 64 x 64
    self.conv2 = conv_block(128, 256, pool=True)     #256 x 32 x 32
    self.res1 = nn.Sequential(conv_block(256, 256), 
                              conv_block(256, 256))  #256 x 32 x 32
                  
    self.conv3 = conv_block(in_channels, 256)         #256 x 32 x 32
    self.conv4 = conv_block(256, 256, pool=True)      #256 x 16 x 16
    self.res2 = nn.Sequential(conv_block(256, 256), 
                              conv_block(256, 256))   #256 x 16 x 16
    
    self.conv6 = conv_block(256, 256, pool=True)      #256 x 8 x 8
    self.conv7 = conv_block(256, 512, pool=True)      #512 x 4 x 4
    self.res3 = nn.Sequential(conv_block(512, 512), 
                              conv_block(512, 512))   #512 x 4 x 4
    
    self.classifier = nn.Sequential(nn.MaxPool2d(4), # 512 x 1 x 1
                                    nn.Flatten(),     #512
                                    nn.Dropout(0.2),  #512
                                    nn.Linear(512, num_classes))  #6
    
def forward(self, xb):
    out = self.conv1(xb)
    out = self.conv2(out)
    out = self.res1(out) + out
    out = self.conv3(xb)
    out = self.conv4(out)
    out = self.res2(out) + out
    out = self.conv6(out)
    out = self.conv7(out)
    out = self.res3(out) + out
    out = self.classifier(out)
    return out

Now when I run
history = [evaluate(model, valid_dl)]
I get a good output.
[{‘val_acc’: 0.19766665995121002, ‘val_loss’: 1.7888643741607666}]

When I try running
history += fit_one_cycle(epochs, max_lr, model, train_dl, valid_dl,
grad_clip=grad_clip,
weight_decay=weight_decay,
opt_func=opt_func)

RuntimeError: mat1 dim 1 must match mat2 dim 0

Oh Well… seemed like I was on the right track.

2 Likes

Awesome notebook - those damn dogs!!

It is the notebook from lesson 5. Nothing changed except loading the dataset at the beginning, and the num_classes from 10 to 6. I have added a lot more examples and what I have attempted below. Still can not quite figure out why it is cracking other than it seems to be in the classifier.
35 out = self.res3(out) + out
—> 36 out = self.classifier(out)
37 return out

I tried before the rizising option too but it is not usefull, therefore the adaptive pooling…

Thanks, I will give your solution a try next.
Happy New Year!

yeah try it out. Btw. it took me 4 days to figure it out :grin:

4 Days! … I have been working on this since last year :wink:

Thanks for the help!

Whatever your input image’s size is just make sure that the input to MaxPool2d(4) should be divisible to 4 (if its argument given is 4). It took me 3 days to figure that out.

The trick to the solution had two elements.
In the transform I needed ReSize(image_size) and CenterCrop(image_size) to get all images to be the same size.

I played around with different numbers and even added an extra loop through Conv_Block with limited success.

train_mean_std = tt.Compose([tt.Resize((image_size)),
                             tt.CenterCrop(image_size)

The next critical thing is in ResNet9 Class to make sure
the ‘out’ of the last convolution matches the ‘in’ of the next
and they all finish with 512 , 1, 1 before flatten. This was really easy to mess up
Note if you change your input image_size to from 64 to 128, you need to re-work your layers so you end up with 512,1,1.

class ResNet9(ImageClassificationBase):
    def __init__(self, in_channels, num_classes):
        super().__init__()
        # Start with 3 x 64 x 64
        self.conv1 = conv_block(in_channels, 128)        #128 x 64 x 64
        self.conv2 = conv_block(128, 256, pool=True)     #256 x 32 x 32
        self.res1 = nn.Sequential(conv_block(256, 256), 
                                  conv_block(256, 256))  #256 x 32 x 32
                      
        self.conv3 = conv_block(256, 256)                 #256 x 32 x 32
        self.conv4 = conv_block(256, 512, pool=True)      #512 x 16 x 16
        self.res2 = nn.Sequential(conv_block(512, 512), 
                                  conv_block(512, 512))   #512 x 16 x 16
        
        self.conv6 = conv_block(512, 512, pool=True)      #512 x 8 x 8
        self.conv7 = conv_block(512, 512, pool=True)      #512 x 4 x 4
        self.res3 = nn.Sequential(conv_block(512, 512), 
                                  conv_block(512, 512))   #512 x 4 x 4
        
        self.classifier = nn.Sequential(nn.MaxPool2d(4), # 512 x 1 x 1
                                        nn.Flatten(),     #512
                                        nn.Dropout(0.2),  #512
                                        nn.Linear(512, num_classes))  #6

i had this error too when i defined RandomResizedCrop() in the train transforms only.

i added the same RandomResizedCrop() with the same integer value on the validation transforms as well and it fixed my error.

train_tfms = transforms.Compose([transforms.RandomRotation(45),
                                  transforms.RandomResizedCrop(224), #if this param is used, it will need to be added in the validation dataset as well or it will throw an error in size
                                  transforms.RandomHorizontalFlip(),
                                  transforms.RandomCrop(224, padding=4, padding_mode='reflect'),
                                  transforms.ToTensor(),
                                  transforms.Normalize(*stats,inplace=True)])  

val_tfms = transforms.Compose([transforms.RandomResizedCrop (224),transforms.ToTensor(), transforms.Normalize(*stats,inplace=True)])

also, check your layer names on the ‘def forward’ function code (if you’re using the feed forward model) especially if you’d added nn.linear layers

i hope this helps.