RuntimeError: CUDA error: device-side assert triggered
You are having this problem while doing semantic image segmentation using PyTorch? Here is the solution that worked for me.
The original article on my blog website here.
For now, I’m doing a project about semantic tumor segmentation in the whole body in CT scans, but you know that doing this kind of problem is not easy so in this article, I will be talking about a famous error that you have probably faced when using PyTorch.
So looking at the error you can understand that the program is not specifying the exact part of the error. For me, I kept printing and printing shapes and values of the images and masks (labels) to see where the exact error was.
I will show you where I found errors in my project and how you can solve them, and hopefully, it will work for you also.
Where is the problem?
For me, the problem was in the labels, and I am pretty sure that you have a problem there also. Let me talk about the problem that I found then we fix it.
My problem was in the intensity values of the labels they were wrong in two ways. The first error was that the program is detecting multiple classes instead of two classes (background and mask) because I have for each patient different numbers of tumors, and I am working on the volumetric segmentation which means that my one input is a set of slices (nifty file with 300 slices at least), this doesn’t matter for this problem. So having that different number of tumors in each patient had the effect of detecting multiple classes because the program will see different intensity values which will make it hard to understand that we need to segment only the background and the tumor.
Now, if you are using this kind of data so you should verify that the program is not detecting them as different classes, to check that all you need to do is to plot a label that has more than one area in it and if you get like the following figure, that means you are in my first problem.
You can see here in this slice, I should have the same color for the two tumors, but instead of that, I am having two different colors which means that the program is seeing 3 classes (including the background).
And you should know that to see this difference you have to pass your label by the preprocessing function then plot it. Because at first I tried to plot the mask before the preprocessing or using external software but I didn’t see the difference.
You can see here a mask with 3 tumors if you plot it without the preprocessing you can see that all the tumors have the same colors (talking about visual aspect) but in reality, they have all different intensity values for that they will be detected as different classes.
As I told you, the main problem is about the mask’s intensity which is different from each tumor.
And you should also notice that the mask’s values should be either 0 or 1 because if you are doing semantic segmentation so you don’t need other values.
If you want to check the values of your mask you can use a NumPy function called ‘unique’, so this function will give you the different values that you can find in the mask, and there you can see the second error (if you have the same error as me).
To fix this problem there are a lot of ways to do it, but the easiest one is to convert the mask value into booleans which means the mask will have only ‘true’ or ‘false’ (false for the background because it has the value 0 and true for all the values which are different to 0).
You can add only this line before starting the training and the validation (if you have).
label = label > 0
Otherwise, if you want to work with 0 and 1 so it depends on which problem you have. If you have only the values of the intensity bigger than 1 so you can divide the label values by their max. Or if you have the problem of multi classes, so here I recommend you to fix the problem outside the training part so that you will not slow the training process.