modules.resnet.BlockPlain3d¶
layer = mdnc.modules.resnet.BlockPlain3d(
in_planes, out_planes,
kernel_size=3, stride=1, padding=1, output_size=None,
normalizer='pinst', activator='prelu', layer_order='new', scaler='down'
)
In the following paper, the authors propose two structres of the residual block.
Deep Residual Learning for Image Recognition
This is the implementation of the plain (first-type) residual block. The residual block could be divided into two branches (input + conv). In this plain implementation, the convolutional branch is a composed of double convolutional layers. Shown in the following chart:
flowchart TB
in((" ")) --> conv1[Modern<br>convolution] --> conv2[Modern<br>convolution] --> plus(("+")):::diagramop --> out((" "))
in --> plus
classDef diagramop fill:#FFB11B, stroke:#AF811B;
If the channel of the output changes, or the size of the output changes, a projection layer implemented by a convolution with kernel_size=1
is required for mapping the input branch to the output space:
flowchart TB
in((" ")) --> conv1[Modern<br>convolution] --> conv2[Modern<br>convolution] --> plus(("+")):::diagramop --> out((" "))
in --> pconv[Projection<br>convolution] --> plus
classDef diagramop fill:#FFB11B, stroke:#AF811B;
In the following paper, a new op composition order is proposed for building residual block:
Identity Mappings in Deep Residual Networks
This implementation called "pre-activation" would change the order of the sub-layers in the modern convolutional layer (see mdnc.modules.conv.ConvModern3d
). We support and recommend to use this implementation, set layer_order = 'new'
to enable it.
Arguments¶
Requries
Argument | Type | Description |
---|---|---|
in_planes | int | The channel number of the input data. |
out_planes | int | The channel number of the output data. |
kernel_size | int or(int, int, int) | The kernel size of this layer. |
stride | int or(int, int, int) | The stride size of this layer. When scaler='down' , this argument serves as the down-sampling factor. When scaler='up' , this argument serves as the up-sampling factor. |
padding | int or(int, int, int) | The padding size of this layer. The zero padding would be performed on both edges of the input before the convolution. |
output_size | int or(int, int, int) | The size of the output data. This option is only used when scaler='up' . When setting this value, the size of the up-sampling would be given explicitly and the argument stride would not be used. |
normalizer | str | The normalization method, could be:
|
activator | str | The activation method, could be: 'prelu' , 'relu' , 'null' . |
layer_order | str | The sub-layer composition order, could be:
|
scaler | str | The scaling method, could be:
|
Operators¶
__call__
¶
y = layer(x)
The forward operator implemented by the forward()
method. The input is a 3D tensor, and the output is the final output of this layer.
Requries
Argument | Type | Description |
---|---|---|
x | torch.Tensor | A 3D tensor, the size should be (B, C, L1, L2, L3) , where B is the batch size, C is the input channel number, and (L1, L2, L3) is the input data size. |
Returns
Argument | Description |
---|---|
y | A 3D tensor, the size should be (B, C, L1, L2, L3) , where B is the batch size, C is the output channel number, and (L1, L2, L3) is the output data size. |
Examples¶
In the first example, we build a plain residual block with ½ down-sampling and same padding.
Example 1
1 2 3 4 |
|
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
InstanceNorm3d-1 [-1, 16, 32, 4, 63] 32
PReLU-2 [-1, 16, 32, 4, 63] 16
Conv3d-3 [-1, 16, 32, 4, 63] 2,304
InstanceNorm3d-4 [-1, 16, 32, 4, 63] 32
PReLU-5 [-1, 16, 32, 4, 63] 16
Conv3d-6 [-1, 32, 16, 4, 32] 4,608
Conv3d-7 [-1, 32, 16, 4, 32] 512
InstanceNorm3d-8 [-1, 32, 16, 4, 32] 64
BlockPlain3d-9 [-1, 32, 16, 4, 32] 0
================================================================
Total params: 7,584
Trainable params: 7,584
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.49
Forward/backward pass size (MB): 6.92
Params size (MB): 0.03
Estimated Total Size (MB): 7.44
----------------------------------------------------------------
Note that the output size would be (16, 4, 32)
in this example, because the same padding is used for all three axes of the input size. In this case, if we want to make a reverse layer, we could specify the output_size
for the up-sampling layer, for example:
Example 2
1 2 3 4 |
|
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
InstanceNorm3d-1 [-1, 32, 16, 4, 32] 64
PReLU-2 [-1, 32, 16, 4, 32] 32
Conv3d-3 [-1, 32, 16, 4, 32] 9,216
InstanceNorm3d-4 [-1, 32, 16, 4, 32] 64
PReLU-5 [-1, 32, 16, 4, 32] 32
Upsample-6 [-1, 32, 32, 4, 63] 0
Conv3d-7 [-1, 16, 32, 4, 63] 4,608
Upsample-8 [-1, 32, 32, 4, 63] 0
Conv3d-9 [-1, 16, 32, 4, 63] 512
InstanceNorm3d-10 [-1, 16, 32, 4, 63] 32
BlockPlain3d-11 [-1, 16, 32, 4, 63] 0
================================================================
Total params: 14,560
Trainable params: 14,560
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.25
Forward/backward pass size (MB): 10.38
Params size (MB): 0.06
Estimated Total Size (MB): 10.68
----------------------------------------------------------------