Skip to content

modules.resnet.BlockPlain2d

Class · nn.Module · Source

layer = mdnc.modules.resnet.BlockPlain2d(
    in_planes, out_planes,
    kernel_size=3, stride=1, padding=1, output_size=None,
    normalizer='pinst', activator='prelu', layer_order='new', scaler='down'
)

In the following paper, the authors propose two structres of the residual block.

Deep Residual Learning for Image Recognition

This is the implementation of the plain (first-type) residual block. The residual block could be divided into two branches (input + conv). In this plain implementation, the convolutional branch is a composed of double convolutional layers. Shown in the following chart:

flowchart TB
    in((" ")) --> conv1[Modern<br>convolution] --> conv2[Modern<br>convolution] --> plus(("+")):::diagramop --> out((" "))
    in --> plus
    classDef diagramop fill:#FFB11B, stroke:#AF811B;

If the channel of the output changes, or the size of the output changes, a projection layer implemented by a convolution with kernel_size=1 is required for mapping the input branch to the output space:

flowchart TB
    in((" ")) --> conv1[Modern<br>convolution] --> conv2[Modern<br>convolution] --> plus(("+")):::diagramop --> out((" "))
    in --> pconv[Projection<br>convolution] --> plus
    classDef diagramop fill:#FFB11B, stroke:#AF811B;

In the following paper, a new op composition order is proposed for building residual block:

Identity Mappings in Deep Residual Networks

This implementation called "pre-activation" would change the order of the sub-layers in the modern convolutional layer (see mdnc.modules.conv.ConvModern2d). We support and recommend to use this implementation, set layer_order = 'new' to enable it.

Arguments

Requries

Argument Type Description
in_planes int The channel number of the input data.
out_planes int The channel number of the output data.
kernel_size int or
(int, int)
The kernel size of this layer.
stride int or
(int, int)
The stride size of this layer. When scaler='down', this argument serves as the down-sampling factor. When scaler='up', this argument serves as the up-sampling factor.
padding int or
(int, int)
The padding size of this layer. The zero padding would be performed on both edges of the input before the convolution.
output_size int or
(int, int)
The size of the output data. This option is only used when scaler='up'. When setting this value, the size of the up-sampling would be given explicitly and the argument stride would not be used.
normalizer str The normalization method, could be:
  • 'batch': Batch normalization.
  • 'inst': Instance normalization.
  • 'pinst': Instance normalization with tunable rescaling parameters.
  • 'null': Without normalization, would falls back to the "convolution + activation" form. In this case, the layer_order='new' would not take effects.
activator str The activation method, could be: 'prelu', 'relu', 'null'.
layer_order str The sub-layer composition order, could be:
  • 'new': normalization + activation + convolution.
  • 'old': convolution + normalization + activation.
scaler str The scaling method, could be:
  • 'down': the argument stride would be used for down-sampling.
  • 'up': the argument stride would be used for up-sampling (equivalent to transposed convolution).

Operators

__call__

y = layer(x)

The forward operator implemented by the forward() method. The input is a 2D tensor, and the output is the final output of this layer.

Requries

Argument Type Description
x torch.Tensor A 2D tensor, the size should be (B, C, L1, L2), where B is the batch size, C is the input channel number, and (L1, L2) is the input data size.

Returns

Argument Description
y A 2D tensor, the size should be (B, C, L1, L2), where B is the batch size, C is the output channel number, and (L1, L2) is the output data size.

Examples

In the first example, we build a plain residual block with ½ down-sampling and same padding.

Example 1
1
2
3
4
import mdnc

layer = mdnc.modules.resnet.BlockPlain2d(16, 32, kernel_size=3, stride=(1, 2), padding=1, scaler='down')
mdnc.contribs.torchsummary.summary(layer, (16, 4, 255), device='cpu')
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
    InstanceNorm2d-1           [-1, 16, 4, 255]              32
             PReLU-2           [-1, 16, 4, 255]              16
            Conv2d-3           [-1, 16, 4, 255]           2,304
    InstanceNorm2d-4           [-1, 16, 4, 255]              32
             PReLU-5           [-1, 16, 4, 255]              16
            Conv2d-6           [-1, 32, 4, 128]           4,608
            Conv2d-7           [-1, 32, 4, 128]             512
    InstanceNorm2d-8           [-1, 32, 4, 128]              64
      BlockPlain2d-9           [-1, 32, 4, 128]               0
================================================================
Total params: 7,584
Trainable params: 7,584
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.06
Forward/backward pass size (MB): 1.12
Params size (MB): 0.03
Estimated Total Size (MB): 1.21
----------------------------------------------------------------

Note that the output size would be (4, 128) in this example, because the same padding is used for both two axes of the input size. In this case, if we want to make a reverse layer, we could specify the output_size for the up-sampling layer, for example:

Example 2
1
2
3
4
import mdnc

layer = mdnc.modules.resnet.BlockPlain2d(32, 16, kernel_size=3, output_size=(4, 255), padding=1, scaler='up')
mdnc.contribs.torchsummary.summary(layer, (32, 4, 128), device='cpu')
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
    InstanceNorm2d-1           [-1, 32, 4, 128]              64
             PReLU-2           [-1, 32, 4, 128]              32
            Conv2d-3           [-1, 32, 4, 128]           9,216
    InstanceNorm2d-4           [-1, 32, 4, 128]              64
             PReLU-5           [-1, 32, 4, 128]              32
          Upsample-6           [-1, 32, 4, 255]               0
            Conv2d-7           [-1, 16, 4, 255]           4,608
          Upsample-8           [-1, 32, 4, 255]               0
            Conv2d-9           [-1, 16, 4, 255]             512
   InstanceNorm2d-10           [-1, 16, 4, 255]              32
     BlockPlain2d-11           [-1, 16, 4, 255]               0
================================================================
Total params: 14,560
Trainable params: 14,560
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.06
Forward/backward pass size (MB): 1.62
Params size (MB): 0.06
Estimated Total Size (MB): 1.74
----------------------------------------------------------------

Last update: March 14, 2021

Comments