modules.conv.ConvModern3d¶

Class · nn.Module · Source

layer = mdnc.modules.conv.ConvModern3d(
    in_planes, out_planes,
    kernel_size=3, stride=1, padding=1, output_size=None,
    normalizer='pinst', activator='prelu', layer_order='new', scaler='down'
)

The implementation for the 3D modern convolutional layer. It supports both down-sampling mode and up-sampling modes. The modern convolutional layer is a stack of convolution, normalization and activation. Shown in the following chart:

flowchart TB
    conv[Convolution] --> norm[Normalization] --> actv[Activation]

In the following paper, a new op composition order is proposed for building residual block. This idea is may help the performance get improved.

Identity Mappings in Deep Residual Networks

The basic idea of this method is shown in the following diagram:

flowchart TB
    actv[Activation]  --> norm[Normalization] --> conv[Convolution]

This idea is called "pre-activation" in some works. We also support this implementation. By setting the argument layer_order='new', the "pre-activation" method would be used for building the layer.

Arguments¶

Requries

Argument	Type	Description
`in_planes`	`int`	The channel number of the input data.
`out_planes`	`int`	The channel number of the output data.
`kernel_size`	`int` or `(int, int, int)`	The kernel size of this layer.
`stride`	`int` or `(int, int, int)`	The stride size of this layer. When `scaler='down'`, this argument serves as the down-sampling factor. When `scaler='up'`, this argument serves as the up-sampling factor.
`padding`	`int` or `(int, int, int)`	The padding size of this layer. The zero padding would be performed on both edges of the input before the convolution.
`output_size`	`int` or `(int, int, int)`	The size of the output data. This option is only used when `scaler='up'`. When setting this value, the size of the up-sampling would be given explicitly and the argument `stride` would not be used.
`normalizer`	`str`	The normalization method, could be: `'batch'`: Batch normalization. `'inst'`: Instance normalization. `'pinst'`: Instance normalization with tunable rescaling parameters. `'null'`: Without normalization, would falls back to the "convolution + activation" form. In this case, the `layer_order='new'` would not take effects.
`activator`	`str`	The activation method, could be: `'prelu'`, `'relu'`, `'null'`.
`layer_order`	`str`	The sub-layer composition order, could be: `'new'`: normalization + activation + convolution. `'old'`: convolution + normalization + activation.
`scaler`	`str`	The scaling method, could be: `'down'`: the argument `stride` would be used for down-sampling. `'up'`: the argument `stride` would be used for up-sampling (equivalent to transposed convolution).

Operators¶

`call`¶

y = layer(x)

The forward operator implemented by the forward() method. The input is a 3D tensor, and the output is the final output of this layer.

Requries

Argument	Type	Description
`x`	`torch.Tensor`	A 3D tensor, the size should be `(B, C, L1, L2, L3)`, where `B` is the batch size, `C` is the input channel number, and `(L1, L2, L3)` is the input data size.

Returns

Argument	Description
`y`	A 3D tensor, the size should be `(B, C, L1, L2, L3)`, where `B` is the batch size, `C` is the output channel number, and `(L1, L2, L3)` is the output data size.

Examples¶

In the first example, we build a modern convolutional layer with ½ down-sampling and same padding.

Example 1

Codes

import mdnc

layer = mdnc.modules.conv.ConvModern3d(16, 32, kernel_size=(3, 1, 3), stride=(2, 1, 2), padding=(1, 0, 1), scaler='down')
mdnc.contribs.torchsummary.summary(layer, (16, 32, 4, 63), device='cpu')

Output

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
    InstanceNorm3d-1        [-1, 16, 32, 4, 63]              32
             PReLU-2        [-1, 16, 32, 4, 63]              16
            Conv3d-3        [-1, 32, 16, 4, 32]           4,608
      ConvModern3d-4        [-1, 32, 16, 4, 32]               0
================================================================
Total params: 4,656
Trainable params: 4,656
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.49
Forward/backward pass size (MB): 2.97
Params size (MB): 0.02
Estimated Total Size (MB): 3.48
----------------------------------------------------------------

Note that the output size would be (16, 4, 32) in this example, because the same padding is used for all three axes of the input size. In this case, if we want to make a reverse layer, we could specify the output_size for the up-sampling layer, for example:

Example 2