modules.resnet.BlockBottleneck1d¶
layer = mdnc.modules.resnet.BlockBottleneck1d(
in_planes, out_planes,
kernel_size=3, stride=1, padding=1, output_size=None,
normalizer='pinst', activator='prelu', layer_order='new', scaler='down'
)
In the following paper, the authors propose two structres of the residual block.
Deep Residual Learning for Image Recognition
This is the implementation of the bottleneck (second-type) residual block. The residual block could be divided into two branches (input + conv). In this plain implementation, the convolutional branch is a composed of double convolutional layers. Shown in the following chart:
flowchart TB
in((" ")) --> conv1[Projection<br>convolution] --> conv2[Modern<br>convolution] --> conv3[Projection<br>convolution] --> plus(("+")):::diagramop --> out((" "))
in --> plus
classDef diagramop fill:#FFB11B, stroke:#AF811B;
where the projection convolutional layer is implemented by a convolution with kernel_size=1
. Compared to the plain block, this implementation requires fewer paramters, but provides a deeper stack.
If the channel of the output changes, or the size of the output changes, a projection layer is required for mapping the input branch to the output space:
flowchart TB
in((" ")) --> conv1[Projection<br>convolution] --> conv2[Modern<br>convolution] --> conv3[Projection<br>convolution] --> plus(("+")):::diagramop --> out((" "))
in --> pconv[Projection<br>convolution] --> plus
classDef diagramop fill:#FFB11B, stroke:#AF811B;
In the following paper, a new op composition order is proposed for building residual block:
Identity Mappings in Deep Residual Networks
This implementation called "pre-activation" would change the order of the sub-layers in the modern convolutional layer (see mdnc.modules.conv.ConvModern1d
). We support and recommend to use this implementation, set layer_order = 'new'
to enable it.
Arguments¶
Requries
Argument | Type | Description |
---|---|---|
in_planes | int | The channel number of the input data. |
out_planes | int | The channel number of the output data. |
kernel_size | int | The kernel size of this layer. |
stride | int | The stride size of this layer. When scaler='down' , this argument serves as the down-sampling factor. When scaler='up' , this argument serves as the up-sampling factor. |
padding | int | The padding size of this layer. The zero padding would be performed on both edges of the input before the convolution. |
output_size | int | The length of the output data. This option is only used when scaler='up' . When setting this value, the size of the up-sampling would be given explicitly and the argument stride would not be used. |
normalizer | str | The normalization method, could be:
|
activator | str | The activation method, could be: 'prelu' , 'relu' , 'null' . |
layer_order | str | The sub-layer composition order, could be:
|
scaler | str | The scaling method, could be:
|
Operators¶
__call__
¶
y = layer(x)
The forward operator implemented by the forward()
method. The input is a 1D tensor, and the output is the final output of this layer.
Requries
Argument | Type | Description |
---|---|---|
x | torch.Tensor | A 1D tensor, the size should be (B, C, L) , where B is the batch size, C is the input channel number, and L is the input data length. |
Returns
Argument | Description |
---|---|
y | A 1D tensor, the size should be (B, C, L) , where B is the batch size, C is the output channel number, and L is the output data length. |
Examples¶
In the first example, we build a plain residual block with ½ down-sampling and same padding.
Example 1
1 2 3 4 |
|
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
InstanceNorm1d-1 [-1, 16, 255] 32
PReLU-2 [-1, 16, 255] 16
Conv1d-3 [-1, 16, 255] 256
InstanceNorm1d-4 [-1, 16, 255] 32
PReLU-5 [-1, 16, 255] 16
Conv1d-6 [-1, 16, 128] 768
InstanceNorm1d-7 [-1, 16, 128] 32
PReLU-8 [-1, 16, 128] 16
Conv1d-9 [-1, 32, 128] 512
Conv1d-10 [-1, 32, 128] 512
InstanceNorm1d-11 [-1, 32, 128] 64
BlockBottleneck1d-12 [-1, 32, 128] 0
================================================================
Total params: 2,256
Trainable params: 2,256
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.02
Forward/backward pass size (MB): 0.33
Params size (MB): 0.01
Estimated Total Size (MB): 0.35
----------------------------------------------------------------
Note that the output length would be 128
in this example, because the same padding is used for the input. In this case, if we want to make a reverse layer, we could specify the output_size
for the up-sampling layer, for example:
Example 2
1 2 3 4 |
|
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
InstanceNorm1d-1 [-1, 32, 128] 64
PReLU-2 [-1, 32, 128] 32
Conv1d-3 [-1, 32, 128] 1,024
InstanceNorm1d-4 [-1, 32, 128] 64
PReLU-5 [-1, 32, 128] 32
Upsample-6 [-1, 32, 255] 0
Conv1d-7 [-1, 32, 255] 3,072
InstanceNorm1d-8 [-1, 32, 255] 64
PReLU-9 [-1, 32, 255] 32
Conv1d-10 [-1, 16, 255] 512
Upsample-11 [-1, 32, 255] 0
Conv1d-12 [-1, 16, 255] 512
InstanceNorm1d-13 [-1, 16, 255] 32
BlockBottleneck1d-14 [-1, 16, 255] 0
================================================================
Total params: 5,440
Trainable params: 5,440
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.02
Forward/backward pass size (MB): 0.59
Params size (MB): 0.02
Estimated Total Size (MB): 0.63
----------------------------------------------------------------