modules.conv.ConvModern1d¶
layer = mdnc.modules.conv.ConvModern1d(
in_planes, out_planes,
kernel_size=3, stride=1, padding=1, output_size=None,
normalizer='pinst', activator='prelu', layer_order='new', scaler='down'
)
The implementation for the 1D modern convolutional layer. It supports both down-sampling mode and up-sampling modes. The modern convolutional layer is a stack of convolution, normalization and activation. Shown in the following chart:
flowchart TB
conv[Convolution] --> norm[Normalization] --> actv[Activation]
In the following paper, a new op composition order is proposed for building residual block. This idea is may help the performance get improved.
Identity Mappings in Deep Residual Networks
The basic idea of this method is shown in the following diagram:
flowchart TB
actv[Activation] --> norm[Normalization] --> conv[Convolution]
This idea is called "pre-activation" in some works. We also support this implementation. By setting the argument layer_order='new'
, the "pre-activation" method would be used for building the layer.
Arguments¶
Requries
Argument | Type | Description |
---|---|---|
in_planes | int | The channel number of the input data. |
out_planes | int | The channel number of the output data. |
kernel_size | int | The kernel size of this layer. |
stride | int | The stride size of this layer. When scaler='down' , this argument serves as the down-sampling factor. When scaler='up' , this argument serves as the up-sampling factor. |
padding | int | The padding size of this layer. The zero padding would be performed on both edges of the input before the convolution. |
output_size | int | The length of the output data. This option is only used when scaler='up' . When setting this value, the size of the up-sampling would be given explicitly and the argument stride would not be used. |
normalizer | str | The normalization method, could be:
|
activator | str | The activation method, could be: 'prelu' , 'relu' , 'null' . |
layer_order | str | The sub-layer composition order, could be:
|
scaler | str | The scaling method, could be:
|
Operators¶
__call__
¶
y = layer(x)
The forward operator implemented by the forward()
method. The input is a 1D tensor, and the output is the final output of this layer.
Requries
Argument | Type | Description |
---|---|---|
x | torch.Tensor | A 1D tensor, the size should be (B, C, L) , where B is the batch size, C is the input channel number, and L is the input data length. |
Returns
Argument | Description |
---|---|
y | A 1D tensor, the size should be (B, C, L) , where B is the batch size, C is the output channel number, and L is the output data length. |
Examples¶
In the first example, we build a modern convolutional layer with ½ down-sampling and same padding.
Example 1
1 2 3 4 |
|
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
InstanceNorm1d-1 [-1, 16, 255] 32
PReLU-2 [-1, 16, 255] 16
Conv1d-3 [-1, 32, 128] 1,536
ConvModern1d-4 [-1, 32, 128] 0
================================================================
Total params: 1,584
Trainable params: 1,584
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.02
Forward/backward pass size (MB): 0.12
Params size (MB): 0.01
Estimated Total Size (MB): 0.15
----------------------------------------------------------------
Note that the output length would be 128
in this example, because the same padding is used for the input. In this case, if we want to make a reverse layer, we could specify the output_size
for the up-sampling layer, for example:
Example 2
1 2 3 4 |
|
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
InstanceNorm1d-1 [-1, 32, 128] 64
PReLU-2 [-1, 32, 128] 32
Upsample-3 [-1, 32, 255] 0
Conv1d-4 [-1, 16, 255] 1,536
ConvModern1d-5 [-1, 16, 255] 0
================================================================
Total params: 1,632
Trainable params: 1,632
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.02
Forward/backward pass size (MB): 0.19
Params size (MB): 0.01
Estimated Total Size (MB): 0.21
----------------------------------------------------------------