Networks
We now have a fully working deep learning library with most of the features of a real industrial system like Torch. To take advantage of this hard work, this module is entirely based on using the software framework. In particular, we are going to build an image recognition system. We will do this by build the infrastructure for a version of LeNet on MNIST: a classic convolutional neural network (CNN) for digit recognition, and for a 1D conv for NLP sentiment classification.
You need the files from previous assignments, so maker sure to pull them over to your new repo. We recommend you to get familiar with tensor.py, since you might find some of those functions useful for implementing this Module.
Guides
Task 4.1: 1D Convolution
You will implement the 1D convolution in Numba. This function gets used by the
forward
and backward
pass of conv1d.
Todo
Complete the following function in minitorch/fast_conv.py
, and pass tests marked
as task4_1
.
minitorch._tensor_conv1d(out: Tensor, out_shape: Shape, out_strides: Strides, out_size: int, input: Tensor, input_shape: Shape, input_strides: Strides, weight: Tensor, weight_shape: Shape, weight_strides: Strides, reverse: bool) -> None
1D Convolution implementation.
Given input tensor of
batch, in_channels, width
and weight tensor
out_channels, in_channels, k_width
Computes padded output of
batch, out_channels, width
reverse
decides if weight is anchored left (False) or right.
(See diagrams)
Parameters:
-
out
(
Storage
) –storage for
out
tensor. -
out_shape
(
Shape
) –shape for
out
tensor. -
out_strides
(
Strides
) –strides for
out
tensor. -
out_size
(
int
) –size of the
out
tensor. -
input
(
Storage
) –storage for
input
tensor. -
input_shape
(
Shape
) –shape for
input
tensor. -
input_strides
(
Strides
) –strides for
input
tensor. -
weight
(
Storage
) –storage for
input
tensor. -
weight_shape
(
Shape
) –shape for
input
tensor. -
weight_strides
(
Strides
) –strides for
input
tensor. -
reverse
(
bool
) –anchor weight at left or right
Task 4.2: 2D Convolution
You will implement the 2D convolution in Numba. This function gets used by the
forward
and backward
pass of conv2d.
Todo
Complete the following function in minitorch/fast_conv.py
, and pass tests marked
as task4_2
.
minitorch._tensor_conv2d(out: Tensor, out_shape: Shape, out_strides: Strides, out_size: int, input: Tensor, input_shape: Shape, input_strides: Strides, weight: Tensor, weight_shape: Shape, weight_strides: Strides, reverse: bool) -> None
2D Convolution implementation.
Given input tensor of
batch, in_channels, height, width
and weight tensor
out_channels, in_channels, k_height, k_width
Computes padded output of
batch, out_channels, height, width
Reverse
decides if weight is anchored top-left (False) or bottom-right.
(See diagrams)
Parameters:
-
out
(
Storage
) –storage for
out
tensor. -
out_shape
(
Shape
) –shape for
out
tensor. -
out_strides
(
Strides
) –strides for
out
tensor. -
out_size
(
int
) –size of the
out
tensor. -
input
(
Storage
) –storage for
input
tensor. -
input_shape
(
Shape
) –shape for
input
tensor. -
input_strides
(
Strides
) –strides for
input
tensor. -
weight
(
Storage
) –storage for
input
tensor. -
weight_shape
(
Shape
) –shape for
input
tensor. -
weight_strides
(
Strides
) –strides for
input
tensor. -
reverse
(
bool
) –anchor weight at top-left or bottom-right
Task 4.3: Pooling
You will implement 2D pooling on tensors with an average operation.
Todo
Complete the following functions in minitorch/nn.py
, and pass tests
marked as task4_3
.
minitorch.tile(input: Tensor, kernel: Tuple[int, int]) -> Tuple[Tensor, int, int]
Reshape an image tensor for 2D pooling
Parameters:
-
input
(
Tensor
) –batch x channel x height x width
-
kernel
(
Tuple[int, int]
) –height x width of pooling
Returns:
-
Tuple[Tensor, int, int]
–Tensor of size batch x channel x new_height x new_width x (kernel_height * kernel_width) as well as the new_height and new_width value.
minitorch.avgpool2d(input: Tensor, kernel: Tuple[int, int]) -> Tensor
Tiled average pooling 2D
Parameters:
-
input
–
batch x channel x height x width
-
kernel
–
height x width of pooling
Returns:
-
Tensor
–Pooled tensor
Task 4.4: Softmax and Dropout
You will implement max, softmax, and log softmax on tensors as well as the dropout and max-pooling operations.
Todo
-
Complete the following functions in
minitorch/nn.py
, and pass tests marked astask4_4
. -
Add a property tests for the function in
test/test_nn.py
and ensure that you understand its gradient computation.
minitorch.max(input: Tensor, dim: int) -> Tensor
minitorch.softmax(input: Tensor, dim: int) -> Tensor
Compute the softmax as a tensor.
\(z_i = \frac{e^{x_i}}{\sum_i e^{x_i}}\)
Parameters:
-
input
–
input tensor
-
dim
–
dimension to apply softmax
Returns:
-
Tensor
–softmax tensor
minitorch.logsoftmax(input: Tensor, dim: int) -> Tensor
Compute the log of the softmax as a tensor.
\(z_i = x_i - \log \sum_i e^{x_i}\)
See https://en.wikipedia.org/wiki/LogSumExp#log-sum-exp_trick_for_log-domain_calculations
Parameters:
-
input
–
input tensor
-
dim
–
dimension to apply log-softmax
Returns:
-
Tensor
–log of softmax tensor
minitorch.maxpool2d(input: Tensor, kernel: Tuple[int, int]) -> Tensor
Tiled max pooling 2D
Parameters:
-
input
(
Tensor
) –batch x channel x height x width
-
kernel
(
Tuple[int, int]
) –height x width of pooling
Returns:
-
Tensor(
Tensor
) –pooled tensor
minitorch.dropout(input: Tensor, rate: float, ignore: bool = False) -> Tensor
Dropout positions based on random noise.
Parameters:
-
input
–
input tensor
-
rate
–
probability [0, 1) of dropping out each position
-
ignore
–
skip dropout, i.e. do nothing at all
Returns:
-
Tensor
–tensor with random positions dropped out
Task 4.4b: Extra Credit
Implementing convolution and pooling efficiently is critical for
large-scale image recognition. However, both are a bit harder than
some of the basic CUDA functions we have written so far. For this
task, add an extra file cuda_conv.py
that implements
conv1d
and conv2d
on CUDA. Show the output on colab.
Task 4.5: Training an Image Classifier
If your code works, you should now be able to move on to the NLP and
CV training scripts in project/run_sentiment.py
and
project/run_mnist_multiclass.py
. This script has the same basic
training setup as :doc:module3
, but now adapted to sentiment and
image classification. You need to implement Conv1D
, Conv2D
, and
Network
for both files.
We recommend running on the command line when testing. But you can also use the Streamlit visualization to view hidden states of your model, like the following:
Todo
-
Train a model on Sentiment (SST2), and add your training printout logs as a text file
sentiment.txt
to the repo. It should show train loss, train accuracy and validation accuracy. (The model should get to >70% best validation accuracy.) -
Train a model on Digit classification (MNIST) logs as a text file
mnist.txt
to the repo. It should show train loss and validation accuracy out of 16