Skip to content

Networks

We now have a fully working deep learning library with most of the features of a real industrial system like Torch. To take advantage of this hard work, this module is entirely based on using the software framework. In particular, we are going to build an image recognition system. We will do this by build the infrastructure for a version of LeNet on MNIST: a classic convolutional neural network (CNN) for digit recognition, and for a 1D conv for NLP sentiment classification.

You need the files from previous assignments, so maker sure to pull them over to your new repo. We recommend you to get familiar with tensor.py, since you might find some of those functions useful for implementing this Module.

Guides

Task 4.1: 1D Convolution

You will implement the 1D convolution in Numba. This function gets used by the forward and backward pass of conv1d.

Todo

Complete the following function in minitorch/fast_conv.py, and pass tests marked as task4_1.

minitorch._tensor_conv1d(out: Tensor, out_shape: Shape, out_strides: Strides, out_size: int, input: Tensor, input_shape: Shape, input_strides: Strides, weight: Tensor, weight_shape: Shape, weight_strides: Strides, reverse: bool) -> None

1D Convolution implementation.

Given input tensor of

batch, in_channels, width

and weight tensor

out_channels, in_channels, k_width

Computes padded output of

batch, out_channels, width

reverse decides if weight is anchored left (False) or right. (See diagrams)

Parameters:

  • out (Storage) –

    storage for out tensor.

  • out_shape (Shape) –

    shape for out tensor.

  • out_strides (Strides) –

    strides for out tensor.

  • out_size (int) –

    size of the out tensor.

  • input (Storage) –

    storage for input tensor.

  • input_shape (Shape) –

    shape for input tensor.

  • input_strides (Strides) –

    strides for input tensor.

  • weight (Storage) –

    storage for input tensor.

  • weight_shape (Shape) –

    shape for input tensor.

  • weight_strides (Strides) –

    strides for input tensor.

  • reverse (bool) –

    anchor weight at left or right

Task 4.2: 2D Convolution

You will implement the 2D convolution in Numba. This function gets used by the forward and backward pass of conv2d.

Todo

Complete the following function in minitorch/fast_conv.py, and pass tests marked as task4_2.

minitorch._tensor_conv2d(out: Tensor, out_shape: Shape, out_strides: Strides, out_size: int, input: Tensor, input_shape: Shape, input_strides: Strides, weight: Tensor, weight_shape: Shape, weight_strides: Strides, reverse: bool) -> None

2D Convolution implementation.

Given input tensor of

batch, in_channels, height, width

and weight tensor

out_channels, in_channels, k_height, k_width

Computes padded output of

batch, out_channels, height, width

Reverse decides if weight is anchored top-left (False) or bottom-right. (See diagrams)

Parameters:

  • out (Storage) –

    storage for out tensor.

  • out_shape (Shape) –

    shape for out tensor.

  • out_strides (Strides) –

    strides for out tensor.

  • out_size (int) –

    size of the out tensor.

  • input (Storage) –

    storage for input tensor.

  • input_shape (Shape) –

    shape for input tensor.

  • input_strides (Strides) –

    strides for input tensor.

  • weight (Storage) –

    storage for input tensor.

  • weight_shape (Shape) –

    shape for input tensor.

  • weight_strides (Strides) –

    strides for input tensor.

  • reverse (bool) –

    anchor weight at top-left or bottom-right

Task 4.3: Pooling

You will implement 2D pooling on tensors with an average operation.

Todo

Complete the following functions in minitorch/nn.py, and pass tests marked as task4_3.

minitorch.tile(input: Tensor, kernel: Tuple[int, int]) -> Tuple[Tensor, int, int]

Reshape an image tensor for 2D pooling

Parameters:

  • input (Tensor) –

    batch x channel x height x width

  • kernel (Tuple[int, int]) –

    height x width of pooling

Returns:

  • Tuple[Tensor, int, int]

    Tensor of size batch x channel x new_height x new_width x (kernel_height * kernel_width) as well as the new_height and new_width value.

minitorch.avgpool2d(input: Tensor, kernel: Tuple[int, int]) -> Tensor

Tiled average pooling 2D

Parameters:

  • input

    batch x channel x height x width

  • kernel

    height x width of pooling

Returns:

  • Tensor

    Pooled tensor

Task 4.4: Softmax and Dropout

You will implement max, softmax, and log softmax on tensors as well as the dropout and max-pooling operations.

Todo

  • Complete the following functions in minitorch/nn.py, and pass tests marked as task4_4.

  • Add a property tests for the function in test/test_nn.py and ensure that you understand its gradient computation.

minitorch.max(input: Tensor, dim: int) -> Tensor

minitorch.softmax(input: Tensor, dim: int) -> Tensor

Compute the softmax as a tensor.

\(z_i = \frac{e^{x_i}}{\sum_i e^{x_i}}\)

Parameters:

  • input

    input tensor

  • dim

    dimension to apply softmax

Returns:

  • Tensor

    softmax tensor

minitorch.logsoftmax(input: Tensor, dim: int) -> Tensor

Compute the log of the softmax as a tensor.

\(z_i = x_i - \log \sum_i e^{x_i}\)

See https://en.wikipedia.org/wiki/LogSumExp#log-sum-exp_trick_for_log-domain_calculations

Parameters:

  • input

    input tensor

  • dim

    dimension to apply log-softmax

Returns:

  • Tensor

    log of softmax tensor

minitorch.maxpool2d(input: Tensor, kernel: Tuple[int, int]) -> Tensor

Tiled max pooling 2D

Parameters:

  • input (Tensor) –

    batch x channel x height x width

  • kernel (Tuple[int, int]) –

    height x width of pooling

Returns:

  • Tensor( Tensor ) –

    pooled tensor

minitorch.dropout(input: Tensor, rate: float, ignore: bool = False) -> Tensor

Dropout positions based on random noise.

Parameters:

  • input

    input tensor

  • rate

    probability [0, 1) of dropping out each position

  • ignore

    skip dropout, i.e. do nothing at all

Returns:

  • Tensor

    tensor with random positions dropped out

Task 4.4b: Extra Credit

Implementing convolution and pooling efficiently is critical for large-scale image recognition. However, both are a bit harder than some of the basic CUDA functions we have written so far. For this task, add an extra file cuda_conv.py that implements conv1d and conv2d on CUDA. Show the output on colab.

Task 4.5: Training an Image Classifier

If your code works, you should now be able to move on to the NLP and CV training scripts in project/run_sentiment.py and project/run_mnist_multiclass.py. This script has the same basic training setup as :doc:module3, but now adapted to sentiment and image classification. You need to implement Conv1D, Conv2D, and Network for both files.

We recommend running on the command line when testing. But you can also use the Streamlit visualization to view hidden states of your model, like the following:

Todo

  • Train a model on Sentiment (SST2), and add your training printout logs as a text file sentiment.txt to the repo. It should show train loss, train accuracy and validation accuracy. (The model should get to >70% best validation accuracy.)

  • Train a model on Digit classification (MNIST) logs as a text file mnist.txt to the repo. It should show train loss and validation accuracy out of 16

Back to top