.. done ============================== Tensors ============================== .. jupyter-execute:: :hide-code: import minitorch import sys sys.path.append("../project/") from show_tensor import * Tensor is a fancy name for a simple concept. A tensor is a `multi-dimensional array` of arbitrary dimensions. It is a convenient and efficient way to hold data, which becomes much more powerful when paired with fast operators and autodifferentiation. Tensor Shapes ************** So far we have focused on `scalars`, which correspond to 0-dimensional tensors. Next, we consider a 1-dimensional tensor (`vector`): .. jupyter-execute:: :hide-code: plot_matrix( np.array([1, 1, 1, 1, 1]), np.array([1, 2, 3, 4, 5]), "vector", bg="#fcfcfc" ) Then a 2-dimensional tensor (`matrix`): .. jupyter-execute:: :hide-code: mat = np.vstack([np.hstack([1,2,3,4,5]*2), np.hstack([np.ones(5),np.ones(5)*2])]) x,y = mat plot_matrix(x,y,"matrix", w=480,h=310, bg = "#fcfcfc") In addition to its dimension (`dims`), other critical aspects of a tensor are its `shape` and `size`. The shape of the above vector is (5,)) and its size (i.e. number of squares in the graph) is 5. The shape of the above matrix is (2,5) and its size is 10. A 3-dimensional tensor with shape (2, 3, 3) and size 18 looks like this: .. jupyter-execute:: :hide-code: tensor_figure(2,3,3,None,title=None, slider=False, axisTitles =['','','']) We access an element of the tensor by tensor index notation: `tensor[i]` for 1-dimension, `tensor[i, j]` for 2-dimension, `tensor[i, j, k]` for 3-dimension, and so forth. For example, `tensor[0, 1, 2]` would give this blue cube: .. jupyter-execute:: :hide-code: tensor_figure(2,3,3,5,title="Tensor Index: (0, 1, 2)") Typically, we access tensors just like multi-dimensional arrays, but there are some special geometric properties that make tensors different. First, tensors make it easy to change the order of the dimensions. For example, we can `transpose` the dimensions of a matrix. For a general tensor, we refer to this operation as `permute`. Calling `permute` arbitrarily reorders the dimensions of the input tensor. For example, as shown below, calling `permute(1,0)` on a matrix of shape (2, 5) gives a matrix of shape (5, 2). For indexing into the permuted matrix, we access elements using `tensor[j, i]` instead of `tensor[i, j]`. .. image:: figs/Tensors/matrix1.png .. image:: figs/Tensors/matrix2.png Second, tensors make it really easy to add or remove additional dimensions. Note that a matrix of shape (5, 2) can store the same amount of data as a matrix of shape (1, 5, 2), so they have the same size as shown below: .. image:: figs/Tensors/matrix2.png .. image:: figs/Tensors/broad.png We would like to easily increase or decrease the dimension of a tensor without changing the data. We will do this with a `view` function: use `view(1, 5, 2)` for the above example. Element `tensor[i, j]` in the (5,2) matrix is now `tensor[0, i, j]` in the 3-dimensional tensor. Critically, neither of these operations changes anything about the input tensor itself. Both `view` and `permute` are `tensor tricks`, i.e. operations that only modify how we look at the tensor, but not any of its data. Another way to say this is that they do not move or copy the data in any way, but only the external tensor wrapper. Tensor Strides ************** Users of a Tensor library only have to be aware of the `shape` and `size` of a tensor. However, there are important implementation details that we need to keep track of. To make our code a bit cleaner, we need to separate out the internal `tensor data` from the user-facing tensor. In addition to the `shape`, :class:`minitorch.TensorData` manages tensor `storage` and `strides`: * **Storage** is where the core data of the tensor is kept. It is always a 1-D array of numbers of length `size`, no matter the dimensionality or `shape` of the tensor. Keeping a 1-D storage allows us to have tensors with different shapes point to the same type of underlying data. * **Strides** is a tuple that provides the mapping from user indexing to the position in the 1-D `storage`. `Strides` can get a bit confusing to think about, so let's go over an example. Consider a matrix of shape (5, 2). The standard mapping is to walk left-to-right, top-to-bottom to order this matrix to the 1-D `storage`: .. image:: figs/Tensors/stride2.png :align: center :width: 400px We call it `contiguous` mapping, since it is in the natural counting order (bigger strides left). Here the strides are :math:`(2, 1)`. We read this as each column moves 1 step in storage and each row moves 2 steps. We can have different strides for the same shape. For instance, if we were walking top-to-bottom, left-to-right, we would have the following stride map: .. image:: figs/Tensors/stride1.png :align: center :width: 400px Contiguous strides are generally preferred, but non-contiguous strides can be quite useful as well. Consider transposing the above matrix and using strides (1,2): .. image:: figs/Tensors/stride3.png :align: center :width: 400px It has new strides (1,2) and new shape (5,2), in contrast to the previous (2,1) stride map on the (5,2) matrix. But notably no change in the `storage`. This is one of the super powers of tensors mentioned above: we can easily manipulate how we view the same underlying `storage`. Strides naturally extend to higher-dimensional tensors. .. image:: figs/Tensors/stride4.png :align: center Finally, strides can be used to implement indexing into the tensor. Assuming strides are :math:`(s_1, s_2)` and we want to look up `tensor[i, j]`, we can directly use strides to find its postion in the `storage`:: storage[s1 * i + s2 * j] Or in general:: storage[s1 * index1 + s2 * index2 + s3 * index3 ... ]