Module 2.1 - Tensors¶

Intuition: Split 1¶

In [2]:
yellow = Linear(-1, 0, 0.25)
ycolor = Color("#fde699")
draw_with_hard_points(yellow, ycolor, Color("white"))
Out[2]:

Reshape: ReLU¶

In [3]:
graph(
    minitorch.operators.relu,
    [yellow.forward(*pt) for pt in s2_hard],
    [yellow.forward(*pt) for pt in s1_hard],
    3,
    0.25,
    c=ycolor,
)
Out[3]:

Math View¶

$$ \begin{eqnarray*} h_ 1 &=& \text{ReLU}(\text{lin}(x; w^0, b^0)) \\ \end{eqnarray*} $$

Intuition: Split 2¶

In [4]:
green = Linear(1, 0, -0.8)
gcolor = Color("#d1e9c3")
draw_with_hard_points(green, gcolor, Color("white"))
Out[4]:

Math View¶

$$ \begin{eqnarray*} h_ 2 &=& \text{ReLU}(\text{lin}(x; w^1, b^1)) \\ \end{eqnarray*} $$

Reshape: ReLU¶

In [5]:
graph(
    minitorch.operators.relu,
    [green.forward(*pt) for pt in s2_hard],
    [green.forward(*pt) for pt in s1_hard],
    3,
    0.25,
    c=gcolor,
)
Out[5]:

Reshape: ReLU¶

In [6]:
draw_nn_graph(green, yellow)
Out[6]:

Math View (Alt)¶

$$ \begin{eqnarray*} \text{lin}(x; w, b) &=& x_1 \times w_1 + x_2 \times w_2 + b \\ h_ 1 &=& \text{ReLU}(\text{lin}(x; w^0, b^0)) \\ h_ 2 &=& \text{ReLU}(\text{lin}(x; w^1, b^1))\\ m(x_1, x_2) &=& \text{lin}(h; w, b) \end{eqnarray*} $$

Code View¶

Model

In [7]:
class Network(minitorch.Module):
    def __init__(self):
        super().__init__()
        self.unit1 = LinearModule()
        self.unit2 = LinearModule()
        self.classify = LinearModule()

    def forward(self, x):
        # yellow
        h1 = self.unit1.forward(x).relu()
        # green
        h2 = self.unit2.forward(x).relu()
        return self.classify.forward((h1, h2))

Quiz¶

Outline¶

  • Tensors
  • Operations
  • Strides

Tensors¶

Motivation¶

$$ \begin{eqnarray*} \text{lin}(x; w, b) &=& x_1 \times w_1 + x_2 \times w_2 + b \\ h_ 1 &=& \text{ReLU}(\text{lin}(x; w^0, b^0)) \\ h_ 2 &=& \text{ReLU}(\text{lin}(x; w^1, b^1))\\ m(x_1, x_2) &=& \text{lin}(h; w, b) \end{eqnarray*} $$

Parameters: $w_1, w_2, w^0_1, w^0_2, w^1_1, w^1_2, b, b^0, b^1$

  • This is really messy!

Matrix Form¶

$$ \begin{eqnarray*} \mathbf{h} &=& \text{ReLU}(\mathbf{W}^{(0)} \mathbf{x} + \mathbf{b}^{(0)}) \\ m(\mathbf{x}) &=& \mathbf{W} \mathbf{h} + \mathbf{b} \end{eqnarray*} $$

Parameters: $\mathbf{W}, \mathbf{b}, \mathbf{W}^{(0)}, \mathbf{b}^{(0)}$

  • Matrix - compute a bunch of linears at once (may be more than 2!)

Matrix / Tensors¶

  • Multi-dimensional arrays
  • Basis for an mathmatical programming
  • Similar foundation for many libraries (matlab, numpy, etc)

Terminology¶

  • 0-Dimensional Scalar

  • Scalar from module-0

Terminology¶

  • 1-Dimensional - Vector
In [8]:
matrix(5, 1)
Out[8]:

Terminology¶

  • 2-Dimensional - Matrix
In [9]:
matrix(3, 5)
Out[9]:

Terminology¶

  • n-dimensions - Tensor
In [10]:
tensor(0.75, 2, 3, 5)
Out[10]:

Terminology¶

  • Dims - # dimensions (x.dims)
  • Shape - # cells per dimension (x.shape)
  • Size - # cells (x.size)

Visual Convention¶

  • depth
  • row
  • columns

Example¶

  • dims: 2
  • shape: (3, 5)
  • size : 15
In [11]:
matrix(3, 5)
Out[11]:

Example¶

  • dims: ?
  • shape: ?
  • size : ?
In [12]:
matrix(4, 3)
Out[12]:

Indexing¶

  • Indexing syntax: x[0, 1, 2]
In [13]:
tensor(0.75, 2, 3, 5,
       colormap=lambda i, j, k: drawing.aqua if (i, j, k) == (0, 1, 2) else drawing.white)
Out[13]:

Implementing Tensors¶

Why not just use lists?¶

  • Functions to manipulate shape
  • Mathematical notation
  • Enables autodiff
  • Efficient control of memory (Module-3)

Tensor Usage¶

Unary

In [14]:
new_tensor = x.log()

Binary (for now, only same shape)

In [15]:
new_tensor = x + x

Reductions

In [16]:
new_tensor = x.sum()

Immutable Operations¶

  • We never change the tensors itself (mostly)
  • All operations return a new tensor (just like `Scalar``)
In [17]:
set_svg_height(200)
draw_boxes(["$x$", "$f(x)$"], [1])
Out[17]:

What's bad about tensors?¶

  • Hard to grow or shrink
  • Only numerical values
  • Lose comprehensions / python built-ins
  • Shapes are easy to mess up

Next Couple Lectures¶

  • No autodifferentiation for now
  • Only consider forward tensor operations
  • Add autodiff afterwards

Tensor Internals¶

How does this work¶

  • Storage : 1-D array of numbers of length size

  • Strides : tuple that provides the mapping from user indexing to the position in the 1-D storage.

Strides¶

  • Stride: $(1, 5)$
  • Shape: $(5,2)$
In [18]:
set_svg_height(200)
d = (
    matrix(5, 2, "n", colormap=color(5, 2))
    / vstrut(1)
    / matrix(1, 10, "s", colormap=lambda i, j: color(5, 2)(j % 5, j // 5))
)
d.connect(("n", 3, 0), ("s", 0, 3)).connect(("n", 3, 1), ("s", 0, 8))
Out[18]:

Strides¶

  • Stride: $(1, 2)$
  • Shape: $(2, 5)$
In [19]:
d = (
    matrix(2, 5, "n", colormap=lambda i, j: color(5, 2)(j, i))
    / vstrut(1)
    / matrix(1, 10, "s", colormap=color(1, 10))
)
d.connect(("n", 0, 3), ("s", 0, 6)).connect(("n", 1, 3), ("s", 0, 7))
Out[19]:

Strides¶

  • Shape: $(2, 2, 3)$
  • Stride: $(6, 3, 1)$
In [20]:
d = (
    tensor(0.5, 2, 2, 3, "n", colormap=lambda i, j, k: color(4, 3)(i * 2 + j, k))
    / vstrut(1)
    / matrix(1, 12, "s", colormap=color(1, 12))
)
d.connect(("n", 0, 1, 1), ("s", 0, 4)).connect_perim(
    ("n", 1, 0, 2), ("s", 0, 2 + 6), unit_x - unit_y, -unit_y
)
Out[20]:

Which do we use? ¶

  • Contiguous: Bigger strides left
  • $(s_1, s_2, s_3)$
  • However, need to handle all cases.

Strides are useful: Transpose¶

Can transpose without copying.

In [21]:
matrix(2, 5, colormap=color(2, 5)) | chalk.hstrut(1) |  matrix(5, 2, colormap=lambda i,j: color(2, 5)(j,i))
Out[21]:

Operation 1: Indexing¶

  • $x[i, j, k]$

How to find data point?

Operation 2: Movement¶

How do I move to the next in the row? Column?

Operation 3: Reverse Indexing¶

How do I find the index for data?

Stride Intuition¶

  • Numerical bases,
  • Index for position 0? Position 1? Position 2?
In [22]:
tensor(0.75, 2, 2, 2)
Out[22]:

Stride Intuition¶

  • Index for position 0? Position 1? Position 2?

  • $[0, 0, 0], [0, 0, 1], [0, 1, 0]$

In [23]:
(
    tensor(0.5, 2, 2, 2, "n", colormap=lambda i, j, k: color(4, 2)(i * 2 + j, k))
    / vstrut(1)
    / matrix(1, 8, "s", colormap=color(1, 8))
)
Out[23]:

Conversion Formula¶

  • Divide and mod
  • $ k = p % s_2 $
  • $ j = (p // s_2) % s_1 $
  • ...

Implementation¶

  • TensorData : Manager of strides and storage

Module-2¶

Overview¶

  • tensor.py - Tensor Variable
  • tensor_functions.py - Tensor Functions
  • tensor_data.py - Storage and Indexing
  • tensor_ops.py - Low-level tensor operations

Q&A¶