Module 2.3 - Advanced Tensors¶

Map¶

In [2]:
set_svg_draw_height(200)
set_svg_height(200)
opts = ArrowOpts(arc_height=0.5, shaft_style=astyle)
d = hcat([matrix(3, 2, "a"), right_arrow, matrix(3, 2, "b")], 1)
d.connect(("a", 0, 0), ("b", 0, 0), opts).connect(("a", 1, 0), ("b", 1, 0), opts)
Out[2]:

Zip¶

In [3]:
opts = ArrowOpts(arc_height=0.5, shaft_style=astyle)
opts2 = ArrowOpts(arc_height=0.2, shaft_style=astyle)

d = hcat([matrix(3, 2, "a"), matrix(3, 2, "b"), right_arrow, matrix(3, 2, "c")], 1)
d.connect(("a", 0, 0), ("c", 0, 0), opts).connect(
    ("a", 1, 0), ("c", 1, 0), opts
).connect(("b", 0, 0), ("c", 0, 0), opts2).connect(("b", 1, 0), ("c", 1, 0), opts2)
Out[3]:

Reduce¶

In [4]:
opts = ArrowOpts(shaft_style=astyle)
d = hcat([matrix(3, 2, "a"), right_arrow, matrix(1, 2, "c")], 1)
d.connect(("a", 2, 0), ("a", 0, 0), opts).connect(("a", 2, 1), ("a", 0, 1), opts)
Out[4]:

Quiz¶

Outline¶

  • Broadcasting
  • Gradients
  • Tensor Puzzles

Broadcasting¶

Motivation: Scalar Addition¶

vector1 + 10

Zip Broadcasting¶

In [5]:
opts = ArrowOpts(arc_height=0.5, shaft_style=astyle)
opts2 = ArrowOpts(arc_height=0.2, shaft_style=astyle)

d = hcat([matrix(3, 1, "a"), matrix(1, 2, "b"), right_arrow, matrix(3, 2, "c")], 1)
d.connect(("a", 0, 0), ("c", 0, 0), opts).connect(
    ("a", 1, 0), ("c", 1, 1), opts
).connect(("b", 0, 0), ("c", 0, 0), opts2).connect(("b", 0, 1), ("c", 1, 1), opts2)
Out[5]:

Rules¶

  • Rule 1: Dimension of size 1 broadcasts with anything
  • Rule 2: Extra dimensions of 1 can be added with view
  • Rule 3: Zip automatically adds starting dims of size 1

Applying the Rules¶

A B =
(3, 4, 5) (3, 1, 5) (3, 4, 5)
(3, 4, 1) (3, 1, 5) (3, 4, 5)
(3, 4, 1) (1, 5) (3, 4, 5)
(3, 4, 1) (3, 5) Fail

Broadcasting Example¶

  3  4  1
     1  5
  -------
  3  4  5

View (adding dim)¶

In [6]:
import torch

x = torch.tensor([1, 2, 3])
print(x.shape)
print(x.view(3, 1).shape)
print(x.view(1, 3).shape)
torch.Size([3])
torch.Size([3, 1])
torch.Size([1, 3])

Matrix-Vector¶

In [7]:
x = minitorch.tensor([[1, 2], [3, 4], [5, 6]])
y = minitorch.tensor([[1], [3], [5]])
z = x + y
z.shape
Out[7]:
(3, 2)

Matrix-Vector¶

In [8]:
def col(c):
    return (
        matrix(3, 2).line_color(c).align_t().align_l()
        + matrix(3, 1).align_t().align_l()
    ).center_xy()


hcat(
    [
        matrix(3, 2),
        col(drawing.white),
        right_arrow,
        matrix(3, 2),
        col(drawing.papaya),
        right_arrow,
        matrix(3, 2),
    ],
    0.4,
)
Out[8]:
In [9]:
                # + [markdown] slideshow={"slide_type": "slide"}
# Matrix-Matrix
# ================
In [10]:
x = minitorch.zeros((4, 5))
y = minitorch.zeros((3, 1, 5))
z = x + y
z.shape
Out[10]:
(3, 4, 5)

Matrix-Matrix¶

In [11]:
def t(d, r, c, n=""):
    return tensor(0.5, d, r, c, n).fill_color(drawing.white)


d, r, c = 3, 4, 5
base = t(d, r, c).line_color(drawing.papaya)
chalk.set_svg_height(300)
hcat(
    [
        t(1, r, c),
        t(d, 1, c),
        right_arrow,
        (base + t(1, r, c)),
        (base + t(d, 1, c)),
    ],
    sep=2.5,
) / vstrut(1) /  hcat([right_arrow,
         t(d, r, c)], sep=2.5).align_l()
Out[11]:

Matrix-multiplication-ish ¶

In [12]:
x = minitorch.zeros((3, 2))
y = minitorch.zeros((2, 1, 2))
z = (x * y).sum(2)
z.shape
Out[12]:
(2, 3, 1)

Matrix multiplication-ish¶

In [13]:
d, r, c = 2, 3, 2
base = t(d, r, c).line_color(drawing.papaya)
d = hcat(
    [
        t(1, r, c),
        t(d, 1, c),
        right_arrow,
        (base + t(1, r, c)),
        (base + t(d, 1, c)),

    ],
    sep=2.5,
) / vstrut(1) / hcat(
        [right_arrow,
        t(d, r, c, "s"),
        right_arrow,
        t(2, 3, 1),
        right_arrow,
        matrix(3, 2)], sep=2.5)
d.connect(("s", 0, 0, 1), ("s", 0, 0, 0)).connect(
    ("s", 0, 1, 1), ("s", 0, 1, 0)
).connect(("s", 0, 2, 1), ("s", 0, 2, 0))
Out[13]:

Implementation¶

Broadcast Implementation¶

  • Never create an intermediate value.
  • Implicit map between output space / input space

Broadcast Functions¶

  • shape_broadcast - create the broadcast dims
  • broadcast_index - map from broadcasted to original value

Low-level Operations¶

  • map
  • zip
  • reduce

Backends¶

  • Simple backend for debugging
  • CPU implementation
  • GPU implementation
  • ...

Where is the backend?¶

  • Torch: Stored on the tensor

Other Options:

  • Inferred by environment
  • Compiled

Low-level Operations¶

class TensorOps:
    @staticmethod
    def map(fn: Callable[[float], float]) -> Callable[[Tensor], Tensor]:
        pass

    @staticmethod
    def zip(fn: Callable[[float, float], float]) -> Callable[[Tensor, Tensor], Tensor]:
        pass

    @staticmethod
    def reduce(
        fn: Callable[[float, float], float], start: float = 0.0
    ) -> Callable[[Tensor, int], Tensor]:
        pass

Constructed Operations¶

  • Stored on tensor tensor_op.py
self.neg_map = ops.map(operators.neg)
self.sigmoid_map = ops.map(operators.sigmoid)
self.relu_map = ops.map(operators.relu)
self.log_map = ops.map(operators.log)
self.exp_map = ops.map(operators.exp)
self.id_map = ops.map(operators.id)

How to use¶

In [14]:
t1 = minitorch.tensor([1, 2, 3])
t1.f.neg_map(t1)
Out[14]:
[-1.00 -2.00 -3.00]

Implementation Tips¶

  • Map
  • Zip
  • Reduce

Tensor Puzzles¶

Special PyTorch Syntax¶

(Not available in minitorch)

In [15]:
import torch

x = torch.tensor([1, 2, 3])
print(x.shape)
print(x[None].shape)
print(x[:, None].shape)
torch.Size([3])
torch.Size([1, 3])
torch.Size([3, 1])

Tensor Function¶

In [16]:
x = torch.arange(10)
print(x)

y = torch.arange(4)
print(y)
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
tensor([0, 1, 2, 3])

Tensor Function¶

In [17]:
x = torch.where(
    torch.tensor([True, False]),
    torch.tensor([1, 1]),
    torch.tensor([0, 0]),
)
print(x)
tensor([1, 0])

Tensor Puzzles¶

Tensor Puzzles

Q&A¶