class Inv(minitorch.Function):
    @staticmethod
    def forward(ctx, t1: Tensor) -> Tensor:
        ctx.save_for_backward(t1)
        return t1.f.inv_map(t1)

    @staticmethod
    def backward(ctx, d: Tensor) -> Tensor:
        (t1,) = ctx.saved_values
        return d.f.inv_back_zip(t1, d)


class Mul(minitorch.Function):
    @staticmethod
    def forward(ctx, t1: Tensor, t2:Tensor) -> Tensor:
        ctx.save_for_backwards((t1, t2))
        return t1.f.mul_map(t1, t2)

    @staticmethod
    def backward(ctx, d: Tensor) -> Tensor:
        (t1, t2) = ctx.saved_values
        return d.f.mul_map(t2, d), d.f.mul_map(t1, d)


split_graph(s1, s2)

# X - (BATCH, FEATURES)
out = model.forward(X)

# out - (BATCH)
l = loss(out)

# l - (1)

class Network(minitorch.Module):
  def __init__(self):
    ...
    self.layer1 = Linear(FEATURES, HIDDEN)
    self.layer2 = Linear(HIDDEN, HIDDEN)
    self.layer3 = Linear(HIDDEN, 1)

loss.backward()
print(model.layer1.w_1.value.grad)

for p in model.parameters():
  if p.value.grad is not None:
    p.update(p.value - RATE * (p.value.grad / float(data.N)))

word_one_hot = tensor([0 if i != word else 1
                       for i in range(VOCAB)])
embedding = (layer1 * word_one_hot).sum(1)

embedding.weights.value.update(pretrained_weights)

Module 3.0 - Real Neural Networks¶

Map Gradient¶

Example: Tensor Inversion¶

Example: Inv¶

Example: Multiplicaiton¶

Example: Mult¶

Example: Sum¶

Reduce Gradient¶

Quiz¶

Outline¶

Training¶

Model: Math¶

Simple Dataset¶

Parameter Fitting¶

Batching¶

How to Compute Loss¶

Model: Code¶

Layer 1: Weight¶

Layer 1: Bias¶

Linear Model¶

How Much Does it Cost?¶

Layer 2: Weights¶

Compute Derivatives¶

Layer 1: Weight Grad¶

Update Parameters¶

Broadcasting¶

Observations¶

Simple NLP¶

Sentiment Classification¶

Data¶

What is a word?¶

Layer 1¶

Hidden vector for word¶

How does this share information?¶

Where do these come from?¶

Examples¶

Examples¶

Sentence Length¶

Value Transformation¶

Pooling¶

Benefits¶

Full Model¶

Issues¶