Tensors
We now have a fully developed autodifferentiation system built around
scalars. This system is correct, but you saw during training that it
is inefficient. Every scalar number requires building an object, and
each operation requires storing a graph of all the values that we
have previously created. Training requires
repeating the above operations, and running models, such as a linear model,
requires a for
loop over each of the terms in the network.
This module introduces and implements a tensor object that will solve these problems. Tensors group together many repeated operations to save Python overhead and to pass off grouped operations to faster implementations.
Guides
For this module we have implemented the skeleton tensor.py
file for
you. This is similar in spirit to scalar.py
from the last
assignment. Before starting, it is worth reading through this file to
have a sense of what a Tensor does. Each of the following tasks asks
you to implement the methods this file relies on:
tensor_data.py
: Indexing, strides, and storagetensor_ops.py
: Higherorder tensor operationstensor_functions.py
: Autodifferentiationready functions
Tasks 2.1: Tensor Data  Indexing
The MiniTorch library implements the core tensor backend as
minitorch.TensorData
. This class handles indexing, storage,
transposition,
and lowlevel details such as strides. You will first implement these core
functions
before turning to the userfacing class minitorch.Tensor
.
Todo
Complete the following functions in minitorch/tensor_data.py
, and pass
tests marked as task2_1
.
minitorch.index_to_position(index: Index, strides: Strides) > int
Converts a multidimensional tensor index
into a singledimensional position in
storage based on strides.
Parameters:

index
–
index tuple of ints

strides
–
tensor strides
Returns:

int
–Position in storage
minitorch.to_index(ordinal: int, shape: Shape, out_index: OutIndex) > None
Convert an ordinal
to an index in the shape
.
Should ensure that enumerating position 0 ... size of a
tensor produces every index exactly once. It
may not be the inverse of index_to_position
.
Parameters:

ordinal
(
int
) –ordinal position to convert.

shape
–
tensor shape.

out_index
–
return index corresponding to position.
minitorch.tensor_data.TensorData.permute(*order: int) > TensorData
Permute the dimensions of the tensor.
Parameters:

*order
(
int
) –a permutation of the dimensions
Returns:

TensorData
–New
TensorData
with the same storage and a new dimension order.
Tasks 2.2: Tensor Broadcasting
Todo
Complete following functions in minitorch/tensor_data.py
and pass tests marked as task2_2
.
minitorch.shape_broadcast(shape1: UserShape, shape2: UserShape) > UserShape
Broadcast two shapes to create a new union shape.
Parameters:

shape1
–
first shape

shape2
–
second shape
Returns:

UserShape
–broadcasted shape
Raises:

IndexingError
–if cannot broadcast
minitorch.broadcast_index(big_index: Index, big_shape: Shape, shape: Shape, out_index: OutIndex) > None
Convert a big_index
into big_shape
to a smaller out_index
into shape
following broadcasting rules. In this case
it may be larger or with more dimensions than the shape
given. Additional dimensions may need to be mapped to 0 or
removed.
Parameters:

big_index
–
multidimensional index of bigger tensor

big_shape
–
tensor shape of bigger tensor

shape
–
tensor shape of smaller tensor

out_index
–
multidimensional index of smaller tensor
Returns:

None
–None
Tasks 2.3: Tensor Operations
Tensor operations apply highlevel, higherorder operations to all
elements in a tensor simultaneously. In particular, you can map,
zip, and reduce tensor data objects together. On top of this
foundation, we can build up a Function
class for Tensor, similar to
what we did for the ScalarFunction. In this task, you will first
implement generic tensor operations and then use them to implement
forward
for specific operations.
We have built a debugging tool for you to observe the workings of your
expressions to see
how the graph is built. You
can alter
the expression at in Streamlit
to view the graph
y = x * z + 10.0
>>> python project/show_expression.py
Todo
Add functions in minitorch/tensor_ops.py
and
minitorch/tensor_functions.py
for each of the following, and pass tests
marked as task2_3
.
minitorch.tensor_ops.tensor_map(fn: Callable[[float], float]) > Callable[[Storage, Shape, Strides, Storage, Shape, Strides], None]
Lowlevel implementation of tensor map between tensors with possibly different strides.
Simple version:
 Fill in the
out
array by applyingfn
to each value ofin_storage
assumingout_shape
andin_shape
are the same size.
Broadcasted version:
 Fill in the
out
array by applyingfn
to each value ofin_storage
assumingout_shape
andin_shape
broadcast. (in_shape
must be smaller thanout_shape
).
Parameters:

fn
(
Callable[[float], float]
) –function from floattofloat to apply
Returns:

Callable[[Storage, Shape, Strides, Storage, Shape, Strides], None]
–Tensor map function.
minitorch.tensor_ops.tensor_zip(fn: Callable[[float, float], float]) > Callable[[Storage, Shape, Strides, Storage, Shape, Strides, Storage, Shape, Strides], None]
Lowlevel implementation of tensor zip between tensors with possibly different strides.
Simple version:
 Fill in the
out
array by applyingfn
to each value ofa_storage
andb_storage
assumingout_shape
anda_shape
are the same size.
Broadcasted version:
 Fill in the
out
array by applyingfn
to each value ofa_storage
andb_storage
assuminga_shape
andb_shape
broadcast toout_shape
.
Parameters:

fn
(
Callable[[float, float], float]
) –function mapping two floats to float to apply
Returns:

Callable[[Storage, Shape, Strides, Storage, Shape, Strides, Storage, Shape, Strides], None]
–Tensor zip function.
minitorch.tensor_ops.tensor_reduce(fn: Callable[[float, float], float]) > Callable[[Storage, Shape, Strides, Storage, Shape, Strides, int], None]
Lowlevel implementation of tensor reduce.
out_shape
will be the same asa_shape
except withreduce_dim
turned to size1
Parameters:

fn
(
Callable[[float, float], float]
) –reduction function mapping two floats to float
Returns:

Callable[[Storage, Shape, Strides, Storage, Shape, Strides, int], None]
–Tensor reduce function.
minitorch.tensor_functions.Mul.forward(ctx: Context, a: Tensor, b: Tensor) > Tensor
staticmethod
minitorch.tensor_functions.Sigmoid.forward(ctx: Context, t1: Tensor) > Tensor
staticmethod
minitorch.tensor_functions.ReLU.forward(ctx: Context, t1: Tensor) > Tensor
staticmethod
minitorch.tensor_functions.Log.forward(ctx: Context, t1: Tensor) > Tensor
staticmethod
minitorch.tensor_functions.Exp.forward(ctx: Context, t1: Tensor) > Tensor
staticmethod
minitorch.tensor_functions.LT.forward(ctx: Context, a: Tensor, b: Tensor) > Tensor
staticmethod
minitorch.tensor_functions.EQ.forward(ctx: Context, a: Tensor, b: Tensor) > Tensor
staticmethod
minitorch.tensor_functions.Permute.forward(ctx: Context, a: Tensor, order: Tensor) > Tensor
staticmethod
minitorch.tensor_functions.IsClose.forward(ctx: Context, a: Tensor, b: Tensor) > Tensor
staticmethod
Tasks 2.4: Gradients and Autograd
Similar to minitorch.Scalar
, minitorch.Tensor
is a Variable that supports autodifferentiation. In this task, you
will implement backward
functions for tensor operations.
Todo
Complete following functions in minitorch/tensor_functions.py
, and pass
tests marked as task2_4
.
minitorch.tensor_functions.Mul.backward(ctx: Context, grad_output: Tensor) > Tuple[Tensor, Tensor]
staticmethod
minitorch.tensor_functions.Sigmoid.backward(ctx: Context, grad_output: Tensor) > Tensor
staticmethod
minitorch.tensor_functions.ReLU.backward(ctx: Context, grad_output: Tensor) > Tensor
staticmethod
minitorch.tensor_functions.Log.backward(ctx: Context, grad_output: Tensor) > Tensor
staticmethod
minitorch.tensor_functions.Exp.backward(ctx: Context, grad_output: Tensor) > Tensor
staticmethod
minitorch.tensor_functions.LT.backward(ctx: Context, grad_output: Tensor) > Tuple[Tensor, Tensor]
staticmethod
minitorch.tensor_functions.EQ.backward(ctx: Context, grad_output: Tensor) > Tuple[Tensor, Tensor]
staticmethod
minitorch.tensor_functions.Permute.backward(ctx: Context, grad_output: Tensor) > Tuple[Tensor, float]
staticmethod
Task 2.5: Training
If your code works you should now be able to move on to the tensor
training script in project/run_tensor.py
. This code runs the same
basic training setup as in module1
, but now utilize your tensor
code.
Todo
Implement the missing forward
functions in project/run_tensor.py
. They
should
do exactly the same thing as the corresponding functions in
project/run_scalar.py
,
but now use the tensor code base.

Train a tensor model and add your results for all datasets to the README.

Record the time per epoch reported by the trainer. (It is okay if it is slow).