..
  done

========
Modules
========

Researchers often disagree on exactly what the term `deep` learning
means, but one aspect that everyone agrees on is that deep
models are big and complex.  Common models can include hundreds of
millions of learned `parameters` that span over hundreds of informal
`module` groups.  In order to work with such complex systems, it is
important to have data structures which abstract away the complexity
so that it is eaiser to access and manipulate specific components, and
group together shared regions.


On the programming side, `Modules` have become a popular paradigm to
group parameters together to make them easy to manage, access, and
address.  There is nothing specific to machine learning about this
setup (and everything in MiniTorch could be done without modules), but they
make life easier and code more organized.

First, let's define a Parameter. For now, we will just think of Parameter
as a holder. It is just a special object that stores a value.


.. autoclass:: minitorch.Parameter


Parameters become more interesting when they are grouped with
`Modules`. Modules provide a way of storing and finding these
parameters. Let's look at the `Module` class to see how this works.


Modules are a recursive tree-shaped data structure. Each module can
store three things: 1) parameters, 2) non-parameter data, 3) other
modules. Internally, the user stores each of these directly on `self`,
but the module spies under the hood to determine the type of each
assignment.

Here is an example of the simplest usage of a module

.. jupyter-execute::

      from minitorch import *
      class OtherModule(Module):
          pass

      class MyModule(Module):
          def __init__(self, arg):
              # Initialize the super-class (so it can spy)
              super().__init__()

              # A parameter member (subclass of Parameter)
              self.parameter1 = Parameter(15)

              # A non-parameter member
              self.data = 25

              # A module member (subclass of Module)
              self.sub_module = OtherModule(arg, arg+10)

.. warning::
   All subclasses must begin their initialization by calling ::

     super().__init__()

   This allows the module to capture any members of type :class:`Module`
   or :class:`Parameter`
   and store them in a special dictionary.


Internally, parameters (type 1) are stored in :attr:`_parameters`, data
(type 2)
is stored on `self`, modules (type 3) are stored in :attr:`_modules`.


The main benefit of this infrastructure is that it allows us to
`flatten` a module to get out all of its parameters using
:func:`named_parameters`. This returns a dictionary of all of the
parameters in the module and in all descendent sub-modules. The names
here refer to the keys in the dictionary which give the path to each
parameter in the tree (similar to python dot notation). Critically
this function does not just return the current module's parameters, but
recursively
collects parameters from all the modules below as well.


Here is an example of how you can create a tree of modules and then
extract the flattened parameters

.. jupyter-execute::

      class Module1(Module):
          def __init__(self):
              super().__init__()
              self.p1 = Parameter(5)
              self.a = Module2()
              self.b = Module3()

      class Module2(Module):
          def __init__(self):
              super().__init__()
              self.p2 = Parameter(10)

      class Module3(Module):
          def __init__(self):
              super().__init__()
              self.c = Module4()

      class Module4(Module):
          def __init__(self):
              super().__init__()
              self.p3 = Parameter(15)

      np = Module1().named_parameters()
      assert np["b.c.p3"].value == 15

.. image:: figs/Module/module.png

Additionally, a module can have a :attr:`mode` indicating how it is
currently operated. The mode should propagate to all of its
child modules. For simplicity, we only consider the train and eval mode.

.. autoclass:: minitorch.Module