Module flowcon.nn.nets.extended_basic_nets

Classes

class ExtendedLinear (in_features: int, out_features: int, bias: bool = True, device=None, dtype=None)

Applies an affine linear transformation to the incoming data: :math:y = xA^T + b.

This module supports :ref:TensorFloat32<tf32_on_ampere>.

On certain ROCm devices, when using float16 inputs this module will use :ref:different precision<fp16_on_mi200> for backward.

Args

in_features
size of each input sample
out_features
size of each output sample
bias
If set to False, the layer will not learn an additive bias. Default: True

Shape

  • Input: :math:(*, H_{in}) where :math:* means any number of dimensions including none and :math:H_{in} = \text{in\_features}.
  • Output: :math:(*, H_{out}) where all but the last dimension are the same shape as the input and :math:H_{out} = \text{out\_features}.

Attributes

weight
the learnable weights of the module of shape :math:(\text{out\_features}, \text{in\_features}). The values are initialized from :math:\mathcal{U}(-\sqrt{k}, \sqrt{k}), where :math:k = \frac{1}{\text{in\_features}}
bias
the learnable bias of the module of shape :math:(\text{out\_features}). If :attr:bias is True, the values are initialized from :math:\mathcal{U}(-\sqrt{k}, \sqrt{k}) where :math:k = \frac{1}{\text{in\_features}}

Examples::

>>> m = nn.Linear(20, 30)
>>> input = torch.randn(128, 20)
>>> output = m(input)
>>> print(output.size())
torch.Size([128, 30])

Initialize internal Module state, shared by both nn.Module and ScriptModule.

Expand source code
class ExtendedLinear(nn.Linear):
    def build_clone(self):
        with torch.no_grad():
            weight = self.weight.detach().requires_grad_(False)
            # weight = self.compute_weight(update=False).detach().requires_grad_(False)
            if self.bias is not None:
                bias = self.bias.detach().requires_grad_(False)
            m = nn.Linear(self.in_features, self.out_features, bias=self.bias is not None, device=self.weight.device)
            m.weight.data.copy_(weight)
            if self.bias is not None:
                m.bias.data.copy_(bias)
            return m

    def build_jvp_net(self, x):
        '''
        Bias is omitted in contrast to self.build_clone().
        '''
        with torch.no_grad():
            # weight = self.compute_weight(update=False).detach().requires_grad_(False)
            weight = self.weight.detach().requires_grad_(False)
            m = nn.Linear(self.in_features, self.out_features, bias=None, device=self.weight.device)
            m.weight.data.copy_(weight)
            return m, self.forward(x).detach().clone()

Ancestors

  • torch.nn.modules.linear.Linear
  • torch.nn.modules.module.Module

Class variables

var in_features : int
var out_features : int
var weight : torch.Tensor

Methods

def build_clone(self)
def build_jvp_net(self, x)

Bias is omitted in contrast to self.build_clone().

class ExtendedSequential (*args)

A sequential container.

Modules will be added to it in the order they are passed in the constructor. Alternatively, an OrderedDict of modules can be passed in. The forward() method of Sequential accepts any input and forwards it to the first module it contains. It then "chains" outputs to inputs sequentially for each subsequent module, finally returning the output of the last module.

The value a Sequential provides over manually calling a sequence of modules is that it allows treating the whole container as a single module, such that performing a transformation on the Sequential applies to each of the modules it stores (which are each a registered submodule of the Sequential).

What's the difference between a Sequential and a :class:torch.nn.ModuleList? A ModuleList is exactly what it sounds like–a list for storing Module s! On the other hand, the layers in a Sequential are connected in a cascading way.

Example::

# Using Sequential to create a small model. When <code>model</code> is run,
# input will first be passed to <code>Conv2d(1,20,5)</code>. The output of
# <code>Conv2d(1,20,5)</code> will be used as the input to the first
# <code>ReLU</code>; the output of the first <code>ReLU</code> will become the input
# for <code>Conv2d(20,64,5)</code>. Finally, the output of
# <code>Conv2d(20,64,5)</code> will be used as input to the second <code>ReLU</code>
model = nn.Sequential(
          nn.Conv2d(1,20,5),
          nn.ReLU(),
          nn.Conv2d(20,64,5),
          nn.ReLU()
        )

# Using Sequential with OrderedDict. This is functionally the
# same as the above code
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU())
        ]))

Initialize internal Module state, shared by both nn.Module and ScriptModule.

Expand source code
class ExtendedSequential(nn.Sequential):
    def build_clone(self):
        modules = []
        for m in self:
            modules.append(m.build_clone())
        return ExtendedSequential(*modules)

    def build_jvp_net(self, *args):
        with torch.no_grad():
            modules = []
            y = args
            for m in self:
                jvp_net_and_y = m.build_jvp_net(*y)
                jvp_net = jvp_net_and_y[0]
                y = jvp_net_and_y[1:]
                modules.append(jvp_net)
            return ExtendedSequential(*modules), *y

Ancestors

  • torch.nn.modules.container.Sequential
  • torch.nn.modules.module.Module

Methods

def build_clone(self)
def build_jvp_net(self, *args)