layer_norms
optimus_dl.modules.model.blocks.layer_norms
¶
LayerNorm
¶
Bases: Module
LayerNorm with optional bias.
PyTorch's standard LayerNorm always expects a bias if elementwise_affine is True. This implementation allows for a more flexible bias=False option as seen in some LLM architectures.
Attributes:
| Name | Type | Description |
|---|---|---|
weight |
Affine scale parameter. |
|
bias |
Optional affine bias parameter. |
Source code in optimus_dl/modules/model/blocks/layer_norms.py
forward(input)
¶
Apply layer normalization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input
|
Tensor
|
Input tensor. |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Normalized tensor. |
Source code in optimus_dl/modules/model/blocks/layer_norms.py
RMSNorm
¶
Bases: Module
Root Mean Square Layer Normalization (RMSNorm).
RMSNorm is a simplification of LayerNorm that only scales the input by the root mean square of the activations, omitting the mean subtraction and bias.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dim
|
int
|
Input dimension. |
required |
eps
|
float
|
Small value for numerical stability. |
1e-06
|
use_liger
|
bool | None
|
If True, uses the high-performance Liger kernel. If None, automatically enables if available. |
None
|
Source code in optimus_dl/modules/model/blocks/layer_norms.py
forward(x)
¶
Perform the forward pass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Input tensor. |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
RMS normalized tensor. |