utils
optimus_dl.modules.model.presets.utils
¶
Utility functions for loading Hugging Face models.
WeightMapper
¶
Helper to map and copy weights from HF state dict to local model.
Source code in optimus_dl/modules/model/presets/utils.py
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 | |
copy(src_key, dest_key, permute=False, n_heads=None, head_dim=None, transpose=False, dim=0)
¶
Copy weight from HF state dict to local state dict.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
src_key
|
str
|
Key in HF state dict. |
required |
dest_key
|
str
|
Key in local state dict. |
required |
permute
|
bool
|
If True, permutes the weight for RoPE. |
False
|
n_heads
|
int | None
|
Number of heads (required for permutation). |
None
|
head_dim
|
int | None
|
Dimension per head (required for permutation). |
None
|
transpose
|
bool
|
If True, transposes the weight. |
False
|
dim
|
int
|
Dimension to permute (0 for output, 1 for input). |
0
|
Source code in optimus_dl/modules/model/presets/utils.py
validate(tie_word_embeddings=False, ignore_patterns=None)
¶
Validate that all expected keys were loaded.
Source code in optimus_dl/modules/model/presets/utils.py
permute_rope_weight(w, n_heads, head_dim, interleaved=True)
¶
Permute weights for Rotary Positional Embeddings.
HF typically uses a half-half split (first half of head_dim is cos, second is sin). Optimus-DL uses interleaved (cos, sin, cos, sin...).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
w
|
Tensor
|
Weight tensor of shape (n_heads * head_dim, input_dim) or (n_heads * head_dim,). |
required |
n_heads
|
int
|
Number of attention heads. |
required |
head_dim
|
int
|
Dimension of each head. |
required |
interleaved
|
bool
|
If True, permutes to interleaved format. If False, returns as is. |
True
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Permuted weight tensor. |
Source code in optimus_dl/modules/model/presets/utils.py
update_config_from_hf(optimus_cfg, hf_config, head_dim_fallback=None)
¶
Update Optimus-DL config from HF config with common attributes.