common
optimus_dl.modules.metrics.common
¶
Common meter implementations and logging utilities.
This module provides standard meter types (averages, sums, frequencies, etc.) and convenience functions for logging meter values during training. All meters support distributed aggregation and checkpointing.
AverageMeter
¶
Bases: BaseMeter
Meter that computes a weighted average of logged values.
This meter accumulates weighted values and computes the average when
compute() is called. Useful for values like loss, accuracy, etc.
that should be averaged over batches.
Attributes:
| Name | Type | Description |
|---|---|---|
round |
Number of decimal places to round the result to (None = no rounding). |
|
sum |
Accumulated sum of (value * weight). |
|
count |
Accumulated sum of weights. |
Example
Source code in optimus_dl/modules/metrics/common.py
__init__(round=None)
¶
Initialize the average meter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
round
|
int | None
|
Number of decimal places to round results to. If None, results are not rounded. |
None
|
Source code in optimus_dl/modules/metrics/common.py
compute()
¶
Compute the weighted average.
Returns:
| Type | Description |
|---|---|
float | int
|
Weighted average of all logged values, optionally rounded. |
log(value, weight)
¶
Log a value with an associated weight.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
float | int
|
The value to add to the average. |
required |
weight
|
float | int
|
The weight for this value (typically batch size). |
required |
Source code in optimus_dl/modules/metrics/common.py
merge(other_state)
¶
Merge state from another meter instance (for distributed aggregation).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other_state
|
dict[str, Any]
|
State dictionary from another AverageMeter instance. |
required |
Source code in optimus_dl/modules/metrics/common.py
AveragedExponentMeter
¶
Bases: BaseMeter
Meter that computes the exponent of a weighted average.
Commonly used for computing perplexity (exp(loss)).
Attributes:
| Name | Type | Description |
|---|---|---|
_internal |
An |
|
round |
Number of decimal places to round the final result to. |
Source code in optimus_dl/modules/metrics/common.py
__init__(round=None)
¶
Initialize the averaged exponent meter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
round
|
int | None
|
Number of decimal places to round results to. If None, results are not rounded. |
None
|
Source code in optimus_dl/modules/metrics/common.py
compute()
¶
Return the exponent of the average.
Computes the weighted average of logged values using the internal
AverageMeter and then returns exp() of that average.
Returns:
| Type | Description |
|---|---|
float | int
|
The exponent of the weighted average, optionally rounded. |
Source code in optimus_dl/modules/metrics/common.py
load_state_dict(state_dict)
¶
Restore state.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
state_dict
|
dict[str, Any]
|
State dictionary to restore from. |
required |
log(value, weight)
¶
Log a log-scale value with its weight to the internal average meter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
float | int
|
The log-scale value to add. |
required |
weight
|
float | int
|
The weight for this value. |
required |
Source code in optimus_dl/modules/metrics/common.py
merge(other_state)
¶
Merge state from another AveragedExponentMeter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other_state
|
dict[str, Any]
|
State dictionary from another AveragedExponentMeter instance. |
required |
Source code in optimus_dl/modules/metrics/common.py
state_dict()
¶
Collect state for checkpointing.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
A dictionary containing the internal state of the |
dict[str, Any]
|
and the rounding precision. |
Source code in optimus_dl/modules/metrics/common.py
CachedLambda
¶
Wrapper that caches the result of a callable function.
This is useful for expensive computations that are used multiple times in meter logging. The function is only called once, and subsequent calls return the cached result.
Example
Source code in optimus_dl/modules/metrics/common.py
__call__()
¶
Call the function, caching the result.
Returns:
| Type | Description |
|---|---|
Any
|
The result of the function call. On first call, the function is |
Any
|
executed and the result is cached. On subsequent calls, the cached value |
Any
|
is returned. |
Source code in optimus_dl/modules/metrics/common.py
__init__(func)
¶
Initialize the cached lambda.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
func
|
Callable[[], Any]
|
Callable function that takes no arguments and returns a value. |
required |
Source code in optimus_dl/modules/metrics/common.py
FrequencyMeter
¶
Bases: BaseMeter
Meter that computes the frequency (duration per call) of an event.
Measures time between successive calls to log().
Attributes:
| Name | Type | Description |
|---|---|---|
round |
Rounding precision for the result. |
|
start |
int | None
|
Internal timestamp of the last call to |
elapsed |
int
|
Total elapsed time in nanoseconds. |
counter |
int
|
Number of events recorded. |
Source code in optimus_dl/modules/metrics/common.py
__init__(round=None)
¶
Initialize the frequency meter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
round
|
int | None
|
Number of decimal places to round results to. If None, results are not rounded. |
None
|
Source code in optimus_dl/modules/metrics/common.py
compute()
¶
Compute average time per occurrence in milliseconds.
Returns:
| Type | Description |
|---|---|
float | int | dict[str, float | int]
|
The average time per occurrence in milliseconds, optionally rounded. |
float | int | dict[str, float | int]
|
Returns 0 if no events have been recorded. |
Source code in optimus_dl/modules/metrics/common.py
load_state_dict(state_dict)
¶
Restore state and reset the start timer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
state_dict
|
dict[str, Any]
|
State dictionary to restore from. |
required |
Source code in optimus_dl/modules/metrics/common.py
log()
¶
Record an occurrence and compute elapsed time since the last call.
If this is the first call, it initializes the start time. Subsequent calls update the elapsed time and increment the counter.
Source code in optimus_dl/modules/metrics/common.py
merge(other_state)
¶
Merge state from another FrequencyMeter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other_state
|
dict[str, Any]
|
State dictionary from another FrequencyMeter instance. |
required |
Source code in optimus_dl/modules/metrics/common.py
GatherMeter
¶
Bases: BaseMeter
Accumulator that gathers all raw values across the entire dataset.
Use this for meters that require full dataset context (e.g., BLEU, ROC-AUC).
Source code in optimus_dl/modules/metrics/common.py
MaxMeter
¶
Bases: BaseMeter
Meter that tracks the maximum logged value.
This meter accumulates values and computes the maximum when
compute() is called. Useful for tracking peak memory, max loss, etc.
Attributes:
| Name | Type | Description |
|---|---|---|
round |
Number of decimal places to round the result to (None = no rounding). |
|
value |
Accumulated maximum value. |
Source code in optimus_dl/modules/metrics/common.py
__init__(round=None)
¶
Initialize the max meter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
round
|
int | None
|
Number of decimal places to round results to. If None, results are not rounded. |
None
|
Source code in optimus_dl/modules/metrics/common.py
compute()
¶
Compute the maximum value.
Returns:
| Type | Description |
|---|---|
float | int
|
Maximum of all logged values, optionally rounded. |
Source code in optimus_dl/modules/metrics/common.py
log(value)
¶
Log a value to be tracked for maximum.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
float | int
|
The value to accumulate to the max. |
required |
merge(other_state)
¶
Merge state from another meter instance (for distributed aggregation).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other_state
|
dict[str, Any]
|
State dictionary from another MaxMeter instance. |
required |
Source code in optimus_dl/modules/metrics/common.py
MinMeter
¶
Bases: MaxMeter
Meter that tracks the minimum logged value.
This meter accumulates values and computes the minimum when
compute() is called. Useful for tracking minimum learning rate, etc.
Source code in optimus_dl/modules/metrics/common.py
StopwatchMeter
¶
Bases: BaseMeter
Meter that acts as a manual stopwatch for measuring event durations.
Expects pairs of log(mode="start") and log(mode="end") calls.
Attributes:
| Name | Type | Description |
|---|---|---|
round |
Rounding precision for the result. |
|
_start |
float | None
|
Internal timestamp of when the stopwatch was started. |
elapsed |
int
|
Total elapsed time in nanoseconds across all recorded intervals. |
counter |
int
|
Number of completed intervals. |
Source code in optimus_dl/modules/metrics/common.py
303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 | |
__init__(round=None)
¶
Initialize the stopwatch meter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
round
|
int | None
|
Number of decimal places to round results to. If None, results are not rounded. |
None
|
Source code in optimus_dl/modules/metrics/common.py
compute()
¶
Compute average duration in milliseconds.
Returns:
| Type | Description |
|---|---|
float | int | dict[str, float | int]
|
The average duration in milliseconds, optionally rounded. |
float | int | dict[str, float | int]
|
Returns 0 if no intervals have been recorded. |
Source code in optimus_dl/modules/metrics/common.py
end()
¶
Stop the timer and record the duration.
Raises:
| Type | Description |
|---|---|
AssertionError
|
If |
Source code in optimus_dl/modules/metrics/common.py
load_state_dict(state_dict)
¶
Restore state and reset current timer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
state_dict
|
dict[str, Any]
|
State dictionary to restore from. |
required |
Source code in optimus_dl/modules/metrics/common.py
log(mode)
¶
Start or stop the timer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mode
|
str
|
"start" to begin timing, "end" to stop and record duration. |
required |
Raises:
| Type | Description |
|---|---|
AssertionError
|
If an unknown mode is provided. |
Source code in optimus_dl/modules/metrics/common.py
merge(other_state)
¶
Merge state from another StopwatchMeter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other_state
|
dict[str, Any]
|
State dictionary from another StopwatchMeter instance. |
required |
Source code in optimus_dl/modules/metrics/common.py
SummedMeter
¶
Bases: BaseMeter
Meter that sums logged values.
This meter simply accumulates values without averaging. Useful for values like total tokens processed, total examples seen, etc.
Attributes:
| Name | Type | Description |
|---|---|---|
round |
Number of decimal places to round results to (None = no rounding). |
|
sum |
Accumulated sum of all logged values. |
Source code in optimus_dl/modules/metrics/common.py
__init__(round=None)
¶
Initialize the summed meter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
round
|
int | None
|
Number of decimal places to round results to. If None, results are not rounded. |
None
|
compute()
¶
Compute the sum.
Returns:
| Type | Description |
|---|---|
float | int
|
Sum of all logged values, optionally rounded. |
log(value)
¶
Log a value to add to the sum.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
float | int
|
The value to add to the sum. |
required |
merge(other_state)
¶
Merge state from another meter instance (for distributed aggregation).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other_state
|
dict[str, Any]
|
State dictionary from another SummedMeter instance. |
required |
Source code in optimus_dl/modules/metrics/common.py
cached_lambda(x)
¶
Create a cached lambda wrapper.
Convenience function for creating a CachedLambda instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Callable[[], Any]
|
Callable function to cache. |
required |
Returns:
| Type | Description |
|---|---|
CachedLambda
|
CachedLambda instance that caches the function's result. |
Example
Source code in optimus_dl/modules/metrics/common.py
log_averaged(name, value, weight=1.0, round=None, reset=True, priority=100)
¶
Log a value to an averaged meter.
This is a convenience function for logging values to an AverageMeter.
The value and weight can be callables (lambdas) that are only evaluated
when the meter is actually logged (lazy evaluation).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the meter (e.g., "train/loss"). |
required |
value
|
DelayedScalar
|
Value to log. Can be a number or a callable that returns a number. |
required |
weight
|
DelayedScalar
|
Weight for this value (typically batch size). Can be a number or a callable. Defaults to 1.0. |
1.0
|
round
|
int | None
|
Number of decimal places to round the result to. |
None
|
reset
|
bool
|
If True, the meter is reset after logging (for per-iteration meters). If False, the meter accumulates across iterations. |
True
|
priority
|
int
|
Priority for meter ordering when logging. Higher priority meters appear first. |
100
|
Example
Source code in optimus_dl/modules/metrics/common.py
log_averaged_exponent(name, value, weight=1.0, round=None, reset=True, priority=100)
¶
Log a value to an averaged exponent meter.
This is a convenience function for logging values to an AveragedExponentMeter.
The value and weight can be callables (lambdas) for lazy evaluation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the meter (e.g., "train/perplexity"). |
required |
value
|
DelayedScalar
|
Log-scale value to log. Can be a number or a callable. |
required |
weight
|
DelayedScalar
|
Weight for this value. Can be a number or a callable. Defaults to 1.0. |
1.0
|
round
|
int | None
|
Number of decimal places to round the result to. |
None
|
reset
|
bool
|
If True, the meter is reset after logging. |
True
|
priority
|
int
|
Priority for meter ordering. |
100
|
Source code in optimus_dl/modules/metrics/common.py
log_event_end(name, round=None, reset=True, priority=100)
¶
End timing an event.
This function stops a stopwatch that was started with log_event_start().
The duration between start and end is recorded and averaged across
multiple occurrences.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the event (must match the name used in |
required |
round
|
int | None
|
Number of decimal places to round the duration to (in milliseconds). |
None
|
reset
|
bool
|
If True, the meter is reset after logging. |
True
|
priority
|
int
|
Priority for meter ordering when logging. |
100
|
Raises:
| Type | Description |
|---|---|
AssertionError
|
If |
Example
Source code in optimus_dl/modules/metrics/common.py
log_event_occurence(name, round=None, reset=True, priority=100)
¶
Log an occurrence of an event and track its frequency.
This function uses a FrequencyMeter to measure the time between
successive calls to log_event_occurrence() for a given name.
It can be used to track the rate at which certain events happen.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the event to track (e.g., "perf/dataloader_ready"). |
required |
round
|
int | None
|
Number of decimal places to round the duration to (in milliseconds). |
None
|
reset
|
bool
|
If True, the meter is reset after logging. |
True
|
priority
|
int
|
Priority for meter ordering when logging. |
100
|
Source code in optimus_dl/modules/metrics/common.py
log_event_start(name, round=None, reset=True, priority=100)
¶
Start timing an event.
This function starts a stopwatch for measuring the duration of an event.
Call log_event_end() with the same name to stop timing and record the
duration. The duration is automatically averaged across multiple occurrences.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the event to time (e.g., "perf/forward_pass"). |
required |
round
|
int | None
|
Number of decimal places to round the duration to (in milliseconds). |
None
|
reset
|
bool
|
If True, the meter is reset after logging. |
True
|
priority
|
int
|
Priority for meter ordering when logging. |
100
|
Example
Source code in optimus_dl/modules/metrics/common.py
log_max(name, value, round=None, reset=True, priority=100)
¶
Log a value to a max meter.
This is a convenience function for logging values to a MaxMeter.
The value can be a callable (lambda) for lazy evaluation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the meter (e.g., "perf/max_memory"). |
required |
value
|
DelayedScalar
|
Value to log. Can be a number or a callable. |
required |
round
|
int | None
|
Number of decimal places to round the result to. |
None
|
reset
|
bool
|
If True, the meter is reset after logging. |
True
|
priority
|
int
|
Priority for meter ordering when logging. |
100
|
Source code in optimus_dl/modules/metrics/common.py
log_min(name, value, round=None, reset=True, priority=100)
¶
Log a value to a min meter.
This is a convenience function for logging values to a MinMeter.
The value can be a callable (lambda) for lazy evaluation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the meter (e.g., "optimization/min_lr"). |
required |
value
|
DelayedScalar
|
Value to log. Can be a number or a callable. |
required |
round
|
int | None
|
Number of decimal places to round the result to. |
None
|
reset
|
bool
|
If True, the meter is reset after logging. |
True
|
priority
|
int
|
Priority for meter ordering when logging. |
100
|
Source code in optimus_dl/modules/metrics/common.py
log_summed(name, value, round=None, reset=True, priority=100)
¶
Log a value to a summed meter.
This is a convenience function for logging values to a SummedMeter.
The value can be a callable (lambda) that is only evaluated when the
meter is actually logged (lazy evaluation).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the meter (e.g., "train/tokens_processed"). |
required |
value
|
DelayedScalar
|
Value to add to the sum. Can be a number or a callable that returns a number. |
required |
round
|
int | None
|
Number of decimal places to round the result to. |
None
|
reset
|
bool
|
If True, the meter is reset after logging. If False, the meter accumulates across iterations. |
True
|
priority
|
int
|
Priority for meter ordering when logging. |
100
|