vllm.model_executor.model_loader.reload.utils ¶
get_info_size ¶
get_info_size(info: LayerReloadingInfo) -> int
Calculate the number of bytes used by loaded weights for a given layer
:param info: layerwise info to get size of :return: number of bytes used by loaded weights
Source code in vllm/model_executor/model_loader/reload/utils.py
get_layer_params_buffers ¶
get_layer_params_buffers(layer: Module) -> LayerTensors
Get all parameters and buffers of a module as a tuple of dicts.
Source code in vllm/model_executor/model_loader/reload/utils.py
get_layer_size ¶
Calculate total number of elements across loadable tensors in a layer.
Excludes SKIP_TENSORS (e.g. _expert_map) which are never moved to meta device and never loaded via weight_loader during layerwise reload.
Source code in vllm/model_executor/model_loader/reload/utils.py
get_layer_tensors ¶
Get all parameters and buffers from a module as a dict.
has_device_tensors ¶
has_device_tensors(bound_args: BoundArguments) -> bool
Return True if the loaded weights exist on an accelerator device
:param bound_args: args to load weights :return: True if weights are on accelerator device