vllm.model_executor.layers.rotary_embedding.yarn_scaling_rope ¶
YaRNScalingRotaryEmbedding ¶
Bases: RotaryEmbedding
RotaryEmbedding extended with YaRN method.
Credits to Peng et al. github.com/jquesnelle/yarn
Source code in vllm/model_executor/layers/rotary_embedding/yarn_scaling_rope.py
mscale instance-attribute ¶
mscale = (
float(yarn_get_mscale(scaling_factor) * attn_factor)
if apply_yarn_scaling
else float(attn_factor)
)
__init__ ¶
__init__(
head_size: int,
rotary_dim: int,
max_position_embeddings: int,
base: float,
is_neox_style: bool,
scaling_factor: float,
dtype: dtype,
*,
extrapolation_factor: float = 1,
attn_factor: float = 1,
beta_fast: int = 32,
beta_slow: int = 1,
apply_yarn_scaling: bool = True,
) -> None
Source code in vllm/model_executor/layers/rotary_embedding/yarn_scaling_rope.py
_compute_cos_sin_cache ¶
_compute_cos_sin_cache() -> Tensor