vllm.attention.backends.registry ¶
Attention backend registry
AttentionBackendEnum ¶
Bases: Enum
Enumeration of all supported attention backends.
The enum value is the default class path, but this can be overridden at runtime using register_backend().
To get the actual backend class (respecting overrides), use: backend.get_class()
Source code in vllm/attention/backends/registry.py
CPU_ATTN class-attribute instance-attribute ¶
CUTLASS_MLA class-attribute instance-attribute ¶
FLASHINFER class-attribute instance-attribute ¶
FLASHINFER_MLA class-attribute instance-attribute ¶
FLASHMLA class-attribute instance-attribute ¶
FLASHMLA_SPARSE class-attribute instance-attribute ¶
FLASH_ATTN class-attribute instance-attribute ¶
FLASH_ATTN_MLA class-attribute instance-attribute ¶
FLEX_ATTENTION class-attribute instance-attribute ¶
IPEX class-attribute instance-attribute ¶
NO_ATTENTION class-attribute instance-attribute ¶
PALLAS class-attribute instance-attribute ¶
ROCM_AITER_FA class-attribute instance-attribute ¶
ROCM_AITER_MLA class-attribute instance-attribute ¶
ROCM_AITER_UNIFIED_ATTN class-attribute instance-attribute ¶
ROCM_AITER_UNIFIED_ATTN = "vllm.v1.attention.backends.rocm_aiter_unified_attn.RocmAiterUnifiedAttentionBackend"
ROCM_ATTN class-attribute instance-attribute ¶
TREE_ATTN class-attribute instance-attribute ¶
TRITON_ATTN class-attribute instance-attribute ¶
TRITON_MLA class-attribute instance-attribute ¶
XFORMERS class-attribute instance-attribute ¶
clear_override ¶
get_class ¶
get_class() -> type[AttentionBackend]
Get the backend class (respects overrides).
Returns:
| Type | Description |
|---|---|
type[AttentionBackend] | The backend class |
Raises:
| Type | Description |
|---|---|
ImportError | If the backend class cannot be imported |
ValueError | If Backend.CUSTOM is used without being registered |
Source code in vllm/attention/backends/registry.py
get_path ¶
Get the class path for this backend (respects overrides).
Returns:
| Type | Description |
|---|---|
str | The fully qualified class path string |
Raises:
| Type | Description |
|---|---|
ValueError | If Backend.CUSTOM is used without being registered |
Source code in vllm/attention/backends/registry.py
_AttentionBackendEnumMeta ¶
Bases: EnumMeta
Metaclass for AttentionBackendEnum to provide better error messages.
Source code in vllm/attention/backends/registry.py
__getitem__ ¶
__getitem__(name: str)
Get backend by name with helpful error messages.
Source code in vllm/attention/backends/registry.py
_Backend ¶
Deprecated: Use AttentionBackendEnum instead.
This class is provided for backwards compatibility with plugins and will be removed in a future release.
_BackendMeta ¶
Bases: type
Metaclass to provide deprecation warnings when accessing _Backend.
Source code in vllm/attention/backends/registry.py
__getattribute__ ¶
__getattribute__(name: str)
Source code in vllm/attention/backends/registry.py
register_backend ¶
register_backend(
backend: AttentionBackendEnum,
class_path: str | None = None,
) -> Callable[[type], type]
Register or override a backend implementation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
backend | AttentionBackendEnum | The AttentionBackendEnum member to register | required |
class_path | str | None | Optional class path. If not provided and used as decorator, will be auto-generated from the class. | None |
Returns:
| Type | Description |
|---|---|
Callable[[type], type] | Decorator function if class_path is None, otherwise a no-op |
Examples:
Override an existing backend¶
@register_backend(AttentionBackendEnum.FLASH_ATTN) class MyCustomFlashAttn: ...
Register a custom third-party backend¶
@register_backend(AttentionBackendEnum.CUSTOM) class MyCustomBackend: ...
Direct registration¶
register_backend( AttentionBackendEnum.CUSTOM, "my.module.MyCustomBackend" )