Skip to content

Commit ce80498

Browse files
authored
Merge pull request #3 from slyalin/window_and_alibi
Passing alibi_slopes and sliding_window to PagedAttention extension
2 parents 0460167 + 6511e66 commit ce80498

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

vllm/worker/model_runner.py

+3-1
Original file line numberDiff line numberDiff line change
@@ -206,7 +206,9 @@ def wrapper(module, target_op, *args, **kwargs):
206206
args[5].max_context_len,
207207
args[5].context_lens,
208208
args[5].block_tables,
209-
torch.tensor(module.scale) # wrap in a tensor, otherwise it will not appear in the trace
209+
torch.tensor(module.scale), # wrap in a tensor, otherwise it will not appear in the trace
210+
torch.tensor(module.alibi_slopes if module.alibi_slopes is not None else [], dtype=torch.float32), # alibi_slopes
211+
torch.tensor(module.sliding_window if module.sliding_window is not None else 0, dtype=torch.int32) # sliding_window
210212
)
211213

212214
with torch.no_grad():

0 commit comments

Comments
 (0)