Activity
Attempt to use symbolic for MatMul axis tracking, blocked by insuffic…
Attempt to use symbolic for MatMul axis tracking, blocked by insuffic…
Fuse selected token indices pruning into the model
Fuse selected token indices pruning into the model
Flattening of input_ids and position_ids. Adopt to a new signature in…
Flattening of input_ids and position_ids. Adopt to a new signature in…
Fixed dimenstions in HW dependent part of PA transformation
Fixed dimenstions in HW dependent part of PA transformation
Deduce the number of KV heads and head_size from the model without re…
Deduce the number of KV heads and head_size from the model without re…
Merge remote-tracking branch 'lavrenov/openvino-model-executor' into …
Merge remote-tracking branch 'lavrenov/openvino-model-executor' into …
Enabled int8 weights by default for performnace benchmarking purposes
Enabled int8 weights by default for performnace benchmarking purposes
Merge branch 'openvino-model-executor' into transformers_4_39
Merge branch 'openvino-model-executor' into transformers_4_39
Describe weights comression option in the documentation
Describe weights comression option in the documentation
Merge remote-tracking branch 'lavrenov/openvino-model-executor' into …
Merge remote-tracking branch 'lavrenov/openvino-model-executor' into …
Disable weight compression on optimum-intel path if model is being co…
Disable weight compression on optimum-intel path if model is being co…
Update Dockerfile.openvino
Update Dockerfile.openvino
Removed op factory and dependency on openvino_contrib.
Removed op factory and dependency on openvino_contrib.
Update vllm/model_executor/openvino_model_loader.py
Update vllm/model_executor/openvino_model_loader.py
Merge remote-tracking branch 'lavrenov/openvino-model-executor' into …
Merge remote-tracking branch 'lavrenov/openvino-model-executor' into …
Update vllm/model_executor/openvino_model_loader.py
Update vllm/model_executor/openvino_model_loader.py
Load from OpenVINO IR if it exists in model id directory
Load from OpenVINO IR if it exists in model id directory
Passthrough trust_remote_code
Passthrough trust_remote_code
Removed part of the debug output
Removed part of the debug output
vLLM modeling works with PythonOp, less debug output, no explicit add…
vLLM modeling works with PythonOp, less debug output, no explicit add…
Env var control for optimum-intel switch. Debug output in evaluate.
Env var control for optimum-intel switch. Debug output in evaluate.
Using stub-like Python Op for PagedAttentionExtension instead of real…
Using stub-like Python Op for PagedAttentionExtension instead of real…
Enabled optimum-intel path, fixed recent regression with int32/int64 …
Enabled optimum-intel path, fixed recent regression with int32/int64 …
Set position_ids name for optimum-intel based modeling. Fix for model…
Set position_ids name for optimum-intel based modeling. Fix for model…