Skip to content

Activity

Attempt to use symbolic for MatMul axis tracking, blocked by insuffic…

slyalincreated use_symbolics_for_selected_tokens_fusion • a435a33 • 
on May 27, 2024

Fuse selected token indices pruning into the model

slyalincreated fuse_selected_token_indices • a7e119e • 
on May 26, 2024

Flattening of input_ids and position_ids. Adopt to a new signature in…

slyalinpushed 1 commit to new_paged_attention • da7211c…b237fdf • 
on May 15, 2024

WIP: migrate to the new PA

slyalincreated new_paged_attention • da7211c • 
on May 13, 2024

Fixed dimenstions in HW dependent part of PA transformation

slyalinpushed 1 commit to paged_attention_transformation • 642f87a…0bf2880 • 
on May 7, 2024

Deduce the number of KV heads and head_size from the model without re…

slyalinpushed 4 commits to paged_attention_transformation • 3193fa8…642f87a • 
on May 6, 2024

Merge remote-tracking branch 'lavrenov/openvino-model-executor' into …

slyalincreated paged_attention_transformation • 3193fa8 • 
on Apr 22, 2024

Enabled int8 weights by default for performnace benchmarking purposes

slyalincreated int8_enabled_by_default • 4931727 • 
on Apr 19, 2024

Merge branch 'openvino-model-executor' into transformers_4_39

slyalinpushed 51 commits to transformers_4_39 • 56bf5b1…fc6302a • 
on Apr 17, 2024

Describe weights comression option in the documentation

slyalinpushed 1 commit to disable_int8 • 786c6e5…02a108a • 
on Apr 16, 2024

Merge remote-tracking branch 'lavrenov/openvino-model-executor' into …

slyalinpushed 7 commits to disable_int8 • 0acb46c…786c6e5 • 
on Apr 15, 2024

Transformers 4.39

slyalincreated for_health_check • 4b2ec6e • 
on Apr 4, 2024

Disable weight compression on optimum-intel path if model is being co…

slyalincreated disable_int8 • 0acb46c • 
on Apr 4, 2024

Update Dockerfile.openvino

ilya-lavrenovpushed 1 commit to paged_attention_in_openvino • b691a92…c790199 • 
on Apr 4, 2024

Merged from remote

slyalincreated transform_fixes • aa002b6 • 
on Apr 3, 2024

Removed op factory and dependency on openvino_contrib.

slyalincreated paged_attention_in_openvino • b691a92 • 
on Apr 3, 2024

Update vllm/model_executor/openvino_model_loader.py

slyalinpushed 1 commit to extended_optimum_transform_plus • e09a0f1…d073339 • 
on Apr 1, 2024

Merge remote-tracking branch 'lavrenov/openvino-model-executor' into …

slyalincreated extended_optimum_transform_plus • e09a0f1 • 
on Apr 1, 2024

Update vllm/model_executor/openvino_model_loader.py

slyalinpushed 1 commit to load_from_ir • 9dd49cb…06cf403 • 
on Mar 26, 2024

Load from OpenVINO IR if it exists in model id directory

slyalincreated load_from_ir • 9dd49cb • 
on Mar 26, 2024

Upgrade transformers to 4.39

slyalincreated transformers_4_39 • 56bf5b1 • 
on Mar 22, 2024

Passthrough trust_remote_code

slyalincreated trust_remote_code_passthrough • d617241 • 
on Mar 22, 2024

Reverted accidental change

slyalinpushed 2 commits to python_op • 15eaeec…e683a9a • 
on Mar 21, 2024

Removed part of the debug output

slyalinpushed 1 commit to python_op • 14b1dc2…15eaeec • 
on Mar 21, 2024

vLLM modeling works with PythonOp, less debug output, no explicit add…

slyalinpushed 8 commits to python_op • 6d87df2…14b1dc2 • 
on Mar 21, 2024

Env var control for optimum-intel switch. Debug output in evaluate.

slyalinpushed 6 commits to python_op • 5c21861…6d87df2 • 
on Mar 21, 2024

Using stub-like Python Op for PagedAttentionExtension instead of real…

slyalincreated python_op • 5c21861 • 
on Mar 19, 2024

Enabled optimum-intel path, fixed recent regression with int32/int64 …

slyalincreated optimum_models_after_reorg • b98f5ba • 
on Mar 18, 2024

Merge pull request #5 from slyalin/fixed_parameter_types

Pull request merge
ilya-lavrenovpushed 6 commits to openvino • 8a9862f…30605c8 • 
on Mar 8, 2024

Set position_ids name for optimum-intel based modeling. Fix for model…

slyalinpushed 2 commits to fixed_parameter_types • 0d2ba62…504704c • 
on Mar 8, 2024