Skip to content

Commit c86fd77

Browse files
Wovchenaakladievpopovaannikita-malininnyatarkan
authored
Merge releases/2024/3 into master (openvinotoolkit#666)
Co-authored-by: Alina Kladieva <alina.kladieva@intel.com> Co-authored-by: Anastasiia Pnevskaia <anastasiia.pnevskaia@intel.com> Co-authored-by: Nikita Malinin <nikita.malinin@intel.com> Co-authored-by: Yaroslav Tarkan <yaroslav.tarkan@intel.com> Co-authored-by: Anatoliy Talamanov <anatoliy.talamanov@intel.com> Co-authored-by: Pavel Esir <pavel.esir@gmail.com> Co-authored-by: Miłosz Żeglarski <milosz.zeglarski@intel.com> Co-authored-by: Alexander Suvorov <alexander.suvorov@intel.com> Co-authored-by: Xiake Sun <xiake.sun@intel.com>
1 parent 5d21486 commit c86fd77

File tree

27 files changed

+212
-101
lines changed

27 files changed

+212
-101
lines changed

.github/workflows/causal_lm_cpp.yml

+33-33
Large diffs are not rendered by default.

.github/workflows/genai_package.yml

+9-9
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@ concurrency:
55
group: ${{ github.workflow }}-${{ github.head_ref || github.ref_name }}
66
cancel-in-progress: true
77
env:
8-
l_ov_link: https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2024.3.0-15945-a349dc82f9a/l_openvino_toolkit_ubuntu20_2024.3.0.dev20240708_x86_64.tgz
9-
m_ov_link: https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2024.3.0-15945-a349dc82f9a/m_openvino_toolkit_macos_12_6_2024.3.0.dev20240708_x86_64.tgz
10-
w_ov_link: https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2024.3.0-15945-a349dc82f9a/w_openvino_toolkit_windows_2024.3.0.dev20240708_x86_64.zip
8+
l_ov_link: https://storage.openvinotoolkit.org/repositories/openvino/packages/pre-release/2024.3.0rc1/linux/l_openvino_toolkit_ubuntu20_2024.3.0.dev20240711_x86_64.tgz
9+
m_ov_link: https://storage.openvinotoolkit.org/repositories/openvino/packages/pre-release/2024.3.0rc1/macos/m_openvino_toolkit_macos_12_6_2024.3.0.dev20240711_x86_64.tgz
10+
w_ov_link: https://storage.openvinotoolkit.org/repositories/openvino/packages/pre-release/2024.3.0rc1/windows/w_openvino_toolkit_windows_2024.3.0.dev20240711_x86_64.zip
1111
jobs:
1212
ubuntu_genai_package:
1313
strategy:
@@ -28,8 +28,8 @@ jobs:
2828
- run: sudo ./ov/install_dependencies/install_openvino_dependencies.sh
2929
- run: source ./ov/setupvars.sh && cmake -DCMAKE_BUILD_TYPE=${{ matrix.build-type }} -S ./ -B ./build/
3030
- run: source ./ov/setupvars.sh && cmake --build ./build/ --config ${{ matrix.build-type }} --target package -j
31-
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
32-
- run: source ./ov/setupvars.sh && python -m pip install --upgrade-strategy eager -r ./samples/requirements.txt --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
31+
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release
32+
- run: source ./ov/setupvars.sh && python -m pip install --upgrade-strategy eager -r ./samples/requirements.txt --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release
3333
- run: source ./ov/setupvars.sh && optimum-cli export openvino --trust-remote-code --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 TinyLlama-1.1B-Chat-v1.0
3434
- run: source ./ov/setupvars.sh && cmake --install ./build/ --config ${{ matrix.build-type }} --prefix ov
3535
- run: ov/samples/cpp/build_samples.sh -i ${{ github.workspace }}/s\ pace
@@ -57,8 +57,8 @@ jobs:
5757
- run: brew install coreutils scons
5858
- run: source ./ov/setupvars.sh && cmake -DCMAKE_BUILD_TYPE=${{ matrix.build-type }} -S ./ -B ./build/
5959
- run: source ./ov/setupvars.sh && cmake --build ./build/ --config ${{ matrix.build-type }} --target package -j
60-
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
61-
- run: source ./ov/setupvars.sh && python -m pip install --upgrade-strategy eager -r ./samples/requirements.txt --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
60+
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release
61+
- run: source ./ov/setupvars.sh && python -m pip install --upgrade-strategy eager -r ./samples/requirements.txt --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release
6262
- run: source ./ov/setupvars.sh && optimum-cli export openvino --trust-remote-code --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 TinyLlama-1.1B-Chat-v1.0
6363
- run: source ./ov/setupvars.sh && cmake --install ./build/ --config ${{ matrix.build-type }} --prefix ov
6464
- run: ov/samples/cpp/build_samples.sh -i ${{ github.workspace }}/s\ pace
@@ -100,8 +100,8 @@ jobs:
100100
shell: bash
101101
- run: call ov\setupvars.bat && cmake -DCMAKE_BUILD_TYPE=${{ matrix.build-type }} -S ./ -B ./build/
102102
- run: call ov\setupvars.bat && cmake --build ./build/ --config ${{ matrix.build-type }} --target package -j
103-
- run: call ov\setupvars.bat && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
104-
- run: call ov\setupvars.bat && python -m pip install --upgrade-strategy eager -r ./samples/requirements.txt --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
103+
- run: call ov\setupvars.bat && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release
104+
- run: call ov\setupvars.bat && python -m pip install --upgrade-strategy eager -r ./samples/requirements.txt --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release
105105
- run: call ov\setupvars.bat && optimum-cli export openvino --trust-remote-code --weight-format fp16 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 TinyLlama-1.1B-Chat-v1.0
106106
- run: call ov\setupvars.bat && cmake --install ./build/ --config ${{ matrix.build-type }} --prefix ov
107107
- run: call ov\samples\cpp\build_samples_msvc.bat -i "${{ github.workspace }}/samples_install"

.github/workflows/genai_python_lib.yml

+8-11
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@ concurrency:
55
group: ${{ github.workflow }}-${{ github.head_ref || github.ref_name }}
66
cancel-in-progress: true
77
env:
8-
l_ov_centos_link: https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2024.3.0-15945-a349dc82f9a/l_openvino_toolkit_centos7_2024.3.0.dev20240708_x86_64.tgz
9-
m_ov_link: https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2024.3.0-15945-a349dc82f9a/m_openvino_toolkit_macos_12_6_2024.3.0.dev20240708_x86_64.tgz
10-
w_ov_link: https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2024.3.0-15945-a349dc82f9a/w_openvino_toolkit_windows_2024.3.0.dev20240708_x86_64.zip
8+
l_ov_centos_link: https://storage.openvinotoolkit.org/repositories/openvino/packages/pre-release/2024.3.0rc1/linux/l_openvino_toolkit_centos7_2024.3.0.dev20240711_x86_64.tgz
9+
m_ov_link: https://storage.openvinotoolkit.org/repositories/openvino/packages/pre-release/2024.3.0rc1/macos/m_openvino_toolkit_macos_12_6_2024.3.0.dev20240711_x86_64.tgz
10+
w_ov_link: https://storage.openvinotoolkit.org/repositories/openvino/packages/pre-release/2024.3.0rc1/windows/w_openvino_toolkit_windows_2024.3.0.dev20240711_x86_64.zip
1111
jobs:
1212
ubuntu_genai_python_lib:
1313
# A tokenizers' dependency fails to compile on ubuntu-20 n CenOS7 env.
@@ -29,7 +29,7 @@ jobs:
2929
- run: sudo ./ov/install_dependencies/install_openvino_dependencies.sh
3030
- run: source ./ov/setupvars.sh && cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/
3131
- run: source ./ov/setupvars.sh && cmake --build ./build/ --config Release -j
32-
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] -r ./tests/python_tests/requirements.txt --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly --upgrade-strategy eager
32+
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] -r ./tests/python_tests/requirements.txt --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release --upgrade-strategy eager
3333
- run: source ./ov/setupvars.sh && PYTHONPATH=./build/:$PYTHONPATH python -m pytest ./tests/python_tests/
3434
- run: source ./ov/setupvars.sh && python -m pip install . --verbose
3535
- run: python -m pytest ./tests/python_tests/
@@ -52,7 +52,7 @@ jobs:
5252
- run: brew install coreutils scons
5353
- run: source ./ov/setupvars.sh && cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/
5454
- run: source ./ov/setupvars.sh && cmake --build ./build/ --config Release -j
55-
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] -r ./tests/python_tests/requirements.txt --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly --upgrade-strategy eager
55+
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] -r ./tests/python_tests/requirements.txt --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release --upgrade-strategy eager
5656
- run: source ./ov/setupvars.sh && PYTHONPATH=./build/:$PYTHONPATH python -m pytest ./tests/python_tests/
5757
- run: source ./ov/setupvars.sh && python -m pip install . --verbose
5858
- run: python -c "from openvino_genai import LLMPipeline"
@@ -79,12 +79,9 @@ jobs:
7979
unzip -d ov ov.zip
8080
dirs=(ov/*) && mv ov/*/* ov && rmdir "${dirs[@]}"
8181
shell: bash
82-
- name: Install dependencies and build
83-
run: |
84-
call .\ov\setupvars.bat
85-
python -m pip install ./thirdparty/openvino_tokenizers/[transformers] -r ./tests/python_tests/requirements.txt --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly --upgrade-strategy eager
86-
cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/
87-
cmake --build ./build/ --config Release -j
82+
- run: call ./ov/setupvars.bat && cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/
83+
- run: call ./ov/setupvars.bat && cmake --build ./build/ --config Release -j
84+
- run: call ./ov/setupvars.bat && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] -r ./tests/python_tests/requirements.txt --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release --upgrade-strategy eager
8885
# cmd evaluates variables in a different way. Setting PYTHONPATH before setupvars.bat instead of doing that after solves that.
8986
- run: set "PYTHONPATH=./build/" && call ./ov/setupvars.bat && python -m pytest ./tests/python_tests/
9087
- run: call ./ov/setupvars.bat && python -m pip install . --verbose

.github/workflows/lcm_dreamshaper_cpp.yml

+4-4
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,8 @@ jobs:
5050
working-directory: ${{ env.working_directory }}
5151
run: |
5252
conda activate openvino_lcm_cpp
53-
python -m pip install ../../../thirdparty/openvino_tokenizers/[transformers] --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
54-
python -m pip install -r ../../requirements.txt --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
53+
python -m pip install ../../../thirdparty/openvino_tokenizers/[transformers] --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release
54+
python -m pip install -r ../../requirements.txt --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release
5555
5656
- name: Download and convert model and tokenizer
5757
working-directory: ${{ env.working_directory }}
@@ -95,8 +95,8 @@ jobs:
9595
working-directory: ${{ env.working_directory }}
9696
run: |
9797
conda activate openvino_lcm_cpp
98-
python -m pip install ../../../thirdparty/openvino_tokenizers/[transformers] --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
99-
python -m pip install -r ../../requirements.txt --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
98+
python -m pip install ../../../thirdparty/openvino_tokenizers/[transformers] --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release
99+
python -m pip install -r ../../requirements.txt --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release
100100
101101
- name: Download and convert model and tokenizer
102102
working-directory: ${{ env.working_directory }}

.github/workflows/stable_diffusion_1_5_cpp.yml

+2-2
Original file line numberDiff line numberDiff line change
@@ -49,8 +49,8 @@ jobs:
4949
working-directory: ${{ env.working_directory }}
5050
run: |
5151
conda activate openvino_sd_cpp
52-
python -m pip install ../../../thirdparty/openvino_tokenizers/[transformers] --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
53-
python -m pip install -r ../../requirements.txt --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
52+
python -m pip install ../../../thirdparty/openvino_tokenizers/[transformers] --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release
53+
python -m pip install -r ../../requirements.txt --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release
5454
5555
- name: Download and convert model and tokenizer
5656
working-directory: ${{ env.working_directory }}

CMakeLists.txt

+16
Original file line numberDiff line numberDiff line change
@@ -57,11 +57,27 @@ if(ENABLE_PYTHON)
5757
endif()
5858
endif()
5959

60+
if(ENABLE_PYTHON)
61+
# the following two calls are required for cross-compilation
62+
if(OpenVINODeveloperPackage_DIR)
63+
ov_find_python3(REQUIRED)
64+
ov_detect_python_module_extension()
65+
else()
66+
if(CMAKE_VERSION VERSION_GREATER_EQUAL 3.18)
67+
find_package(Python3 REQUIRED COMPONENTS Interpreter Development.Module)
68+
else()
69+
find_package(Python3 REQUIRED COMPONENTS Interpreter Development)
70+
endif()
71+
endif()
72+
endif()
73+
6074
add_subdirectory(thirdparty)
6175
add_subdirectory(src)
6276
add_subdirectory(samples)
6377
add_subdirectory(tests/cpp)
6478

79+
install(FILES LICENSE DESTINATION docs/licensing COMPONENT licensing_genai RENAME LICENSE-GENAI)
80+
install(FILES third-party-programs.txt DESTINATION docs/licensing COMPONENT licensing_genai RENAME third-party-programs-genai.txt)
6581
install(FILES LICENSE DESTINATION docs/licensing COMPONENT licensing_genai RENAME LICENSE-GENAI)
6682
install(FILES third-party-programs.txt DESTINATION docs/licensing COMPONENT licensing_genai RENAME third-party-programs-genai.txt)
6783
set(CPACK_ARCHIVE_COMPONENT_INSTALL ON)

samples/cpp/beam_search_causal_lm/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Text generation C++ sample that supports most popular models like LLaMA 2
1+
# Text generation C++ sample that supports most popular models like LLaMA 3
22

33
This example showcases inference of text-generation Large Language Models (LLMs): `chatglm`, `LLaMA`, `Qwen` and other models with the same signature. The application doesn't have many configuration options to encourage the reader to explore and modify the source code. It's only possible to change the device for inference to a differnt one, GPU for example, from the command line interface. The sample fearures `ov::genai::LLMPipeline` and configures it to use multiple beam grops. There is also a Jupyter [notebook](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-chatbot) which provides an example of LLM-powered Chatbot in Python.
44

samples/cpp/chat_sample/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# C++ chat_sample that supports most popular models like LLaMA 2
1+
# C++ chat_sample that supports most popular models like LLaMA 3
22

33
This example showcases inference of text-generation Large Language Models (LLMs): `chatglm`, `LLaMA`, `Qwen` and other models with the same signature. The application doesn't have many configuration options to encourage the reader to explore and modify the source code. For example, change the device for inference to GPU. The sample fearures `ov::genai::LLMPipeline` and configures it for the chat scenario. There is also a Jupyter [notebook](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-chatbot) which provides an example of LLM-powered Chatbot in Python.
44

samples/cpp/greedy_causal_lm/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Text generation C++ greedy_causal_lm that supports most popular models like LLaMA 2
1+
# Text generation C++ greedy_causal_lm that supports most popular models like LLaMA 3
22

33
This example showcases inference of text-generation Large Language Models (LLMs): `chatglm`, `LLaMA`, `Qwen` and other models with the same signature. The application doesn't have many configuration options to encourage the reader to explore and modify the source code. For example, change the device for inference to GPU. The sample fearures `ov::genai::LLMPipeline` and configures it to run the simplest deterministic greedy sampling algorithm. There is also a Jupyter [notebook](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-chatbot) which provides an example of LLM-powered Chatbot in Python.
44

samples/cpp/multinomial_causal_lm/CMakeLists.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ set_target_properties(multinomial_causal_lm PROPERTIES
1111
COMPILE_PDB_NAME multinomial_causal_lm
1212
# Ensure out of box LC_RPATH on macOS with SIP
1313
INSTALL_RPATH_USE_LINK_PATH ON)
14-
target_compile_features(greedy_causal_lm PRIVATE cxx_std_11)
14+
target_compile_features(multinomial_causal_lm PRIVATE cxx_std_11)
1515
install(TARGETS multinomial_causal_lm
1616
RUNTIME DESTINATION samples_bin/
1717
COMPONENT samples_bin

samples/cpp/multinomial_causal_lm/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Text generation C++ multinomial_causal_lm that supports most popular models like LLaMA 2
1+
# Text generation C++ multinomial_causal_lm that supports most popular models like LLaMA 3
22

33
This example showcases inference of text-generation Large Language Models (LLMs): `chatglm`, `LLaMA`, `Qwen` and other models with the same signature. The application doesn't have many configuration options to encourage the reader to explore and modify the source code. For example, change the device for inference to GPU. The sample fearures `ov::genai::LLMPipeline` and configures it to run random sampling algorithm. There is also a Jupyter [notebook](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-chatbot) which provides an example of LLM-powered Chatbot in Python.
44

samples/cpp/prompt_lookup_decoding_lm/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# prompt_lookup_decoding_lm C++ sample that supports most popular models like LLaMA 2
1+
# prompt_lookup_decoding_lm C++ sample that supports most popular models like LLaMA 3
22

33
[Prompt Lookup decoding](https://github.com/apoorvumang/prompt-lookup-decoding) is [assested-generation](https://huggingface.co/blog/assisted-generation#understanding-text-generation-latency) technique where the draft model is replaced with simple string matching the prompt to generate candidate token sequences. This method highly effective for input grounded generation (summarization, document QA, multi-turn chat, code editing), where there is high n-gram overlap between LLM input (prompt) and LLM output. This could be entity names, phrases, or code chunks that the LLM directly copies from the input while generating the output. Prompt lookup exploits this pattern to speed up autoregressive decoding in LLMs. This results in significant speedups with no effect on output quality.
44

samples/cpp/speculative_decoding_lm/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# speculative_decoding_lm C++ sample that supports most popular models like LLaMA 2
1+
# speculative_decoding_lm C++ sample that supports most popular models like LLaMA 3
22

33
Speculative decoding (or [assisted-generation](https://huggingface.co/blog/assisted-generation#understanding-text-generation-latency) in HF terminology) is a recent technique, that allows to speed up token generation when an additional smaller draft model is used alonside with the main model.
44

samples/python/beam_search_causal_lm/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Text generation Python sample that supports most popular models like LLaMA 2
1+
# Text generation Python sample that supports most popular models like LLaMA 3
22

33
This example showcases inference of text-generation Large Language Models (LLMs): `chatglm`, `LLaMA`, `Qwen` and other models with the same signature. The application doesn't have many configuration options to encourage the reader to explore and modify the source code. It's only possible to change the device for inference to a differnt one, GPU for example, from the command line interface. The sample fearures `openvino_genai.LLMPipeline` and configures it to use multiple beam grops. There is also a Jupyter [notebook](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-chatbot) which provides an example of LLM-powered Chatbot in Python.
44

0 commit comments

Comments
 (0)