Skip to content

Commit 3f953f4

Browse files
[DOCS] benchmark content restructuring (openvinotoolkit#26918)
1 parent 4043e15 commit 3f953f4

File tree

7 files changed

+273
-290
lines changed

7 files changed

+273
-290
lines changed

docs/articles_en/about-openvino/performance-benchmarks.rst

+41-70
Original file line numberDiff line numberDiff line change
@@ -16,14 +16,12 @@ Performance Benchmarks
1616
Getting Performance Numbers <performance-benchmarks/getting-performance-numbers>
1717

1818

19-
This page presents benchmark results for
19+
This page presents benchmark results for the
2020
`Intel® Distribution of OpenVINO™ toolkit <https://software.intel.com/content/www/us/en/develop/tools/openvino-toolkit.html>`__
2121
and :doc:`OpenVINO Model Server <../openvino-workflow/model-server/ovms_what_is_openvino_model_server>`, for a representative
2222
selection of public neural networks and Intel® devices. The results may help you decide which
2323
hardware to use in your applications or plan AI workload for the hardware you have already
2424
implemented in your solutions. Click the buttons below to see the chosen benchmark data.
25-
For a more detailed view of performance numbers for generative AI models, check the
26-
:doc:`Generative AI Benchmark Results <./performance-benchmarks/generative-ai-performance>`
2725

2826
.. grid:: 1 1 2 2
2927
:gutter: 4
@@ -36,7 +34,7 @@ For a more detailed view of performance numbers for generative AI models, check
3634
:outline:
3735
:expand:
3836

39-
:material-regular:`bar_chart;1.4em` OpenVINO Benchmark Graphs
37+
:material-regular:`bar_chart;1.4em` OpenVINO Benchmark Graphs (general)
4038

4139
.. grid-item::
4240

@@ -46,10 +44,35 @@ For a more detailed view of performance numbers for generative AI models, check
4644
:outline:
4745
:expand:
4846

49-
:material-regular:`bar_chart;1.4em` OVMS Benchmark Graphs
47+
:material-regular:`bar_chart;1.4em` OVMS Benchmark Graphs (general)
48+
49+
.. grid-item::
50+
51+
.. button-link:: ./performance-benchmarks/generative-ai-performance.html
52+
:class: ov-toolkit-benchmark-genai
53+
:color: primary
54+
:outline:
55+
:expand:
56+
57+
:material-regular:`table_view;1.4em` LLM performance for AI PC
58+
59+
.. grid-item::
60+
61+
.. button-link:: #
62+
:class: ovms-toolkit-benchmark-llm
63+
:color: primary
64+
:outline:
65+
:expand:
66+
67+
:material-regular:`bar_chart;1.4em` OVMS for GenAI (coming soon)
68+
69+
70+
71+
5072

5173

52-
Key performance indicators and workload parameters.
74+
75+
**Key performance indicators and workload parameters**
5376

5477
.. tab-set::
5578

@@ -65,13 +88,13 @@ Key performance indicators and workload parameters.
6588
.. tab-item:: Latency
6689
:sync: latency
6790

68-
For Vision and NLP models this mhis measures the synchronous execution of inference requests and is reported in
69-
milliseconds. Each inference request (for example: preprocess, infer, postprocess) is
70-
allowed to complete before the next is started. This performance metric is relevant in
71-
usage scenarios where a single image input needs to be acted upon as soon as possible. An
72-
example would be the healthcare sector where medical personnel only request analysis of a
73-
single ultra sound scanning image or in real-time or near real-time applications for
74-
example an industrial robot's response to actions in its environment or obstacle avoidance
91+
For Vision and NLP models this measures the synchronous execution of inference requests and
92+
is reported in milliseconds. Each inference request (for example: preprocess, infer,
93+
postprocess) is allowed to complete before the next one starts. This performance metric is
94+
relevant in usage scenarios where a single image input needs to be acted upon as soon as
95+
possible. An example would be the healthcare sector where medical personnel only request
96+
analysis of a single ultra sound scanning image or in real-time or near real-time applications
97+
such as an industrial robot's response to actions in its environment or obstacle avoidance
7598
for autonomous vehicles.
7699
For Transformer models like Stable-Diffusion this measures the time it takes to convert the prompt
77100
or input text into a finished image. It is presented in seconds.
@@ -97,9 +120,10 @@ Key performance indicators and workload parameters.
97120
* input token length: 1024 (the tokens for GenAI models are in English).
98121

99122

100-
.. raw:: html
123+
**Platforms, Configurations, Methodology**
101124

102-
<h2>Platforms, Configurations, Methodology</h2>
125+
To see the methodology used to obtain the numbers and learn how to test performance yourself,
126+
see the guide on :doc:`getting performance numbers <performance-benchmarks/getting-performance-numbers>`.
103127

104128
For a listing of all platforms and configurations used for testing, refer to the following:
105129

@@ -130,59 +154,10 @@ For a listing of all platforms and configurations used for testing, refer to the
130154
:material-regular:`download;1.5em` Click for Performance Data [XLSX]
131155

132156

133-
The OpenVINO benchmark setup includes a single system with OpenVINO™, as well as the benchmark
134-
application installed. It measures the time spent on actual inference (excluding any pre or post
135-
processing) and then reports on the inferences per second (or Frames Per Second).
136-
137-
OpenVINO™ Model Server (OVMS) employs the Intel® Distribution of OpenVINO™ toolkit runtime
138-
libraries and exposes a set of models via a convenient inference API over gRPC or HTTP/REST.
139-
Its benchmark results are measured with the configuration of multiple-clients-single-server,
140-
using two hardware platforms connected by ethernet. Network bandwidth depends on both platforms
141-
and models used. It is set not to be a bottleneck for workload intensity. The connection is
142-
dedicated only to measuring performance.
143-
144-
.. dropdown:: See more details about OVMS benchmark setup
145-
146-
The benchmark setup for OVMS consists of four main parts:
147157

148-
.. image:: ../assets/images/performance_benchmarks_ovms_02.png
149-
:alt: OVMS Benchmark Setup Diagram
150158

151-
* **OpenVINO™ Model Server** is launched as a docker container on the server platform and it
152-
listens to (and answers) requests from clients. OpenVINO™ Model Server is run on the same
153-
system as the OpenVINO™ toolkit benchmark application in corresponding benchmarking. Models
154-
served by OpenVINO™ Model Server are located in a local file system mounted into the docker
155-
container. The OpenVINO™ Model Server instance communicates with other components via ports
156-
over a dedicated docker network.
157159

158-
* **Clients** are run in separated physical machine referred to as client platform. Clients
159-
are implemented in Python3 programming language based on TensorFlow* API and they work as
160-
parallel processes. Each client waits for a response from OpenVINO™ Model Server before it
161-
will send a new next request. The role played by the clients is also verification of
162-
responses.
163-
164-
* **Load balancer** works on the client platform in a docker container. HAProxy is used for
165-
this purpose. Its main role is counting of requests forwarded from clients to OpenVINO™
166-
Model Server, estimating its latency, and sharing this information by Prometheus service.
167-
The reason of locating the load balancer on the client site is to simulate real life
168-
scenario that includes impact of physical network on reported metrics.
169-
170-
* **Execution Controller** is launched on the client platform. It is responsible for
171-
synchronization of the whole measurement process, downloading metrics from the load
172-
balancer, and presenting the final report of the execution.
173-
174-
175-
176-
.. raw:: html
177-
178-
<h2>Test performance yourself</h2>
179-
180-
You can also test performance for your system yourself, following the guide on
181-
:doc:`getting performance numbers <performance-benchmarks/getting-performance-numbers>`.
182-
183-
.. raw:: html
184-
185-
<h2>Disclaimers</h2>
160+
**Disclaimers**
186161

187162
* Intel® Distribution of OpenVINO™ toolkit performance results are based on release
188163
2024.3, as of July 31, 2024.
@@ -192,22 +167,18 @@ You can also test performance for your system yourself, following the guide on
192167

193168
The results may not reflect all publicly available updates. Intel technologies' features and
194169
benefits depend on system configuration and may require enabled hardware, software, or service
195-
activation. Learn more at intel.com, or from the OEM or retailer.
170+
activation. Learn more at intel.com, the OEM, or retailer.
196171

197172
See configuration disclosure for details. No product can be absolutely secure.
198173
Performance varies by use, configuration and other factors. Learn more at
199174
`www.intel.com/PerformanceIndex <https://www.intel.com/PerformanceIndex>`__.
200-
Your costs and results may vary.
201175
Intel optimizations, for Intel compilers or other products, may not optimize to the same degree
202176
for non-Intel products.
203177

204178

205179

206180

207181

208-
209-
210-
211182
.. raw:: html
212183

213184
<link rel="stylesheet" type="text/css" href="../_static/css/benchmark-banner.css">

docs/articles_en/about-openvino/performance-benchmarks/generative-ai-performance.rst

+19-9
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Most Efficient Large Language Models for AI PC
44
This page is regularly updated to help you identify the best-performing LLMs on the
55
Intel® Core™ Ultra processor family and AI PCs.
66

7-
The tables below list the key performance indicators for a selection of Large Language Models,
7+
The tables below list key performance indicators for a selection of Large Language Models,
88
running on an Intel® Core™ Ultra 7-165H based system, on built-in GPUs.
99

1010

@@ -23,24 +23,34 @@ running on an Intel® Core™ Ultra 7-165H based system, on built-in GPUs.
2323
:class: modeldata stripe
2424
:name: supportedModelsTableOv
2525
:header-rows: 1
26-
:file: ../../_static/download/llm_models.csv
26+
:file: ../../_static/benchmarks_files/llm_models.csv
2727

2828

29-
For complete information on the system config, see:
30-
`Hardware Platforms [PDF] <https://docs.openvino.ai/2024/_static/benchmarks_files/OV-2024.4-platform_list.pdf>`__
31-
32-
To view the data in an editable form, you can download the .csv file here:
33-
3429
.. grid:: 1 1 2 2
3530
:gutter: 4
3631

3732
.. grid-item::
3833

39-
.. button-link:: ../../_static/download/llm_models.csv
34+
All models listed here were tested with the following parameters:
35+
36+
* Framework: PyTorch
37+
* Model precision: INT4
38+
* Beam: 1
39+
* Batch size: 1
40+
41+
.. grid-item::
42+
43+
.. button-link:: https://docs.openvino.ai/2024/_static/benchmarks_files/OV-2024.4-platform_list.pdf
4044
:color: primary
4145
:outline:
4246
:expand:
4347

44-
:material-regular:`download;1.5em` Click for OpenVINO LLM results [CSV]
48+
:material-regular:`download;1.5em` Get full system info [PDF]
49+
50+
.. button-link:: ../../_static/benchmarks_files/llm_models.csv
51+
:color: primary
52+
:outline:
53+
:expand:
4554

55+
:material-regular:`download;1.5em` Get the data in .csv [CSV]
4656

0 commit comments

Comments
 (0)