Skip to content

Commit 3b5eee2

Browse files
Add an asyncio-based load generator (#935)
With this commit we add a new experimental subcommand `race-aync` to Rally. It allows to specify significantly more clients than the current `race` subcommand. The reason for this is that under the hood, `race-async` uses `asyncio` and runs all clients in a single event loop. Contrary to that, `race` uses an actor system under the hood and maps each client to one process. As the new subcommand is very experimental and not yet meant to be used broadly, there is no accompanying user documentation in this PR. Instead, we plan to build on top of this PR and expand the load generator to take advantage of multiple cores before we consider this usable in production (it will likely keep its experimental status though). In this PR we also implement a compatibility layer into the current load generator so both work internally now with `asyncio`. Consequently, we have already adapted all Rally tracks with a backwards-compatibility layer (see elastic/rally-tracks#97 and elastic/rally-eventdata-track#80). Closes #852 Relates #916
1 parent b33da19 commit 3b5eee2

25 files changed

+2656
-1099
lines changed

create-notice.sh

+7
Original file line numberDiff line numberDiff line change
@@ -43,19 +43,26 @@ function main {
4343
printf "The source code can be obtained at https://github.com/certifi/python-certifi\n" >> "${OUTPUT_FILE}"
4444
add_license "certifi" "https://raw.githubusercontent.com/certifi/python-certifi/master/LICENSE"
4545
add_license "elasticsearch" "https://raw.githubusercontent.com/elastic/elasticsearch-py/master/LICENSE"
46+
add_license "elasticsearch-async" "https://raw.githubusercontent.com/elastic/elasticsearch-py-async/master/LICENSE"
4647
add_license "jinja2" "https://raw.githubusercontent.com/pallets/jinja/master/LICENSE.rst"
4748
add_license "jsonschema" "https://raw.githubusercontent.com/Julian/jsonschema/master/COPYING"
4849
add_license "psutil" "https://raw.githubusercontent.com/giampaolo/psutil/master/LICENSE"
4950
add_license "py-cpuinfo" "https://raw.githubusercontent.com/workhorsy/py-cpuinfo/master/LICENSE"
5051
add_license "tabulate" "https://bitbucket.org/astanin/python-tabulate/raw/03182bf9b8a2becbc54d17aa7e3e7dfed072c5f5/LICENSE"
5152
add_license "thespian" "https://raw.githubusercontent.com/kquick/Thespian/master/LICENSE.txt"
5253
add_license "boto3" "https://raw.githubusercontent.com/boto/boto3/develop/LICENSE"
54+
add_license "yappi" "https://raw.githubusercontent.com/sumerc/yappi/master/LICENSE"
55+
add_license "ijson" "https://raw.githubusercontent.com/ICRAR/ijson/master/LICENSE.txt"
5356

5457
# transitive dependencies
5558
# Jinja2 -> Markupsafe
5659
add_license "Markupsafe" "https://raw.githubusercontent.com/pallets/markupsafe/master/LICENSE.rst"
5760
# elasticsearch -> urllib3
5861
add_license "urllib3" "https://raw.githubusercontent.com/shazow/urllib3/master/LICENSE.txt"
62+
#elasticsearch_async -> aiohttp
63+
add_license "aiohttp" "https://raw.githubusercontent.com/aio-libs/aiohttp/master/LICENSE.txt"
64+
#elasticsearch_async -> async_timeout
65+
add_license "async_timeout" "https://raw.githubusercontent.com/aio-libs/async-timeout/master/LICENSE"
5966
# boto3 -> s3transfer
6067
add_license "s3transfer" "https://raw.githubusercontent.com/boto/s3transfer/develop/LICENSE.txt"
6168
# boto3 -> jmespath

docs/adding_tracks.rst

+30-16
Original file line numberDiff line numberDiff line change
@@ -881,17 +881,15 @@ In ``track.json`` set the ``operation-type`` to "percolate" (you can choose this
881881

882882
Then create a file ``track.py`` next to ``track.json`` and implement the following two functions::
883883

884-
def percolate(es, params):
885-
es.percolate(
886-
index="queries",
887-
doc_type="content",
888-
body=params["body"]
889-
)
890-
884+
async def percolate(es, params):
885+
await es.percolate(
886+
index="queries",
887+
doc_type="content",
888+
body=params["body"]
889+
)
891890

892891
def register(registry):
893-
registry.register_runner("percolate", percolate)
894-
892+
registry.register_runner("percolate", percolate, async_runner=True)
895893

896894
The function ``percolate`` is the actual runner and takes the following parameters:
897895

@@ -906,11 +904,25 @@ This function can return:
906904

907905
Similar to a parameter source you also need to bind the name of your operation type to the function within ``register``.
908906

907+
To illustrate how to use custom return values, suppose we want to implement a custom runner that calls the `pending tasks API <https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-pending.html>`_ and returns the number of pending tasks as additional meta-data::
908+
909+
async def pending_tasks(es, params):
910+
response = await es.cluster.pending_tasks()
911+
return {
912+
"weight": 1,
913+
"unit": "ops",
914+
"pending-tasks-count": len(response["tasks"])
915+
}
916+
917+
def register(registry):
918+
registry.register_runner("pending-tasks", pending_tasks, async_runner=True)
919+
920+
909921
If you need more control, you can also implement a runner class. The example above, implemented as a class looks as follows::
910922

911923
class PercolateRunner:
912-
def __call__(self, es, params):
913-
es.percolate(
924+
async def __call__(self, es, params):
925+
await es.percolate(
914926
index="queries",
915927
doc_type="content",
916928
body=params["body"]
@@ -920,10 +932,12 @@ If you need more control, you can also implement a runner class. The example abo
920932
return "percolate"
921933

922934
def register(registry):
923-
registry.register_runner("percolate", PercolateRunner())
935+
registry.register_runner("percolate", PercolateRunner(), async_runner=True)
936+
924937

938+
The actual runner is implemented in the method ``__call__`` and the same return value conventions apply as for functions. For debugging purposes you should also implement ``__repr__`` and provide a human-readable name for your runner. Finally, you need to register your runner in the ``register`` function.
925939

926-
The actual runner is implemented in the method ``__call__`` and the same return value conventions apply as for functions. For debugging purposes you should also implement ``__repr__`` and provide a human-readable name for your runner. Finally, you need to register your runner in the ``register`` function. Runners also support Python's `context manager <https://docs.python.org/3/library/stdtypes.html#typecontextmanager>`_ interface. Rally uses a new context for each request. Implementing the context manager interface can be handy for cleanup of resources after executing an operation. Rally uses it, for example, to clear open scrolls.
940+
Runners also support Python's `asynchronous context manager <https://docs.python.org/3/reference/datamodel.html#async-context-managers>`_ interface. Rally uses a new context for each request. Implementing the asynchronous context manager interface can be handy for cleanup of resources after executing an operation. Rally uses it, for example, to clear open scrolls.
927941

928942
If you have specified multiple Elasticsearch clusters using :ref:`target-hosts <command_line_reference_advanced_topics>` you can make Rally pass a dictionary of client connections instead of one for the ``default`` cluster in the ``es`` parameter.
929943

@@ -938,14 +952,14 @@ Example (assuming Rally has been invoked specifying ``default`` and ``remote`` i
938952
class CreateIndexInRemoteCluster:
939953
multi_cluster = True
940954

941-
def __call__(self, es, params):
942-
es['remote'].indices.create(index='remote-index')
955+
async def __call__(self, es, params):
956+
await es["remote"].indices.create(index="remote-index")
943957

944958
def __repr__(self, *args, **kwargs):
945959
return "create-index-in-remote-cluster"
946960

947961
def register(registry):
948-
registry.register_runner("create-index-in-remote-cluster", CreateIndexInRemoteCluster())
962+
registry.register_runner("create-index-in-remote-cluster", CreateIndexInRemoteCluster(), async_runner=True)
949963

950964

951965
.. note::

docs/migrate.rst

+49
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,55 @@ Minimum Python version is 3.8.0
99

1010
Rally 1.5.0 requires Python 3.8.0. Check the :ref:`updated installation instructions <install_python>` for more details.
1111

12+
Meta-Data for queries are omitted
13+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14+
15+
Rally 1.5.0 does not determine query meta-data anymore by default to reduce the risk of client-side bottlenecks. The following meta-data fields are affected:
16+
17+
* ``hits``
18+
* ``hits_relation``
19+
* ``timed_out``
20+
* ``took``
21+
22+
If you still want to retrieve them (risking skewed results due to additional overhead), set the new property ``detailed-results`` to ``true`` for any operation of type ``search``.
23+
24+
Runner API uses asyncio
25+
^^^^^^^^^^^^^^^^^^^^^^^
26+
27+
In order to support more concurrent clients in the future, Rally is moving from a synchronous model to an asynchronous model internally. With Rally 1.5.0 all custom runners need to be implemented using async APIs and a new bool argument ``async_runner=True`` needs to be provided upon registration. Below is an example how to migrate a custom runner function.
28+
29+
A custom runner prior to Rally 1.5.0::
30+
31+
def percolate(es, params):
32+
es.percolate(
33+
index="queries",
34+
doc_type="content",
35+
body=params["body"]
36+
)
37+
38+
def register(registry):
39+
registry.register_runner("percolate", percolate)
40+
41+
With Rally 1.5.0, the implementation changes as follows::
42+
43+
async def percolate(es, params):
44+
await es.percolate(
45+
index="queries",
46+
doc_type="content",
47+
body=params["body"]
48+
)
49+
50+
def register(registry):
51+
registry.register_runner("percolate", percolate, async_runner=True)
52+
53+
Apply to the following changes for each custom runner:
54+
55+
* Prefix the function signature with ``async``.
56+
* Add an ``await`` keyword before each Elasticsearch API call.
57+
* Add ``async_runner=True`` as the last argument to the ``register_runner`` function.
58+
59+
For more details please refer to the updated documentation on :ref:`custom runners <adding_tracks_custom_runners>`.
60+
1261
Migrating to Rally 1.4.1
1362
------------------------
1463

docs/track.rst

+8
Original file line numberDiff line numberDiff line change
@@ -402,9 +402,17 @@ With the operation type ``search`` you can execute `request body searches <http:
402402
2. Rally will not attempt to serialize the parameters and pass them as is. Always use "true" / "false" strings for boolean parameters (see example below).
403403

404404
* ``body`` (mandatory): The query body.
405+
* ``detailed-results`` (optional, defaults to ``false``): Records more detailed meta-data about queries. As it analyzes the corresponding response in more detail, this might incur additional overhead which can skew measurement results. This flag is ineffective for scroll queries.
405406
* ``pages`` (optional): Number of pages to retrieve. If this parameter is present, a scroll query will be executed. If you want to retrieve all result pages, use the value "all".
406407
* ``results-per-page`` (optional): Number of documents to retrieve per page for scroll queries.
407408

409+
If ``detailed-results`` is set to ``true``, the following meta-data properties will be determined and stored:
410+
411+
* ``hits``
412+
* ``hits_relation``
413+
* ``timed_out``
414+
* ``took``
415+
408416
Example::
409417

410418
{

esrally/async_connection.py

+134
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
import asyncio
2+
import ssl
3+
import warnings
4+
5+
import aiohttp
6+
from aiohttp.client_exceptions import ServerFingerprintMismatch
7+
import async_timeout
8+
9+
from elasticsearch.exceptions import ConnectionError, ConnectionTimeout, ImproperlyConfigured, SSLError
10+
from elasticsearch.connection import Connection
11+
from elasticsearch.compat import urlencode
12+
from elasticsearch.connection.http_urllib3 import create_ssl_context
13+
14+
15+
# This is only needed because https://github.com/elastic/elasticsearch-py-async/pull/68 is not merged yet
16+
# In addition we have raised the connection limit in TCPConnector from 100 to 10000.
17+
18+
# We want to keep the diff as small as possible thus suppressing pylint warnings that we would not allow in Rally
19+
# pylint: disable=W0706
20+
class AIOHttpConnection(Connection):
21+
def __init__(self, host='localhost', port=9200, http_auth=None,
22+
use_ssl=False, verify_certs=False, ca_certs=None, client_cert=None,
23+
client_key=None, loop=None, use_dns_cache=True, headers=None,
24+
ssl_context=None, trace_config=None, **kwargs):
25+
super().__init__(host=host, port=port, **kwargs)
26+
27+
self.loop = asyncio.get_event_loop() if loop is None else loop
28+
29+
if http_auth is not None:
30+
if isinstance(http_auth, str):
31+
http_auth = tuple(http_auth.split(':', 1))
32+
33+
if isinstance(http_auth, (tuple, list)):
34+
http_auth = aiohttp.BasicAuth(*http_auth)
35+
36+
headers = headers or {}
37+
headers.setdefault('content-type', 'application/json')
38+
39+
# if providing an SSL context, raise error if any other SSL related flag is used
40+
if ssl_context and (verify_certs or ca_certs):
41+
raise ImproperlyConfigured("When using `ssl_context`, `use_ssl`, `verify_certs`, `ca_certs` are not permitted")
42+
43+
if use_ssl or ssl_context:
44+
cafile = ca_certs
45+
if not cafile and not ssl_context and verify_certs:
46+
# If no ca_certs and no sslcontext passed and asking to verify certs
47+
# raise error
48+
raise ImproperlyConfigured("Root certificates are missing for certificate "
49+
"validation. Either pass them in using the ca_certs parameter or "
50+
"install certifi to use it automatically.")
51+
if verify_certs or ca_certs:
52+
warnings.warn('Use of `verify_certs`, `ca_certs` have been deprecated in favor of using SSLContext`', DeprecationWarning)
53+
54+
if not ssl_context:
55+
# if SSLContext hasn't been passed in, create one.
56+
# need to skip if sslContext isn't avail
57+
try:
58+
ssl_context = create_ssl_context(cafile=cafile)
59+
except AttributeError:
60+
ssl_context = None
61+
62+
if not verify_certs and ssl_context is not None:
63+
ssl_context.check_hostname = False
64+
ssl_context.verify_mode = ssl.CERT_NONE
65+
warnings.warn(
66+
'Connecting to %s using SSL with verify_certs=False is insecure.' % host)
67+
if ssl_context:
68+
verify_certs = True
69+
use_ssl = True
70+
71+
trace_configs = [trace_config] if trace_config else None
72+
73+
self.session = aiohttp.ClientSession(
74+
auth=http_auth,
75+
timeout=self.timeout,
76+
connector=aiohttp.TCPConnector(
77+
loop=self.loop,
78+
verify_ssl=verify_certs,
79+
use_dns_cache=use_dns_cache,
80+
ssl_context=ssl_context,
81+
# this has been changed from the default (100)
82+
limit=100000
83+
),
84+
headers=headers,
85+
trace_configs=trace_configs
86+
)
87+
88+
self.base_url = 'http%s://%s:%d%s' % (
89+
's' if use_ssl else '',
90+
host, port, self.url_prefix
91+
)
92+
93+
@asyncio.coroutine
94+
def close(self):
95+
yield from self.session.close()
96+
97+
@asyncio.coroutine
98+
def perform_request(self, method, url, params=None, body=None, timeout=None, ignore=(), headers=None):
99+
url_path = url
100+
if params:
101+
url_path = '%s?%s' % (url, urlencode(params or {}))
102+
url = self.base_url + url_path
103+
104+
start = self.loop.time()
105+
response = None
106+
try:
107+
with async_timeout.timeout(timeout or self.timeout.total, loop=self.loop):
108+
response = yield from self.session.request(method, url, data=body, headers=headers)
109+
raw_data = yield from response.text()
110+
duration = self.loop.time() - start
111+
112+
except asyncio.CancelledError:
113+
raise
114+
115+
except Exception as e:
116+
self.log_request_fail(method, url, url_path, body, self.loop.time() - start, exception=e)
117+
if isinstance(e, ServerFingerprintMismatch):
118+
raise SSLError('N/A', str(e), e)
119+
if isinstance(e, asyncio.TimeoutError):
120+
raise ConnectionTimeout('TIMEOUT', str(e), e)
121+
raise ConnectionError('N/A', str(e), e)
122+
123+
finally:
124+
if response is not None:
125+
yield from response.release()
126+
127+
# raise errors based on http status codes, let the client handle those if needed
128+
if not (200 <= response.status < 300) and response.status not in ignore:
129+
self.log_request_fail(method, url, url_path, body, duration, status_code=response.status, response=raw_data)
130+
self._raise_error(response.status, raw_data)
131+
132+
self.log_request_success(method, url, url_path, body, response.status, raw_data, duration)
133+
134+
return response.status, response.headers, raw_data

0 commit comments

Comments
 (0)