-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Halyard benchmarking -- how to improve? #32
Comments
upd: did not add benchmarking results to the github, they are here: https://www.dropbox.com/s/st5sz0hu7eoxj8l/benchmark_results.tar.bz2?dl=0 |
I'll take a look at it, there might be many configuration reasons why HBase does not perform well on a single-node cluster. And there might be also reason in Halyard query evaluation and the benchmarking queries. |
I have found better performance when the 'Push' option is not enabled. There are probably issues with some queries (such as path queries) with that option. Have you tested without the Push option enabled? |
Hi,I'm sorry for the late response.I've noticed our mail system filtered your mail with the data attached. Could you, please, send it to this my Gmail address. I'll profile and compare both options.
Thanks,Adam
BTW there might be always a bug in the push strategy, however significant performance degradation for paths is theoretically possible due to its multi-threaded architecture only on systems very low on resources (1 CPU core, low memory, etc...).
-------- Původní zpráva --------Od: Lawrence <notifications@github.com> Datum: 08.01.18 12:39 (GMT+01:00) Komu: Merck/Halyard <Halyard@noreply.github.com> Cc: Adam Sotona <adam.sotona@gmail.com>, Mention <mention@noreply.github.com> Předmět: Re: [Merck/Halyard] Halyard benchmarking -- how to improve? (#32)
I have found better performance when the 'Push' option is not enabled. There are probably issues with some queries (such as path queries) with that option. Have you tested without the Push option enabled?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/Merck/Halyard","title":"Merck/Halyard","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/Merck/Halyard"}},"updates":{"snippets":[{"icon":"PERSON","message":"@peterjohnlawrence in #32: I have found better performance when the 'Push' option is ***not*** enabled. There are probably issues with some queries (such as path queries) with that option. Have you tested without the Push option enabled?"}],"action":{"name":"View Issue","url":"#32 (comment)"}}}
|
@earthquakesan Hi Ivan, have you made it to work in a multi node cluster as well? Thanks |
Hi Adam @asotona!
I have performed Halyard benchmarking on 1 node setup (i7-3770 3.4GHz, 32GB RAM, normal HDD) --> HDFS + YARN + HBase + Halyard. The querying was done via rdf4j-server SPARQL endpoint. e.g.:
I have used FEASIBLE [1] benchmark queries and IGUANA [2]. The configuration for the benchmarking is available in halyard docker repository [3] (iguana-config.tar.bz2).
As you can see from the benchmarking results for the smallest size Halyard could answer only 6 queries, for larger sizes (50 and 100) Halyard answered 0 queries.
From preliminary discussions: it is possible to query Halyard using Java interface and it should improve the performance. Is there any example on how to do that?
[1] http://aksw.org/Projects/FEASIBLE.html
[2] http://aksw.org/Projects/IGUANA.html
[3] https://github.com/earthquakesan/docker-halyard
The text was updated successfully, but these errors were encountered: