-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor: performance optimization in tracker announce #751
Refactor: performance optimization in tracker announce #751
Conversation
ACK 59a68d3 |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #751 +/- ##
===========================================
- Coverage 74.06% 74.02% -0.05%
===========================================
Files 145 145
Lines 8958 8923 -35
===========================================
- Hits 6635 6605 -30
+ Misses 2323 2318 -5 ☔ View full report in Codecov by Sentry. |
cd9f82c
to
4a17fa1
Compare
The tracker announce method uses two locks: 1. To write the new peer in the torrent entry peer list (write lock). 2. To read the peers list from the torrent entry (read lock). I was wondering if the repo would be faster removing the second lock since we can get the peer list in the first lock. This are the benchmarking results: ```output tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (57ns, 59ns) update_one_torrent_in_parallel: Avg/AdjAvg: (15.769541ms, 16.69948ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (20.6541ms, 20.010245ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (18.326633ms, 18.326633ms) std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (38ns, 39ns) update_one_torrent_in_parallel: Avg/AdjAvg: (7.154301ms, 7.296907ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (8.370377ms, 8.370377ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.902327ms, 6.902327ms) std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (44ns, 40ns) update_one_torrent_in_parallel: Avg/AdjAvg: (6.023789ms, 6.023789ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.306719ms, 12.306719ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.429203ms, 8.429203ms) tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (71ns, 69ns) update_one_torrent_in_parallel: Avg/AdjAvg: (7.022376ms, 7.022376ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (28.125712ms, 27.811162ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.908545ms, 9.036288ms) tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (85ns, 81ns) update_one_torrent_in_parallel: Avg/AdjAvg: (15.285006ms, 15.555171ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (29.794849ms, 30.254531ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (9.302051ms, 9.302051ms) ``` ```output Requests out: 384701.13/second Responses in: 345642.53/second - Connect responses: 171189.76 - Announce responses: 171187.17 - Scrape responses: 3265.60 - Error responses: 0.00 Peers per announce response: 0.00 Announce responses per info hash: - p10: 1 - p25: 1 - p50: 1 - p75: 1 - p90: 2 - p95: 3 - p99: 104 - p99.9: 281 - p100: 347 ``` ```output tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (64ns, 64ns) update_one_torrent_in_parallel: Avg/AdjAvg: (15.652934ms, 15.54617ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.834278ms, 22.935421ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (17.973855ms, 18.28944ms) std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (43ns, 39ns) update_one_torrent_in_parallel: Avg/AdjAvg: (6.362147ms, 6.362147ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (8.869102ms, 8.869102ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.589264ms, 6.510441ms) std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (48ns, 49ns) update_one_torrent_in_parallel: Avg/AdjAvg: (6.194211ms, 6.194211ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.50352ms, 12.50352ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.411279ms, 8.411279ms) tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (79ns, 79ns) update_one_torrent_in_parallel: Avg/AdjAvg: (6.966462ms, 6.966462ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (27.584361ms, 27.411202ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.965519ms, 7.474346ms) tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (97ns, 97ns) update_one_torrent_in_parallel: Avg/AdjAvg: (16.046499ms, 16.430313ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (28.42327ms, 27.876115ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.576256ms, 6.326548ms) ``` Only in a couple of cases is a little bit faster. But I think the code is simpler. However, the UDP load test looks better. From 171187.17 announce requests to 185922.40. ```output Requests out: 416413.93/second Responses in: 374655.61/second - Connect responses: 185740.08 - Announce responses: 185922.40 - Scrape responses: 2993.13 - Error responses: 0.00 Peers per announce response: 0.00 Announce responses per info hash: - p10: 1 - p25: 1 - p50: 1 - p75: 1 - p90: 2 - p95: 3 - p99: 105 - p99.9: 301 - p100: 365 ``` In general, It's not a great performance increase. I would merge it only because the code is cleaner. This is the initial version but with a cleaner code. It's the final version in this PR. - Write lock to update the peer list. - Read lock to get stats and the peer list. ```output tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (75ns, 73ns) update_one_torrent_in_parallel: Avg/AdjAvg: (13.987718ms, 13.987718ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (18.375705ms, 17.887158ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (18.242291ms, 18.242291ms) std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (55ns, 54ns) update_one_torrent_in_parallel: Avg/AdjAvg: (9.550877ms, 9.80194ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (7.861103ms, 7.861103ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (7.80655ms, 7.80655ms) std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (61ns, 59ns) update_one_torrent_in_parallel: Avg/AdjAvg: (9.649314ms, 9.521965ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.225382ms, 12.582905ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (10.700225ms, 10.700225ms) tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (83ns, 82ns) update_one_torrent_in_parallel: Avg/AdjAvg: (12.059674ms, 12.059674ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.095544ms, 24.209862ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (11.731907ms, 11.731907ms) tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (105ns, 104ns) update_one_torrent_in_parallel: Avg/AdjAvg: (12.257331ms, 12.257331ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.032795ms, 23.541457ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (11.880602ms, 11.880602ms) ``` ```output Requests out: 416171.50/second Responses in: 374553.95/second - Connect responses: 186039.97 - Announce responses: 185587.24 - Scrape responses: 2926.74 - Error responses: 0.00 Peers per announce response: 0.00 Announce responses per info hash: - p10: 1 - p25: 1 - p50: 1 - p75: 1 - p90: 2 - p95: 3 - p99: 105 - p99.9: 301 - p100: 393 ```
4a17fa1
to
59a68d3
Compare
I would prefer to merge #690 before this. I've just got to finish a couple more tests and then it is ready to review. (I will prob need to do rebase to squash a few comments also.) |
Hey, yes I didn't want to merge it before you finish yours. Anyway It's not an important change. Take a look when you have time and let me know if you think the code is simpler. Regarding performance It looks like it's not a big change. |
I'm closing this. I think it's a good refactor. I would like to decouple peer lists from statistics. Statistics could even be enabled/disabled in config. Since the performance is the same, I would prefer to split the method In short, I think it would be better because:
|
The tracker announce method uses two locks:
I was wondering if the repo would be faster removing the second lock since we can get the peer list in the first lock.
These are the benchmarking results:
Before the refactor
With only a write lock
Only in a couple of cases is a little bit faster. But I think the code is simpler.
However, the UDP load test looks better. From 171187.17 announce requests to 185922.40.
In general, It's not a great performance increase. I would merge it only because the code is cleaner.
With independent write and read lock
This is the initial version but with a cleaner code. It's the final version in this PR.