Refactor: performance optimization in tracker announce #751

josecelano · 2024-03-22T09:05:52Z

The tracker announce method uses two locks:

To write the new peer in the torrent entry peer list (write lock).
To read the peers list from the torrent entry (read lock).

I was wondering if the repo would be faster removing the second lock since we can get the peer list in the first lock.

These are the benchmarking results:

Before the refactor

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (57ns, 59ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (15.769541ms, 16.69948ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (20.6541ms, 20.010245ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (18.326633ms, 18.326633ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (38ns, 39ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (7.154301ms, 7.296907ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (8.370377ms, 8.370377ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.902327ms, 6.902327ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (44ns, 40ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (6.023789ms, 6.023789ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.306719ms, 12.306719ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.429203ms, 8.429203ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (71ns, 69ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (7.022376ms, 7.022376ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (28.125712ms, 27.811162ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.908545ms, 9.036288ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (85ns, 81ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (15.285006ms, 15.555171ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (29.794849ms, 30.254531ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (9.302051ms, 9.302051ms)

Requests out: 384701.13/second
Responses in: 345642.53/second
  - Connect responses:  171189.76
  - Announce responses: 171187.17
  - Scrape responses:   3265.60
  - Error responses:    0.00
Peers per announce response: 0.00
Announce responses per info hash:
  - p10: 1
  - p25: 1
  - p50: 1
  - p75: 1
  - p90: 2
  - p95: 3
  - p99: 104
  - p99.9: 281
  - p100: 347

With only a write lock

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (64ns, 64ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (15.652934ms, 15.54617ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.834278ms, 22.935421ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (17.973855ms, 18.28944ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (43ns, 39ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (6.362147ms, 6.362147ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (8.869102ms, 8.869102ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.589264ms, 6.510441ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (48ns, 49ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (6.194211ms, 6.194211ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.50352ms, 12.50352ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.411279ms, 8.411279ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (79ns, 79ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (6.966462ms, 6.966462ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (27.584361ms, 27.411202ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.965519ms, 7.474346ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (97ns, 97ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (16.046499ms, 16.430313ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (28.42327ms, 27.876115ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.576256ms, 6.326548ms)

Only in a couple of cases is a little bit faster. But I think the code is simpler.

However, the UDP load test looks better. From 171187.17 announce requests to 185922.40.

Requests out: 416413.93/second
Responses in: 374655.61/second
  - Connect responses:  185740.08
  - Announce responses: 185922.40
  - Scrape responses:   2993.13
  - Error responses:    0.00
Peers per announce response: 0.00
Announce responses per info hash:
  - p10: 1
  - p25: 1
  - p50: 1
  - p75: 1
  - p90: 2
  - p95: 3
  - p99: 105
  - p99.9: 301
  - p100: 365

In general, It's not a great performance increase. I would merge it only because the code is cleaner.

With independent write and read lock

This is the initial version but with a cleaner code. It's the final version in this PR.

Write lock to update the peer list.
Read lock to get stats and the peer list.

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>>
add_one_torrent: Avg/AdjAvg: (75ns, 73ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (13.987718ms, 13.987718ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (18.375705ms, 17.887158ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (18.242291ms, 18.242291ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>>
add_one_torrent: Avg/AdjAvg: (55ns, 54ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (9.550877ms, 9.80194ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (7.861103ms, 7.861103ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (7.80655ms, 7.80655ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>>
add_one_torrent: Avg/AdjAvg: (61ns, 59ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (9.649314ms, 9.521965ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.225382ms, 12.582905ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (10.700225ms, 10.700225ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>>
add_one_torrent: Avg/AdjAvg: (83ns, 82ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (12.059674ms, 12.059674ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.095544ms, 24.209862ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (11.731907ms, 11.731907ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>>
add_one_torrent: Avg/AdjAvg: (105ns, 104ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (12.257331ms, 12.257331ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.032795ms, 23.541457ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (11.880602ms, 11.880602ms)

Requests out: 416171.50/second
Responses in: 374553.95/second
  - Connect responses:  186039.97
  - Announce responses: 185587.24
  - Scrape responses:   2926.74
  - Error responses:    0.00
Peers per announce response: 0.00
Announce responses per info hash:
  - p10: 1
  - p25: 1
  - p50: 1
  - p75: 1
  - p90: 2
  - p95: 3
  - p99: 105
  - p99.9: 301
  - p100: 393

josecelano · 2024-03-22T09:06:32Z

ACK 59a68d3

codecov · 2024-03-22T09:12:45Z

Codecov Report

Attention: Patch coverage is 60.34483% with 23 lines in your changes are missing coverage. Please review.

Project coverage is 74.02%. Comparing base (3bd2a9c) to head (59a68d3).
Report is 10 commits behind head on develop.

Files	Patch %	Lines
src/core/torrent/repository.rs	46.87%	17 Missing ⚠️
.../torrent-repository-benchmarks/src/benches/asyn.rs	0.00%	6 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #751      +/-   ##
===========================================
- Coverage    74.06%   74.02%   -0.05%     
===========================================
  Files          145      145              
  Lines         8958     8923      -35     
===========================================
- Hits          6635     6605      -30     
+ Misses        2323     2318       -5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

The tracker announce method uses two locks: 1. To write the new peer in the torrent entry peer list (write lock). 2. To read the peers list from the torrent entry (read lock). I was wondering if the repo would be faster removing the second lock since we can get the peer list in the first lock. This are the benchmarking results: ```output tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (57ns, 59ns) update_one_torrent_in_parallel: Avg/AdjAvg: (15.769541ms, 16.69948ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (20.6541ms, 20.010245ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (18.326633ms, 18.326633ms) std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (38ns, 39ns) update_one_torrent_in_parallel: Avg/AdjAvg: (7.154301ms, 7.296907ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (8.370377ms, 8.370377ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.902327ms, 6.902327ms) std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (44ns, 40ns) update_one_torrent_in_parallel: Avg/AdjAvg: (6.023789ms, 6.023789ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.306719ms, 12.306719ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.429203ms, 8.429203ms) tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (71ns, 69ns) update_one_torrent_in_parallel: Avg/AdjAvg: (7.022376ms, 7.022376ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (28.125712ms, 27.811162ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.908545ms, 9.036288ms) tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (85ns, 81ns) update_one_torrent_in_parallel: Avg/AdjAvg: (15.285006ms, 15.555171ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (29.794849ms, 30.254531ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (9.302051ms, 9.302051ms) ``` ```output Requests out: 384701.13/second Responses in: 345642.53/second - Connect responses: 171189.76 - Announce responses: 171187.17 - Scrape responses: 3265.60 - Error responses: 0.00 Peers per announce response: 0.00 Announce responses per info hash: - p10: 1 - p25: 1 - p50: 1 - p75: 1 - p90: 2 - p95: 3 - p99: 104 - p99.9: 281 - p100: 347 ``` ```output tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (64ns, 64ns) update_one_torrent_in_parallel: Avg/AdjAvg: (15.652934ms, 15.54617ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.834278ms, 22.935421ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (17.973855ms, 18.28944ms) std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (43ns, 39ns) update_one_torrent_in_parallel: Avg/AdjAvg: (6.362147ms, 6.362147ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (8.869102ms, 8.869102ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.589264ms, 6.510441ms) std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (48ns, 49ns) update_one_torrent_in_parallel: Avg/AdjAvg: (6.194211ms, 6.194211ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.50352ms, 12.50352ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.411279ms, 8.411279ms) tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (79ns, 79ns) update_one_torrent_in_parallel: Avg/AdjAvg: (6.966462ms, 6.966462ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (27.584361ms, 27.411202ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.965519ms, 7.474346ms) tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (97ns, 97ns) update_one_torrent_in_parallel: Avg/AdjAvg: (16.046499ms, 16.430313ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (28.42327ms, 27.876115ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.576256ms, 6.326548ms) ``` Only in a couple of cases is a little bit faster. But I think the code is simpler. However, the UDP load test looks better. From 171187.17 announce requests to 185922.40. ```output Requests out: 416413.93/second Responses in: 374655.61/second - Connect responses: 185740.08 - Announce responses: 185922.40 - Scrape responses: 2993.13 - Error responses: 0.00 Peers per announce response: 0.00 Announce responses per info hash: - p10: 1 - p25: 1 - p50: 1 - p75: 1 - p90: 2 - p95: 3 - p99: 105 - p99.9: 301 - p100: 365 ``` In general, It's not a great performance increase. I would merge it only because the code is cleaner. This is the initial version but with a cleaner code. It's the final version in this PR. - Write lock to update the peer list. - Read lock to get stats and the peer list. ```output tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (75ns, 73ns) update_one_torrent_in_parallel: Avg/AdjAvg: (13.987718ms, 13.987718ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (18.375705ms, 17.887158ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (18.242291ms, 18.242291ms) std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (55ns, 54ns) update_one_torrent_in_parallel: Avg/AdjAvg: (9.550877ms, 9.80194ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (7.861103ms, 7.861103ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (7.80655ms, 7.80655ms) std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (61ns, 59ns) update_one_torrent_in_parallel: Avg/AdjAvg: (9.649314ms, 9.521965ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.225382ms, 12.582905ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (10.700225ms, 10.700225ms) tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (83ns, 82ns) update_one_torrent_in_parallel: Avg/AdjAvg: (12.059674ms, 12.059674ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.095544ms, 24.209862ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (11.731907ms, 11.731907ms) tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (105ns, 104ns) update_one_torrent_in_parallel: Avg/AdjAvg: (12.257331ms, 12.257331ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.032795ms, 23.541457ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (11.880602ms, 11.880602ms) ``` ```output Requests out: 416171.50/second Responses in: 374553.95/second - Connect responses: 186039.97 - Announce responses: 185587.24 - Scrape responses: 2926.74 - Error responses: 0.00 Peers per announce response: 0.00 Announce responses per info hash: - p10: 1 - p25: 1 - p50: 1 - p75: 1 - p90: 2 - p95: 3 - p99: 105 - p99.9: 301 - p100: 393 ```

da2ce7 · 2024-03-22T20:26:49Z

I would prefer to merge #690 before this. I've just got to finish a couple more tests and then it is ready to review. (I will prob need to do rebase to squash a few comments also.)

josecelano · 2024-03-22T20:56:18Z

I would prefer to merge #690 before this. I've just got to finish a couple more tests and then it is ready to review. (I will prob need to do rebase to squash a few comments also.)

Hey, yes I didn't want to merge it before you finish yours. Anyway It's not an important change. Take a look when you have time and let me know if you think the code is simpler. Regarding performance It looks like it's not a big change.

josecelano · 2024-04-12T07:22:52Z

I'm closing this. I think it's a good refactor. I would like to decouple peer lists from statistics. Statistics could even be enabled/disabled in config. Since the performance is the same, I would prefer to split the method update_torrent_with_peer_and_get_stats (command/query segregation).

In short, I think it would be better because:

Stats could be optional.
Methos are simpler command/query segregation.

josecelano requested a review from da2ce7 March 22, 2024 09:05

josecelano temporarily deployed to coverage March 22, 2024 09:05 — with GitHub Actions Inactive

josecelano added the Optimization Make it Faster label Mar 22, 2024

josecelano force-pushed the performance-optimization-announce-merge-update-and-fetch branch from cd9f82c to 4a17fa1 Compare March 22, 2024 09:59

josecelano temporarily deployed to coverage March 22, 2024 09:59 — with GitHub Actions Inactive

josecelano force-pushed the performance-optimization-announce-merge-update-and-fetch branch from 4a17fa1 to 59a68d3 Compare March 22, 2024 11:19

josecelano temporarily deployed to coverage March 22, 2024 11:19 — with GitHub Actions Inactive

josecelano mentioned this pull request Mar 25, 2024

Torrent Repository Work #690

Merged

6 tasks

da2ce7 added the Needs Rebase Base Branch has Incompatibilities label Mar 25, 2024

josecelano closed this Apr 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: performance optimization in tracker announce #751

Refactor: performance optimization in tracker announce #751

josecelano commented Mar 22, 2024 •

edited

Loading

josecelano commented Mar 22, 2024 •

edited

Loading

codecov bot commented Mar 22, 2024 •

edited

Loading

da2ce7 commented Mar 22, 2024

josecelano commented Mar 22, 2024

josecelano commented Apr 12, 2024

Refactor: performance optimization in tracker announce #751

Refactor: performance optimization in tracker announce #751

Conversation

josecelano commented Mar 22, 2024 • edited Loading

Before the refactor

With only a write lock

With independent write and read lock

josecelano commented Mar 22, 2024 • edited Loading

codecov bot commented Mar 22, 2024 • edited Loading

Codecov Report

da2ce7 commented Mar 22, 2024

josecelano commented Mar 22, 2024

josecelano commented Apr 12, 2024

josecelano commented Mar 22, 2024 •

edited

Loading

josecelano commented Mar 22, 2024 •

edited

Loading

codecov bot commented Mar 22, 2024 •

edited

Loading