Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: performance optimization in tracker announce #751

Conversation

josecelano
Copy link
Member

@josecelano josecelano commented Mar 22, 2024

The tracker announce method uses two locks:

  1. To write the new peer in the torrent entry peer list (write lock).
  2. To read the peers list from the torrent entry (read lock).

I was wondering if the repo would be faster removing the second lock since we can get the peer list in the first lock.

These are the benchmarking results:

Before the refactor

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (57ns, 59ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (15.769541ms, 16.69948ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (20.6541ms, 20.010245ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (18.326633ms, 18.326633ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (38ns, 39ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (7.154301ms, 7.296907ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (8.370377ms, 8.370377ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.902327ms, 6.902327ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (44ns, 40ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (6.023789ms, 6.023789ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.306719ms, 12.306719ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.429203ms, 8.429203ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (71ns, 69ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (7.022376ms, 7.022376ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (28.125712ms, 27.811162ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.908545ms, 9.036288ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (85ns, 81ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (15.285006ms, 15.555171ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (29.794849ms, 30.254531ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (9.302051ms, 9.302051ms)
Requests out: 384701.13/second
Responses in: 345642.53/second
  - Connect responses:  171189.76
  - Announce responses: 171187.17
  - Scrape responses:   3265.60
  - Error responses:    0.00
Peers per announce response: 0.00
Announce responses per info hash:
  - p10: 1
  - p25: 1
  - p50: 1
  - p75: 1
  - p90: 2
  - p95: 3
  - p99: 104
  - p99.9: 281
  - p100: 347

With only a write lock

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (64ns, 64ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (15.652934ms, 15.54617ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.834278ms, 22.935421ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (17.973855ms, 18.28944ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (43ns, 39ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (6.362147ms, 6.362147ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (8.869102ms, 8.869102ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.589264ms, 6.510441ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (48ns, 49ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (6.194211ms, 6.194211ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.50352ms, 12.50352ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.411279ms, 8.411279ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (79ns, 79ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (6.966462ms, 6.966462ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (27.584361ms, 27.411202ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.965519ms, 7.474346ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (97ns, 97ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (16.046499ms, 16.430313ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (28.42327ms, 27.876115ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.576256ms, 6.326548ms)

Only in a couple of cases is a little bit faster. But I think the code is simpler.

However, the UDP load test looks better. From 171187.17 announce requests to 185922.40.

Requests out: 416413.93/second
Responses in: 374655.61/second
  - Connect responses:  185740.08
  - Announce responses: 185922.40
  - Scrape responses:   2993.13
  - Error responses:    0.00
Peers per announce response: 0.00
Announce responses per info hash:
  - p10: 1
  - p25: 1
  - p50: 1
  - p75: 1
  - p90: 2
  - p95: 3
  - p99: 105
  - p99.9: 301
  - p100: 365

In general, It's not a great performance increase. I would merge it only because the code is cleaner.

With independent write and read lock

This is the initial version but with a cleaner code. It's the final version in this PR.

  • Write lock to update the peer list.
  • Read lock to get stats and the peer list.
tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>>
add_one_torrent: Avg/AdjAvg: (75ns, 73ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (13.987718ms, 13.987718ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (18.375705ms, 17.887158ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (18.242291ms, 18.242291ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>>
add_one_torrent: Avg/AdjAvg: (55ns, 54ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (9.550877ms, 9.80194ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (7.861103ms, 7.861103ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (7.80655ms, 7.80655ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>>
add_one_torrent: Avg/AdjAvg: (61ns, 59ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (9.649314ms, 9.521965ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.225382ms, 12.582905ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (10.700225ms, 10.700225ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>>
add_one_torrent: Avg/AdjAvg: (83ns, 82ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (12.059674ms, 12.059674ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.095544ms, 24.209862ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (11.731907ms, 11.731907ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>>
add_one_torrent: Avg/AdjAvg: (105ns, 104ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (12.257331ms, 12.257331ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.032795ms, 23.541457ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (11.880602ms, 11.880602ms)
Requests out: 416171.50/second
Responses in: 374553.95/second
  - Connect responses:  186039.97
  - Announce responses: 185587.24
  - Scrape responses:   2926.74
  - Error responses:    0.00
Peers per announce response: 0.00
Announce responses per info hash:
  - p10: 1
  - p25: 1
  - p50: 1
  - p75: 1
  - p90: 2
  - p95: 3
  - p99: 105
  - p99.9: 301
  - p100: 393

@josecelano
Copy link
Member Author

josecelano commented Mar 22, 2024

ACK 59a68d3

@josecelano josecelano added the Optimization Make it Faster label Mar 22, 2024
Copy link

codecov bot commented Mar 22, 2024

Codecov Report

Attention: Patch coverage is 60.34483% with 23 lines in your changes are missing coverage. Please review.

Project coverage is 74.02%. Comparing base (3bd2a9c) to head (59a68d3).
Report is 10 commits behind head on develop.

Files Patch % Lines
src/core/torrent/repository.rs 46.87% 17 Missing ⚠️
.../torrent-repository-benchmarks/src/benches/asyn.rs 0.00% 6 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #751      +/-   ##
===========================================
- Coverage    74.06%   74.02%   -0.05%     
===========================================
  Files          145      145              
  Lines         8958     8923      -35     
===========================================
- Hits          6635     6605      -30     
+ Misses        2323     2318       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@josecelano josecelano force-pushed the performance-optimization-announce-merge-update-and-fetch branch from cd9f82c to 4a17fa1 Compare March 22, 2024 09:59
The tracker announce method uses two locks:

1. To write the new peer in the torrent entry peer list (write lock).
2. To read the peers list from the torrent entry (read lock).

I was wondering if the repo would be faster removing the second lock since we can get the peer list in the first lock.

This are the benchmarking results:

```output
tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (57ns, 59ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (15.769541ms, 16.69948ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (20.6541ms, 20.010245ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (18.326633ms, 18.326633ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (38ns, 39ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (7.154301ms, 7.296907ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (8.370377ms, 8.370377ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.902327ms, 6.902327ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (44ns, 40ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (6.023789ms, 6.023789ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.306719ms, 12.306719ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.429203ms, 8.429203ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (71ns, 69ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (7.022376ms, 7.022376ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (28.125712ms, 27.811162ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.908545ms, 9.036288ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (85ns, 81ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (15.285006ms, 15.555171ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (29.794849ms, 30.254531ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (9.302051ms, 9.302051ms)
```

```output
Requests out: 384701.13/second
Responses in: 345642.53/second
  - Connect responses:  171189.76
  - Announce responses: 171187.17
  - Scrape responses:   3265.60
  - Error responses:    0.00
Peers per announce response: 0.00
Announce responses per info hash:
  - p10: 1
  - p25: 1
  - p50: 1
  - p75: 1
  - p90: 2
  - p95: 3
  - p99: 104
  - p99.9: 281
  - p100: 347
```

```output
tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (64ns, 64ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (15.652934ms, 15.54617ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.834278ms, 22.935421ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (17.973855ms, 18.28944ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>> add_one_torrent: Avg/AdjAvg: (43ns, 39ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (6.362147ms, 6.362147ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (8.869102ms, 8.869102ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.589264ms, 6.510441ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (48ns, 49ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (6.194211ms, 6.194211ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.50352ms, 12.50352ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (8.411279ms, 8.411279ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (79ns, 79ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (6.966462ms, 6.966462ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (27.584361ms, 27.411202ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.965519ms, 7.474346ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>> add_one_torrent: Avg/AdjAvg: (97ns, 97ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (16.046499ms, 16.430313ms) add_multiple_torrents_in_parallel: Avg/AdjAvg: (28.42327ms, 27.876115ms) update_multiple_torrents_in_parallel: Avg/AdjAvg: (6.576256ms, 6.326548ms)
```

Only in a couple of cases is a little bit faster. But I think the code is simpler.

However, the UDP load test looks better. From 171187.17 announce requests to 185922.40.

```output
Requests out: 416413.93/second
Responses in: 374655.61/second
  - Connect responses:  185740.08
  - Announce responses: 185922.40
  - Scrape responses:   2993.13
  - Error responses:    0.00
Peers per announce response: 0.00
Announce responses per info hash:
  - p10: 1
  - p25: 1
  - p50: 1
  - p75: 1
  - p90: 2
  - p95: 3
  - p99: 105
  - p99.9: 301
  - p100: 365
```

In general, It's not a great performance increase. I would merge it only because the code is cleaner.

This is the initial version but with a cleaner code. It's the final version in this PR.

- Write lock to update the peer list.
- Read lock to get stats and the peer list.

```output
tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>>
add_one_torrent: Avg/AdjAvg: (75ns, 73ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (13.987718ms, 13.987718ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (18.375705ms, 17.887158ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (18.242291ms, 18.242291ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Entry>>
add_one_torrent: Avg/AdjAvg: (55ns, 54ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (9.550877ms, 9.80194ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (7.861103ms, 7.861103ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (7.80655ms, 7.80655ms)

std::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>>
add_one_torrent: Avg/AdjAvg: (61ns, 59ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (9.649314ms, 9.521965ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (12.225382ms, 12.582905ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (10.700225ms, 10.700225ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<std::sync::Mutex<Entry>>>>
add_one_torrent: Avg/AdjAvg: (83ns, 82ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (12.059674ms, 12.059674ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.095544ms, 24.209862ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (11.731907ms, 11.731907ms)

tokio::sync::RwLock<std::collections::BTreeMap<InfoHash, Arc<tokio::sync::Mutex<Entry>>>>
add_one_torrent: Avg/AdjAvg: (105ns, 104ns)
update_one_torrent_in_parallel: Avg/AdjAvg: (12.257331ms, 12.257331ms)
add_multiple_torrents_in_parallel: Avg/AdjAvg: (23.032795ms, 23.541457ms)
update_multiple_torrents_in_parallel: Avg/AdjAvg: (11.880602ms, 11.880602ms)
```

```output
Requests out: 416171.50/second
Responses in: 374553.95/second
  - Connect responses:  186039.97
  - Announce responses: 185587.24
  - Scrape responses:   2926.74
  - Error responses:    0.00
Peers per announce response: 0.00
Announce responses per info hash:
  - p10: 1
  - p25: 1
  - p50: 1
  - p75: 1
  - p90: 2
  - p95: 3
  - p99: 105
  - p99.9: 301
  - p100: 393
```
@josecelano josecelano force-pushed the performance-optimization-announce-merge-update-and-fetch branch from 4a17fa1 to 59a68d3 Compare March 22, 2024 11:19
@da2ce7
Copy link
Contributor

da2ce7 commented Mar 22, 2024

I would prefer to merge #690 before this. I've just got to finish a couple more tests and then it is ready to review. (I will prob need to do rebase to squash a few comments also.)

@josecelano
Copy link
Member Author

I would prefer to merge #690 before this. I've just got to finish a couple more tests and then it is ready to review. (I will prob need to do rebase to squash a few comments also.)

Hey, yes I didn't want to merge it before you finish yours. Anyway It's not an important change. Take a look when you have time and let me know if you think the code is simpler. Regarding performance It looks like it's not a big change.

@josecelano josecelano mentioned this pull request Mar 25, 2024
6 tasks
@da2ce7 da2ce7 added the Needs Rebase Base Branch has Incompatibilities label Mar 25, 2024
@josecelano
Copy link
Member Author

I'm closing this. I think it's a good refactor. I would like to decouple peer lists from statistics. Statistics could even be enabled/disabled in config. Since the performance is the same, I would prefer to split the method update_torrent_with_peer_and_get_stats (command/query segregation).

In short, I think it would be better because:

  • Stats could be optional.
  • Methos are simpler command/query segregation.

@josecelano josecelano closed this Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Rebase Base Branch has Incompatibilities Optimization Make it Faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants