Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: debug mempool lock contention in tx submit peer client #11

Merged

Conversation

lancevincentsalera
Copy link
Contributor

@lancevincentsalera lancevincentsalera commented Jan 29, 2025

REASON FOR THIS FIX

CODE FLOW (Context: TxSubmitPeer)

  1. The TxSubmitPeer will have the following fields: mempool, client, peer_addr, network_magic, and unfulfilled_request
  2. Initialize peer client (Transaction Provider) to start a connection to config provided peers (Transaction Consumer). Send MsgInit to the consumer to officially start the tx-submission mini-protocol.
  3. A background task will run after formal initialization of fields. The background task will spawn a task that lives inside a tokio runtime. It will do the following in an indefinite loop:
    • We check for an outstanding request in each loop iteration.
    • If there’s no outstanding requests yet, we simply wait for the Transaction Consumer to send a MsgRequestTxIdsBlocking.
    • If there are any outstanding requests, we process it first before we can listen for another consumer request. We then reply with MsgReplyTxIds or MsgReplyTxs depending on the type of request the consumer has sent us.
    • If we listened and received a MsgRequestTxIdsBlocking, MsgRequestTxIdsNonBlocking, MsgRequestTxs from the consumer, we process request and reply depending on its type with MsgReplyTxIds or MsgReplyTxs accessing the mempool.
    • If the mempool is empty, we store the consumer's request in the unfulfilled_request field.
    • Repeat indefinitely until the connection closes or an error occurs.

ISSUE

  • We noticed that adding new transactions (add_tx(...)) could be delayed by up to 10 seconds.
  • The background loop was holding the mempool lock while sleeping (to re-check unfulfilled requests), causing lock contention.
  • As a result, incoming TX submissions were blocked from acquiring the mempool lock, leading to noticeable delays.

SOLUTION

  • Drop the mempool lock before sleeping in the background loop.
  • Only lock the mempool when actually checking pending_total() or replying to the consumer with “TxIds/Txs.”
  • This ensures new transactions can be received in microseconds even if the system is waiting 10 seconds to re-check an unfulfilled request.

RESULT

  • Substantial improvement in throughput and responsiveness for new TX arrivals.
  • Confirmed by logs showing add_tx calls complete in microseconds, without multi-second blocking.
  • Maintains the 10-second retry logic for unfulfilled requests in the Tx-Submission protocol.

TESTING

  • Manually tested by sending multiple transactions both immediately and after extended idle periods.
  • Observed consistently low lock acquisition times in logs, confirming the fix.

@lancevincentsalera lancevincentsalera marked this pull request as draft January 30, 2025 12:10
@Mercurial Mercurial marked this pull request as ready for review January 31, 2025 12:48
@Mercurial Mercurial merged commit 5629004 into txpipe:main Jan 31, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants