Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yan/lbm1 17384 - reconnecto to rabbitmq in case of failure #68

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

yant-lb
Copy link

@yant-lb yant-lb commented Aug 30, 2021

Problem
As seen on ticket #LBM1-17384 we got sometimes failures in while calling t rabbitmq.

I've implemented a reconnection that is being done on the Connection thread.
Currently it tries to reconnect once, if it fails it will continue with the original failure flow (i.e set pdb trace)

Tests
Iv'e created a test branch: https://github.com/LightBitsLabs/inaugurator/tree/yan/LBM1-17384-TEST which is a branch of the current branch.
Basically what it does, is throwing the ConnectionClosed exception once in the code (in order to fail the run in the first time, then in the second run it will reconnect and reproduce the original function)

The tests were done on rack03-server61
I've tested the following flows:

  1. checked_in_down -> hard_reclam -> wait -> WORKS
  2. checked_in_down -> hard_reclam -> wait -> allocate -> WORKS
  3. checked_in_down -> hard_reclam -> allocate -> WORKS
  4. checked_in_down -> allocate -> WORKS
  5. hard_reclam -> allocate -> WORKS
  6. hard_reclam -> WORKS

- disable main loop while reconnecting
- do not set connection as finished on reconnect
- invoke cleanup in rabbitmq connection thread

Signed-off-by: Yan <yan@lightbitslabs.com>
Signed-off-by: Yan <yan@lightbitslabs.com>
@yant-lb yant-lb requested review from yuval-lb and amittar-lb August 30, 2021 16:23
Comment on lines +90 to +91
except Exception as e:
logging.exception("Couldnt to reconnect RabbitMQ.")
Copy link

@amittar-lb amittar-lb Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be good to log the exception itself as well. (f"Couldnt to reconnect RabbitMQ. Reaseon: {e}")
It can help us understand why the connection failed.
Other then that - looks good!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
Yes, I used logging.exception for that: https://stackoverflow.com/a/5191885
"logger.exception will output a stack trace alongside the error message."

I think it should be enough

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it will be good

@yant-lb
Copy link
Author

yant-lb commented Sep 1, 2021

@yuval-lb ping 👍

@yant-lb yant-lb requested review from dean-lb and amittar-lb August 25, 2022 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants