From b1794eb2b6696150ac5207ae92c90bda7b3fcb2f Mon Sep 17 00:00:00 2001
From: Joe Rafaniello <jrafanie@gmail.com>
Date: Wed, 14 Sep 2022 15:29:51 -0400
Subject: [PATCH] Don't wait so long for primary/standby info and changes

While monitoring for failover, we get the changes, such as "standby was just
added", and also if a failover and promotion occurs as much as five minutes
after it happened.  This is far too long.  With 2 minutes, we can still be
conservative in polling postgres but still not take that long.

We may find that we can drop this further to 90 seconds or even 60 seconds but
this change seems like an obvious improvement with very little downside such as
too many connections to postgres.

We establish 2 very quick connections for each iteration of this loop[1], once for
the logical replication connection and once for the rails connection[2].  This
means every 2 minutes, we make 2 connections.  This should not be a big concern
even if we have tens of appliances in a complex.

[1] https://github.com/ManageIQ/manageiq-postgres_ha_admin/blob/e7e87af12da82f86e967a22700505442f61bb7b1/lib/manageiq/postgres_ha_admin/failover_monitor.rb#L51-L58
[2] https://github.com/ManageIQ/manageiq/blob/17feafb6138749996fc0e529c42e1928abb18968/lib/evm_database.rb#L174-L175
---
 config/ha_admin.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/config/ha_admin.yml b/config/ha_admin.yml
index ea1a0287306..5934d2cfe08 100644
--- a/config/ha_admin.yml
+++ b/config/ha_admin.yml
@@ -1,4 +1,4 @@
 ---
 failover_attempts: 10
-db_check_frequency: 300
+db_check_frequency: 120
 failover_check_frequency: 60