Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FRR] Zebra BGP enhancements to better handle memory during route churns #99

Closed
wants to merge 3 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
[FRR]Porting Zebra backpressure and bgpd enhancements
dgsudharsan committed Jun 27, 2024
commit 365bdf3ffab2729fa4b7ab96b820a0a2cdf8a51d
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
From df79586612fb0445fbdf6b191747294e5494ece2 Mon Sep 17 00:00:00 2001
From: Christian Hopps <chopps@labn.net>
Date: Thu, 26 Oct 2023 22:51:08 -0400
Subject: [PATCH 01/11] isisd: staticd: need to link directly against libyang

Signed-off-by: Christian Hopps <chopps@labn.net>
(cherry picked from commit 81d1d399521bb18f3fdd5353c9d58c4b3988f225)

diff --git a/isisd/subdir.am b/isisd/subdir.am
index dabf6a925e..10f9f5b964 100644
--- a/isisd/subdir.am
+++ b/isisd/subdir.am
@@ -92,7 +92,7 @@ ISIS_SOURCES = \
isisd/isis_pfpacket.c \
# end

-ISIS_LDADD_COMMON = lib/libfrr.la $(LIBCAP)
+ISIS_LDADD_COMMON = lib/libfrr.la $(LIBCAP) $(LIBYANG_LIBS)

# Building isisd

diff --git a/staticd/subdir.am b/staticd/subdir.am
index 022428281f..07ebe3c02c 100644
--- a/staticd/subdir.am
+++ b/staticd/subdir.am
@@ -36,7 +36,7 @@ clippy_scan += \
# end

staticd_staticd_SOURCES = staticd/static_main.c
-staticd_staticd_LDADD = staticd/libstatic.a lib/libfrr.la $(LIBCAP)
+staticd_staticd_LDADD = staticd/libstatic.a lib/libfrr.la $(LIBCAP) $(LIBYANG_LIBS)

nodist_staticd_staticd_SOURCES = \
yang/frr-bfdd.yang.c \
--
2.17.1

Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
From 7c711ff437985b23a4dd919a98b22b8ea14ef553 Mon Sep 17 00:00:00 2001
From: Rajasekar Raja <rajasekarr@nvidia.com>
Date: Mon, 12 Feb 2024 10:44:18 -0800
Subject: [PATCH 02/11] zebra: backpressure - Zebra push back on Buffer/Stream
creation

Currently, the way zebra works is it creates pthread per client (BGP is
of interest in this case) and this thread loops itself in zserv_read()
to check for any incoming data. If there is one, then it reads,
validates and adds it in the ibuf_fifo signalling the main thread to
process the message. The main thread when it gets a change, processes
the message, and invokes the function pointer registered in the header
command. (Ex: zserv_handlers).

Finally, if all of this was successful, this task reschedules itself and
loops in zserv_read() again

However, if there are already items on ibuf FIFO, that means zebra is
slow in processing. And with the current mechanism if Zebra main is
busy, the ibuf FIFO keeps growing holding up the memory.

Show memory zebra:(Example: 15k streams hoarding ~160 MB of data)
--- qmem libfrr ---
Stream : 44 variable 3432352 15042 161243800

Fix:
Client IO Thread: (zserv_read)
- Stop doing the read events when we know there are X number of items
on the FIFO already.(X - zebra zapi-packets <1-10000> (Default-1000)

- Determine the number of items on the zserv->ibuf_fifo. Subtract this
from the work items and only pull the number of items off that would
take us to X items on the ibuf_fifo again.

- If the number of items in the ibuf_fifo has reached to the maximum
* Either initially when zserv_read() is called (or)
* when processing the remainders of the incoming buffer
the client IO thread is woken by the the zebra main.

Main thread: (zserv_process_message)
If the client ibuf always schedules a wakeup to the client IO to read
more items from the socked buffer. This way we ensure
- Client IO thread always tries to read the socket buffer and add more
items to the ibuf_fifo (until max limit)
- hidden config change (zebra zapi-packets <>) is taken into account

Ticket: #3390099

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>

diff --git a/zebra/zserv.c b/zebra/zserv.c
index 2024f34534..de6e404fc4 100644
--- a/zebra/zserv.c
+++ b/zebra/zserv.c
@@ -318,6 +318,14 @@ zwrite_fail:
* this task reschedules itself.
*
* Any failure in any of these actions is handled by terminating the client.
+ *
+ * The client's input buffer ibuf_fifo can have a maximum items as configured
+ * in the packets_to_process. This way we are not filling up the FIFO more
+ * than the maximum when the zebra main is busy. If the fifo has space, we
+ * reschedule ourselves to read more.
+ *
+ * The main thread processes the items in ibuf_fifo and always signals the
+ * client IO thread.
*/
static void zserv_read(struct thread *thread)
{
@@ -325,15 +333,25 @@ static void zserv_read(struct thread *thread)
int sock;
size_t already;
struct stream_fifo *cache;
- uint32_t p2p_orig;
-
- uint32_t p2p;
+ uint32_t p2p; /* Temp p2p used to process */
+ uint32_t p2p_orig; /* Configured p2p (Default-1000) */
+ int p2p_avail; /* How much space is available for p2p */
struct zmsghdr hdr;
+ size_t client_ibuf_fifo_cnt = stream_fifo_count_safe(client->ibuf_fifo);

p2p_orig = atomic_load_explicit(&zrouter.packets_to_process,
memory_order_relaxed);
+ p2p_avail = p2p_orig - client_ibuf_fifo_cnt;
+
+ /*
+ * Do nothing if ibuf_fifo count has reached its max limit. Otherwise
+ * proceed and reschedule ourselves if there is space in the ibuf_fifo.
+ */
+ if (p2p_avail <= 0)
+ return;
+
+ p2p = p2p_avail;
cache = stream_fifo_new();
- p2p = p2p_orig;
sock = THREAD_FD(thread);

while (p2p) {
@@ -433,7 +451,7 @@ static void zserv_read(struct thread *thread)
p2p--;
}

- if (p2p < p2p_orig) {
+ if (p2p < (uint32_t)p2p_avail) {
uint64_t time_now = monotime(NULL);

/* update session statistics */
@@ -447,19 +465,23 @@ static void zserv_read(struct thread *thread)
while (cache->head)
stream_fifo_push(client->ibuf_fifo,
stream_fifo_pop(cache));
+ /* Need to update count as main thread could have processed few */
+ client_ibuf_fifo_cnt =
+ stream_fifo_count_safe(client->ibuf_fifo);
}

/* Schedule job to process those packets */
zserv_event(client, ZSERV_PROCESS_MESSAGES);
-
}

if (IS_ZEBRA_DEBUG_PACKET)
- zlog_debug("Read %d packets from client: %s", p2p_orig - p2p,
- zebra_route_string(client->proto));
+ zlog_debug("Read %d packets from client: %s. Current ibuf fifo count: %zu. Conf P2p %d",
+ p2p_avail - p2p, zebra_route_string(client->proto),
+ client_ibuf_fifo_cnt, p2p_orig);

- /* Reschedule ourselves */
- zserv_client_event(client, ZSERV_CLIENT_READ);
+ /* Reschedule ourselves since we have space in ibuf_fifo */
+ if (client_ibuf_fifo_cnt < p2p_orig)
+ zserv_client_event(client, ZSERV_CLIENT_READ);

stream_fifo_free(cache);

@@ -495,14 +517,20 @@ static void zserv_client_event(struct zserv *client,
* as the task argument.
*
* Each message is popped off the client's input queue and the action associated
- * with the message is executed. This proceeds until there are no more messages,
- * an error occurs, or the processing limit is reached.
+ * with the message is executed. This proceeds until an error occurs, or the
+ * processing limit is reached.
*
* The client's I/O thread can push at most zrouter.packets_to_process messages
* onto the input buffer before notifying us there are packets to read. As long
* as we always process zrouter.packets_to_process messages here, then we can
* rely on the read thread to handle queuing this task enough times to process
* everything on the input queue.
+ *
+ * If the client ibuf always schedules a wakeup to the client IO to read more
+ * items from the socked buffer. This way we ensure
+ * - Client IO thread always tries to read the socket buffer and add more
+ * items to the ibuf_fifo (until max limit)
+ * - the hidden config change (zebra zapi-packets <>) is taken into account.
*/
static void zserv_process_messages(struct thread *thread)
{
@@ -538,6 +566,9 @@ static void zserv_process_messages(struct thread *thread)
/* Reschedule ourselves if necessary */
if (need_resched)
zserv_event(client, ZSERV_PROCESS_MESSAGES);
+
+ /* Ensure to include the read socket in the select/poll/etc.. */
+ zserv_client_event(client, ZSERV_CLIENT_READ);
}

int zserv_send_message(struct zserv *client, struct stream *msg)
--
2.17.1

Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
From 7796ce2bb6eb1650ae1bec41ab2f270807b33c62 Mon Sep 17 00:00:00 2001
From: Donald Sharp <sharpd@nvidia.com>
Date: Thu, 25 Jan 2024 12:53:24 -0500
Subject: [PATCH 03/11] bgpd: backpressure - Add a typesafe list for Zebra
Announcement

Modify the bgp master to hold a type safe list for bgp_dests that need
to be passed to zebra.

Future commits will use this.

Ticket: #3390099

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>

diff --git a/bgpd/bgp_main.c b/bgpd/bgp_main.c
index 90ae580bab..e28dde5d16 100644
--- a/bgpd/bgp_main.c
+++ b/bgpd/bgp_main.c
@@ -214,6 +214,8 @@ static __attribute__((__noreturn__)) void bgp_exit(int status)
bgp_evpn_mh_finish();
bgp_l3nhg_finish();

+ zebra_announce_fini(&bm->zebra_announce_head);
+
/* reverse bgp_dump_init */
bgp_dump_finish();

diff --git a/bgpd/bgp_table.h b/bgpd/bgp_table.h
index 121afc481f..d43bf86eb9 100644
--- a/bgpd/bgp_table.h
+++ b/bgpd/bgp_table.h
@@ -101,6 +101,8 @@ struct bgp_node {

STAILQ_ENTRY(bgp_dest) pq;

+ struct zebra_announce_item zai;
+
uint64_t version;

mpls_label_t local_label;
@@ -121,6 +123,8 @@ struct bgp_node {
enum bgp_path_selection_reason reason;
};

+DECLARE_LIST(zebra_announce, struct bgp_dest, zai);
+
extern void bgp_delete_listnode(struct bgp_dest *dest);
/*
* bgp_table_iter_t
diff --git a/bgpd/bgpd.c b/bgpd/bgpd.c
index 023047050b..392423e028 100644
--- a/bgpd/bgpd.c
+++ b/bgpd/bgpd.c
@@ -8017,6 +8017,8 @@ void bgp_master_init(struct thread_master *master, const int buffer_size,
memset(&bgp_master, 0, sizeof(bgp_master));

bm = &bgp_master;
+
+ zebra_announce_init(&bm->zebra_announce_head);
bm->bgp = list_new();
bm->listen_sockets = list_new();
bm->port = BGP_PORT_DEFAULT;
diff --git a/bgpd/bgpd.h b/bgpd/bgpd.h
index 72b5b50fb4..55f53bf9d3 100644
--- a/bgpd/bgpd.h
+++ b/bgpd/bgpd.h
@@ -32,6 +32,8 @@
#include "srv6.h"
#include "iana_afi.h"

+PREDECL_LIST(zebra_announce);
+
/* For union sockunion. */
#include "queue.h"
#include "sockunion.h"
@@ -180,6 +182,9 @@ struct bgp_master {
uint32_t inq_limit;
uint32_t outq_limit;

+ /* To preserve ordering of installations into zebra across all Vrfs */
+ struct zebra_announce_head zebra_announce_head;
+
QOBJ_FIELDS;
};
DECLARE_QOBJ_TYPE(bgp_master);
--
2.17.1

Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
From 69e38aa82f325129ebad4535be5d834c599b5c0b Mon Sep 17 00:00:00 2001
From: Philippe Guibert <philippe.guibert@6wind.com>
Date: Wed, 7 Feb 2024 22:34:34 +0100
Subject: [PATCH 04/11] bgpd: fix flushing ipv6 flowspec entries when peering
stops

When a BGP flowspec peering stops, the BGP RIB entries for IPv6
flowspec entries are removed, but not the ZEBRA RIB IPv6 entries.

Actually, when calling bgp_zebra_withdraw() function call, only
the AFI_IP parameter is passed to the bgp_pbr_update_entry() function
in charge of the Flowspec add/delete in zebra. Fix this by passing
the AFI parameter to the bgp_zebra_withdraw() function.

Note that using topotest does not show up the problem as the
flowspec driver code is not present and was refused. Without that,
routes are not installed, and can not be uninstalled.

Fixes: 529efa234655 ("bgpd: allow flowspec entries to be announced to zebra")
Link: https://github.com/FRRouting/frr/pull/2025

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>

diff --git a/bgpd/bgp_route.c b/bgpd/bgp_route.c
index fbff57634a..455cd6cdbb 100644
--- a/bgpd/bgp_route.c
+++ b/bgpd/bgp_route.c
@@ -3312,7 +3312,8 @@ static void bgp_process_main_one(struct bgp *bgp, struct bgp_dest *dest,
*/
if (old_select &&
is_route_parent_evpn(old_select))
- bgp_zebra_withdraw(p, old_select, bgp, safi);
+ bgp_zebra_withdraw(p, old_select, bgp, afi,
+ safi);

bgp_zebra_announce(dest, p, new_select, bgp, afi, safi);
} else {
@@ -3322,7 +3323,8 @@ static void bgp_process_main_one(struct bgp *bgp, struct bgp_dest *dest,
|| old_select->sub_type == BGP_ROUTE_AGGREGATE
|| old_select->sub_type == BGP_ROUTE_IMPORTED))

- bgp_zebra_withdraw(p, old_select, bgp, safi);
+ bgp_zebra_withdraw(p, old_select, bgp, afi,
+ safi);
}
}

@@ -4201,7 +4203,7 @@ void bgp_update(struct peer *peer, const struct prefix *p, uint32_t addpath_id,
if (pi && pi->attr->rmap_table_id != new_attr.rmap_table_id) {
if (CHECK_FLAG(pi->flags, BGP_PATH_SELECTED))
/* remove from RIB previous entry */
- bgp_zebra_withdraw(p, pi, bgp, safi);
+ bgp_zebra_withdraw(p, pi, bgp, afi, safi);
}

if (peer->sort == BGP_PEER_EBGP) {
@@ -5841,7 +5843,7 @@ bool bgp_inbound_policy_exists(struct peer *peer, struct bgp_filter *filter)
}

static void bgp_cleanup_table(struct bgp *bgp, struct bgp_table *table,
- safi_t safi)
+ afi_t afi, safi_t safi)
{
struct bgp_dest *dest;
struct bgp_path_info *pi;
@@ -5865,7 +5867,8 @@ static void bgp_cleanup_table(struct bgp *bgp, struct bgp_table *table,
|| pi->sub_type == BGP_ROUTE_IMPORTED)) {

if (bgp_fibupd_safi(safi))
- bgp_zebra_withdraw(p, pi, bgp, safi);
+ bgp_zebra_withdraw(p, pi, bgp, afi,
+ safi);
}

bgp_path_info_reap(dest, pi);
@@ -5882,7 +5885,7 @@ void bgp_cleanup_routes(struct bgp *bgp)
for (afi = AFI_IP; afi < AFI_MAX; ++afi) {
if (afi == AFI_L2VPN)
continue;
- bgp_cleanup_table(bgp, bgp->rib[afi][SAFI_UNICAST],
+ bgp_cleanup_table(bgp, bgp->rib[afi][SAFI_UNICAST], afi,
SAFI_UNICAST);
/*
* VPN and ENCAP and EVPN tables are two-level (RD is top level)
@@ -5894,7 +5897,7 @@ void bgp_cleanup_routes(struct bgp *bgp)
dest = bgp_route_next(dest)) {
table = bgp_dest_get_bgp_table_info(dest);
if (table != NULL) {
- bgp_cleanup_table(bgp, table, safi);
+ bgp_cleanup_table(bgp, table, afi, safi);
bgp_table_finish(&table);
bgp_dest_set_bgp_table_info(dest, NULL);
bgp_dest_unlock_node(dest);
@@ -5905,7 +5908,7 @@ void bgp_cleanup_routes(struct bgp *bgp)
dest = bgp_route_next(dest)) {
table = bgp_dest_get_bgp_table_info(dest);
if (table != NULL) {
- bgp_cleanup_table(bgp, table, safi);
+ bgp_cleanup_table(bgp, table, afi, safi);
bgp_table_finish(&table);
bgp_dest_set_bgp_table_info(dest, NULL);
bgp_dest_unlock_node(dest);
@@ -5917,7 +5920,7 @@ void bgp_cleanup_routes(struct bgp *bgp)
dest = bgp_route_next(dest)) {
table = bgp_dest_get_bgp_table_info(dest);
if (table != NULL) {
- bgp_cleanup_table(bgp, table, SAFI_EVPN);
+ bgp_cleanup_table(bgp, table, afi, SAFI_EVPN);
bgp_table_finish(&table);
bgp_dest_set_bgp_table_info(dest, NULL);
bgp_dest_unlock_node(dest);
diff --git a/bgpd/bgp_zebra.c b/bgpd/bgp_zebra.c
index ff79746b4c..69240a3b83 100644
--- a/bgpd/bgp_zebra.c
+++ b/bgpd/bgp_zebra.c
@@ -1761,7 +1761,7 @@ void bgp_zebra_announce_table_all_subtypes(struct bgp *bgp, afi_t afi,
}

void bgp_zebra_withdraw(const struct prefix *p, struct bgp_path_info *info,
- struct bgp *bgp, safi_t safi)
+ struct bgp *bgp, afi_t afi, safi_t safi)
{
struct zapi_route api;
struct peer *peer;
@@ -1780,7 +1780,7 @@ void bgp_zebra_withdraw(const struct prefix *p, struct bgp_path_info *info,

if (safi == SAFI_FLOWSPEC) {
peer = info->peer;
- bgp_pbr_update_entry(peer->bgp, p, info, AFI_IP, safi, false);
+ bgp_pbr_update_entry(peer->bgp, p, info, afi, safi, false);
return;
}

@@ -1821,7 +1821,7 @@ void bgp_zebra_withdraw_table_all_subtypes(struct bgp *bgp, afi_t afi, safi_t sa
if (CHECK_FLAG(pi->flags, BGP_PATH_SELECTED)
&& (pi->type == ZEBRA_ROUTE_BGP))
bgp_zebra_withdraw(bgp_dest_get_prefix(dest),
- pi, bgp, safi);
+ pi, bgp, afi, safi);
}
}
}
diff --git a/bgpd/bgp_zebra.h b/bgpd/bgp_zebra.h
index 0a41069411..a5fe8d7ace 100644
--- a/bgpd/bgp_zebra.h
+++ b/bgpd/bgp_zebra.h
@@ -49,7 +49,7 @@ extern void bgp_zebra_announce(struct bgp_dest *dest, const struct prefix *p,
extern void bgp_zebra_announce_table(struct bgp *bgp, afi_t afi, safi_t safi);
extern void bgp_zebra_withdraw(const struct prefix *p,
struct bgp_path_info *path, struct bgp *bgp,
- safi_t safi);
+ afi_t afi, safi_t safi);

/* Announce routes of any bgp subtype of a table to zebra */
extern void bgp_zebra_announce_table_all_subtypes(struct bgp *bgp, afi_t afi,
--
2.17.1

Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
From 679ad9ee5f3c15570d697506183d37aa29f6ebf2 Mon Sep 17 00:00:00 2001
From: Donald Sharp <sharpd@nvidia.com>
Date: Thu, 25 Jan 2024 13:07:37 -0500
Subject: [PATCH 05/11] bgpd: backpressure - cleanup bgp_zebra_XX func args

Since installing/withdrawing routes into zebra is going to be changed
around to be dest based in a list,
- Retrieve the afi/safi to use based upon the dest's afi/safi
instead of passing it in.
- Prefix is known by the dest. Remove this arg as well

Ticket: #3390099

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>

diff --git a/bgpd/bgp_route.c b/bgpd/bgp_route.c
index 455cd6cdbb..d19f27110e 100644
--- a/bgpd/bgp_route.c
+++ b/bgpd/bgp_route.c
@@ -3214,8 +3214,8 @@ static void bgp_process_main_one(struct bgp *bgp, struct bgp_dest *dest,
|| new_select->sub_type
== BGP_ROUTE_IMPORTED))

- bgp_zebra_announce(dest, p, old_select,
- bgp, afi, safi);
+ bgp_zebra_announce(dest, old_select,
+ bgp);
}
}

@@ -3312,10 +3312,9 @@ static void bgp_process_main_one(struct bgp *bgp, struct bgp_dest *dest,
*/
if (old_select &&
is_route_parent_evpn(old_select))
- bgp_zebra_withdraw(p, old_select, bgp, afi,
- safi);
+ bgp_zebra_withdraw(dest, old_select, bgp);

- bgp_zebra_announce(dest, p, new_select, bgp, afi, safi);
+ bgp_zebra_announce(dest, new_select, bgp);
} else {
/* Withdraw the route from the kernel. */
if (old_select && old_select->type == ZEBRA_ROUTE_BGP
@@ -3323,8 +3322,7 @@ static void bgp_process_main_one(struct bgp *bgp, struct bgp_dest *dest,
|| old_select->sub_type == BGP_ROUTE_AGGREGATE
|| old_select->sub_type == BGP_ROUTE_IMPORTED))

- bgp_zebra_withdraw(p, old_select, bgp, afi,
- safi);
+ bgp_zebra_withdraw(dest, old_select, bgp);
}
}

@@ -4203,7 +4201,7 @@ void bgp_update(struct peer *peer, const struct prefix *p, uint32_t addpath_id,
if (pi && pi->attr->rmap_table_id != new_attr.rmap_table_id) {
if (CHECK_FLAG(pi->flags, BGP_PATH_SELECTED))
/* remove from RIB previous entry */
- bgp_zebra_withdraw(p, pi, bgp, afi, safi);
+ bgp_zebra_withdraw(dest, pi, bgp);
}

if (peer->sort == BGP_PEER_EBGP) {
@@ -5867,8 +5865,7 @@ static void bgp_cleanup_table(struct bgp *bgp, struct bgp_table *table,
|| pi->sub_type == BGP_ROUTE_IMPORTED)) {

if (bgp_fibupd_safi(safi))
- bgp_zebra_withdraw(p, pi, bgp, afi,
- safi);
+ bgp_zebra_withdraw(dest, pi, bgp);
}

bgp_path_info_reap(dest, pi);
diff --git a/bgpd/bgp_zebra.c b/bgpd/bgp_zebra.c
index 69240a3b83..920df835a4 100644
--- a/bgpd/bgp_zebra.c
+++ b/bgpd/bgp_zebra.c
@@ -1292,9 +1292,8 @@ static bool bgp_zebra_use_nhop_weighted(struct bgp *bgp, struct attr *attr,
return true;
}

-void bgp_zebra_announce(struct bgp_dest *dest, const struct prefix *p,
- struct bgp_path_info *info, struct bgp *bgp, afi_t afi,
- safi_t safi)
+void bgp_zebra_announce(struct bgp_dest *dest, struct bgp_path_info *info,
+ struct bgp *bgp)
{
struct zapi_route api = { 0 };
struct zapi_nexthop *api_nh;
@@ -1321,6 +1320,8 @@ void bgp_zebra_announce(struct bgp_dest *dest, const struct prefix *p,
uint32_t ttl = 0;
uint32_t bos = 0;
uint32_t exp = 0;
+ struct bgp_table *table = bgp_dest_table(dest);
+ const struct prefix *p = bgp_dest_get_prefix(dest);

/*
* BGP is installing this route and bgp has been configured
@@ -1339,9 +1340,9 @@ void bgp_zebra_announce(struct bgp_dest *dest, const struct prefix *p,
if (bgp->main_zebra_update_hold)
return;

- if (safi == SAFI_FLOWSPEC) {
- bgp_pbr_update_entry(bgp, bgp_dest_get_prefix(dest), info, afi,
- safi, true);
+ if (table->safi == SAFI_FLOWSPEC) {
+ bgp_pbr_update_entry(bgp, p, info, table->afi, table->safi,
+ true);
return;
}

@@ -1354,7 +1355,7 @@ void bgp_zebra_announce(struct bgp_dest *dest, const struct prefix *p,
/* Make Zebra API structure. */
api.vrf_id = bgp->vrf_id;
api.type = ZEBRA_ROUTE_BGP;
- api.safi = safi;
+ api.safi = table->safi;
api.prefix = *p;
SET_FLAG(api.message, ZAPI_MESSAGE_NEXTHOP);

@@ -1458,12 +1459,13 @@ void bgp_zebra_announce(struct bgp_dest *dest, const struct prefix *p,
}
}

- if (bgp->table_map[afi][safi].name) {
+ if (bgp->table_map[table->afi][table->safi].name) {
/* Copy info and attributes, so the route-map
apply doesn't modify the BGP route info. */
local_attr = *mpinfo->attr;
mpinfo_cp->attr = &local_attr;
- if (!bgp_table_map_apply(bgp->table_map[afi][safi].map,
+ if (!bgp_table_map_apply(bgp->table_map[table->afi]
+ [table->safi].map,
p, mpinfo_cp))
continue;

@@ -1619,7 +1621,7 @@ void bgp_zebra_announce(struct bgp_dest *dest, const struct prefix *p,
api.tag = tag;
}

- distance = bgp_distance_apply(p, info, afi, safi, bgp);
+ distance = bgp_distance_apply(p, info, table->afi, table->safi, bgp);
if (distance) {
SET_FLAG(api.message, ZAPI_MESSAGE_DISTANCE);
api.distance = distance;
@@ -1731,9 +1733,7 @@ void bgp_zebra_announce_table(struct bgp *bgp, afi_t afi, safi_t safi)
&& (pi->sub_type == BGP_ROUTE_NORMAL
|| pi->sub_type == BGP_ROUTE_IMPORTED)))

- bgp_zebra_announce(dest,
- bgp_dest_get_prefix(dest),
- pi, bgp, afi, safi);
+ bgp_zebra_announce(dest, pi, bgp);
}

/* Announce routes of any bgp subtype of a table to zebra */
@@ -1755,16 +1755,16 @@ void bgp_zebra_announce_table_all_subtypes(struct bgp *bgp, afi_t afi,
for (pi = bgp_dest_get_bgp_path_info(dest); pi; pi = pi->next)
if (CHECK_FLAG(pi->flags, BGP_PATH_SELECTED) &&
pi->type == ZEBRA_ROUTE_BGP)
- bgp_zebra_announce(dest,
- bgp_dest_get_prefix(dest),
- pi, bgp, afi, safi);
+ bgp_zebra_announce(dest, pi, bgp);
}

-void bgp_zebra_withdraw(const struct prefix *p, struct bgp_path_info *info,
- struct bgp *bgp, afi_t afi, safi_t safi)
+void bgp_zebra_withdraw(struct bgp_dest *dest, struct bgp_path_info *info,
+ struct bgp *bgp)
{
struct zapi_route api;
struct peer *peer;
+ struct bgp_table *table = bgp_dest_table(dest);
+ const struct prefix *p = bgp_dest_get_prefix(dest);

/*
* If we are withdrawing the route, we don't need to have this
@@ -1778,16 +1778,17 @@ void bgp_zebra_withdraw(const struct prefix *p, struct bgp_path_info *info,
if (!bgp_install_info_to_zebra(bgp))
return;

- if (safi == SAFI_FLOWSPEC) {
+ if (table->safi == SAFI_FLOWSPEC) {
peer = info->peer;
- bgp_pbr_update_entry(peer->bgp, p, info, afi, safi, false);
+ bgp_pbr_update_entry(peer->bgp, p, info, table->afi,
+ table->safi, false);
return;
}

memset(&api, 0, sizeof(api));
api.vrf_id = bgp->vrf_id;
api.type = ZEBRA_ROUTE_BGP;
- api.safi = safi;
+ api.safi = table->safi;
api.prefix = *p;

if (info->attr->rmap_table_id) {
@@ -1820,8 +1821,7 @@ void bgp_zebra_withdraw_table_all_subtypes(struct bgp *bgp, afi_t afi, safi_t sa
for (pi = bgp_dest_get_bgp_path_info(dest); pi; pi = pi->next) {
if (CHECK_FLAG(pi->flags, BGP_PATH_SELECTED)
&& (pi->type == ZEBRA_ROUTE_BGP))
- bgp_zebra_withdraw(bgp_dest_get_prefix(dest),
- pi, bgp, afi, safi);
+ bgp_zebra_withdraw(dest, pi, bgp);
}
}
}
diff --git a/bgpd/bgp_zebra.h b/bgpd/bgp_zebra.h
index a5fe8d7ace..b77e423f8f 100644
--- a/bgpd/bgp_zebra.h
+++ b/bgpd/bgp_zebra.h
@@ -43,13 +43,11 @@ extern void bgp_zebra_destroy(void);
extern int bgp_zebra_get_table_range(uint32_t chunk_size,
uint32_t *start, uint32_t *end);
extern int bgp_if_update_all(void);
-extern void bgp_zebra_announce(struct bgp_dest *dest, const struct prefix *p,
- struct bgp_path_info *path, struct bgp *bgp,
- afi_t afi, safi_t safi);
+extern void bgp_zebra_announce(struct bgp_dest *dest,
+ struct bgp_path_info *path, struct bgp *bgp);
extern void bgp_zebra_announce_table(struct bgp *bgp, afi_t afi, safi_t safi);
-extern void bgp_zebra_withdraw(const struct prefix *p,
- struct bgp_path_info *path, struct bgp *bgp,
- afi_t afi, safi_t safi);
+extern void bgp_zebra_withdraw(struct bgp_dest *dest,
+ struct bgp_path_info *path, struct bgp *bgp);

/* Announce routes of any bgp subtype of a table to zebra */
extern void bgp_zebra_announce_table_all_subtypes(struct bgp *bgp, afi_t afi,
--
2.17.1

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
From b6caf88bd3bce9673d49435453991c49712287aa Mon Sep 17 00:00:00 2001
From: Rajasekar Raja <rajasekarr@nvidia.com>
Date: Thu, 11 Apr 2024 22:27:37 -0700
Subject: [PATCH 08/11] zebra: backpressure - Fix Null ptr access (Coverity
Issue)

Fix dereferencing NULL ptr making coverity happy.

Ticket :#3390099

Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>

diff --git a/bgpd/bgp_zebra.c b/bgpd/bgp_zebra.c
index b81acaf8ec..524551b1e0 100644
--- a/bgpd/bgp_zebra.c
+++ b/bgpd/bgp_zebra.c
@@ -1809,8 +1809,7 @@ static void bgp_handle_route_announcements_to_zebra(struct thread *e)
table = bgp_dest_table(dest);
install =
CHECK_FLAG(dest->flags, BGP_NODE_SCHEDULE_FOR_INSTALL);
- if (table && table->afi == AFI_L2VPN &&
- table->safi == SAFI_EVPN)
+ if (table->afi == AFI_L2VPN && table->safi == SAFI_EVPN)
is_evpn = true;

if (BGP_DEBUG(zebra, ZEBRA))
--
2.17.1

Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
From 7166c2222cb82885510c3e8c7906c1d7de950f9b Mon Sep 17 00:00:00 2001
From: Donald Sharp <sharpd@nvidia.com>
Date: Thu, 11 Apr 2024 13:28:30 -0400
Subject: [PATCH 09/11] bgpd: Increase install/uninstall speed of evpn vpn
vni's

BGP receives notification from zebra about an vpn that
needs to be installed into the evpn tables. Unfortunately
this function was walking the entirety of evpn tables
3 times. Modify the code to walk the tree 1 time and
to just look for the needed route types as you go.

This reduces, in a scaled environment, processing
time of the zclient_read function from 130 seconds
to 95 seconds. For a up / down / up interface
scenario.

Signed-off-by: Rajasekar Raja <rajasekarr@vndia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>

diff --git a/bgpd/bgp_evpn.c b/bgpd/bgp_evpn.c
index 622fd6afd2..eb5aa9f077 100644
--- a/bgpd/bgp_evpn.c
+++ b/bgpd/bgp_evpn.c
@@ -3752,9 +3752,7 @@ static int install_uninstall_routes_for_vrf(struct bgp *bgp_vrf, int install)
* particular VNI.
*/
static int install_uninstall_routes_for_vni(struct bgp *bgp,
- struct bgpevpn *vpn,
- bgp_evpn_route_type rtype,
- int install)
+ struct bgpevpn *vpn, int install)
{
afi_t afi;
safi_t safi;
@@ -3785,7 +3783,9 @@ static int install_uninstall_routes_for_vni(struct bgp *bgp,
(const struct prefix_evpn *)bgp_dest_get_prefix(
dest);

- if (evp->prefix.route_type != rtype)
+ if (evp->prefix.route_type != BGP_EVPN_IMET_ROUTE &&
+ evp->prefix.route_type != BGP_EVPN_AD_ROUTE &&
+ evp->prefix.route_type != BGP_EVPN_MAC_IP_ROUTE)
continue;

for (pi = bgp_dest_get_bgp_path_info(dest); pi;
@@ -3812,7 +3812,8 @@ static int install_uninstall_routes_for_vni(struct bgp *bgp,
bgp->vrf_id,
install ? "install"
: "uninstall",
- rtype == BGP_EVPN_MAC_IP_ROUTE
+ evp->prefix.route_type ==
+ BGP_EVPN_MAC_IP_ROUTE
? "MACIP"
: "IMET",
vpn->vni);
@@ -3845,23 +3846,11 @@ static int install_routes_for_vrf(struct bgp *bgp_vrf)
*/
static int install_routes_for_vni(struct bgp *bgp, struct bgpevpn *vpn)
{
- int ret;
-
- /* Install type-3 routes followed by type-2 routes - the ones applicable
+ /*
+ * Install type-3 routes followed by type-2 routes - the ones applicable
* for this VNI.
*/
- ret = install_uninstall_routes_for_vni(bgp, vpn, BGP_EVPN_IMET_ROUTE,
- 1);
- if (ret)
- return ret;
-
- ret = install_uninstall_routes_for_vni(bgp, vpn, BGP_EVPN_AD_ROUTE,
- 1);
- if (ret)
- return ret;
-
- return install_uninstall_routes_for_vni(bgp, vpn, BGP_EVPN_MAC_IP_ROUTE,
- 1);
+ return install_uninstall_routes_for_vni(bgp, vpn, 1);
}

/* uninstall routes from l3vni vrf. */
@@ -3877,25 +3866,11 @@ static int uninstall_routes_for_vrf(struct bgp *bgp_vrf)
*/
static int uninstall_routes_for_vni(struct bgp *bgp, struct bgpevpn *vpn)
{
- int ret;
-
- /* Uninstall type-2 routes followed by type-3 routes - the ones
- * applicable
- * for this VNI.
+ /*
+ * Uninstall type-2 routes followed by type-3 routes - the ones
+ * applicable for this VNI.
*/
- ret = install_uninstall_routes_for_vni(bgp, vpn, BGP_EVPN_MAC_IP_ROUTE,
- 0);
- if (ret)
- return ret;
-
- ret = install_uninstall_routes_for_vni(bgp, vpn, BGP_EVPN_AD_ROUTE,
- 0);
- if (ret)
- return ret;
-
-
- return install_uninstall_routes_for_vni(bgp, vpn, BGP_EVPN_IMET_ROUTE,
- 0);
+ return install_uninstall_routes_for_vni(bgp, vpn, 0);
}

/*
--
2.17.1

Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
From 72781109e5dadafe55f10d72ae2b3505bf0ccb93 Mon Sep 17 00:00:00 2001
From: Donald Sharp <sharpd@nvidia.com>
Date: Mon, 8 Apr 2024 16:24:05 -0400
Subject: [PATCH 10/11] zebra: Actually display I/O buffer sizes

An operator found a situation where zebra was
backing up in a significant way towards BGP
with EVPN changes taking up some serious amounts
of memory. The key lines that would have clued
us in on it were behind a dev build. Let's change
this.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>

diff --git a/lib/stream.h b/lib/stream.h
index a3c148c9c9..7acbbd2dc7 100644
--- a/lib/stream.h
+++ b/lib/stream.h
@@ -120,9 +120,7 @@ struct stream_fifo {

/* number of streams in this fifo */
atomic_size_t count;
-#if defined DEV_BUILD
atomic_size_t max_count;
-#endif

struct stream *head;
struct stream *tail;
diff --git a/zebra/zserv.c b/zebra/zserv.c
index de6e404fc4..daccfa59d5 100644
--- a/zebra/zserv.c
+++ b/zebra/zserv.c
@@ -1130,12 +1130,10 @@ static void zebra_show_client_detail(struct vty *vty, struct zserv *client)
vty_out(vty, "ES-EVI %-12u%-12u%-12u\n",
client->local_es_evi_add_cnt, 0, client->local_es_evi_del_cnt);
vty_out(vty, "Errors: %u\n", client->error_cnt);
-
-#if defined DEV_BUILD
vty_out(vty, "Input Fifo: %zu:%zu Output Fifo: %zu:%zu\n",
client->ibuf_fifo->count, client->ibuf_fifo->max_count,
client->obuf_fifo->count, client->obuf_fifo->max_count);
-#endif
+
vty_out(vty, "\n");
}

--
2.17.1

Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
From 981bfca72414bfa7a97b6526e4736ea4f6834620 Mon Sep 17 00:00:00 2001
From: Rajasekar Raja <rajasekarr@nvidia.com>
Date: Mon, 22 Apr 2024 13:50:47 -0400
Subject: [PATCH] From a88498262d8d88fb3846d1a5c1a373751fe6f381 Mon Sep 17
00:00:00 2001 Subject: [PATCH 11/11] zebra: Actually display I/O buffer sizes
(part-2)

An extension of commit-8d8f12ba8e5cd11c189b8475b05539fa8415ccb9

Removing ifdef DEV_BUILD in stream_fifo_push as well to make the 'sh
zebra client' display the current I/O fifo along with max fifo items.

TICKET :#3390099

Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>

diff --git a/lib/stream.c b/lib/stream.c
index 2de3abdf45..3aef70794e 100644
--- a/lib/stream.c
+++ b/lib/stream.c
@@ -1256,9 +1256,7 @@ void stream_fifo_init(struct stream_fifo *fifo)
/* Add new stream to fifo. */
void stream_fifo_push(struct stream_fifo *fifo, struct stream *s)
{
-#if defined DEV_BUILD
size_t max, curmax;
-#endif

if (fifo->tail)
fifo->tail->next = s;
@@ -1267,15 +1265,11 @@ void stream_fifo_push(struct stream_fifo *fifo, struct stream *s)

fifo->tail = s;
fifo->tail->next = NULL;
-#if !defined DEV_BUILD
- atomic_fetch_add_explicit(&fifo->count, 1, memory_order_release);
-#else
max = atomic_fetch_add_explicit(&fifo->count, 1, memory_order_release);
curmax = atomic_load_explicit(&fifo->max_count, memory_order_relaxed);
if (max > curmax)
atomic_store_explicit(&fifo->max_count, max,
memory_order_relaxed);
-#endif
}

void stream_fifo_push_safe(struct stream_fifo *fifo, struct stream *s)
--
2.43.2

11 changes: 11 additions & 0 deletions src/sonic-frr/patch/series
Original file line number Diff line number Diff line change
@@ -27,3 +27,14 @@
0027-lib-Do-not-convert-EVPN-prefixes-into-IPv4-IPv6-if-n.patch
0028-zebra-fix-parse-attr-problems-for-encap.patch
0029-zebra-nhg-fix-on-intf-up.patch
0030-isisd-staticd-need-to-link-directly-against-libyang.patch
0031-zebra-backpressure-Zebra-push-back-on-Buffer-Stream-.patch
0032-bgpd-backpressure-Add-a-typesafe-list-for-Zebra-Anno.patch
0033-bgpd-fix-flushing-ipv6-flowspec-entries-when-peering.patch
0034-bgpd-backpressure-cleanup-bgp_zebra_XX-func-args.patch
0035-gpd-backpressure-Handle-BGP-Zebra-Install-evt-Creat.patch
0036-bgpd-backpressure-Handle-BGP-Zebra-EPVN-Install-evt-.patch
0037-zebra-backpressure-Fix-Null-ptr-access-Coverity-Issu.patch
0038-bgpd-Increase-install-uninstall-speed-of-evpn-vpn-vn.patch
0039-zebra-Actually-display-I-O-buffer-sizes.patch
0040-zebra-Actually-display-I-O-buffer-sizes-part-2.patch