Skip to content

Commit

Permalink
prov/efa: Second fix for efa-direct's max_msg_size
Browse files Browse the repository at this point in the history
commit f3e26d6 introduces another bug
that the max_msg_size can be larger than
the the max_msg_size of efa_prov_info when
FI_RMA is requested: This will cause the
fi_endpoint() call failed when comparing
the user info and prov info.

This patch fixes this issue by making the
prov_info always use the largest max_msg_size,
and conditionally reduce the returned
user_info to the smaller max_msg_size
when FI_RMA is not requested.

Signed-off-by: Shi Jin <sjina@amazon.com>
  • Loading branch information
shijin-aws committed Feb 21, 2025
1 parent b135e2e commit 6597c8b
Show file tree
Hide file tree
Showing 3 changed files with 16 additions and 5 deletions.
8 changes: 7 additions & 1 deletion prov/efa/src/efa_prov_info.c
Original file line number Diff line number Diff line change
Expand Up @@ -153,8 +153,14 @@ void efa_prov_info_set_ep_attr(struct fi_info *prov_info,
prov_info->ep_attr->max_msg_size = device->ibv_port_attr.max_msg_sz;
prov_info->ep_attr->type = ep_type;

if (ep_type == FI_EP_DGRAM)
if (ep_type == FI_EP_RDM) {
/* ep_attr->max_msg_size is the maximum of both MSG and RMA operations */
if (prov_info->caps & FI_RMA)
prov_info->ep_attr->max_msg_size = MAX(device->ibv_port_attr.max_msg_sz, device->max_rdma_size);
} else {
assert(ep_type == FI_EP_DGRAM);
prov_info->ep_attr->msg_prefix_size = 40;
}
}

/**
Expand Down
8 changes: 4 additions & 4 deletions prov/efa/src/efa_user_info.c
Original file line number Diff line number Diff line change
Expand Up @@ -380,11 +380,11 @@ int efa_user_info_alter_direct(int version, struct fi_info *info, const struct f
EFA_INFO(FI_LOG_CORE,
"FI_MSG_PREFIX size = %ld\n", info->ep_attr->msg_prefix_size);
}
/* When user requests FI_RMA and it's supported, the max_msg_size should be returned
* as the maximum of both MSG and RMA operations
/* When user doesn't request FI_RMA, the max_msg_size should be returned
* as the MSG only as RMA will not be used.
*/
if (hints->caps & FI_RMA)
info->ep_attr->max_msg_size = MAX(g_device_list[0].ibv_port_attr.max_msg_sz, g_device_list[0].max_rdma_size);
if (!(hints->caps & FI_RMA))
info->ep_attr->max_msg_size = g_device_list[0].ibv_port_attr.max_msg_sz;
}

/* Print a warning and use FI_AV_TABLE if the app requests FI_AV_MAP */
Expand Down
5 changes: 5 additions & 0 deletions prov/efa/test/efa_unit_test_info.c
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,11 @@ static void test_info_direct_attributes_impl(struct fi_info *hints,
assert_false(info->tx_attr->msg_order & FI_ORDER_SAS);
assert_int_equal(info->domain_attr->progress, FI_PROGRESS_AUTO);
assert_int_equal(info->domain_attr->control_progress, FI_PROGRESS_AUTO);
assert_int_equal(
g_device_list[0].rdm_info->ep_attr->max_msg_size,
(info->caps & FI_RMA) ?
g_device_list[0].max_rdma_size :
g_device_list[0].ibv_port_attr.max_msg_sz);
assert_int_equal(
info->ep_attr->max_msg_size,
(hints->caps & FI_RMA) ?
Expand Down

0 comments on commit 6597c8b

Please sign in to comment.