Skip to content

Commit

Permalink
v2.1.0rc1
Browse files Browse the repository at this point in the history
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
  • Loading branch information
j-xiong committed Feb 28, 2025
1 parent 1224c82 commit 16223f1
Show file tree
Hide file tree
Showing 7 changed files with 246 additions and 6 deletions.
14 changes: 14 additions & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
Expand Up @@ -65,17 +65,21 @@ Cody Mann <cody.mann@cornelisnetworks.com>
Coni Gehler <cgehler@cray.com>
ct-clmsn <ct.clmsn@gmail.com>
Dardo D Kleiner <dkleiner@cmf.nrl.navy.mil>
Dariusz Sciebura <dariusz.sciebura@gmail.com>
Dariusz Sciebura <dariuszs@graphcore.ai>
dariuszsciebura <93722774+dariuszsciebura@users.noreply.github.com>
Darryl Abbate <drl@amazon.com>
Dave Goodell <dgoodell@cisco.com>
David Noel <david.noel19@gmail.com>
Denis Maryin <denis.maryin@intel.com>
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Derek Shinaberry <dshinaberry@mru.medical.canon>
Di Wang <ddiwang@google.com>
Dipti Kothari <dkothar@amazon.com>
Dmitry Durnov <dmitry.durnov@intel.com>
Dmitry Gladkov <dmitry.gladkov@intel.com>
Doug Oucharek <dougso@me.com>
dsciebu <dariusz.sciebura@gmail.com>
Edgar Gabriel <Edgar.Gabriel@amd.com>
Elias Kozah <elias.elkozah@cornelisnetworks.com>
Elias Kozah <Elias.Kozah@cornelisnetworks.com>
Expand Down Expand Up @@ -126,6 +130,7 @@ Jeff Hammond <jeff.science@gmail.com>
Jeff Squyres <jsquyres@cisco.com>
Jerome Berryhill <Jerome.Berryhill@Intel.com>
Jerome Boyd Berryhill <JeromeBerryhill@Intel.com>
Jerome Soumagne <jerome.soumagne@hpe.com>
Jerome Soumagne <jsoumagne@hdfgroup.org>
Jessie Yang <jiaxiyan@amazon.com>
Jiakun Yan <jiakunyan1998@gmail.com>
Expand Down Expand Up @@ -171,6 +176,7 @@ Luke Robison <lrbison@amazon.com>
Marcin Salnik <marcin.salnik@intel.com>
Martin Kontsek <mkontsek@cisco.com>
Matt Koop <mkoop@amazon.com>
Md Bulbul Sharif <md-bulbul.sharif@hpe.com>
Miao Luo <miao.luo@intel.com>
Michael Blocksome <michael.blocksome@intel.com>
Michael Chuvelev <michael.chuvelev@intel.com>
Expand Down Expand Up @@ -218,7 +224,9 @@ Peter Gottesman <pgottesm@cisco.com>
Phil Carns <carns@mcs.anl.gov>
Philip Davis <philipdavis01@gmail.com>
Pierre Roux <piroux@cisco.com>
Piotr Chmiel <piotrc@graphcore.ai>
Prankur Gupta <prankgup@cisco.com>
PukNgae Cryolitia <Cryolitia@gmail.com>
Quentin Boyer <quentin.boyer@eviden.com>
Quincey Koziol <qkoziol@amazon.com>
Raghu Raja <craghun@amazon.com>
Expand All @@ -230,6 +238,7 @@ Reese Faucette <rfaucett@cisco.com>
Rich Welch <rlwelch@amazon.com>
Richard Halkyard <rhalkyard@cray.com>
Robert Wespetal <wesper@amazon.com>
Roger Connaughty <roger.connaughty@hpe.com>
Rohit Zambre <rzambre@uci.edu>
Ryan Hankins <rqh@amazon.com>
Ryan Hankins <rqh@dev-dsk-rqh-1d-28b29c44.us-east-1.amazon.com>
Expand All @@ -240,26 +249,31 @@ Sannikov, Alexander <alexander.sannikov@intel.com>
Sayantan Sur <sayantan.sur@intel.com>
Scott Breyer <scott.breyer@intel.com>
Sean Hefty <sean.hefty@intel.com>
Sean Pollard <sean.pollard@hpe.com>
Sergey Fedorov <vital.had@gmail.com>
Sergey Oblomov <sergey.oblomov@intel.com>
Seth Zegelstein <szegel@amazon.com>
Shantonu Hossain <shantonu.hossain@intel.com>
Shi Jin <53314885+shijin-aws@users.noreply.github.com>
Shi Jin <sjina@amazon.com>
Siarhei Volkau <lis8215@gmail.com>
soblomov <sergey.oblomov@intel.com>
Solovyev, Dmitriy <dmitriy.solovyev@intel.com>
Soumendu Satapathy <soumendu.satapathy@hpe.com>
Spruit, Neil R <neil.r.spruit@intel.com>
Srdjan Milakovic <srdjan@rice.edu>
Srikrishna Gurugubelli <gurugubs@amazon.com>
Stan Smith <stan.smith@intel.com>
Stephen Oost <stephen.oost@intel.com>
Steve Welch <swelch@systemfabricworks.com>
Steve Welch <welch@hpe.com>
Steven Dashevsky <sdashevsky@habana.ai>
Steven Vormwald <sdvormwa@cray.com>
Sung-Eun Choi <sungeun@cray.com>
Sung-Eun Choi <sungeunchoi@users.noreply.github.com>
Sylvain Didelot <didelot.sylvain@gmail.com>
Sylvain Didelot <sdidelot@ddn.com>
szegel <szegel@ip-10-0-2-236.corp.pcluster.com>
Tadeusz Struk <tstruk@gigaio.com>
Tang, Jingyin <jytang@amazon.com>
Thananon Patinyasakdikul <apatinya@cisco.com>
Expand Down
2 changes: 1 addition & 1 deletion Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ src_libfabric_la_LIBADD =
src_libfabric_la_DEPENDENCIES = libfabric.map

if !EMBEDDED
src_libfabric_la_LDFLAGS += -version-info 27:0:26
src_libfabric_la_LDFLAGS += -version-info 28:0:27
endif
src_libfabric_la_LDFLAGS += -export-dynamic \
$(libfabric_version_script)
Expand Down
228 changes: 227 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,215 @@ bug fixes (and other actions) for each version of Libfabric since
version 1.0. New major releases include all fixes from minor
releases with earlier release dates.

v2.1.0, Sat Mar 15, 2025
========================

## Core

- hmem: Fix missing rocr dlopen function assignments
- Fix data race on log_prefix
- hmem: Define ofi_hmem_put_dmabuf_fd and add support for cuda and rocr
- Fix a few minor man page issues

## CXI

- Fix peer CQ support
- Added collectives logical operators
- Fix bug in constrained LE test cases in test.sh and test_sw.sh
- Fix unit test missing pthread initialization
- Add FI_WAIT_YIELD EQ support
- Make string setup of FI_CXI_CURL_LIB_PATH safe
- Add FI_CXI_CURL_LIB_PATH #define from autoconf
- Test CUDA with DMA buf FD recycling
- Test ROCR with DMA buf FD recycling
- Test ROCR with DMA buf offset
- Integrate with ofi_hmem_put_dmabuf_fd
- Test monitor unsubscribe
- Fix fi_cq_strerror
- Cxi EQ do not support wait objects
- Fix CQ wait FD logic
- Disable retry logic for experimental collectives
- Ignore drop count during init
- Remove CXI_MAP_IOVA_ALLOC flag.
- Synchronous fi_close on collective multicast
- Fix deferred work test
- Depreciate FI_CXI_WEAK_FENCE
- Update message and target ordering doc
- Define FI_CXI_MR_TARGET_ORDERING
- Remove FI_CXI_ENABLE_UNRESTRICTED_RO
- Set MR relax order on EP order size
- Fix RMA/AMO network ordering
- Update CXI provider max order size

## EFA

- Release matched rxe before destroying the srx rx_pool
- Fix the error code from ibv wr API
- Fix the clean up issue for efa_util_prov
- Fix the cntr interface for efa-direct
- Add unit test for efa-direct progress model
- Fix the max_msg_size reporting for efa-direct
- Clean up rxe map during rxe release
- rdm: Do not claim support for FI_PROGRESS_AUTO
- Always return efa_prov in EFA_INI
- Do not write cq error for ope from internal operations
- Remove unused field efa_domain->mr_mode
- Do GDRCopy registrations only in the EFA RDM path
- Reset g_efa_hmem_info after each test
- Fix the unexp_pkt clean up.
- Call efa_fork_support_enable_if_requested earlier
- Check efa_prov_info_set_fabric_name return code
- Clean up efa_prov_info_set_hmem_flags
- Bug fix in the RDM path with FI_MSG_PREFIX mode
- Rework the efa_cq unit tests
- Improve efa_cq's completion report
- Unit test additions and fixes for efa-direct
- Remove incorrect usage of rdm_info->ep_attr->max_msg_size
- Add new efa-direct fi_info objects
- Cleanup efa_user_info
- Add debug log for efa-direct data transfer
- Use cuda_put_dmabuf_fd
- Fix leak of dmabuf fd in cuda p2p probe
- Implement FI_CONTEXT2 in EFA Direct
- Remove x86-64 architecture check for static_assert
- Do infinite rnr retry for base ep by default
- Extend efa_ep interface
- Migrate efa_dgram_ep to efa_ep
- Adjust the logging level for unreleased rxe
- Regulate the usage of optnames
- Move struct efa_ep_addr to efa_base_ep
- Remove util_av_fi_addr from efa_conn
- Make efa_rdm_cq use efa_cq
- Deprecate FI_AV_MAP
- Remove inline write logic for rma inject
- Add missing mock for wc_is_unsolicited in unit test
- Implement the cq progress
- Remove err_msg from efa_rdm_ep
- Move raw addr functions
- Move efa_rdm_cq_wc_is_unsolicited to efa_cq
- Correct the error code for IBV_WC_RECV_RDMA_WITH_IMM
- Add missing locks in efa_msg and efa_rma
- Move fork handler installation to efa_domain_open
- Detect unsolicited write recv support status on both sides
- Add unit tests for efa_rma
- Add tracepoints for efa_msg and efa_rma
- Add unit tests for efa_msg
- Add tracepoint for poll cq ope
- Adjust the error code for flushed receive

## LPP

- Add check for atomics

## OPX

- Move CUDA sync attribute setting to mr registration
- Add HMEM handle for GDRCopy in GET/PUT
- Add newline to trace entry
- Add debug trace messages to RMA functions
- Disable out of order RC if TID is enabled
- Unexpected packet processing modifications
- Use inlined call to process_header for payloadless RZV_DATA (TID) packets
- Run opx-format on upstream opx provider change
- Remove reliability handshake
- Add PR close event to Cornelis Networks internal workflow triggers
- Use cycle timer as long as all set CPUs are same socket
- fi_opx_addr changes as pre-context sharing and pre-CYR
- Replace intranode hashmap with array
- Default RTS/CTS to in-order route control
- Write CQ entry for successful data transfer operation by default
- Resolve OPX fi_writedata() reliability errors
- Remove extraneous warning
- Enable TID by default.
- Fixed OPX trace points
- Set route control based on packet type
- Implement FI_MR_VIRT_ADDR in OPX
- Use reliability timer for link bounce status check
- Link bounce for JKR
- Fix debug print array indexing
- Resolve new Coverity scan defects
- Enhanced simulation and debug support
- Add HFI1 Direct Verbs support
- Making pkey related failures more obvious
- Reformat full OPX provider
- Add .clang-format file for OPX provider
- Identify and resolve new Coverity scan defects
- Changing default pkey to fetch from pkey table index 0
- Fix wrong function name for getting hmem iface.
- Handle Cuda Managed/Unified memory
- Fix OPX hint checking and capability setting
- Implement fi_writedata()
- Set rate control defaults
- Process RZV payload immediately
- CN5000/JKR 16B: 3B Lid changes
- Set entropy to rx/tx pair
- Don't send immediate data in send_rzv when send buffer is not host memory
- Use `page_sizes[OFI_PAGE_SIZE]-1` instead of `PAGE_MASK`

## RXM

- Fix rxm multi recv getopt segfault

## SHM

- Remove prefix from map inserts
- Fix name compare bug

## TCP

- Only disable ep if the failure can not be retried
- Fix data race caused by parallel access to xnet_rdm_fid_ops
- Fix FI_MULTI_RECV not set on error
- Fix race in writing to xnet_ep_fi_ops

## Util

- Change util_av lock to genlock
- Roundup_power_of_two remove unnecessary decrement
- Enchance performace of roundup_power_of_two
- Fix FI_MULTI_RECV not set on FI_ECANCELED
- Fix flag initialization for generic receive of unexpected entry
- Add fabric argument to pingpong test
- Statically set uffd callbacks
- Fix ROCR and memhooks deadlock
- Support mem monitors with per sub ctx
- Separate uffd and import mem monitors
- pingpong: close mr after ep close

## Verbs

- Always return vrb_prov in VERBS_INI
- Fix data race vrb_open_ep function

## Fabtests

- efa: Add remote exit early test with post recv
- Do not require FI_TAGGED for fi_av_xfer test
- efa: print err for recv failure
- efa: Add fabtests for efa-direct
- Set the min of tx/rx_mr_size
- efa: Add remote exit early test
- efa: Fix the rnr read cq error test for efa-direct
- multi_ep: Support customized transfer size
- Re-enable psm3 rdm_tagged_peek
- Disable multi_recv
- Run efa tests with efa fabric name
- Add fabric argument to ClientServerTest
- efa: add rdma check for unsolicited write recv
- Add support for FI_CONTEXT2
- Bugfixes for neuron
- Corrected flags argument type in ft_sendmsg/ft_recvmsg functions
- pytest/efa: Avoid duiplicate completion semantic for RMA test
- pytest/efa: merge memory_type and check_rma_bw_memory_type


v2.0.0, Fri Dec 13, 2024
========================

## Core

- hmem/cuda: avoid stub loading at runtime
- Makefile.am: Keep using libfabric.so.1 as the soname
- xpmem: Cleanup xpmem before monitors
- Remove redundant windows.h
- hmem/cuda: Add env variable to enable/disable CUDA DMABUF
Expand Down Expand Up @@ -40,6 +244,8 @@ v2.0.0, Fri Dec 13, 2024

## EFA

- Skip rx pkt refill under certain threshold
- Fix efa multi recv setopt segfault
- Add tracepoints for rma operations
- Adjust the location of tracepoint
- Implement the rma interface
Expand All @@ -61,13 +267,27 @@ v2.0.0, Fri Dec 13, 2024

## Hook

Fix the preprocessor
- Fix the preprocessor

## LNX

- Initialize flags to 0
- Convert peer table to use buffer pools
- Fix av strncpy
- Fix various issues with initial commit

## PSM2

- Check return value of asprintf

## RXM

- Fix rxm multi recv setopt segfault
- Replace rxm managed srx with util srx, support FI_PEER
- Add rxm support for using a peer CQs and counters
- Add FI_AV_USER_ID support
- Fix definition of the rxm SAR segment enum

## SHM

- Cleanup op flags
Expand All @@ -76,6 +296,11 @@ Fix the preprocessor

- Fixed coverity issue for unchecked return value.

## UCX

- Fix segfault in ucx_send_callback
- Fix incorrect return value checking for fi_param_get()

## Util

- Set srx completion flags and msg_len properly
Expand All @@ -88,6 +313,7 @@ Fix the preprocessor

## Fabtests

- Add opts.min_multi_recv_size to set opt before enable
- Add FI_MORE pytest for fi_recv in zcpy recv mode
- Allow tests with FI_MORE flag by using fi_recvmsg
- New fabtest fi_flood to test over subscription of resources
Expand Down
2 changes: 1 addition & 1 deletion configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ dnl
dnl Process this file with autoconf to produce a configure script.

AC_PREREQ([2.60])
AC_INIT([libfabric], [2.1.0a1], [ofiwg@lists.openfabrics.org])
AC_INIT([libfabric], [2.1.0rc1], [ofiwg@lists.openfabrics.org])
AC_CONFIG_SRCDIR([src/fabric.c])
AC_CONFIG_AUX_DIR(config)
AC_CONFIG_MACRO_DIR(config)
Expand Down
2 changes: 1 addition & 1 deletion fabtests/configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ dnl
dnl Process this file with autoconf to produce a configure script.

AC_PREREQ(2.57)
AC_INIT([fabtests], [2.1.0a1], [ofiwg@lists.openfabrics.org])
AC_INIT([fabtests], [2.1.0rc1], [ofiwg@lists.openfabrics.org])
AC_CONFIG_AUX_DIR(config)
AC_CONFIG_MACRO_DIR(config)
AC_CONFIG_HEADERS(config.h)
Expand Down
2 changes: 1 addition & 1 deletion include/rdma/fabric.h
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ extern "C" {
#endif

#define FI_MAJOR_VERSION 2
#define FI_MINOR_VERSION 0
#define FI_MINOR_VERSION 1
#define FI_REVISION_VERSION 0

/* Removing these breaks the build for some apps.
Expand Down
Loading

0 comments on commit 16223f1

Please sign in to comment.