Compare of verbs implementation vs. the specifications
- 1 Missing functionality
- 2 Changed functionality
The InfiniBand spec defines several features and verbs that the verbs implementation (i.e. RDMA stack in the Linux kernel and libibverbs) didn't implement or implemented in a different way.
In this post I will cover missing verbs and functionality that was defined in the specifications and how it was implemented.
Reliable Datagram (RD)
The RDMA stack in the kernel and libibverbs don't support RD at all: RD isn't a valid transport type when creating a QP and all the following verbs that are relevant to manage its related resources weren't implemented:
- Allocate Reliable Datagram Domain
- Deallocate Reliable Datagram Domain
- Create EE Context
- Modify EE Context Attributes
- Query EE Context
- Destroy EE Context
Address Handle (AH)
- Modify Address Handle
- Query Address Handle
The RDMA stack in the kernel supports those verbs, but most low-level drivers don't support them.
libibverbs doesn't support those verbs at all.
Memory Region (MR)
- Reregister Memory Region
Libibverbs have preparations to support this verb. However, there isn't any implementation of it (yet?).
- Register Shared Memory Region
Libibverbs doesn't support this verb at all.
Memory Window (MW)
- Allocate Memory Window
- Query Memory Window
- Bind Memory Window
- Deallocate Memory Window
The RDMA stack in the kernel supports those verbs and some of the low-level drivers support them as well.
Libibverbs has preparations to support this verb. However, there isn't any implementation to it (yet?).
Memory Region (MR)
- Query Memory Region
The RDMA stack in the kernel supports this verb, but most low-level drivers don't support it.
libibverbs doesn't support this verb verbs at all. However, the attributes addr, length (that were provided when registering the MR),
lkey and rkey (that were filled by ibv_reg_mr()) are part of struct ibv_mr and this replaced the need of this verb. The only attribute that cannot be retrieved from the MR after its creation is the access permissions to it.
Completion Queue (CQ)
- Query Completion Queue
The RDMA stack in the kernel and libibverbs don't support this verb at all. However, the attribute cqe is part of struct ibv_cq and this replaces the need of this verb.
- Set Completion Event Handler
The RDMA stack in the kernel and libibverbs don't support this verb at all. However, when calling ib_create_cq() in the RDMA stack in the kernel the client code can specify a CQ event handler.
In libibverbs the client code can create a thread that will call ibv_req_notify_cq(), ibv_get_cq_event() and ibv_ack_cq_events() and it actually behaves as a Completion Event Handler.
- Set Asynchronous Event Handler
The RDMA stack in the kernel and libibverbs don't fully support this verb. However, when calling ib_create_cq(), ib_create_srq(), ib_create_qp() in the RDMA stack in the kernel the client code can specify Asynchronous Event Handler for those resources. Further more, the user can call ib_register_event_handler() to register the event handler for the RDMA device's events.
In libibverbs the client code can create a thread that will call ibv_get_async_event() and ibv_ack_async_event() and it actually behaves as an Asynchronous Event Handler.
eXtended Reliable Connected (XRC)
Annex A14 adds XRC to the IB spec. The following verbs were added:
- Allocate XRC Domain
- Deallocate XRC Domain
- Create XRC Shared Receive Queue
- Query XRC Shared Receive Queue
- Modify XRC Shared Receive Queue
- Destroy XRC Shared Receive Queue
- Create XRC Target Queue Pair
- Query XRC Target Queue Pair
- Modify XRC Target Queue Pair
- Destroy XRC Target Queue Pair
Most of this functionality was added to the RDMA stack in the kernel, either by adding new verbs or by extending the functionality of exiting ones (for example: instead of adding a new verb for creating an XRC Shared Receive Queue, ib_create_srq() was extended to support the creation of XRC SRQs as well).
However, libibvers doesn't support XRC at all.
1) There are some OFED distributions (such as MLNX-OFED) that have XRC support.
2) Patch that extends libibverbs to support XRC was sent to the mailing list, but they weren't (yet?) accepted to the libibverbs upstream.
Fast Memory Region (FMR)
The IB spec defines registering FMR in a Send Request. However, in the RDMA stack in the kernel there are verbs that allow creating of FMR pools using verbs too and not only using Send Requests.
The InfiniBand spec define special return values for errors that may happen when calling the verbs (for example: Invalid HCA handle, Invalid protection domain, Insufficient resources to complete request and more). The RDMA stack in the kernel and libibverbs using the errno values instead.
Tell us what do you think.