libibverbs thread safe level
Let's start with the bottom line: the verbs API is fully thread safe and verbs can be called from every thread in the process. Part of the thread safe is implemented at the libibverbs level and part of it is implemented at the low-level driver library level.
The same resource can even be handled from different threads (the atomicity of the operations is guaranteed). The supported operations that can be performed in multiple threads include, but not limited to:
- Opening context using RDMA device.
- Reading Asynchronous events - each event will be read exactly by one thread.
- Acknowledging an Asynchronous event.
- Creating RDMA resources which are associated with the same object (Context, PD, CQ, SRQ). It is guaranteed that each newly created RDMA resource will have its own unique number. For example: a specific QP number will be assigned at a given time to only one QP. Same goes for l_key and r_key.
- Destroying RDMA resources which are associated with the same object (Context, PD, CQ, SRQ).
- Query or modify RDMA resources.
- Posting Work Request to any Queue (QP or SRQ) - different threads may post to different RDMA resources, and different threads may post to the same RDMA resource.
- Polling for Work Completions on a specific CQ - each Work Completion will be read exactly by one of the polling threads.
- Requesting notifications from a CQ.
- Reading a Completion event.
- Acknowledging a Completion Event.
- Attaching/detaching a QP to/from multicast groups.
A little bit information about the internal implementation:
This thread safeness is guaranteed by using pthread primitives by libibverbs and the low-level driver libraries, such as spinlocks and mutexes, conditional variables. In general:
- Mutexes are being used for protecting critical sections in the control path.
- Conditional variables are being used for reference counting of resources in the control path.
- spinlocks are being used in RDMA objects areas that may be accessed in the data-path, for example: CQ, QP and SRQ. This allows fast scheduling of threads
However, creating RDMA resources usually involved in dynamic memory allocation and destroying RDMA resources usually involved in a dynamic memory release. The same resource cannot be destroyed more than once, at any thread, and a resource cannot be used after it was destroyed. It is up to the user to follow those rules and not doing so may result in a segmentation fault.
A good practice will be releasing every RDMA resource in the same thread that it was created in. This isn't mandatory, but is a good way to prevent double destruction or using a destroyed resource.
Are the RDMA verbs thread safe or do I have to protect the RDMA code with a mutex?
Yes. The RDMA verbs are fully thread safe.
What are the limitations of working with RDMA verbs in threads?
There aren't any limitations. Avoid destroying a resource more than once and avoid working with a resource that was destroyed is a limitation that isn't related to mulch-threaded programming.
Tell us what do you think.
There are no comments on this entry.