- 1 General
- 2 Thread safe
- 3 Fork safe
- 4 Library API
- 4.1 Library functions
- 4.2 Device functions
- 4.3 Context functions
- 4.4 Queries
- 4.5 Asynchronous events
- 4.6 Protection Domains
- 4.7 Memory Regions
- 4.8 Address Handles
- 4.9 Completion event channels
- 4.10 Completion Queues control
- 4.11 Shared Receive Queue control
- 4.12 Queue Pair control
- 4.13 Posting Work Requests to QPs/SRQs
- 4.14 Reading Completions from CQ
- 4.15 Requesting / Managing CQ events
- 4.16 Multicast group
- 4.17 General functions
- 5 Resource creation dependency
- 6 Typical error messages
- 7 Summary
libibverbs is an implementation of the RDMA verbs for both Infiniband (according to the Infiniband specifications) and iWarp (iWARP verbs specifications). It handles the control path of creating, modifying, querying and destroying resources such as Protection Domains (PD), Completion Queues (CQ), Queue-Pairs (QP), Shared Receive Queues (SRQ), Address Handles (AH), Memory Regions (MR). It also handles sending and receiving data posted to QPs and SRQs, getting completions from CQs using polling and completions events.
The control path is implemented through system calls to the uverbs kernel module which further calls the low-level HW driver. The data path is implemented through calls made to low-level HW library which, in most cases, interacts directly with the HW providing kernel and network stack bypass (saving context/mode switches) along with zero copy and an asynchronous I/O model.
Typically, under network and RDMA programming, there are operations which involve interaction with remote peers (such as address resolution and connection establishment) and remote entities (such as route resolution and joining a multicast group under IB), where a resource managed through IB verbs such as QP or AH would be eventually created or effected from this interaction. In such cases, applications whose addressing semantics is based on IP can use librdmacm which works in conjunction with libibverbs.
This library is a thread safe library and verbs can be called from every thread in the process. The same resource can even be handled from different threads (the atomicity of the operations is guaranteed). However, it is up to the user to stop working with a resource after it was destroyed (by the same thread or by any other thread), not doing so may result a segmentation fault.
As a general rule of thumb, fork() should be avoided when using libibvebrs, either by calling it explicitly or by calling it implicitly (by calling other system calls that call it, such as system(), popen(), etc.).
However, if one must use fork() please read the documentation of ibv_fork_init().
The functions in the library shall be declared as functions and some of them may be declared as macros.
In order to use libibvebrs, the following line must be included in the source code:
struct ibv_device **ibv_get_device_list(int *num_devices); void ibv_free_device_list(struct ibv_device **list); const char *ibv_get_device_name(struct ibv_device *device); uint64_t ibv_get_device_guid(struct ibv_device *device);
struct ibv_context *ibv_open_device(struct ibv_device *device); int ibv_close_device(struct ibv_context *context);
int ibv_query_device(struct ibv_context *context, struct ibv_device_attr *device_attr); int ibv_query_port(struct ibv_context *context, uint8_t port_num, struct ibv_port_attr *port_attr); int ibv_query_pkey(struct ibv_context *context, uint8_t port_num, int index, uint16_t *pkey); int ibv_query_gid(struct ibv_context *context, uint8_t port_num, int index, union ibv_gid *gid);
int ibv_get_async_event(struct ibv_context *context, struct ibv_async_event *event); void ibv_ack_async_event(struct ibv_async_event *event);
struct ibv_pd *ibv_alloc_pd(struct ibv_context *context); int ibv_dealloc_pd(struct ibv_pd *pd);
struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, void *addr, size_t length, enum ibv_access_flags access); int ibv_dereg_mr(struct ibv_mr *mr);
struct ibv_ah *ibv_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr); int ibv_init_ah_from_wc(struct ibv_context *context, uint8_t port_num, struct ibv_wc *wc, struct ibv_grh *grh, struct ibv_ah_attr *ah_attr); struct ibv_ah *ibv_create_ah_from_wc(struct ibv_pd *pd, struct ibv_wc *wc, struct ibv_grh *grh, uint8_t port_num); int ibv_destroy_ah(struct ibv_ah *ah);
Completion event channels
struct ibv_comp_channel *ibv_create_comp_channel(struct ibv_context *context); int ibv_destroy_comp_channel(struct ibv_comp_channel *channel);
Completion Queues control
struct ibv_cq *ibv_create_cq(struct ibv_context *context, int cqe, void *cq_context, struct ibv_comp_channel *channel, int comp_vector); int ibv_destroy_cq(struct ibv_cq *cq); int ibv_resize_cq(struct ibv_cq *cq, int cqe);
struct ibv_srq *ibv_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *srq_init_attr); int ibv_destroy_srq(struct ibv_srq *srq); int ibv_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, enum ibv_srq_attr_mask srq_attr_mask); int ibv_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr);
Queue Pair control
struct ibv_qp *ibv_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *qp_init_attr); int ibv_destroy_qp(struct ibv_qp *qp); int ibv_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, enum ibv_qp_attr_mask attr_mask); int ibv_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, enum ibv_qp_attr_mask attr_mask, struct ibv_qp_init_attr *init_attr);
Posting Work Requests to QPs/SRQs
int ibv_post_send(struct ibv_qp *qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int ibv_post_recv(struct ibv_qp *qp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); int ibv_post_srq_recv(struct ibv_srq *srq, struct ibv_recv_wr *recv_wr, struct ibv_recv_wr **bad_recv_wr);
Reading Completions from CQ
int ibv_poll_cq(struct ibv_cq *cq, int num_entries, struct ibv_wc *wc);
Requesting / Managing CQ events
int ibv_req_notify_cq(struct ibv_cq *cq, int solicited_only); int ibv_get_cq_event(struct ibv_comp_channel *channel, struct ibv_cq **cq, void **cq_context); void ibv_ack_cq_events(struct ibv_cq *cq, unsigned int nevents);
int ibv_attach_mcast(struct ibv_qp *qp, union ibv_gid *gid, uint16_t lid); int ibv_detach_mcast(struct ibv_qp *qp, union ibv_gid *gid, uint16_t lid);
int ibv_rate_to_mult(enum ibv_rate rate); enum ibv_rate mult_to_ibv_rate(int mult); const char *ibv_node_type_str(enum ibv_node_type node_type); const char *ibv_port_state_str(enum ibv_port_state port_state); const char *ibv_event_type_str(enum ibv_event_type event); const char *ibv_wc_status_str(enum ibv_wc_status status);
Resource creation dependency
Typical error messages
Here is a list of the typical error messages, which may be printed to stderr when executing a libibverbs application, and how to solve them:
- libibverbs: Fatal: couldn't read uverbs abi version
- Reason: libibverbs failed to find the file (/sys/class/infiniband_verbs/abi_version) that indicated the ABI (Application Binary Interface) version between the kernel and libibverbs.
- Cause: this usually happens when the module ib_uverbs isn't loaded.
- Solution: if the RDMA package (OFED) was installed - reboot the machine. Otherwise, load the RDMA stack drivers using the proper service file.
- libibverbs: Fatal: kernel ABI version X doesn't match library version Y
- Reason: the available RDMA kernel stack isn't supported by libibverbs (this is what wrong ABI means).
- Cause: this usually happen when the kernel part and libibverbs don't come from the same source (i.e. OFED/inbox/built manually).
- Solution: uninstall the current RDMA packages that one may have and install a fresh OFED distribution or the packages that come within the Linux distribution.
- libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'
- Reason: libibverbs failed to open the directory that holds information about the installed userspace low-level driver libraries.
- Cause: this usually happens when libibverbs was configured and compiled with different parameters (--sysconfdir that was provided to "configure") than the userspace low-level driver libraries.
- Solution: uninstall the userspace low-level drivers and libibverbs and install them from a consistent source or recompile all those libraries with the same parameters.
- libibverbs: Warning: fork()-safety requested but init failed
- Reason: libibverbs tried to work in fork()-safe mode, according to the user's request, but failed.
- Cause: this usually happens in old Linux kernels (older than 2.6.12)
- Solution: move to older Linux kernel or disable the fork() request environment variable/verb.
- libibverbs: Warning: no userspace device-specific driver found
- Reason: libibverbs failed to find userspace low-level driver for a specific RDMA device.
- Cause: the userspace low-level driver for this RDMA device is missing.
- Solution: install the missing low-level driver according to the HW that exists in your computer (lspci may be handy).
- libibverbs Warning: couldn't load driver
- Reason: libibverbs failed to load the userspace low-level driver library for a specific RDMA device.
- Cause: this usually happens when the userspace low-level driver library (.so file) for this RDMA device is missing, corrupted or isn't consistent with the libibverbs (in terms of supported features).
- Solution: if the userspace low-level driver library for this RDMA device is missing: install it. If it is already installed, uninstalll and reinstall it from the same source that libibverbs came from.
- libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes
- Reason: libibverbs verified the amount of memory that can be locked by the running process, and detected that this value is 32KB or less.
- Cause: working with RDMA requires to pin (i.e. lock) system memory. Low amount of memory which can be locked will cause failure when creating Completion Queue, Queue Pair, Shared Receive Queue or Memory Region.
- Solution: increase the amount of memory which can be locked by any process to a higher value ("unlimited" is preferred).
In this post, we described libibverbs.
In the next posts, we will cover the API in details.
Tell us what do you think.