Skip to content

ibv_get_async_event()

Contents

5.00 avg. rating (97% score) - 1 vote
int ibv_get_async_event(struct ibv_context *context,
                        struct ibv_async_event *event);

Description

ibv_get_async_event() reads the next asynchronous event for an RDMA device context context.

After calling ibv_open_device() all of the asynchronous events are being enqueued to this context, and calling ibv_get_async_event() will read them one by one, by their order. Even if ibv_get_async_event() will be called a long time after the events were generated, it will still first read the older events. Unfortunately, there isn't any time notion for the events, and the user can't know when the events occurred.

By default, ibv_get_async_event() is a blocking function and if there isn't any asynchronous event to read, it waits until the next event will be generated. It can be useful to have a dedicated thread that wait for the next event to occur. However, if one wishes to read the event in non-blocking way, this can be done. One can configure the file descriptor of the event file in the device context context to be non-blocking using fcntl(), and then read this file descriptor using read()/poll()/epoll()/select() in order to determine if there is an event that waits to be read. There is an example on how doing it in this post.

Calling ibv_get_async_event() is atomic and even it being called in more than one thread, it is guaranteed that the same event won't be read by more than one thread.

Each event which was received using ibv_get_async_event() must be acknowledged using ibv_ack_async_event().

Here is the full description of struct ibv_async_event:

Name Description
element A union of several fields that only one of them is valid, depends on the event type:

CQ events: element.cq is valid

QP events: element.qp is valid

SRQ events: element.srq is valid

Port events: element.port_num is valid

RDMA device events: no field is valid

event_type Enumerated value which described the type of the event

Here is a full description of the possible events:

QP events

Here is the description of the affiliated events that may occur for QPs. For those events, the field event->element.qp contains the handle of the QP that got this asynchronous event. Those events will be generated only in the context of the code that this QP belongs to.

IBV_EVENT_COMM_EST

A QP which its state is IBV_QPS_RTR received the first packet in its Receive Queue and it was processed without any error.

This event is mainly relevant only in connection oriented QPs, i.e. RC and UC QPs. It may happen for UD QP as well, it is driver implementation specific.

IBV_EVENT_SQ_DRAINED

A QP, which its state was changed from IBV_QPS_RTS to IBV_QPS_SQD, completed sending all of the outstanding messages in progress in its Send Queue when the state change was requested. For RC QP, this means that all of those messages received acknowledgments, if applicable.

Most of the time, this event will be generated when the (internal) QP state will be changed from SQD.draining to SQD.drained. However, this event may be also generated if the transition to the state IBV_QPS_SQD was aborted because of a transition (either by the RDMA device or by the user) into the  IBV_QPS_SQEIBV_QPS_ERR or IBV_QPS_RESET QP states.

After this event, and the QP is in the IBV_QPS_SQD state it is safe to the user to start modifying the Send Queue attributes send there aren't any message send in progress.

IBV_EVENT_PATH_MIG

Indicates the connection has migrated to the alternate path. This event is relevant only to connection oriented QPs, i.e. RC and UC QPs.

This means that the alternate path attributes are now being used as the primary path attributes. If it is required that there will be another alternate path attribute loaded, the user can now set those attributes.

IBV_EVENT_QP_LAST_WQE_REACHED

A QP, which is associated with an SRQ, was transitioned to the IBV_QPS_ERR state, either automatically by the RDMA device or explicitly by the user, and one of the following occurred:

  • A completion with error was generated for the last WQE
  • The QP transitioned to the IBV_QPS_ERR state and there are no more WQEs on Receive Queue of that QP

This event actually means that WQEs won't be consumed anymore from the SRQ by this QP.

If there was an error to a QP and this event wasn't generated, the user must destroy all of the QPs that are associated with this SRQ and the SRQ itself in order to reclaim all of the WQEs associated with that QP.

IBV_EVENT_QP_FATAL

A QP experienced an error that prevents the generation of completions while accessing or processing the Work Queue, either Send or Receive Queue.

If the problem that caused this event is in the CQ of that Work Queue, the appropriate CQ will get the IBV_EVENT_CQ_ERR event too.

IBV_EVENT_QP_REQ_ERR

The transport layer of the RDMA device detected a transport error violation in the responder side. This error may be one of the following:

  • Unsupported or reserved opcode
  • Out of sequence opcode

Those errors are rare and may happen when there are problems in the subnet or when an RDMA device sends illegal packets.

When this happens, the QP is being transitioned automatically to the IBV_QPS_ERR state by the RDMA device.

This event is relevant only to RC QPs.

IBV_EVENT_QP_ACCESS_ERR

The transport layer of the RDMA device detected a request error violation in the responder side. This error may be one of the following:

  • Misaligned atomic request
  • Too many RDMA Read or Atomic requests
  • R_Key violation
  • Length errors without immediate data

Those errors are usually happening due to bugs in the user code.

When this happens, the QP is being transitioned automatically to the IBV_QPS_ERR state by the RDMA device.

This event is relevant only to RC QPs.

IBV_EVENT_PATH_MIG_ERR

A QP that has an alternate path attributes loaded tried to perform a path migration change, either by the RDMA device or explicitly by the user, and there was an error that prevented from moving to that alternate path.

This error usually can happen if the alternate path attributes in both sides aren't consistent.

CQ events

Here is the description of the affiliated events that may occur for CQs. For those events, the field event->element.cq contains the handle of the CQ that got this asynchronous event. Those events will be generated only in the context of the code that this CQ belongs to.

IBV_EVENT_CQ_ERR

An error occurred when writing a completion to the CQ. This event may occur when there is a protection error (a rare condition) or when there is a CQ overrun (most likely)

When the CQ has an error, it isn't guaranteed that completions from that CQ can be pulled. All of the QPs that are associated with this CQ, either in their RQ or in their SQ will get the IBV_EVENT_QP_FATAL event too.

SRQ events

Here is the description of the affiliated events that may occur for SRQs. For those events, the field event->element.srq contains the handle of the SRQ that got this asynchronous event. Those events will be generated only in the context of the code that this SRQ belongs to.

IBV_EVENT_SRQ_LIMIT_REACHED

A SRQ which was armed and the number of RR in that SRQ dropped below the limit value of that SRQ. When this event is being generated, the limit value of the SRQ will be set to zero.

Most likely that when this event happens, the user will post more RRs to that SRQ and rearm the SRQ again.

IBV_EVENT_SRQ_ERR

An error occurred that prevents from the RDMA device from dequeuing RRs from that SRQ and reporting of receive completions.

If an SRQ experience an error, all of the QPs, which are associated with this SRQ, will be transitioned to IBV_QPS_ERR state and the IBV_EVENT_QP_FATAL asynchronous event will be generated for them.

Port events

Here is the description of the unaffiliated events that may occur for RDMA device ports. For those events, the field event->element.port_num contains the number of the port that got this asynchronous event. Those events will be generated for all of the contexts that use the RDMA device that its port got the events.

IBV_EVENT_PORT_ACTIVE

The link becomes active and it now available to send/receive packets.

The port_attr.state is was in one of the following states: IBV_PORT_DOWN, IBV_PORT_INIT, IBV_PORT_ARMED and it moved to one of the following states IBV_PORT_ACTIVE or IBV_PORT_ACTIVE_DEFER. This can happen when the SM configures the port.

This event will be generated by the device only if IBV_DEVICE_PORT_ACTIVE_EVENT is set in dev_cap.device_cap_flags.

IBV_EVENT_LID_CHANGE

LID was changed on a port by the SM. If this is not the first time that the SM configures the port LID, this may indicate that there is a new SM in the subnet, or the SM reconfigures the subnet. QPs which send/receive data may experience connection failures (if the LIDs in the subnet were changed).

IBV_EVENT_PKEY_CHANGE

P_Key table was changed on a port by the SM. Since QPs are using P_Key table indexes rather than absolute values, it is suggested for the client to check that the P_Key indexes which his QPs use weren't changed.

IBV_EVENT_GID_CHANGE

GID table was changed on a port by the SM. Since QPs are using GID table indexes rather than absolute values (as the source GID), it is suggested for the client to check that the GID indexes which his QPs use weren't changed.

IBV_EVENT_SM_CHANGE

There is a new SM in the subnet which port belongs to and the client should reregister to all subscriptions previously requested from this port, for example (but not limited to) join a multicast group.

IBV_EVENT_CLIENT_REREGISTER

The SM requests that the client will reregister to all subscriptions previously requested from this port, for example (but not limited to) join a multicast group. This event may be generated when the SM suffered from a failure, which caused it to lose his records or when there is new SM in the subnet.

This event will be generated by the device only if the bit that indicates that client reregister is supported set in port_attr.port_cap_flags.

IBV_EVENT_PORT_ERR

The link becomes inactive and it now unavailable to send/receive packets.

The port_attr.state is was in either IBV_PORT_ACTIVE or IBV_PORT_ACTIVE_DEFER states and it moved to one of the following states: IBV_PORT_DOWN, IBV_PORT_INIT, IBV_PORT_ARMED. This can happen when the there are problems with the link (for example: the cable was removed).

This will not affect the QPs, which are associated with this port, states. Although if they are reliable and tries to send data, they may experience retry exceeded.

Device events

Here are the unaffiliated events that may occur in RDMA devices. Those events will be generated for all of the contexts that use the RDMA device that got the events.

IBV_EVENT_DEVICE_FATAL

The RDMA device suffered from an error which isn't related to one of the above asynchronous events. When this event occurs, the behavior of the RDMA device isn't determined and it is highly recommended to close the process immediately since the attempt to destroy the RDMA resources may fail.

Summary

The following table summarize the behavior of the asynchronous events:

Event name Element type Event type Protocol
IBV_EVENT_COMM_EST QP Info IB, RoCE
IBV_EVENT_SQ_DRAINED QP Info IB, RoCE
IBV_EVENT_PATH_MIG QP Info IB, RoCE
IBV_EVENT_QP_LAST_WQE_REACHED QP Info IB, RoCE
IBV_EVENT_QP_FATAL QP Error IB, RoCE, iWARP
IBV_EVENT_QP_REQ_ERR QP Error IB, RoCE, iWARP
IBV_EVENT_QP_ACCESS_ERR QP Error IB, RoCE, iWARP
IBV_EVENT_PATH_MIG_ERR QP Error IB, RoCE
IBV_EVENT_CQ_ERR CQ Error IB, RoCE, iWARP
IBV_EVENT_SRQ_LIMIT_REACHED SRQ Info IB, RoCE, iWARP
IBV_EVENT_SRQ_ERR SRQ Error IB, RoCE, iWARP
IBV_EVENT_PORT_ACTIVE Port Info IB, RoCE, iWARP
IBV_EVENT_LID_CHANGE Port Info IB
IBV_EVENT_PKEY_CHANGE Port Info IB
IBV_EVENT_GID_CHANGE Port Info IB, RoCE
IBV_EVENT_SM_CHANGE Port Info IB
IBV_EVENT_CLIENT_REREGISTER Port Info IB
IBV_EVENT_PORT_ERR Port Error IB, RoCE, iWARP
IBV_EVENT_DEVICE_FATAL Device Error IB, RoCE, iWARP

Parameters

Name Direction Description
context in RDMA device context that was returned from ibv_open_device()
event out The asynchronous event that occurred

Return Values

Value Description
0 On success
-1
If blocking mode: there is an error
If non-blocking mode: there isn't any async event to read

Examples

1) Reading asynchronous event (in blocking way) and printing its context:

 
/* helper function to print the content of the async event */
static void print_async_event(struct ibv_context *ctx,
			      struct ibv_async_event *event)
{
	switch (event->event_type) {
	/* QP events */
	case IBV_EVENT_QP_FATAL:
		printf("QP fatal event for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_QP_REQ_ERR:
		printf("QP Requestor error for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_QP_ACCESS_ERR:
		printf("QP access error event for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_COMM_EST:
		printf("QP communication established event for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_SQ_DRAINED:
		printf("QP Send Queue drained event for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_PATH_MIG:
		printf("QP Path migration loaded event for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_PATH_MIG_ERR:
		printf("QP Path migration error event for QP with handle %p\n", event->element.qp);
		break;
	case IBV_EVENT_QP_LAST_WQE_REACHED:
		printf("QP last WQE reached event for QP with handle %p\n", event->element.qp);
		break;
 
	/* CQ events */
	case IBV_EVENT_CQ_ERR:
		printf("CQ error for CQ with handle %p\n", event->element.cq);
		break;
 
	/* SRQ events */
	case IBV_EVENT_SRQ_ERR:
		printf("SRQ error for SRQ with handle %p\n", event->element.srq);
		break;
	case IBV_EVENT_SRQ_LIMIT_REACHED:
		printf("SRQ limit reached event for SRQ with handle %p\n", event->element.srq);
		break;
 
	/* Port events */
	case IBV_EVENT_PORT_ACTIVE:
		printf("Port active event for port number %d\n", event->element.port_num);
		break;
	case IBV_EVENT_PORT_ERR:
		printf("Port error event for port number %d\n", event->element.port_num);
		break;
	case IBV_EVENT_LID_CHANGE:
		printf("LID change event for port number %d\n", event->element.port_num);
		break;
	case IBV_EVENT_PKEY_CHANGE:
		printf("P_Key table change event for port number %d\n", event->element.port_num);
		break;
	case IBV_EVENT_GID_CHANGE:
		printf("GID table change event for port number %d\n", event->element.port_num);
		break;
	case IBV_EVENT_SM_CHANGE:
		printf("SM change event for port number %d\n", event->element.port_num);
		break;
	case IBV_EVENT_CLIENT_REREGISTER:
		printf("Client reregister event for port number %d\n", event->element.port_num);
		break;
 
	/* RDMA device events */
	case IBV_EVENT_DEVICE_FATAL:
		printf("Fatal error event for device %s\n", ibv_get_device_name(ctx->device));
		break;
 
	default:
		printf("Unknown event (%d)\n", event->event_type);
	}
}
 
 
 
/* the actual code that reads the events in the loop and prints it */
int ret;
 
while (1) {
	/* wait for the next async event */
	ret = ibv_get_async_event(ctx, &event);
	if (ret) {
		fprintf(stderr, "Error, ibv_get_async_event() failed\n");
		return -1;
	}
 
	/* print the event */
	print_async_event(ctx, &event);
 
	/* ack the event */
	ibv_ack_async_event(&event);
}

2) Reading asynchronous event (in non-blocking way) and printing its context:

int flags;
int ret;
 
printf("Changing the mode of events read to be non-blocking\n");
 
/* change the blocking mode of the async event queue */
flags = fcntl(ctx->async_fd, F_GETFL);
ret = fcntl(ctx->async_fd, F_SETFL, flags | O_NONBLOCK);
if (ret < 0) {
	fprintf(stderr, "Error, failed to change file descriptor of async event queue\n");
	return -1;
}
 
while (1) {
	struct pollfd my_pollfd;
	int ms_timeout = 100;
 
	/*
	 * poll the queue until it has an event and sleep ms_timeout
	 * milliseconds between any iteration
	 */
	my_pollfd.fd      = ctx->async_fd;
	my_pollfd.events  = POLLIN;
	my_pollfd.revents = 0;
	do {
		ret = poll(&my_pollfd, 1, ms_timeout);
	} while (ret == 0);
	if (ret < 0) {
		fprintf(stderr, "poll failed\n");
		return -1;
	}
 
	/* we know that there is an event, so we just need to read it */
	ret = ibv_get_async_event(ctx, &event);
	if (ret) {
		fprintf(stderr, "Error, ibv_get_async_event() failed\n");
		return -1;
	}
 
	/* print the event */
	print_async_event(ctx, &event);
 
	/* ack the event */
	ibv_ack_async_event(&event);
}

async_event.c
async_event_nonblocking.c

FAQs

Do I have to read the asynchronous events?

No. The asynchronous events mechanism is a way to provide extra information about things that happen in the CQs, QPs, SRQs, ports, devices. The user doesn't have to use it, but it is highly recommended doing so.

Can I read the events once in a while (for example, every few minutes)?

Yes, you can. The downside for this is that you won't know when the event happened, and maybe this information is irrelevant anymore.

Is this verb is thread-safe?

Yes, this verb is thread-safe (just like the rest of the verbs).

I got a QP/CQ/SRQ event. Will other processes get this event too?

No. Affiliated events will be generated only to the context that this resource belongs to. Other contexts won't even know that this event occurred.

Share Our Posts

Share this post through social bookmarks.

  • Delicious
  • Digg
  • Newsvine
  • RSS
  • StumbleUpon
  • Technorati

Comments

Tell us what do you think.

  1. Junhyun Shim says: December 20, 2018

    Hi Dotan, how can another node polling cq for recv event tell if its peer process has crashed?
    I ran a quick experiment and it seems the alive node is simply polling the CQ without any erroneous work completion. I made it sidetrack to calling ibv_get_async_event occasionally and no new event from that fd either.

    • Dotan Barak says: December 24, 2018

      Hi.

      If you are using only the verbs - you won't be able to know that the remote side crashed.
      If you'll use CM/CMA you'll get an event about it.

      However, you can implement a simple keep alive mechanism:
      RDMA Write of zero-bytes message (assuming that this is an RC QP).

      Thanks
      Dotan

  2. Kethiri says: July 11, 2019

    What are the possible reasons to get async event IBV_EVENT_PATH_MIG. I am handling high number of traffic in one path. In that case I am getting IBV_EVENT_PATH_MIG. Are those correlated ?

    Regards,
    Kethiri

    • Dotan Barak says: July 13, 2019

      Hi.

      Automatic Path Migration (APM) starts when there is a transport error with a connection
      (on a reliable Connection).

      I must admit that I don't understand what "High number of traffic" means;
      But maybe now (when there are many packets in the path), the QP timeout isn't enough - and you should increase it.

      Thanks
      Dotan

Add a Comment

This comment will be moderated; answer may be provided within 14 days.

Time limit is exhausted. Please reload CAPTCHA.