ibv_ack_cq_events()

Contents

4.20 avg. rating (85% score) - 5 votes

void ibv_ack_cq_events(struct ibv_cq *cq, unsigned int nevents);

Description

ibv_ack_cq_events() acknowledge Completion events.

In order to prevent races, all of the Completion events that were read using ibv_get_cq_event() must be acknowledged using ibv_ack_cq_events().

Calling ibv_ack_cq_events() may be relatively expensive in the data-path since it uses mutual exclusion object(s). Therefore, it may be better to amortize this cost by keeping a count of the number of events needing acknowledgement and acknowledging several Completion events in one call to ibv_ack_cq_events().

Parameters

Name	Direction	Description
cq	in	CQ that was returned from ibv_create_cq()
nevents	in	Number of Completion events to acknowledge

Return Values

None (this function always succeeds).

Examples

Read a Completion event and acknowledge it:

struct ibv_context *context;
struct ibv_cq *cq;
void *ev_ctx = NULL; /* can be initialized with other values for the CQ context */
 
/* Create a CQ, which is associated with a Completion Event Channel */
cq = ibv_create_cq(ctx, 1, ev_ctx, channel, 0);
if (!cq) {
        fprintf(stderr, "Failed to create CQ\n");
        return -1;
}
 
/* Request notification before any completion can be created (to prevent races) */
ret = ibv_req_notify_cq(cq, 0);
if (ret) {
        fprintf(stderr, "Couldn't request CQ notification\n");
        return -1;
}
 
.
. /* Perform an operation that will eventually end with Work Completion */
.
 
/* The following code will be called each time you need to read a Work Completion */
struct ibv_cq *ev_cq;
void *ev_ctx;
int ret;
int ne;
 
/* Wait for the Completion event */
ret = ibv_get_cq_event(channel, &amp;ev_cq, &amp;ev_ctx);
if (ret) {
        fprintf(stderr, "Failed to get CQ event\n");
        return -1;
}
 
/* Request notification upon the next completion event */
ret = ibv_req_notify_cq(ev_cq, 0);
if (ret) {
        fprintf(stderr, "Couldn't request CQ notification\n");
        return -1;
}
 
/* Empty the CQ: poll all of the completions from the CQ (if any exist) */
do {
        ne = ibv_poll_cq(cq, 1, &amp;wc);
        if (ne &lt; 0) {
                fprintf(stderr, "Failed to poll completions from the CQ: ret = %d\n",
                        ne);
                return -1;
        }
        /* there may be an extra event with no completion in the CQ */
        if (ne) {
            if (wc.status != IBV_WC_SUCCESS) {
                fprintf(stderr, "Completion with status 0x%x was found\n",
wc.status);
                return -1;
            }
        }
} while (ne);
 
/* Ack the event */
ibv_ack_cq_events(ev_cq, 1);

FAQs

Why do I need to call ibv_ack_cq_events() anyway?

This verb is used in order to prevent internal races.

What will happen if I won't acknowledge all of Completion events?

If one won't acknowledge all of the Completion events that he reads using ibv_get_cq_event(), destroying the CQ, that got the events, will be blocked forever. This behavior is used in order to prevent an acknowledgment on a resource that has already been destroyed.

What will happen if I read a Completion event and my process will be terminated intentionally (for example, by calling exit()) or unintentionally (for example, by segmentation fault) before I could acknowledge the event?

Even if there is any unacknowledged Completion event, when the process will be terminated, no matter the reason is, all of the resources will be cleaned.

Should I acknowledge the Completion events one by one, or acknowledge several Completion events at once?

Acknowledging Completion events require the use of mutual exclusion object(s), so it is highly advised to acknowledged several Completion events at once (especially, in the data-path).

Written by: Dotan Barak on March 16, 2013.on March 26, 2021.

Comments

Tell us what do you think.

DjvuLee says: February 26, 2015

"prevent internal races." means what? can you explain deeper, If we only one thread run the ibv_get_cq_event, how can we introduce the race?

Reply
- Dotan Barak says: February 26, 2015
  
  Hi.
  
  Yes, but Libibverbs is thread safe, so it should be ready to work with thread.
  
  Furthermore, when reading Completion event from a channel - you get the CQ handler that got the event.
  Without a proper protection, you may get a handler to a CQ which was already destroyed;
  which may lead to inconsistency or segmentation fault.
  
  Thanks
  Dotan
  
  Reply
  - DjvuLee says: March 4, 2015
    
    Thanks very much for your answer! I missed your wonderful answer in my Email box.
    
    I have one more question.
    In a client-server scenario, if the server do not know the size of the request message will be send from the client, how can the server register the memory? bigger memory region can not totally deal with this problem?
    In TCP/IP, there is flow control inside the protocol, if the sender send too much message, the receiver can notify it to slow down, how can we deal with this in a RDMA protocol?
  - Dotan Barak says: March 4, 2015
    
    Hi.
    
    One can register big memory buffer (the maximum possible value) and be ready for this scenario;
    or the client side can split the message to the maximum buffer size, reported by the server.
    
    You can register big buffers without any problem (as long as you don't try to register most of the machine memory).
    
    The flow control that you described; do you mean in the application layer or in the protocol?
    
    Let's assume that you are talking about protocol:
    And what do you mean send too much? in RDMA Read/Writes there (almost) isn't any problem.
    If Receive Requests needs to be consumed in remote side, there is the RNR flow.
    
    Beside what I described, AFAIK, the protocol doesn't start slowing the traffic.
    
    Thanks
    Dotan
DjvuLee says: March 4, 2015

The answer is very clearly! Yes, I mean there is a receiver not ready in the Send/Recv way , I understand this now.

But after the client received the RNR notify from the server, when the client can send message again? Does the server will send a message to notify the client to continue send message?

Consider another question, If the client want to use the RDMA write operation to write some message to the server, how can the client get the rkey of the buffer in server, I think the server may send the buffer's rkey and location to the client, this may introduce some latency, because the client will wait the server send this info first, and send message next. If the client will write many times to the server, is there any way to avoid this or cut down the cost?

Reply
- Dotan Barak says: March 4, 2015
  
  No.
  
  The 'min_rnr_timer' specify how much time the remote side should wait before resend the message in RNR flow.
  
  The answer to the second question: the client can't perform RDMA Write to the server memory unless its know the
  server's Memory Region attributes (size, key. address). This can only be done either by:
  1) When connecting (for example using rdmacm), send this info as private data
  2) Send RDMA messages (for example: using Send opcode)
  3) Using Out Of Band protocol/system
  
  Sorry, no free launches here ...
  
  Thanks
  Dotan
  
  Reply
  - DjvuLee says: March 4, 2015
    
    Got it! thanks for your nice answer!
neuralcn says: April 24, 2015

But if set min_rnr_timer，when meet RNR，the transaction response time may change larger. If so, how to do with it?

Reply
- Dotan Barak says: April 24, 2015
  
  Hi.
  
  I don't really understand the question here. Can you please rephrase it?
  
  Thanks
  Dotan
  
  Reply
  - neuralcn says: April 27, 2015
    
    I mean: when set min_rnr_timer, the remote side should wait and retry, but the server side may be already post receive, so this will not real time.
  - Dotan Barak says: April 28, 2015
    
    The min_rnr_timer give enough time to the receiver until it will post more Receive Requests to its Receive Queue.
    Since it knows his priorities and algorithms, it specify the time that the sender should wait before resending the message.
    
    I must admit that i don't understand what "real time" time.
    
    There won't be any other message that the receiver will send to the receiver "hey, now I have Receive Requests; you can (re)send the message now).
    And the sender will wait until (at least) the min_rnr_timer will expire; even if just after the receiver send the RNR NACK, many Receive Requests were posted ...
    
    Thanks
    Dotan
CQ Tang says: August 4, 2016

About the race condition. Here is you statement:

when reading Completion event from a channel - you get the CQ handler that got the event.
Without a proper protection, you may get a handler to a CQ which was already destroyed;
which may lead to inconsistency or segmentation fault.

But after you call ibv_ack_cq_event(), the CQ can be destroyed immediately, then later as in your example code, ibv_req_notify_cq() and ibv_poll_cq() will use the destroyed CQ handle.

Does that mean we should call ibv_ack_cq_event() last in your example code, instead of right after ibv_get_cq_event() call?

Reply
- Dotan Barak says: August 11, 2016
  
  Hi.
  
  Any operation on a destroyed CQ may end with a segmentation fault.
  I understand what you mean, and I illustrated a code sample on how to use a CQ.
  
  If one wishes to be on the same side, he can move the ibv_ack_cq_event() after using the CQ.
  However to destroy the CQ, the associated QPs should be destroyed first;
  so this is part of the teardown flow...
  
  Nice comment though
  Dotan
  
  Reply
  - Dotan Barak says: August 18, 2016
    
    After rethinking about, since I believe in protective programming,
    I agree.
    
    The ack on the event was moved to the end of the code sample.
    
    Thanks!
    Dotan
VIncenzo Maffione says: April 27, 2018

Hi,
I have a question regarding the need to call ibv_ack_cq_events() and its cost.
From your description, I got that ibv_get_cq_event() does not need to acquire a mutex, while ibv_ack_cq_events() does. Hence it is more convenient to call the latter as little as possible.

But is there a limit on how many "Completion Events" may be outstanding? If not, one could think of never calling ibv_ack_cq_events() in the datapath, and only calling it right before destroying the CQ. In this way you would never pay the toll where it matters.
Does it make sense?

Thanks,
Vincenzo

Reply
- Dotan Barak says: May 11, 2018
  
  Hi.
  
  Indeed, ibv_ack_cq_events() implementation is involve in acquiring (and releasing) a mutex.
  There isn't any limit to the number of outstanding Completion events
  (except for the maximum number that can be represented in an "unsigned int" for your machine architecture).
  
  I fully agree with what you wrote:
  Acking all the events just before destroying the CQ is fine and can be a good practice.
  
  Thanks
  Dotan
  
  Reply
Roman Pushkov says: February 28, 2021

I have two questions about your code example:

/* Empty the CQ: poll all of the completions from the CQ (if any exist) */
do {
ne = ibv_poll_cq(cq, 1, &wc);
.
.
.
/* there may be an extra event with no completion in the CQ */
if (ne == 0)
continue;

if (wc.status != IBV_WC_SUCCESS) {
fprintf(stderr, "Completion with status 0x%x was found\n",
wc.status);
return -1;
}
} while (ne);

/* Ack the event */
ibv_ack_cq_events(ev_cq, 1);

The first: as I understood, if ne == 0, we need to avoid checking wc.status, becase there wasn't any completions (but event was generated before). So why we can't use while(ne = ibv_poll_cq(...)) {} ?
Also, when we got ne == 0, we called ibv_ack_cq_events, but we need to ack completions and completion wasn't in CQ.

The second: if do { ... } while cycle we can got more than one completion, but ibv_ack_cq_events(...) acks only one. Can you explain why?

Thank you.

Reply
- Dotan Barak says: March 26, 2021
  
  Hi.
  
  Thanks for the feedback.
  1) I improved the code a little bit (personally, i against setting a value in the loop statement),
  to prevent looping if there isn't any completion in the CQ
  2) The verb ibv_ack_cq_events() acks events and not Work Completions.
  In this code sample, we got one completion event - it doesn't matter how many Work Completion we got (zero, one or even more).
  
  Thanks
  Dotan
  
  Reply

Add a Comment

This comment will be moderated; answer may be provided within 14 days.

Social Network Badges

Main Menu

ibv_ack_cq_events()

Description

Parameters

Return Values

Examples

FAQs

Why do I need to call ibv_ack_cq_events() anyway?

What will happen if I won't acknowledge all of Completion events?

What will happen if I read a Completion event and my process will be terminated intentionally (for example, by calling exit()) or unintentionally (for example, by segmentation fault) before I could acknowledge the event?

Should I acknowledge the Completion events one by one, or acknowledge several Completion events at once?

Comments

Add a Comment

Sidebar

Donate

Categories

Archives

Recent Comments

Twitter Status

Archives

Social Network Badges

Main Menu

ibv_ack_cq_events()

Description

Parameters

Return Values

Examples

FAQs

Why do I need to call ibv_ack_cq_events() anyway?

What will happen if I won't acknowledge all of Completion events?

What will happen if I read a Completion event and my process will be terminated intentionally (for example, by calling exit()) or unintentionally (for example, by segmentation fault) before I could acknowledge the event?

Should I acknowledge the Completion events one by one, or acknowledge several Completion events at once?

Share Our Posts

Comments

Add a Comment

Sidebar

Donate

Tags

Categories

Archives

Recent Comments

Twitter Status

Blogroll

Archives