Skip to content

ibv_attach_mcast()

Contents

5.00 avg. rating (97% score) - 1 vote
int ibv_attach_mcast(struct ibv_qp *qp, const union ibv_gid *gid,
                     uint16_t lid);

Description

ibv_attach_mcast() attaches a Queue Pair to a multicast group.

After the attachment completes, this QP will be provided with a copy of every multicast message addressed to the group specified by gid and received on the RDMA device port with which the QP is associated. ibv_attach_mcast() only influence the local RDMA device. In order to receive multicast messages to this RDMA device, a join request to the multicast group must be sent to the Subnet Administrator (SA), so that the fabric's multicast routing is configured to deliver messages to the local port.

Only QPs of Transport Service Type IBV_QPT_UD may be attached to multicast groups.

Attaching a QP to the same multicast group more than once will succeed. However, the QP is being attached to the multicast group only once and the QP will still receive only a single copy of every message that was sent to this multicast group.

If the value of dev_cap.max_mcast_grp is zero, this means that the RDMA device doesn't support unreliable multicast groups. Otherwise, this value specifies the total number of supported multicast groups.

The same QP can be attached to one or more multicast groups.
One or More QPs can be attached to the same multicast group. The value dev_cap.max_mcast_qp_attach specifies the maximum number of QPs that can be attached to any multicast group.

dev_cap.max_total_mcast_qp_attach specifies the total number of QPs that can be attached to any multicast group. This value will be equal or less than dev_cap.max_mcast_grp * dev_cap.max_mcast_qp_attach.

Attaching a QP to a multicast group can occur in any QP state.

Parameters

Name Direction Description
qp in QP that was returned from ibv_create_qp()
gid in Multicast GID that the QP will be attached to
lid in Multicast LID that the QP will be attached to

Return Values

Value Description
0 On success
errno On failure
EINVAL Invalid gid (not multicast GID) or qp isn't an UD QP
ENOMEM Not enough resources to complete this operation
ENOSYS Multicast groups aren't supported by this device

Examples

Attach a QP to a multicast group and detach it:

struct ibv_qp *qp; /* A pointer to a QP that was created before */
union ibv_gid mgid;
uint16_t mlid;
 
/* need to initialize the value of mgid and mlid according to value from the SA */
mgid.raw = {255,1,0,0,0,2,201,133,0,0,0,0,0,0,0,0};
mlid = 0xc001;
 
if (ibv_attach_mcast(qp, &mgid, mlid)) {
	fprintf(stderr, "Error, ibv_attach_mcast() failed\n");
	return -1;
}
 
if (ibv_detach_mcast(qp, &mgid, mlid)) {
	fprintf(stderr, "Error, ibv_detach_mcast() failed\n");
	return -1;
}

FAQs

Can I attach any QP to a multicast group?

No. Only Unreliable Datagram QPs can be attached to multicast groups.

How do I get the value for gid and lid?

One should get those values from the SA.

I called ibv_attach_mcast() and I didn't get multicast messages. Why?

ibv_attach_mcast() configure the RDMA device (locally) to duplicate every incoming multicast message. However, the RDMA device won't get multicast messages from the fabric if it won't be configured by the Subnet Manager. One should send join request for the multicast group to the SA.

How can I send join request for a multicast group to the SA?

One can do this by using the library rdmacm.

Can I attach one QP to the same multicast group more than once?

Yes, doing this will pass. However, only one multicast message will be duplicated to this QP.

Can I attach the same QP to more than one multicast groups?

Yes, you can.

Share Our Posts

Share this post through social bookmarks.

  • Delicious
  • Digg
  • Newsvine
  • RSS
  • StumbleUpon
  • Technorati

Comments

Tell us what do you think.

  1. Lluis says: December 2, 2014

    I noticed that in multicast QPs, sometimes the node sending a message receives a copy of it and sometimes not. What does define whether nodes get a copy of their own messages or not?

    Thanks,

    • Dotan Barak says: December 4, 2014

      Hi Lluis.

      In InfiniBand, if a UD QP sends a message to a multicast group and he is a member in that multicast group,
      he will get the message too
      (as long as he has an outstanding Receive Request in his Receive Queue to hold the incoming message).

      Thanks
      Dotan

  2. AlexChes says: June 23, 2015

    Hi Dotan,

    Thank you for your excellent blog!
    I made some working send-style test programs with different QP types (RC, UC, UD). Now I'm trying to test multicasting. I just can't understand how to obtain multicast GID. (I work under Windows Server 2008 (x64). Here there is an IB Accsess Layer which differs from you use under Linux.) Could you tell me a little more detail how to apply your advice "should get those values from the SA"?

    Thanks, Alex.

    • Dotan Barak says: June 23, 2015

      Hi Alex.

      Thanks for the feedback
      :)

      I assume that you are referring to to IBAL.
      Unlike the Linux stack, which only attach the QP to the multicast group locally,
      IBAL send a join message to the SA as well.

      However, In the WinOF Release Notes one can see that this stack is deprecated,
      and Network Direct (ND) should be used (ND doesn't support UD QPs).

      Did I answer your question?
      Or do you ask me how one can decide with is the multicast GID to use?
      (and the interface with the SA)

      Thanks
      Dotan

      • AlexChes says: June 24, 2015

        Hi Dotan,

        Thank you for the reply!

        >>However, In the WinOF Release Notes one can see that this stack is deprecated,
        and Network Direct (ND) should be used (ND doesn't support UD QPs).

        Sad news. The quality and fulness of IBAL documentation is not very high. That documentation is clear only if a developer has deep knowledge in IB. I don't. On the other hand I can't find any documents/tutorials for NetworkDirect at all.
        Do you think that IBAL will be excluded from future releases of WinOF? (Actually, I read Release Notes (Rev 4.95.50000) but I didn't see information that IBAL is deprecated.)

        >>Or do you ask me how one can decide with is the multicast GID to use? (and the interface with the SA)

        Yes, I do. I'm creating a multicast group and I have some parameters which must be set I don't know how to fill them:

        ib_mcast_req_t mcast_req;

        memset(&mcast_req, 0, sizeof(mcast_req));

        mcast_req.create = 1;
        mcast_req.flags = IB_FLAGS_SYNC;
        mcast_req.mcast_context = pClientContext;
        mcast_req.pfn_mcast_cb = IBOnMulticastJoin;
        mcast_req.pkey_index = 0;
        mcast_req.port_guid = mpIBChannelAdapterAttributes->p_port_attr[mParams.IBPort - 1].port_guid;
        mcast_req.retry_cnt = 5;
        mcast_req.timeout_ms = 1000;

        mcast_req.member_rec.pkey = 0;
        mcast_req.member_rec.qkey = 0;

        mcast_req.member_rec.mgid.raw = ; // ?
        mcast_req.member_rec.scope_state = ; // ?

        mcast_req.member_rec.sl_flow_hop = ; // ?
        mcast_req.member_rec.tclass = ;// ?

        ib_api_status = ib_join_mcast(pClientContext->IBQueuePairHandle, &mcast_req, &mIBMulticastHandle);

        if(ib_api_status != IB_SUCCESS)
        {
        ...
        }

        Could you give me any advices to solve this problem?

        Thanks, Alex.

      • Dotan Barak says: June 29, 2015

        Hi.

        You can use the following values for the attributes you mentioned:
        * mcast_req.member_rec.mgid.raw: The multicast GID of the group that you want to be member in (for example:
        255,1,0,0,0,0,0,0,0,2,201,0,1,0,208,100
        * mcast_req.member_rec.scope_state : ib_member_set_scope_state( scope, state );
        scope: According to "Table 3 Multicast Address Scope",
        the value 2 (i.e. Link-local) can be used).
        state: According to joining mode,
        the value IB_MC_REC_STATE_FULL_MEMBER can be used
        mcast_req.member_rec.sl_flow_hop : ib_member_set_sl_flow_hop( sl, flow_label, hop_limit )
        the values of sl/flow_label/hop_limit with the values that you need,
        You can use the values: 0, 0, 0.
        mcast_req.member_rec.tclass : according to the traffic class that you want to use,
        You can use the value 0.

        I hope that this helped you.

        Thanks
        Dotan

  3. Vasilis_G says: May 1, 2017

    Hello Dotan,

    Thank you very much for this blog.
    I have the following question on multicast:
    I am worried about whether the multicast happens in software or hardware (i.e. does the sender NIC sends one message, or multiple messages?)

    Does the multicast mean that the sender NIC will send a single message and the switch will propagate it to all QPs that have joined the multicast group?

    That would imply some existing support/functionality from the switch. Should I expect that such support exists if the cluster supports RDMA?

    Apologies if the question seems a bit silly, but I have not been able to derive a definitive answer on this from the blog, the book or the documentation (probably because of my limited knowledge on the matters).

    Vasilis

    • Dotan Barak says: July 3, 2017

      Hi.

      Here is some clarifications on how multicast works in InfiniBand:
      * HCAs join the multicast group by an SA join request
      (and the SM configures the subnet according to this)
      * The HCA will send one multicast message and the switch will duplicate it to ports that should get the messsage
      (this will cause the message to propagate in the subnet)
      * One message will be received by any remote HCA, and the HCA will duplicate the message to all the relevant QPs.

      And there isn't any silly question
      :)

      Thanks
      Dotan

      • huhu says: December 1, 2017

        only RDMA multicast can work in InfiniBand? does RoCE support this operation?

      • Dotan Barak says: December 6, 2017

        No.

        You have multicast over UDP sockets in RoCE/Ethernet.

        Thanks
        Dotan

  4. TomS says: January 14, 2019

    I am new in IB and am confused when RDMA and multicast are discussed together. Isn't RDMA a one-to-one (point-to-point) transfer whereby multicast is on-to-many?

    Also, does IB multicast imply IPoIB is the actual protocol? If so, can IPoIB bypass the kernel?

    Thanks

    • Dotan Barak says: January 15, 2019

      Hi.

      RDMA is a network ability that allows Remote DMA (Direct Memory Access).
      InfiniBand for example, is a network technology which supports RDMA.

      In InfiniBand one can perform RDMA Write in Connected transport types (i.e. one to one).

      However, InfiniBand supports Unreliable Datagram transport types as well,
      which allows performing one to many (i.e. multicast).

      IPoIB is a upper layer protocol that uses InfiniBand packets,
      and it uses multicast as part of its work (to imitate "flood", as done in Ethernet fabric).

      IPoIB doesn't pass the kernel, there is a kernel module which implements it
      (and it uses the TCP/IP stack that exists in the kernel's network stack).

      Thanks
      Dotan

Add a Comment

This comment will be moderated; answer may be provided within 14 days.

Time limit is exhausted. Please reload CAPTCHA.