diff options
Diffstat (limited to 'static/netbsd/man4/multicast.4 3.html')
| -rw-r--r-- | static/netbsd/man4/multicast.4 3.html | 717 |
1 files changed, 0 insertions, 717 deletions
diff --git a/static/netbsd/man4/multicast.4 3.html b/static/netbsd/man4/multicast.4 3.html deleted file mode 100644 index f4cbdddc..00000000 --- a/static/netbsd/man4/multicast.4 3.html +++ /dev/null @@ -1,717 +0,0 @@ -<table class="head"> - <tr> - <td class="head-ltitle">MULTICAST(4)</td> - <td class="head-vol">Device Drivers Manual</td> - <td class="head-rtitle">MULTICAST(4)</td> - </tr> -</table> -<div class="manual-text"> -<section class="Sh"> -<h1 class="Sh" id="NAME"><a class="permalink" href="#NAME">NAME</a></h1> -<p class="Pp"><code class="Nm">multicast</code> — - <span class="Nd">Multicast Routing</span></p> -</section> -<section class="Sh"> -<h1 class="Sh" id="SYNOPSIS"><a class="permalink" href="#SYNOPSIS">SYNOPSIS</a></h1> -<p class="Pp"><code class="Cd">options MROUTING</code></p> -<p class="Pp"> - <br/> - <code class="In">#include <<a class="In">sys/types.h</a>></code> - <br/> - <code class="In">#include <<a class="In">sys/socket.h</a>></code> - <br/> - <code class="In">#include <<a class="In">netinet/in.h</a>></code> - <br/> - <code class="In">#include <<a class="In">netinet/ip_mroute.h</a>></code> - <br/> - <code class="In">#include - <<a class="In">netinet6/ip6_mroute.h</a>></code></p> -<p class="Pp"><var class="Ft">int</var> - <br/> - <code class="Fn">getsockopt</code>(<var class="Fa" style="white-space: nowrap;">int - s</var>, <var class="Fa" style="white-space: nowrap;">IPPROTO_IP</var>, - <var class="Fa" style="white-space: nowrap;">MRT_INIT</var>, - <var class="Fa" style="white-space: nowrap;">void *optval</var>, - <var class="Fa" style="white-space: nowrap;">socklen_t *optlen</var>);</p> -<p class="Pp"><var class="Ft">int</var> - <br/> - <code class="Fn">setsockopt</code>(<var class="Fa" style="white-space: nowrap;">int - s</var>, <var class="Fa" style="white-space: nowrap;">IPPROTO_IP</var>, - <var class="Fa" style="white-space: nowrap;">MRT_INIT</var>, - <var class="Fa" style="white-space: nowrap;">const void *optval</var>, - <var class="Fa" style="white-space: nowrap;">socklen_t optlen</var>);</p> -<p class="Pp"><var class="Ft">int</var> - <br/> - <code class="Fn">getsockopt</code>(<var class="Fa" style="white-space: nowrap;">int - s</var>, <var class="Fa" style="white-space: nowrap;">IPPROTO_IPV6</var>, - <var class="Fa" style="white-space: nowrap;">MRT6_INIT</var>, - <var class="Fa" style="white-space: nowrap;">void *optval</var>, - <var class="Fa" style="white-space: nowrap;">socklen_t *optlen</var>);</p> -<p class="Pp"><var class="Ft">int</var> - <br/> - <code class="Fn">setsockopt</code>(<var class="Fa" style="white-space: nowrap;">int - s</var>, <var class="Fa" style="white-space: nowrap;">IPPROTO_IPV6</var>, - <var class="Fa" style="white-space: nowrap;">MRT6_INIT</var>, - <var class="Fa" style="white-space: nowrap;">const void *optval</var>, - <var class="Fa" style="white-space: nowrap;">socklen_t optlen</var>);</p> -</section> -<section class="Sh"> -<h1 class="Sh" id="DESCRIPTION"><a class="permalink" href="#DESCRIPTION">DESCRIPTION</a></h1> -<p class="Pp">Multicast routing is used to efficiently propagate data packets to - a set of multicast listeners in multipoint networks. If unicast is used to - replicate the data to all listeners, then some of the network links may - carry multiple copies of the same data packets. With multicast routing, the - overhead is reduced to one copy (at most) per network link.</p> -<p class="Pp">All multicast-capable routers must run a common multicast routing - protocol. The Distance Vector Multicast Routing Protocol (DVMRP) was the - first developed multicast routing protocol. Later, other protocols such as - Multicast Extensions to OSPF (MOSPF), Core Based Trees (CBT), Protocol - Independent Multicast - Sparse Mode (PIM-SM), and Protocol Independent - Multicast - Dense Mode (PIM-DM) were developed as well.</p> -<p class="Pp">To start multicast routing, the user must enable multicast - forwarding in the kernel (see <a class="Sx" href="#SYNOPSIS">SYNOPSIS</a> - about the kernel configuration options), and must run a multicast routing - capable user-level process. From developer's point of view, the programming - guide described in the <a class="Sx" href="#Programming_Guide">Programming - Guide</a> section should be used to control the multicast forwarding in the - kernel.</p> -<section class="Ss"> -<h2 class="Ss" id="Programming_Guide"><a class="permalink" href="#Programming_Guide">Programming - Guide</a></h2> -<p class="Pp">This section provides information about the basic multicast - routing API. The so-called “advanced multicast API” is - described in the - <a class="Sx" href="#Advanced_Multicast_API_Programming_Guide">Advanced - Multicast API Programming Guide</a> section.</p> -<p class="Pp">First, a multicast routing socket must be open. That socket would - be used to control the multicast forwarding in the kernel. Note that most - operations below require certain privilege (i.e., root privilege):</p> -<div class="Bd Pp Li"> -<pre>/* IPv4 */ -int mrouter_s4; -mrouter_s4 = socket(AF_INET, SOCK_RAW, IPPROTO_IGMP);</pre> -</div> -<div class="Bd Pp Li"> -<pre>int mrouter_s6; -mrouter_s6 = socket(AF_INET6, SOCK_RAW, IPPROTO_ICMPV6);</pre> -</div> -<p class="Pp">Note that if the router needs to open an IGMP or ICMPv6 socket (in - case of IPv4 and IPv6 respectively) for sending or receiving of IGMP or MLD - multicast group membership messages, then the same - <var class="Va">mrouter_s4</var> or <var class="Va">mrouter_s6</var> sockets - should be used for sending and receiving respectively IGMP or MLD messages. - In case of <span class="Ux">BSD</span>-derived kernel, it may be possible to - open separate sockets for IGMP or MLD messages only. However, some other - kernels (e.g., Linux) require that the multicast routing socket must be used - for sending and receiving of IGMP or MLD messages. Therefore, for - portability reason the multicast routing socket should be reused for IGMP - and MLD messages as well.</p> -<p class="Pp">After the multicast routing socket is open, it can be used to - enable or disable multicast forwarding in the kernel:</p> -<div class="Bd Pp Li"> -<pre>/* IPv4 */ -int v = 1; /* 1 to enable, or 0 to disable */ -setsockopt(mrouter_s4, IPPROTO_IP, MRT_INIT, (void *)&v, sizeof(v));</pre> -</div> -<div class="Bd Pp Li"> -<pre>/* IPv6 */ -int v = 1; /* 1 to enable, or 0 to disable */ -setsockopt(mrouter_s6, IPPROTO_IPV6, MRT6_INIT, (void *)&v, sizeof(v)); -... -/* If necessary, filter all ICMPv6 messages */ -struct icmp6_filter filter; -ICMP6_FILTER_SETBLOCKALL(&filter); -setsockopt(mrouter_s6, IPPROTO_ICMPV6, ICMP6_FILTER, (void *)&filter, - sizeof(filter));</pre> -</div> -<p class="Pp">After multicast forwarding is enabled, the multicast routing - socket can be used to enable PIM processing in the kernel if we are running - PIM-SM or PIM-DM (see <a class="Xr">pim(4)</a>).</p> -<p class="Pp">For each network interface (e.g., physical or a virtual tunnel) - that would be used for multicast forwarding, a corresponding multicast - interface must be added to the kernel:</p> -<div class="Bd Pp Li"> -<pre>/* IPv4 */ -struct vifctl vc; -memset(&vc, 0, sizeof(vc)); -/* Assign all vifctl fields as appropriate */ -vc.vifc_vifi = vif_index; -vc.vifc_flags = vif_flags; -vc.vifc_threshold = min_ttl_threshold; -vc.vifc_rate_limit = max_rate_limit; -memcpy(&vc.vifc_lcl_addr, &vif_local_address, sizeof(vc.vifc_lcl_addr)); -if (vc.vifc_flags & VIFF_TUNNEL) - memcpy(&vc.vifc_rmt_addr, &vif_remote_address, - sizeof(vc.vifc_rmt_addr)); -setsockopt(mrouter_s4, IPPROTO_IP, MRT_ADD_VIF, (void *)&vc, - sizeof(vc));</pre> -</div> -<p class="Pp">The <var class="Va">vif_index</var> must be unique per vif. The - <var class="Va">vif_flags</var> contains the <code class="Dv">VIFF_*</code> - flags as defined in - <code class="In"><<a class="In">netinet/ip_mroute.h</a>></code>. The - <var class="Va">min_ttl_threshold</var> contains the minimum TTL a multicast - data packet must have to be forwarded on that vif. Typically, it would have - value of 1. The <var class="Va">max_rate_limit</var> contains the maximum - rate (in bits/s) of the multicast data packets forwarded on that vif. Value - of 0 means no limit. The <var class="Va">vif_local_address</var> contains - the local IP address of the corresponding local interface. The - <var class="Va">vif_remote_address</var> contains the remote IP address in - case of DVMRP multicast tunnels.</p> -<div class="Bd Pp Li"> -<pre>/* IPv6 */ -struct mif6ctl mc; -memset(&mc, 0, sizeof(mc)); -/* Assign all mif6ctl fields as appropriate */ -mc.mif6c_mifi = mif_index; -mc.mif6c_flags = mif_flags; -mc.mif6c_pifi = pif_index; -setsockopt(mrouter_s6, IPPROTO_IPV6, MRT6_ADD_MIF, (void *)&mc, - sizeof(mc));</pre> -</div> -<p class="Pp">The <var class="Va">mif_index</var> must be unique per vif. The - <var class="Va">mif_flags</var> contains the <code class="Dv">MIFF_*</code> - flags as defined in - <code class="In"><<a class="In">netinet6/ip6_mroute.h</a>></code>. The - <var class="Va">pif_index</var> is the physical interface index of the - corresponding local interface.</p> -<p class="Pp">A multicast interface is deleted by:</p> -<div class="Bd Pp Li"> -<pre>/* IPv4 */ -vifi_t vifi = vif_index; -setsockopt(mrouter_s4, IPPROTO_IP, MRT_DEL_VIF, (void *)&vifi, - sizeof(vifi));</pre> -</div> -<div class="Bd Pp Li"> -<pre>/* IPv6 */ -mifi_t mifi = mif_index; -setsockopt(mrouter_s6, IPPROTO_IPV6, MRT6_DEL_MIF, (void *)&mifi, - sizeof(mifi));</pre> -</div> -<p class="Pp">After the multicast forwarding is enabled, and the multicast - virtual interfaces are added, the kernel may deliver upcall messages (also - called signals later in this text) on the multicast routing socket that was - open earlier with <code class="Dv">MRT_INIT</code> or - <code class="Dv">MRT6_INIT</code>. The IPv4 upcalls have - <var class="Vt">struct igmpmsg</var> header (see - <code class="In"><<a class="In">netinet/ip_mroute.h</a>></code>) with - field <var class="Va">im_mbz</var> set to zero. Note that this header - follows the structure of <var class="Vt">struct ip</var> with the protocol - field <var class="Va">ip_p</var> set to zero. The IPv6 upcalls have - <var class="Vt">struct mrt6msg</var> header (see - <code class="In"><<a class="In">netinet6/ip6_mroute.h</a>></code>) - with field <var class="Va">im6_mbz</var> set to zero. Note that this header - follows the structure of <var class="Vt">struct ip6_hdr</var> with the next - header field <var class="Va">ip6_nxt</var> set to zero.</p> -<p class="Pp">The upcall header contains field <var class="Va">im_msgtype</var> - and <var class="Va">im6_msgtype</var> with the type of the upcall - <code class="Dv">IGMPMSG_*</code> and <code class="Dv">MRT6MSG_*</code> for - IPv4 and IPv6 respectively. The values of the rest of the upcall header - fields and the body of the upcall message depend on the particular upcall - type.</p> -<p class="Pp">If the upcall message type is - <code class="Dv">IGMPMSG_NOCACHE</code> or - <code class="Dv">MRT6MSG_NOCACHE</code>, this is an indication that a - multicast packet has reached the multicast router, but the router has no - forwarding state for that packet. Typically, the upcall would be a signal - for the multicast routing user-level process to install the appropriate - Multicast Forwarding Cache (MFC) entry in the kernel.</p> -<p class="Pp">An MFC entry is added by:</p> -<div class="Bd Pp Li"> -<pre>/* IPv4 */ -struct mfcctl mc; -memset(&mc, 0, sizeof(mc)); -memcpy(&mc.mfcc_origin, &source_addr, sizeof(mc.mfcc_origin)); -memcpy(&mc.mfcc_mcastgrp, &group_addr, sizeof(mc.mfcc_mcastgrp)); -mc.mfcc_parent = iif_index; -for (i = 0; i < maxvifs; i++) - mc.mfcc_ttls[i] = oifs_ttl[i]; -setsockopt(mrouter_s4, IPPROTO_IP, MRT_ADD_MFC, - (void *)&mc, sizeof(mc));</pre> -</div> -<div class="Bd Pp Li"> -<pre>/* IPv6 */ -struct mf6cctl mc; -memset(&mc, 0, sizeof(mc)); -memcpy(&mc.mf6cc_origin, &source_addr, sizeof(mc.mf6cc_origin)); -memcpy(&mc.mf6cc_mcastgrp, &group_addr, sizeof(mf6cc_mcastgrp)); -mc.mf6cc_parent = iif_index; -for (i = 0; i < maxvifs; i++) - if (oifs_ttl[i] > 0) - IF_SET(i, &mc.mf6cc_ifset); -setsockopt(mrouter_s6, IPPROTO_IPV6, MRT6_ADD_MFC, - (void *)&mc, sizeof(mc));</pre> -</div> -<p class="Pp">The <var class="Va">source_addr</var> and - <var class="Va">group_addr</var> are the source and group address of the - multicast packet (as set in the upcall message). The - <var class="Va">iif_index</var> is the virtual interface index of the - multicast interface the multicast packets for this specific source and group - address should be received on. The <var class="Va">oifs_ttl[]</var> array - contains the minimum TTL (per interface) a multicast packet should have to - be forwarded on an outgoing interface. If the TTL value is zero, the - corresponding interface is not included in the set of outgoing interfaces. - Note that in case of IPv6 only the set of outgoing interfaces can be - specified.</p> -<p class="Pp">An MFC entry is deleted by:</p> -<div class="Bd Pp Li"> -<pre>/* IPv4 */ -struct mfcctl mc; -memset(&mc, 0, sizeof(mc)); -memcpy(&mc.mfcc_origin, &source_addr, sizeof(mc.mfcc_origin)); -memcpy(&mc.mfcc_mcastgrp, &group_addr, sizeof(mc.mfcc_mcastgrp)); -setsockopt(mrouter_s4, IPPROTO_IP, MRT_DEL_MFC, - (void *)&mc, sizeof(mc));</pre> -</div> -<div class="Bd Pp Li"> -<pre>/* IPv6 */ -struct mf6cctl mc; -memset(&mc, 0, sizeof(mc)); -memcpy(&mc.mf6cc_origin, &source_addr, sizeof(mc.mf6cc_origin)); -memcpy(&mc.mf6cc_mcastgrp, &group_addr, sizeof(mf6cc_mcastgrp)); -setsockopt(mrouter_s6, IPPROTO_IPV6, MRT6_DEL_MFC, - (void *)&mc, sizeof(mc));</pre> -</div> -<p class="Pp">The following method can be used to get various statistics per - installed MFC entry in the kernel (e.g., the number of forwarded packets per - source and group address):</p> -<div class="Bd Pp Li"> -<pre>/* IPv4 */ -struct sioc_sg_req sgreq; -memset(&sgreq, 0, sizeof(sgreq)); -memcpy(&sgreq.src, &source_addr, sizeof(sgreq.src)); -memcpy(&sgreq.grp, &group_addr, sizeof(sgreq.grp)); -ioctl(mrouter_s4, SIOCGETSGCNT, &sgreq);</pre> -</div> -<div class="Bd Pp Li"> -<pre>/* IPv6 */ -struct sioc_sg_req6 sgreq; -memset(&sgreq, 0, sizeof(sgreq)); -memcpy(&sgreq.src, &source_addr, sizeof(sgreq.src)); -memcpy(&sgreq.grp, &group_addr, sizeof(sgreq.grp)); -ioctl(mrouter_s6, SIOCGETSGCNT_IN6, &sgreq);</pre> -</div> -<p class="Pp">The following method can be used to get various statistics per - multicast virtual interface in the kernel (e.g., the number of forwarded - packets per interface):</p> -<div class="Bd Pp Li"> -<pre>/* IPv4 */ -struct sioc_vif_req vreq; -memset(&vreq, 0, sizeof(vreq)); -vreq.vifi = vif_index; -ioctl(mrouter_s4, SIOCGETVIFCNT, &vreq);</pre> -</div> -<div class="Bd Pp Li"> -<pre>/* IPv6 */ -struct sioc_mif_req6 mreq; -memset(&mreq, 0, sizeof(mreq)); -mreq.mifi = vif_index; -ioctl(mrouter_s6, SIOCGETMIFCNT_IN6, &mreq);</pre> -</div> -</section> -<section class="Ss"> -<h2 class="Ss" id="Advanced_Multicast_API_Programming_Guide"><a class="permalink" href="#Advanced_Multicast_API_Programming_Guide">Advanced - Multicast API Programming Guide</a></h2> -<p class="Pp">If we want to add new features in the kernel, it becomes difficult - to preserve backward compatibility (binary and API), and at the same time to - allow user-level processes to take advantage of the new features (if the - kernel supports them).</p> -<p class="Pp">One of the mechanisms that allows us to preserve the backward - compatibility is a sort of negotiation between the user-level process and - the kernel:</p> -<ol class="Bl-enum"> - <li>The user-level process tries to enable in the kernel the set of new - features (and the corresponding API) it would like to use.</li> - <li>The kernel returns the (sub)set of features it knows about and is willing - to be enabled.</li> - <li>The user-level process uses only that set of features the kernel has - agreed on.</li> -</ol> -<p class="Pp">To support backward compatibility, if the user-level process does - not ask for any new features, the kernel defaults to the basic multicast API - (see the <a class="Sx" href="#Programming_Guide">Programming Guide</a> - section). Currently, the advanced multicast API exists only for IPv4; in the - future there will be IPv6 support as well.</p> -<p class="Pp">Below is a summary of the expandable API solution. Note that all - new options and structures are defined in - <code class="In"><<a class="In">netinet/ip_mroute.h</a>></code> and - <code class="In"><<a class="In">netinet6/ip6_mroute.h</a>></code>, - unless stated otherwise.</p> -<p class="Pp" id="getsockopt">The user-level process uses new - <a class="permalink" href="#getsockopt"><code class="Fn">getsockopt</code></a>()/<code class="Fn">setsockopt</code>() - options to perform the API features negotiation with the kernel. This - negotiation must be performed right after the multicast routing socket is - open. The set of desired/allowed features is stored in a bitset (currently, - in <var class="Vt">uint32_t</var>; i.e., maximum of 32 new features). The - new - <code class="Fn">getsockopt</code>()/<code class="Fn">setsockopt</code>() - options are <code class="Dv">MRT_API_SUPPORT</code> and - <code class="Dv">MRT_API_CONFIG</code>. Example:</p> -<div class="Bd Pp Li"> -<pre>uint32_t v; -getsockopt(sock, IPPROTO_IP, MRT_API_SUPPORT, (void *)&v, sizeof(v));</pre> -</div> -<p class="Pp" id="getsockopt~2">would set in <var class="Va">v</var> the - pre-defined bits that the kernel API supports. The eight least significant - bits in <var class="Vt">uint32_t</var> are same as the eight possible flags - <code class="Dv">MRT_MFC_FLAGS_*</code> that can be used in - <var class="Va">mfcc_flags</var> as part of the new definition of - <var class="Vt">struct mfcctl</var> (see below about those flags), which - leaves 24 flags for other new features. The value returned by - <a class="permalink" href="#getsockopt~2"><code class="Fn">getsockopt</code></a>(<var class="Fa">MRT_API_SUPPORT</var>) - is read-only; in other words, - <code class="Fn">setsockopt</code>(<var class="Fa">MRT_API_SUPPORT</var>) - would fail.</p> -<p class="Pp">To modify the API, and to set some specific feature in the kernel, - then:</p> -<div class="Bd Pp Li"> -<pre>uint32_t v = MRT_MFC_FLAGS_DISABLE_WRONGVIF; -if (setsockopt(sock, IPPROTO_IP, MRT_API_CONFIG, (void *)&v, sizeof(v)) - != 0) { - return (ERROR); -} -if (v & MRT_MFC_FLAGS_DISABLE_WRONGVIF) - return (OK); /* Success */ -else - return (ERROR);</pre> -</div> -<p class="Pp" id="setsockopt">In other words, when - <a class="permalink" href="#setsockopt"><code class="Fn">setsockopt</code></a>(<var class="Fa">MRT_API_CONFIG</var>) - is called, the argument to it specifies the desired set of features to be - enabled in the API and the kernel. The return value in - <var class="Va">v</var> is the actual (sub)set of features that were enabled - in the kernel. To obtain later the same set of features that were enabled, - then:</p> -<div class="Bd Pp Li"> -<pre>getsockopt(sock, IPPROTO_IP, MRT_API_CONFIG, (void *)&v, sizeof(v));</pre> -</div> -<p class="Pp" id="setsockopt~2">The set of enabled features is global. In other - words, - <a class="permalink" href="#setsockopt~2"><code class="Fn">setsockopt</code></a>(<var class="Fa">MRT_API_CONFIG</var>) - should be called right after - <code class="Fn">setsockopt</code>(<var class="Fa">MRT_INIT</var>).</p> -<p class="Pp">Currently, the following set of new features is defined:</p> -<div class="Bd Pp Li"> -<pre>#define MRT_MFC_FLAGS_DISABLE_WRONGVIF (1 << 0) /* disable WRONGVIF signals */ -#define MRT_MFC_FLAGS_BORDER_VIF (1 << 1) /* border vif */ -#define MRT_MFC_RP (1 << 8) /* enable RP address */ -#define MRT_MFC_BW_UPCALL (1 << 9) /* enable bw upcalls */</pre> -</div> -<p class="Pp">The advanced multicast API uses a newly defined - <var class="Vt">struct mfcctl2</var> instead of the traditional - <var class="Vt">struct mfcctl</var>. The original <var class="Vt">struct - mfcctl</var> is kept as is. The new <var class="Vt">struct mfcctl2</var> - is:</p> -<div class="Bd Pp Li"> -<pre>/* - * The new argument structure for MRT_ADD_MFC and MRT_DEL_MFC overlays - * and extends the old struct mfcctl. - */ -struct mfcctl2 { - /* the mfcctl fields */ - struct in_addr mfcc_origin; /* ip origin of mcasts */ - struct in_addr mfcc_mcastgrp; /* multicast group associated*/ - vifi_t mfcc_parent; /* incoming vif */ - u_char mfcc_ttls[MAXVIFS];/* forwarding ttls on vifs */ - - /* extension fields */ - uint8_t mfcc_flags[MAXVIFS];/* the MRT_MFC_FLAGS_* flags*/ - struct in_addr mfcc_rp; /* the RP address */ -};</pre> -</div> -<p class="Pp">The new fields are <var class="Va">mfcc_flags[MAXVIFS]</var> and - <var class="Va">mfcc_rp</var>. Note that for compatibility reasons they are - added at the end.</p> -<p class="Pp">The <var class="Va">mfcc_flags[MAXVIFS]</var> field is used to set - various flags per interface per (S,G) entry. Currently, the defined flags - are:</p> -<div class="Bd Pp Li"> -<pre>#define MRT_MFC_FLAGS_DISABLE_WRONGVIF (1 << 0) /* disable WRONGVIF signals */ -#define MRT_MFC_FLAGS_BORDER_VIF (1 << 1) /* border vif */</pre> -</div> -<p class="Pp">The <code class="Dv">MRT_MFC_FLAGS_DISABLE_WRONGVIF</code> flag is - used to explicitly disable the <code class="Dv">IGMPMSG_WRONGVIF</code> - kernel signal at the (S,G) granularity if a multicast data packet arrives on - the wrong interface. Usually, this signal is used to complete the - shortest-path switch in case of PIM-SM multicast routing, or to trigger a - PIM assert message. However, it should not be delivered for interfaces that - are not in the outgoing interface set, and that are not expecting to become - an incoming interface. Hence, if the - <code class="Dv">MRT_MFC_FLAGS_DISABLE_WRONGVIF</code> flag is set for some - of the interfaces, then a data packet that arrives on that interface for - that MFC entry will NOT trigger a WRONGVIF signal. If that flag is not set, - then a signal is triggered (the default action).</p> -<p class="Pp">The <code class="Dv">MRT_MFC_FLAGS_BORDER_VIF</code> flag is used - to specify whether the Border-bit in PIM Register messages should be set (in - case when the Register encapsulation is performed inside the kernel). If it - is set for the special PIM Register kernel virtual interface (see - <a class="Xr">pim(4)</a>), the Border-bit in the Register messages sent to - the RP will be set.</p> -<p class="Pp">The remaining six bits are reserved for future usage.</p> -<p class="Pp" id="setsockopt~3">The <var class="Va">mfcc_rp</var> field is used - to specify the RP address (in case of PIM-SM multicast routing) for a - multicast group G if we want to perform kernel-level PIM Register - encapsulation. The <var class="Va">mfcc_rp</var> field is used only if the - <code class="Dv">MRT_MFC_RP</code> advanced API flag/capability has been - successfully set by - <a class="permalink" href="#setsockopt~3"><code class="Fn">setsockopt</code></a>(<var class="Fa">MRT_API_CONFIG</var>).</p> -<p class="Pp" id="setsockopt~4">If the <code class="Dv">MRT_MFC_RP</code> flag - was successfully set by - <a class="permalink" href="#setsockopt~4"><code class="Fn">setsockopt</code></a>(<var class="Fa">MRT_API_CONFIG</var>), - then the kernel will attempt to perform the PIM Register encapsulation - itself instead of sending the multicast data packets to user level (inside - <code class="Dv">IGMPMSG_WHOLEPKT</code> upcalls) for user-level - encapsulation. The RP address would be taken from the - <var class="Va">mfcc_rp</var> field inside the new <var class="Vt">struct - mfcctl2</var>. However, even if the <code class="Dv">MRT_MFC_RP</code> flag - was successfully set, if the <var class="Va">mfcc_rp</var> field was set to - <code class="Dv">INADDR_ANY</code>, then the kernel will still deliver an - <code class="Dv">IGMPMSG_WHOLEPKT</code> upcall with the multicast data - packet to the user-level process.</p> -<p class="Pp">In addition, if the multicast data packet is too large to fit - within a single IP packet after the PIM Register encapsulation (e.g., if its - size was on the order of 65500 bytes), the data packet will be fragmented, - and then each of the fragments will be encapsulated separately. Note that - typically a multicast data packet can be that large only if it was - originated locally from the same hosts that performs the encapsulation; - otherwise the transmission of the multicast data packet over Ethernet for - example would have fragmented it into much smaller pieces.</p> -<p class="Pp">Typically, a multicast routing user-level process would need to - know the forwarding bandwidth for some data flow. For example, the multicast - routing process may want to timeout idle MFC entries, or in case of PIM-SM - it can initiate (S,G) shortest-path switch if the bandwidth rate is above a - threshold for example.</p> -<p class="Pp">The original solution for measuring the bandwidth of a dataflow - was that a user-level process would periodically query the kernel about the - number of forwarded packets/bytes per (S,G), and then based on those numbers - it would estimate whether a source has been idle, or whether the source's - transmission bandwidth is above a threshold. That solution is far from being - scalable, hence the need for a new mechanism for bandwidth monitoring.</p> -<p class="Pp">Below is a description of the bandwidth monitoring mechanism.</p> -<ul class="Bl-bullet"> - <li>If the bandwidth of a data flow satisfies some pre-defined filter, the - kernel delivers an upcall on the multicast routing socket to the multicast - routing process that has installed that filter.</li> - <li>The bandwidth-upcall filters are installed per (S,G). There can be more - than one filter per (S,G).</li> - <li>Instead of supporting all possible comparison operations (i.e., < <= - == != > >= ), there is support only for the <= and >= - operations, because this makes the kernel-level implementation simpler, - and because practically we need only those two. Further, the missing - operations can be simulated by secondary user-level filtering of those - <= and >= filters. For example, to simulate !=, then we need to - install filter “bw <= 0xffffffff”, and after an upcall is - received, we need to check whether “measured_bw != - expected_bw”.</li> - <li id="setsockopt~5">The bandwidth-upcall mechanism is enabled by - <a class="permalink" href="#setsockopt~5"><code class="Fn">setsockopt</code></a>(<var class="Fa">MRT_API_CONFIG</var>) - for the <code class="Dv">MRT_MFC_BW_UPCALL</code> flag.</li> - <li>The bandwidth-upcall filters are added/deleted by the new - <code class="Fn">setsockopt</code>(<var class="Fa">MRT_ADD_BW_UPCALL</var>) - and - <code class="Fn">setsockopt</code>(<var class="Fa">MRT_DEL_BW_UPCALL</var>) - respectively (with the appropriate <var class="Vt">struct bw_upcall</var> - argument of course).</li> -</ul> -<p class="Pp">From application point of view, a developer needs to know about - the following:</p> -<div class="Bd Pp Li"> -<pre>/* - * Structure for installing or delivering an upcall if the - * measured bandwidth is above or below a threshold. - * - * User programs (e.g. daemons) may have a need to know when the - * bandwidth used by some data flow is above or below some threshold. - * This interface allows the userland to specify the threshold (in - * bytes and/or packets) and the measurement interval. Flows are - * all packet with the same source and destination IP address. - * At the moment the code is only used for multicast destinations - * but there is nothing that prevents its use for unicast. - * - * The measurement interval cannot be shorter than some Tmin (currently, 3s). - * The threshold is set in packets and/or bytes per_interval. - * - * Measurement works as follows: - * - * For >= measurements: - * The first packet marks the start of a measurement interval. - * During an interval we count packets and bytes, and when we - * pass the threshold we deliver an upcall and we are done. - * The first packet after the end of the interval resets the - * count and restarts the measurement. - * - * For <= measurement: - * We start a timer to fire at the end of the interval, and - * then for each incoming packet we count packets and bytes. - * When the timer fires, we compare the value with the threshold, - * schedule an upcall if we are below, and restart the measurement - * (reschedule timer and zero counters). - */ - -struct bw_data { - struct timeval b_time; - uint64_t b_packets; - uint64_t b_bytes; -}; - -struct bw_upcall { - struct in_addr bu_src; /* source address */ - struct in_addr bu_dst; /* destination address */ - uint32_t bu_flags; /* misc flags (see below) */ -#define BW_UPCALL_UNIT_PACKETS (1 << 0) /* threshold (in packets) */ -#define BW_UPCALL_UNIT_BYTES (1 << 1) /* threshold (in bytes) */ -#define BW_UPCALL_GEQ (1 << 2) /* upcall if bw >= threshold */ -#define BW_UPCALL_LEQ (1 << 3) /* upcall if bw <= threshold */ -#define BW_UPCALL_DELETE_ALL (1 << 4) /* delete all upcalls for s,d*/ - struct bw_data bu_threshold; /* the bw threshold */ - struct bw_data bu_measured; /* the measured bw */ -}; - -/* max. number of upcalls to deliver together */ -#define BW_UPCALLS_MAX 128 -/* min. threshold time interval for bandwidth measurement */ -#define BW_UPCALL_THRESHOLD_INTERVAL_MIN_SEC 3 -#define BW_UPCALL_THRESHOLD_INTERVAL_MIN_USEC 0</pre> -</div> -<p class="Pp" id="setsockopt~6">The <var class="Vt">bw_upcall</var> structure is - used as an argument to - <a class="permalink" href="#setsockopt~6"><code class="Fn">setsockopt</code></a>(<var class="Fa">MRT_ADD_BW_UPCALL</var>) - and - <code class="Fn">setsockopt</code>(<var class="Fa">MRT_DEL_BW_UPCALL</var>). - Each - <code class="Fn">setsockopt</code>(<var class="Fa">MRT_ADD_BW_UPCALL</var>) - installs a filter in the kernel for the source and destination address in - the <var class="Vt">bw_upcall</var> argument, and that filter will trigger - an upcall according to the following pseudo-algorithm:</p> -<div class="Bd Pp Li"> -<pre> if (bw_upcall_oper IS ">=") { - if (((bw_upcall_unit & PACKETS == PACKETS) && - (measured_packets >= threshold_packets)) || - ((bw_upcall_unit & BYTES == BYTES) && - (measured_bytes >= threshold_bytes))) - SEND_UPCALL("measured bandwidth is >= threshold"); - } - if (bw_upcall_oper IS "<=" && measured_interval >= threshold_interval) { - if (((bw_upcall_unit & PACKETS == PACKETS) && - (measured_packets <= threshold_packets)) || - ((bw_upcall_unit & BYTES == BYTES) && - (measured_bytes <= threshold_bytes))) - SEND_UPCALL("measured bandwidth is <= threshold"); - }</pre> -</div> -<p class="Pp">In the same <var class="Vt">bw_upcall</var> the unit can be - specified in both BYTES and PACKETS. However, the GEQ and LEQ flags are - mutually exclusive.</p> -<p class="Pp">Basically, an upcall is delivered if the measured bandwidth is - >= or <= the threshold bandwidth (within the specified measurement - interval). For practical reasons, the smallest value for the measurement - interval is 3 seconds. If smaller values are allowed, then the bandwidth - estimation may be less accurate, or the potentially very high frequency of - the generated upcalls may introduce too much overhead. For the >= - operation, the answer may be known before the end of - <var class="Va">threshold_interval</var>, therefore the upcall may be - delivered earlier. For the <= operation however, we must wait until the - threshold interval has expired to know the answer.</p> -<p class="Pp">Example of usage:</p> -<div class="Bd Pp Li"> -<pre>struct bw_upcall bw_upcall; -/* Assign all bw_upcall fields as appropriate */ -memset(&bw_upcall, 0, sizeof(bw_upcall)); -memcpy(&bw_upcall.bu_src, &source, sizeof(bw_upcall.bu_src)); -memcpy(&bw_upcall.bu_dst, &group, sizeof(bw_upcall.bu_dst)); -bw_upcall.bu_threshold.b_data = threshold_interval; -bw_upcall.bu_threshold.b_packets = threshold_packets; -bw_upcall.bu_threshold.b_bytes = threshold_bytes; -if (is_threshold_in_packets) - bw_upcall.bu_flags |= BW_UPCALL_UNIT_PACKETS; -if (is_threshold_in_bytes) - bw_upcall.bu_flags |= BW_UPCALL_UNIT_BYTES; -do { - if (is_geq_upcall) { - bw_upcall.bu_flags |= BW_UPCALL_GEQ; - break; - } - if (is_leq_upcall) { - bw_upcall.bu_flags |= BW_UPCALL_LEQ; - break; - } - return (ERROR); -} while (0); -setsockopt(mrouter_s4, IPPROTO_IP, MRT_ADD_BW_UPCALL, - (void *)&bw_upcall, sizeof(bw_upcall));</pre> -</div> -<p class="Pp">To delete a single filter, then use - <code class="Dv">MRT_DEL_BW_UPCALL</code>, and the fields of bw_upcall must - be set exactly same as when <code class="Dv">MRT_ADD_BW_UPCALL</code> was - called.</p> -<p class="Pp">To delete all bandwidth filters for a given (S,G), then only the - <var class="Va">bu_src</var> and <var class="Va">bu_dst</var> fields in - <var class="Vt">struct bw_upcall</var> need to be set, and then just set - only the <code class="Dv">BW_UPCALL_DELETE_ALL</code> flag inside field - <var class="Va">bw_upcall.bu_flags</var>.</p> -<p class="Pp">The bandwidth upcalls are received by aggregating them in the new - upcall message:</p> -<div class="Bd Pp Li"> -<pre>#define IGMPMSG_BW_UPCALL 4 /* BW monitoring upcall */</pre> -</div> -<p class="Pp" id="ioctl">This message is an array of <var class="Vt">struct - bw_upcall</var> elements (up to <code class="Dv">BW_UPCALLS_MAX</code> = - 128). The upcalls are delivered when there are 128 pending upcalls, or when - 1 second has expired since the previous upcall (whichever comes first). In - an <var class="Vt">struct upcall</var> element, the - <var class="Va">bu_measured</var> field is filled-in to indicate the - particular measured values. However, because of the way the particular - intervals are measured, the user should be careful how - <var class="Va">bu_measured.b_time</var> is used. For example, if the filter - is installed to trigger an upcall if the number of packets is >= 1, then - <var class="Va">bu_measured</var> may have a value of zero in the upcalls - after the first one, because the measured interval for >= filters is - “clocked” by the forwarded packets. Hence, this upcall - mechanism should not be used for measuring the exact value of the bandwidth - of the forwarded data. To measure the exact bandwidth, the user would need - to get the forwarded packets statistics with the - <a class="permalink" href="#ioctl"><code class="Fn">ioctl</code></a>(<var class="Fa">SIOCGETSGCNT</var>) - mechanism (see the <a class="Sx" href="#Programming_Guide">Programming - Guide</a> section) .</p> -<p class="Pp">Note that the upcalls for a filter are delivered until the - specific filter is deleted, but no more frequently than once per - <var class="Va">bu_threshold.b_time</var>. For example, if the filter is - specified to deliver a signal if bw >= 1 packet, the first packet will - trigger a signal, but the next upcall will be triggered no earlier than - <var class="Va">bu_threshold.b_time</var> after the previous upcall.</p> -</section> -</section> -<section class="Sh"> -<h1 class="Sh" id="SEE_ALSO"><a class="permalink" href="#SEE_ALSO">SEE - ALSO</a></h1> -<p class="Pp"><a class="Xr">getsockopt(2)</a>, <a class="Xr">recvfrom(2)</a>, - <a class="Xr">recvmsg(2)</a>, <a class="Xr">setsockopt(2)</a>, - <a class="Xr">socket(2)</a>, <a class="Xr">icmp6(4)</a>, - <a class="Xr">inet(4)</a>, <a class="Xr">inet6(4)</a>, - <a class="Xr">intro(4)</a>, <a class="Xr">ip(4)</a>, - <a class="Xr">ip6(4)</a>, <a class="Xr">pim(4)</a></p> -</section> -<section class="Sh"> -<h1 class="Sh" id="AUTHORS"><a class="permalink" href="#AUTHORS">AUTHORS</a></h1> -<p class="Pp">The original multicast code was written by <span class="An">David - Waitzman</span> (BBN Labs), and later modified by the following individuals: - <span class="An">Steve Deering</span> (Stanford), <span class="An">Mark J. - Steiglitz</span> (Stanford), <span class="An">Van Jacobson</span> (LBL), - <span class="An">Ajit Thyagarajan</span> (PARC), <span class="An">Bill - Fenner</span> (PARC). The IPv6 multicast support was implemented by the KAME - project - (<a class="Lk" href="https://www.kame.net">https://www.kame.net</a>), and - was based on the IPv4 multicast code. The advanced multicast API and the - multicast bandwidth monitoring were implemented by <span class="An">Pavlin - Radoslavov</span> (ICSI) in collaboration with <span class="An">Chris - Brown</span> (NextHop).</p> -<p class="Pp">This manual page was written by <span class="An">Pavlin - Radoslavov</span> (ICSI).</p> -</section> -</div> -<table class="foot"> - <tr> - <td class="foot-date">September 4, 2003</td> - <td class="foot-os">NetBSD 10.1</td> - </tr> -</table> |
