IP Multicast Extensions for 4.3BSD UNIX and related systems (MULTICAST 1.2 Release) June 24, 1989 Steve Deering Stanford University This note describes the installation and use of some extensions to Berkeley 4.3BSD UNIX and related systems (SunOS, Ultrix) to support IP multicasting, as specified in RFC-1054. Included in the release is an experimental multicast routing demon which implements the Distance-Vector Multicast Routing Protocol (DVMRP), an earlier version of which is specified in RFC-1075. See Section 15 for a list of changes since Release 1.0. CONTENTS 1. SUPPORTED CONFIGURATIONS 2. HOW TO OBTAIN THIS RELEASE AND WHAT YOU GET 3. HOW TO USE THE MULTICAST EXTENSIONS Sending IP Multicast Datagrams Receiving IP Multicast Datagrams 4. KERNEL MODIFICATIONS 4.3 BSD SunOS 4.0 Ultrix 3.0 5. ESTABLISHING A DEFAULT MULTICAST INTERFACE 6. WARNINGS 7. MODIFYING OTHER NETWORK DRIVERS 8. MROUTED Invocation Configuration Signals Changes from RFC-1075 DVMRP 9. MTEST 10. NETSTAT 11. PING 12. RWHOD 13. ADMINISTRATION 14. ACKNOWLEDGMENTS 15. CHANGE HISTORY 1. SUPPORTED CONFIGURATIONS This release includes support for the following hardware/OS configurations: Machines Operating Systems Network Interfaces -------- ----------------- ------------------ Vax or Microvax 4.3+ or 4.3-tahoe de, qe, sl*, lo Decstation 3100 Ultrix 3.0 se, lo Sun 2*, 3 or 4* SunOS 4.0 ie, le, ec*, lo "4.3+" is Berkeley's 4.3BSD release plus the networking software released to the public on 4/4/88, available via anonymous FTP from ucbarpa.Berkeley.EDU. "de" is DEC's DEUNA Ethernet interface. "qe" is DEC's DEQNA Ethernet interface. "se" is DEC's AMD LANCE Ethernet interface. "ie" is Sun's Intel 82586 Ethernet interface. "le" is Sun's AMD LANCE Ethernet interface. "ec" is Sun's 3Com Ethernet interface. "sl" is Rick Adams's Serial Line IP (SLIP) driver for 4.3BSD. (Allows the use of IP multicast addressing on a point-to-point link.) "lo" is the BSD loopback driver. (Allows the use of IP multicast for intra-machine IPC, whether or not there are multicast-capable interfaces present.) THE SOFTWARE HAS NOT BEEN TESTED ON MACHINES OR INTERFACES MARKED WITH "*". The Sun 4, in particular, may present some porting difficulties, due to its stricter requirements on alignment, procedure arguments, etc. The 4.3BSD changes require access to the kernel source files. The SunOS and Ultrix changes require access to only those source files included in a binary distribution. I encourage ports to other machines, operating systems, and interfaces. I would appreciate receiving copies of any changes required to support other configurations, for inclusion in future releases. I would also appreciate reports of bugs and suggestions for improvement. Discussion of changes, bugs, etc. takes place on the VMTP-IP mailing list, which you may join by sending a message to vmtp-ip-request@gregorio.Stanford.EDU. 2. HOW TO OBTAIN THIS RELEASE AND WHAT YOU GET This release may be obtained by anonymous FTP from the "vmtp-ip" directory on host gregorio.Stanford.EDU, in the compressed tar file "ipmulticast.tar.Z". A non-compressed version is available in "ipmulticast.tar", and a copy of this document by itself is in "ipmulticast.README". This release includes the following components: README this document. RFC-1054 the Draft Standard for IP multicast host requirements. lispm-fix some software patches for Symbolics Lisp Machines, to suppress incorrect ICMP error responses to IP multicasts. mrouted/ an experimental IP multicast routing demon. mtest/ a test program for manipulating IP and link-level multicast address filters. netstat.bsd/ a modified version of the 4.3+ netstat program which displays multicast-related kernel state. netstat.sun/ the same modified netstat program, compilable under SunOS. netstat.ultrix/ the same modified netstat program, compilable under Ultrix. ping/ a modified version of the BRL ping program with options for multicast pinging. rwhod/ a modified version of the BSD rwho demon which can use IP multicast, rather than broadcast, for its periodic messages. sys.bsd/ modifications for the 4.3BSD kernel. sys.sun/ modifications for the SunOS 4.0 kernel. sys.ultrix/ modifications for the Ultrix 3.0 kernel. 3. HOW TO USE THE MULTICAST EXTENSIONS Sending IP Multicast Datagrams IP multicasting is currently supported only on AF_INET sockets of type SOCK_DGRAM and SOCK_RAW, and only on subnetworks for which the interface driver has been modified to support multicasting. To send a multicast datagram, specify an IP multicast address in the range 224.0.0.0 to 239.255.255.255 as the destination address in a sendto() call. By default, IP multicast datagrams are sent with a time-to-live (TTL) of 1, which prevents them from being forwarded beyond a single subnetwork. A new socket option allows the TTL for subsequent multicast datagrams to be set to any value from 0 to 255, in order to control the scope of the multicasts: u_char ttl; setsockopt(sock, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl)) Multicast datagrams with a TTL of 0 will not be transmitted on any subnet, but may be delivered locally if the sending host belongs to the destination group and if multicast loopback has not been disabled on the sending socket (see below). Multicast datagrams with TTL greater than one may be delivered to more than one subnet if there are one or more multicast routers attached to the first-hop subnet. To provide meaningful scope control, the multicast routers support the notion of TTL "thresholds", which prevent datagrams with less than a certain TTL from traversing certain subnets. The thresholds enforce the following convention: multicast datagrams with initial TTL 0 are restricted to the same host multicast datagrams with initial TTL 1 are restricted to the same subnet multicast datagrams with initial TTL 32 are restricted to the same site multicast datagrams with initial TTL 64 are restricted to the same region multicast datagrams with initial TTL 128 are restricted to the same continent multicast datagrams with initial TTL 255 are unrestricted in scope. "Sites" and "regions" are not strictly defined, and sites may be further subdivided into smaller administrative units, as a local matter. An application may choose an initial TTL other than the ones listed above. For example, an application might perform an "expanding-ring search" for a network resource by sending a multicast query, first with a TTL of 0, and then with larger and larger TTLs, until a reply is received, perhaps using the TTL sequence 0, 1, 2, 4, 8, 16, 32. The multicast router accompanying this release refuses to forward any multicast datagram with a destination address between 224.0.0.0 and 224.0.0.255, inclusive, regardless of its TTL. This range of addresses is reserved for the use of routing protocols and other low-level topology discovery or maintenance protocols, such as gateway discovery and group membership reporting. The current specification for IP multicasting requires this behavior only for addresses 224.0.0.0 and 224.0.0.1; the next revision of the specification is expected to contain this more general restriction. Each multicast transmission is sent from a single network interface, even if the host has more than one multicast-capable interface. (If the host is also serving as a multicast router, a multicast may be FORWARDED to interfaces other than originating interface, provided that the TTL is greater than 1.) The system manager establishes the default interface to be used for multicast- ing as part of the installation procedure, described below. A socket option is available to override the default for subsequent transmissions from a given socket: struct in_addr addr; setsockopt(sock, IPPROTO_IP, IP_MULTICAST_IF, &addr, sizeof(addr)) where "addr" is the local IP address of the desired outgoing interface. An address of INADDR_ANY may be used to revert to the default interface. The local IP address of an interface can be obtained via the SIOCGIFCONF ioctl. To determine if an interface supports multicasting, fetch the interface flags via the SIOCGIFFLAGS ioctl and see if the IFF_MULTICAST flag is set. (Normal applications should not need to use this option; it is intended primarily for multicast routers and other system services specifically concerned with internet topology.) If a multicast datagram is sent to a group to which the sending host itself belongs (on the outgoing interface), a copy of the datagram is, by default, looped back by the IP layer for local delivery. Another socket option gives the sender explicit control over whether or not subsequent datagrams are looped back: u_char loop; setsockopt(sock, IPPROTO_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop)) where "loop" is 0 to disable loopback, and 1 to enable loopback. This option provides a performance benefit for applications that may have no more than one instance on a single host (such as a router or a mail demon), by eliminating the overhead of receiving their own transmissions. It should generally not be used by applications for which there may be more than one instance on a single host (such as a conferencing program) or for which the sender does not belong to the destination group (such as a time querying program). A multicast datagram sent with an initial TTL greater than 1 may be delivered to the sending host on a different interface from that on which it was sent, if the host belongs to the destination group on that other interface. The loopback control option has no effect on such delivery. Receiving IP Multicast Datagrams Before a host can receive IP multicast datagrams, it must become a member of one or more IP multicast groups. A process can ask the host to join a multicast group by using the following socket option: struct ip_mreq mreq; setsockopt(sock, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq)) where "mreq" is the following structure: struct ip_mreq { struct in_addr imr_multiaddr; /* multicast group to join */ struct in_addr imr_interface; /* interface to join on */ } Every membership is associated with a single interface, and it is possible to join the same group on more than one interface. "imr_interface" should be INADDR_ANY to choose the default multicast interface, or one of the host's local addresses to choose a particular (multicast-capable) interface. Up to IP_MAX_MEMBERSHIPS (currently 20) memberships may be added on a single socket. To drop a membership, use: struct ip_mreq mreq; setsockopt(sock, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq)) where "mreq" contains the same values as used to add the membership. The memberships associated with a socket are also dropped when the socket is closed or the process holding the socket is killed. However, more than one socket may claim a membership in a particular group, and the host will remain a member of that group until the last claim is dropped. The memberships associated with a socket do not necessarily determine which datagrams are received on that socket. Incoming multicast packets are accepted by the kernel IP layer if any socket has claimed a membership in the destination group of the datagram; however, delivery of a multicast datagram to a particular socket is based on the destination port (or protocol type, for raw sockets), just as with unicast datagrams. To receive multicast datagrams sent to a particular port, it is necessary to bind to that local port, leaving the local address unspecified (i.e., INADDR_ANY). More than one process may bind to the same SOCK_DGRAM UDP port if the bind() is preceded by: int one = 1; setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one)) In this case, every incoming multicast or broadcast UDP datagram destined to the shared port is delivered to all sockets bound to the port. For backwards compatibility reasons, THIS DOES NOT APPLY TO INCOMING UNICAST DATAGRAMS -- unicast datagrams are never delivered to more than one socket, regardless of how many sockets are bound to the datagram's destination port. SOCK_RAW sockets do not require the SO_REUSEADDR option to share a single IP protocol type. The definitions required for the new, multicast-related socket options are found in . All IP addresses are passed in network byte-order. A final multicast-related extension is independent of IP: two new ioctls, SIOCADDMULTI and SIOCDELMULTI, are available to add or delete link-level (e.g., Ethernet) multicast addresses accepted by a particular interface. The address to be added or deleted is passed as a sockaddr structure of family AF_UNSPEC, within the standard ifreq structure. These ioctls are for the use of protocols other than IP, and require superuser privileges. A link-level multicast address added via SIOCADDMULTI is not automatically deleted when the socket used to add it goes away; it must be explicitly deleted. It is inadvisable to delete a link-level address that may be in use by IP. (These ioctls already exist in SunOS and Ultrix; they are new to BSD Unix.) Drivers that have been modified to support multicasting also support the IFF_PROMISC and IFF_ALLMULTI interface flags, to the degree possible. The kernel modification required to support Van Jacobson's traceroute program is also included in this release. Examples of usage of the above facilities can be found in the programs accompanying this distribution, such as "ping", "mtest" and "rwhod". 4. KERNEL MODIFICATIONS 4.3 BSD I assume that you have sources for 4.3 BSD with the 4/4/88 or tahoe networking code, and that you know how to build and install a new kernel. The "sys.bsd" directory contains the following files: netinet/if_ether.c net/if.c netinet/if_ether.h net/if.h netinet/igmp.c net/if_loop.c netinet/igmp.h net/if_sl.c netinet/igmp_var.h net/raw_cb.c netinet/in.c net/raw_cb.h netinet/in.h netinet/in_pcb.c h/ioctl.h.diff.4.3 netinet/in_pcb.h h/ioctl.h.diff.tahoe netinet/in_proto.c h/mbuf.h.diff.4.3 netinet/in_var.h h/mbuf.h.diff.tahoe netinet/ip_icmp.c netinet/ip_input.c vaxif/if_de.c.diff.4.3 netinet/ip_mroute.c vaxif/if_de.c.diff.tahoe netinet/ip_mroute.h vaxif/if_qe.c.diff.4.3 netinet/ip_output.c vaxif/if_qe.c.diff.tahoe netinet/ip_var.h vaxif/if_qereg.h.diff netinet/raw_ip.c netinet/udp_usrreq.c conf/files.diff The files under netinet and net should replace (or be merged with) the files of the same names in your kernel source directories. Note that netinet has three new files, igmp.c, igmp.h, and igmp_var.h, that implement the Internet Group Management Protocol (IGMP), and two more files, ip_mroute.c and ip_mroute.h, that implement the kernel part of DVMRP, a multicast routing protocol. The other three directories, h, vaxif and conf, contain modifications for other kernel files which are not part of the freely-distributable BSD networking code (and hence may not be distributed whole). Some of the modifications are different for the original 4.3 release and for the tahoe release. The diffs are suitable for use by the "patch" program. If you have Ethernet interfaces other than DEUNAs or DEQNAs, the if_de* and if_qe* files may be used as examples of the kind of modifications that must be made to support IP multicasting. They will be less helpful for modifying interface drivers for other kinds of networks. The changes made to support multicasting are bracketed with #ifdef MULTICAST/ #endif MULTICAST in all of the kernel ".c" files, and the multicast routing code is similarly bracketed with #ifdef MROUTING/#endif MROUTING. To build a system that includes the multicast services, the lines "options MULTICAST" and "options MROUTING" must be added after "options INET" in the configuration file for the target kernel, in the conf directory. The MROUTING option is needed only for those machines that are to serve as IP multicast routers. All ".h" files should also be installed (or symbolically-linked) under the appropriate subdirectories of /usr/include, so that they may be included by programs outside the kernel. SunOS 4.0 I assume that you have either a binary SunOS 4.0 distribution or sources for either SunOS 4.0 or SunOS 4.1beta, and that you know how to build and install a new kernel. The "sys.sun" directory contains the following files: netinet/if_ether.c net/if.c netinet/if_ether.h net/if.h netinet/if_loop.c net/raw_cb.c netinet/igmp.c net/raw_cb.h netinet/igmp.h net/route.h netinet/igmp_var.h netinet/in.c sys/mbuf.h.diff netinet/in.h netinet/in_pcb.c sunif/if_ec.c.diff netinet/in_pcb.h sunif/if_ecreg.h.diff netinet/in_proto.c sunif/if_ie.c.diff netinet/in_var.h sunif/if_iereg.h.diff netinet/ip.h sunif/if_le.c.diff netinet/ip_icmp.c sunif/if_subr.c.diff netinet/ip_icmp.h netinet/ip_input.c conf.common/files.cmn.diff netinet/ip_mroute.c netinet/ip_mroute.h sun2/OBJ/if_ec.o netinet/ip_output.c sun2/OBJ/if_ie.o netinet/ip_var.h sun2/OBJ/if_subr.o netinet/raw_ip.c netinet/tcp_input.c sun3/OBJ/if_ie.o netinet/tcp_output.c sun3/OBJ/if_le.o netinet/tcp_subr.c sun3/OBJ/if_subr.o netinet/tcp_timer.c netinet/tcp_timer.h netinet/tcp_usrreq.c netinet/udp_usrreq.c The files under netinet and net should replace (or be merged with) the files of the same names in your kernel source directories. If you have a binary-only distribution, just add them to the appropriate directories. Note that netinet has three new files, igmp.c, igmp.h, and igmp_var.h, that implement the Internet Group Management Protocol (IGMP), and two more files, ip_mroute.c and ip_mroute.h, that implement the kernel part of DVMRP, a multicast routing protocol. (The TCP files do not contain any multicast- related changes, but are included for compatibility with some of the multicast-modified .h files.) The sys, sunif and conf.common directories contain modifications for other kernel files which are not part of the freely-distributable BSD networking code (and hence may not be distributed whole). The diffs are suitable for use by the "patch" program. If you have a binary-only distribution, you will not have the .c files under the sunif directory. In that case, you should copy the already-compiled binary (.o) files into the appropriate OBJ directories; this distribution includes binaries for the following configurations only: Sun 2 with ec or ie interfaces. Sun 3 with le or ie interfaces. (If someone sends me .o files for other configurations, I will add them to the release.) The changes made to support multicasting are bracketed with #ifdef MULTICAST/ #endif MULTICAST in all of the kernel ".c" files, and the multicast routing code is similarly bracketed with #ifdef MROUTING/#endif MROUTING. To build a system that includes the multicast services, the lines "options MULTICAST" and "options MROUTING" must be added after "options INET" in the configuration file for the target kernel, in the sunX/conf directory. The MROUTING option is needed only for those machines that are to serve as IP multicast routers. All ".h" files should also be installed (or symbolically-linked) under the appropriate subdirectories of /usr/include, so that they may be included by programs outside the kernel. Ultrix 3.0 I assume that you have either a binary or source distribution of Ultrix 3.0, and that you know how to build and install a new kernel. The "sys.ultrix" directory contains the following files: net/netinet/if_ether.c.diff net/dli/dli_setopt.c.diff net/netinet/if_ether.h.diff net/netinet/igmp.c h/kmalloc.h.diff net/netinet/igmp.h h/mbuf.h.diff net/netinet/igmp_var.h net/netinet/in.c.diff mips/mips/if_ln.c.diff net/netinet/in.h.diff net/netinet/in_pcb.c.diff conf/mips/files.diff net/netinet/in_pcb.h.diff conf/mips/files.mips.diff net/netinet/in_proto.c.diff net/netinet/in_var.h.diff b.mips/BINARY/dli_setopt.o net/netinet/ip_icmp.c.diff b.mips/BINARY/if_ether.o net/netinet/ip_input.c.diff b.mips/BINARY/if_ln.o net/netinet/ip_mroute.c b.mips/BINARY/if_loop.o net/netinet/ip_mroute.h b.mips/BINARY/igmp.o net/netinet/ip_output.c.diff b.mips/BINARY/in.o net/netinet/ip_var.h.diff b.mips/BINARY/in_pcb.o net/netinet/raw_ip.c b.mips/BINARY/ip_icmp.o net/netinet/udp_usrreq.c.diff b.mips/BINARY/ip_input.o b.mips/BINARY/ip_mroute.o net/net/if.h.diff b.mips/BINARY/ip_output.o net/net/if_loop.c.diff b.mips/BINARY/raw_cb.o net/net/raw_cb.c.diff b.mips/BINARY/raw_ip.o net/net/raw_cb.h.diff b.mips/BINARY/udp_usrreq.o If you have a binary distribution, you should update net/netinet/in_proto.c and all the given .h files in the net/netinet, net/net, and h directories. (The diffs are suitable for use by the "patch" program; those .h files that are not diffs---igmp.h, igmp_var.h and ip_mroute.h---should just be installed as is.) Ignore any other .c files. Update conf/mips/files and conf/mips/files.mips. Install all .o files in the b.mips/BINARY directory. If you have a source distribution, update all .h and .c files in the net/netinet, net/net, net/dli, h, and mips/mips directories. Those files that are not diffs should just be installed as is. Modify conf/mips/files and conf/mips/files.mips as indicated by the diffs, and then ensure that the lines for all of the updated or new .c files do NOT contain the word "Binary"; the line for if_ln.c is in conf/mips/files.mips, the lines for all of the other .c files are in conf/mips/files. Do NOT install the files in the b.mips/BINARY directory. The changes made to support multicasting are bracketed with #ifdef MULTICAST/ #endif MULTICAST in all of the kernel ".c" files, and the multicast routing code is similarly bracketed with #ifdef MROUTING/#endif MROUTING. To build a system that includes the multicast services, the lines "options MULTICAST" and "options MROUTING" must be added after "options INET" in the configuration file for the target kernel, in the conf/mips directory. The MROUTING option is needed only for those machines that are to serve as IP multicast routers. (The .o files in the BINARY directory were compiled with both MULTICAST and MROUTING defined.) All ".h" files should also be installed (or symbolically-linked) under the appropriate subdirectories of /usr/include, so that they may be included by programs outside the kernel. The file multi-pmax3.1c.tar contains the following files: multi.dist/Ultrix3.1c.patch multi.dist/MIPS/BINARY/dli_setopt.o multi.dist/MIPS/BINARY/if_ether.o multi.dist/MIPS/BINARY/if_ln.o multi.dist/MIPS/BINARY/if_loop.o multi.dist/MIPS/BINARY/igmp.o multi.dist/MIPS/BINARY/in.o multi.dist/MIPS/BINARY/in_pcb.o multi.dist/MIPS/BINARY/ip_icmp.o multi.dist/MIPS/BINARY/ip_input.o multi.dist/MIPS/BINARY/ip_mroute.o multi.dist/MIPS/BINARY/ip_output.o multi.dist/MIPS/BINARY/raw_cb.o multi.dist/MIPS/BINARY/raw_ip.o multi.dist/MIPS/BINARY/udp_usrreq.o of these, the patch file (using patch -p0) will modify (or create) the following files: multi.dist/net/netinet/if_ether.c.diff multi.dist/net/netinet/if_ether.h.diff multi.dist/net/netinet/in.c.diff multi.dist/net/netinet/in.h.diff multi.dist/net/netinet/in_pcb.c.diff multi.dist/net/netinet/in_pcb.h.diff multi.dist/net/netinet/in_proto.c.diff multi.dist/net/netinet/in_var.h.diff multi.dist/net/netinet/ip_icmp.c.diff multi.dist/net/netinet/ip_input.c.diff multi.dist/net/netinet/ip_output.c.diff multi.dist/net/netinet/ip_var.h.diff multi.dist/net/netinet/raw_ip.c.diff multi.dist/net/netinet/udp_usrreq.c.diff multi.dist/net/netinet/igmp.c multi.dist/net/netinet/igmp.h multi.dist/net/netinet/igmp_var.h multi.dist/net/netinet/ip_mroute.c multi.dist/net/netinet/ip_mroute.h multi.dist/net/net/if.h.diff multi.dist/net/net/if_loop.c.diff multi.dist/net/net/raw_cb.c.diff multi.dist/net/net/raw_cb.h.diff multi.dist/net/dli/dli_setopt.c.diff multi.dist/h/kmalloc.h.diff multi.dist/h/mbuf.h.diff multi.dist/io/netif/if_ln.c.diff multi.dist/conf/mips/files.diff multi.dist/conf/mips/files.mips.diff multi.dist/conf/mips/BINARY multi.dist/conf/mips/newvers.sh If you only have a binary distribution, only the last four diff files will be of interest. Note that we have added the options MULTICAST and MULTIROUTING to the config file for BINARY; this makes it easier to distribute and install in a binary configuration. 5. ESTABLISHING A DEFAULT MULTICAST INTERFACE Selection of the default multicast interface is controlled via the kernel (unicast) routing table. If there is no multicast route in the table, all multicasts will, by default, be sent on the interface associated with the default gateway. If that interface does not support multicast, attempts to send will receive an ENETUNREACH error. A route may be added for a particular multicast address or for all multicast addresses, to direct them to a different default interface. For example, to specify that multicast datagrams addressed to 224.0.1.3 should, by default, be sent on the interface with local address 36.2.0.8, use the following: /etc/route add 224.0.1.3 36.2.0.8 0 To set the default for all multicast addresses, other than those with individual routes, to be the interface with local address 36.11.0.1, use: /etc/route add 224.0.0.0 36.11.0.1 0 If you point a multicast route at an interface that does not support multicasting, an attempt to multicast via that route will receive an ENETUNREACH error. If needed, these commands normally would be added to the /etc/rc.ip or /etc/rc.local file, to take effect every time the system is booted. 6. WARNINGS These extensions to 4.3BSD, SunOS and Ultrix are experimental and subject to change in future releases from Stanford, Berkeley, Sun or DEC. In particular, there may be changes in the way a process overrides the default interface for sending multicast datagrams and for joining multicast groups. This ability to override the default interface is intended mainly for routing demons; normal applications should not be concerned with specific interfaces. These modifications have not yet been tested in a wide variety of environments and may interact in undesirable ways with other hosts or with other operations on the same host. The most common problem is with hosts that respond incorrectly to IP multicasts. These responses typically take the form of ICMP network unreachable, redirect, or time-exceeded error messages, which are a nuisance but mostly harmless. These responses are in violation of the current IP specification and, with luck, will disappear over time. (This release includes some patches to correct this problem in Symbolics Lisp Machines, which seem to be the most commonly encountered offenders.) I don't know if routing demons such as routed or gated will be confused to find multicast routes in the kernel's unicast routing table. I deleted one test in the in_pcblookup function to allow multiple sockets to be bound to the same port, even if all other parts of the address are unbound, if SO_REUSEADDR is specified. The logic of in_pcblookup is rather opaque and there is not a single comment, so it is quite possible that I have broken some important piece of binding machinery. I would appreciate suggestions of better ways to accomplish what I want. The multicast router included in this release and described in section 8, below, is intended as an experimental, interim solution to the multicast routing problem, pending availability of multicast routing support in production Internet gateways. Three problems with the current router are: (1) It does not properly handle physical networks that are assigned more than one IP subnet number. If you install the router on a network that is shared by more than one subnet, you should make sure that all multicast-capable hosts and all multicast routers have the same subnet number. This problem will probably be fixed in a future release. (2) The tunnel mechanism used to interconnect instances of the multicast router that are separated by non-multicast-capable gateways depends on IP source routing. Some sites have disabled source routing in their gateways; those sites will not be able to use tunnels. (3) A host must be attached to at least one physical, multicast- capable network in order to originate or receive internet multicasts. A future version of the router may allow an isolated host (say, connected only to an X.25 network) to run a copy of the router with tunnel links to other routers in order to be a source and sink of multicasts. 7. MODIFYING OTHER NETWORK DRIVERS Ethernet drivers other than those included in this release, and drivers for ther kinds of networks, must be modified to support multicasting. Note that hardware multicast support is not essential, as long as it is possible to send a packet to all members of a local multicast group. For example, on the 3Mbit Experimental Ethernet which has no multicast addressing mode, all multicast packets could be sent as broadcasts. On point-to-point links, there can be only two members of any group, so multicasting is trivial. (The SLIP driver in sys.bsd/net/if_sl.c exemplifies the minimal modifications required.) In general, the following changes must be made before a driver is entitled to set the IFF_MULTICAST flag: - the destination IP address of outgoing multicast packets must be translated into an appropriate link-level address. For Ethernets, this is taken care of by arpresolve(). - the driver must ensure that it does NOT receive or loop back its own multicast transmissions. This is opposite to the way IP broadcasts are handled. - the driver must accept without error SIOCADDMULTI and SIOCDELMULTI ioctls that refer to AF_INET family addresses. If the interface provides multicast filtering hardware, the IP addresses should be mapped to appropriate link-level addresses and the filter updated accordingly. If the number of addresses exceeds the capacity of the filter, the driver should open up the interface to receive all multicasts. For Ethernets, the functions ether_addmulti() and ether_delmulti() can be used to maintain a list of multicast addresses, and the macros ETHER_FIRST_MULTI() and ETHER_NEXT_MULTI() can be used to scan the list when updating the filter. See the various driver changes included with this release for examples of filter manipulation. - if possible, the driver should handle changes in the IFF_PROMISC and IFF_ALLMULTI interface flags. These often must be or can be coordinated with the handling of the filter list. - the IFF_MULTICAST flag should be set in the "attach" routine. 8. MROUTED The "mrouted" directory contains the sources for a multicast router. This program is covered by the license in mrouted/LICENSE. If you are unwilling to abide by the conditions of the license, do not use the mrouted program. Mrouted is an implementation of the Distance-Vector Multicast Routing Protocol (DVMRP), an earlier version of which is specified in RFC-1075. It maintains topological knowledge via a distance-vector routing protocol (like RIP, described in RFC-1058), upon which it implements a multicast forwarding algorithm called Truncated Reverse Path Broadcasting (TRPB). TRPB is described, along with other multicast routing algorithms, in the paper "Multicast Routing in Internetworks and Extended LANs" by S. Deering, in the Proceedings of the ACM SIGCOMM '88 Conference. Mrouted forwards a multicast datagram along a shortest (reverse) path tree rooted at the subnet on which the datagram originates. It is a BROADCAST tree, which means it includes ALL subnets reachable by a cooperating set of mrouted routers. However, the datagram will not be forwarded onto LEAF subnets of the tree if those subnets do not have members of the destination group. Furthermore, the IP time-to-live of a multicast datagram may prevent it from being forwarded along the entire tree. In order to support multicasting among subnets that are separated by (unicast) routers that do not support IP multicasting, mrouted includes support for "tunnels", which are virtual point-to-point links between pairs of mrouteds located anywhere in the internet. IP multicast packets are encapsulated for transmission through tunnels, so that they look like normal unicast datagrams to intervening routers and subnets. The encapsulation takes the form of an IP source route which is inserted on entry to a tunnel, and stripped out on exit from a tunnel. The tunnel mechanism allows mrouted to establish a virtual internet, for the purpose of multicasting only, which is independent of the physical internet, and which may span multiple Autonomous Systems. This capability is intended for experimental support of internet multicasting only, pending widespread support for multicast routing by the regular (unicast) routers. Mrouted suffers from the well-known scaling problems of any distance-vector routing protocol, and does not (yet) support hierarchical multicast routing or inter-operation with other multicast routing protocols. Mrouted handles multicast routing only; there may or may not be a unicast router running on the same host as mrouted. With the use of tunnels, it is not necessary for mrouted to have access to more than one physical subnet in order to perform multicast forwarding. Invocation Mrouted must be run as root. It is invoked as follows: mrouted [ -d [ debug_level ] ] If no "-d" option is given, or if the debug level is specified as 0, mrouted detaches from the invoking terminal. Otherwise, it remains attached to the invoking terminal and responsive to signals from that terminal. If "-d" is given with no argument, the debug level defaults to 2. Regardless of the debug level, mrouted always writes warning and error messages to the system log demon. Non-zero debug levels have the following effects: level 1: all syslog'ed messages are also printed to stderr. level 2: all level 1 messages plus notifications of "significant" events are printed to stderr. level 3: all level 2 messages plus notifications of all packet arrivals and departures are printed to stderr. Configuration Mrouted automatically configures itself to forward on all multicast-capable interfaces, i.e. interfaces that have the IFF_MULTICAST flag set (excluding the loopback "interface"), and it finds other mrouteds directly reachable via those interfaces. To override the default configuration, or to add tunnel links to other mrouteds, configuration commands may be placed in /etc/mrouted.conf. There are two types of configuration command: phyint [disable] [metric ] [threshold ] tunnel [metric ] [threshold ] The phyint command can be used to disable multicast routing on the physical interface identified by local IP address , or to associate a non-default metric or threshold with the specified physical interface. Phyint commands should precede tunnel commands. The tunnel command can be used to establish a tunnel link between local IP address and remote IP address , and to associate a non-default metric or threshold with that tunnel. The tunnel must be set up in the mrouted.conf files of both ends before it will be used. The metric is the "cost" associated with sending a datagram on the given interface or tunnel; it may be used to influence the choice of routes. The metric defaults to 1. Metrics should be kept as small as possible, because mrouted cannot route along paths with a sum of metrics greater than 31. When in doubt, the following metrics are recommended: LAN, or tunnel across a single LAN: 1 serial link, or tunnel across a single serial link: 2 multi-hop tunnel: 3 The threshold is the minimum IP time-to-live required for a multicast datagram to be forwarded to the given interface or tunnel. It is used to control the scope of multicast datagrams. (The TTL of forwarded packets is only compared to the threshold, it is not decremented by the threshold. Every multicast router decrements the TTL by 1.) The default threshold is 1. links that separate sites should have a threshold of 32. links that separate regions should have a threshold of 64. links that separate continents should have a threshold of 128. In general, all mrouteds connected to a particular subnet or tunnel should use the same metric and threshold for that subnet or tunnel. Mrouted will not initiate execution if it has fewer than two enabled vifs, where a vif (virtual interface) is either a physical multicast-capable interface or a tunnel. It will log a warning if all of its vifs are tunnels; such an mrouted configuration would be better replaced by more direct tunnels (i.e., eliminate the middle man). The "mrouted" directory includes a sample mrouted.conf file. Signals Mrouted responds to the following signals: HUP, TERM, INT: terminates execution gracefully (i.e., by sending good-bye messages to all neighboring routers). The INT signal is typically bound to the CTL-C key. USR1: dumps the internal routing tables to /usr/tmp/mrouted.dump. QUIT: dumps the internal routing tables to stderr (only if mrouted was invoked with a non-zero debug level). The QUIT signal is typically bound to the CTL-\ key. The routing tables look like this: Virtual Interface Table Vif Local-Address Metric Thresh Flags 0 36.2.0.8 subnet: 36.2 1 1 querier groups: 224.0.2.1 224.0.0.4 1 36.11.0.1 subnet: 36.11 1 1 querier groups: 224.0.2.1 224.0.1.0 224.0.0.4 2 36.2.0.8 tunnel: 36.8.0.77 3 1 peers : 36.8.0.77 3 36.2.0.8 tunnel: 36.8.0.110 3 1 Multicast Routing Table Origin-Subnet From-Gateway Metric In-Vif Out-Vifs 36.2 1 0 1* 2 3* 36.8 36.8.0.77 4 2 0* 1* 3* 36.11 1 1 0* 2 3* In this example, there are four vifs connecting to two subnets and two tunnels. The vif 3 tunnel is not in use (no peer address). The vif 0 and vif 1 subnets have some groups present; tunnels never have any groups. This instance of mrouted is the one responsible for sending periodic group membership queries on the vif 0 and vif 1 subnets, as indicated by the "querier" flags. Associated with each subnet from which a multicast datagram can originate is the address of the previous hop gateway (unless the subnet is directly- connected), the metric of the path back to the origin, the incoming vif for multicasts from that origin, and a list of outgoing vifs. "*" means that the outgoing vif is connected to a leaf of the broadcast tree rooted at the origin, and a multicast datagram from that origin will be forwarded on that outgoing vif only if there are members of the destination group on that leaf. Changes from RFC-1075 DVMRP The version of DVMRP implemented by mrouted differs significantly from the specification in RFC-1075. Among the most important changes are: - more compact (but less general) packet format. - routes are "poisoned" for split-horizon by adding infinity to their metric; this makes it possible to distinguish poisoned routes from unreachable routes without using a separate flag. - no longer allows infinity to differ between routes or between routers; it's now a compiled-in constant. - no longer tries to hide subnet structure from routers outside of a subnetted network; the children/leaf algorithms wouldn't work with subnet hiding. - topologies now include only multicast-capable nets and tunnels; other types of links (such as pure broadcast) must be excluded because they can't carry IP multicast packets (even though they could carry routing updates). - most timer values have changed. - leaf timeouts now done per-route rather than per-vif. - periodic route reports are replaced by periodic probe messages when no neighbors can be heard over a vif; this prevents routes from being established in the absence of two-way communication capability (which absence is much more likely with tunnels than with physical links). - tunneled packets now retain their original source address, rather than carrying the address of the tunnel source in the IP source address field, despite the problem of IGMP messages; this is the lesser of two evils. - detailed changes in the leaf algorithms, to recognize some leaf links more quickly (especially tunnels). The only specification of the current version of DVMRP is the mrouted sources. 9. MTEST The mtest directory contains a small program for testing the multicast membership sockopts and ioctls. It accepts the following commands, interactively: j g.g.g.g i.i.i.i - join IP multicast group l g.g.g.g i.i.i.i - leave IP multicast group a ifname e.e.e.e.e.e - add ether multicast address d ifname e.e.e.e.e.e - del ether multicast address m ifname 1/0 - set/clear ether allmulti flag p ifname 1/0 - set/clear ether promisc flag q - quit where g.g.g.g is an IP multicast address, e.g., 224.0.2.1 i.i.i.i is the IP address of a local interface or 0.0.0.0 ifname is an interface name, e.g., qe0 e.e.e.e.e.e is an Ethernet address in hex, e.g., 1.0.5e.0.2.1 1/0 is a 1 or a 0, to turn the flag on or off The "p" command to change the promiscuous flag does not work under SunOS, because it uses a different ioctl for that purpose. Mtest is useful for establishing targets for multicast ping testing. The results of mtest filter manipulation can be seen by using the "netstat -nia" command (see next section). 10. NETSTAT The netstat.bsd directory contains a modified version of the netstat program included with the 4/4/88 Berkeley networking release. Note that it is NOT the tahoe version of netstat; it ought to be upgraded to the tahoe version. The netstat.sun directory contains the same program as in netstat.bsd, with non-Sun supported stuff ripped out (mainly AF_NS stuff). Note that is is NOT the SunOS Release 4.0 netstat; it ought to be merged with the Sun version. The netstat.ultrix directory contains the same program as in netstat.bsd, with non-DEC supported stuff ripped out (mainly AF_NS stuff). Note that is is NOT the Ultrix 3.0 netstat; it ought to be merged with the DEC version. These versions of netstat support the following new features: - "netstat -m" recognizes four new, multicast-related mbuf types. - IGMP statistics are now included in the "netstat -s" output, or explicitly with "netstat -p igmp". - the "-a" option may be used when printing interface information (for example, "netstat -nia") to print out all addresses (unicast and multicast) associated with each interface, including link-level addresses. The only link-level addresses currently recognized are Ethernet addresses, and only for some types of Ethernet interface; additional interface types may be added in netstat/if.c. - the "-M" option prints the kernel's multicast routing tables. The kernel's tables are a subset of mrouted's tables (see section 9, above). "-Ms" prints miscellaneous statistics related to multicast routing. Before the "-p igmp" option will work, you must add the following line to /etc/protocols: igmp 2 IGMP # internet group management protocol 11. PING The ping directory contains a modified version of BRL's public domain ping program. The modified ping provides control of the multicast transmission sockopts via the following new options: -l - inhibit multicast loopback. -t - set the multicast time-to-live to . -i - send multicasts from the interface with local address , given in a.b.c.d format. This version of ping also supports the -R (record route) option. 12. RWHOD The rwhod directory contains a modified version of Berkeley's rwho demon. It supports a new command line option, as follows: -m - causes rwhod to use IP multicast (instead of broadcast or unicast) on all interfaces that have the IFF_MULTICAST flag set in their "ifnet" structs (excluding the loopback interface). The multicast reports are sent with a time-to- live of 1, to prevent forwarding beyond the directly- connected subnet(s). -m - causes rwhod to send IP multicast datagrams with a time- to-live of , via a SINGLE interface rather than all interfaces. must be between 0 and 32. Note that "-m 1" is different than "-m", in that "-m 1" specifies transmission on one interface only. When "-m" is used without a argument, the program accepts multicast rwhod reports from all multicast-capable interfaces. If a argument is given, it accepts multicast reports from only one interface, the one on which reports are sent (which may be controlled via the host's routing table; see section 6, above). Regardless of the "-m" option, the program accepts broadcast or unicast reports from all interfaces. Thus, this program will hear the reports of old, non-multicasting rwhods, but, if multicasting is used, those old rwhods won't hear the reports generated by this program. 13. ADMINISTRATION IP multicast addresses are officially assigned by the Internet Numbers Czar. Contact Joyce Reynolds to apply for a multicast address assignment. The interconnection of sites by tunnels will be informally coordinated by me (Steve Deering ). If you wish to connect your site to the "multicast internet", please drop me a note. Bug reports and suggestions for changes should be sent to the mailing list vmtp-ip@gregorio.Stanford.EDU. To join that list, send a request to vmtp-ip-request@gregorio.Stanford.EDU. 14. ACKNOWLEDGEMENTS I would like to acknowledge my debt to David Cheriton of Stanford, who supervised this work and provided the development resources, Karen Lam of BBN Labs, who did the first IP multicast implementation for BSD Unix, Erik Nordmark of Stanford, who did the first port to SunOS, Bill Nowicki of Sun, who encouraged and facilitated the SunOS port, Tony Mason of Stanford, who did the PMAX port, David Waitzman of BBN STC, who did the first DVMRP implementation, Joe Pallas of Stanford, who provided frequent wizardly insights, and the members of the End-to-End Protocols Task Force, who continue to encourage and guide this work. 15. CHANGE HISTORY Release 1.0 May 1988 - first release based on BSD 4.3+ networking code. Release 1.1 March 21, 1989 - updated to BSD 4.3-tahoe sources. - added support for SunOS 4.0. - added support for multicasting to the loopback "interface", and allowed TTL 0 multicasts to be delivered locally. - added the experimental DVMRP multicast routing demon, along with associated kernel multicast routing support. - included kernel change required to support Jacobson's traceroute program. - added multicast scope-control option to rwhod. - added mtest and modified ping programs. Release 1.2 June 24, 1989 - added support for Ultrix 3.0 on the PMAX (Decstation 3100). - modified rwhod to use its new, officially-assigned IP multicast address (224.0.1.3). - fixed a byte-order bug in mrouted. - fixed an error logging bug in mrouted. - changed mrouted to accept an all-ones subnet mask in reports, for eventual support of "host routes". - deleted dependency lines generated by "mkdep" from the mrouted and netstat Makefiles, to accomodate different file system layouts. - included a set of patches for Symbolics Lisp Machines to eliminate anti-social ICMP error responses to IP multicasts. - included more source files in the SunOS netinet directory to fix problems with undefined symbols and unsatisfied references. - changed the kernel to deliver only multicasts and broadcasts to multiple UDP sockets bound to the same port; unicasts are now delivered to one socket only, for compatibility with existing 4.3BSD applications. - changed the format of some function declarations to conform to BSD kernel style. - fixed a route caching bug in the kernel's multicast routing code. - moved the kernel's multicast routing stub routines from ip_output.c to ip_mroute.c. - rearranged code in kernel's ip_output() routine to avoid checking for multicast addresses twice. - added a missing call to splx() in SunOS kernel's arpresolve() routine. - renamed SunOS kernel's "udp_cksum" variable to "udpcksum" to conform to binary SunOS 4.0 usage. - installed Sun's fix to the "le" Ethernet driver for a bug triggered by mbufs with zero data length.