1.\" $NetBSD: ip.4,v 1.36 2013/07/13 09:24:25 njoly Exp $ 2.\" 3.\" Copyright (c) 1983, 1991, 1993 4.\" The Regents of the University of California. All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 3. Neither the name of the University nor the names of its contributors 15.\" may be used to endorse or promote products derived from this software 16.\" without specific prior written permission. 17.\" 18.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 19.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 20.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 21.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 22.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 24.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 25.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 26.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 27.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 28.\" SUCH DAMAGE. 29.\" 30.\" @(#)ip.4 8.2 (Berkeley) 11/30/93 31.\" 32.Dd June 27, 2013 33.Dt IP 4 34.Os 35.Sh NAME 36.Nm ip 37.Nd Internet Protocol 38.Sh SYNOPSIS 39.In sys/socket.h 40.In netinet/in.h 41.Ft int 42.Fn socket AF_INET SOCK_RAW proto 43.Sh DESCRIPTION 44.Tn IP 45is the network layer protocol used by the Internet protocol family. 46Options may be set at the 47.Tn IP 48level when using higher-level protocols that are based on 49.Tn IP 50(such as 51.Tn TCP 52and 53.Tn UDP ) . 54It may also be accessed through a 55.Dq raw socket 56when developing new protocols, or special-purpose applications. 57.Pp 58There are several 59.Tn IP-level 60.Xr setsockopt 2 Ns / Ns Xr getsockopt 2 61options. 62.Dv IP_OPTIONS 63may be used to provide 64.Tn IP 65options to be transmitted in the 66.Tn IP 67header of each outgoing packet 68or to examine the header options on incoming packets. 69.Tn IP 70options may be used with any socket type in the Internet family. 71The format of 72.Tn IP 73options to be sent is that specified by the 74.Tn IP 75protocol specification (RFC 791), with one exception: 76the list of addresses for Source Route options must include the first-hop 77gateway at the beginning of the list of gateways. 78The first-hop gateway address will be extracted from the option list 79and the size adjusted accordingly before use. 80To disable previously specified options, use a zero-length buffer: 81.Bd -literal 82setsockopt(s, IPPROTO_IP, IP_OPTIONS, NULL, 0); 83.Ed 84.Pp 85.Dv IP_TOS 86and 87.Dv IP_TTL 88may be used to set the type-of-service and time-to-live fields in the 89.Tn IP 90header for 91.Dv SOCK_STREAM 92and 93.Dv SOCK_DGRAM 94sockets. 95For example, 96.Bd -literal 97int tos = IPTOS_LOWDELAY; /* see \*[Lt]netinet/ip.h\*[Gt] */ 98setsockopt(s, IPPROTO_IP, IP_TOS, \*[Am]tos, sizeof(tos)); 99 100int ttl = 60; /* max = 255 */ 101setsockopt(s, IPPROTO_IP, IP_TTL, \*[Am]ttl, sizeof(ttl)); 102.Ed 103.Pp 104.Dv IP_IPSEC_POLICY 105controls IPSec policy for sockets. 106For example, 107.Bd -literal 108const char *policy = "in ipsec ah/transport//require"; 109char *buf = ipsec_set_policy(policy, strlen(policy)); 110setsockopt(s, IPPROTO_IP, IP_IPSEC_POLICY, buf, ipsec_get_policylen(buf)); 111.Ed 112.Pp 113The 114.Dv IP_PKTINFO 115option can be used to turn on receiving of information about the source 116address of the packet, and the interface index. 117The information is passed in a 118.Vt struct in_pktinfo 119structure, which contains 120.Bd -literal 121 struct in_addr ipi_addr; /* the source or destination address */ 122 unsigned int ipi_ifindex; /* the interface index */ 123.Ed 124and added to the control portion of the message: 125The cmsghdr fields have the following values: 126.Bd -literal 127cmsg_len = CMSG_LEN(sizeof(struct in_pktinfo)) 128cmsg_level = IPPROTO_IP 129cmsg_type = IP_PKTINFO 130.Ed 131.Pp 132The 133.Dv IP_PORTALGO 134can be used to randomize the port selection. 135Valid algorithms are described in 136.Xr rfc6056 7 137and their respective constants are in 138.In netinet/portalgo.h . 139For example, 140.Bd -literal 141int algo = PORTALGO_ALGO_RANDOM_PICK; /* see \*[Lt]netinet/portalgo.h\*[Gt] */ 142setsockopt(s, IPPROTO_IP, IP_PORTALGO, \*[Am]algo, sizeof(algo)); 143.Ed 144.Pp 145The port selection can be also viewed and controlled at a global level for all 146.Tn IP 147sockets using the following 148.Xr sysctl 7 149variables: 150.Dv net.inet.ip.anonportalgo.available 151and 152.Dv net.inet.ip.anonportalgo.selected . 153.Pp 154.Dv IP_PORTRANGE 155controls how ephemeral ports are allocated for 156.Dv SOCK_STREAM 157and 158.Dv SOCK_DGRAM 159sockets. 160For example, 161.Bd -literal 162int range = IP_PORTRANGE_LOW; /* see \*[Lt]netinet/in.h\*[Gt] */ 163setsockopt(s, IPPROTO_IP, IP_PORTRANGE, \*[Am]range, sizeof(range)); 164.Ed 165.Pp 166If the 167.Dv IP_RECVDSTADDR 168option is enabled on a 169.Dv SOCK_DGRAM 170or 171.Dv SOCK_RAW 172socket, 173the 174.Xr recvmsg 2 175call will return the destination 176.Tn IP 177address for a 178.Tn UDP 179datagram. 180The msg_control field in the msghdr structure points to a buffer 181that contains a cmsghdr structure followed by the 182.Tn IP 183address. 184The cmsghdr fields have the following values: 185.Bd -literal 186cmsg_len = CMSG_LEN(sizeof(struct in_addr)) 187cmsg_level = IPPROTO_IP 188cmsg_type = IP_RECVDSTADDR 189.Ed 190.Pp 191If the 192.Dv IP_RECVIF 193option is enabled on a 194.Dv SOCK_DGRAM 195or 196.Dv SOCK_RAW 197socket, 198the 199.Xr recvmsg 2 200call will return a struct sockaddr_dl corresponding to 201the interface on which the packet was received. 202the msg_control field in the msghdr structure points to a buffer 203that contains a cmsghdr structure followed by the struct sockaddr_dl. 204The cmsghdr fields have the following values: 205.Bd -literal 206cmsg_len = CMSG_LEN(sizeof(struct sockaddr_dl)) 207cmsg_level = IPPROTO_IP 208cmsg_type = IP_RECVIF 209.Ed 210.Pp 211The 212.Dv IP_RECVPKTINFO 213option is similar to the 214.Dv IP_PKTINFO 215one, only in this case the inbound information is returned. 216.Pp 217If the 218.Dv IP_RECVTTL 219option is enabled on a 220.Dv SOCK_DGRAM 221socket, the 222.Xr recvmsg 2 223call will return the 224.Tn TTL 225of the received datagram. 226The msg_control field in the msghdr structure points to a buffer 227that contains a cmsghdr structure followed by the 228.Tn TTL 229value. 230The cmsghdr fields have the following values: 231.Bd -literal 232cmsg_len = CMSG_LEN(sizeof(uint8_t)) 233cmsg_level = IPPROTO_IP 234cmsg_type = IP_RECVTTL 235.Ed 236.Pp 237The 238.Dv IP_MINTTL 239option may be used on 240.Dv SOCK_DGRAM 241or 242.Dv SOCK_STREAM 243sockets to discard packets with a TTL lower than the option value. 244This can be used to implement the 245.Em Generalized TTL Security Mechanism (GTSM) 246according to RFC 3682. 247To discard all packets with a TTL lower than 255: 248.Bd -literal -offset indent 249int minttl = 255; 250setsockopt(s, IPPROTO_IP, IP_MINTTL, \*[Am]minttl, sizeof(minttl)); 251.Ed 252.Ss MULTICAST OPTIONS 253.Tn IP 254multicasting is supported only on 255.Dv AF_INET 256sockets of type 257.Dv SOCK_DGRAM 258and 259.Dv SOCK_RAW , 260and only on networks where the interface driver supports multicasting. 261.Pp 262The 263.Dv IP_MULTICAST_TTL 264option changes the time-to-live (TTL) for outgoing multicast datagrams 265in order to control the scope of the multicasts: 266.Bd -literal 267u_char ttl; /* range: 0 to 255, default = 1 */ 268setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, \*[Am]ttl, sizeof(ttl)); 269.Ed 270.Pp 271Datagrams with a TTL of 1 are not forwarded beyond the local network. 272Multicast datagrams with a TTL of 0 will not be transmitted on any network, 273but may be delivered locally if the sending host belongs to the destination 274group and if multicast loopback has not been disabled on the sending socket 275(see below). 276Multicast datagrams with TTL greater than 1 may be forwarded 277to other networks if a multicast router is attached to the local network. 278.Pp 279For hosts with multiple interfaces, each multicast transmission is 280sent from the primary network interface. 281The 282.Dv IP_MULTICAST_IF 283option overrides the default for 284subsequent transmissions from a given socket: 285.Bd -literal 286struct in_addr addr; 287setsockopt(s, IPPROTO_IP, IP_MULTICAST_IF, \*[Am]addr, sizeof(addr)); 288.Ed 289.Pp 290where "addr" is the local 291.Tn IP 292address of the desired interface or 293.Dv INADDR_ANY 294to specify the default interface. 295An interface's local IP address and multicast capability can 296be obtained via the 297.Dv SIOCGIFCONF 298and 299.Dv SIOCGIFFLAGS 300ioctls. 301An application may also specify an alternative to the default network interface 302by index: 303.Bd -literal 304struct uint32_t idx = htonl(ifindex); 305setsockopt(s, IPPROTO_IP, IP_MULTICAST_IF, \*[Am]idx, sizeof(idx)); 306.Ed 307.Pp 308where "ifindex" is an interface index as returned by 309.Xr if_nametoindex 3 . 310.Pp 311Normal applications should not need to use 312.Dv IP_MULTICAST_IF . 313.Pp 314If a multicast datagram is sent to a group to which the sending host itself 315belongs (on the outgoing interface), a copy of the datagram is, by default, 316looped back by the IP layer for local delivery. 317The 318.Dv IP_MULTICAST_LOOP 319option gives the sender explicit control 320over whether or not subsequent datagrams are looped back: 321.Bd -literal 322u_char loop; /* 0 = disable, 1 = enable (default) */ 323setsockopt(s, IPPROTO_IP, IP_MULTICAST_LOOP, \*[Am]loop, sizeof(loop)); 324.Ed 325.Pp 326This option 327improves performance for applications that may have no more than one 328instance on a single host (such as a router demon), by eliminating 329the overhead of receiving their own transmissions. 330It should generally not be used by applications for which there 331may be more than one instance on a single host (such as a conferencing 332program) or for which the sender does not belong to the destination 333group (such as a time querying program). 334.Pp 335A multicast datagram sent with an initial TTL greater than 1 may be delivered 336to the sending host on a different interface from that on which it was sent, 337if the host belongs to the destination group on that other interface. 338The loopback control option has no effect on such delivery. 339.Pp 340A host must become a member of a multicast group before it can receive 341datagrams sent to the group. 342To join a multicast group, use the 343.Dv IP_ADD_MEMBERSHIP 344option: 345.Bd -literal 346struct ip_mreq mreq; 347setsockopt(s, IPPROTO_IP, IP_ADD_MEMBERSHIP, \*[Am]mreq, sizeof(mreq)); 348.Ed 349.Pp 350where 351.Fa mreq 352is the following structure: 353.Bd -literal 354struct ip_mreq { 355 struct in_addr imr_multiaddr; /* multicast group to join */ 356 struct in_addr imr_interface; /* interface to join on */ 357} 358.Ed 359.Pp 360.Dv imr_interface 361should be 362.Dv INADDR_ANY 363to choose the default multicast interface, or the 364.Tn IP 365address of a particular multicast-capable interface if 366the host is multihomed. 367Membership is associated with a single interface; 368programs running on multihomed hosts may need to 369join the same group on more than one interface. 370Up to 371.Dv IP_MAX_MEMBERSHIPS 372(currently 20) memberships may be added on a single socket. 373.Pp 374To drop a membership, use: 375.Bd -literal 376struct ip_mreq mreq; 377setsockopt(s, IPPROTO_IP, IP_DROP_MEMBERSHIP, \*[Am]mreq, sizeof(mreq)); 378.Ed 379.Pp 380where 381.Fa mreq 382contains the same values as used to add the membership. 383Memberships are dropped when the socket is closed or the process exits. 384.\"----------------------- 385.Ss RAW IP SOCKETS 386Raw 387.Tn IP 388sockets are connectionless, and are normally used with the 389.Xr sendto 2 390and 391.Xr recvfrom 2 392calls, though the 393.Xr connect 2 394call may also be used to fix the destination for future 395packets (in which case the 396.Xr read 2 397or 398.Xr recv 2 399and 400.Xr write 2 401or 402.Xr send 2 403system calls may be used). 404.Pp 405If 406.Fa proto 407is 0, the default protocol 408.Dv IPPROTO_RAW 409is used for outgoing packets, and only incoming packets destined 410for that protocol are received. 411If 412.Fa proto 413is non-zero, that protocol number will be used on outgoing packets 414and to filter incoming packets. 415.Pp 416Outgoing packets automatically have an 417.Tn IP 418header prepended to them (based on the destination address and the 419protocol number the socket is created with), unless the 420.Dv IP_HDRINCL 421option has been set. 422Incoming packets are received with 423.Tn IP 424header and options intact. 425.Pp 426.Dv IP_HDRINCL 427indicates the complete IP header is included with the data and may 428be used only with the 429.Dv SOCK_RAW 430type. 431.Bd -literal 432#include \*[Lt]netinet/ip.h\*[Gt] 433 434int hincl = 1; /* 1 = on, 0 = off */ 435setsockopt(s, IPPROTO_IP, IP_HDRINCL, \*[Am]hincl, sizeof(hincl)); 436.Ed 437.Pp 438Unlike previous 439.Bx 440releases, the program must set all 441the fields of the IP header, including the following: 442.Bd -literal 443ip-\*[Gt]ip_v = IPVERSION; 444ip-\*[Gt]ip_hl = hlen \*[Gt]\*[Gt] 2; 445ip-\*[Gt]ip_id = 0; /* 0 means kernel set appropriate value */ 446ip-\*[Gt]ip_off = offset; 447.Ed 448.Pp 449If the header source address is set to 450.Dv INADDR_ANY , 451the kernel will choose an appropriate address. 452.Sh DIAGNOSTICS 453A socket operation may fail with one of the following errors returned: 454.Bl -tag -width [EADDRNOTAVAIL] 455.It Bq Er EISCONN 456when trying to establish a connection on a socket which already 457has one, or when trying to send a datagram with the destination 458address specified and the socket is already connected; 459.It Bq Er ENOTCONN 460when trying to send a datagram, but no destination address is 461specified, and the socket hasn't been connected; 462.It Bq Er ENOBUFS 463when the system runs out of memory for an internal data structure; 464.It Bq Er EADDRNOTAVAIL 465when an attempt is made to create a socket with a network address 466for which no network interface exists. 467.It Bq Er EACCES 468when an attempt is made to create a raw IP socket by a non-privileged process. 469.El 470.Pp 471The following errors specific to 472.Tn IP 473may occur when setting or getting 474.Tn IP 475options: 476.Bl -tag -width EADDRNOTAVAILxx 477.It Bq Er EINVAL 478An unknown socket option name was given. 479.It Bq Er EINVAL 480The IP option field was improperly formed; an option field was 481shorter than the minimum value or longer than the option buffer provided. 482.El 483.Sh SEE ALSO 484.Xr getsockopt 2 , 485.Xr recv 2 , 486.Xr send 2 , 487.Xr CMSG_DATA 3 , 488.Xr ipsec_set_policy 3 , 489.Xr icmp 4 , 490.Xr inet 4 , 491.Xr intro 4 492.Rs 493.%R RFC 494.%N 791 495.%D September 1981 496.%T "Internet Protocol" 497.Re 498.Rs 499.%R RFC 500.%N 1112 501.%D August 1989 502.%T "Host Extensions for IP Multicasting" 503.Re 504.Rs 505.%R RFC 506.%N 1122 507.%D October 1989 508.%T "Requirements for Internet Hosts -- Communication Layers" 509.Re 510.Sh HISTORY 511The 512.Nm 513protocol appeared in 514.Bx 4.2 . 515