xref: /netbsd-src/share/man/man4/tcp.4 (revision 392ee9f3154ce3e306e76d39c478517a51f15537)
1.\"	$NetBSD: tcp.4,v 1.31 2015/02/14 13:02:38 wiz Exp $
2.\"	$FreeBSD: tcp.4,v 1.11.2.16 2004/02/16 22:21:47 bms Exp $
3.\"
4.\" Copyright (c) 1983, 1991, 1993
5.\"	The Regents of the University of California.  All rights reserved.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that the following conditions
9.\" are met:
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\" 2. Redistributions in binary form must reproduce the above copyright
13.\"    notice, this list of conditions and the following disclaimer in the
14.\"    documentation and/or other materials provided with the distribution.
15.\" 3. Neither the name of the University nor the names of its contributors
16.\"    may be used to endorse or promote products derived from this software
17.\"    without specific prior written permission.
18.\"
19.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
20.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
21.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
22.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
23.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
24.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
25.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
26.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
27.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
28.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
29.\" SUCH DAMAGE.
30.\"
31.\"     @(#)tcp.4	8.1 (Berkeley) 6/5/93
32.\"
33.Dd February 14, 2015
34.Dt TCP 4
35.Os
36.Sh NAME
37.Nm tcp
38.Nd Internet Transmission Control Protocol
39.Sh SYNOPSIS
40.In sys/socket.h
41.In netinet/in.h
42.Ft int
43.Fn socket AF_INET SOCK_STREAM 0
44.Ft int
45.Fn socket AF_INET6 SOCK_STREAM 0
46.Sh DESCRIPTION
47The
48.Tn TCP
49provides reliable, flow-controlled, two-way transmission of data.
50It is a byte-stream protocol used to support the
51.Dv SOCK_STREAM
52abstraction.
53.Tn TCP
54uses the standard Internet address format and, in addition, provides
55a per-host collection of
56.Dq port addresses .
57Thus, each address is composed of an Internet address specifying
58the host and network, with a specific
59.Tn TCP
60port on the host identifying the peer entity.
61.Pp
62Sockets using
63.Tn TCP
64are either
65.Dq active
66or
67.Dq passive .
68Active sockets initiate connections to passive
69sockets.
70By default
71.Tn TCP
72sockets are created active; to create a passive socket the
73.Xr listen 2
74system call must be used
75after binding the socket with the
76.Xr bind 2
77system call.
78Only passive sockets may use the
79.Xr accept 2
80call to accept incoming connections.
81Only active sockets may use the
82.Xr connect 2
83call to initiate connections.
84.Pp
85Passive sockets may
86.Dq underspecify
87their location to match incoming connection requests from multiple networks.
88This technique, termed
89.Dq wildcard addressing ,
90allows a single
91server to provide service to clients on multiple networks.
92To create a socket which listens on all networks, the Internet
93address
94.Dv INADDR_ANY
95must be bound.
96The
97.Tn TCP
98port may still be specified at this time; if the port is not
99specified the system will assign one.
100Once a connection has been established the socket's address is
101fixed by the peer entity's location.
102The address assigned the socket is the address associated with the
103network interface through which packets are being transmitted and received.
104Normally this address corresponds to the peer entity's network.
105.Pp
106.Tn TCP
107supports a number of socket options which can be set with
108.Xr setsockopt 2
109and tested with
110.Xr getsockopt 2 :
111.Bl -tag -width TCP_KEEPINTVL
112.It Dv TCP_NODELAY
113Under most circumstances,
114.Tn TCP
115sends data when it is presented;
116when outstanding data has not yet been acknowledged, it gathers
117small amounts of output to be sent in a single packet once
118an acknowledgment is received.
119For a small number of clients, such as window systems
120that send a stream of mouse events which receive no replies,
121this packetization may cause significant delays.
122Therefore,
123.Tn TCP
124provides a boolean option,
125.Dv TCP_NODELAY
126(from
127.In netinet/tcp.h ,
128to defeat this algorithm.
129.It Dv TCP_MAXSEG
130By default, a sender- and receiver-TCP
131will negotiate among themselves to determine the maximum segment size
132to be used for each connection.
133The
134.Dv TCP_MAXSEG
135option allows the user to determine the result of this negotiation,
136and to reduce it if desired.
137.It Dv TCP_MD5SIG
138This option enables the use of MD5 digests (also known as TCP-MD5)
139on writes to the specified socket.
140In the current release, only outgoing traffic is digested;
141digests on incoming traffic are not verified.
142The current default behavior for the system is to respond to a system
143advertising this option with TCP-MD5; this may change.
144.Pp
145One common use for this in a
146.Nx
147router deployment is to enable
148based routers to interwork with Cisco equipment at peering points.
149Support for this feature conforms to RFC 2385.
150Only IPv4 (AF_INET) sessions are supported.
151.Pp
152In order for this option to function correctly, it is necessary for the
153administrator to add a tcp-md5 key entry to the system's security
154associations database (SADB) using the
155.Xr setkey 8
156utility.
157This entry must have an SPI of 0x1000 and can therefore only be specified
158on a per-host basis at this time.
159.Pp
160If an SADB entry cannot be found for the destination, the outgoing traffic
161will have an invalid digest option prepended, and the following error message
162will be visible on the system console:
163.Em "tcp_signature_compute: SADB lookup failed for %d.%d.%d.%d" .
164.It Dv TCP_KEEPIDLE
165.\" XXX: We always do it.
166.\" When the
167.\" .Dv SO_KEEPALIVE
168.\" option is enabled,
169TCP probes a connection that
170has been idle for some amount of time.
171The default value for this idle period is 4 hours.
172The
173.Dv TCP_KEEPIDLE
174option can be used to affect this value for a given socket, and specifies
175the number of seconds of idle time between keepalive probes.
176This option takes an
177.Vt "unsigned int"
178value, with a value greater than 0.
179.\" range of 1 to N (where N is
180.\" the
181.\" .Xr sysctl 8
182.\" variable
183.\" .Dv net.inet.tcp.keepidle ).
184.\" divided by
185.\" .Dv  PR_SLOWHZ
186.\" which is defined in the
187.\" .In sys/protosw.h
188.\" header file).
189.It Dv TCP_KEEPINTVL
190When the
191.Dv SO_KEEPALIVE
192option is enabled, TCP probes a connection that
193has been idle for some amount of time.
194If the remote system does not
195respond to a keepalive probe, TCP retransmits the probe after some
196amount of time.
197The default value for this retransmit interval is 150 seconds.
198The
199.Dv TCP_KEEPINTVL
200option can be used to affect this value for
201a given socket, and specifies the number of seconds to wait before
202retransmitting a keepalive probe.
203This option takes an
204.Vt "unsigned int"
205value, with a value greater than 0.
206.\" range of 1 to N (where N is the
207.\" .Xr sysctl 8
208.\" variable
209.\" .Dv net.inet.tcp.keepintvl ).
210.It Dv TCP_KEEPCNT
211When the
212.Dv SO_KEEPALIVE
213option is enabled, TCP probes a connection that
214has been idle for some amount of time.
215If the remote system does not
216respond to a keepalive probe, TCP retransmits the probe a certain
217number of times before a connection is considered to be broken.
218The default value for this keepalive probe retransmit limit is 8.
219The
220.Dv TCP_KEEPCNT
221option can be used to affect this value for a given socket,
222and specifies the maximum number of keepalive probes to be sent.
223This option takes an
224.Vt "unsigned int"
225value, with a value greater than 0.
226.\" range of 0 to N (where N is the
227.\" .Xr sysctl 8
228.\" variable
229.\" .Dv net.inet.tcp.keepcnt ).
230.It Dv TCP_KEEPINIT
231If a TCP connection cannot be established within some amount of time,
232TCP will time out the connect attempt.
233The default value for this initial connection establishment timeout
234is 150 seconds.
235The
236.Dv TCP_KEEPINIT
237option can be used to affect this initial timeout period for a given
238socket, and specifies the number of seconds to wait before the connect
239attempt is timed out.
240For passive connections, the
241.Dv TCP_KEEPINIT
242option value is inherited from the listening socket.
243This option takes an
244.Vt "unsigned int"
245value, with a value greater than 0.
246.It Dv TCP_INFO
247Information about a socket's underlying TCP session may be retrieved
248by passing the read-only option
249.Dv TPC_INFO
250to
251.Xr getsockopt 2 .
252It accepts a single argument: a pointer to an instance of
253.Vt "struct tcp_info" .
254.Pp
255This API is subject to change; consult the source to determine
256which fields are currently filled out by this option.
257.Nx
258specific additions include
259send window size,
260receive window size,
261and
262bandwidth-controlled window space.
263.\" range of 0 to N (where N is the
264.\" .Xr sysctl 8
265.\" variable
266.\" .Dv net.inet.tcp.keepinit ).
267.El
268.Pp
269The option level for the
270.Xr setsockopt 2
271call is the protocol number for
272.Tn TCP ,
273available from
274.Xr getprotobyname 3 .
275.Pp
276In the historical
277.Bx
278.Tn TCP
279implementation, if the
280.Dv TCP_NODELAY
281option was set on a passive socket, the sockets returned by
282.Xr accept 2
283erroneously did not have the
284.Dv TCP_NODELAY
285option set; the behavior was corrected to inherit
286.Dv TCP_NODELAY
287in
288.Nx 1.6 .
289.Pp
290Options at the
291.Tn IP
292network level may be used with
293.Tn TCP ;
294see
295.Xr ip 4
296or
297.Xr ip6 4 .
298Incoming connection requests that are source-routed are noted,
299and the reverse source route is used in responding.
300.Pp
301There are many adjustable parameters that control various aspects
302of the
303.Nx
304TCP behavior; these parameters are documented in
305.Xr sysctl 7 ,
306and they include:
307.Bl -bullet -compact
308.It
309RFC 1323 extensions for high performance
310.It
311Send/receive buffer sizes
312.It
313Default maximum segment size (MSS)
314.It
315SYN cache parameters
316.It
317Hughes/Touch/Heidemann Congestion Window Monitoring algorithm
318.It
319Keepalive parameters
320.It
321newReno algorithm for congestion control
322.It
323Logging of connection refusals
324.It
325RST packet rate limits
326.It
327SACK (Selective Acknowledgment)
328.It
329ECN (Explicit Congestion Notification)
330.It
331Congestion window increase methods; the traditional packet counting or
332RFC 3465 Appropriate Byte Counting
333.It
334RFC 3390: Increased initial window size
335.El
336.Sh DIAGNOSTICS
337A socket operation may fail with one of the following errors returned:
338.Bl -tag -width [EADDRNOTAVAIL]
339.It Bq Er EISCONN
340when trying to establish a connection on a socket which
341already has one;
342.It Bq Er ENOBUFS
343when the system runs out of memory for
344an internal data structure;
345.It Bq Er ETIMEDOUT
346when a connection was dropped
347due to excessive retransmissions;
348.It Bq Er ECONNRESET
349when the remote peer
350forces the connection to be closed;
351.It Bq Er ECONNREFUSED
352when the remote
353peer actively refuses connection establishment (usually because
354no process is listening to the port);
355.It Bq Er EADDRINUSE
356when an attempt
357is made to create a socket with a port which has already been
358allocated;
359.It Bq Er EADDRNOTAVAIL
360when an attempt is made to create a
361socket with a network address for which no network interface
362exists.
363.El
364.Sh SEE ALSO
365.Xr getsockopt 2 ,
366.Xr socket 2 ,
367.Xr inet 4 ,
368.Xr inet6 4 ,
369.Xr intro 4 ,
370.Xr ip 4 ,
371.Xr ip6 4 ,
372.Xr sysctl 7
373.Rs
374.%R RFC
375.%N 793
376.%D September 1981
377.%T "Transmission Control Protocol"
378.Re
379.Rs
380.%R RFC
381.%N 1122
382.%D October 1989
383.%T "Requirements for Internet Hosts -- Communication Layers"
384.Re
385.Sh HISTORY
386The
387.Nm
388protocol stack appeared in
389.Bx 4.2 .
390