xref: /illumos-gate/usr/src/man/man7/byteorder.7 (revision bbf215553c7233fbab8a0afdf1fac74c44781867)
1*bbf21555SRichard Lowe.\"
2*bbf21555SRichard Lowe.\" This file and its contents are supplied under the terms of the
3*bbf21555SRichard Lowe.\" Common Development and Distribution License ("CDDL"), version 1.0.
4*bbf21555SRichard Lowe.\" You may only use this file in accordance with the terms of version
5*bbf21555SRichard Lowe.\" 1.0 of the CDDL.
6*bbf21555SRichard Lowe.\"
7*bbf21555SRichard Lowe.\" A full copy of the text of the CDDL should have accompanied this
8*bbf21555SRichard Lowe.\" source.  A copy of the CDDL is also available via the Internet at
9*bbf21555SRichard Lowe.\" http://www.illumos.org/license/CDDL.
10*bbf21555SRichard Lowe.\"
11*bbf21555SRichard Lowe.\"
12*bbf21555SRichard Lowe.\" Copyright 2016 Joyent, Inc.
13*bbf21555SRichard Lowe.\"
14*bbf21555SRichard Lowe.Dd August 2, 2018
15*bbf21555SRichard Lowe.Dt BYTEORDER 7
16*bbf21555SRichard Lowe.Os
17*bbf21555SRichard Lowe.Sh NAME
18*bbf21555SRichard Lowe.Nm byteorder ,
19*bbf21555SRichard Lowe.Nm endian
20*bbf21555SRichard Lowe.Nd byte order and endianness
21*bbf21555SRichard Lowe.Sh DESCRIPTION
22*bbf21555SRichard LoweInteger values which occupy more than 1 byte in memory can be laid out
23*bbf21555SRichard Lowein different ways on different platforms.
24*bbf21555SRichard LoweIn particular, there is a major split between those which place the least
25*bbf21555SRichard Lowesignificant byte of an integer at the lowest address, and those which place the
26*bbf21555SRichard Lowemost significant byte there instead.
27*bbf21555SRichard LoweAs this difference relates to which end of the integer is found in memory first,
28*bbf21555SRichard Lowethe term
29*bbf21555SRichard Lowe.Em endian
30*bbf21555SRichard Loweis used to refer to a particular byte order.
31*bbf21555SRichard Lowe.Pp
32*bbf21555SRichard LoweA platform is referred to as using a
33*bbf21555SRichard Lowe.Em big-endian
34*bbf21555SRichard Lowebyte order when it places the most significant byte at the lowest
35*bbf21555SRichard Loweaddress, and
36*bbf21555SRichard Lowe.Em little-endian
37*bbf21555SRichard Lowewhen it places the least significant byte first.
38*bbf21555SRichard LoweSome platforms may also switch between big- and little-endian mode and run code
39*bbf21555SRichard Lowecompiled for either.
40*bbf21555SRichard Lowe.Pp
41*bbf21555SRichard LoweHistorically, there have also been some systems that utilized
42*bbf21555SRichard Lowe.Em middle-endian
43*bbf21555SRichard Lowebyte orders for integers larger than 2 bytes.
44*bbf21555SRichard LoweSuch orderings are not in common use today.
45*bbf21555SRichard Lowe.Pp
46*bbf21555SRichard LoweEndianness is also of particular importance when dealing with values
47*bbf21555SRichard Lowethat are being read into memory from an external source.
48*bbf21555SRichard LoweFor example, network protocols such as IP conventionally define the fields in a
49*bbf21555SRichard Lowepacket as being always stored in big-endian byte order.
50*bbf21555SRichard LoweThis means that a little-endian machine will have to perform transformations on
51*bbf21555SRichard Lowethese fields in order to process them.
52*bbf21555SRichard Lowe.Ss Examples
53*bbf21555SRichard LoweTo illustrate endianness in memory, let us consider the decimal integer
54*bbf21555SRichard Lowe2864434397.
55*bbf21555SRichard LoweThis number fits in 32 bits of storage (4 bytes).
56*bbf21555SRichard Lowe.Pp
57*bbf21555SRichard LoweOn a big-endian system, this integer would be written into memory as
58*bbf21555SRichard Lowethe bytes 0xAA, 0xBB, 0xCC, 0xDD, in order from lowest memory address to
59*bbf21555SRichard Lowehighest.
60*bbf21555SRichard Lowe.Pp
61*bbf21555SRichard LoweOn a little-endian system, it would be written instead as the bytes
62*bbf21555SRichard Lowe0xDD, 0xCC, 0xBB, 0xAA, in that order.
63*bbf21555SRichard Lowe.Pp
64*bbf21555SRichard LoweIf both the big- and little-endian systems were asked to store this
65*bbf21555SRichard Loweinteger at address 0x100, we would see the following in each of their
66*bbf21555SRichard Lowememory:
67*bbf21555SRichard Lowe.Bd -literal
68*bbf21555SRichard Lowe
69*bbf21555SRichard Lowe                    Big-Endian
70*bbf21555SRichard Lowe
71*bbf21555SRichard Lowe        ++------++------++------++------++
72*bbf21555SRichard Lowe        || 0xAA || 0xBB || 0xCC || 0xDD ||
73*bbf21555SRichard Lowe        ++------++------++------++------++
74*bbf21555SRichard Lowe            ^^      ^^      ^^      ^^
75*bbf21555SRichard Lowe          0x100   0x101   0x102   0x103
76*bbf21555SRichard Lowe            vv      vv      vv      vv
77*bbf21555SRichard Lowe        ++------++------++------++------++
78*bbf21555SRichard Lowe        || 0xDD || 0xCC || 0xBB || 0xAA ||
79*bbf21555SRichard Lowe        ++------++------++------++------++
80*bbf21555SRichard Lowe
81*bbf21555SRichard Lowe                  Little-Endian
82*bbf21555SRichard Lowe.Ed
83*bbf21555SRichard Lowe.Pp
84*bbf21555SRichard LoweIt is particularly important to note that even though the byte order is
85*bbf21555SRichard Lowedifferent between these two machines, the bit ordering within each byte,
86*bbf21555SRichard Loweby convention, is still the same.
87*bbf21555SRichard Lowe.Pp
88*bbf21555SRichard LoweFor example, take the decimal integer 4660, which occupies in 16 bits (2
89*bbf21555SRichard Lowebytes).
90*bbf21555SRichard Lowe.Pp
91*bbf21555SRichard LoweOn a big-endian system, this would be written into memory as 0x12, then
92*bbf21555SRichard Lowe0x34.
93*bbf21555SRichard Lowe.Pp
94*bbf21555SRichard LoweOn a little-endian system, it would be written as 0x34, then 0x12.
95*bbf21555SRichard LoweNote that this is not at all the same as seeing 0x43 then 0x21 in memory --
96*bbf21555SRichard Loweonly the bytes are re-ordered, not any bits (or nybbles) within them.
97*bbf21555SRichard Lowe.Pp
98*bbf21555SRichard LoweAs before, storing this at address 0x100:
99*bbf21555SRichard Lowe.Bd -literal
100*bbf21555SRichard Lowe                    Big-Endian
101*bbf21555SRichard Lowe
102*bbf21555SRichard Lowe                ++------++------++
103*bbf21555SRichard Lowe                || 0x12 || 0x34 ||
104*bbf21555SRichard Lowe                ++------++------++
105*bbf21555SRichard Lowe                    ^^      ^^
106*bbf21555SRichard Lowe                  0x100   0x101
107*bbf21555SRichard Lowe                    vv      vv
108*bbf21555SRichard Lowe                ++------++------++
109*bbf21555SRichard Lowe                || 0x34 || 0x12 ||
110*bbf21555SRichard Lowe                ++------++------++
111*bbf21555SRichard Lowe
112*bbf21555SRichard Lowe                   Little-Endian
113*bbf21555SRichard Lowe.Ed
114*bbf21555SRichard Lowe.Pp
115*bbf21555SRichard LoweThis example shows how an eight byte number, 0xBADCAFEDEADBEEF is stored
116*bbf21555SRichard Lowein both big and little-endian:
117*bbf21555SRichard Lowe.Bd -literal
118*bbf21555SRichard Lowe                        Big-Endian
119*bbf21555SRichard Lowe
120*bbf21555SRichard Lowe    +------+------+------+------+------+------+------+------+
121*bbf21555SRichard Lowe    | 0xBA | 0xDC | 0xAF | 0xFE | 0xDE | 0xAD | 0xBE | 0xEF |
122*bbf21555SRichard Lowe    +------+------+------+------+------+------+------+------+
123*bbf21555SRichard Lowe       ^^     ^^     ^^     ^^     ^^     ^^     ^^     ^^
124*bbf21555SRichard Lowe     0x100  0x101  0x102  0x103  0x104  0x105  0x106  0x107
125*bbf21555SRichard Lowe       vv     vv     vv     vv     vv     vv     vv     vv
126*bbf21555SRichard Lowe    +------+------+------+------+------+------+------+------+
127*bbf21555SRichard Lowe    | 0xEF | 0xBE | 0xAD | 0xDE | 0xFE | 0xAF | 0xDC | 0xBA |
128*bbf21555SRichard Lowe    +------+------+------+------+------+------+------+------+
129*bbf21555SRichard Lowe
130*bbf21555SRichard Lowe                       Little-Endian
131*bbf21555SRichard Lowe
132*bbf21555SRichard Lowe.Ed
133*bbf21555SRichard Lowe.Pp
134*bbf21555SRichard LoweThe treatment of different endian values would not be complete without
135*bbf21555SRichard Lowediscussing
136*bbf21555SRichard Lowe.Em PDP-endian ,
137*bbf21555SRichard Lowewhich is also known as
138*bbf21555SRichard Lowe.Em middle-endian .
139*bbf21555SRichard LoweWhile the PDP-11 was a 16-bit little-endian system, it laid out 32-bit
140*bbf21555SRichard Lowevalues in a different way from current little-endian systems.
141*bbf21555SRichard LoweFirst, it would divide a 32-bit number into two 16-bit numbers.
142*bbf21555SRichard LoweEach 16-bit number would be stored in little-endian; however, the two 16-bit
143*bbf21555SRichard Lowewords would be stored with the larger 16-bit word appearing first in memory,
144*bbf21555SRichard Lowefollowed by the latter.
145*bbf21555SRichard Lowe.Pp
146*bbf21555SRichard LoweThe following image illustrates PDP-endian and compares it against
147*bbf21555SRichard Lowelittle-endian values.
148*bbf21555SRichard LoweHere, we'll start with the value 0xAABBCCDD and show how the four bytes for it
149*bbf21555SRichard Lowewill be laid out, starting at 0x100.
150*bbf21555SRichard Lowe.Bd -literal
151*bbf21555SRichard Lowe                    PDP-Endian
152*bbf21555SRichard Lowe
153*bbf21555SRichard Lowe        ++------++------++------++------++
154*bbf21555SRichard Lowe        || 0xBB || 0xAA || 0xDD || 0xCC ||
155*bbf21555SRichard Lowe        ++------++------++------++------++
156*bbf21555SRichard Lowe            ^^      ^^      ^^      ^^
157*bbf21555SRichard Lowe          0x100   0x101   0x102   0x103
158*bbf21555SRichard Lowe            vv      vv      vv      vv
159*bbf21555SRichard Lowe        ++------++------++------++------++
160*bbf21555SRichard Lowe        || 0xDD || 0xCC || 0xBB || 0xAA ||
161*bbf21555SRichard Lowe        ++------++------++------++------++
162*bbf21555SRichard Lowe
163*bbf21555SRichard Lowe                  Little-Endian
164*bbf21555SRichard Lowe
165*bbf21555SRichard Lowe.Ed
166*bbf21555SRichard Lowe.Ss Network Byte Order
167*bbf21555SRichard LoweThe term 'network byte order' refers to big-endian ordering, and
168*bbf21555SRichard Loweoriginates from the IEEE.
169*bbf21555SRichard LoweEarly disagreements over which byte ordering to use for network traffic prompted
170*bbf21555SRichard LoweRFC1700 to define that all IETF-specified network protocols use big-endian
171*bbf21555SRichard Loweordering unless noted explicitly otherwise.
172*bbf21555SRichard LoweThe Internet protocol family (IP, and thus TCP and UDP etc) particularly adhere
173*bbf21555SRichard Loweto this convention.
174*bbf21555SRichard Lowe.Ss Determining the System's Byte Order
175*bbf21555SRichard LoweThe operating system supports both big-endian and little-endian CPUs.
176*bbf21555SRichard LoweTo make it easier for programs to determine the endianness of the platform they
177*bbf21555SRichard Loweare being compiled for, functions and macro constants are provided in the system
178*bbf21555SRichard Loweheader files.
179*bbf21555SRichard Lowe.Pp
180*bbf21555SRichard LoweThe endianness of the system can be obtained by including the header
181*bbf21555SRichard Lowe.In sys/types.h
182*bbf21555SRichard Loweand using the pre-processor macros
183*bbf21555SRichard Lowe.Sy _LITTLE_ENDIAN
184*bbf21555SRichard Loweand
185*bbf21555SRichard Lowe.Sy _BIG_ENDIAN .
186*bbf21555SRichard LoweSee
187*bbf21555SRichard Lowe.Xr types.h 3HEAD
188*bbf21555SRichard Lowefor more information.
189*bbf21555SRichard Lowe.Pp
190*bbf21555SRichard LoweAdditionally, the header
191*bbf21555SRichard Lowe.In endian.h
192*bbf21555SRichard Lowedefines an alternative means for determining the endianness of the
193*bbf21555SRichard Lowecurrent system.
194*bbf21555SRichard LoweSee
195*bbf21555SRichard Lowe.Xr endian.h 3HEAD
196*bbf21555SRichard Lowefor more information.
197*bbf21555SRichard Lowe.Pp
198*bbf21555SRichard Loweillumos runs on both big- and little-endian systems.
199*bbf21555SRichard LoweWhen writing software for which the endianness is important, one must always
200*bbf21555SRichard Lowecheck the byte order and convert it appropriately.
201*bbf21555SRichard Lowe.Ss Converting Between Byte Orders
202*bbf21555SRichard LoweThe system provides two different sets of functions to convert values
203*bbf21555SRichard Lowebetween big-endian and little-endian.
204*bbf21555SRichard LoweThey are defined in
205*bbf21555SRichard Lowe.Xr byteorder 3C
206*bbf21555SRichard Loweand
207*bbf21555SRichard Lowe.Xr endian 3C .
208*bbf21555SRichard Lowe.Pp
209*bbf21555SRichard LoweThe
210*bbf21555SRichard Lowe.Xr byteorder 3C
211*bbf21555SRichard Lowefamily of functions convert data between the host's native byte order
212*bbf21555SRichard Loweand big- or little-endian.
213*bbf21555SRichard LoweThe functions operate on either 16-bit, 32-bit, or 64-bit values.
214*bbf21555SRichard LoweFunctions that convert from network byte order to the host's byte order
215*bbf21555SRichard Lowestart with the string
216*bbf21555SRichard Lowe.Sy ntoh ,
217*bbf21555SRichard Lowewhile functions which convert from the host's byte order to network byte
218*bbf21555SRichard Loweorder, begin with
219*bbf21555SRichard Lowe.Sy hton .
220*bbf21555SRichard LoweFor example, to convert a 32-bit value, a long, from network byte order
221*bbf21555SRichard Loweto the host's, one would use the function
222*bbf21555SRichard Lowe.Xr ntohl 3C .
223*bbf21555SRichard Lowe.Pp
224*bbf21555SRichard LoweThese functions have been standardized by POSIX.
225*bbf21555SRichard LoweHowever, the 64-bit variants,
226*bbf21555SRichard Lowe.Xr ntohll 3C
227*bbf21555SRichard Loweand
228*bbf21555SRichard Lowe.Xr htonll 3C
229*bbf21555SRichard Loweare not standardized and may not be found on other systems.
230*bbf21555SRichard LoweFor more information on these functions, see
231*bbf21555SRichard Lowe.Xr byteorder 3C .
232*bbf21555SRichard Lowe.Pp
233*bbf21555SRichard LoweThe second family of functions,
234*bbf21555SRichard Lowe.Xr endian 3C ,
235*bbf21555SRichard Loweprovide a means to convert between the host's byte order
236*bbf21555SRichard Loweand big-endian and little-endian specifically.
237*bbf21555SRichard LoweWhile these functions are similar to those in
238*bbf21555SRichard Lowe.Xr byteorder 3C ,
239*bbf21555SRichard Lowethey more explicitly cover different data conversions.
240*bbf21555SRichard LoweLike them, these functions operate on either 16-bit, 32-bit, or 64-bit values.
241*bbf21555SRichard LoweWhen converting from big-endian, to the host's endianness, the functions
242*bbf21555SRichard Lowebegin with
243*bbf21555SRichard Lowe.Sy betoh .
244*bbf21555SRichard LoweIf instead, one is converting data from the host's native endianness to
245*bbf21555SRichard Loweanother, then it starts with
246*bbf21555SRichard Lowe.Sy htobe .
247*bbf21555SRichard LoweWhen working with little-endian data, the prefixes
248*bbf21555SRichard Lowe.Sy letoh
249*bbf21555SRichard Loweand
250*bbf21555SRichard Lowe.Sy htole
251*bbf21555SRichard Loweconvert little-endian data to the host's endianness and from the host's
252*bbf21555SRichard Loweto little-endian respectively.
253*bbf21555SRichard Lowe.Pp
254*bbf21555SRichard LoweThese functions are not standardized and the header they appear in varies
255*bbf21555SRichard Lowebetween the BSDs and GNU/Linux.
256*bbf21555SRichard LoweApplications that wish to be portable, should instead use the
257*bbf21555SRichard Lowe.Xr byteorder 3C
258*bbf21555SRichard Lowefunctions.
259*bbf21555SRichard Lowe.Pp
260*bbf21555SRichard LoweAll of these functions in both families simply return their input when
261*bbf21555SRichard Lowethe host's native byte order is the same as the desired order.
262*bbf21555SRichard LoweFor example, when calling
263*bbf21555SRichard Lowe.Xr htonl 3C
264*bbf21555SRichard Loweon a big-endian system the original data is returned with no conversion
265*bbf21555SRichard Loweor modification.
266*bbf21555SRichard Lowe.Sh SEE ALSO
267*bbf21555SRichard Lowe.Xr byteorder 3C ,
268*bbf21555SRichard Lowe.Xr endian 3C ,
269*bbf21555SRichard Lowe.Xr endian.h 3HEAD ,
270*bbf21555SRichard Lowe.Xr inet 3HEAD
271