xref: /freebsd-src/contrib/llvm-project/lld/docs/ELF/large_sections.rst (revision 0fca6ea1d4eea4c934cfff25ac9ee8ad6fe95583)
1*0fca6ea1SDimitry AndricLarge data sections
2*0fca6ea1SDimitry Andric===================
3*0fca6ea1SDimitry Andric
4*0fca6ea1SDimitry AndricWhen linking very large binaries, lld may report relocation overflows like
5*0fca6ea1SDimitry Andric
6*0fca6ea1SDimitry Andric::
7*0fca6ea1SDimitry Andric
8*0fca6ea1SDimitry Andric  relocation R_X86_64_PC32 out of range: 2158227201 is not in [-2147483648, 2147483647]
9*0fca6ea1SDimitry Andric
10*0fca6ea1SDimitry AndricThis happens when running into architectural limitations. For example, in x86-64
11*0fca6ea1SDimitry AndricPIC code, a reference to a static global variable is typically done with a
12*0fca6ea1SDimitry Andric``R_X86_64_PC32`` relocation, which is a 32-bit signed offset from the PC. That
13*0fca6ea1SDimitry Andricmeans if the global variable is laid out further than 2GB (2^31 bytes) from the
14*0fca6ea1SDimitry Andricinstruction referencing it, we run into a relocation overflow.
15*0fca6ea1SDimitry Andric
16*0fca6ea1SDimitry Andriclld normally lays out sections as follows:
17*0fca6ea1SDimitry Andric
18*0fca6ea1SDimitry Andric.. image:: section_layout.png
19*0fca6ea1SDimitry Andric
20*0fca6ea1SDimitry AndricThe largest relocation pressure is usually from ``.text`` to the beginning of
21*0fca6ea1SDimitry Andric``.rodata`` or ``.text`` to the end of ``.bss``.
22*0fca6ea1SDimitry Andric
23*0fca6ea1SDimitry AndricSome code models offer a tradeoff between relocation pressure and performance.
24*0fca6ea1SDimitry AndricFor example, x86-64's medium code model splits global variables into small and
25*0fca6ea1SDimitry Andriclarge globals depending on if their size is over a certain threshold. Large
26*0fca6ea1SDimitry Andricglobals are placed further away from text and we use 64-bit references to refer
27*0fca6ea1SDimitry Andricto them.
28*0fca6ea1SDimitry Andric
29*0fca6ea1SDimitry AndricLarge globals are placed in separate sections from small globals, and those
30*0fca6ea1SDimitry Andricsections have a "large" section flag, e.g. ``SHF_X86_64_LARGE`` for x86-64. The
31*0fca6ea1SDimitry Andriclinker places large sections on the outer edges of the binary, making sure they
32*0fca6ea1SDimitry Andricdo not affect affect the distance of small globals to text. The large versions
33*0fca6ea1SDimitry Andricof ``.rodata``, ``.bss``, and ``.data`` are ``.lrodata``, ``.lbss``, and
34*0fca6ea1SDimitry Andric``.ldata``, and they are laid out as follows:
35*0fca6ea1SDimitry Andric
36*0fca6ea1SDimitry Andric.. image:: large_section_layout_pic.png
37*0fca6ea1SDimitry Andric
38*0fca6ea1SDimitry AndricWe try to keep the number of ``PT_LOAD`` segments to a minimum, so we place
39*0fca6ea1SDimitry Andriclarge sections next to the small sections with the same RWX permissions when
40*0fca6ea1SDimitry Andricpossible.
41*0fca6ea1SDimitry Andric
42*0fca6ea1SDimitry Andric``.lbss`` is right after ``.bss`` so that they are merged together and we
43*0fca6ea1SDimitry Andricminimize the number of segments with ``p_memsz > p_filesz``.
44*0fca6ea1SDimitry Andric
45*0fca6ea1SDimitry AndricNote that the above applies to PIC code. For less common non-PIC code with
46*0fca6ea1SDimitry Andricabsolute relocations instead of relative relocations, 32-bit relocations
47*0fca6ea1SDimitry Andrictypically assume that symbols are in the lower 2GB of the address space. So for
48*0fca6ea1SDimitry Andricnon-PIC code, large sections should be placed after all small sections to avoid
49*0fca6ea1SDimitry Andric``.lrodata`` pushing small symbols out of the lower 2GB of the address space.
50*0fca6ea1SDimitry Andric``-z lrodata-after-bss`` changes the layout to be:
51*0fca6ea1SDimitry Andric
52*0fca6ea1SDimitry Andric.. image:: large_section_layout_nopic.png
53