xref: /freebsd-src/contrib/llvm-project/lld/docs/ELF/warn_backrefs.rst (revision fe6060f10f634930ff71b7c50291ddc610da2475)
1e8d8bef9SDimitry Andric--warn-backrefs
2e8d8bef9SDimitry Andric===============
3e8d8bef9SDimitry Andric
4e8d8bef9SDimitry Andric``--warn-backrefs`` gives a warning when an undefined symbol reference is
5e8d8bef9SDimitry Andricresolved by a definition in an archive to the left of it on the command line.
6e8d8bef9SDimitry Andric
7e8d8bef9SDimitry AndricA linker such as GNU ld makes a single pass over the input files from left to
8e8d8bef9SDimitry Andricright maintaining the set of undefined symbol references from the files loaded
9e8d8bef9SDimitry Andricso far. When encountering an archive or an object file surrounded by
10e8d8bef9SDimitry Andric``--start-lib`` and ``--end-lib`` that archive will be searched for resolving
11e8d8bef9SDimitry Andricsymbol definitions; this may result in input files being loaded, updating the
12e8d8bef9SDimitry Andricset of undefined symbol references. When all resolving definitions have been
13e8d8bef9SDimitry Andricloaded from the archive, the linker moves on the next file and will not return
14e8d8bef9SDimitry Andricto it.  This means that if an input file to the right of a archive cannot have
15e8d8bef9SDimitry Andrican undefined symbol resolved by a archive to the left of it. For example:
16e8d8bef9SDimitry Andric
17e8d8bef9SDimitry Andric    ld def.a ref.o
18e8d8bef9SDimitry Andric
19e8d8bef9SDimitry Andricwill result in an ``undefined reference`` error. If there are no cyclic
20e8d8bef9SDimitry Andricreferences, the archives can be ordered in such a way that there are no
21e8d8bef9SDimitry Andricbackward references. If there are cyclic references then the ``--start-group``
22e8d8bef9SDimitry Andricand ``--end-group`` options can be used, or the same archive can be placed on
23e8d8bef9SDimitry Andricthe command line twice.
24e8d8bef9SDimitry Andric
25e8d8bef9SDimitry AndricLLD remembers the symbol table of archives that it has previously seen, so if
26e8d8bef9SDimitry Andricthere is a reference from an input file to the right of an archive, LLD will
27e8d8bef9SDimitry Andricstill search that archive for resolving any undefined references. This means
28e8d8bef9SDimitry Andricthat an archive only needs to be included once on the command line and the
29e8d8bef9SDimitry Andric``--start-group`` and ``--end-group`` options are redundant.
30e8d8bef9SDimitry Andric
31e8d8bef9SDimitry AndricA consequence of the differing archive searching semantics is that the same
32e8d8bef9SDimitry Andriclinker command line can result in different outcomes. A link may succeed with
33e8d8bef9SDimitry AndricLLD that will fail with GNU ld, or even worse both links succeed but they have
34e8d8bef9SDimitry Andricselected different objects from different archives that both define the same
35e8d8bef9SDimitry Andricsymbols.
36e8d8bef9SDimitry Andric
37e8d8bef9SDimitry AndricThe ``warn-backrefs`` option provides information that helps identify cases
38e8d8bef9SDimitry Andricwhere LLD and GNU ld archive selection may differ.
39e8d8bef9SDimitry Andric
40*fe6060f1SDimitry Andric    | % ld.lld --warn-backrefs ... -lB -lA
41*fe6060f1SDimitry Andric    | ld.lld: warning: backward reference detected: system in A.a(a.o) refers to B.a(b.o)
42e8d8bef9SDimitry Andric
43*fe6060f1SDimitry Andric    | % ld.lld --warn-backrefs ... --start-lib B/b.o --end-lib --start-lib A/a.o --end-lib
44*fe6060f1SDimitry Andric    | ld.lld: warning: backward reference detected: system in A/a.o refers to B/b.o
45e8d8bef9SDimitry Andric
46e8d8bef9SDimitry Andric    # To suppress the warning, you can specify --warn-backrefs-exclude=<glob> to match B/b.o or B.a(b.o)
47e8d8bef9SDimitry Andric
48e8d8bef9SDimitry AndricThe ``--warn-backrefs`` option can also provide a check to enforce a
49e8d8bef9SDimitry Andrictopological order of archives, which can be useful to detect layering
50e8d8bef9SDimitry Andricviolations (albeit unable to catch all cases). There are two cases where GNU ld
51e8d8bef9SDimitry Andricwill result in an ``undefined reference`` error:
52e8d8bef9SDimitry Andric
53e8d8bef9SDimitry Andric* If adding the dependency does not form a cycle: conceptually ``A`` is higher
54e8d8bef9SDimitry Andric  level library while ``B`` is at a lower level. When you are developing an
55e8d8bef9SDimitry Andric  application ``P`` which depends on ``A``, but does not directly depend on
56e8d8bef9SDimitry Andric  ``B``, your link may fail surprisingly with ``undefined symbol:
57e8d8bef9SDimitry Andric  symbol_defined_in_B`` if the used/linked part of ``A`` happens to need some
58e8d8bef9SDimitry Andric  components of ``B``. It is inappropriate for ``P`` to add a dependency on
59e8d8bef9SDimitry Andric  ``B`` since ``P`` does not use ``B`` directly.
60e8d8bef9SDimitry Andric* If adding the dependency forms a cycle, e.g. ``B->C->A ~> B``. ``A``
61e8d8bef9SDimitry Andric  is supposed to be at the lowest level while ``B`` is supposed to be at the
62e8d8bef9SDimitry Andric  highest level. When you are developing ``C_test`` testing ``C``, your link may
63e8d8bef9SDimitry Andric  fail surprisingly with ``undefined symbol`` if there is somehow a dependency on
64e8d8bef9SDimitry Andric  some components of ``B``. You could fix the issue by adding the missing
65e8d8bef9SDimitry Andric  dependency (``B``), however, then every test (``A_test``, ``B_test``,
66e8d8bef9SDimitry Andric  ``C_test``) will link against every library. This breaks the motivation
67e8d8bef9SDimitry Andric  of splitting ``B``, ``C`` and ``A`` into separate libraries and makes binaries
68e8d8bef9SDimitry Andric  unnecessarily large. Moreover, the layering violation makes lower-level
69e8d8bef9SDimitry Andric  libraries (e.g. ``A``) vulnerable to changes to higher-level libraries (e.g.
70e8d8bef9SDimitry Andric  ``B``, ``C``).
71e8d8bef9SDimitry Andric
72e8d8bef9SDimitry AndricResolution:
73e8d8bef9SDimitry Andric
74e8d8bef9SDimitry Andric* Add a dependency from ``A`` to ``B``.
75e8d8bef9SDimitry Andric* The reference may be unintended and can be removed.
76e8d8bef9SDimitry Andric* The dependency may be intentionally omitted because there are multiple
77e8d8bef9SDimitry Andric  libraries like ``B``.  Consider linking ``B`` with object semantics by
78e8d8bef9SDimitry Andric  surrounding it with ``--whole-archive`` and ``--no-whole-archive``.
79e8d8bef9SDimitry Andric* In the case of circular dependency, sometimes merging the libraries are the best.
80e8d8bef9SDimitry Andric
81e8d8bef9SDimitry AndricThere are two cases like a library sandwich where GNU ld will select a
82e8d8bef9SDimitry Andricdifferent object.
83e8d8bef9SDimitry Andric
84e8d8bef9SDimitry Andric* ``A.a B A2.so``: ``A.a`` may be used as an interceptor (e.g. it provides some
85e8d8bef9SDimitry Andric  optimized libc functions and ``A2`` is libc).  ``B`` does not need to know
86e8d8bef9SDimitry Andric  about ``A.a``, and ``A.a`` may be pulled into the link by other part of the
87e8d8bef9SDimitry Andric  program. For linker portability, consider ``--whole-archive`` and
88e8d8bef9SDimitry Andric  ``--no-whole-archive``.
89e8d8bef9SDimitry Andric
90e8d8bef9SDimitry Andric* ``A.a B A2.a``: similar to the above case but ``--warn-backrefs`` does not
91e8d8bef9SDimitry Andric  flag the problem, because ``A2.a`` may be a replicate of ``A.a``, which is
92e8d8bef9SDimitry Andric  redundant but benign. In some cases ``A.a`` and ``B`` should be surrounded by
93e8d8bef9SDimitry Andric  a pair of ``--start-group`` and ``--end-group``. This is especially common
94e8d8bef9SDimitry Andric  among system libraries (e.g.  ``-lc __isnanl references -lm``, ``-lc
95e8d8bef9SDimitry Andric  _IO_funlockfile references -lpthread``, ``-lc __gcc_personality_v0 references
96e8d8bef9SDimitry Andric  -lgcc_eh``, and ``-lpthread _Unwind_GetCFA references -lunwind``).
97e8d8bef9SDimitry Andric
98e8d8bef9SDimitry Andric  In C++, this is likely an ODR violation. We probably need a dedicated option
99e8d8bef9SDimitry Andric  for ODR detection.
100