xref: /netbsd-src/external/apache2/llvm/dist/libcxx/docs/DesignDocs/UniquePtrTrivialAbi.rst (revision 4d6fc14bc9b0c5bf3e30be318c143ee82cadd108)
1*4d6fc14bSjoerg=============================================
2*4d6fc14bSjoergEnable std::unique_ptr [[clang::trivial_abi]]
3*4d6fc14bSjoerg=============================================
4*4d6fc14bSjoerg
5*4d6fc14bSjoergBackground
6*4d6fc14bSjoerg==========
7*4d6fc14bSjoerg
8*4d6fc14bSjoergConsider the follow snippets
9*4d6fc14bSjoerg
10*4d6fc14bSjoerg
11*4d6fc14bSjoerg.. code-block:: cpp
12*4d6fc14bSjoerg
13*4d6fc14bSjoerg    void raw_func(Foo* raw_arg) { ... }
14*4d6fc14bSjoerg    void smart_func(std::unique_ptr<Foo> smart_arg) { ... }
15*4d6fc14bSjoerg
16*4d6fc14bSjoerg    Foo* raw_ptr_retval() { ... }
17*4d6fc14bSjoerg    std::unique_ptr<Foo*> smart_ptr_retval() { ... }
18*4d6fc14bSjoerg
19*4d6fc14bSjoerg
20*4d6fc14bSjoerg
21*4d6fc14bSjoergThe argument ``raw_arg`` could be passed in a register but ``smart_arg`` could not, due to current
22*4d6fc14bSjoergimplementation.
23*4d6fc14bSjoerg
24*4d6fc14bSjoergSpecifically, in the ``smart_arg`` case, the caller secretly constructs a temporary ``std::unique_ptr``
25*4d6fc14bSjoergin its stack-frame, and then passes a pointer to it to the callee in a hidden parameter.
26*4d6fc14bSjoergSimilarly, the return value from ``smart_ptr_retval`` is secretly allocated in the caller and
27*4d6fc14bSjoergpassed as a secret reference to the callee.
28*4d6fc14bSjoerg
29*4d6fc14bSjoerg
30*4d6fc14bSjoergGoal
31*4d6fc14bSjoerg===================
32*4d6fc14bSjoerg
33*4d6fc14bSjoerg``std::unique_ptr`` is passed directly in a register.
34*4d6fc14bSjoerg
35*4d6fc14bSjoergDesign
36*4d6fc14bSjoerg======
37*4d6fc14bSjoerg
38*4d6fc14bSjoerg* Annotate the two definitions of ``std::unique_ptr``  with ``clang::trivial_abi`` attribute.
39*4d6fc14bSjoerg* Put the attribuate behind a flag because this change has potential compilation and runtime breakages.
40*4d6fc14bSjoerg
41*4d6fc14bSjoerg
42*4d6fc14bSjoergThis comes with some side effects:
43*4d6fc14bSjoerg
44*4d6fc14bSjoerg* ``std::unique_ptr`` parameters will now be destroyed by callees, rather than callers.
45*4d6fc14bSjoerg  It is worth noting that destruction by callee is not unique to the use of trivial_abi attribute.
46*4d6fc14bSjoerg  In most Microsoft's ABIs, arguments are always destroyed by the callee.
47*4d6fc14bSjoerg
48*4d6fc14bSjoerg  Consequently, this may change the destruction order for function parameters to an order that is non-conforming to the standard.
49*4d6fc14bSjoerg  For example:
50*4d6fc14bSjoerg
51*4d6fc14bSjoerg
52*4d6fc14bSjoerg  .. code-block:: cpp
53*4d6fc14bSjoerg
54*4d6fc14bSjoerg    struct A { ~A(); };
55*4d6fc14bSjoerg    struct B { ~B(); };
56*4d6fc14bSjoerg    struct C { C(A, unique_ptr<B>, A) {} };
57*4d6fc14bSjoerg    C c{{}, make_unique<B>, {}};
58*4d6fc14bSjoerg
59*4d6fc14bSjoerg
60*4d6fc14bSjoerg  In a conforming implementation, the destruction order for C::C's parameters is required to be ``~A(), ~B(), ~A()`` but with this mode enabled, we'll instead see ``~B(), ~A(), ~A()``.
61*4d6fc14bSjoerg
62*4d6fc14bSjoerg* Reduced code-size.
63*4d6fc14bSjoerg
64*4d6fc14bSjoerg
65*4d6fc14bSjoergPerformance impact
66*4d6fc14bSjoerg------------------
67*4d6fc14bSjoerg
68*4d6fc14bSjoergGoogle has measured performance improvements of up to 1.6% on some large server macrobenchmarks, and a small reduction in binary sizes.
69*4d6fc14bSjoerg
70*4d6fc14bSjoergThis also affects null pointer optimization
71*4d6fc14bSjoerg
72*4d6fc14bSjoergClang's optimizer can now figure out when a `std::unique_ptr` is known to contain *non*-null.
73*4d6fc14bSjoerg(Actually, this has been a *missed* optimization all along.)
74*4d6fc14bSjoerg
75*4d6fc14bSjoerg
76*4d6fc14bSjoerg.. code-block:: cpp
77*4d6fc14bSjoerg
78*4d6fc14bSjoerg    struct Foo {
79*4d6fc14bSjoerg      ~Foo();
80*4d6fc14bSjoerg    };
81*4d6fc14bSjoerg    std::unique_ptr<Foo> make_foo();
82*4d6fc14bSjoerg    void do_nothing(const Foo&)
83*4d6fc14bSjoerg
84*4d6fc14bSjoerg    void bar() {
85*4d6fc14bSjoerg      auto x = make_foo();
86*4d6fc14bSjoerg      do_nothing(*x);
87*4d6fc14bSjoerg    }
88*4d6fc14bSjoerg
89*4d6fc14bSjoerg
90*4d6fc14bSjoergWith this change, ``~Foo()`` will be called even if ``make_foo`` returns ``unique_ptr<Foo>(nullptr)``.
91*4d6fc14bSjoergThe compiler can now assume that ``x.get()`` cannot be null by the end of ``bar()``, because
92*4d6fc14bSjoergthe deference of ``x`` would be UB if it were ``nullptr``. (This dereference would not have caused
93*4d6fc14bSjoerga segfault, because no load is generated for dereferencing a pointer to a reference. This can be detected with ``-fsanitize=null``).
94*4d6fc14bSjoerg
95*4d6fc14bSjoerg
96*4d6fc14bSjoergPotential breakages
97*4d6fc14bSjoerg-------------------
98*4d6fc14bSjoerg
99*4d6fc14bSjoergThe following breakages were discovered by enabling this change and fixing the resulting issues in a large code base.
100*4d6fc14bSjoerg
101*4d6fc14bSjoerg- Compilation failures
102*4d6fc14bSjoerg
103*4d6fc14bSjoerg - Function definitions now require complete type ``T`` for parameters with type ``std::unique_ptr<T>``. The following code will no longer compile.
104*4d6fc14bSjoerg
105*4d6fc14bSjoerg   .. code-block:: cpp
106*4d6fc14bSjoerg
107*4d6fc14bSjoerg       class Foo;
108*4d6fc14bSjoerg       void func(std::unique_ptr<Foo> arg) { /* never use `arg` directly */ }
109*4d6fc14bSjoerg
110*4d6fc14bSjoerg - Fix: Remove forward-declaration of ``Foo`` and include its proper header.
111*4d6fc14bSjoerg
112*4d6fc14bSjoerg- Runtime Failures
113*4d6fc14bSjoerg
114*4d6fc14bSjoerg - Lifetime of ``std::unique_ptr<>`` arguments end earlier (at the end of the callee's body, rather than at the end of the full expression containing the call).
115*4d6fc14bSjoerg
116*4d6fc14bSjoerg   .. code-block:: cpp
117*4d6fc14bSjoerg
118*4d6fc14bSjoerg     util::Status run_worker(std::unique_ptr<Foo>);
119*4d6fc14bSjoerg     void func() {
120*4d6fc14bSjoerg        std::unique_ptr<Foo> smart_foo = ...;
121*4d6fc14bSjoerg        Foo* owned_foo = smart_foo.get();
122*4d6fc14bSjoerg        // Currently, the following would "work" because the argument to run_worker() is deleted at the end of func()
123*4d6fc14bSjoerg        // With the new calling convention, it will be deleted at the end of run_worker(),
124*4d6fc14bSjoerg        // making this an access to freed memory.
125*4d6fc14bSjoerg        owned_foo->Bar(run_worker(std::move(smart_foo)));
126*4d6fc14bSjoerg                  ^
127*4d6fc14bSjoerg                 // <<<Crash expected here
128*4d6fc14bSjoerg     }
129*4d6fc14bSjoerg
130*4d6fc14bSjoerg - Lifetime of local *returned* ``std::unique_ptr<>`` ends earlier.
131*4d6fc14bSjoerg
132*4d6fc14bSjoerg   Spot the bug:
133*4d6fc14bSjoerg
134*4d6fc14bSjoerg    .. code-block:: cpp
135*4d6fc14bSjoerg
136*4d6fc14bSjoerg     std::unique_ptr<Foo> create_and_subscribe(Bar* subscriber) {
137*4d6fc14bSjoerg       auto foo = std::make_unique<Foo>();
138*4d6fc14bSjoerg       subscriber->sub([&foo] { foo->do_thing();} );
139*4d6fc14bSjoerg       return foo;
140*4d6fc14bSjoerg     }
141*4d6fc14bSjoerg
142*4d6fc14bSjoerg   One could point out this is an obvious stack-use-after return bug.
143*4d6fc14bSjoerg   With the current calling convention, running this code with ASAN enabled, however, would not yield any "issue".
144*4d6fc14bSjoerg   So is this a bug in ASAN? (Spoiler: No)
145*4d6fc14bSjoerg
146*4d6fc14bSjoerg   This currently would "work" only because the storage for ``foo`` is in the caller's stackframe.
147*4d6fc14bSjoerg   In other words, ``&foo`` in callee and ``&foo`` in the caller are the same address.
148*4d6fc14bSjoerg
149*4d6fc14bSjoergASAN can be used to detect both of these.
150