1Thread-local storage. 2 3Each thread has a thread control block, or TCB. The TCB is a 4variable-size structure headed by `struct tls_tcb' from <sys/tls.h>, 5with: 6 7(a) static thread-local storage for the TLS data of initial objects, 8 i.e., those loaded at startup rather than those dynamically loaded 9 by dlopen 10 11(b) a pointer to a dynamic thread vector (DTV) for the TLS data 12 pointers of objects that use global-dynamic or local-dynamic models 13 (typically shared libraries or dlopenable modules) 14 15(c) the pthread_t pointer 16 17The per-thread lwp private pointer, also sometimes called TP (thread 18pointer), managed by the _lwp_setprivate and _lwp_setprivate syscalls, 19either points at the TCB directly, or, on some architectures, points at 20 21 tp = tcb + sizeof(struct tls_tcb) + TLS_TP_OFFSET. 22 23This bias is chosen for architectures where signed displacements from 24TP enable twice the range of static TLS offsets when biased like this. 25Architectures with such a tp/tcb offset must provide 26 27void *__lwp_gettcb_fast(void); 28 29in machine/mcontext.h and must define __HAVE___LWP_GETTCB_FAST in 30machine/types.h to reflect this; otherwise they must provide 31__lwp_getprivate_fast to return the TCB pointer. 32 33Each architecture has one of two TLS variants, variant I or variant II. 34Variant I places the static thread-local storage _after_ the fixed 35content of the TCB, at increasing addresses (increasing addresses grow 36down in diagram): 37 38 +---------------+ 39 | dtv pointer | tcb points here (struct tls_tcb) 40 +---------------+ 41 | pthread_t | 42 +---------------+ 43 | obj0 tls | obj0->tlsoffset = 0 44 | | 45 | | 46 +---------------+ 47 | obj1 tls | obj1->tlsoffset = 3 48 +---------------+ 49 | obj2 tls | obj2->tlsoffset = 4 50 | | 51 . . 52 . . 53 . . 54 | | 55 +---------------+ 56 | objN tls | objN->tlsoffset = k 57 +---------------+ 58 59Variant II places the static thread-local storage _before_ the fixed 60content of the TCB, at decreasing addresses: 61 62 +---------------+ 63 | objN tls | objN->tlsoffset = k 64 +---------------+ 65 | obj(N-1) tls | obj(N-1)->tlsoffset = k - 1 66 . . 67 . . 68 . . 69 | | 70 +---------------+ 71 | obj2 tls | obj2->tlsoffset = 4 72 +---------------+ 73 | obj1 tls | obj1->tlsoffset = 3 74 +---------------+ 75 | obj0 tls | obj0->tlsoffset = 0 76 | | 77 | | 78 +---------------+ 79 | tcb pointer | tcb points here (struct tls_tcb) 80 +---------------+ 81 | dtv pointer | 82 +---------------+ 83 | pthread_t | 84 +---------------+ 85 86See [ELFTLS] Sec. 3 `Run-Time Handling of TLS', Figs 1 and 2, for 87bigger pictures including the DTV and dynamically allocated TLS blocks. 88 89Each architecture also has its own ELF ABI processor supplement with 90the architecture-specific relocations and TLS details. 91 92References: 93 94 [ELFTLS] Ulrich Drepper, `ELF Handling For Thread-Local 95 Storage', Version 0.21, 2023-08-22. 96 https://akkadia.org/drepper/tls.pdf 97 https://web.archive.org/web/20240718081934/https://akkadia.org/drepper/tls.pdf 98 99Steps for adding TLS support for a new platform: 100 101(1) Declare TLS variant in machine/types.h by defining either 102__HAVE_TLS_VARIANT_I or __HAVE_TLS_VARIANT_II. 103 104(2) _lwp_makecontext has to set the reserved register or kernel 105transfer variable in uc_mcontext according to the provided value of 106`private'. Note that _lwp_makecontext takes tcb, not tp, as an 107argument, so make sure to adjust it if needed for the tp/tcb offset. 108See src/lib/libc/arch/$PLATFORM/gen/_lwp.c. 109 110This is not possible on the VAX as there is no free space in ucontext_t. 111This requires either a special version of _lwp_create or versioning 112everything using ucontext_t. Debug support depends on getting the data from 113ucontext_t, so the second option is possibly required. 114 115(3) _lwp_setprivate(2) has to update the same register as 116_lwp_makecontext uses for the private area pointer. Normally 117cpu_lwp_setprivate is provided by MD to reflect the kernel view and 118enabled by defining __HAVE_CPU_LWP_SETPRIVATE in machine/types.h. 119cpu_setmcontext is responsible for keeping the MI l_private field 120synchronised by calling lwp_setprivate as needed. 121 122cpu_switchto has to update the mapping. 123 124_lwp_setprivate is used for the initial thread, all other threads 125created by libpthread use _lwp_makecontext for this purpose. 126 127(4) Provide __tls_get_addr and possible other MD functions for dynamic 128TLS offset computation. If such alternative entry points exist (currently 129only i386), also add a weak reference to 0 in src/lib/libc/tls/tls.c. 130 131The generic implementation can be found in tls.c and is used with 132__HAVE_COMMON___TLS_GET_ADDR. It depends on __lwp_getprivate_fast 133(see below). 134 135(5) Implement the necessary relocation records in mdreloc.c. There are 136typically three relocation types found in dynamic binaries: 137 138(a) R_TYPE(TLS_DTPOFF): Offset inside the module. The common TLS code 139ensures that the DTV vector points to offset 0 inside the module TLS block. 140This is normally def->st_value + rela->r_addend. 141 142(b) R_TYPE(TLS_DTPMOD): Module index. 143 144(c) R_TYPE(TLS_TPOFF): Static TLS offset. The code has to check whether 145the static TLS offset for this module has been allocated 146(defobj->tls_static) and otherwise call _rtld_tls_offset_allocate(). This 147may fail if no static space is available and the object has been pulled 148in via dlopen(3). It can also fail if the TLS area has already been used 149via a global-dynamic allocation. 150 151For TLS Variant I, this is typically: 152 153def->st_value + rela->r_addend + defobj->tlsoffset + sizeof(struct tls_tcb) 154 155e.g. the relocation doesn't include the fixed TCB. 156 157For TLS Variant II, this is typically: 158 159def->st_value - defobj->tlsoffset + rela->r_addend 160 161e.g. starting offset is counting down from the TCB. 162 163(6) If there is a tp/tcb offset, implement 164 165 __lwp_gettcb_fast() 166 __lwp_settcb() 167 168in machine/mcontext.h and set 169 170 __HAVE___LWP_GETTCB_FAST 171 __HAVE___LWP_SETTCB 172 173in machine/types.h. 174 175Otherwise, implement __lwp_getprivate_fast() in machine/mcontext.h and 176set __HAVE___LWP_GETPRIVATE_FAST in machine/types.h. 177 178(7) Test using src/tests/lib/libc/tls and src/tests/libexec/ld.elf_so. 179Make sure with "objdump -R" that t_tls_dynamic has two TPOFF 180relocations and h_tls_dlopen.so.1 and libh_tls_dynamic.so.1 have both 181two DTPMOD and DTPOFF relocations. 182