xref: /netbsd-src/external/bsd/jemalloc.old/dist/ChangeLog (revision 8e33eff89e26cf71871ead62f0d5063e1313c33a)
1*8e33eff8SchristosFollowing are change highlights associated with official releases.  Important
2*8e33eff8Schristosbug fixes are all mentioned, but some internal enhancements are omitted here for
3*8e33eff8Schristosbrevity.  Much more detail can be found in the git revision history:
4*8e33eff8Schristos
5*8e33eff8Schristos    https://github.com/jemalloc/jemalloc
6*8e33eff8Schristos
7*8e33eff8Schristos* 5.1.0 (May 4th, 2018)
8*8e33eff8Schristos
9*8e33eff8Schristos  This release is primarily about fine-tuning, ranging from several new features
10*8e33eff8Schristos  to numerous notable performance and portability enhancements.  The release and
11*8e33eff8Schristos  prior dev versions have been running in multiple large scale applications for
12*8e33eff8Schristos  months, and the cumulative improvements are substantial in many cases.
13*8e33eff8Schristos
14*8e33eff8Schristos  Given the long and successful production runs, this release is likely a good
15*8e33eff8Schristos  candidate for applications to upgrade, from both jemalloc 5.0 and before.  For
16*8e33eff8Schristos  performance-critical applications, the newly added TUNING.md provides
17*8e33eff8Schristos  guidelines on jemalloc tuning.
18*8e33eff8Schristos
19*8e33eff8Schristos  New features:
20*8e33eff8Schristos  - Implement transparent huge page support for internal metadata.  (@interwq)
21*8e33eff8Schristos  - Add opt.thp to allow enabling / disabling transparent huge pages for all
22*8e33eff8Schristos    mappings.  (@interwq)
23*8e33eff8Schristos  - Add maximum background thread count option.  (@djwatson)
24*8e33eff8Schristos  - Allow prof_active to control opt.lg_prof_interval and prof.gdump.
25*8e33eff8Schristos    (@interwq)
26*8e33eff8Schristos  - Allow arena index lookup based on allocation addresses via mallctl.
27*8e33eff8Schristos    (@lionkov)
28*8e33eff8Schristos  - Allow disabling initial-exec TLS model.  (@davidtgoldblatt, @KenMacD)
29*8e33eff8Schristos  - Add opt.lg_extent_max_active_fit to set the max ratio between the size of
30*8e33eff8Schristos    the active extent selected (to split off from) and the size of the requested
31*8e33eff8Schristos    allocation.  (@interwq, @davidtgoldblatt)
32*8e33eff8Schristos  - Add retain_grow_limit to set the max size when growing virtual address
33*8e33eff8Schristos    space.  (@interwq)
34*8e33eff8Schristos  - Add mallctl interfaces:
35*8e33eff8Schristos    + arena.<i>.retain_grow_limit  (@interwq)
36*8e33eff8Schristos    + arenas.lookup  (@lionkov)
37*8e33eff8Schristos    + max_background_threads  (@djwatson)
38*8e33eff8Schristos    + opt.lg_extent_max_active_fit  (@interwq)
39*8e33eff8Schristos    + opt.max_background_threads  (@djwatson)
40*8e33eff8Schristos    + opt.metadata_thp  (@interwq)
41*8e33eff8Schristos    + opt.thp  (@interwq)
42*8e33eff8Schristos    + stats.metadata_thp  (@interwq)
43*8e33eff8Schristos
44*8e33eff8Schristos  Portability improvements:
45*8e33eff8Schristos  - Support GNU/kFreeBSD configuration.  (@paravoid)
46*8e33eff8Schristos  - Support m68k, nios2 and SH3 architectures.  (@paravoid)
47*8e33eff8Schristos  - Fall back to FD_CLOEXEC when O_CLOEXEC is unavailable.  (@zonyitoo)
48*8e33eff8Schristos  - Fix symbol listing for cross-compiling.  (@tamird)
49*8e33eff8Schristos  - Fix high bits computation on ARM.  (@davidtgoldblatt, @paravoid)
50*8e33eff8Schristos  - Disable the CPU_SPINWAIT macro for Power.  (@davidtgoldblatt, @marxin)
51*8e33eff8Schristos  - Fix MSVC 2015 & 2017 builds.  (@rustyx)
52*8e33eff8Schristos  - Improve RISC-V support.  (@EdSchouten)
53*8e33eff8Schristos  - Set name mangling script in strict mode.  (@nicolov)
54*8e33eff8Schristos  - Avoid MADV_HUGEPAGE on ARM.  (@marxin)
55*8e33eff8Schristos  - Modify configure to determine return value of strerror_r.
56*8e33eff8Schristos    (@davidtgoldblatt, @cferris1000)
57*8e33eff8Schristos  - Make sure CXXFLAGS is tested with CPP compiler.  (@nehaljwani)
58*8e33eff8Schristos  - Fix 32-bit build on MSVC.  (@rustyx)
59*8e33eff8Schristos  - Fix external symbol on MSVC.  (@maksqwe)
60*8e33eff8Schristos  - Avoid a printf format specifier warning.  (@jasone)
61*8e33eff8Schristos  - Add configure option --disable-initial-exec-tls which can allow jemalloc to
62*8e33eff8Schristos    be dynamically loaded after program startup.  (@davidtgoldblatt, @KenMacD)
63*8e33eff8Schristos  - AArch64: Add ILP32 support.  (@cmuellner)
64*8e33eff8Schristos  - Add --with-lg-vaddr configure option to support cross compiling.
65*8e33eff8Schristos    (@cmuellner, @davidtgoldblatt)
66*8e33eff8Schristos
67*8e33eff8Schristos  Optimizations and refactors:
68*8e33eff8Schristos  - Improve active extent fit with extent_max_active_fit.  This considerably
69*8e33eff8Schristos    reduces fragmentation over time and improves virtual memory and metadata
70*8e33eff8Schristos    usage.  (@davidtgoldblatt, @interwq)
71*8e33eff8Schristos  - Eagerly coalesce large extents to reduce fragmentation.  (@interwq)
72*8e33eff8Schristos  - sdallocx: only read size info when page aligned (i.e. possibly sampled),
73*8e33eff8Schristos    which speeds up the sized deallocation path significantly.  (@interwq)
74*8e33eff8Schristos  - Avoid attempting new mappings for in place expansion with retain, since
75*8e33eff8Schristos    it rarely succeeds in practice and causes high overhead.  (@interwq)
76*8e33eff8Schristos  - Refactor OOM handling in newImpl.  (@wqfish)
77*8e33eff8Schristos  - Add internal fine-grained logging functionality for debugging use.
78*8e33eff8Schristos    (@davidtgoldblatt)
79*8e33eff8Schristos  - Refactor arena / tcache interactions.  (@davidtgoldblatt)
80*8e33eff8Schristos  - Refactor extent management with dumpable flag.  (@davidtgoldblatt)
81*8e33eff8Schristos  - Add runtime detection of lazy purging.  (@interwq)
82*8e33eff8Schristos  - Use pairing heap instead of red-black tree for extents_avail.  (@djwatson)
83*8e33eff8Schristos  - Use sysctl on startup in FreeBSD.  (@trasz)
84*8e33eff8Schristos  - Use thread local prng state instead of atomic.  (@djwatson)
85*8e33eff8Schristos  - Make decay to always purge one more extent than before, because in
86*8e33eff8Schristos    practice large extents are usually the ones that cross the decay threshold.
87*8e33eff8Schristos    Purging the additional extent helps save memory as well as reduce VM
88*8e33eff8Schristos    fragmentation.  (@interwq)
89*8e33eff8Schristos  - Fast division by dynamic values.  (@davidtgoldblatt)
90*8e33eff8Schristos  - Improve the fit for aligned allocation.  (@interwq, @edwinsmith)
91*8e33eff8Schristos  - Refactor extent_t bitpacking.  (@rkmisra)
92*8e33eff8Schristos  - Optimize the generated assembly for ticker operations.  (@davidtgoldblatt)
93*8e33eff8Schristos  - Convert stats printing to use a structured text emitter.  (@davidtgoldblatt)
94*8e33eff8Schristos  - Remove preserve_lru feature for extents management.  (@djwatson)
95*8e33eff8Schristos  - Consolidate two memory loads into one on the fast deallocation path.
96*8e33eff8Schristos    (@davidtgoldblatt, @interwq)
97*8e33eff8Schristos
98*8e33eff8Schristos  Bug fixes (most of the issues are only relevant to jemalloc 5.0):
99*8e33eff8Schristos  - Fix deadlock with multithreaded fork in OS X.  (@davidtgoldblatt)
100*8e33eff8Schristos  - Validate returned file descriptor before use.  (@zonyitoo)
101*8e33eff8Schristos  - Fix a few background thread initialization and shutdown issues.  (@interwq)
102*8e33eff8Schristos  - Fix an extent coalesce + decay race by taking both coalescing extents off
103*8e33eff8Schristos    the LRU list.  (@interwq)
104*8e33eff8Schristos  - Fix potentially unbound increase during decay, caused by one thread keep
105*8e33eff8Schristos    stashing memory to purge while other threads generating new pages.  The
106*8e33eff8Schristos    number of pages to purge is checked to prevent this.  (@interwq)
107*8e33eff8Schristos  - Fix a FreeBSD bootstrap assertion.  (@strejda, @interwq)
108*8e33eff8Schristos  - Handle 32 bit mutex counters.  (@rkmisra)
109*8e33eff8Schristos  - Fix a indexing bug when creating background threads.  (@davidtgoldblatt,
110*8e33eff8Schristos    @binliu19)
111*8e33eff8Schristos  - Fix arguments passed to extent_init.  (@yuleniwo, @interwq)
112*8e33eff8Schristos  - Fix addresses used for ordering mutexes.  (@rkmisra)
113*8e33eff8Schristos  - Fix abort_conf processing during bootstrap.  (@interwq)
114*8e33eff8Schristos  - Fix include path order for out-of-tree builds.  (@cmuellner)
115*8e33eff8Schristos
116*8e33eff8Schristos  Incompatible changes:
117*8e33eff8Schristos  - Remove --disable-thp.  (@interwq)
118*8e33eff8Schristos  - Remove mallctl interfaces:
119*8e33eff8Schristos    + config.thp  (@interwq)
120*8e33eff8Schristos
121*8e33eff8Schristos  Documentation:
122*8e33eff8Schristos  - Add TUNING.md.  (@interwq, @davidtgoldblatt, @djwatson)
123*8e33eff8Schristos
124*8e33eff8Schristos* 5.0.1 (July 1, 2017)
125*8e33eff8Schristos
126*8e33eff8Schristos  This bugfix release fixes several issues, most of which are obscure enough
127*8e33eff8Schristos  that typical applications are not impacted.
128*8e33eff8Schristos
129*8e33eff8Schristos  Bug fixes:
130*8e33eff8Schristos  - Update decay->nunpurged before purging, in order to avoid potential update
131*8e33eff8Schristos    races and subsequent incorrect purging volume.  (@interwq)
132*8e33eff8Schristos  - Only abort on dlsym(3) error if the failure impacts an enabled feature (lazy
133*8e33eff8Schristos    locking and/or background threads).  This mitigates an initialization
134*8e33eff8Schristos    failure bug for which we still do not have a clear reproduction test case.
135*8e33eff8Schristos    (@interwq)
136*8e33eff8Schristos  - Modify tsd management so that it neither crashes nor leaks if a thread's
137*8e33eff8Schristos    only allocation activity is to call free() after TLS destructors have been
138*8e33eff8Schristos    executed.  This behavior was observed when operating with GNU libc, and is
139*8e33eff8Schristos    unlikely to be an issue with other libc implementations.  (@interwq)
140*8e33eff8Schristos  - Mask signals during background thread creation.  This prevents signals from
141*8e33eff8Schristos    being inadvertently delivered to background threads.  (@jasone,
142*8e33eff8Schristos    @davidtgoldblatt, @interwq)
143*8e33eff8Schristos  - Avoid inactivity checks within background threads, in order to prevent
144*8e33eff8Schristos    recursive mutex acquisition.  (@interwq)
145*8e33eff8Schristos  - Fix extent_grow_retained() to use the specified hooks when the
146*8e33eff8Schristos    arena.<i>.extent_hooks mallctl is used to override the default hooks.
147*8e33eff8Schristos    (@interwq)
148*8e33eff8Schristos  - Add missing reentrancy support for custom extent hooks which allocate.
149*8e33eff8Schristos    (@interwq)
150*8e33eff8Schristos  - Post-fork(2), re-initialize the list of tcaches associated with each arena
151*8e33eff8Schristos    to contain no tcaches except the forking thread's.  (@interwq)
152*8e33eff8Schristos  - Add missing post-fork(2) mutex reinitialization for extent_grow_mtx.  This
153*8e33eff8Schristos    fixes potential deadlocks after fork(2).  (@interwq)
154*8e33eff8Schristos  - Enforce minimum autoconf version (currently 2.68), since 2.63 is known to
155*8e33eff8Schristos    generate corrupt configure scripts.  (@jasone)
156*8e33eff8Schristos  - Ensure that the configured page size (--with-lg-page) is no larger than the
157*8e33eff8Schristos    configured huge page size (--with-lg-hugepage).  (@jasone)
158*8e33eff8Schristos
159*8e33eff8Schristos* 5.0.0 (June 13, 2017)
160*8e33eff8Schristos
161*8e33eff8Schristos  Unlike all previous jemalloc releases, this release does not use naturally
162*8e33eff8Schristos  aligned "chunks" for virtual memory management, and instead uses page-aligned
163*8e33eff8Schristos  "extents".  This change has few externally visible effects, but the internal
164*8e33eff8Schristos  impacts are... extensive.  Many other internal changes combine to make this
165*8e33eff8Schristos  the most cohesively designed version of jemalloc so far, with ample
166*8e33eff8Schristos  opportunity for further enhancements.
167*8e33eff8Schristos
168*8e33eff8Schristos  Continuous integration is now an integral aspect of development thanks to the
169*8e33eff8Schristos  efforts of @davidtgoldblatt, and the dev branch tends to remain reasonably
170*8e33eff8Schristos  stable on the tested platforms (Linux, FreeBSD, macOS, and Windows).  As a
171*8e33eff8Schristos  side effect the official release frequency may decrease over time.
172*8e33eff8Schristos
173*8e33eff8Schristos  New features:
174*8e33eff8Schristos  - Implement optional per-CPU arena support; threads choose which arena to use
175*8e33eff8Schristos    based on current CPU rather than on fixed thread-->arena associations.
176*8e33eff8Schristos    (@interwq)
177*8e33eff8Schristos  - Implement two-phase decay of unused dirty pages.  Pages transition from
178*8e33eff8Schristos    dirty-->muzzy-->clean, where the first phase transition relies on
179*8e33eff8Schristos    madvise(... MADV_FREE) semantics, and the second phase transition discards
180*8e33eff8Schristos    pages such that they are replaced with demand-zeroed pages on next access.
181*8e33eff8Schristos    (@jasone)
182*8e33eff8Schristos  - Increase decay time resolution from seconds to milliseconds.  (@jasone)
183*8e33eff8Schristos  - Implement opt-in per CPU background threads, and use them for asynchronous
184*8e33eff8Schristos    decay-driven unused dirty page purging.  (@interwq)
185*8e33eff8Schristos  - Add mutex profiling, which collects a variety of statistics useful for
186*8e33eff8Schristos    diagnosing overhead/contention issues.  (@interwq)
187*8e33eff8Schristos  - Add C++ new/delete operator bindings.  (@djwatson)
188*8e33eff8Schristos  - Support manually created arena destruction, such that all data and metadata
189*8e33eff8Schristos    are discarded.  Add MALLCTL_ARENAS_DESTROYED for accessing merged stats
190*8e33eff8Schristos    associated with destroyed arenas.  (@jasone)
191*8e33eff8Schristos  - Add MALLCTL_ARENAS_ALL as a fixed index for use in accessing
192*8e33eff8Schristos    merged/destroyed arena statistics via mallctl.  (@jasone)
193*8e33eff8Schristos  - Add opt.abort_conf to optionally abort if invalid configuration options are
194*8e33eff8Schristos    detected during initialization.  (@interwq)
195*8e33eff8Schristos  - Add opt.stats_print_opts, so that e.g. JSON output can be selected for the
196*8e33eff8Schristos    stats dumped during exit if opt.stats_print is true.  (@jasone)
197*8e33eff8Schristos  - Add --with-version=VERSION for use when embedding jemalloc into another
198*8e33eff8Schristos    project's git repository.  (@jasone)
199*8e33eff8Schristos  - Add --disable-thp to support cross compiling.  (@jasone)
200*8e33eff8Schristos  - Add --with-lg-hugepage to support cross compiling.  (@jasone)
201*8e33eff8Schristos  - Add mallctl interfaces (various authors):
202*8e33eff8Schristos    + background_thread
203*8e33eff8Schristos    + opt.abort_conf
204*8e33eff8Schristos    + opt.retain
205*8e33eff8Schristos    + opt.percpu_arena
206*8e33eff8Schristos    + opt.background_thread
207*8e33eff8Schristos    + opt.{dirty,muzzy}_decay_ms
208*8e33eff8Schristos    + opt.stats_print_opts
209*8e33eff8Schristos    + arena.<i>.initialized
210*8e33eff8Schristos    + arena.<i>.destroy
211*8e33eff8Schristos    + arena.<i>.{dirty,muzzy}_decay_ms
212*8e33eff8Schristos    + arena.<i>.extent_hooks
213*8e33eff8Schristos    + arenas.{dirty,muzzy}_decay_ms
214*8e33eff8Schristos    + arenas.bin.<i>.slab_size
215*8e33eff8Schristos    + arenas.nlextents
216*8e33eff8Schristos    + arenas.lextent.<i>.size
217*8e33eff8Schristos    + arenas.create
218*8e33eff8Schristos    + stats.background_thread.{num_threads,num_runs,run_interval}
219*8e33eff8Schristos    + stats.mutexes.{ctl,background_thread,prof,reset}.
220*8e33eff8Schristos      {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
221*8e33eff8Schristos      num_owner_switch}
222*8e33eff8Schristos    + stats.arenas.<i>.{dirty,muzzy}_decay_ms
223*8e33eff8Schristos    + stats.arenas.<i>.uptime
224*8e33eff8Schristos    + stats.arenas.<i>.{pmuzzy,base,internal,resident}
225*8e33eff8Schristos    + stats.arenas.<i>.{dirty,muzzy}_{npurge,nmadvise,purged}
226*8e33eff8Schristos    + stats.arenas.<i>.bins.<j>.{nslabs,reslabs,curslabs}
227*8e33eff8Schristos    + stats.arenas.<i>.bins.<j>.mutex.
228*8e33eff8Schristos      {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
229*8e33eff8Schristos      num_owner_switch}
230*8e33eff8Schristos    + stats.arenas.<i>.lextents.<j>.{nmalloc,ndalloc,nrequests,curlextents}
231*8e33eff8Schristos    + stats.arenas.i.mutexes.{large,extent_avail,extents_dirty,extents_muzzy,
232*8e33eff8Schristos      extents_retained,decay_dirty,decay_muzzy,base,tcache_list}.
233*8e33eff8Schristos      {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
234*8e33eff8Schristos      num_owner_switch}
235*8e33eff8Schristos
236*8e33eff8Schristos  Portability improvements:
237*8e33eff8Schristos  - Improve reentrant allocation support, such that deadlock is less likely if
238*8e33eff8Schristos    e.g. a system library call in turn allocates memory.  (@davidtgoldblatt,
239*8e33eff8Schristos    @interwq)
240*8e33eff8Schristos  - Support static linking of jemalloc with glibc.  (@djwatson)
241*8e33eff8Schristos
242*8e33eff8Schristos  Optimizations and refactors:
243*8e33eff8Schristos  - Organize virtual memory as "extents" of virtual memory pages, rather than as
244*8e33eff8Schristos    naturally aligned "chunks", and store all metadata in arbitrarily distant
245*8e33eff8Schristos    locations.  This reduces virtual memory external fragmentation, and will
246*8e33eff8Schristos    interact better with huge pages (not yet explicitly supported).  (@jasone)
247*8e33eff8Schristos  - Fold large and huge size classes together; only small and large size classes
248*8e33eff8Schristos    remain.  (@jasone)
249*8e33eff8Schristos  - Unify the allocation paths, and merge most fast-path branching decisions.
250*8e33eff8Schristos    (@davidtgoldblatt, @interwq)
251*8e33eff8Schristos  - Embed per thread automatic tcache into thread-specific data, which reduces
252*8e33eff8Schristos    conditional branches and dereferences.  Also reorganize tcache to increase
253*8e33eff8Schristos    fast-path data locality.  (@interwq)
254*8e33eff8Schristos  - Rewrite atomics to closely model the C11 API, convert various
255*8e33eff8Schristos    synchronization from mutex-based to atomic, and use the explicit memory
256*8e33eff8Schristos    ordering control to resolve various hypothetical races without increasing
257*8e33eff8Schristos    synchronization overhead.  (@davidtgoldblatt)
258*8e33eff8Schristos  - Extensively optimize rtree via various methods:
259*8e33eff8Schristos    + Add multiple layers of rtree lookup caching, since rtree lookups are now
260*8e33eff8Schristos      part of fast-path deallocation.  (@interwq)
261*8e33eff8Schristos    + Determine rtree layout at compile time.  (@jasone)
262*8e33eff8Schristos    + Make the tree shallower for common configurations.  (@jasone)
263*8e33eff8Schristos    + Embed the root node in the top-level rtree data structure, thus avoiding
264*8e33eff8Schristos      one level of indirection.  (@jasone)
265*8e33eff8Schristos    + Further specialize leaf elements as compared to internal node elements,
266*8e33eff8Schristos      and directly embed extent metadata needed for fast-path deallocation.
267*8e33eff8Schristos      (@jasone)
268*8e33eff8Schristos    + Ignore leading always-zero address bits (architecture-specific).
269*8e33eff8Schristos      (@jasone)
270*8e33eff8Schristos  - Reorganize headers (ongoing work) to make them hermetic, and disentangle
271*8e33eff8Schristos    various module dependencies.  (@davidtgoldblatt)
272*8e33eff8Schristos  - Convert various internal data structures such as size class metadata from
273*8e33eff8Schristos    boot-time-initialized to compile-time-initialized.  Propagate resulting data
274*8e33eff8Schristos    structure simplifications, such as making arena metadata fixed-size.
275*8e33eff8Schristos    (@jasone)
276*8e33eff8Schristos  - Simplify size class lookups when constrained to size classes that are
277*8e33eff8Schristos    multiples of the page size.  This speeds lookups, but the primary benefit is
278*8e33eff8Schristos    complexity reduction in code that was the source of numerous regressions.
279*8e33eff8Schristos    (@jasone)
280*8e33eff8Schristos  - Lock individual extents when possible for localized extent operations,
281*8e33eff8Schristos    rather than relying on a top-level arena lock.  (@davidtgoldblatt, @jasone)
282*8e33eff8Schristos  - Use first fit layout policy instead of best fit, in order to improve
283*8e33eff8Schristos    packing.  (@jasone)
284*8e33eff8Schristos  - If munmap(2) is not in use, use an exponential series to grow each arena's
285*8e33eff8Schristos    virtual memory, so that the number of disjoint virtual memory mappings
286*8e33eff8Schristos    remains low.  (@jasone)
287*8e33eff8Schristos  - Implement per arena base allocators, so that arenas never share any virtual
288*8e33eff8Schristos    memory pages.  (@jasone)
289*8e33eff8Schristos  - Automatically generate private symbol name mangling macros.  (@jasone)
290*8e33eff8Schristos
291*8e33eff8Schristos  Incompatible changes:
292*8e33eff8Schristos  - Replace chunk hooks with an expanded/normalized set of extent hooks.
293*8e33eff8Schristos    (@jasone)
294*8e33eff8Schristos  - Remove ratio-based purging.  (@jasone)
295*8e33eff8Schristos  - Remove --disable-tcache.  (@jasone)
296*8e33eff8Schristos  - Remove --disable-tls.  (@jasone)
297*8e33eff8Schristos  - Remove --enable-ivsalloc.  (@jasone)
298*8e33eff8Schristos  - Remove --with-lg-size-class-group.  (@jasone)
299*8e33eff8Schristos  - Remove --with-lg-tiny-min.  (@jasone)
300*8e33eff8Schristos  - Remove --disable-cc-silence.  (@jasone)
301*8e33eff8Schristos  - Remove --enable-code-coverage.  (@jasone)
302*8e33eff8Schristos  - Remove --disable-munmap (replaced by opt.retain).  (@jasone)
303*8e33eff8Schristos  - Remove Valgrind support.  (@jasone)
304*8e33eff8Schristos  - Remove quarantine support.  (@jasone)
305*8e33eff8Schristos  - Remove redzone support.  (@jasone)
306*8e33eff8Schristos  - Remove mallctl interfaces (various authors):
307*8e33eff8Schristos    + config.munmap
308*8e33eff8Schristos    + config.tcache
309*8e33eff8Schristos    + config.tls
310*8e33eff8Schristos    + config.valgrind
311*8e33eff8Schristos    + opt.lg_chunk
312*8e33eff8Schristos    + opt.purge
313*8e33eff8Schristos    + opt.lg_dirty_mult
314*8e33eff8Schristos    + opt.decay_time
315*8e33eff8Schristos    + opt.quarantine
316*8e33eff8Schristos    + opt.redzone
317*8e33eff8Schristos    + opt.thp
318*8e33eff8Schristos    + arena.<i>.lg_dirty_mult
319*8e33eff8Schristos    + arena.<i>.decay_time
320*8e33eff8Schristos    + arena.<i>.chunk_hooks
321*8e33eff8Schristos    + arenas.initialized
322*8e33eff8Schristos    + arenas.lg_dirty_mult
323*8e33eff8Schristos    + arenas.decay_time
324*8e33eff8Schristos    + arenas.bin.<i>.run_size
325*8e33eff8Schristos    + arenas.nlruns
326*8e33eff8Schristos    + arenas.lrun.<i>.size
327*8e33eff8Schristos    + arenas.nhchunks
328*8e33eff8Schristos    + arenas.hchunk.<i>.size
329*8e33eff8Schristos    + arenas.extend
330*8e33eff8Schristos    + stats.cactive
331*8e33eff8Schristos    + stats.arenas.<i>.lg_dirty_mult
332*8e33eff8Schristos    + stats.arenas.<i>.decay_time
333*8e33eff8Schristos    + stats.arenas.<i>.metadata.{mapped,allocated}
334*8e33eff8Schristos    + stats.arenas.<i>.{npurge,nmadvise,purged}
335*8e33eff8Schristos    + stats.arenas.<i>.huge.{allocated,nmalloc,ndalloc,nrequests}
336*8e33eff8Schristos    + stats.arenas.<i>.bins.<j>.{nruns,reruns,curruns}
337*8e33eff8Schristos    + stats.arenas.<i>.lruns.<j>.{nmalloc,ndalloc,nrequests,curruns}
338*8e33eff8Schristos    + stats.arenas.<i>.hchunks.<j>.{nmalloc,ndalloc,nrequests,curhchunks}
339*8e33eff8Schristos
340*8e33eff8Schristos  Bug fixes:
341*8e33eff8Schristos  - Improve interval-based profile dump triggering to dump only one profile when
342*8e33eff8Schristos    a single allocation's size exceeds the interval.  (@jasone)
343*8e33eff8Schristos  - Use prefixed function names (as controlled by --with-jemalloc-prefix) when
344*8e33eff8Schristos    pruning backtrace frames in jeprof.  (@jasone)
345*8e33eff8Schristos
346*8e33eff8Schristos* 4.5.0 (February 28, 2017)
347*8e33eff8Schristos
348*8e33eff8Schristos  This is the first release to benefit from much broader continuous integration
349*8e33eff8Schristos  testing, thanks to @davidtgoldblatt.  Had we had this testing infrastructure
350*8e33eff8Schristos  in place for prior releases, it would have caught all of the most serious
351*8e33eff8Schristos  regressions fixed by this release.
352*8e33eff8Schristos
353*8e33eff8Schristos  New features:
354*8e33eff8Schristos  - Add --disable-thp and the opt.thp mallctl to provide opt-out mechanisms for
355*8e33eff8Schristos    transparent huge page integration.  (@jasone)
356*8e33eff8Schristos  - Update zone allocator integration to work with macOS 10.12.  (@glandium)
357*8e33eff8Schristos  - Restructure *CFLAGS configuration, so that CFLAGS behaves typically, and
358*8e33eff8Schristos    EXTRA_CFLAGS provides a way to specify e.g. -Werror during building, but not
359*8e33eff8Schristos    during configuration.  (@jasone, @ronawho)
360*8e33eff8Schristos
361*8e33eff8Schristos  Bug fixes:
362*8e33eff8Schristos  - Fix DSS (sbrk(2)-based) allocation.  This regression was first released in
363*8e33eff8Schristos    4.3.0.  (@jasone)
364*8e33eff8Schristos  - Handle race in per size class utilization computation.  This functionality
365*8e33eff8Schristos    was first released in 4.0.0.  (@interwq)
366*8e33eff8Schristos  - Fix lock order reversal during gdump.  (@jasone)
367*8e33eff8Schristos  - Fix/refactor tcache synchronization.  This regression was first released in
368*8e33eff8Schristos    4.0.0.  (@jasone)
369*8e33eff8Schristos  - Fix various JSON-formatted malloc_stats_print() bugs.  This functionality
370*8e33eff8Schristos    was first released in 4.3.0.  (@jasone)
371*8e33eff8Schristos  - Fix huge-aligned allocation.  This regression was first released in 4.4.0.
372*8e33eff8Schristos    (@jasone)
373*8e33eff8Schristos  - When transparent huge page integration is enabled, detect what state pages
374*8e33eff8Schristos    start in according to the kernel's current operating mode, and only convert
375*8e33eff8Schristos    arena chunks to non-huge during purging if that is not their initial state.
376*8e33eff8Schristos    This functionality was first released in 4.4.0.  (@jasone)
377*8e33eff8Schristos  - Fix lg_chunk clamping for the --enable-cache-oblivious --disable-fill case.
378*8e33eff8Schristos    This regression was first released in 4.0.0.  (@jasone, @428desmo)
379*8e33eff8Schristos  - Properly detect sparc64 when building for Linux.  (@glaubitz)
380*8e33eff8Schristos
381*8e33eff8Schristos* 4.4.0 (December 3, 2016)
382*8e33eff8Schristos
383*8e33eff8Schristos  New features:
384*8e33eff8Schristos  - Add configure support for *-*-linux-android.  (@cferris1000, @jasone)
385*8e33eff8Schristos  - Add the --disable-syscall configure option, for use on systems that place
386*8e33eff8Schristos    security-motivated limitations on syscall(2).  (@jasone)
387*8e33eff8Schristos  - Add support for Debian GNU/kFreeBSD.  (@thesam)
388*8e33eff8Schristos
389*8e33eff8Schristos  Optimizations:
390*8e33eff8Schristos  - Add extent serial numbers and use them where appropriate as a sort key that
391*8e33eff8Schristos    is higher priority than address, so that the allocation policy prefers older
392*8e33eff8Schristos    extents.  This tends to improve locality (decrease fragmentation) when
393*8e33eff8Schristos    memory grows downward.  (@jasone)
394*8e33eff8Schristos  - Refactor madvise(2) configuration so that MADV_FREE is detected and utilized
395*8e33eff8Schristos    on Linux 4.5 and newer.  (@jasone)
396*8e33eff8Schristos  - Mark partially purged arena chunks as non-huge-page.  This improves
397*8e33eff8Schristos    interaction with Linux's transparent huge page functionality.  (@jasone)
398*8e33eff8Schristos
399*8e33eff8Schristos  Bug fixes:
400*8e33eff8Schristos  - Fix size class computations for edge conditions involving extremely large
401*8e33eff8Schristos    allocations.  This regression was first released in 4.0.0.  (@jasone,
402*8e33eff8Schristos    @ingvarha)
403*8e33eff8Schristos  - Remove overly restrictive assertions related to the cactive statistic.  This
404*8e33eff8Schristos    regression was first released in 4.1.0.  (@jasone)
405*8e33eff8Schristos  - Implement a more reliable detection scheme for os_unfair_lock on macOS.
406*8e33eff8Schristos    (@jszakmeister)
407*8e33eff8Schristos
408*8e33eff8Schristos* 4.3.1 (November 7, 2016)
409*8e33eff8Schristos
410*8e33eff8Schristos  Bug fixes:
411*8e33eff8Schristos  - Fix a severe virtual memory leak.  This regression was first released in
412*8e33eff8Schristos    4.3.0.  (@interwq, @jasone)
413*8e33eff8Schristos  - Refactor atomic and prng APIs to restore support for 32-bit platforms that
414*8e33eff8Schristos    use pre-C11 toolchains, e.g. FreeBSD's mips.  (@jasone)
415*8e33eff8Schristos
416*8e33eff8Schristos* 4.3.0 (November 4, 2016)
417*8e33eff8Schristos
418*8e33eff8Schristos  This is the first release that passes the test suite for multiple Windows
419*8e33eff8Schristos  configurations, thanks in large part to @glandium setting up continuous
420*8e33eff8Schristos  integration via AppVeyor (and Travis CI for Linux and OS X).
421*8e33eff8Schristos
422*8e33eff8Schristos  New features:
423*8e33eff8Schristos  - Add "J" (JSON) support to malloc_stats_print().  (@jasone)
424*8e33eff8Schristos  - Add Cray compiler support.  (@ronawho)
425*8e33eff8Schristos
426*8e33eff8Schristos  Optimizations:
427*8e33eff8Schristos  - Add/use adaptive spinning for bootstrapping and radix tree node
428*8e33eff8Schristos    initialization.  (@jasone)
429*8e33eff8Schristos
430*8e33eff8Schristos  Bug fixes:
431*8e33eff8Schristos  - Fix large allocation to search starting in the optimal size class heap,
432*8e33eff8Schristos    which can substantially reduce virtual memory churn and fragmentation.  This
433*8e33eff8Schristos    regression was first released in 4.0.0.  (@mjp41, @jasone)
434*8e33eff8Schristos  - Fix stats.arenas.<i>.nthreads accounting.  (@interwq)
435*8e33eff8Schristos  - Fix and simplify decay-based purging.  (@jasone)
436*8e33eff8Schristos  - Make DSS (sbrk(2)-related) operations lockless, which resolves potential
437*8e33eff8Schristos    deadlocks during thread exit.  (@jasone)
438*8e33eff8Schristos  - Fix over-sized allocation of radix tree leaf nodes.  (@mjp41, @ogaun,
439*8e33eff8Schristos    @jasone)
440*8e33eff8Schristos  - Fix over-sized allocation of arena_t (plus associated stats) data
441*8e33eff8Schristos    structures.  (@jasone, @interwq)
442*8e33eff8Schristos  - Fix EXTRA_CFLAGS to not affect configuration.  (@jasone)
443*8e33eff8Schristos  - Fix a Valgrind integration bug.  (@ronawho)
444*8e33eff8Schristos  - Disallow 0x5a junk filling when running in Valgrind.  (@jasone)
445*8e33eff8Schristos  - Fix a file descriptor leak on Linux.  This regression was first released in
446*8e33eff8Schristos    4.2.0.  (@vsarunas, @jasone)
447*8e33eff8Schristos  - Fix static linking of jemalloc with glibc.  (@djwatson)
448*8e33eff8Schristos  - Use syscall(2) rather than {open,read,close}(2) during boot on Linux.  This
449*8e33eff8Schristos    works around other libraries' system call wrappers performing reentrant
450*8e33eff8Schristos    allocation.  (@kspinka, @Whissi, @jasone)
451*8e33eff8Schristos  - Fix OS X default zone replacement to work with OS X 10.12.  (@glandium,
452*8e33eff8Schristos    @jasone)
453*8e33eff8Schristos  - Fix cached memory management to avoid needless commit/decommit operations
454*8e33eff8Schristos    during purging, which resolves permanent virtual memory map fragmentation
455*8e33eff8Schristos    issues on Windows.  (@mjp41, @jasone)
456*8e33eff8Schristos  - Fix TSD fetches to avoid (recursive) allocation.  This is relevant to
457*8e33eff8Schristos    non-TLS and Windows configurations.  (@jasone)
458*8e33eff8Schristos  - Fix malloc_conf overriding to work on Windows.  (@jasone)
459*8e33eff8Schristos  - Forcibly disable lazy-lock on Windows (was forcibly *enabled*).  (@jasone)
460*8e33eff8Schristos
461*8e33eff8Schristos* 4.2.1 (June 8, 2016)
462*8e33eff8Schristos
463*8e33eff8Schristos  Bug fixes:
464*8e33eff8Schristos  - Fix bootstrapping issues for configurations that require allocation during
465*8e33eff8Schristos    tsd initialization (e.g. --disable-tls).  (@cferris1000, @jasone)
466*8e33eff8Schristos  - Fix gettimeofday() version of nstime_update().  (@ronawho)
467*8e33eff8Schristos  - Fix Valgrind regressions in calloc() and chunk_alloc_wrapper().  (@ronawho)
468*8e33eff8Schristos  - Fix potential VM map fragmentation regression.  (@jasone)
469*8e33eff8Schristos  - Fix opt_zero-triggered in-place huge reallocation zeroing.  (@jasone)
470*8e33eff8Schristos  - Fix heap profiling context leaks in reallocation edge cases.  (@jasone)
471*8e33eff8Schristos
472*8e33eff8Schristos* 4.2.0 (May 12, 2016)
473*8e33eff8Schristos
474*8e33eff8Schristos  New features:
475*8e33eff8Schristos  - Add the arena.<i>.reset mallctl, which makes it possible to discard all of
476*8e33eff8Schristos    an arena's allocations in a single operation.  (@jasone)
477*8e33eff8Schristos  - Add the stats.retained and stats.arenas.<i>.retained statistics.  (@jasone)
478*8e33eff8Schristos  - Add the --with-version configure option.  (@jasone)
479*8e33eff8Schristos  - Support --with-lg-page values larger than actual page size.  (@jasone)
480*8e33eff8Schristos
481*8e33eff8Schristos  Optimizations:
482*8e33eff8Schristos  - Use pairing heaps rather than red-black trees for various hot data
483*8e33eff8Schristos    structures.  (@djwatson, @jasone)
484*8e33eff8Schristos  - Streamline fast paths of rtree operations.  (@jasone)
485*8e33eff8Schristos  - Optimize the fast paths of calloc() and [m,d,sd]allocx().  (@jasone)
486*8e33eff8Schristos  - Decommit unused virtual memory if the OS does not overcommit.  (@jasone)
487*8e33eff8Schristos  - Specify MAP_NORESERVE on Linux if [heuristic] overcommit is active, in order
488*8e33eff8Schristos    to avoid unfortunate interactions during fork(2).  (@jasone)
489*8e33eff8Schristos
490*8e33eff8Schristos  Bug fixes:
491*8e33eff8Schristos  - Fix chunk accounting related to triggering gdump profiles.  (@jasone)
492*8e33eff8Schristos  - Link against librt for clock_gettime(2) if glibc < 2.17.  (@jasone)
493*8e33eff8Schristos  - Scale leak report summary according to sampling probability.  (@jasone)
494*8e33eff8Schristos
495*8e33eff8Schristos* 4.1.1 (May 3, 2016)
496*8e33eff8Schristos
497*8e33eff8Schristos  This bugfix release resolves a variety of mostly minor issues, though the
498*8e33eff8Schristos  bitmap fix is critical for 64-bit Windows.
499*8e33eff8Schristos
500*8e33eff8Schristos  Bug fixes:
501*8e33eff8Schristos  - Fix the linear scan version of bitmap_sfu() to shift by the proper amount
502*8e33eff8Schristos    even when sizeof(long) is not the same as sizeof(void *), as on 64-bit
503*8e33eff8Schristos    Windows.  (@jasone)
504*8e33eff8Schristos  - Fix hashing functions to avoid unaligned memory accesses (and resulting
505*8e33eff8Schristos    crashes).  This is relevant at least to some ARM-based platforms.
506*8e33eff8Schristos    (@rkmisra)
507*8e33eff8Schristos  - Fix fork()-related lock rank ordering reversals.  These reversals were
508*8e33eff8Schristos    unlikely to cause deadlocks in practice except when heap profiling was
509*8e33eff8Schristos    enabled and active.  (@jasone)
510*8e33eff8Schristos  - Fix various chunk leaks in OOM code paths.  (@jasone)
511*8e33eff8Schristos  - Fix malloc_stats_print() to print opt.narenas correctly.  (@jasone)
512*8e33eff8Schristos  - Fix MSVC-specific build/test issues.  (@rustyx, @yuslepukhin)
513*8e33eff8Schristos  - Fix a variety of test failures that were due to test fragility rather than
514*8e33eff8Schristos    core bugs.  (@jasone)
515*8e33eff8Schristos
516*8e33eff8Schristos* 4.1.0 (February 28, 2016)
517*8e33eff8Schristos
518*8e33eff8Schristos  This release is primarily about optimizations, but it also incorporates a lot
519*8e33eff8Schristos  of portability-motivated refactoring and enhancements.  Many people worked on
520*8e33eff8Schristos  this release, to an extent that even with the omission here of minor changes
521*8e33eff8Schristos  (see git revision history), and of the people who reported and diagnosed
522*8e33eff8Schristos  issues, so much of the work was contributed that starting with this release,
523*8e33eff8Schristos  changes are annotated with author credits to help reflect the collaborative
524*8e33eff8Schristos  effort involved.
525*8e33eff8Schristos
526*8e33eff8Schristos  New features:
527*8e33eff8Schristos  - Implement decay-based unused dirty page purging, a major optimization with
528*8e33eff8Schristos    mallctl API impact.  This is an alternative to the existing ratio-based
529*8e33eff8Schristos    unused dirty page purging, and is intended to eventually become the sole
530*8e33eff8Schristos    purging mechanism.  New mallctls:
531*8e33eff8Schristos    + opt.purge
532*8e33eff8Schristos    + opt.decay_time
533*8e33eff8Schristos    + arena.<i>.decay
534*8e33eff8Schristos    + arena.<i>.decay_time
535*8e33eff8Schristos    + arenas.decay_time
536*8e33eff8Schristos    + stats.arenas.<i>.decay_time
537*8e33eff8Schristos    (@jasone, @cevans87)
538*8e33eff8Schristos  - Add --with-malloc-conf, which makes it possible to embed a default
539*8e33eff8Schristos    options string during configuration.  This was motivated by the desire to
540*8e33eff8Schristos    specify --with-malloc-conf=purge:decay , since the default must remain
541*8e33eff8Schristos    purge:ratio until the 5.0.0 release.  (@jasone)
542*8e33eff8Schristos  - Add MS Visual Studio 2015 support.  (@rustyx, @yuslepukhin)
543*8e33eff8Schristos  - Make *allocx() size class overflow behavior defined.  The maximum
544*8e33eff8Schristos    size class is now less than PTRDIFF_MAX to protect applications against
545*8e33eff8Schristos    numerical overflow, and all allocation functions are guaranteed to indicate
546*8e33eff8Schristos    errors rather than potentially crashing if the request size exceeds the
547*8e33eff8Schristos    maximum size class.  (@jasone)
548*8e33eff8Schristos  - jeprof:
549*8e33eff8Schristos    + Add raw heap profile support.  (@jasone)
550*8e33eff8Schristos    + Add --retain and --exclude for backtrace symbol filtering.  (@jasone)
551*8e33eff8Schristos
552*8e33eff8Schristos  Optimizations:
553*8e33eff8Schristos  - Optimize the fast path to combine various bootstrapping and configuration
554*8e33eff8Schristos    checks and execute more streamlined code in the common case.  (@interwq)
555*8e33eff8Schristos  - Use linear scan for small bitmaps (used for small object tracking).  In
556*8e33eff8Schristos    addition to speeding up bitmap operations on 64-bit systems, this reduces
557*8e33eff8Schristos    allocator metadata overhead by approximately 0.2%.  (@djwatson)
558*8e33eff8Schristos  - Separate arena_avail trees, which substantially speeds up run tree
559*8e33eff8Schristos    operations.  (@djwatson)
560*8e33eff8Schristos  - Use memoization (boot-time-computed table) for run quantization.  Separate
561*8e33eff8Schristos    arena_avail trees reduced the importance of this optimization.  (@jasone)
562*8e33eff8Schristos  - Attempt mmap-based in-place huge reallocation.  This can dramatically speed
563*8e33eff8Schristos    up incremental huge reallocation.  (@jasone)
564*8e33eff8Schristos
565*8e33eff8Schristos  Incompatible changes:
566*8e33eff8Schristos  - Make opt.narenas unsigned rather than size_t.  (@jasone)
567*8e33eff8Schristos
568*8e33eff8Schristos  Bug fixes:
569*8e33eff8Schristos  - Fix stats.cactive accounting regression.  (@rustyx, @jasone)
570*8e33eff8Schristos  - Handle unaligned keys in hash().  This caused problems for some ARM systems.
571*8e33eff8Schristos    (@jasone, @cferris1000)
572*8e33eff8Schristos  - Refactor arenas array.  In addition to fixing a fork-related deadlock, this
573*8e33eff8Schristos    makes arena lookups faster and simpler.  (@jasone)
574*8e33eff8Schristos  - Move retained memory allocation out of the default chunk allocation
575*8e33eff8Schristos    function, to a location that gets executed even if the application installs
576*8e33eff8Schristos    a custom chunk allocation function.  This resolves a virtual memory leak.
577*8e33eff8Schristos    (@buchgr)
578*8e33eff8Schristos  - Fix a potential tsd cleanup leak.  (@cferris1000, @jasone)
579*8e33eff8Schristos  - Fix run quantization.  In practice this bug had no impact unless
580*8e33eff8Schristos    applications requested memory with alignment exceeding one page.
581*8e33eff8Schristos    (@jasone, @djwatson)
582*8e33eff8Schristos  - Fix LinuxThreads-specific bootstrapping deadlock.  (Cosmin Paraschiv)
583*8e33eff8Schristos  - jeprof:
584*8e33eff8Schristos    + Don't discard curl options if timeout is not defined.  (@djwatson)
585*8e33eff8Schristos    + Detect failed profile fetches.  (@djwatson)
586*8e33eff8Schristos  - Fix stats.arenas.<i>.{dss,lg_dirty_mult,decay_time,pactive,pdirty} for
587*8e33eff8Schristos    --disable-stats case.  (@jasone)
588*8e33eff8Schristos
589*8e33eff8Schristos* 4.0.4 (October 24, 2015)
590*8e33eff8Schristos
591*8e33eff8Schristos  This bugfix release fixes another xallocx() regression.  No other regressions
592*8e33eff8Schristos  have come to light in over a month, so this is likely a good starting point
593*8e33eff8Schristos  for people who prefer to wait for "dot one" releases with all the major issues
594*8e33eff8Schristos  shaken out.
595*8e33eff8Schristos
596*8e33eff8Schristos  Bug fixes:
597*8e33eff8Schristos  - Fix xallocx(..., MALLOCX_ZERO to zero the last full trailing page of large
598*8e33eff8Schristos    allocations that have been randomly assigned an offset of 0 when
599*8e33eff8Schristos    --enable-cache-oblivious configure option is enabled.
600*8e33eff8Schristos
601*8e33eff8Schristos* 4.0.3 (September 24, 2015)
602*8e33eff8Schristos
603*8e33eff8Schristos  This bugfix release continues the trend of xallocx() and heap profiling fixes.
604*8e33eff8Schristos
605*8e33eff8Schristos  Bug fixes:
606*8e33eff8Schristos  - Fix xallocx(..., MALLOCX_ZERO) to zero all trailing bytes of large
607*8e33eff8Schristos    allocations when --enable-cache-oblivious configure option is enabled.
608*8e33eff8Schristos  - Fix xallocx(..., MALLOCX_ZERO) to zero trailing bytes of huge allocations
609*8e33eff8Schristos    when resizing from/to a size class that is not a multiple of the chunk size.
610*8e33eff8Schristos  - Fix prof_tctx_dump_iter() to filter out nodes that were created after heap
611*8e33eff8Schristos    profile dumping started.
612*8e33eff8Schristos  - Work around a potentially bad thread-specific data initialization
613*8e33eff8Schristos    interaction with NPTL (glibc's pthreads implementation).
614*8e33eff8Schristos
615*8e33eff8Schristos* 4.0.2 (September 21, 2015)
616*8e33eff8Schristos
617*8e33eff8Schristos  This bugfix release addresses a few bugs specific to heap profiling.
618*8e33eff8Schristos
619*8e33eff8Schristos  Bug fixes:
620*8e33eff8Schristos  - Fix ixallocx_prof_sample() to never modify nor create sampled small
621*8e33eff8Schristos    allocations.  xallocx() is in general incapable of moving small allocations,
622*8e33eff8Schristos    so this fix removes buggy code without loss of generality.
623*8e33eff8Schristos  - Fix irallocx_prof_sample() to always allocate large regions, even when
624*8e33eff8Schristos    alignment is non-zero.
625*8e33eff8Schristos  - Fix prof_alloc_rollback() to read tdata from thread-specific data rather
626*8e33eff8Schristos    than dereferencing a potentially invalid tctx.
627*8e33eff8Schristos
628*8e33eff8Schristos* 4.0.1 (September 15, 2015)
629*8e33eff8Schristos
630*8e33eff8Schristos  This is a bugfix release that is somewhat high risk due to the amount of
631*8e33eff8Schristos  refactoring required to address deep xallocx() problems.  As a side effect of
632*8e33eff8Schristos  these fixes, xallocx() now tries harder to partially fulfill requests for
633*8e33eff8Schristos  optional extra space.  Note that a couple of minor heap profiling
634*8e33eff8Schristos  optimizations are included, but these are better thought of as performance
635*8e33eff8Schristos  fixes that were integral to discovering most of the other bugs.
636*8e33eff8Schristos
637*8e33eff8Schristos  Optimizations:
638*8e33eff8Schristos  - Avoid a chunk metadata read in arena_prof_tctx_set(), since it is in the
639*8e33eff8Schristos    fast path when heap profiling is enabled.  Additionally, split a special
640*8e33eff8Schristos    case out into arena_prof_tctx_reset(), which also avoids chunk metadata
641*8e33eff8Schristos    reads.
642*8e33eff8Schristos  - Optimize irallocx_prof() to optimistically update the sampler state.  The
643*8e33eff8Schristos    prior implementation appears to have been a holdover from when
644*8e33eff8Schristos    rallocx()/xallocx() functionality was combined as rallocm().
645*8e33eff8Schristos
646*8e33eff8Schristos  Bug fixes:
647*8e33eff8Schristos  - Fix TLS configuration such that it is enabled by default for platforms on
648*8e33eff8Schristos    which it works correctly.
649*8e33eff8Schristos  - Fix arenas_cache_cleanup() and arena_get_hard() to handle
650*8e33eff8Schristos    allocation/deallocation within the application's thread-specific data
651*8e33eff8Schristos    cleanup functions even after arenas_cache is torn down.
652*8e33eff8Schristos  - Fix xallocx() bugs related to size+extra exceeding HUGE_MAXCLASS.
653*8e33eff8Schristos  - Fix chunk purge hook calls for in-place huge shrinking reallocation to
654*8e33eff8Schristos    specify the old chunk size rather than the new chunk size.  This bug caused
655*8e33eff8Schristos    no correctness issues for the default chunk purge function, but was
656*8e33eff8Schristos    visible to custom functions set via the "arena.<i>.chunk_hooks" mallctl.
657*8e33eff8Schristos  - Fix heap profiling bugs:
658*8e33eff8Schristos    + Fix heap profiling to distinguish among otherwise identical sample sites
659*8e33eff8Schristos      with interposed resets (triggered via the "prof.reset" mallctl).  This bug
660*8e33eff8Schristos      could cause data structure corruption that would most likely result in a
661*8e33eff8Schristos      segfault.
662*8e33eff8Schristos    + Fix irealloc_prof() to prof_alloc_rollback() on OOM.
663*8e33eff8Schristos    + Make one call to prof_active_get_unlocked() per allocation event, and use
664*8e33eff8Schristos      the result throughout the relevant functions that handle an allocation
665*8e33eff8Schristos      event.  Also add a missing check in prof_realloc().  These fixes protect
666*8e33eff8Schristos      allocation events against concurrent prof_active changes.
667*8e33eff8Schristos    + Fix ixallocx_prof() to pass usize_max and zero to ixallocx_prof_sample()
668*8e33eff8Schristos      in the correct order.
669*8e33eff8Schristos    + Fix prof_realloc() to call prof_free_sampled_object() after calling
670*8e33eff8Schristos      prof_malloc_sample_object().  Prior to this fix, if tctx and old_tctx were
671*8e33eff8Schristos      the same, the tctx could have been prematurely destroyed.
672*8e33eff8Schristos  - Fix portability bugs:
673*8e33eff8Schristos    + Don't bitshift by negative amounts when encoding/decoding run sizes in
674*8e33eff8Schristos      chunk header maps.  This affected systems with page sizes greater than 8
675*8e33eff8Schristos      KiB.
676*8e33eff8Schristos    + Rename index_t to szind_t to avoid an existing type on Solaris.
677*8e33eff8Schristos    + Add JEMALLOC_CXX_THROW to the memalign() function prototype, in order to
678*8e33eff8Schristos      match glibc and avoid compilation errors when including both
679*8e33eff8Schristos      jemalloc/jemalloc.h and malloc.h in C++ code.
680*8e33eff8Schristos    + Don't assume that /bin/sh is appropriate when running size_classes.sh
681*8e33eff8Schristos      during configuration.
682*8e33eff8Schristos    + Consider __sparcv9 a synonym for __sparc64__ when defining LG_QUANTUM.
683*8e33eff8Schristos    + Link tests to librt if it contains clock_gettime(2).
684*8e33eff8Schristos
685*8e33eff8Schristos* 4.0.0 (August 17, 2015)
686*8e33eff8Schristos
687*8e33eff8Schristos  This version contains many speed and space optimizations, both minor and
688*8e33eff8Schristos  major.  The major themes are generalization, unification, and simplification.
689*8e33eff8Schristos  Although many of these optimizations cause no visible behavior change, their
690*8e33eff8Schristos  cumulative effect is substantial.
691*8e33eff8Schristos
692*8e33eff8Schristos  New features:
693*8e33eff8Schristos  - Normalize size class spacing to be consistent across the complete size
694*8e33eff8Schristos    range.  By default there are four size classes per size doubling, but this
695*8e33eff8Schristos    is now configurable via the --with-lg-size-class-group option.  Also add the
696*8e33eff8Schristos    --with-lg-page, --with-lg-page-sizes, --with-lg-quantum, and
697*8e33eff8Schristos    --with-lg-tiny-min options, which can be used to tweak page and size class
698*8e33eff8Schristos    settings.  Impacts:
699*8e33eff8Schristos    + Worst case performance for incrementally growing/shrinking reallocation
700*8e33eff8Schristos      is improved because there are far fewer size classes, and therefore
701*8e33eff8Schristos      copying happens less often.
702*8e33eff8Schristos    + Internal fragmentation is limited to 20% for all but the smallest size
703*8e33eff8Schristos      classes (those less than four times the quantum).  (1B + 4 KiB)
704*8e33eff8Schristos      and (1B + 4 MiB) previously suffered nearly 50% internal fragmentation.
705*8e33eff8Schristos    + Chunk fragmentation tends to be lower because there are fewer distinct run
706*8e33eff8Schristos      sizes to pack.
707*8e33eff8Schristos  - Add support for explicit tcaches.  The "tcache.create", "tcache.flush", and
708*8e33eff8Schristos    "tcache.destroy" mallctls control tcache lifetime and flushing, and the
709*8e33eff8Schristos    MALLOCX_TCACHE(tc) and MALLOCX_TCACHE_NONE flags to the *allocx() API
710*8e33eff8Schristos    control which tcache is used for each operation.
711*8e33eff8Schristos  - Implement per thread heap profiling, as well as the ability to
712*8e33eff8Schristos    enable/disable heap profiling on a per thread basis.  Add the "prof.reset",
713*8e33eff8Schristos    "prof.lg_sample", "thread.prof.name", "thread.prof.active",
714*8e33eff8Schristos    "opt.prof_thread_active_init", "prof.thread_active_init", and
715*8e33eff8Schristos    "thread.prof.active" mallctls.
716*8e33eff8Schristos  - Add support for per arena application-specified chunk allocators, configured
717*8e33eff8Schristos    via the "arena.<i>.chunk_hooks" mallctl.
718*8e33eff8Schristos  - Refactor huge allocation to be managed by arenas, so that arenas now
719*8e33eff8Schristos    function as general purpose independent allocators.  This is important in
720*8e33eff8Schristos    the context of user-specified chunk allocators, aside from the scalability
721*8e33eff8Schristos    benefits.  Related new statistics:
722*8e33eff8Schristos    + The "stats.arenas.<i>.huge.allocated", "stats.arenas.<i>.huge.nmalloc",
723*8e33eff8Schristos      "stats.arenas.<i>.huge.ndalloc", and "stats.arenas.<i>.huge.nrequests"
724*8e33eff8Schristos      mallctls provide high level per arena huge allocation statistics.
725*8e33eff8Schristos    + The "arenas.nhchunks", "arenas.hchunk.<i>.size",
726*8e33eff8Schristos      "stats.arenas.<i>.hchunks.<j>.nmalloc",
727*8e33eff8Schristos      "stats.arenas.<i>.hchunks.<j>.ndalloc",
728*8e33eff8Schristos      "stats.arenas.<i>.hchunks.<j>.nrequests", and
729*8e33eff8Schristos      "stats.arenas.<i>.hchunks.<j>.curhchunks" mallctls provide per size class
730*8e33eff8Schristos      statistics.
731*8e33eff8Schristos  - Add the 'util' column to malloc_stats_print() output, which reports the
732*8e33eff8Schristos    proportion of available regions that are currently in use for each small
733*8e33eff8Schristos    size class.
734*8e33eff8Schristos  - Add "alloc" and "free" modes for for junk filling (see the "opt.junk"
735*8e33eff8Schristos    mallctl), so that it is possible to separately enable junk filling for
736*8e33eff8Schristos    allocation versus deallocation.
737*8e33eff8Schristos  - Add the jemalloc-config script, which provides information about how
738*8e33eff8Schristos    jemalloc was configured, and how to integrate it into application builds.
739*8e33eff8Schristos  - Add metadata statistics, which are accessible via the "stats.metadata",
740*8e33eff8Schristos    "stats.arenas.<i>.metadata.mapped", and
741*8e33eff8Schristos    "stats.arenas.<i>.metadata.allocated" mallctls.
742*8e33eff8Schristos  - Add the "stats.resident" mallctl, which reports the upper limit of
743*8e33eff8Schristos    physically resident memory mapped by the allocator.
744*8e33eff8Schristos  - Add per arena control over unused dirty page purging, via the
745*8e33eff8Schristos    "arenas.lg_dirty_mult", "arena.<i>.lg_dirty_mult", and
746*8e33eff8Schristos    "stats.arenas.<i>.lg_dirty_mult" mallctls.
747*8e33eff8Schristos  - Add the "prof.gdump" mallctl, which makes it possible to toggle the gdump
748*8e33eff8Schristos    feature on/off during program execution.
749*8e33eff8Schristos  - Add sdallocx(), which implements sized deallocation.  The primary
750*8e33eff8Schristos    optimization over dallocx() is the removal of a metadata read, which often
751*8e33eff8Schristos    suffers an L1 cache miss.
752*8e33eff8Schristos  - Add missing header includes in jemalloc/jemalloc.h, so that applications
753*8e33eff8Schristos    only have to #include <jemalloc/jemalloc.h>.
754*8e33eff8Schristos  - Add support for additional platforms:
755*8e33eff8Schristos    + Bitrig
756*8e33eff8Schristos    + Cygwin
757*8e33eff8Schristos    + DragonFlyBSD
758*8e33eff8Schristos    + iOS
759*8e33eff8Schristos    + OpenBSD
760*8e33eff8Schristos    + OpenRISC/or1k
761*8e33eff8Schristos
762*8e33eff8Schristos  Optimizations:
763*8e33eff8Schristos  - Maintain dirty runs in per arena LRUs rather than in per arena trees of
764*8e33eff8Schristos    dirty-run-containing chunks.  In practice this change significantly reduces
765*8e33eff8Schristos    dirty page purging volume.
766*8e33eff8Schristos  - Integrate whole chunks into the unused dirty page purging machinery.  This
767*8e33eff8Schristos    reduces the cost of repeated huge allocation/deallocation, because it
768*8e33eff8Schristos    effectively introduces a cache of chunks.
769*8e33eff8Schristos  - Split the arena chunk map into two separate arrays, in order to increase
770*8e33eff8Schristos    cache locality for the frequently accessed bits.
771*8e33eff8Schristos  - Move small run metadata out of runs, into arena chunk headers.  This reduces
772*8e33eff8Schristos    run fragmentation, smaller runs reduce external fragmentation for small size
773*8e33eff8Schristos    classes, and packed (less uniformly aligned) metadata layout improves CPU
774*8e33eff8Schristos    cache set distribution.
775*8e33eff8Schristos  - Randomly distribute large allocation base pointer alignment relative to page
776*8e33eff8Schristos    boundaries in order to more uniformly utilize CPU cache sets.  This can be
777*8e33eff8Schristos    disabled via the --disable-cache-oblivious configure option, and queried via
778*8e33eff8Schristos    the "config.cache_oblivious" mallctl.
779*8e33eff8Schristos  - Micro-optimize the fast paths for the public API functions.
780*8e33eff8Schristos  - Refactor thread-specific data to reside in a single structure.  This assures
781*8e33eff8Schristos    that only a single TLS read is necessary per call into the public API.
782*8e33eff8Schristos  - Implement in-place huge allocation growing and shrinking.
783*8e33eff8Schristos  - Refactor rtree (radix tree for chunk lookups) to be lock-free, and make
784*8e33eff8Schristos    additional optimizations that reduce maximum lookup depth to one or two
785*8e33eff8Schristos    levels.  This resolves what was a concurrency bottleneck for per arena huge
786*8e33eff8Schristos    allocation, because a global data structure is critical for determining
787*8e33eff8Schristos    which arenas own which huge allocations.
788*8e33eff8Schristos
789*8e33eff8Schristos  Incompatible changes:
790*8e33eff8Schristos  - Replace --enable-cc-silence with --disable-cc-silence to suppress spurious
791*8e33eff8Schristos    warnings by default.
792*8e33eff8Schristos  - Assure that the constness of malloc_usable_size()'s return type matches that
793*8e33eff8Schristos    of the system implementation.
794*8e33eff8Schristos  - Change the heap profile dump format to support per thread heap profiling,
795*8e33eff8Schristos    rename pprof to jeprof, and enhance it with the --thread=<n> option.  As a
796*8e33eff8Schristos    result, the bundled jeprof must now be used rather than the upstream
797*8e33eff8Schristos    (gperftools) pprof.
798*8e33eff8Schristos  - Disable "opt.prof_final" by default, in order to avoid atexit(3), which can
799*8e33eff8Schristos    internally deadlock on some platforms.
800*8e33eff8Schristos  - Change the "arenas.nlruns" mallctl type from size_t to unsigned.
801*8e33eff8Schristos  - Replace the "stats.arenas.<i>.bins.<j>.allocated" mallctl with
802*8e33eff8Schristos    "stats.arenas.<i>.bins.<j>.curregs".
803*8e33eff8Schristos  - Ignore MALLOC_CONF in set{uid,gid,cap} binaries.
804*8e33eff8Schristos  - Ignore MALLOCX_ARENA(a) in dallocx(), in favor of using the
805*8e33eff8Schristos    MALLOCX_TCACHE(tc) and MALLOCX_TCACHE_NONE flags to control tcache usage.
806*8e33eff8Schristos
807*8e33eff8Schristos  Removed features:
808*8e33eff8Schristos  - Remove the *allocm() API, which is superseded by the *allocx() API.
809*8e33eff8Schristos  - Remove the --enable-dss options, and make dss non-optional on all platforms
810*8e33eff8Schristos    which support sbrk(2).
811*8e33eff8Schristos  - Remove the "arenas.purge" mallctl, which was obsoleted by the
812*8e33eff8Schristos    "arena.<i>.purge" mallctl in 3.1.0.
813*8e33eff8Schristos  - Remove the unnecessary "opt.valgrind" mallctl; jemalloc automatically
814*8e33eff8Schristos    detects whether it is running inside Valgrind.
815*8e33eff8Schristos  - Remove the "stats.huge.allocated", "stats.huge.nmalloc", and
816*8e33eff8Schristos    "stats.huge.ndalloc" mallctls.
817*8e33eff8Schristos  - Remove the --enable-mremap option.
818*8e33eff8Schristos  - Remove the "stats.chunks.current", "stats.chunks.total", and
819*8e33eff8Schristos    "stats.chunks.high" mallctls.
820*8e33eff8Schristos
821*8e33eff8Schristos  Bug fixes:
822*8e33eff8Schristos  - Fix the cactive statistic to decrease (rather than increase) when active
823*8e33eff8Schristos    memory decreases.  This regression was first released in 3.5.0.
824*8e33eff8Schristos  - Fix OOM handling in memalign() and valloc().  A variant of this bug existed
825*8e33eff8Schristos    in all releases since 2.0.0, which introduced these functions.
826*8e33eff8Schristos  - Fix an OOM-related regression in arena_tcache_fill_small(), which could
827*8e33eff8Schristos    cause cache corruption on OOM.  This regression was present in all releases
828*8e33eff8Schristos    from 2.2.0 through 3.6.0.
829*8e33eff8Schristos  - Fix size class overflow handling for malloc(), posix_memalign(), memalign(),
830*8e33eff8Schristos    calloc(), and realloc() when profiling is enabled.
831*8e33eff8Schristos  - Fix the "arena.<i>.dss" mallctl to return an error if "primary" or
832*8e33eff8Schristos    "secondary" precedence is specified, but sbrk(2) is not supported.
833*8e33eff8Schristos  - Fix fallback lg_floor() implementations to handle extremely large inputs.
834*8e33eff8Schristos  - Ensure the default purgeable zone is after the default zone on OS X.
835*8e33eff8Schristos  - Fix latent bugs in atomic_*().
836*8e33eff8Schristos  - Fix the "arena.<i>.dss" mallctl to handle read-only calls.
837*8e33eff8Schristos  - Fix tls_model configuration to enable the initial-exec model when possible.
838*8e33eff8Schristos  - Mark malloc_conf as a weak symbol so that the application can override it.
839*8e33eff8Schristos  - Correctly detect glibc's adaptive pthread mutexes.
840*8e33eff8Schristos  - Fix the --without-export configure option.
841*8e33eff8Schristos
842*8e33eff8Schristos* 3.6.0 (March 31, 2014)
843*8e33eff8Schristos
844*8e33eff8Schristos  This version contains a critical bug fix for a regression present in 3.5.0 and
845*8e33eff8Schristos  3.5.1.
846*8e33eff8Schristos
847*8e33eff8Schristos  Bug fixes:
848*8e33eff8Schristos  - Fix a regression in arena_chunk_alloc() that caused crashes during
849*8e33eff8Schristos    small/large allocation if chunk allocation failed.  In the absence of this
850*8e33eff8Schristos    bug, chunk allocation failure would result in allocation failure, e.g.  NULL
851*8e33eff8Schristos    return from malloc().  This regression was introduced in 3.5.0.
852*8e33eff8Schristos  - Fix backtracing for gcc intrinsics-based backtracing by specifying
853*8e33eff8Schristos    -fno-omit-frame-pointer to gcc.  Note that the application (and all the
854*8e33eff8Schristos    libraries it links to) must also be compiled with this option for
855*8e33eff8Schristos    backtracing to be reliable.
856*8e33eff8Schristos  - Use dss allocation precedence for huge allocations as well as small/large
857*8e33eff8Schristos    allocations.
858*8e33eff8Schristos  - Fix test assertion failure message formatting.  This bug did not manifest on
859*8e33eff8Schristos    x86_64 systems because of implementation subtleties in va_list.
860*8e33eff8Schristos  - Fix inconsequential test failures for hash and SFMT code.
861*8e33eff8Schristos
862*8e33eff8Schristos  New features:
863*8e33eff8Schristos  - Support heap profiling on FreeBSD.  This feature depends on the proc
864*8e33eff8Schristos    filesystem being mounted during heap profile dumping.
865*8e33eff8Schristos
866*8e33eff8Schristos* 3.5.1 (February 25, 2014)
867*8e33eff8Schristos
868*8e33eff8Schristos  This version primarily addresses minor bugs in test code.
869*8e33eff8Schristos
870*8e33eff8Schristos  Bug fixes:
871*8e33eff8Schristos  - Configure Solaris/Illumos to use MADV_FREE.
872*8e33eff8Schristos  - Fix junk filling for mremap(2)-based huge reallocation.  This is only
873*8e33eff8Schristos    relevant if configuring with the --enable-mremap option specified.
874*8e33eff8Schristos  - Avoid compilation failure if 'restrict' C99 keyword is not supported by the
875*8e33eff8Schristos    compiler.
876*8e33eff8Schristos  - Add a configure test for SSE2 rather than assuming it is usable on i686
877*8e33eff8Schristos    systems.  This fixes test compilation errors, especially on 32-bit Linux
878*8e33eff8Schristos    systems.
879*8e33eff8Schristos  - Fix mallctl argument size mismatches (size_t vs. uint64_t) in the stats unit
880*8e33eff8Schristos    test.
881*8e33eff8Schristos  - Fix/remove flawed alignment-related overflow tests.
882*8e33eff8Schristos  - Prevent compiler optimizations that could change backtraces in the
883*8e33eff8Schristos    prof_accum unit test.
884*8e33eff8Schristos
885*8e33eff8Schristos* 3.5.0 (January 22, 2014)
886*8e33eff8Schristos
887*8e33eff8Schristos  This version focuses on refactoring and automated testing, though it also
888*8e33eff8Schristos  includes some non-trivial heap profiling optimizations not mentioned below.
889*8e33eff8Schristos
890*8e33eff8Schristos  New features:
891*8e33eff8Schristos  - Add the *allocx() API, which is a successor to the experimental *allocm()
892*8e33eff8Schristos    API.  The *allocx() functions are slightly simpler to use because they have
893*8e33eff8Schristos    fewer parameters, they directly return the results of primary interest, and
894*8e33eff8Schristos    mallocx()/rallocx() avoid the strict aliasing pitfall that
895*8e33eff8Schristos    allocm()/rallocm() share with posix_memalign().  Note that *allocm() is
896*8e33eff8Schristos    slated for removal in the next non-bugfix release.
897*8e33eff8Schristos  - Add support for LinuxThreads.
898*8e33eff8Schristos
899*8e33eff8Schristos  Bug fixes:
900*8e33eff8Schristos  - Unless heap profiling is enabled, disable floating point code and don't link
901*8e33eff8Schristos    with libm.  This, in combination with e.g. EXTRA_CFLAGS=-mno-sse on x64
902*8e33eff8Schristos    systems, makes it possible to completely disable floating point register
903*8e33eff8Schristos    use.  Some versions of glibc neglect to save/restore caller-saved floating
904*8e33eff8Schristos    point registers during dynamic lazy symbol loading, and the symbol loading
905*8e33eff8Schristos    code uses whatever malloc the application happens to have linked/loaded
906*8e33eff8Schristos    with, the result being potential floating point register corruption.
907*8e33eff8Schristos  - Report ENOMEM rather than EINVAL if an OOM occurs during heap profiling
908*8e33eff8Schristos    backtrace creation in imemalign().  This bug impacted posix_memalign() and
909*8e33eff8Schristos    aligned_alloc().
910*8e33eff8Schristos  - Fix a file descriptor leak in a prof_dump_maps() error path.
911*8e33eff8Schristos  - Fix prof_dump() to close the dump file descriptor for all relevant error
912*8e33eff8Schristos    paths.
913*8e33eff8Schristos  - Fix rallocm() to use the arena specified by the ALLOCM_ARENA(s) flag for
914*8e33eff8Schristos    allocation, not just deallocation.
915*8e33eff8Schristos  - Fix a data race for large allocation stats counters.
916*8e33eff8Schristos  - Fix a potential infinite loop during thread exit.  This bug occurred on
917*8e33eff8Schristos    Solaris, and could affect other platforms with similar pthreads TSD
918*8e33eff8Schristos    implementations.
919*8e33eff8Schristos  - Don't junk-fill reallocations unless usable size changes.  This fixes a
920*8e33eff8Schristos    violation of the *allocx()/*allocm() semantics.
921*8e33eff8Schristos  - Fix growing large reallocation to junk fill new space.
922*8e33eff8Schristos  - Fix huge deallocation to junk fill when munmap is disabled.
923*8e33eff8Schristos  - Change the default private namespace prefix from empty to je_, and change
924*8e33eff8Schristos    --with-private-namespace-prefix so that it prepends an additional prefix
925*8e33eff8Schristos    rather than replacing je_.  This reduces the likelihood of applications
926*8e33eff8Schristos    which statically link jemalloc experiencing symbol name collisions.
927*8e33eff8Schristos  - Add missing private namespace mangling (relevant when
928*8e33eff8Schristos    --with-private-namespace is specified).
929*8e33eff8Schristos  - Add and use JEMALLOC_INLINE_C so that static inline functions are marked as
930*8e33eff8Schristos    static even for debug builds.
931*8e33eff8Schristos  - Add a missing mutex unlock in a malloc_init_hard() error path.  In practice
932*8e33eff8Schristos    this error path is never executed.
933*8e33eff8Schristos  - Fix numerous bugs in malloc_strotumax() error handling/reporting.  These
934*8e33eff8Schristos    bugs had no impact except for malformed inputs.
935*8e33eff8Schristos  - Fix numerous bugs in malloc_snprintf().  These bugs were not exercised by
936*8e33eff8Schristos    existing calls, so they had no impact.
937*8e33eff8Schristos
938*8e33eff8Schristos* 3.4.1 (October 20, 2013)
939*8e33eff8Schristos
940*8e33eff8Schristos  Bug fixes:
941*8e33eff8Schristos  - Fix a race in the "arenas.extend" mallctl that could cause memory corruption
942*8e33eff8Schristos    of internal data structures and subsequent crashes.
943*8e33eff8Schristos  - Fix Valgrind integration flaws that caused Valgrind warnings about reads of
944*8e33eff8Schristos    uninitialized memory in:
945*8e33eff8Schristos    + arena chunk headers
946*8e33eff8Schristos    + internal zero-initialized data structures (relevant to tcache and prof
947*8e33eff8Schristos      code)
948*8e33eff8Schristos  - Preserve errno during the first allocation.  A readlink(2) call during
949*8e33eff8Schristos    initialization fails unless /etc/malloc.conf exists, so errno was typically
950*8e33eff8Schristos    set during the first allocation prior to this fix.
951*8e33eff8Schristos  - Fix compilation warnings reported by gcc 4.8.1.
952*8e33eff8Schristos
953*8e33eff8Schristos* 3.4.0 (June 2, 2013)
954*8e33eff8Schristos
955*8e33eff8Schristos  This version is essentially a small bugfix release, but the addition of
956*8e33eff8Schristos  aarch64 support requires that the minor version be incremented.
957*8e33eff8Schristos
958*8e33eff8Schristos  Bug fixes:
959*8e33eff8Schristos  - Fix race-triggered deadlocks in chunk_record().  These deadlocks were
960*8e33eff8Schristos    typically triggered by multiple threads concurrently deallocating huge
961*8e33eff8Schristos    objects.
962*8e33eff8Schristos
963*8e33eff8Schristos  New features:
964*8e33eff8Schristos  - Add support for the aarch64 architecture.
965*8e33eff8Schristos
966*8e33eff8Schristos* 3.3.1 (March 6, 2013)
967*8e33eff8Schristos
968*8e33eff8Schristos  This version fixes bugs that are typically encountered only when utilizing
969*8e33eff8Schristos  custom run-time options.
970*8e33eff8Schristos
971*8e33eff8Schristos  Bug fixes:
972*8e33eff8Schristos  - Fix a locking order bug that could cause deadlock during fork if heap
973*8e33eff8Schristos    profiling were enabled.
974*8e33eff8Schristos  - Fix a chunk recycling bug that could cause the allocator to lose track of
975*8e33eff8Schristos    whether a chunk was zeroed.  On FreeBSD, NetBSD, and OS X, it could cause
976*8e33eff8Schristos    corruption if allocating via sbrk(2) (unlikely unless running with the
977*8e33eff8Schristos    "dss:primary" option specified).  This was completely harmless on Linux
978*8e33eff8Schristos    unless using mlockall(2) (and unlikely even then, unless the
979*8e33eff8Schristos    --disable-munmap configure option or the "dss:primary" option was
980*8e33eff8Schristos    specified).  This regression was introduced in 3.1.0 by the
981*8e33eff8Schristos    mlockall(2)/madvise(2) interaction fix.
982*8e33eff8Schristos  - Fix TLS-related memory corruption that could occur during thread exit if the
983*8e33eff8Schristos    thread never allocated memory.  Only the quarantine and prof facilities were
984*8e33eff8Schristos    susceptible.
985*8e33eff8Schristos  - Fix two quarantine bugs:
986*8e33eff8Schristos    + Internal reallocation of the quarantined object array leaked the old
987*8e33eff8Schristos      array.
988*8e33eff8Schristos    + Reallocation failure for internal reallocation of the quarantined object
989*8e33eff8Schristos      array (very unlikely) resulted in memory corruption.
990*8e33eff8Schristos  - Fix Valgrind integration to annotate all internally allocated memory in a
991*8e33eff8Schristos    way that keeps Valgrind happy about internal data structure access.
992*8e33eff8Schristos  - Fix building for s390 systems.
993*8e33eff8Schristos
994*8e33eff8Schristos* 3.3.0 (January 23, 2013)
995*8e33eff8Schristos
996*8e33eff8Schristos  This version includes a few minor performance improvements in addition to the
997*8e33eff8Schristos  listed new features and bug fixes.
998*8e33eff8Schristos
999*8e33eff8Schristos  New features:
1000*8e33eff8Schristos  - Add clipping support to lg_chunk option processing.
1001*8e33eff8Schristos  - Add the --enable-ivsalloc option.
1002*8e33eff8Schristos  - Add the --without-export option.
1003*8e33eff8Schristos  - Add the --disable-zone-allocator option.
1004*8e33eff8Schristos
1005*8e33eff8Schristos  Bug fixes:
1006*8e33eff8Schristos  - Fix "arenas.extend" mallctl to output the number of arenas.
1007*8e33eff8Schristos  - Fix chunk_recycle() to unconditionally inform Valgrind that returned memory
1008*8e33eff8Schristos    is undefined.
1009*8e33eff8Schristos  - Fix build break on FreeBSD related to alloca.h.
1010*8e33eff8Schristos
1011*8e33eff8Schristos* 3.2.0 (November 9, 2012)
1012*8e33eff8Schristos
1013*8e33eff8Schristos  In addition to a couple of bug fixes, this version modifies page run
1014*8e33eff8Schristos  allocation and dirty page purging algorithms in order to better control
1015*8e33eff8Schristos  page-level virtual memory fragmentation.
1016*8e33eff8Schristos
1017*8e33eff8Schristos  Incompatible changes:
1018*8e33eff8Schristos  - Change the "opt.lg_dirty_mult" default from 5 to 3 (32:1 to 8:1).
1019*8e33eff8Schristos
1020*8e33eff8Schristos  Bug fixes:
1021*8e33eff8Schristos  - Fix dss/mmap allocation precedence code to use recyclable mmap memory only
1022*8e33eff8Schristos    after primary dss allocation fails.
1023*8e33eff8Schristos  - Fix deadlock in the "arenas.purge" mallctl.  This regression was introduced
1024*8e33eff8Schristos    in 3.1.0 by the addition of the "arena.<i>.purge" mallctl.
1025*8e33eff8Schristos
1026*8e33eff8Schristos* 3.1.0 (October 16, 2012)
1027*8e33eff8Schristos
1028*8e33eff8Schristos  New features:
1029*8e33eff8Schristos  - Auto-detect whether running inside Valgrind, thus removing the need to
1030*8e33eff8Schristos    manually specify MALLOC_CONF=valgrind:true.
1031*8e33eff8Schristos  - Add the "arenas.extend" mallctl, which allows applications to create
1032*8e33eff8Schristos    manually managed arenas.
1033*8e33eff8Schristos  - Add the ALLOCM_ARENA() flag for {,r,d}allocm().
1034*8e33eff8Schristos  - Add the "opt.dss", "arena.<i>.dss", and "stats.arenas.<i>.dss" mallctls,
1035*8e33eff8Schristos    which provide control over dss/mmap precedence.
1036*8e33eff8Schristos  - Add the "arena.<i>.purge" mallctl, which obsoletes "arenas.purge".
1037*8e33eff8Schristos  - Define LG_QUANTUM for hppa.
1038*8e33eff8Schristos
1039*8e33eff8Schristos  Incompatible changes:
1040*8e33eff8Schristos  - Disable tcache by default if running inside Valgrind, in order to avoid
1041*8e33eff8Schristos    making unallocated objects appear reachable to Valgrind.
1042*8e33eff8Schristos  - Drop const from malloc_usable_size() argument on Linux.
1043*8e33eff8Schristos
1044*8e33eff8Schristos  Bug fixes:
1045*8e33eff8Schristos  - Fix heap profiling crash if sampled object is freed via realloc(p, 0).
1046*8e33eff8Schristos  - Remove const from __*_hook variable declarations, so that glibc can modify
1047*8e33eff8Schristos    them during process forking.
1048*8e33eff8Schristos  - Fix mlockall(2)/madvise(2) interaction.
1049*8e33eff8Schristos  - Fix fork(2)-related deadlocks.
1050*8e33eff8Schristos  - Fix error return value for "thread.tcache.enabled" mallctl.
1051*8e33eff8Schristos
1052*8e33eff8Schristos* 3.0.0 (May 11, 2012)
1053*8e33eff8Schristos
1054*8e33eff8Schristos  Although this version adds some major new features, the primary focus is on
1055*8e33eff8Schristos  internal code cleanup that facilitates maintainability and portability, most
1056*8e33eff8Schristos  of which is not reflected in the ChangeLog.  This is the first release to
1057*8e33eff8Schristos  incorporate substantial contributions from numerous other developers, and the
1058*8e33eff8Schristos  result is a more broadly useful allocator (see the git revision history for
1059*8e33eff8Schristos  contribution details).  Note that the license has been unified, thanks to
1060*8e33eff8Schristos  Facebook granting a license under the same terms as the other copyright
1061*8e33eff8Schristos  holders (see COPYING).
1062*8e33eff8Schristos
1063*8e33eff8Schristos  New features:
1064*8e33eff8Schristos  - Implement Valgrind support, redzones, and quarantine.
1065*8e33eff8Schristos  - Add support for additional platforms:
1066*8e33eff8Schristos    + FreeBSD
1067*8e33eff8Schristos    + Mac OS X Lion
1068*8e33eff8Schristos    + MinGW
1069*8e33eff8Schristos    + Windows (no support yet for replacing the system malloc)
1070*8e33eff8Schristos  - Add support for additional architectures:
1071*8e33eff8Schristos    + MIPS
1072*8e33eff8Schristos    + SH4
1073*8e33eff8Schristos    + Tilera
1074*8e33eff8Schristos  - Add support for cross compiling.
1075*8e33eff8Schristos  - Add nallocm(), which rounds a request size up to the nearest size class
1076*8e33eff8Schristos    without actually allocating.
1077*8e33eff8Schristos  - Implement aligned_alloc() (blame C11).
1078*8e33eff8Schristos  - Add the "thread.tcache.enabled" mallctl.
1079*8e33eff8Schristos  - Add the "opt.prof_final" mallctl.
1080*8e33eff8Schristos  - Update pprof (from gperftools 2.0).
1081*8e33eff8Schristos  - Add the --with-mangling option.
1082*8e33eff8Schristos  - Add the --disable-experimental option.
1083*8e33eff8Schristos  - Add the --disable-munmap option, and make it the default on Linux.
1084*8e33eff8Schristos  - Add the --enable-mremap option, which disables use of mremap(2) by default.
1085*8e33eff8Schristos
1086*8e33eff8Schristos  Incompatible changes:
1087*8e33eff8Schristos  - Enable stats by default.
1088*8e33eff8Schristos  - Enable fill by default.
1089*8e33eff8Schristos  - Disable lazy locking by default.
1090*8e33eff8Schristos  - Rename the "tcache.flush" mallctl to "thread.tcache.flush".
1091*8e33eff8Schristos  - Rename the "arenas.pagesize" mallctl to "arenas.page".
1092*8e33eff8Schristos  - Change the "opt.lg_prof_sample" default from 0 to 19 (1 B to 512 KiB).
1093*8e33eff8Schristos  - Change the "opt.prof_accum" default from true to false.
1094*8e33eff8Schristos
1095*8e33eff8Schristos  Removed features:
1096*8e33eff8Schristos  - Remove the swap feature, including the "config.swap", "swap.avail",
1097*8e33eff8Schristos    "swap.prezeroed", "swap.nfds", and "swap.fds" mallctls.
1098*8e33eff8Schristos  - Remove highruns statistics, including the
1099*8e33eff8Schristos    "stats.arenas.<i>.bins.<j>.highruns" and
1100*8e33eff8Schristos    "stats.arenas.<i>.lruns.<j>.highruns" mallctls.
1101*8e33eff8Schristos  - As part of small size class refactoring, remove the "opt.lg_[qc]space_max",
1102*8e33eff8Schristos    "arenas.cacheline", "arenas.subpage", "arenas.[tqcs]space_{min,max}", and
1103*8e33eff8Schristos    "arenas.[tqcs]bins" mallctls.
1104*8e33eff8Schristos  - Remove the "arenas.chunksize" mallctl.
1105*8e33eff8Schristos  - Remove the "opt.lg_prof_tcmax" option.
1106*8e33eff8Schristos  - Remove the "opt.lg_prof_bt_max" option.
1107*8e33eff8Schristos  - Remove the "opt.lg_tcache_gc_sweep" option.
1108*8e33eff8Schristos  - Remove the --disable-tiny option, including the "config.tiny" mallctl.
1109*8e33eff8Schristos  - Remove the --enable-dynamic-page-shift configure option.
1110*8e33eff8Schristos  - Remove the --enable-sysv configure option.
1111*8e33eff8Schristos
1112*8e33eff8Schristos  Bug fixes:
1113*8e33eff8Schristos  - Fix a statistics-related bug in the "thread.arena" mallctl that could cause
1114*8e33eff8Schristos    invalid statistics and crashes.
1115*8e33eff8Schristos  - Work around TLS deallocation via free() on Linux.  This bug could cause
1116*8e33eff8Schristos    write-after-free memory corruption.
1117*8e33eff8Schristos  - Fix a potential deadlock that could occur during interval- and
1118*8e33eff8Schristos    growth-triggered heap profile dumps.
1119*8e33eff8Schristos  - Fix large calloc() zeroing bugs due to dropping chunk map unzeroed flags.
1120*8e33eff8Schristos  - Fix chunk_alloc_dss() to stop claiming memory is zeroed.  This bug could
1121*8e33eff8Schristos    cause memory corruption and crashes with --enable-dss specified.
1122*8e33eff8Schristos  - Fix fork-related bugs that could cause deadlock in children between fork
1123*8e33eff8Schristos    and exec.
1124*8e33eff8Schristos  - Fix malloc_stats_print() to honor 'b' and 'l' in the opts parameter.
1125*8e33eff8Schristos  - Fix realloc(p, 0) to act like free(p).
1126*8e33eff8Schristos  - Do not enforce minimum alignment in memalign().
1127*8e33eff8Schristos  - Check for NULL pointer in malloc_usable_size().
1128*8e33eff8Schristos  - Fix an off-by-one heap profile statistics bug that could be observed in
1129*8e33eff8Schristos    interval- and growth-triggered heap profiles.
1130*8e33eff8Schristos  - Fix the "epoch" mallctl to update cached stats even if the passed in epoch
1131*8e33eff8Schristos    is 0.
1132*8e33eff8Schristos  - Fix bin->runcur management to fix a layout policy bug.  This bug did not
1133*8e33eff8Schristos    affect correctness.
1134*8e33eff8Schristos  - Fix a bug in choose_arena_hard() that potentially caused more arenas to be
1135*8e33eff8Schristos    initialized than necessary.
1136*8e33eff8Schristos  - Add missing "opt.lg_tcache_max" mallctl implementation.
1137*8e33eff8Schristos  - Use glibc allocator hooks to make mixed allocator usage less likely.
1138*8e33eff8Schristos  - Fix build issues for --disable-tcache.
1139*8e33eff8Schristos  - Don't mangle pthread_create() when --with-private-namespace is specified.
1140*8e33eff8Schristos
1141*8e33eff8Schristos* 2.2.5 (November 14, 2011)
1142*8e33eff8Schristos
1143*8e33eff8Schristos  Bug fixes:
1144*8e33eff8Schristos  - Fix huge_ralloc() race when using mremap(2).  This is a serious bug that
1145*8e33eff8Schristos    could cause memory corruption and/or crashes.
1146*8e33eff8Schristos  - Fix huge_ralloc() to maintain chunk statistics.
1147*8e33eff8Schristos  - Fix malloc_stats_print(..., "a") output.
1148*8e33eff8Schristos
1149*8e33eff8Schristos* 2.2.4 (November 5, 2011)
1150*8e33eff8Schristos
1151*8e33eff8Schristos  Bug fixes:
1152*8e33eff8Schristos  - Initialize arenas_tsd before using it.  This bug existed for 2.2.[0-3], as
1153*8e33eff8Schristos    well as for --disable-tls builds in earlier releases.
1154*8e33eff8Schristos  - Do not assume a 4 KiB page size in test/rallocm.c.
1155*8e33eff8Schristos
1156*8e33eff8Schristos* 2.2.3 (August 31, 2011)
1157*8e33eff8Schristos
1158*8e33eff8Schristos  This version fixes numerous bugs related to heap profiling.
1159*8e33eff8Schristos
1160*8e33eff8Schristos  Bug fixes:
1161*8e33eff8Schristos  - Fix a prof-related race condition.  This bug could cause memory corruption,
1162*8e33eff8Schristos    but only occurred in non-default configurations (prof_accum:false).
1163*8e33eff8Schristos  - Fix off-by-one backtracing issues (make sure that prof_alloc_prep() is
1164*8e33eff8Schristos    excluded from backtraces).
1165*8e33eff8Schristos  - Fix a prof-related bug in realloc() (only triggered by OOM errors).
1166*8e33eff8Schristos  - Fix prof-related bugs in allocm() and rallocm().
1167*8e33eff8Schristos  - Fix prof_tdata_cleanup() for --disable-tls builds.
1168*8e33eff8Schristos  - Fix a relative include path, to fix objdir builds.
1169*8e33eff8Schristos
1170*8e33eff8Schristos* 2.2.2 (July 30, 2011)
1171*8e33eff8Schristos
1172*8e33eff8Schristos  Bug fixes:
1173*8e33eff8Schristos  - Fix a build error for --disable-tcache.
1174*8e33eff8Schristos  - Fix assertions in arena_purge() (for real this time).
1175*8e33eff8Schristos  - Add the --with-private-namespace option.  This is a workaround for symbol
1176*8e33eff8Schristos    conflicts that can inadvertently arise when using static libraries.
1177*8e33eff8Schristos
1178*8e33eff8Schristos* 2.2.1 (March 30, 2011)
1179*8e33eff8Schristos
1180*8e33eff8Schristos  Bug fixes:
1181*8e33eff8Schristos  - Implement atomic operations for x86/x64.  This fixes compilation failures
1182*8e33eff8Schristos    for versions of gcc that are still in wide use.
1183*8e33eff8Schristos  - Fix an assertion in arena_purge().
1184*8e33eff8Schristos
1185*8e33eff8Schristos* 2.2.0 (March 22, 2011)
1186*8e33eff8Schristos
1187*8e33eff8Schristos  This version incorporates several improvements to algorithms and data
1188*8e33eff8Schristos  structures that tend to reduce fragmentation and increase speed.
1189*8e33eff8Schristos
1190*8e33eff8Schristos  New features:
1191*8e33eff8Schristos  - Add the "stats.cactive" mallctl.
1192*8e33eff8Schristos  - Update pprof (from google-perftools 1.7).
1193*8e33eff8Schristos  - Improve backtracing-related configuration logic, and add the
1194*8e33eff8Schristos    --disable-prof-libgcc option.
1195*8e33eff8Schristos
1196*8e33eff8Schristos  Bug fixes:
1197*8e33eff8Schristos  - Change default symbol visibility from "internal", to "hidden", which
1198*8e33eff8Schristos    decreases the overhead of library-internal function calls.
1199*8e33eff8Schristos  - Fix symbol visibility so that it is also set on OS X.
1200*8e33eff8Schristos  - Fix a build dependency regression caused by the introduction of the .pic.o
1201*8e33eff8Schristos    suffix for PIC object files.
1202*8e33eff8Schristos  - Add missing checks for mutex initialization failures.
1203*8e33eff8Schristos  - Don't use libgcc-based backtracing except on x64, where it is known to work.
1204*8e33eff8Schristos  - Fix deadlocks on OS X that were due to memory allocation in
1205*8e33eff8Schristos    pthread_mutex_lock().
1206*8e33eff8Schristos  - Heap profiling-specific fixes:
1207*8e33eff8Schristos    + Fix memory corruption due to integer overflow in small region index
1208*8e33eff8Schristos      computation, when using a small enough sample interval that profiling
1209*8e33eff8Schristos      context pointers are stored in small run headers.
1210*8e33eff8Schristos    + Fix a bootstrap ordering bug that only occurred with TLS disabled.
1211*8e33eff8Schristos    + Fix a rallocm() rsize bug.
1212*8e33eff8Schristos    + Fix error detection bugs for aligned memory allocation.
1213*8e33eff8Schristos
1214*8e33eff8Schristos* 2.1.3 (March 14, 2011)
1215*8e33eff8Schristos
1216*8e33eff8Schristos  Bug fixes:
1217*8e33eff8Schristos  - Fix a cpp logic regression (due to the "thread.{de,}allocatedp" mallctl fix
1218*8e33eff8Schristos    for OS X in 2.1.2).
1219*8e33eff8Schristos  - Fix a "thread.arena" mallctl bug.
1220*8e33eff8Schristos  - Fix a thread cache stats merging bug.
1221*8e33eff8Schristos
1222*8e33eff8Schristos* 2.1.2 (March 2, 2011)
1223*8e33eff8Schristos
1224*8e33eff8Schristos  Bug fixes:
1225*8e33eff8Schristos  - Fix "thread.{de,}allocatedp" mallctl for OS X.
1226*8e33eff8Schristos  - Add missing jemalloc.a to build system.
1227*8e33eff8Schristos
1228*8e33eff8Schristos* 2.1.1 (January 31, 2011)
1229*8e33eff8Schristos
1230*8e33eff8Schristos  Bug fixes:
1231*8e33eff8Schristos  - Fix aligned huge reallocation (affected allocm()).
1232*8e33eff8Schristos  - Fix the ALLOCM_LG_ALIGN macro definition.
1233*8e33eff8Schristos  - Fix a heap dumping deadlock.
1234*8e33eff8Schristos  - Fix a "thread.arena" mallctl bug.
1235*8e33eff8Schristos
1236*8e33eff8Schristos* 2.1.0 (December 3, 2010)
1237*8e33eff8Schristos
1238*8e33eff8Schristos  This version incorporates some optimizations that can't quite be considered
1239*8e33eff8Schristos  bug fixes.
1240*8e33eff8Schristos
1241*8e33eff8Schristos  New features:
1242*8e33eff8Schristos  - Use Linux's mremap(2) for huge object reallocation when possible.
1243*8e33eff8Schristos  - Avoid locking in mallctl*() when possible.
1244*8e33eff8Schristos  - Add the "thread.[de]allocatedp" mallctl's.
1245*8e33eff8Schristos  - Convert the manual page source from roff to DocBook, and generate both roff
1246*8e33eff8Schristos    and HTML manuals.
1247*8e33eff8Schristos
1248*8e33eff8Schristos  Bug fixes:
1249*8e33eff8Schristos  - Fix a crash due to incorrect bootstrap ordering.  This only impacted
1250*8e33eff8Schristos    --enable-debug --enable-dss configurations.
1251*8e33eff8Schristos  - Fix a minor statistics bug for mallctl("swap.avail", ...).
1252*8e33eff8Schristos
1253*8e33eff8Schristos* 2.0.1 (October 29, 2010)
1254*8e33eff8Schristos
1255*8e33eff8Schristos  Bug fixes:
1256*8e33eff8Schristos  - Fix a race condition in heap profiling that could cause undefined behavior
1257*8e33eff8Schristos    if "opt.prof_accum" were disabled.
1258*8e33eff8Schristos  - Add missing mutex unlocks for some OOM error paths in the heap profiling
1259*8e33eff8Schristos    code.
1260*8e33eff8Schristos  - Fix a compilation error for non-C99 builds.
1261*8e33eff8Schristos
1262*8e33eff8Schristos* 2.0.0 (October 24, 2010)
1263*8e33eff8Schristos
1264*8e33eff8Schristos  This version focuses on the experimental *allocm() API, and on improved
1265*8e33eff8Schristos  run-time configuration/introspection.  Nonetheless, numerous performance
1266*8e33eff8Schristos  improvements are also included.
1267*8e33eff8Schristos
1268*8e33eff8Schristos  New features:
1269*8e33eff8Schristos  - Implement the experimental {,r,s,d}allocm() API, which provides a superset
1270*8e33eff8Schristos    of the functionality available via malloc(), calloc(), posix_memalign(),
1271*8e33eff8Schristos    realloc(), malloc_usable_size(), and free().  These functions can be used to
1272*8e33eff8Schristos    allocate/reallocate aligned zeroed memory, ask for optional extra memory
1273*8e33eff8Schristos    during reallocation, prevent object movement during reallocation, etc.
1274*8e33eff8Schristos  - Replace JEMALLOC_OPTIONS/JEMALLOC_PROF_PREFIX with MALLOC_CONF, which is
1275*8e33eff8Schristos    more human-readable, and more flexible.  For example:
1276*8e33eff8Schristos      JEMALLOC_OPTIONS=AJP
1277*8e33eff8Schristos    is now:
1278*8e33eff8Schristos      MALLOC_CONF=abort:true,fill:true,stats_print:true
1279*8e33eff8Schristos  - Port to Apple OS X.  Sponsored by Mozilla.
1280*8e33eff8Schristos  - Make it possible for the application to control thread-->arena mappings via
1281*8e33eff8Schristos    the "thread.arena" mallctl.
1282*8e33eff8Schristos  - Add compile-time support for all TLS-related functionality via pthreads TSD.
1283*8e33eff8Schristos    This is mainly of interest for OS X, which does not support TLS, but has a
1284*8e33eff8Schristos    TSD implementation with similar performance.
1285*8e33eff8Schristos  - Override memalign() and valloc() if they are provided by the system.
1286*8e33eff8Schristos  - Add the "arenas.purge" mallctl, which can be used to synchronously purge all
1287*8e33eff8Schristos    dirty unused pages.
1288*8e33eff8Schristos  - Make cumulative heap profiling data optional, so that it is possible to
1289*8e33eff8Schristos    limit the amount of memory consumed by heap profiling data structures.
1290*8e33eff8Schristos  - Add per thread allocation counters that can be accessed via the
1291*8e33eff8Schristos    "thread.allocated" and "thread.deallocated" mallctls.
1292*8e33eff8Schristos
1293*8e33eff8Schristos  Incompatible changes:
1294*8e33eff8Schristos  - Remove JEMALLOC_OPTIONS and malloc_options (see MALLOC_CONF above).
1295*8e33eff8Schristos  - Increase default backtrace depth from 4 to 128 for heap profiling.
1296*8e33eff8Schristos  - Disable interval-based profile dumps by default.
1297*8e33eff8Schristos
1298*8e33eff8Schristos  Bug fixes:
1299*8e33eff8Schristos  - Remove bad assertions in fork handler functions.  These assertions could
1300*8e33eff8Schristos    cause aborts for some combinations of configure settings.
1301*8e33eff8Schristos  - Fix strerror_r() usage to deal with non-standard semantics in GNU libc.
1302*8e33eff8Schristos  - Fix leak context reporting.  This bug tended to cause the number of contexts
1303*8e33eff8Schristos    to be underreported (though the reported number of objects and bytes were
1304*8e33eff8Schristos    correct).
1305*8e33eff8Schristos  - Fix a realloc() bug for large in-place growing reallocation.  This bug could
1306*8e33eff8Schristos    cause memory corruption, but it was hard to trigger.
1307*8e33eff8Schristos  - Fix an allocation bug for small allocations that could be triggered if
1308*8e33eff8Schristos    multiple threads raced to create a new run of backing pages.
1309*8e33eff8Schristos  - Enhance the heap profiler to trigger samples based on usable size, rather
1310*8e33eff8Schristos    than request size.
1311*8e33eff8Schristos  - Fix a heap profiling bug due to sometimes losing track of requested object
1312*8e33eff8Schristos    size for sampled objects.
1313*8e33eff8Schristos
1314*8e33eff8Schristos* 1.0.3 (August 12, 2010)
1315*8e33eff8Schristos
1316*8e33eff8Schristos  Bug fixes:
1317*8e33eff8Schristos  - Fix the libunwind-based implementation of stack backtracing (used for heap
1318*8e33eff8Schristos    profiling).  This bug could cause zero-length backtraces to be reported.
1319*8e33eff8Schristos  - Add a missing mutex unlock in library initialization code.  If multiple
1320*8e33eff8Schristos    threads raced to initialize malloc, some of them could end up permanently
1321*8e33eff8Schristos    blocked.
1322*8e33eff8Schristos
1323*8e33eff8Schristos* 1.0.2 (May 11, 2010)
1324*8e33eff8Schristos
1325*8e33eff8Schristos  Bug fixes:
1326*8e33eff8Schristos  - Fix junk filling of large objects, which could cause memory corruption.
1327*8e33eff8Schristos  - Add MAP_NORESERVE support for chunk mapping, because otherwise virtual
1328*8e33eff8Schristos    memory limits could cause swap file configuration to fail.  Contributed by
1329*8e33eff8Schristos    Jordan DeLong.
1330*8e33eff8Schristos
1331*8e33eff8Schristos* 1.0.1 (April 14, 2010)
1332*8e33eff8Schristos
1333*8e33eff8Schristos  Bug fixes:
1334*8e33eff8Schristos  - Fix compilation when --enable-fill is specified.
1335*8e33eff8Schristos  - Fix threads-related profiling bugs that affected accuracy and caused memory
1336*8e33eff8Schristos    to be leaked during thread exit.
1337*8e33eff8Schristos  - Fix dirty page purging race conditions that could cause crashes.
1338*8e33eff8Schristos  - Fix crash in tcache flushing code during thread destruction.
1339*8e33eff8Schristos
1340*8e33eff8Schristos* 1.0.0 (April 11, 2010)
1341*8e33eff8Schristos
1342*8e33eff8Schristos  This release focuses on speed and run-time introspection.  Numerous
1343*8e33eff8Schristos  algorithmic improvements make this release substantially faster than its
1344*8e33eff8Schristos  predecessors.
1345*8e33eff8Schristos
1346*8e33eff8Schristos  New features:
1347*8e33eff8Schristos  - Implement autoconf-based configuration system.
1348*8e33eff8Schristos  - Add mallctl*(), for the purposes of introspection and run-time
1349*8e33eff8Schristos    configuration.
1350*8e33eff8Schristos  - Make it possible for the application to manually flush a thread's cache, via
1351*8e33eff8Schristos    the "tcache.flush" mallctl.
1352*8e33eff8Schristos  - Base maximum dirty page count on proportion of active memory.
1353*8e33eff8Schristos  - Compute various additional run-time statistics, including per size class
1354*8e33eff8Schristos    statistics for large objects.
1355*8e33eff8Schristos  - Expose malloc_stats_print(), which can be called repeatedly by the
1356*8e33eff8Schristos    application.
1357*8e33eff8Schristos  - Simplify the malloc_message() signature to only take one string argument,
1358*8e33eff8Schristos    and incorporate an opaque data pointer argument for use by the application
1359*8e33eff8Schristos    in combination with malloc_stats_print().
1360*8e33eff8Schristos  - Add support for allocation backed by one or more swap files, and allow the
1361*8e33eff8Schristos    application to disable over-commit if swap files are in use.
1362*8e33eff8Schristos  - Implement allocation profiling and leak checking.
1363*8e33eff8Schristos
1364*8e33eff8Schristos  Removed features:
1365*8e33eff8Schristos  - Remove the dynamic arena rebalancing code, since thread-specific caching
1366*8e33eff8Schristos    reduces its utility.
1367*8e33eff8Schristos
1368*8e33eff8Schristos  Bug fixes:
1369*8e33eff8Schristos  - Modify chunk allocation to work when address space layout randomization
1370*8e33eff8Schristos    (ASLR) is in use.
1371*8e33eff8Schristos  - Fix thread cleanup bugs related to TLS destruction.
1372*8e33eff8Schristos  - Handle 0-size allocation requests in posix_memalign().
1373*8e33eff8Schristos  - Fix a chunk leak.  The leaked chunks were never touched, so this impacted
1374*8e33eff8Schristos    virtual memory usage, but not physical memory usage.
1375*8e33eff8Schristos
1376*8e33eff8Schristos* linux_2008082[78]a (August 27/28, 2008)
1377*8e33eff8Schristos
1378*8e33eff8Schristos  These snapshot releases are the simple result of incorporating Linux-specific
1379*8e33eff8Schristos  support into the FreeBSD malloc sources.
1380*8e33eff8Schristos
1381*8e33eff8Schristos--------------------------------------------------------------------------------
1382*8e33eff8Schristosvim:filetype=text:textwidth=80
1383