1.\" $NetBSD: kmem.9,v 1.7 2010/02/13 07:44:11 wiz Exp $ 2.\" 3.\" Copyright (c)2006 YAMAMOTO Takashi, 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 15.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 18.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25.\" SUCH DAMAGE. 26.\" 27.\" ------------------------------------------------------------ 28.Dd February 11, 2010 29.Dt KMEM 9 30.Os 31.\" ------------------------------------------------------------ 32.Sh NAME 33.Nm kmem 34.Nd kernel wired memory allocator 35.\" ------------------------------------------------------------ 36.Sh SYNOPSIS 37.In sys/kmem.h 38.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 39.Ft void * 40.Fn kmem_alloc \ 41"size_t size" "km_flag_t kmflags" 42.Ft void * 43.Fn kmem_zalloc \ 44"size_t size" "km_flag_t kmflags" 45.Ft void 46.Fn kmem_free \ 47"void *p" "size_t size" 48.Ft char * 49.Fn kmem_asprintf \ 50"const char *fmt" "..." 51.\" ------------------------------------------------------------ 52.Pp 53.Cd "options DEBUG" 54.Sh DESCRIPTION 55.Fn kmem_alloc 56allocates kernel wired memory. 57It takes the following arguments. 58.Bl -tag -width kmflags 59.It Fa size 60Specify the size of allocation in bytes. 61.It Fa kmflags 62Either of the following: 63.Bl -tag -width KM_NOSLEEP 64.It KM_SLEEP 65If the allocation cannot be satisfied immediately, sleep until enough 66memory is available. 67.It KM_NOSLEEP 68Don't sleep. 69Immediately return 70.Dv NULL 71if there is not enough memory available. 72It should only be used when failure to allocate will not have harmful, 73user-visible effects. 74.Pp 75.Bf -symbolic 76Use of 77.Dv KM_NOSLEEP 78is strongly discouraged as it can create transient, hard to debug failures 79that occur when the system is under memory pressure. 80.Ef 81.Pp 82In situations where it is not possible to sleep, for example because locks 83are held by the caller, the code path should be restructured to allow the 84allocation to be made in another place. 85.El 86.El 87.Pp 88The contents of allocated memory are uninitialized. 89.Pp 90Unlike Solaris, kmem_alloc(0, flags) is illegal. 91.Pp 92.\" ------------------------------------------------------------ 93.Fn kmem_zalloc 94is the equivalent of 95.Fn kmem_alloc , 96except that it initializes the memory to zero. 97.Pp 98.\" ------------------------------------------------------------ 99.Fn kmem_asprintf 100functions as the well known 101.Fn asprintf 102function, but allocates memory using 103.Fn kmem_alloc . 104This routine can sleep during allocation. 105The size of the allocated area is the length of the returned character string, plus one (for the NUL terminator). 106This must be taken into consideration when freeing the returned area with 107.Fn kmem_free . 108.Pp 109.\" ------------------------------------------------------------ 110.Fn kmem_free 111frees kernel wired memory allocated by 112.Fn kmem_alloc 113or 114.Fn kmem_zalloc 115so that it can be used for other purposes. 116It takes the following arguments. 117.Bl -tag -width kmflags 118.It Fa p 119The pointer to the memory being freed. 120It must be the one returned by 121.Fn kmem_alloc 122or 123.Fn kmem_zalloc . 124.It Fa size 125The size of the memory being freed, in bytes. 126It must be the same as the 127.Fa size 128argument used for 129.Fn kmem_alloc 130or 131.Fn kmem_zalloc 132when the memory was allocated. 133.El 134.Pp 135Freeing 136.Dv NULL 137is illegal. 138.\" ------------------------------------------------------------ 139.Sh NOTES 140Making 141.Dv KM_SLEEP 142allocations while holding mutexes or reader/writer locks is discouraged, as the 143caller can sleep for an unbounded amount of time in order to satisfy the 144allocation. 145This can in turn block other threads that wish to acquire locks held by the 146caller. 147.Pp 148For some locks this is permissible or even unavoidable. 149For others, particularly locks that may be taken from soft interrupt context, 150it is a serious problem. 151As a general rule it is better not to allow this type of situation to develop. 152One way to circumvent the problem is to make allocations speculative and part 153of a retryable sequence. 154For example: 155.Bd -literal 156 retry: 157 /* speculative unlocked check */ 158 if (need to allocate) { 159 new_item = kmem_alloc(sizeof(*new_item), KM_SLEEP); 160 } else { 161 new_item = NULL; 162 } 163 mutex_enter(lock); 164 /* check while holding lock for true status */ 165 if (need to allocate) { 166 if (new_item == NULL) { 167 mutex_exit(lock); 168 goto retry; 169 } 170 consume(new_item); 171 new_item = NULL; 172 } 173 mutex_exit(lock); 174 if (new_item != NULL) { 175 /* did not use it after all */ 176 kmem_free(new_item, sizeof(*new_item)); 177 } 178.Ed 179.\" ------------------------------------------------------------ 180.Sh OPTIONS 181Kernels compiled with the 182.Dv DEBUG 183option perform CPU intensive sanity checks on kmem operations, 184and include the 185.Dv kmguard 186facility which can be enabled at runtime. 187.Pp 188.Dv kmguard 189adds additional, very high overhead runtime verification to kmem operations. 190To enable it, boot the system with the 191.Fl d 192option, which causes the debugger to be entered early during the kernel 193boot process. 194Issue commands such as the following: 195.Bd -literal 196db\*[Gt] w kmem_guard_depth 0t30000 197db\*[Gt] c 198.Ed 199.Pp 200This instructs 201.Dv kmguard 202to queue up to 60000 (30000*2) pages of unmapped KVA to catch 203use-after-free type errors. 204When 205.Fn kmem_free 206is called, memory backing a freed item is unmapped and the kernel VA 207space pushed onto a FIFO. 208The VA space will not be reused until another 30k items have been freed. 209Until reused the kernel will catch invalid accesses and panic with a page fault. 210Limitations: 211.Bl -bullet 212.It 213It has a severe impact on performance. 214.It 215It is best used on a 64-bit machine with lots of RAM. 216.It 217Allocations larger than PAGE_SIZE bypass the 218.Dv kmguard 219facility. 220.El 221.Pp 222kmguard tries to catch the following types of bugs: 223.Bl -bullet 224.It 225Overflow at time of occurrence, by means of a guard page. 226.It 227Underflow at 228.Fn kmem_free , 229by using a canary value. 230.It 231Invalid pointer or size passed, at 232.Fn kmem_free . 233.El 234.Sh RETURN VALUES 235On success, 236.Fn kmem_alloc 237and 238.Fn kmem_zalloc 239return a pointer to allocated memory. 240Otherwise, 241.Dv NULL 242is returned. 243.\" ------------------------------------------------------------ 244.Sh CODE REFERENCES 245This section describes places within the 246.Nx 247source tree where actual code implementing the 248.Nm 249subsystem 250can be found. 251All pathnames are relative to 252.Pa /usr/src . 253.Pp 254The 255.Nm 256subsystem is implemented within the file 257.Pa sys/kern/subr_kmem.c . 258.\" ------------------------------------------------------------ 259.Sh SEE ALSO 260.Xr intro 9 , 261.Xr memoryallocators 9 , 262.Xr percpu 9 , 263.Xr pool_cache 9 264.\" ------------------------------------------------------------ 265.Sh CAVEATS 266.Fn kmem_alloc 267cannot be used from interrupt context, from a soft interrupt, or from 268a callout. 269Use 270.Xr pool_cache 9 271in these situations. 272.\" ------------------------------------------------------------ 273.Sh SECURITY CONSIDERATION 274As the memory allocated by 275.Fn kmem_alloc 276is uninitialized, it can contain security-sensitive data left by its 277previous user. 278It is the caller's responsibility not to expose it to the world. 279