1.\" $NetBSD: kmem.9,v 1.4 2010/01/23 00:54:43 rmind Exp $ 2.\" 3.\" Copyright (c)2006 YAMAMOTO Takashi, 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 15.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 18.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25.\" SUCH DAMAGE. 26.\" 27.\" ------------------------------------------------------------ 28.Dd August 3, 2009 29.Dt KMEM 9 30.Os 31.\" ------------------------------------------------------------ 32.Sh NAME 33.Nm kmem 34.Nd kernel wired memory allocator 35.\" ------------------------------------------------------------ 36.Sh SYNOPSIS 37.In sys/kmem.h 38.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 39.Ft void * 40.Fn kmem_alloc \ 41"size_t size" "km_flag_t kmflags" 42.Ft void * 43.Fn kmem_zalloc \ 44"size_t size" "km_flag_t kmflags" 45.Ft void 46.Fn kmem_free \ 47"void *p" "size_t size" 48.\" ------------------------------------------------------------ 49.Pp 50.Cd "options DEBUG" 51.Sh DESCRIPTION 52.Fn kmem_alloc 53allocates kernel wired memory. 54It takes the following arguments. 55.Bl -tag -width kmflags 56.It Fa size 57Specify the size of allocation in bytes. 58.It Fa kmflags 59Either of the following: 60.Bl -tag -width KM_NOSLEEP 61.It KM_SLEEP 62If the allocation cannot be satisfied immediately, sleep until enough 63memory is available. 64.It KM_NOSLEEP 65Don't sleep. 66Immediately return 67.Dv NULL 68if there is not enough memory available. 69It should only be used when failure to allocate will not have harmful, 70user-visible effects. 71.Pp 72.Bf -symbolic 73Use of 74.Dv KM_NOSLEEP 75is strongly discouraged as it can create transient, hard to debug failures 76that occur when the system is under memory pressure. 77.Ef 78.Pp 79In situations where it is not possible to sleep, for example because locks 80are held by the caller, the code path should be restructured to allow the 81allocation to be made in another place. 82.El 83.El 84.Pp 85The contents of allocated memory are uninitialized. 86.Pp 87Unlike Solaris, kmem_alloc(0, flags) is illegal. 88.Pp 89.\" ------------------------------------------------------------ 90.Fn kmem_zalloc 91is the equivalent of 92.Fn kmem_alloc , 93except that it initializes the memory to zero. 94.Pp 95.\" ------------------------------------------------------------ 96.Fn kmem_free 97frees kernel wired memory allocated by 98.Fn kmem_alloc 99or 100.Fn kmem_zalloc 101so that it can be used for other purposes. 102It takes the following arguments. 103.Bl -tag -width kmflags 104.It Fa p 105The pointer to the memory being freed. 106It must be the one returned by 107.Fn kmem_alloc 108or 109.Fn kmem_zalloc . 110.It Fa size 111The size of the memory being freed, in bytes. 112It must be the same as the 113.Fa size 114argument used for 115.Fn kmem_alloc 116or 117.Fn kmem_zalloc 118when the memory was allocated. 119.El 120.Pp 121Freeing 122.Dv NULL 123is illegal. 124.\" ------------------------------------------------------------ 125.Sh NOTES 126Making 127.Dv KM_SLEEP 128allocations while holding mutexes or reader/writer locks is discouraged, as the 129caller can sleep for an unbounded amount of time in order to satisfy the 130allocation. 131This can in turn block other threads that wish to acquire locks held by the 132caller. 133.Pp 134For some locks this is permissible or even unavoidable. 135For others, particularly locks that may be taken from soft interrupt context, 136it is a serious problem. 137As a general rule it is better not to allow this type of situation to develop. 138One way to circumvent the problem is to make allocations speculative and part 139of a retryable sequence. 140For example: 141.Bd -literal 142 retry: 143 /* speculative unlocked check */ 144 if (need to allocate) { 145 new_item = kmem_alloc(sizeof(*new_item), KM_SLEEP); 146 } else { 147 new_item = NULL; 148 } 149 mutex_enter(lock); 150 /* check while holding lock for true status */ 151 if (need to allocate) { 152 if (new_item == NULL) { 153 mutex_exit(lock); 154 goto retry; 155 } 156 consume(new_item); 157 new_item = NULL; 158 } 159 mutex_exit(lock); 160 if (new_item != NULL) { 161 /* did not use it after all */ 162 kmem_free(new_item, sizeof(*new_item)); 163 } 164.Ed 165.\" ------------------------------------------------------------ 166.Sh OPTIONS 167Kernels compiled with the 168.Dv DEBUG 169option perform CPU intensive sanity checks on kmem operations, 170and include the 171.Dv kmguard 172facility which can be enabled at runtime. 173.Pp 174.Dv kmguard 175adds additional, very high overhead runtime verification to kmem operations. 176To enable it, boot the system with the 177.Fl d 178option, which causes the debugger to be entered early during the kernel 179boot process. 180Issue commands such as the following: 181.Bd -literal 182db\*[Gt] w kmem_guard_depth 0t30000 183db\*[Gt] c 184.Ed 185.Pp 186This instructs 187.Dv kmguard 188to queue up to 60000 (30000*2) pages of unmapped KVA to catch 189use-after-free type errors. 190When 191.Fn kmem_free 192is called, memory backing a freed item is unmapped and the kernel VA 193space pushed onto a FIFO. 194The VA space will not be reused until another 30k items have been freed. 195Until reused the kernel will catch invalid accesses and panic with a page fault. 196Limitations: 197.Bl -bullet 198.It 199It has a severe impact on performance. 200.It 201It is best used on a 64-bit machine with lots of RAM. 202.It 203Allocations larger than PAGE_SIZE bypass the 204.Dv kmguard 205facility. 206.El 207.Pp 208kmguard tries to catch the following types of bugs: 209.Bl -bullet 210.It 211Overflow at time of occurrence, by means of a guard page. 212.It 213Underflow at 214.Fn kmem_free , 215by using a canary value. 216.It 217Invalid pointer or size passed, at 218.Fn kmem_free . 219.El 220.Sh RETURN VALUES 221On success, 222.Fn kmem_alloc 223and 224.Fn kmem_zalloc 225return a pointer to allocated memory. 226Otherwise, 227.Dv NULL 228is returned. 229.\" ------------------------------------------------------------ 230.Sh CODE REFERENCES 231This section describes places within the 232.Nx 233source tree where actual code implementing the 234.Nm 235subsystem 236can be found. 237All pathnames are relative to 238.Pa /usr/src . 239.Pp 240The 241.Nm 242subsystem is implemented within the file 243.Pa sys/kern/subr_kmem.c . 244.\" ------------------------------------------------------------ 245.Sh SEE ALSO 246.Xr intro 9 , 247.Xr memoryallocators 9 , 248.Xr percpu 9 , 249.Xr pool_cache 9 250.\" ------------------------------------------------------------ 251.Sh CAVEATS 252.Fn kmem_alloc 253cannot be used from interrupt context, from a soft interrupt, or from 254a callout. 255Use 256.Xr pool_cache 9 257in these situations. 258.\" ------------------------------------------------------------ 259.Sh SECURITY CONSIDERATION 260As the memory allocated by 261.Fn kmem_alloc 262is uninitialized, it can contain security-sensitive data left by its 263previous user. 264It is the caller's responsibility not to expose it to the world. 265