xref: /netbsd-src/share/man/man9/uvm.9 (revision 1b9578b8c2c1f848eeb16dabbfd7d1f0d9fdefbd)
1.\"	$NetBSD: uvm.9,v 1.106 2011/06/01 02:22:18 rmind Exp $
2.\"
3.\" Copyright (c) 1998 Matthew R. Green
4.\" All rights reserved.
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\" 1. Redistributions of source code must retain the above copyright
10.\"    notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice, this list of conditions and the following disclaimer in the
13.\"    documentation and/or other materials provided with the distribution.
14.\"
15.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
16.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
17.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
18.\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
19.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
20.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
21.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
22.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
23.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25.\" SUCH DAMAGE.
26.\"
27.Dd June 1, 2011
28.Dt UVM 9
29.Os
30.Sh NAME
31.Nm uvm
32.Nd virtual memory system external interface
33.Sh SYNOPSIS
34.In sys/param.h
35.In uvm/uvm.h
36.Sh DESCRIPTION
37The UVM virtual memory system manages access to the computer's memory
38resources.
39User processes and the kernel access these resources through
40UVM's external interface.
41UVM's external interface includes functions that:
42.Pp
43.Bl -hyphen -compact
44.It
45initialize UVM sub-systems
46.It
47manage virtual address spaces
48.It
49resolve page faults
50.It
51memory map files and devices
52.It
53perform uio-based I/O to virtual memory
54.It
55allocate and free kernel virtual memory
56.It
57allocate and free physical memory
58.El
59.Pp
60In addition to exporting these services, UVM has two kernel-level processes:
61pagedaemon and swapper.
62The pagedaemon process sleeps until physical memory becomes scarce.
63When that happens, pagedaemon is awoken.
64It scans physical memory, paging out and freeing memory that has not
65been recently used.
66The swapper process swaps in runnable processes that are currently swapped
67out, if there is room.
68.Pp
69There are also several miscellaneous functions.
70.Sh INITIALIZATION
71.Bl -ohang
72.It Ft void
73.Fn uvm_init "void" ;
74.It Ft void
75.Fn uvm_init_limits "struct lwp *l" ;
76.It Ft void
77.Fn uvm_setpagesize "void" ;
78.It Ft void
79.Fn uvm_swap_init "void" ;
80.El
81.Pp
82.Fn uvm_init
83sets up the UVM system at system boot time, after the
84console has been setup.
85It initializes global state, the page, map, kernel virtual memory state,
86machine-dependent physical map, kernel memory allocator,
87pager and anonymous memory sub-systems, and then enables
88paging of kernel objects.
89.Pp
90.Fn uvm_init_limits
91initializes process limits for the named process.
92This is for use by the system startup for process zero, before any
93other processes are created.
94.Pp
95.Fn uvm_setpagesize
96initializes the uvmexp members pagesize (if not already done by
97machine-dependent code), pageshift and pagemask.
98It should be called by machine-dependent code early in the
99.Fn pmap_init
100call (see
101.Xr pmap 9 ) .
102.Pp
103.Fn uvm_swap_init
104initializes the swap sub-system.
105.Sh VIRTUAL ADDRESS SPACE MANAGEMENT
106See
107.Xr uvm_map 9 .
108.Sh PAGE FAULT HANDLING
109.Bl -ohang
110.It Ft int
111.Fn uvm_fault "struct vm_map *orig_map" "vaddr_t vaddr" "vm_prot_t access_type" ;
112.El
113.Pp
114.Fn uvm_fault
115is the main entry point for faults.
116It takes
117.Fa orig_map
118as the map the fault originated in, a
119.Fa vaddr
120offset into the map the fault occurred, and
121.Fa access_type
122describing the type of access requested.
123.Fn uvm_fault
124returns a standard UVM return value.
125.Sh MEMORY MAPPING FILES AND DEVICES
126See
127.Xr ubc 9 .
128.Sh VIRTUAL MEMORY I/O
129.Bl -ohang
130.It Ft int
131.Fn uvm_io "struct vm_map *map" "struct uio *uio" ;
132.El
133.Pp
134.Fn uvm_io
135performs the I/O described in
136.Fa uio
137on the memory described in
138.Fa map .
139.Sh ALLOCATION OF KERNEL MEMORY
140See
141.Xr uvm_km 9 .
142.Sh ALLOCATION OF PHYSICAL MEMORY
143.Bl -ohang
144.It Ft struct vm_page *
145.Fn uvm_pagealloc "struct uvm_object *uobj" "voff_t off" "struct vm_anon *anon" "int flags" ;
146.It Ft void
147.Fn uvm_pagerealloc "struct vm_page *pg" "struct uvm_object *newobj" "voff_t newoff" ;
148.It Ft void
149.Fn uvm_pagefree "struct vm_page *pg" ;
150.It Ft int
151.Fn uvm_pglistalloc "psize_t size" "paddr_t low" "paddr_t high" "paddr_t alignment" "paddr_t boundary" "struct pglist *rlist" "int nsegs" "int waitok" ;
152.It Ft void
153.Fn uvm_pglistfree "struct pglist *list" ;
154.It Ft void
155.Fn uvm_page_physload "paddr_t start" "paddr_t end" "paddr_t avail_start" "paddr_t avail_end" "int free_list" ;
156.El
157.Pp
158.Fn uvm_pagealloc
159allocates a page of memory at virtual address
160.Fa off
161in either the object
162.Fa uobj
163or the anonymous memory
164.Fa anon ,
165which must be locked by the caller.
166Only one of
167.Fa uobj
168and
169.Fa anon
170can be non
171.Dv NULL .
172Returns
173.Dv NULL
174when no page can be found.
175The flags can be any of
176.Bd -literal
177#define UVM_PGA_USERESERVE      0x0001  /* ok to use reserve pages */
178#define UVM_PGA_ZERO            0x0002  /* returned page must be zero'd */
179.Ed
180.Pp
181.Dv UVM_PGA_USERESERVE
182means to allocate a page even if that will result in the number of free pages
183being lower than
184.Dv uvmexp.reserve_pagedaemon
185(if the current thread is the pagedaemon) or
186.Dv uvmexp.reserve_kernel
187(if the current thread is not the pagedaemon).
188.Dv UVM_PGA_ZERO
189causes the returned page to be filled with zeroes, either by allocating it
190from a pool of pre-zeroed pages or by zeroing it in-line as necessary.
191.Pp
192.Fn uvm_pagerealloc
193reallocates page
194.Fa pg
195to a new object
196.Fa newobj ,
197at a new offset
198.Fa newoff .
199.Pp
200.Fn uvm_pagefree
201frees the physical page
202.Fa pg .
203If the content of the page is known to be zero-filled,
204caller should set
205.Dv PG_ZERO
206in pg-\*[Gt]flags so that the page allocator will use
207the page to serve future
208.Dv UVM_PGA_ZERO
209requests efficiently.
210.Pp
211.Fn uvm_pglistalloc
212allocates a list of pages for size
213.Fa size
214byte under various constraints.
215.Fa low
216and
217.Fa high
218describe the lowest and highest addresses acceptable for the list.
219If
220.Fa alignment
221is non-zero, it describes the required alignment of the list, in
222power-of-two notation.
223If
224.Fa boundary
225is non-zero, no segment of the list may cross this power-of-two
226boundary, relative to zero.
227.Fa nsegs
228is the maximum number of physically contiguous segments.
229If
230.Fa waitok
231is non-zero, the function may sleep until enough memory is available.
232(It also may give up in some situations, so a non-zero
233.Fa waitok
234does not imply that
235.Fn uvm_pglistalloc
236cannot return an error.)
237The allocated memory is returned in the
238.Fa rlist
239list; the caller has to provide storage only, the list is initialized by
240.Fn uvm_pglistalloc .
241.Pp
242.Fn uvm_pglistfree
243frees the list of pages pointed to by
244.Fa list .
245If the content of the page is known to be zero-filled,
246caller should set
247.Dv PG_ZERO
248in pg-\*[Gt]flags so that the page allocator will use
249the page to serve future
250.Dv UVM_PGA_ZERO
251requests efficiently.
252.Pp
253.Fn uvm_page_physload
254loads physical memory segments into VM space on the specified
255.Fa free_list .
256It must be called at system boot time to set up physical memory
257management pages.
258The arguments describe the
259.Fa start
260and
261.Fa end
262of the physical addresses of the segment, and the available start and end
263addresses of pages not already in use.
264If a system has memory banks of
265different speeds the slower memory should be given a higher
266.Fa free_list
267value.
268.\" XXX expand on "system boot time"!
269.Sh PROCESSES
270.Bl -ohang
271.It Ft void
272.Fn uvm_pageout "void" ;
273.It Ft void
274.Fn uvm_scheduler "void" ;
275.El
276.Pp
277.Fn uvm_pageout
278is the main loop for the page daemon.
279.Pp
280.Fn uvm_scheduler
281is the process zero main loop, which is to be called after the
282system has finished starting other processes.
283It handles the swapping in of runnable, swapped out processes in priority
284order.
285.Sh PAGE LOAN
286.Bl -ohang
287.It Ft int
288.Fn uvm_loan "struct vm_map *map" "vaddr_t start" "vsize_t len" "void *v" "int flags" ;
289.It Ft void
290.Fn uvm_unloan "void *v" "int npages" "int flags" ;
291.El
292.Pp
293.Fn uvm_loan
294loans pages in a map out to anons or to the kernel.
295.Fa map
296should be unlocked,
297.Fa start
298and
299.Fa len
300should be multiples of
301.Dv PAGE_SIZE .
302Argument
303.Fa flags
304should be one of
305.Bd -literal
306#define UVM_LOAN_TOANON       0x01    /* loan to anons */
307#define UVM_LOAN_TOPAGE       0x02    /* loan to kernel */
308.Ed
309.Pp
310.Fa v
311should be pointer to array of pointers to
312.Li struct anon
313or
314.Li struct vm_page ,
315as appropriate.
316The caller has to allocate memory for the array and
317ensure it's big enough to hold
318.Fa len / PAGE_SIZE
319pointers.
320Returns 0 for success, or appropriate error number otherwise.
321Note that wired pages can't be loaned out and
322.Fn uvm_loan
323will fail in that case.
324.Pp
325.Fn uvm_unloan
326kills loans on pages or anons.
327The
328.Fa v
329must point to the array of pointers initialized by previous call to
330.Fn uvm_loan .
331.Fa npages
332should match number of pages allocated for loan, this also matches
333number of items in the array.
334Argument
335.Fa flags
336should be one of
337.Bd -literal
338#define UVM_LOAN_TOANON       0x01    /* loan to anons */
339#define UVM_LOAN_TOPAGE       0x02    /* loan to kernel */
340.Ed
341.Pp
342and should match what was used for previous call to
343.Fn uvm_loan .
344.Sh MISCELLANEOUS FUNCTIONS
345.Bl -ohang
346.It Ft struct uvm_object *
347.Fn uao_create "vsize_t size" "int flags" ;
348.It Ft void
349.Fn uao_detach "struct uvm_object *uobj" ;
350.It Ft void
351.Fn uao_reference "struct uvm_object *uobj" ;
352.It Ft bool
353.Fn uvm_chgkprot "void *addr" "size_t len" "int rw" ;
354.It Ft void
355.Fn uvm_kernacc "void *addr" "size_t len" "int rw" ;
356.It Ft int
357.Fn uvm_vslock "struct vmspace *vs" "void *addr" "size_t len" "vm_prot_t prot" ;
358.It Ft void
359.Fn uvm_vsunlock "struct vmspace *vs" "void *addr" "size_t len" ;
360.It Ft void
361.Fn uvm_meter "void" ;
362.It Ft void
363.Fn uvm_proc_fork "struct proc *p1" "struct proc *p2" "bool shared" ;
364.It Ft int
365.Fn uvm_grow "struct proc *p" "vaddr_t sp" ;
366.It Ft void
367.Fn uvn_findpages "struct uvm_object *uobj" "voff_t offset" "int *npagesp" "struct vm_page **pps" "int flags" ;
368.It Ft void
369.Fn uvm_vnp_setsize "struct vnode *vp" "voff_t newsize" ;
370.El
371.Pp
372The
373.Fn uao_create ,
374.Fn uao_detach ,
375and
376.Fn uao_reference
377functions operate on anonymous memory objects, such as those used to support
378System V shared memory.
379.Fn uao_create
380returns an object of size
381.Fa size
382with flags:
383.Bd -literal
384#define UAO_FLAG_KERNOBJ        0x1     /* create kernel object */
385#define UAO_FLAG_KERNSWAP       0x2     /* enable kernel swap */
386.Ed
387.Pp
388which can only be used once each at system boot time.
389.Fn uao_reference
390creates an additional reference to the named anonymous memory object.
391.Fn uao_detach
392removes a reference from the named anonymous memory object, destroying
393it if removing the last reference.
394.Pp
395.Fn uvm_chgkprot
396changes the protection of kernel memory from
397.Fa addr
398to
399.Fa addr + len
400to the value of
401.Fa rw .
402This is primarily useful for debuggers, for setting breakpoints.
403This function is only available with options
404.Dv KGDB .
405.Pp
406.Fn uvm_kernacc
407checks the access at address
408.Fa addr
409to
410.Fa addr + len
411for
412.Fa rw
413access in the kernel address space.
414.Pp
415.Fn uvm_vslock
416and
417.Fn uvm_vsunlock
418control the wiring and unwiring of pages for process
419.Fa p
420from
421.Fa addr
422to
423.Fa addr + len .
424These functions are normally used to wire memory for I/O.
425.Pp
426.Fn uvm_meter
427calculates the load average.
428.Pp
429.Fn uvm_proc_fork
430forks a virtual address space for process' (old)
431.Fa p1
432and (new)
433.Fa p2 .
434If the
435.Fa shared
436argument is non zero, p1 shares its address space with p2,
437otherwise a new address space is created.
438This function currently has no return value, and thus cannot fail.
439In the future, this function will be changed to allow it to
440fail in low memory conditions.
441.Pp
442.Fn uvm_grow
443increases the stack segment of process
444.Fa p
445to include
446.Fa sp .
447.Pp
448.Fn uvn_findpages
449looks up or creates pages in
450.Fa uobj
451at offset
452.Fa offset ,
453marks them busy and returns them in the
454.Fa pps
455array.
456Currently
457.Fa uobj
458must be a vnode object.
459The number of pages requested is pointed to by
460.Fa npagesp ,
461and this value is updated with the actual number of pages returned.
462The flags can be
463.Bd -literal
464#define UFP_ALL         0x00    /* return all pages requested */
465#define UFP_NOWAIT      0x01    /* don't sleep */
466#define UFP_NOALLOC     0x02    /* don't allocate new pages */
467#define UFP_NOCACHE     0x04    /* don't return pages which already exist */
468#define UFP_NORDONLY    0x08    /* don't return PG_READONLY pages */
469.Ed
470.Pp
471.Dv UFP_ALL
472is a pseudo-flag meaning all requested pages should be returned.
473.Dv UFP_NOWAIT
474means that we must not sleep.
475.Dv UFP_NOALLOC
476causes any pages which do not already exist to be skipped.
477.Dv UFP_NOCACHE
478causes any pages which do already exist to be skipped.
479.Dv UFP_NORDONLY
480causes any pages which are marked PG_READONLY to be skipped.
481.Pp
482.Fn uvm_vnp_setsize
483sets the size of vnode
484.Fa vp
485to
486.Fa newsize .
487Caller must hold a reference to the vnode.
488If the vnode shrinks, pages no longer used are discarded.
489.Sh SYSCTL
490UVM provides support for the
491.Dv CTL_VM
492domain of the
493.Xr sysctl 3
494hierarchy.
495It handles the
496.Dv VM_LOADAVG ,
497.Dv VM_METER ,
498.Dv VM_UVMEXP ,
499and
500.Dv VM_UVMEXP2
501nodes, which return the current load averages, calculates current VM
502totals, returns the uvmexp structure, and a kernel version independent
503view of the uvmexp structure, respectively.
504It also exports a number of tunables that control how much VM space is
505allowed to be consumed by various tasks.
506The load averages are typically accessed from userland using the
507.Xr getloadavg 3
508function.
509The uvmexp structure has all global state of the UVM system,
510and has the following members:
511.Bd -literal
512/* vm_page constants */
513int pagesize;   /* size of a page (PAGE_SIZE): must be power of 2 */
514int pagemask;   /* page mask */
515int pageshift;  /* page shift */
516
517/* vm_page counters */
518int npages;     /* number of pages we manage */
519int free;       /* number of free pages */
520int paging;     /* number of pages in the process of being paged out */
521int wired;      /* number of wired pages */
522int reserve_pagedaemon; /* number of pages reserved for pagedaemon */
523int reserve_kernel; /* number of pages reserved for kernel */
524
525/* pageout params */
526int freemin;    /* min number of free pages */
527int freetarg;   /* target number of free pages */
528int inactarg;   /* target number of inactive pages */
529int wiredmax;   /* max number of wired pages */
530
531/* swap */
532int nswapdev;   /* number of configured swap devices in system */
533int swpages;    /* number of PAGE_SIZE'ed swap pages */
534int swpginuse;  /* number of swap pages in use */
535int nswget;     /* number of times fault calls uvm_swap_get() */
536int nanon;      /* number total of anon's in system */
537int nfreeanon;  /* number of free anon's */
538
539/* stat counters */
540int faults;             /* page fault count */
541int traps;              /* trap count */
542int intrs;              /* interrupt count */
543int swtch;              /* context switch count */
544int softs;              /* software interrupt count */
545int syscalls;           /* system calls */
546int pageins;            /* pagein operation count */
547                        /* pageouts are in pdpageouts below */
548int pgswapin;           /* pages swapped in */
549int pgswapout;          /* pages swapped out */
550int forks;              /* forks */
551int forks_ppwait;       /* forks where parent waits */
552int forks_sharevm;      /* forks where vmspace is shared */
553
554/* fault subcounters */
555int fltnoram;   /* number of times fault was out of ram */
556int fltnoanon;  /* number of times fault was out of anons */
557int fltpgwait;  /* number of times fault had to wait on a page */
558int fltpgrele;  /* number of times fault found a released page */
559int fltrelck;   /* number of times fault relock called */
560int fltrelckok; /* number of times fault relock is a success */
561int fltanget;   /* number of times fault gets anon page */
562int fltanretry; /* number of times fault retrys an anon get */
563int fltamcopy;  /* number of times fault clears "needs copy" */
564int fltnamap;   /* number of times fault maps a neighbor anon page */
565int fltnomap;   /* number of times fault maps a neighbor obj page */
566int fltlget;    /* number of times fault does a locked pgo_get */
567int fltget;     /* number of times fault does an unlocked get */
568int flt_anon;   /* number of times fault anon (case 1a) */
569int flt_acow;   /* number of times fault anon cow (case 1b) */
570int flt_obj;    /* number of times fault is on object page (2a) */
571int flt_prcopy; /* number of times fault promotes with copy (2b) */
572int flt_przero; /* number of times fault promotes with zerofill (2b) */
573
574/* daemon counters */
575int pdwoke;     /* number of times daemon woke up */
576int pdrevs;     /* number of times daemon rev'd clock hand */
577int pdfreed;    /* number of pages daemon freed since boot */
578int pdscans;    /* number of pages daemon scanned since boot */
579int pdanscan;   /* number of anonymous pages scanned by daemon */
580int pdobscan;   /* number of object pages scanned by daemon */
581int pdreact;    /* number of pages daemon reactivated since boot */
582int pdbusy;     /* number of times daemon found a busy page */
583int pdpageouts; /* number of times daemon started a pageout */
584int pdpending;  /* number of times daemon got a pending pageout */
585int pddeact;    /* number of pages daemon deactivates */
586.Ed
587.Sh NOTES
588.Fn uvm_chgkprot
589is only available if the kernel has been compiled with options
590.Dv KGDB .
591.Pp
592All structure and types whose names begin with
593.Dq vm_
594will be renamed to
595.Dq uvm_ .
596.Sh SEE ALSO
597.Xr swapctl 2 ,
598.Xr getloadavg 3 ,
599.Xr kvm 3 ,
600.Xr sysctl 3 ,
601.Xr ddb 4 ,
602.Xr options 4 ,
603.Xr memoryallocators 9 ,
604.Xr pmap 9 ,
605.Xr ubc 9 ,
606.Xr uvm_km 9 ,
607.Xr uvm_map 9
608.Rs
609.%A Charles D. Cranor
610.%A Gurudatta M. Parulkar
611.%T "The UVM Virtual Memory System"
612.%I USENIX Association
613.%B Proceedings of the USENIX Annual Technical Conference
614.%P 117-130
615.%D June 6-11, 1999
616.%U http://www.usenix.org/event/usenix99/full_papers/cranor/cranor.pdf
617.Re
618.Sh HISTORY
619UVM is a new VM system developed at Washington University in St. Louis
620(Missouri).
621UVM's roots lie partly in the Mach-based
622.Bx 4.4
623VM system, the
624.Fx
625VM system, and the SunOS 4 VM system.
626UVM's basic structure is based on the
627.Bx 4.4
628VM system.
629UVM's new anonymous memory system is based on the
630anonymous memory system found in the SunOS 4 VM (as described in papers
631published by Sun Microsystems, Inc.).
632UVM also includes a number of features new to
633.Bx
634including page loanout, map entry passing, simplified
635copy-on-write, and clustered anonymous memory pageout.
636UVM is also further documented in an August 1998 dissertation by
637Charles D. Cranor.
638.Pp
639UVM appeared in
640.Nx 1.4 .
641.Sh AUTHORS
642Charles D. Cranor
643.Aq chuck@ccrc.wustl.edu
644designed and implemented UVM.
645.Pp
646Matthew Green
647.Aq mrg@eterna.com.au
648wrote the swap-space management code and handled the logistical issues
649involved with merging UVM into the
650.Nx
651source tree.
652.Pp
653Chuck Silvers
654.Aq chuq@chuq.com
655implemented the aobj pager, thus allowing UVM to support System V shared
656memory and process swapping.
657