xref: /netbsd-src/share/man/man9/uvm.9 (revision 946379e7b37692fc43f68eb0d1c10daa0a7f3b6c)
1.\"	$NetBSD: uvm.9,v 1.110 2015/03/23 08:19:12 riastradh Exp $
2.\"
3.\" Copyright (c) 1998 Matthew R. Green
4.\" All rights reserved.
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\" 1. Redistributions of source code must retain the above copyright
10.\"    notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice, this list of conditions and the following disclaimer in the
13.\"    documentation and/or other materials provided with the distribution.
14.\"
15.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
16.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
17.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
18.\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
19.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
20.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
21.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
22.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
23.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25.\" SUCH DAMAGE.
26.\"
27.Dd March 23, 2015
28.Dt UVM 9
29.Os
30.Sh NAME
31.Nm uvm
32.Nd virtual memory system external interface
33.Sh SYNOPSIS
34.In sys/param.h
35.In uvm/uvm.h
36.Sh DESCRIPTION
37The UVM virtual memory system manages access to the computer's memory
38resources.
39User processes and the kernel access these resources through
40UVM's external interface.
41UVM's external interface includes functions that:
42.Pp
43.Bl -hyphen -compact
44.It
45initialize UVM sub-systems
46.It
47manage virtual address spaces
48.It
49resolve page faults
50.It
51memory map files and devices
52.It
53perform uio-based I/O to virtual memory
54.It
55allocate and free kernel virtual memory
56.It
57allocate and free physical memory
58.El
59.Pp
60In addition to exporting these services, UVM has two kernel-level processes:
61pagedaemon and swapper.
62The pagedaemon process sleeps until physical memory becomes scarce.
63When that happens, pagedaemon is awoken.
64It scans physical memory, paging out and freeing memory that has not
65been recently used.
66The swapper process swaps in runnable processes that are currently swapped
67out, if there is room.
68.Pp
69There are also several miscellaneous functions.
70.Sh INITIALIZATION
71.Bl -ohang
72.It Ft void
73.Fn uvm_init "void" ;
74.It Ft void
75.Fn uvm_init_limits "struct lwp *l" ;
76.It Ft void
77.Fn uvm_setpagesize "void" ;
78.It Ft void
79.Fn uvm_swap_init "void" ;
80.El
81.Pp
82.Fn uvm_init
83sets up the UVM system at system boot time, after the
84console has been setup.
85It initializes global state, the page, map, kernel virtual memory state,
86machine-dependent physical map, kernel memory allocator,
87pager and anonymous memory sub-systems, and then enables
88paging of kernel objects.
89.Pp
90.Fn uvm_init_limits
91initializes process limits for the named process.
92This is for use by the system startup for process zero, before any
93other processes are created.
94.Pp
95.Fn uvm_setpagesize
96initializes the uvmexp members pagesize (if not already done by
97machine-dependent code), pageshift and pagemask.
98It should be called by machine-dependent code early in the
99.Fn pmap_init
100call (see
101.Xr pmap 9 ) .
102.Pp
103.Fn uvm_swap_init
104initializes the swap sub-system.
105.Sh VIRTUAL ADDRESS SPACE MANAGEMENT
106See
107.Xr uvm_map 9 .
108.Sh PAGE FAULT HANDLING
109.Bl -ohang
110.It Ft int
111.Fn uvm_fault "struct vm_map *orig_map" "vaddr_t vaddr" "vm_prot_t access_type" ;
112.El
113.Pp
114.Fn uvm_fault
115is the main entry point for faults.
116It takes
117.Fa orig_map
118as the map the fault originated in, a
119.Fa vaddr
120offset into the map the fault occurred, and
121.Fa access_type
122describing the type of access requested.
123.Fn uvm_fault
124returns a standard UVM return value.
125.Sh MEMORY MAPPING FILES AND DEVICES
126See
127.Xr ubc 9 .
128.Sh VIRTUAL MEMORY I/O
129.Bl -ohang
130.It Ft int
131.Fn uvm_io "struct vm_map *map" "struct uio *uio" ;
132.El
133.Pp
134.Fn uvm_io
135performs the I/O described in
136.Fa uio
137on the memory described in
138.Fa map .
139.Sh ALLOCATION OF KERNEL MEMORY
140See
141.Xr uvm_km 9 .
142.Sh ALLOCATION OF PHYSICAL MEMORY
143.Bl -ohang
144.It Ft struct vm_page *
145.Fn uvm_pagealloc "struct uvm_object *uobj" "voff_t off" "struct vm_anon *anon" "int flags" ;
146.It Ft void
147.Fn uvm_pagerealloc "struct vm_page *pg" "struct uvm_object *newobj" "voff_t newoff" ;
148.It Ft void
149.Fn uvm_pagefree "struct vm_page *pg" ;
150.It Ft int
151.Fn uvm_pglistalloc "psize_t size" "paddr_t low" "paddr_t high" "paddr_t alignment" "paddr_t boundary" "struct pglist *rlist" "int nsegs" "int waitok" ;
152.It Ft void
153.Fn uvm_pglistfree "struct pglist *list" ;
154.It Ft void
155.Fn uvm_page_physload "paddr_t start" "paddr_t end" "paddr_t avail_start" "paddr_t avail_end" "int free_list" ;
156.El
157.Pp
158.Fn uvm_pagealloc
159allocates a page of memory at virtual address
160.Fa off
161in either the object
162.Fa uobj
163or the anonymous memory
164.Fa anon ,
165which must be locked by the caller.
166Only one of
167.Fa uobj
168and
169.Fa anon
170can be non
171.Dv NULL .
172Returns
173.Dv NULL
174when no page can be found.
175The flags can be any of
176.Bd -literal
177#define UVM_PGA_USERESERVE      0x0001  /* ok to use reserve pages */
178#define UVM_PGA_ZERO            0x0002  /* returned page must be zero'd */
179.Ed
180.Pp
181.Dv UVM_PGA_USERESERVE
182means to allocate a page even if that will result in the number of free pages
183being lower than
184.Dv uvmexp.reserve_pagedaemon
185(if the current thread is the pagedaemon) or
186.Dv uvmexp.reserve_kernel
187(if the current thread is not the pagedaemon).
188.Dv UVM_PGA_ZERO
189causes the returned page to be filled with zeroes, either by allocating it
190from a pool of pre-zeroed pages or by zeroing it in-line as necessary.
191.Pp
192.Fn uvm_pagerealloc
193reallocates page
194.Fa pg
195to a new object
196.Fa newobj ,
197at a new offset
198.Fa newoff .
199.Pp
200.Fn uvm_pagefree
201frees the physical page
202.Fa pg .
203If the content of the page is known to be zero-filled,
204caller should set
205.Dv PG_ZERO
206in pg-\*[Gt]flags so that the page allocator will use
207the page to serve future
208.Dv UVM_PGA_ZERO
209requests efficiently.
210.Pp
211.Fn uvm_pglistalloc
212allocates a list of pages for size
213.Fa size
214byte under various constraints.
215.Fa low
216and
217.Fa high
218describe the lowest and highest addresses acceptable for the list.
219If
220.Fa alignment
221is non-zero, it describes the required alignment of the list, in
222power-of-two notation.
223If
224.Fa boundary
225is non-zero, no segment of the list may cross this power-of-two
226boundary, relative to zero.
227.Fa nsegs
228is the maximum number of physically contiguous segments.
229If
230.Fa waitok
231is non-zero, the function may sleep until enough memory is available.
232(It also may give up in some situations, so a non-zero
233.Fa waitok
234does not imply that
235.Fn uvm_pglistalloc
236cannot return an error.)
237The allocated memory is returned in the
238.Fa rlist
239list; the caller has to provide storage only, the list is initialized by
240.Fn uvm_pglistalloc .
241.Pp
242.Fn uvm_pglistfree
243frees the list of pages pointed to by
244.Fa list .
245If the content of the page is known to be zero-filled,
246caller should set
247.Dv PG_ZERO
248in pg-\*[Gt]flags so that the page allocator will use
249the page to serve future
250.Dv UVM_PGA_ZERO
251requests efficiently.
252.Pp
253.Fn uvm_page_physload
254loads physical memory segments into VM space on the specified
255.Fa free_list .
256It must be called at system boot time to set up physical memory
257management pages.
258The arguments describe the
259.Fa start
260and
261.Fa end
262of the physical addresses of the segment, and the available start and end
263addresses of pages not already in use.
264If a system has memory banks of
265different speeds the slower memory should be given a higher
266.Fa free_list
267value.
268.\" XXX expand on "system boot time"!
269.Sh PROCESSES
270.Bl -ohang
271.It Ft void
272.Fn uvm_pageout "void" ;
273.It Ft void
274.Fn uvm_scheduler "void" ;
275.El
276.Pp
277.Fn uvm_pageout
278is the main loop for the page daemon.
279.Pp
280.Fn uvm_scheduler
281is the process zero main loop, which is to be called after the
282system has finished starting other processes.
283It handles the swapping in of runnable, swapped out processes in priority
284order.
285.Sh PAGE LOAN
286.Bl -ohang
287.It Ft int
288.Fn uvm_loan "struct vm_map *map" "vaddr_t start" "vsize_t len" "void *v" "int flags" ;
289.It Ft void
290.Fn uvm_unloan "void *v" "int npages" "int flags" ;
291.El
292.Pp
293.Fn uvm_loan
294loans pages in a map out to anons or to the kernel.
295.Fa map
296should be unlocked,
297.Fa start
298and
299.Fa len
300should be multiples of
301.Dv PAGE_SIZE .
302Argument
303.Fa flags
304should be one of
305.Bd -literal
306#define UVM_LOAN_TOANON       0x01    /* loan to anons */
307#define UVM_LOAN_TOPAGE       0x02    /* loan to kernel */
308.Ed
309.Pp
310.Fa v
311should be pointer to array of pointers to
312.Li struct anon
313or
314.Li struct vm_page ,
315as appropriate.
316The caller has to allocate memory for the array and
317ensure it's big enough to hold
318.Fa len / PAGE_SIZE
319pointers.
320Returns 0 for success, or appropriate error number otherwise.
321Note that wired pages can't be loaned out and
322.Fn uvm_loan
323will fail in that case.
324.Pp
325.Fn uvm_unloan
326kills loans on pages or anons.
327The
328.Fa v
329must point to the array of pointers initialized by previous call to
330.Fn uvm_loan .
331.Fa npages
332should match number of pages allocated for loan, this also matches
333number of items in the array.
334Argument
335.Fa flags
336should be one of
337.Bd -literal
338#define UVM_LOAN_TOANON       0x01    /* loan to anons */
339#define UVM_LOAN_TOPAGE       0x02    /* loan to kernel */
340.Ed
341.Pp
342and should match what was used for previous call to
343.Fn uvm_loan .
344.Sh MISCELLANEOUS FUNCTIONS
345.Bl -ohang
346.It Ft struct uvm_object *
347.Fn uao_create "vsize_t size" "int flags" ;
348.It Ft void
349.Fn uao_detach "struct uvm_object *uobj" ;
350.It Ft void
351.Fn uao_reference "struct uvm_object *uobj" ;
352.It Ft bool
353.Fn uvm_chgkprot "void *addr" "size_t len" "int rw" ;
354.It Ft void
355.Fn uvm_kernacc "void *addr" "size_t len" "int rw" ;
356.It Ft int
357.Fn uvm_vslock "struct vmspace *vs" "void *addr" "size_t len" "vm_prot_t prot" ;
358.It Ft void
359.Fn uvm_vsunlock "struct vmspace *vs" "void *addr" "size_t len" ;
360.It Ft void
361.Fn uvm_meter "void" ;
362.It Ft void
363.Fn uvm_proc_fork "struct proc *p1" "struct proc *p2" "bool shared" ;
364.It Ft int
365.Fn uvm_grow "struct proc *p" "vaddr_t sp" ;
366.It Ft void
367.Fn uvn_findpages "struct uvm_object *uobj" "voff_t offset" "int *npagesp" "struct vm_page **pps" "int flags" ;
368.It Ft void
369.Fn uvm_vnp_setsize "struct vnode *vp" "voff_t newsize" ;
370.El
371.Pp
372The
373.Fn uao_create ,
374.Fn uao_detach ,
375and
376.Fn uao_reference
377functions operate on anonymous memory objects, such as those used to support
378System V shared memory.
379.Fn uao_create
380returns an object of size
381.Fa size
382with flags:
383.Bd -literal
384#define UAO_FLAG_KERNOBJ        0x1     /* create kernel object */
385#define UAO_FLAG_KERNSWAP       0x2     /* enable kernel swap */
386.Ed
387.Pp
388which can only be used once each at system boot time.
389.Fn uao_reference
390creates an additional reference to the named anonymous memory object.
391.Fn uao_detach
392removes a reference from the named anonymous memory object, destroying
393it if removing the last reference.
394.Pp
395.Fn uvm_chgkprot
396changes the protection of kernel memory from
397.Fa addr
398to
399.Fa addr + len
400to the value of
401.Fa rw .
402This is primarily useful for debuggers, for setting breakpoints.
403This function is only available with options
404.Dv KGDB .
405.Pp
406.Fn uvm_kernacc
407checks the access at address
408.Fa addr
409to
410.Fa addr + len
411for
412.Fa rw
413access in the kernel address space.
414.Pp
415.Fn uvm_vslock
416and
417.Fn uvm_vsunlock
418control the wiring and unwiring of pages for process
419.Fa p
420from
421.Fa addr
422to
423.Fa addr + len .
424These functions are normally used to wire memory for I/O.
425.Pp
426.Fn uvm_meter
427calculates the load average.
428.Pp
429.Fn uvm_proc_fork
430forks a virtual address space for process' (old)
431.Fa p1
432and (new)
433.Fa p2 .
434If the
435.Fa shared
436argument is non zero, p1 shares its address space with p2,
437otherwise a new address space is created.
438This function currently has no return value, and thus cannot fail.
439In the future, this function will be changed to allow it to
440fail in low memory conditions.
441.Pp
442.Fn uvm_grow
443increases the stack segment of process
444.Fa p
445to include
446.Fa sp .
447.Pp
448.Fn uvn_findpages
449looks up or creates pages in
450.Fa uobj
451at offset
452.Fa offset ,
453marks them busy and returns them in the
454.Fa pps
455array.
456Currently
457.Fa uobj
458must be a vnode object.
459The number of pages requested is pointed to by
460.Fa npagesp ,
461and this value is updated with the actual number of pages returned.
462The flags can be any bitwise inclusive-or of:
463.Pp
464.Bl -tag -offset abcd -compact -width UVM_ADV_SEQUENTIAL
465.It Dv UFP_ALL
466Zero pseudo-flag meaning return all pages.
467.It Dv UFP_NOWAIT
468Don't sleep -- yield
469.Dv NULL
470for busy pages or for uncached pages for which allocation would sleep.
471.It Dv UFP_NOALLOC
472Don't allocate -- yield
473.Dv NULL
474for uncached pages.
475.It Dv UFP_NOCACHE
476Don't use cached pages -- yield
477.Dv NULL
478instead.
479.It Dv UFP_NORDONLY
480Don't yield read-only pages -- yield
481.Dv NULL
482for pages marked
483.Dv PG_READONLY .
484.It Dv UFP_DIRTYONLY
485Don't yield clean pages -- stop early at the first clean one.
486As a side effect, mark yielded dirty pages clean.
487Caller must write them to permanent storage before unbusying.
488.It Dv UFP_BACKWARD
489Traverse pages in reverse order.
490If
491.Fn uvn_findpages
492returns early, it will have filled
493.Li * Ns Fa npagesp
494entries at the end of
495.Fa pps
496rather than the beginning.
497.El
498.Pp
499.Fn uvm_vnp_setsize
500sets the size of vnode
501.Fa vp
502to
503.Fa newsize .
504Caller must hold a reference to the vnode.
505If the vnode shrinks, pages no longer used are discarded.
506.Sh MISCELLANEOUS MACROS
507.Bl -ohang
508.It Ft paddr_t
509.Fn atop "paddr_t pa" ;
510.It Ft paddr_t
511.Fn ptoa "paddr_t pn" ;
512.It Ft paddr_t
513.Fn round_page "address" ;
514.It Ft paddr_t
515.Fn trunc_page "address" ;
516.El
517.Pp
518The
519.Fn atop
520macro converts a physical address
521.Fa pa
522into a page number.
523The
524.Fn ptoa
525macro does the opposite by converting a page number
526.Fa pn
527into a physical address.
528.Pp
529.Fn round_page
530and
531.Fn trunc_page
532macros return a page address boundary from rounding
533.Fa address
534up and down, respectively, to the nearest page boundary.
535These macros work for either addresses or byte counts.
536.Sh SYSCTL
537UVM provides support for the
538.Dv CTL_VM
539domain of the
540.Xr sysctl 3
541hierarchy.
542It handles the
543.Dv VM_LOADAVG ,
544.Dv VM_METER ,
545.Dv VM_UVMEXP ,
546and
547.Dv VM_UVMEXP2
548nodes, which return the current load averages, calculates current VM
549totals, returns the uvmexp structure, and a kernel version independent
550view of the uvmexp structure, respectively.
551It also exports a number of tunables that control how much VM space is
552allowed to be consumed by various tasks.
553The load averages are typically accessed from userland using the
554.Xr getloadavg 3
555function.
556The uvmexp structure has all global state of the UVM system,
557and has the following members:
558.Bd -literal
559/* vm_page constants */
560int pagesize;   /* size of a page (PAGE_SIZE): must be power of 2 */
561int pagemask;   /* page mask */
562int pageshift;  /* page shift */
563
564/* vm_page counters */
565int npages;     /* number of pages we manage */
566int free;       /* number of free pages */
567int paging;     /* number of pages in the process of being paged out */
568int wired;      /* number of wired pages */
569int reserve_pagedaemon; /* number of pages reserved for pagedaemon */
570int reserve_kernel; /* number of pages reserved for kernel */
571
572/* pageout params */
573int freemin;    /* min number of free pages */
574int freetarg;   /* target number of free pages */
575int inactarg;   /* target number of inactive pages */
576int wiredmax;   /* max number of wired pages */
577
578/* swap */
579int nswapdev;   /* number of configured swap devices in system */
580int swpages;    /* number of PAGE_SIZE'ed swap pages */
581int swpginuse;  /* number of swap pages in use */
582int nswget;     /* number of times fault calls uvm_swap_get() */
583int nanon;      /* number total of anon's in system */
584int nfreeanon;  /* number of free anon's */
585
586/* stat counters */
587int faults;             /* page fault count */
588int traps;              /* trap count */
589int intrs;              /* interrupt count */
590int swtch;              /* context switch count */
591int softs;              /* software interrupt count */
592int syscalls;           /* system calls */
593int pageins;            /* pagein operation count */
594                        /* pageouts are in pdpageouts below */
595int pgswapin;           /* pages swapped in */
596int pgswapout;          /* pages swapped out */
597int forks;              /* forks */
598int forks_ppwait;       /* forks where parent waits */
599int forks_sharevm;      /* forks where vmspace is shared */
600
601/* fault subcounters */
602int fltnoram;   /* number of times fault was out of ram */
603int fltnoanon;  /* number of times fault was out of anons */
604int fltpgwait;  /* number of times fault had to wait on a page */
605int fltpgrele;  /* number of times fault found a released page */
606int fltrelck;   /* number of times fault relock called */
607int fltrelckok; /* number of times fault relock is a success */
608int fltanget;   /* number of times fault gets anon page */
609int fltanretry; /* number of times fault retrys an anon get */
610int fltamcopy;  /* number of times fault clears "needs copy" */
611int fltnamap;   /* number of times fault maps a neighbor anon page */
612int fltnomap;   /* number of times fault maps a neighbor obj page */
613int fltlget;    /* number of times fault does a locked pgo_get */
614int fltget;     /* number of times fault does an unlocked get */
615int flt_anon;   /* number of times fault anon (case 1a) */
616int flt_acow;   /* number of times fault anon cow (case 1b) */
617int flt_obj;    /* number of times fault is on object page (2a) */
618int flt_prcopy; /* number of times fault promotes with copy (2b) */
619int flt_przero; /* number of times fault promotes with zerofill (2b) */
620
621/* daemon counters */
622int pdwoke;     /* number of times daemon woke up */
623int pdrevs;     /* number of times daemon rev'd clock hand */
624int pdfreed;    /* number of pages daemon freed since boot */
625int pdscans;    /* number of pages daemon scanned since boot */
626int pdanscan;   /* number of anonymous pages scanned by daemon */
627int pdobscan;   /* number of object pages scanned by daemon */
628int pdreact;    /* number of pages daemon reactivated since boot */
629int pdbusy;     /* number of times daemon found a busy page */
630int pdpageouts; /* number of times daemon started a pageout */
631int pdpending;  /* number of times daemon got a pending pageout */
632int pddeact;    /* number of pages daemon deactivates */
633.Ed
634.Sh NOTES
635.Fn uvm_chgkprot
636is only available if the kernel has been compiled with options
637.Dv KGDB .
638.Pp
639All structure and types whose names begin with
640.Dq vm_
641will be renamed to
642.Dq uvm_ .
643.Sh SEE ALSO
644.Xr swapctl 2 ,
645.Xr getloadavg 3 ,
646.Xr kvm 3 ,
647.Xr sysctl 3 ,
648.Xr ddb 4 ,
649.Xr options 4 ,
650.Xr memoryallocators 9 ,
651.Xr pmap 9 ,
652.Xr ubc 9 ,
653.Xr uvm_km 9 ,
654.Xr uvm_map 9
655.Rs
656.%A Charles D. Cranor
657.%A Gurudatta M. Parulkar
658.%T "The UVM Virtual Memory System"
659.%I USENIX Association
660.%B Proceedings of the USENIX Annual Technical Conference
661.%P 117-130
662.%D June 6-11, 1999
663.%U http://www.usenix.org/event/usenix99/full_papers/cranor/cranor.pdf
664.Re
665.Sh HISTORY
666UVM is a new VM system developed at Washington University in St. Louis
667(Missouri).
668UVM's roots lie partly in the Mach-based
669.Bx 4.4
670VM system, the
671.Fx
672VM system, and the SunOS 4 VM system.
673UVM's basic structure is based on the
674.Bx 4.4
675VM system.
676UVM's new anonymous memory system is based on the
677anonymous memory system found in the SunOS 4 VM (as described in papers
678published by Sun Microsystems, Inc.).
679UVM also includes a number of features new to
680.Bx
681including page loanout, map entry passing, simplified
682copy-on-write, and clustered anonymous memory pageout.
683UVM is also further documented in an August 1998 dissertation by
684Charles D. Cranor.
685.Pp
686UVM appeared in
687.Nx 1.4 .
688.Sh AUTHORS
689.An -nosplit
690.An Charles D. Cranor
691.Aq Mt chuck@ccrc.wustl.edu
692designed and implemented UVM.
693.Pp
694.An Matthew Green
695.Aq Mt mrg@eterna.com.au
696wrote the swap-space management code and handled the logistical issues
697involved with merging UVM into the
698.Nx
699source tree.
700.Pp
701.An Chuck Silvers
702.Aq Mt chuq@chuq.com
703implemented the aobj pager, thus allowing UVM to support System V shared
704memory and process swapping.
705