xref: /netbsd-src/share/man/man9/uvm.9 (revision fd5cb0acea84d278e04e640d37ca2398f894991f)
1.\"	$NetBSD: uvm.9,v 1.60 2005/01/11 09:46:49 wiz Exp $
2.\"
3.\" Copyright (c) 1998 Matthew R. Green
4.\" All rights reserved.
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\" 1. Redistributions of source code must retain the above copyright
10.\"    notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice, this list of conditions and the following disclaimer in the
13.\"    documentation and/or other materials provided with the distribution.
14.\" 3. The name of the author may not be used to endorse or promote products
15.\"    derived from this software without specific prior written permission.
16.\"
17.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
18.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
19.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
20.\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
21.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
22.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
23.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
24.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
25.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
26.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
27.\" SUCH DAMAGE.
28.\"
29.Dd January 9, 2005
30.Dt UVM 9
31.Os
32.Sh NAME
33.Nm uvm
34.Nd virtual memory system external interface
35.Sh SYNOPSIS
36.In sys/param.h
37.In uvm/uvm.h
38.Sh DESCRIPTION
39The UVM virtual memory system manages access to the computer's memory
40resources.
41User processes and the kernel access these resources through
42UVM's external interface.
43UVM's external interface includes functions that:
44.Pp
45.Bl -hyphen -compact
46.It
47initialise UVM sub-systems
48.It
49manage virtual address spaces
50.It
51resolve page faults
52.It
53memory map files and devices
54.It
55perform uio-based I/O to virtual memory
56.It
57allocate and free kernel virtual memory
58.It
59allocate and free physical memory
60.El
61.Pp
62In addition to exporting these services, UVM has two kernel-level processes:
63pagedaemon and swapper.
64The pagedaemon process sleeps until physical memory becomes scarce.
65When that happens, pagedaemon is awoken.
66It scans physical memory, paging out and freeing memory that has not
67been recently used.
68The swapper process swaps in runnable processes that are currently swapped
69out, if there is room.
70.Pp
71There are also several miscellaneous functions.
72.Sh INITIALISATION
73.Ft void
74.br
75.Fn uvm_init "void" ;
76.Pp
77.Ft void
78.br
79.Fn uvm_init_limits "struct proc *p" ;
80.Pp
81.Ft void
82.br
83.Fn uvm_setpagesize "void" ;
84.Pp
85.Ft void
86.br
87.Fn uvm_swap_init "void" ;
88.Pp
89.Fn uvm_init
90sets up the UVM system at system boot time, after the
91copyright has been printed.
92It initialises global state, the page, map, kernel virtual memory state,
93machine-dependent physical map, kernel memory allocator,
94pager and anonymous memory sub-systems, and then enables
95paging of kernel objects.
96.Pp
97.Fn uvm_init_limits
98initialises process limits for the named process.
99This is for use by the system startup for process zero, before any
100other processes are created.
101.Pp
102.Fn uvm_setpagesize
103initialises the uvmexp members pagesize (if not already done by
104machine-dependent code), pageshift and pagemask.
105It should be called by machine-dependent code early in the
106.Fn pmap_init
107call (see
108.Xr pmap 9 ) .
109.Pp
110.Fn uvm_swap_init
111initialises the swap sub-system.
112.Sh VIRTUAL ADDRESS SPACE MANAGEMENT
113.Ft int
114.br
115.Fn uvm_map "struct vm_map *map" "vaddr_t *startp" "vsize_t size" "struct uvm_object *uobj" "voff_t uoffset" "vsize_t align" "uvm_flag_t flags" ;
116.Pp
117.Ft int
118.br
119.Fn uvm_unmap "struct vm_map *map" "vaddr_t start" "vaddr_t end" ;
120.Pp
121.Ft int
122.br
123.Fn uvm_map_pageable "struct vm_map *map" "vaddr_t start" "vaddr_t end" "boolean_t new_pageable" "int lockflags" ;
124.Pp
125.Ft boolean_t
126.br
127.Fn uvm_map_checkprot "struct vm_map *map" "vaddr_t start" "vaddr_t end" "vm_prot_t protection" ;
128.Pp
129.Ft int
130.br
131.Fn uvm_map_protect "struct vm_map *map" "vaddr_t start" "vaddr_t end" "vm_prot_t new_prot" "boolean_t set_max" ;
132.Pp
133.Ft int
134.br
135.Fn uvm_deallocate "struct vm_map *map" "vaddr_t start" "vsize_t size" ;
136.Pp
137.Ft struct vmspace *
138.br
139.Fn uvmspace_alloc "vaddr_t min" "vaddr_t max" "int pageable" ;
140.Pp
141.Ft void
142.br
143.Fn uvmspace_exec "struct proc *p" "vaddr_t start" "vaddr_t end" ;
144.Pp
145.Ft struct vmspace *
146.br
147.Fn uvmspace_fork "struct vmspace *vm" ;
148.Pp
149.Ft void
150.br
151.Fn uvmspace_free "struct vmspace *vm1" ;
152.Pp
153.Ft void
154.br
155.Fn uvmspace_share "struct proc *p1" "struct proc *p2" ;
156.Pp
157.Ft void
158.br
159.Fn uvmspace_unshare "struct proc *p" ;
160.Pp
161.Ft boolean_t
162.br
163.Fn uvm_uarea_alloc "vaddr_t *uaddrp" ;
164.Pp
165.Ft void
166.br
167.Fn uvm_uarea_free "vaddr_t uaddr" ;
168.Pp
169.Fn uvm_map
170establishes a valid mapping in map
171.Fa map ,
172which must be unlocked.
173The new mapping has size
174.Fa size ,
175which must a multiple of
176.Dv PAGE_SIZE .
177The
178.Fa uobj
179and
180.Fa uoffset
181arguments can have four meanings.
182When
183.Fa uobj
184is
185.Dv NULL
186and
187.Fa uoffset
188is
189.Dv UVM_UNKNOWN_OFFSET ,
190.Fn uvm_map
191does not use the machine-dependent
192.Dv PMAP_PREFER
193function.
194If
195.Fa uoffset
196is any other value, it is used as the hint to
197.Dv PMAP_PREFER .
198When
199.Fa uobj
200is not
201.Dv NULL
202and
203.Fa uoffset
204is
205.Dv UVM_UNKNOWN_OFFSET ,
206.Fn uvm_map
207finds the offset based upon the virtual address, passed as
208.Fa startp .
209If
210.Fa uoffset
211is any other value, we are doing a normal mapping at this offset.
212The start address of the map will be returned in
213.Fa startp .
214.Pp
215.Fa align
216specifies alignment of mapping unless
217.Dv UVM_FLAG_FIXED
218is specified in
219.Fa flags .
220.Fa align
221must be a power of 2.
222.Pp
223.Fa flags
224passed to
225.Fn uvm_map
226are typically created using the
227.Fn UVM_MAPFLAG "vm_prot_t prot" "vm_prot_t maxprot" "vm_inherit_t inh" "int advice" "int flags"
228macro, which uses the following values.
229The
230.Fa prot
231and
232.Fa maxprot
233can take are:
234.Bd -literal
235#define UVM_PROT_MASK   0x07    /* protection mask */
236#define UVM_PROT_NONE   0x00    /* protection none */
237#define UVM_PROT_ALL    0x07    /* everything */
238#define UVM_PROT_READ   0x01    /* read */
239#define UVM_PROT_WRITE  0x02    /* write */
240#define UVM_PROT_EXEC   0x04    /* exec */
241#define UVM_PROT_R      0x01    /* read */
242#define UVM_PROT_W      0x02    /* write */
243#define UVM_PROT_RW     0x03    /* read-write */
244#define UVM_PROT_X      0x04    /* exec */
245#define UVM_PROT_RX     0x05    /* read-exec */
246#define UVM_PROT_WX     0x06    /* write-exec */
247#define UVM_PROT_RWX    0x07    /* read-write-exec */
248.Ed
249.Pp
250The values that
251.Fa inh
252can take are:
253.Bd -literal
254#define UVM_INH_MASK    0x30    /* inherit mask */
255#define UVM_INH_SHARE   0x00    /* "share" */
256#define UVM_INH_COPY    0x10    /* "copy" */
257#define UVM_INH_NONE    0x20    /* "none" */
258#define UVM_INH_DONATE  0x30    /* "donate" \*[Lt]\*[Lt] not used */
259.Ed
260.Pp
261The values that
262.Fa advice
263can take are:
264.Bd -literal
265#define UVM_ADV_NORMAL     0x0  /* 'normal' */
266#define UVM_ADV_RANDOM     0x1  /* 'random' */
267#define UVM_ADV_SEQUENTIAL 0x2  /* 'sequential' */
268#define UVM_ADV_MASK       0x7  /* mask */
269.Ed
270.Pp
271The values that
272.Fa flags
273can take are:
274.Bd -literal
275#define UVM_FLAG_FIXED   0x010000 /* find space */
276#define UVM_FLAG_OVERLAY 0x020000 /* establish overlay */
277#define UVM_FLAG_NOMERGE 0x040000 /* don't merge map entries */
278#define UVM_FLAG_COPYONW 0x080000 /* set copy_on_write flag */
279#define UVM_FLAG_AMAPPAD 0x100000 /* for bss: pad amap to reduce malloc() */
280#define UVM_FLAG_TRYLOCK 0x200000 /* fail if we can not lock map */
281.Ed
282.Pp
283The
284.Dv UVM_MAPFLAG
285macro arguments can be combined with an or operator.
286There are several special purpose macros for checking protection
287combinations, e.g., the
288.Dv UVM_PROT_WX
289macro.
290There are also some additional macros to extract bits from the flags.
291The
292.Dv UVM_PROTECTION ,
293.Dv UVM_INHERIT ,
294.Dv UVM_MAXPROTECTION
295and
296.Dv UVM_ADVICE
297macros return the protection, inheritance, maximum protection and advice,
298respectively.
299.Fn uvm_map
300returns a standard UVM return value.
301.Pp
302.Fn uvm_unmap
303removes a valid mapping,
304from
305.Fa start
306to
307.Fa end ,
308in map
309.Fa map ,
310which must be unlocked.
311.Pp
312.Fn uvm_map_pageable
313changes the pageability of the pages in the range from
314.Fa start
315to
316.Fa end
317in map
318.Fa map
319to
320.Fa new_pageable .
321.Fn uvm_map_pageable
322returns a standard UVM return value.
323.Pp
324.Fn uvm_map_checkprot
325checks the protection of the range from
326.Fa start
327to
328.Fa end
329in map
330.Fa map
331against
332.Fa protection .
333This returns either
334.Dv TRUE
335or
336.Dv FALSE .
337.Pp
338.Fn uvm_map_protect
339changes the protection
340.Fa start
341to
342.Fa end
343in map
344.Fa map
345to
346.Fa new_prot ,
347also setting the maximum protection to the region to
348.Fa new_prot
349if
350.Fa set_max
351is non-zero.
352This function returns a standard UVM return value.
353.Pp
354.Fn uvm_deallocate
355deallocates kernel memory in map
356.Fa map
357from address
358.Fa start
359to
360.Fa start + size .
361.Pp
362.Fn uvmspace_alloc
363allocates and returns a new address space, with ranges from
364.Fa min
365to
366.Fa max ,
367setting the pageability of the address space to
368.Fa pageable .
369.Pp
370.Fn uvmspace_exec
371either reuses the address space of process
372.Fa p
373if there are no other references to it, or creates
374a new one with
375.Fn uvmspace_alloc .
376The range of valid addresses in the address space is reset to
377.Fa start
378through
379.Fa end .
380.Pp
381.Fn uvmspace_fork
382creates and returns a new address space based upon the
383.Fa vm1
384address space, typically used when allocating an address space for a
385child process.
386.Pp
387.Fn uvmspace_free
388lowers the reference count on the address space
389.Fa vm ,
390freeing the data structures if there are no other references.
391.Pp
392.Fn uvmspace_share
393causes process
394.Pa p2
395to share the address space of
396.Fa p1 .
397.Pp
398.Fn uvmspace_unshare
399ensures that process
400.Fa p
401has its own, unshared address space, by creating a new one if
402necessary by calling
403.Fn uvmspace_fork .
404.Pp
405.Fn uvm_uarea_alloc
406allocates virtual space for a u-area (i.e., a kernel stack) and stores
407its virtual address in
408.Fa *uaddrp .
409The return value is
410.Dv TRUE
411if the u-area is already backed by wired physical memory, otherwise
412.Dv FALSE .
413.Pp
414.Fn uvm_uarea_free
415frees a u-area allocated with
416.Fn uvm_uarea_alloc ,
417freeing both the virtual space and any physical pages which may have been
418allocated to back that virtual space later.
419.Sh PAGE FAULT HANDLING
420.Ft int
421.br
422.Fn uvm_fault "struct vm_map *orig_map" "vaddr_t vaddr" "vm_fault_t fault_type" "vm_prot_t access_type" ;
423.Pp
424.Fn uvm_fault
425is the main entry point for faults.
426It takes
427.Fa orig_map
428as the map the fault originated in, a
429.Fa vaddr
430offset into the map the fault occurred,
431.Fa fault_type
432describing the type of fault, and
433.Fa access_type
434describing the type of access requested.
435.Fn uvm_fault
436returns a standard UVM return value.
437.Sh MEMORY MAPPING FILES AND DEVICES
438.Ft struct uvm_object *
439.br
440.Fn uvn_attach "void *arg" "vm_prot_t accessprot" ;
441.Pp
442.Ft void
443.br
444.Fn uvm_vnp_setsize "struct vnode *vp" "voff_t newsize" ;
445.Pp
446.Ft void *
447.br
448.Fn ubc_alloc "struct uvm_object *uobj" "voff_t offset" "vsize_t *lenp" "int flags" ;
449.Pp
450.Ft void
451.br
452.Fn ubc_release "void *va" "int flags" ;
453.Pp
454.Fn uvn_attach
455attaches a UVM object to vnode
456.Fa arg ,
457creating the object if necessary.
458The object is returned.
459.Pp
460.Fn uvm_vnp_setsize
461sets the size of vnode
462.Fa vp
463to
464.Fa newsize .
465Caller must hold a reference to the vnode.
466If the vnode shrinks, pages no longer used are discarded.
467.Pp
468.Fn ubc_alloc
469creates a kernel mappings of
470.Fa uobj
471starting at offset
472.Fa offset .
473the desired length of the mapping is pointed to by
474.Fa lenp ,
475but the actual mapping may be smaller than this.
476.Fa lenp
477is updated to contain the actual length mapped.
478The flags must be one of
479.Bd -literal
480#define UBC_READ        0x01    /* mapping will be accessed for read */
481#define UBC_WRITE       0x02    /* mapping will be accessed for write */
482.Ed
483.Pp
484Currently,
485.Fa uobj
486must actually be a vnode object.
487Once the mapping is created, it must be accessed only by methods that can
488handle faults, such as
489.Fn uiomove
490or
491.Fn kcopy .
492Page faults on the mapping will result in the vnode's
493.Fn VOP_GETPAGES
494method being called to resolve the fault.
495.Pp
496.Fn ubc_release
497frees the mapping at
498.Fa va
499for reuse.
500The mapping may be cached to speed future accesses to the same region
501of the object.
502The flags can be any of
503.Bd -literal
504#define UBC_UNMAP       0x01    /* do not cache mapping */
505.Ed
506.Sh VIRTUAL MEMORY I/O
507.Ft int
508.br
509.Fn uvm_io "struct vm_map *map" "struct uio *uio" ;
510.Pp
511.Fn uvm_io
512performs the I/O described in
513.Fa uio
514on the memory described in
515.Fa map .
516.Sh ALLOCATION OF KERNEL MEMORY
517.Ft vaddr_t
518.br
519.Fn uvm_km_alloc "struct vm_map *map" "vsize_t size" ;
520.Pp
521.Ft vaddr_t
522.br
523.Fn uvm_km_zalloc "struct vm_map *map" "vsize_t size" ;
524.Pp
525.Ft vaddr_t
526.br
527.Fn uvm_km_alloc1 "struct vm_map *map" "vsize_t size" "boolean_t zeroit" ;
528.Pp
529.Ft vaddr_t
530.br
531.Fn uvm_km_kmemalloc1 "struct vm_map *map" "struct uvm_object *obj" "vsize_t size" "vsize_t align" "voff_t preferred offset" "int flags" ;
532.Pp
533.Ft vaddr_t
534.br
535.Fn uvm_km_kmemalloc "struct vm_map *map" "struct uvm_object *obj" "vsize_t size" "int flags" ;
536.Pp
537.Ft vaddr_t
538.br
539.Fn uvm_km_valloc "struct vm_map *map" "vsize_t size" ;
540.Pp
541.Ft vaddr_t
542.br
543.Fn uvm_km_valloc_wait "struct vm_map *map" "vsize_t size" ;
544.Pp
545.Ft struct vm_map *
546.br
547.Fn uvm_km_suballoc "struct vm_map *map" "vaddr_t *min" "vaddr_t *max " "vsize_t size" "boolean_t pageable" "boolean_t fixed" "struct vm_map *submap" ;
548.Pp
549.Ft void
550.br
551.Fn uvm_km_free "struct vm_map *map" "vaddr_t addr" "vsize_t size" ;
552.Pp
553.Ft void
554.br
555.Fn uvm_km_free_wakeup "struct vm_map *map" "vaddr_t addr" "vsize_t size" ;
556.Pp
557.Fn uvm_km_alloc
558and
559.Fn uvm_km_zalloc
560allocate
561.Fa size
562bytes of wired kernel memory in map
563.Fa map .
564In addition to allocation,
565.Fn uvm_km_zalloc
566zeros the memory.
567Both of these functions are defined as macros in terms of
568.Fn uvm_km_alloc1 ,
569and should almost always be used in preference to
570.Fn uvm_km_alloc1 .
571.Pp
572.Fn uvm_km_alloc1
573allocates and returns
574.Fa size
575bytes of wired memory in the kernel map, zeroing the memory if the
576.Fa zeroit
577argument is non-zero.
578.Pp
579.Fn uvm_km_kmemalloc1
580allocates and returns
581.Fa size
582bytes of wired kernel memory into
583.Fa obj .
584The first address of the allocated memory range will be aligned according to the
585.Fa align
586argument
587.Pq specify 0 if no alignment is necessary .
588The flags can be any of:
589.Bd -literal
590#define UVM_KMF_NOWAIT  0x1                     /* matches M_NOWAIT */
591#define UVM_KMF_VALLOC  0x2                     /* allocate VA only */
592#define UVM_KMF_CANFAIL 0x4			/* caller handles failure */
593#define UVM_KMF_TRYLOCK UVM_FLAG_TRYLOCK        /* try locking only */
594.Ed
595.Pp
596.Dv UVM_KMF_NOWAIT
597causes
598.Fn uvm_km_kmemalloc1
599to return immediately if no memory is available.
600.Dv UVM_KMF_VALLOC
601causes no physical pages to be allocated, only virtual space.
602.Dv UVM_KMF_TRYLOCK
603causes
604.Fn uvm_km_kmemalloc1
605to use
606.Fn simple_lock_try
607when locking maps.
608.Dv UVM_KMF_CANFAIL
609indicates that
610.Fn uvm_km_kmemalloc1
611can return 0 even if
612.Dv UVM_KMF_NOWAIT
613is not specified.
614(If neither
615.Dv UVM_KMF_NOWAIT
616nor
617.Dv UVM_KMF_CANFAIL
618are specified,
619.Fn uvm_km_kmemalloc1
620will never fail, but rather sleep indefinitely until the allocation succeeds.)
621.Pp
622.Fn uvm_km_kmemalloc
623allocates kernel memory like
624.Fn uvm_km_kmemalloc1
625but uses the default values
626.Dv 0
627for the
628.Fa align ,
629and
630.Dv UVM_UNKNOWN_OFFSET
631for the
632.Fa prefer
633arguments.
634.Pp
635.Fn uvm_km_valloc
636and
637.Fn uvm_km_valloc_wait
638return a newly allocated zero-filled address in the kernel map of size
639.Fa size .
640.Fn uvm_km_valloc_wait
641will also wait for kernel memory to become available, if there is a
642memory shortage.
643.Pp
644.Fn uvm_km_free
645and
646.Fn uvm_km_free_wakeup
647free
648.Fa size
649bytes of memory in the kernel map, starting at address
650.Fa addr .
651.Fn uvm_km_free_wakeup
652calls
653.Fn wakeup
654on the map before unlocking the map.
655.Pp
656.Fn uvm_km_suballoc
657allocates submap from
658.Fa map ,
659creating a new map if
660.Fa submap
661is
662.Dv NULL .
663The addresses of the submap can be specified exactly by setting the
664.Fa fixed
665argument to non-zero, which causes the
666.Fa min
667argument to specify the beginning of the address in the submap.
668If
669.Fa fixed
670is zero, any address of size
671.Fa size
672will be allocated from
673.Fa map
674and the start and end addresses returned in
675.Fa min
676and
677.Fa max .
678If
679.Fa pageable
680is non-zero, entries in the map may be paged out.
681.Sh ALLOCATION OF PHYSICAL MEMORY
682.Ft struct vm_page *
683.br
684.Fn uvm_pagealloc "struct uvm_object *uobj" "voff_t off" "struct vm_anon *anon" "int flags" ;
685.Pp
686.Ft void
687.br
688.Fn uvm_pagerealloc "struct vm_page *pg" "struct uvm_object *newobj" "voff_t newoff" ;
689.Pp
690.Ft void
691.br
692.Fn uvm_pagefree "struct vm_page *pg" ;
693.Pp
694.Ft int
695.br
696.Fn uvm_pglistalloc "psize_t size" "paddr_t low" "paddr_t high" "paddr_t alignment" "paddr_t boundary" "struct pglist *rlist" "int nsegs" "int waitok" ;
697.Pp
698.Ft void
699.br
700.Fn uvm_pglistfree "struct pglist *list" ;
701.Pp
702.Ft void
703.br
704.Fn uvm_page_physload "vaddr_t start" "vaddr_t end" "vaddr_t avail_start" "vaddr_t avail_end" "int free_list" ;
705.Pp
706.Fn uvm_pagealloc
707allocates a page of memory at virtual address
708.Fa off
709in either the object
710.Fa uobj
711or the anonymous memory
712.Fa anon ,
713which must be locked by the caller.
714Only one of
715.Fa uobj
716and
717.Fa anon
718can be non
719.Dv NULL .
720Returns
721.Dv NULL
722when no page can be found.
723The flags can be any of
724.Bd -literal
725#define UVM_PGA_USERESERVE      0x0001  /* ok to use reserve pages */
726#define UVM_PGA_ZERO            0x0002  /* returned page must be zero'd */
727.Ed
728.Pp
729.Dv UVM_PGA_USERESERVE
730means to allocate a page even if that will result in the number of free pages
731being lower than
732.Dv uvmexp.reserve_pagedaemon
733(if the current thread is the pagedaemon) or
734.Dv uvmexp.reserve_kernel
735(if the current thread is not the pagedaemon).
736.Dv UVM_PGA_ZERO
737causes the returned page to be filled with zeroes, either by allocating it
738from a pool of pre-zeroed pages or by zeroing it in-line as necessary.
739.Pp
740.Fn uvm_pagerealloc
741reallocates page
742.Fa pg
743to a new object
744.Fa newobj ,
745at a new offset
746.Fa newoff .
747.Pp
748.Fn uvm_pagefree
749frees the physical page
750.Fa pg .
751If the content of the page is known to be zero-filled,
752caller should set
753.Dv PG_ZERO
754in pg-\*[Gt]flags so that the page allocator will use
755the page to serve future
756.Dv UVM_PGA_ZERO
757requests efficiently.
758.Pp
759.Fn uvm_pglistalloc
760allocates a list of pages for size
761.Fa size
762byte under various constraints.
763.Fa low
764and
765.Fa high
766describe the lowest and highest addresses acceptable for the list.
767If
768.Fa alignment
769is non-zero, it describes the required alignment of the list, in
770power-of-two notation.
771If
772.Fa boundary
773is non-zero, no segment of the list may cross this power-of-two
774boundary, relative to zero.
775.Fa nsegs
776is the maximum number of physically contigous segments.
777If
778.Fa waitok
779is non-zero, the function may sleep until enough memory is available.
780(It also may give up in some situations, so a non-zero
781.Fa waitok
782does not imply that
783.Fn uvm_pglistalloc
784cannot return an error.)
785The allocated memory is returned in the
786.Fa rlist
787list; the caller has to provide storage only, the list is initialized by
788.Fn uvm_pglistalloc .
789.Pp
790.Fn uvm_pglistfree
791frees the list of pages pointed to by
792.Fa list .
793If the content of the page is known to be zero-filled,
794caller should set
795.Dv PG_ZERO
796in pg-\*[Gt]flags so that the page allocator will use
797the page to serve future
798.Dv UVM_PGA_ZERO
799requests efficiently.
800.Pp
801.Fn uvm_page_physload
802loads physical memory segments into VM space on the specified
803.Fa free_list .
804It must be called at system boot time to set up physical memory
805management pages.
806The arguments describe the
807.Fa start
808and
809.Fa end
810of the physical addresses of the segment, and the available start and end
811addresses of pages not already in use.
812.\" XXX expand on "system boot time"!
813.Sh PROCESSES
814.Ft void
815.br
816.Fn uvm_pageout "void" ;
817.Pp
818.Ft void
819.br
820.Fn uvm_scheduler "void" ;
821.Pp
822.Ft void
823.br
824.Fn uvm_swapin "struct proc *p" ;
825.Pp
826.Fn uvm_pageout
827is the main loop for the page daemon.
828.Pp
829.Fn uvm_scheduler
830is the process zero main loop, which is to be called after the
831system has finished starting other processes.
832It handles the swapping in of runnable, swapped out processes in priority
833order.
834.Pp
835.Fn uvm_swapin
836swaps in the named process.
837.Sh PAGE LOAN
838.Ft int
839.br
840.Fn uvm_loan "struct vm_map *map" "vaddr_t start" "vsize_t len" "void *v" "int flags" ;
841.Pp
842.Ft void
843.br
844.Fn uvm_unloan "void *v" "int npages" "int flags" ;
845.Pp
846.Fn uvm_loan
847loans pages in a map out to anons or to the kernel.
848.Fa map
849should be unlocked ,
850.Fa start
851and
852.Fa len
853should be multiples of
854.Dv PAGE_SIZE .
855Argument
856.Fa flags
857should be one of
858.Bd -literal
859#define UVM_LOAN_TOANON       0x01    /* loan to anons */
860#define UVM_LOAN_TOPAGE       0x02    /* loan to kernel */
861.Ed
862.Pp
863.Fa v
864should be pointer to array of pointers to
865.Li struct anon
866or
867.Li struct vm_page ,
868as appropriate.
869The caller has to allocate memory for the array and
870ensure it's big enough to hold
871.Fa len / PAGE_SIZE
872pointers.
873Returns 0 for success, or appropriate error number otherwise.
874Note that wired pages can't be loaned out and
875.Fn uvm_loan
876will fail in that case.
877.Pp
878.Fn uvm_unloan
879kills loans on pages or anons.
880The
881.Fa v
882must point to the array of pointers initialized by previous call to
883.Fn uvm_loan .
884.Fa npages
885should match number of pages allocated for loan, this also matches
886number of items in the array.
887Argument
888.Fa flags
889should be one of
890.Bd -literal
891#define UVM_LOAN_TOANON       0x01    /* loan to anons */
892#define UVM_LOAN_TOPAGE       0x02    /* loan to kernel */
893.Ed
894.Pp
895and should match what was used for previous call to
896.Fn uvm_loan .
897.Sh MISCELLANEOUS FUNCTIONS
898.Ft struct uvm_object *
899.br
900.Fn uao_create "vsize_t size" "int flags" ;
901.Pp
902.Ft void
903.br
904.Fn uao_detach "struct uvm_object *uobj" ;
905.Pp
906.Ft void
907.br
908.Fn uao_reference "struct uvm_object *uobj" ;
909.Pp
910.Ft boolean_t
911.br
912.Fn uvm_chgkprot "caddr_t addr" "size_t len" "int rw" ;
913.Pp
914.Ft void
915.br
916.Fn uvm_kernacc "caddr_t addr" "size_t len" "int rw" ;
917.Pp
918.Ft int
919.br
920.Fn uvm_vslock "struct proc *p" "caddr_t addr" "size_t len" "vm_prot_t prot" ;
921.Pp
922.Ft void
923.br
924.Fn uvm_vsunlock "struct proc *p" "caddr_t addr" "size_t len" ;
925.Pp
926.Ft void
927.br
928.Fn uvm_meter "void" ;
929.Pp
930.Ft void
931.br
932.Fn uvm_fork "struct proc *p1" "struct proc *p2" "boolean_t shared" ;
933.Pp
934.Ft int
935.br
936.Fn uvm_grow "struct proc *p" "vaddr_t sp" ;
937.Pp
938.Ft int
939.br
940.Fn uvm_coredump "struct proc *p" "struct vnode *vp" "struct ucred *cred" "struct core *chdr" ;
941.Pp
942.Ft void
943.br
944.Fn uvn_findpages "struct uvm_object *uobj" "voff_t offset" "int *npagesp" "struct vm_page **pps" "int flags" ;
945.Pp
946.Ft void
947.br
948.Fn uvm_swap_stats "int cmd" "struct swapent *sep" "int sec" "register_t *retval" ;
949.Pp
950The
951.Fn uao_create ,
952.Fn uao_detach ,
953and
954.Fn uao_reference
955functions operate on anonymous memory objects, such as those used to support
956System V shared memory.
957.Fn uao_create
958returns an object of size
959.Fa size
960with flags:
961.Bd -literal
962#define UAO_FLAG_KERNOBJ        0x1     /* create kernel object */
963#define UAO_FLAG_KERNSWAP       0x2     /* enable kernel swap */
964.Ed
965.Pp
966which can only be used once each at system boot time.
967.Fn uao_reference
968creates an additional reference to the named anonymous memory object.
969.Fn uao_detach
970removes a reference from the named anonymous memory object, destroying
971it if removing the last reference.
972.Pp
973.Fn uvm_chgkprot
974changes the protection of kernel memory from
975.Fa addr
976to
977.Fa addr + len
978to the value of
979.Fa rw .
980This is primarily useful for debuggers, for setting breakpoints.
981This function is only available with options
982.Dv KGDB .
983.Pp
984.Fn uvm_kernacc
985checks the access at address
986.Fa addr
987to
988.Fa addr + len
989for
990.Fa rw
991access in the kernel address space.
992.Pp
993.Fn uvm_vslock
994and
995.Fn uvm_vsunlock
996control the wiring and unwiring of pages for process
997.Fa p
998from
999.Fa addr
1000to
1001.Fa addr + len .
1002These functions are normally used to wire memory for I/O.
1003.Pp
1004.Fn uvm_meter
1005calculates the load average and wakes up the swapper if necessary.
1006.Pp
1007.Fn uvm_fork
1008forks a virtual address space for process' (old)
1009.Fa p1
1010and (new)
1011.Fa p2 .
1012If the
1013.Fa shared
1014argument is non zero, p1 shares its address space with p2,
1015otherwise a new address space is created.
1016This function currently has no return value, and thus cannot fail.
1017In the future, this function will be changed to allow it to
1018fail in low memory conditions.
1019.Pp
1020.Fn uvm_grow
1021increases the stack segment of process
1022.Fa p
1023to include
1024.Fa sp .
1025.Pp
1026.Fn uvm_coredump
1027generates a coredump on vnode
1028.Fa vp
1029for process
1030.Fa p
1031with credentials
1032.Fa cred
1033and core header description in
1034.Fa chdr .
1035.Pp
1036.Fn uvn_findpages
1037looks up or creates pages in
1038.Fa uobj
1039at offset
1040.Fa offset ,
1041marks them busy and returns them in the
1042.Fa pps
1043array.
1044Currently
1045.Fa uobj
1046must be a vnode object.
1047The number of pages requested is pointed to by
1048.Fa npagesp ,
1049and this value is updated with the actual number of pages returned.
1050The flags can be
1051.Bd -literal
1052#define UFP_ALL         0x00    /* return all pages requested */
1053#define UFP_NOWAIT      0x01    /* don't sleep */
1054#define UFP_NOALLOC     0x02    /* don't allocate new pages */
1055#define UFP_NOCACHE     0x04    /* don't return pages which already exist */
1056#define UFP_NORDONLY    0x08    /* don't return PG_READONLY pages */
1057.Ed
1058.Pp
1059.Dv UFP_ALL
1060is a pseudo-flag meaning all requested pages should be returned.
1061.Dv UFP_NOWAIT
1062means that we must not sleep.
1063.Dv UFP_NOALLOC
1064causes any pages which do not already exist to be skipped.
1065.Dv UFP_NOCACHE
1066causes any pages which do already exist to be skipped.
1067.Dv UFP_NORDONLY
1068causes any pages which are marked PG_READONLY to be skipped.
1069.Pp
1070.Fn uvm_swap_stats
1071implements the
1072.Dv SWAP_STATS
1073and
1074.Dv SWAP_OSTATS
1075operation of the
1076.Xr swapctl 2
1077system call.
1078.Fa cmd
1079is the requested command,
1080.Dv SWAP_STATS
1081or
1082.Dv SWAP_OSTATS .
1083The function will copy no more than
1084.Fa sec
1085entries in the array pointed by
1086.Fa sep .
1087On return,
1088.Fa retval
1089holds the actual number of entries copied in the array.
1090.Sh SYSCTL
1091UVM provides support for the
1092.Dv CTL_VM
1093domain of the
1094.Xr sysctl 3
1095hierarchy.
1096It handles the
1097.Dv VM_LOADAVG ,
1098.Dv VM_METER ,
1099.Dv VM_UVMEXP ,
1100and
1101.Dv VM_UVMEXP2
1102nodes, which return the current load averages, calculates current VM
1103totals, returns the uvmexp structure, and a kernel version independent
1104view of the uvmexp structure, respectively.
1105It also exports a number of tunables that control how much VM space is
1106allowed to be consumed by various tasks.
1107The load averages are typically accessed from userland using the
1108.Xr getloadavg 3
1109function.
1110The uvmexp structure has all global state of the UVM system,
1111and has the following members:
1112.Bd -literal
1113/* vm_page constants */
1114int pagesize;   /* size of a page (PAGE_SIZE): must be power of 2 */
1115int pagemask;   /* page mask */
1116int pageshift;  /* page shift */
1117
1118/* vm_page counters */
1119int npages;     /* number of pages we manage */
1120int free;       /* number of free pages */
1121int active;     /* number of active pages */
1122int inactive;   /* number of pages that we free'd but may want back */
1123int paging;     /* number of pages in the process of being paged out */
1124int wired;      /* number of wired pages */
1125int reserve_pagedaemon; /* number of pages reserved for pagedaemon */
1126int reserve_kernel; /* number of pages reserved for kernel */
1127
1128/* pageout params */
1129int freemin;    /* min number of free pages */
1130int freetarg;   /* target number of free pages */
1131int inactarg;   /* target number of inactive pages */
1132int wiredmax;   /* max number of wired pages */
1133
1134/* swap */
1135int nswapdev;   /* number of configured swap devices in system */
1136int swpages;    /* number of PAGE_SIZE'ed swap pages */
1137int swpginuse;  /* number of swap pages in use */
1138int nswget;     /* number of times fault calls uvm_swap_get() */
1139int nanon;      /* number total of anon's in system */
1140int nfreeanon;  /* number of free anon's */
1141
1142/* stat counters */
1143int faults;             /* page fault count */
1144int traps;              /* trap count */
1145int intrs;              /* interrupt count */
1146int swtch;              /* context switch count */
1147int softs;              /* software interrupt count */
1148int syscalls;           /* system calls */
1149int pageins;            /* pagein operation count */
1150                        /* pageouts are in pdpageouts below */
1151int swapins;            /* swapins */
1152int swapouts;           /* swapouts */
1153int pgswapin;           /* pages swapped in */
1154int pgswapout;          /* pages swapped out */
1155int forks;              /* forks */
1156int forks_ppwait;       /* forks where parent waits */
1157int forks_sharevm;      /* forks where vmspace is shared */
1158
1159/* fault subcounters */
1160int fltnoram;   /* number of times fault was out of ram */
1161int fltnoanon;  /* number of times fault was out of anons */
1162int fltpgwait;  /* number of times fault had to wait on a page */
1163int fltpgrele;  /* number of times fault found a released page */
1164int fltrelck;   /* number of times fault relock called */
1165int fltrelckok; /* number of times fault relock is a success */
1166int fltanget;   /* number of times fault gets anon page */
1167int fltanretry; /* number of times fault retrys an anon get */
1168int fltamcopy;  /* number of times fault clears "needs copy" */
1169int fltnamap;   /* number of times fault maps a neighbor anon page */
1170int fltnomap;   /* number of times fault maps a neighbor obj page */
1171int fltlget;    /* number of times fault does a locked pgo_get */
1172int fltget;     /* number of times fault does an unlocked get */
1173int flt_anon;   /* number of times fault anon (case 1a) */
1174int flt_acow;   /* number of times fault anon cow (case 1b) */
1175int flt_obj;    /* number of times fault is on object page (2a) */
1176int flt_prcopy; /* number of times fault promotes with copy (2b) */
1177int flt_przero; /* number of times fault promotes with zerofill (2b) */
1178
1179/* daemon counters */
1180int pdwoke;     /* number of times daemon woke up */
1181int pdrevs;     /* number of times daemon rev'd clock hand */
1182int pdswout;    /* number of times daemon called for swapout */
1183int pdfreed;    /* number of pages daemon freed since boot */
1184int pdscans;    /* number of pages daemon scanned since boot */
1185int pdanscan;   /* number of anonymous pages scanned by daemon */
1186int pdobscan;   /* number of object pages scanned by daemon */
1187int pdreact;    /* number of pages daemon reactivated since boot */
1188int pdbusy;     /* number of times daemon found a busy page */
1189int pdpageouts; /* number of times daemon started a pageout */
1190int pdpending;  /* number of times daemon got a pending pageout */
1191int pddeact;    /* number of pages daemon deactivates */
1192.Ed
1193.Sh NOTES
1194.Fn uvm_chgkprot
1195is only available if the kernel has been compiled with options
1196.Dv KGDB .
1197.Pp
1198All structure and types whose names begin with
1199.Dq vm_
1200will be renamed to
1201.Dq uvm_ .
1202.Sh SEE ALSO
1203.Xr swapctl 2 ,
1204.Xr getloadavg 3 ,
1205.Xr kvm 3 ,
1206.Xr sysctl 3 ,
1207.Xr ddb 4 ,
1208.Xr options 4 ,
1209.Xr pmap 9
1210.Sh HISTORY
1211UVM is a new VM system developed at Washington University in St. Louis
1212(Missouri).
1213UVM's roots lie partly in the Mach-based
1214.Bx 4.4
1215VM system, the
1216.Fx
1217VM system, and the SunOS 4 VM system.
1218UVM's basic structure is based on the
1219.Bx 4.4
1220VM system.
1221UVM's new anonymous memory system is based on the
1222anonymous memory system found in the SunOS 4 VM (as described in papers
1223published by Sun Microsystems, Inc.).
1224UVM also includes a number of feature new to
1225.Bx
1226including page loanout, map entry passing, simplified
1227copy-on-write, and clustered anonymous memory pageout.
1228UVM is also further documented in an August 1998 dissertation by
1229Charles D. Cranor.
1230.Pp
1231UVM appeared in
1232.Nx 1.4 .
1233.Sh AUTHORS
1234Charles D. Cranor
1235.Aq chuck@ccrc.wustl.edu
1236designed and implemented UVM.
1237.Pp
1238Matthew Green
1239.Aq mrg@eterna.com.au
1240wrote the swap-space management code and handled the logistical issues
1241involved with merging UVM into the
1242.Nx
1243source tree.
1244.Pp
1245Chuck Silvers
1246.Aq chuq@chuq.com
1247implemented the aobj pager, thus allowing UVM to support System V shared
1248memory and process swapping.
1249He also designed and implemented the UBC part of UVM, which uses UVM pages
1250to cache vnode data rather than the traditional buffer cache buffers.
1251