1.\" $NetBSD: uvm.9,v 1.90 2008/05/29 14:51:25 mrg Exp $ 2.\" 3.\" Copyright (c) 1998 Matthew R. Green 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 15.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR 16.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 17.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 18.\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, 19.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 20.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 21.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED 22.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 23.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25.\" SUCH DAMAGE. 26.\" 27.Dd October 15, 2007 28.Dt UVM 9 29.Os 30.Sh NAME 31.Nm uvm 32.Nd virtual memory system external interface 33.Sh SYNOPSIS 34.In sys/param.h 35.In uvm/uvm.h 36.Sh DESCRIPTION 37The UVM virtual memory system manages access to the computer's memory 38resources. 39User processes and the kernel access these resources through 40UVM's external interface. 41UVM's external interface includes functions that: 42.Pp 43.Bl -hyphen -compact 44.It 45initialize UVM sub-systems 46.It 47manage virtual address spaces 48.It 49resolve page faults 50.It 51memory map files and devices 52.It 53perform uio-based I/O to virtual memory 54.It 55allocate and free kernel virtual memory 56.It 57allocate and free physical memory 58.El 59.Pp 60In addition to exporting these services, UVM has two kernel-level processes: 61pagedaemon and swapper. 62The pagedaemon process sleeps until physical memory becomes scarce. 63When that happens, pagedaemon is awoken. 64It scans physical memory, paging out and freeing memory that has not 65been recently used. 66The swapper process swaps in runnable processes that are currently swapped 67out, if there is room. 68.Pp 69There are also several miscellaneous functions. 70.Sh INITIALIZATION 71.Ft void 72.br 73.Fn uvm_init "void" ; 74.Pp 75.Ft void 76.br 77.Fn uvm_init_limits "struct lwp *l" ; 78.Pp 79.Ft void 80.br 81.Fn uvm_setpagesize "void" ; 82.Pp 83.Ft void 84.br 85.Fn uvm_swap_init "void" ; 86.Pp 87.Fn uvm_init 88sets up the UVM system at system boot time, after the 89console has been setup. 90It initializes global state, the page, map, kernel virtual memory state, 91machine-dependent physical map, kernel memory allocator, 92pager and anonymous memory sub-systems, and then enables 93paging of kernel objects. 94.Pp 95.Fn uvm_init_limits 96initializes process limits for the named process. 97This is for use by the system startup for process zero, before any 98other processes are created. 99.Pp 100.Fn uvm_setpagesize 101initializes the uvmexp members pagesize (if not already done by 102machine-dependent code), pageshift and pagemask. 103It should be called by machine-dependent code early in the 104.Fn pmap_init 105call (see 106.Xr pmap 9 ) . 107.Pp 108.Fn uvm_swap_init 109initializes the swap sub-system. 110.Sh VIRTUAL ADDRESS SPACE MANAGEMENT 111.Ft int 112.br 113.Fn uvm_map "struct vm_map *map" "vaddr_t *startp" "vsize_t size" "struct uvm_object *uobj" "voff_t uoffset" "vsize_t align" "uvm_flag_t flags" ; 114.Pp 115.Ft void 116.br 117.Fn uvm_unmap "struct vm_map *map" "vaddr_t start" "vaddr_t end" ; 118.Pp 119.Ft int 120.br 121.Fn uvm_map_pageable "struct vm_map *map" "vaddr_t start" "vaddr_t end" "bool new_pageable" "int lockflags" ; 122.Pp 123.Ft bool 124.br 125.Fn uvm_map_checkprot "struct vm_map *map" "vaddr_t start" "vaddr_t end" "vm_prot_t protection" ; 126.Pp 127.Ft int 128.br 129.Fn uvm_map_protect "struct vm_map *map" "vaddr_t start" "vaddr_t end" "vm_prot_t new_prot" "bool set_max" ; 130.Pp 131.Ft int 132.br 133.Fn uvm_deallocate "struct vm_map *map" "vaddr_t start" "vsize_t size" ; 134.Pp 135.Ft struct vmspace * 136.br 137.Fn uvmspace_alloc "vaddr_t min" "vaddr_t max" "int pageable" ; 138.Pp 139.Ft void 140.br 141.Fn uvmspace_exec "struct lwp *l" "vaddr_t start" "vaddr_t end" ; 142.Pp 143.Ft struct vmspace * 144.br 145.Fn uvmspace_fork "struct vmspace *vm" ; 146.Pp 147.Ft void 148.br 149.Fn uvmspace_free "struct vmspace *vm1" ; 150.Pp 151.Ft void 152.br 153.Fn uvmspace_share "struct proc *p1" "struct proc *p2" ; 154.Pp 155.Ft void 156.br 157.Fn uvmspace_unshare "struct lwp *l" ; 158.Pp 159.Ft bool 160.br 161.Fn uvm_uarea_alloc "vaddr_t *uaddrp" ; 162.Pp 163.Ft void 164.br 165.Fn uvm_uarea_free "vaddr_t uaddr" ; 166.Pp 167.Fn uvm_map 168establishes a valid mapping in map 169.Fa map , 170which must be unlocked. 171The new mapping has size 172.Fa size , 173which must be a multiple of 174.Dv PAGE_SIZE . 175The 176.Fa uobj 177and 178.Fa uoffset 179arguments can have four meanings. 180When 181.Fa uobj 182is 183.Dv NULL 184and 185.Fa uoffset 186is 187.Dv UVM_UNKNOWN_OFFSET , 188.Fn uvm_map 189does not use the machine-dependent 190.Dv PMAP_PREFER 191function. 192If 193.Fa uoffset 194is any other value, it is used as the hint to 195.Dv PMAP_PREFER . 196When 197.Fa uobj 198is not 199.Dv NULL 200and 201.Fa uoffset 202is 203.Dv UVM_UNKNOWN_OFFSET , 204.Fn uvm_map 205finds the offset based upon the virtual address, passed as 206.Fa startp . 207If 208.Fa uoffset 209is any other value, we are doing a normal mapping at this offset. 210The start address of the map will be returned in 211.Fa startp . 212.Pp 213.Fa align 214specifies alignment of mapping unless 215.Dv UVM_FLAG_FIXED 216is specified in 217.Fa flags . 218.Fa align 219must be a power of 2. 220.Pp 221.Fa flags 222passed to 223.Fn uvm_map 224are typically created using the 225.Fn UVM_MAPFLAG "vm_prot_t prot" "vm_prot_t maxprot" "vm_inherit_t inh" "int advice" "int flags" 226macro, which uses the following values. 227The 228.Fa prot 229and 230.Fa maxprot 231can take are: 232.Bd -literal 233#define UVM_PROT_MASK 0x07 /* protection mask */ 234#define UVM_PROT_NONE 0x00 /* protection none */ 235#define UVM_PROT_ALL 0x07 /* everything */ 236#define UVM_PROT_READ 0x01 /* read */ 237#define UVM_PROT_WRITE 0x02 /* write */ 238#define UVM_PROT_EXEC 0x04 /* exec */ 239#define UVM_PROT_R 0x01 /* read */ 240#define UVM_PROT_W 0x02 /* write */ 241#define UVM_PROT_RW 0x03 /* read-write */ 242#define UVM_PROT_X 0x04 /* exec */ 243#define UVM_PROT_RX 0x05 /* read-exec */ 244#define UVM_PROT_WX 0x06 /* write-exec */ 245#define UVM_PROT_RWX 0x07 /* read-write-exec */ 246.Ed 247.Pp 248The values that 249.Fa inh 250can take are: 251.Bd -literal 252#define UVM_INH_MASK 0x30 /* inherit mask */ 253#define UVM_INH_SHARE 0x00 /* "share" */ 254#define UVM_INH_COPY 0x10 /* "copy" */ 255#define UVM_INH_NONE 0x20 /* "none" */ 256#define UVM_INH_DONATE 0x30 /* "donate" \*[Lt]\*[Lt] not used */ 257.Ed 258.Pp 259The values that 260.Fa advice 261can take are: 262.Bd -literal 263#define UVM_ADV_NORMAL 0x0 /* 'normal' */ 264#define UVM_ADV_RANDOM 0x1 /* 'random' */ 265#define UVM_ADV_SEQUENTIAL 0x2 /* 'sequential' */ 266#define UVM_ADV_MASK 0x7 /* mask */ 267.Ed 268.Pp 269The values that 270.Fa flags 271can take are: 272.Bd -literal 273#define UVM_FLAG_FIXED 0x010000 /* find space */ 274#define UVM_FLAG_OVERLAY 0x020000 /* establish overlay */ 275#define UVM_FLAG_NOMERGE 0x040000 /* don't merge map entries */ 276#define UVM_FLAG_COPYONW 0x080000 /* set copy_on_write flag */ 277#define UVM_FLAG_AMAPPAD 0x100000 /* for bss: pad amap to reduce malloc() */ 278#define UVM_FLAG_TRYLOCK 0x200000 /* fail if we can not lock map */ 279.Ed 280.Pp 281The 282.Dv UVM_MAPFLAG 283macro arguments can be combined with an or operator. 284There are several special purpose macros for checking protection 285combinations, e.g., the 286.Dv UVM_PROT_WX 287macro. 288There are also some additional macros to extract bits from the flags. 289The 290.Dv UVM_PROTECTION , 291.Dv UVM_INHERIT , 292.Dv UVM_MAXPROTECTION 293and 294.Dv UVM_ADVICE 295macros return the protection, inheritance, maximum protection and advice, 296respectively. 297.Fn uvm_map 298returns a standard UVM return value. 299.Pp 300.Fn uvm_unmap 301removes a valid mapping, 302from 303.Fa start 304to 305.Fa end , 306in map 307.Fa map , 308which must be unlocked. 309.Pp 310.Fn uvm_map_pageable 311changes the pageability of the pages in the range from 312.Fa start 313to 314.Fa end 315in map 316.Fa map 317to 318.Fa new_pageable . 319.Fn uvm_map_pageable 320returns a standard UVM return value. 321.Pp 322.Fn uvm_map_checkprot 323checks the protection of the range from 324.Fa start 325to 326.Fa end 327in map 328.Fa map 329against 330.Fa protection . 331This returns either 332.Dv true 333or 334.Dv false . 335.Pp 336.Fn uvm_map_protect 337changes the protection 338.Fa start 339to 340.Fa end 341in map 342.Fa map 343to 344.Fa new_prot , 345also setting the maximum protection to the region to 346.Fa new_prot 347if 348.Fa set_max 349is non-zero. 350This function returns a standard UVM return value. 351.Pp 352.Fn uvm_deallocate 353deallocates kernel memory in map 354.Fa map 355from address 356.Fa start 357to 358.Fa start + size . 359.Pp 360.Fn uvmspace_alloc 361allocates and returns a new address space, with ranges from 362.Fa min 363to 364.Fa max , 365setting the pageability of the address space to 366.Fa pageable . 367.Pp 368.Fn uvmspace_exec 369either reuses the address space of lwp 370.Fa l 371if there are no other references to it, or creates 372a new one with 373.Fn uvmspace_alloc . 374The range of valid addresses in the address space is reset to 375.Fa start 376through 377.Fa end . 378.Pp 379.Fn uvmspace_fork 380creates and returns a new address space based upon the 381.Fa vm1 382address space, typically used when allocating an address space for a 383child process. 384.Pp 385.Fn uvmspace_free 386lowers the reference count on the address space 387.Fa vm , 388freeing the data structures if there are no other references. 389.Pp 390.Fn uvmspace_share 391causes process 392.Pa p2 393to share the address space of 394.Fa p1 . 395.Pp 396.Fn uvmspace_unshare 397ensures that lwp 398.Fa l 399has its own, unshared address space, by creating a new one if 400necessary by calling 401.Fn uvmspace_fork . 402.Pp 403.Fn uvm_uarea_alloc 404allocates virtual space for a u-area (i.e., a kernel stack) and stores 405its virtual address in 406.Fa *uaddrp . 407The return value is 408.Dv true 409if the u-area is already backed by wired physical memory, otherwise 410.Dv false . 411.Pp 412.Fn uvm_uarea_free 413frees a u-area allocated with 414.Fn uvm_uarea_alloc , 415freeing both the virtual space and any physical pages which may have been 416allocated to back that virtual space later. 417.Sh PAGE FAULT HANDLING 418.Ft int 419.br 420.Fn uvm_fault "struct vm_map *orig_map" "vaddr_t vaddr" "vm_prot_t access_type" ; 421.Pp 422.Fn uvm_fault 423is the main entry point for faults. 424It takes 425.Fa orig_map 426as the map the fault originated in, a 427.Fa vaddr 428offset into the map the fault occurred, and 429.Fa access_type 430describing the type of access requested. 431.Fn uvm_fault 432returns a standard UVM return value. 433.Sh MEMORY MAPPING FILES AND DEVICES 434.Ft void 435.br 436.Fn uvm_vnp_setsize "struct vnode *vp" "voff_t newsize" ; 437.Pp 438.Ft void * 439.br 440.Fn ubc_alloc "struct uvm_object *uobj" "voff_t offset" "vsize_t *lenp" \ 441"int advice" "int flags" ; 442.Pp 443.Ft void 444.br 445.Fn ubc_release "void *va" "int flags" ; 446.Pp 447int 448.br 449.Fn ubc_uiomove "struct uvm_object *uobj" "struct uio *uio" "vsize_t todo" \ 450"int advice" "int flags" ; 451.Pp 452.Fn uvm_vnp_setsize 453sets the size of vnode 454.Fa vp 455to 456.Fa newsize . 457Caller must hold a reference to the vnode. 458If the vnode shrinks, pages no longer used are discarded. 459.Pp 460.Fn ubc_alloc 461creates a kernel mapping of 462.Fa uobj 463starting at offset 464.Fa offset . 465The desired length of the mapping is pointed to by 466.Fa lenp , 467but the actual mapping may be smaller than this. 468.Fa lenp 469is updated to contain the actual length mapped. 470.Fa advice 471is the access pattern hint, which must be one of 472.Pp 473.Bl -tag -offset indent -width "UVM_ADV_SEQUENTIAL" -compact 474.It UVM_ADV_NORMAL 475No hint 476.It UVM_ADV_RANDOM 477Random access hint 478.It UVM_ADV_SEQUENTIAL 479Sequential access hint (from lower offset to higher offset) 480.El 481.Pp 482The possible 483.Fa flags 484are 485.Pp 486.Bl -tag -offset indent -width "UVM_ADV_SEQUENTIAL" -compact 487.It UBC_READ 488Mapping will be accessed for read. 489.It UBC_WRITE 490Mapping will be accessed for write. 491.It UBC_FAULTBUSY 492Fault in window's pages already during mapping operation. 493Makes sense only for write. 494.El 495.Pp 496Currently, 497.Fa uobj 498must actually be a vnode object. 499Once the mapping is created, it must be accessed only by methods that can 500handle faults, such as 501.Fn uiomove 502or 503.Fn kcopy . 504Page faults on the mapping will result in the vnode's 505.Fn VOP_GETPAGES 506method being called to resolve the fault. 507.Pp 508.Fn ubc_release 509frees the mapping at 510.Fa va 511for reuse. 512The mapping may be cached to speed future accesses to the same region 513of the object. 514The flags can be any of 515.Pp 516.Bl -tag -offset indent -width "UVM_ADV_SEQUENTIAL" -compact 517.It UBC_UNMAP 518Do not cache mapping. 519.El 520.Pp 521.Fn ubc_uiomove 522allocates an UBC memory window, performs I/O on it and unmaps the window. 523The 524.Fa advice 525parameter takes the same values as the respective parameter in 526.Fn ubc_alloc 527and the 528.Fa flags 529parameter takes the same arguments as 530.Fn ubc_alloc 531and 532.Fn ubc_unmap . 533Additionally, the flag 534.Dv UBC_PARTIALOK 535can be provided to indicate that it is acceptable to return if an error 536occurs mid-transfer. 537.Sh VIRTUAL MEMORY I/O 538.Ft int 539.br 540.Fn uvm_io "struct vm_map *map" "struct uio *uio" ; 541.Pp 542.Fn uvm_io 543performs the I/O described in 544.Fa uio 545on the memory described in 546.Fa map . 547.Sh ALLOCATION OF KERNEL MEMORY 548.Ft vaddr_t 549.br 550.Fn uvm_km_alloc "struct vm_map *map" "vsize_t size" "vsize_t align" "uvm_flag_t flags" ; 551.Pp 552.Ft void 553.br 554.Fn uvm_km_free "struct vm_map *map" "vaddr_t addr" "vsize_t size" "uvm_flag_t flags" ; 555.Pp 556.Ft struct vm_map * 557.br 558.Fn uvm_km_suballoc "struct vm_map *map" "vaddr_t *min" "vaddr_t *max" "vsize_t size" "bool pageable" "bool fixed" "struct vm_map *submap" ; 559.Pp 560.Fn uvm_km_alloc 561allocates 562.Fa size 563bytes of kernel memory in map 564.Fa map . 565The first address of the allocated memory range will be aligned according to the 566.Fa align 567argument 568.Pq specify 0 if no alignment is necessary . 569The alignment must be a multiple of page size. 570The 571.Fa flags 572is a bitwise inclusive OR of the allocation type and operation flags. 573.Pp 574The allocation type should be one of: 575.Bl -tag -width UVM_KMF_PAGEABLE 576.It UVM_KMF_WIRED 577Wired memory. 578.It UVM_KMF_PAGEABLE 579Demand-paged zero-filled memory. 580.It UVM_KMF_VAONLY 581Virtual address only. 582No physical pages are mapped in the allocated region. 583If necessary, it's the caller's responsibility to enter page mappings. 584It's also the caller's responsibility to clean up the mappings before freeing 585the address range. 586.El 587.Pp 588The following operation flags are available: 589.Bl -tag -width UVM_KMF_PAGEABLE 590.It UVM_KMF_CANFAIL 591Can fail even if 592.Dv UVM_KMF_NOWAIT 593is not specified and 594.Dv UVM_KMF_WAITVA 595is specified. 596.It UVM_KMF_ZERO 597Request zero-filled memory. 598Only supported for 599.Dv UVM_KMF_WIRED . 600Shouldn't be used with other types. 601.It UVM_KMF_TRYLOCK 602Fail if we can't lock the map. 603.It UVM_KMF_NOWAIT 604Fail immediately if no memory is available. 605.It UVM_KMF_WAITVA 606Sleep to wait for the virtual address resources if needed. 607.El 608.Pp 609(If neither 610.Dv UVM_KMF_NOWAIT 611nor 612.Dv UVM_KMF_CANFAIL 613are specified and 614.Dv UVM_KMF_WAITVA 615is specified, 616.Fn uvm_km_alloc 617will never fail, but rather sleep indefinitely until the allocation succeeds.) 618.Pp 619Pageability of the pages allocated with 620.Dv UVM_KMF_PAGEABLE 621can be changed by 622.Fn uvm_map_pageable . 623In that case, the entire range must be changed atomically. 624Changing a part of the range is not supported. 625.Pp 626.Fn uvm_km_free 627frees the memory range allocated by 628.Fn uvm_km_alloc . 629.Fa addr 630must be an address returned by 631.Fn uvm_km_alloc . 632.Fa map 633and 634.Fa size 635must be the same as the ones used for the corresponding 636.Fn uvm_km_alloc . 637.Fa flags 638must be the allocation type used for the corresponding 639.Fn uvm_km_alloc . 640.Pp 641.Fn uvm_km_free 642is the only way to free memory ranges allocated by 643.Fn uvm_km_alloc . 644.Fn uvm_unmap 645must not be used. 646.Pp 647.Fn uvm_km_suballoc 648allocates submap from 649.Fa map , 650creating a new map if 651.Fa submap 652is 653.Dv NULL . 654The addresses of the submap can be specified exactly by setting the 655.Fa fixed 656argument to non-zero, which causes the 657.Fa min 658argument to specify the beginning of the address in the submap. 659If 660.Fa fixed 661is zero, any address of size 662.Fa size 663will be allocated from 664.Fa map 665and the start and end addresses returned in 666.Fa min 667and 668.Fa max . 669If 670.Fa pageable 671is non-zero, entries in the map may be paged out. 672.Sh ALLOCATION OF PHYSICAL MEMORY 673.Ft struct vm_page * 674.br 675.Fn uvm_pagealloc "struct uvm_object *uobj" "voff_t off" "struct vm_anon *anon" "int flags" ; 676.Pp 677.Ft void 678.br 679.Fn uvm_pagerealloc "struct vm_page *pg" "struct uvm_object *newobj" "voff_t newoff" ; 680.Pp 681.Ft void 682.br 683.Fn uvm_pagefree "struct vm_page *pg" ; 684.Pp 685.Ft int 686.br 687.Fn uvm_pglistalloc "psize_t size" "paddr_t low" "paddr_t high" "paddr_t alignment" "paddr_t boundary" "struct pglist *rlist" "int nsegs" "int waitok" ; 688.Pp 689.Ft void 690.br 691.Fn uvm_pglistfree "struct pglist *list" ; 692.Pp 693.Ft void 694.br 695.Fn uvm_page_physload "vaddr_t start" "vaddr_t end" "vaddr_t avail_start" "vaddr_t avail_end" "int free_list" ; 696.Pp 697.Fn uvm_pagealloc 698allocates a page of memory at virtual address 699.Fa off 700in either the object 701.Fa uobj 702or the anonymous memory 703.Fa anon , 704which must be locked by the caller. 705Only one of 706.Fa uobj 707and 708.Fa anon 709can be non 710.Dv NULL . 711Returns 712.Dv NULL 713when no page can be found. 714The flags can be any of 715.Bd -literal 716#define UVM_PGA_USERESERVE 0x0001 /* ok to use reserve pages */ 717#define UVM_PGA_ZERO 0x0002 /* returned page must be zero'd */ 718.Ed 719.Pp 720.Dv UVM_PGA_USERESERVE 721means to allocate a page even if that will result in the number of free pages 722being lower than 723.Dv uvmexp.reserve_pagedaemon 724(if the current thread is the pagedaemon) or 725.Dv uvmexp.reserve_kernel 726(if the current thread is not the pagedaemon). 727.Dv UVM_PGA_ZERO 728causes the returned page to be filled with zeroes, either by allocating it 729from a pool of pre-zeroed pages or by zeroing it in-line as necessary. 730.Pp 731.Fn uvm_pagerealloc 732reallocates page 733.Fa pg 734to a new object 735.Fa newobj , 736at a new offset 737.Fa newoff . 738.Pp 739.Fn uvm_pagefree 740frees the physical page 741.Fa pg . 742If the content of the page is known to be zero-filled, 743caller should set 744.Dv PG_ZERO 745in pg-\*[Gt]flags so that the page allocator will use 746the page to serve future 747.Dv UVM_PGA_ZERO 748requests efficiently. 749.Pp 750.Fn uvm_pglistalloc 751allocates a list of pages for size 752.Fa size 753byte under various constraints. 754.Fa low 755and 756.Fa high 757describe the lowest and highest addresses acceptable for the list. 758If 759.Fa alignment 760is non-zero, it describes the required alignment of the list, in 761power-of-two notation. 762If 763.Fa boundary 764is non-zero, no segment of the list may cross this power-of-two 765boundary, relative to zero. 766.Fa nsegs 767is the maximum number of physically contiguous segments. 768If 769.Fa waitok 770is non-zero, the function may sleep until enough memory is available. 771(It also may give up in some situations, so a non-zero 772.Fa waitok 773does not imply that 774.Fn uvm_pglistalloc 775cannot return an error.) 776The allocated memory is returned in the 777.Fa rlist 778list; the caller has to provide storage only, the list is initialized by 779.Fn uvm_pglistalloc . 780.Pp 781.Fn uvm_pglistfree 782frees the list of pages pointed to by 783.Fa list . 784If the content of the page is known to be zero-filled, 785caller should set 786.Dv PG_ZERO 787in pg-\*[Gt]flags so that the page allocator will use 788the page to serve future 789.Dv UVM_PGA_ZERO 790requests efficiently. 791.Pp 792.Fn uvm_page_physload 793loads physical memory segments into VM space on the specified 794.Fa free_list . 795It must be called at system boot time to set up physical memory 796management pages. 797The arguments describe the 798.Fa start 799and 800.Fa end 801of the physical addresses of the segment, and the available start and end 802addresses of pages not already in use. 803.\" XXX expand on "system boot time"! 804.Sh PROCESSES 805.Ft void 806.br 807.Fn uvm_pageout "void" ; 808.Pp 809.Ft void 810.br 811.Fn uvm_scheduler "void" ; 812.Pp 813.Ft void 814.br 815.Fn uvm_swapin "struct lwp *l" ; 816.Pp 817.Fn uvm_pageout 818is the main loop for the page daemon. 819.Pp 820.Fn uvm_scheduler 821is the process zero main loop, which is to be called after the 822system has finished starting other processes. 823It handles the swapping in of runnable, swapped out processes in priority 824order. 825.Pp 826.Fn uvm_swapin 827swaps in the named lwp. 828.Sh PAGE LOAN 829.Ft int 830.br 831.Fn uvm_loan "struct vm_map *map" "vaddr_t start" "vsize_t len" "void *v" "int flags" ; 832.Pp 833.Ft void 834.br 835.Fn uvm_unloan "void *v" "int npages" "int flags" ; 836.Pp 837.Fn uvm_loan 838loans pages in a map out to anons or to the kernel. 839.Fa map 840should be unlocked, 841.Fa start 842and 843.Fa len 844should be multiples of 845.Dv PAGE_SIZE . 846Argument 847.Fa flags 848should be one of 849.Bd -literal 850#define UVM_LOAN_TOANON 0x01 /* loan to anons */ 851#define UVM_LOAN_TOPAGE 0x02 /* loan to kernel */ 852.Ed 853.Pp 854.Fa v 855should be pointer to array of pointers to 856.Li struct anon 857or 858.Li struct vm_page , 859as appropriate. 860The caller has to allocate memory for the array and 861ensure it's big enough to hold 862.Fa len / PAGE_SIZE 863pointers. 864Returns 0 for success, or appropriate error number otherwise. 865Note that wired pages can't be loaned out and 866.Fn uvm_loan 867will fail in that case. 868.Pp 869.Fn uvm_unloan 870kills loans on pages or anons. 871The 872.Fa v 873must point to the array of pointers initialized by previous call to 874.Fn uvm_loan . 875.Fa npages 876should match number of pages allocated for loan, this also matches 877number of items in the array. 878Argument 879.Fa flags 880should be one of 881.Bd -literal 882#define UVM_LOAN_TOANON 0x01 /* loan to anons */ 883#define UVM_LOAN_TOPAGE 0x02 /* loan to kernel */ 884.Ed 885.Pp 886and should match what was used for previous call to 887.Fn uvm_loan . 888.Sh MISCELLANEOUS FUNCTIONS 889.Ft struct uvm_object * 890.br 891.Fn uao_create "vsize_t size" "int flags" ; 892.Pp 893.Ft void 894.br 895.Fn uao_detach "struct uvm_object *uobj" ; 896.Pp 897.Ft void 898.br 899.Fn uao_reference "struct uvm_object *uobj" ; 900.Pp 901.Ft bool 902.br 903.Fn uvm_chgkprot "void *addr" "size_t len" "int rw" ; 904.Pp 905.Ft void 906.br 907.Fn uvm_kernacc "void *addr" "size_t len" "int rw" ; 908.Pp 909.Ft int 910.br 911.Fn uvm_vslock "struct vmspace *vs" "void *addr" "size_t len" "vm_prot_t prot" ; 912.Pp 913.Ft void 914.br 915.Fn uvm_vsunlock "struct vmspace *vs" "void *addr" "size_t len" ; 916.Pp 917.Ft void 918.br 919.Fn uvm_meter "void" ; 920.Pp 921.Ft void 922.br 923.Fn uvm_fork "struct lwp *l1" "struct lwp *l2" "bool shared" ; 924.Pp 925.Ft int 926.br 927.Fn uvm_grow "struct proc *p" "vaddr_t sp" ; 928.Pp 929.Ft void 930.br 931.Fn uvn_findpages "struct uvm_object *uobj" "voff_t offset" "int *npagesp" "struct vm_page **pps" "int flags" ; 932.Pp 933.Ft void 934.br 935.Fn uvm_swap_stats "int cmd" "struct swapent *sep" "int sec" "register_t *retval" ; 936.Pp 937The 938.Fn uao_create , 939.Fn uao_detach , 940and 941.Fn uao_reference 942functions operate on anonymous memory objects, such as those used to support 943System V shared memory. 944.Fn uao_create 945returns an object of size 946.Fa size 947with flags: 948.Bd -literal 949#define UAO_FLAG_KERNOBJ 0x1 /* create kernel object */ 950#define UAO_FLAG_KERNSWAP 0x2 /* enable kernel swap */ 951.Ed 952.Pp 953which can only be used once each at system boot time. 954.Fn uao_reference 955creates an additional reference to the named anonymous memory object. 956.Fn uao_detach 957removes a reference from the named anonymous memory object, destroying 958it if removing the last reference. 959.Pp 960.Fn uvm_chgkprot 961changes the protection of kernel memory from 962.Fa addr 963to 964.Fa addr + len 965to the value of 966.Fa rw . 967This is primarily useful for debuggers, for setting breakpoints. 968This function is only available with options 969.Dv KGDB . 970.Pp 971.Fn uvm_kernacc 972checks the access at address 973.Fa addr 974to 975.Fa addr + len 976for 977.Fa rw 978access in the kernel address space. 979.Pp 980.Fn uvm_vslock 981and 982.Fn uvm_vsunlock 983control the wiring and unwiring of pages for process 984.Fa p 985from 986.Fa addr 987to 988.Fa addr + len . 989These functions are normally used to wire memory for I/O. 990.Pp 991.Fn uvm_meter 992calculates the load average and wakes up the swapper if necessary. 993.Pp 994.Fn uvm_fork 995forks a virtual address space for process' (old) 996.Fa p1 997and (new) 998.Fa p2 . 999If the 1000.Fa shared 1001argument is non zero, p1 shares its address space with p2, 1002otherwise a new address space is created. 1003This function currently has no return value, and thus cannot fail. 1004In the future, this function will be changed to allow it to 1005fail in low memory conditions. 1006.Pp 1007.Fn uvm_grow 1008increases the stack segment of process 1009.Fa p 1010to include 1011.Fa sp . 1012.Pp 1013.Fn uvn_findpages 1014looks up or creates pages in 1015.Fa uobj 1016at offset 1017.Fa offset , 1018marks them busy and returns them in the 1019.Fa pps 1020array. 1021Currently 1022.Fa uobj 1023must be a vnode object. 1024The number of pages requested is pointed to by 1025.Fa npagesp , 1026and this value is updated with the actual number of pages returned. 1027The flags can be 1028.Bd -literal 1029#define UFP_ALL 0x00 /* return all pages requested */ 1030#define UFP_NOWAIT 0x01 /* don't sleep */ 1031#define UFP_NOALLOC 0x02 /* don't allocate new pages */ 1032#define UFP_NOCACHE 0x04 /* don't return pages which already exist */ 1033#define UFP_NORDONLY 0x08 /* don't return PG_READONLY pages */ 1034.Ed 1035.Pp 1036.Dv UFP_ALL 1037is a pseudo-flag meaning all requested pages should be returned. 1038.Dv UFP_NOWAIT 1039means that we must not sleep. 1040.Dv UFP_NOALLOC 1041causes any pages which do not already exist to be skipped. 1042.Dv UFP_NOCACHE 1043causes any pages which do already exist to be skipped. 1044.Dv UFP_NORDONLY 1045causes any pages which are marked PG_READONLY to be skipped. 1046.Pp 1047.Fn uvm_swap_stats 1048implements the 1049.Dv SWAP_STATS 1050and 1051.Dv SWAP_OSTATS 1052operation of the 1053.Xr swapctl 2 1054system call. 1055.Fa cmd 1056is the requested command, 1057.Dv SWAP_STATS 1058or 1059.Dv SWAP_OSTATS . 1060The function will copy no more than 1061.Fa sec 1062entries in the array pointed by 1063.Fa sep . 1064On return, 1065.Fa retval 1066holds the actual number of entries copied in the array. 1067.Sh SYSCTL 1068UVM provides support for the 1069.Dv CTL_VM 1070domain of the 1071.Xr sysctl 3 1072hierarchy. 1073It handles the 1074.Dv VM_LOADAVG , 1075.Dv VM_METER , 1076.Dv VM_UVMEXP , 1077and 1078.Dv VM_UVMEXP2 1079nodes, which return the current load averages, calculates current VM 1080totals, returns the uvmexp structure, and a kernel version independent 1081view of the uvmexp structure, respectively. 1082It also exports a number of tunables that control how much VM space is 1083allowed to be consumed by various tasks. 1084The load averages are typically accessed from userland using the 1085.Xr getloadavg 3 1086function. 1087The uvmexp structure has all global state of the UVM system, 1088and has the following members: 1089.Bd -literal 1090/* vm_page constants */ 1091int pagesize; /* size of a page (PAGE_SIZE): must be power of 2 */ 1092int pagemask; /* page mask */ 1093int pageshift; /* page shift */ 1094 1095/* vm_page counters */ 1096int npages; /* number of pages we manage */ 1097int free; /* number of free pages */ 1098int active; /* number of active pages */ 1099int inactive; /* number of pages that we free'd but may want back */ 1100int paging; /* number of pages in the process of being paged out */ 1101int wired; /* number of wired pages */ 1102int reserve_pagedaemon; /* number of pages reserved for pagedaemon */ 1103int reserve_kernel; /* number of pages reserved for kernel */ 1104 1105/* pageout params */ 1106int freemin; /* min number of free pages */ 1107int freetarg; /* target number of free pages */ 1108int inactarg; /* target number of inactive pages */ 1109int wiredmax; /* max number of wired pages */ 1110 1111/* swap */ 1112int nswapdev; /* number of configured swap devices in system */ 1113int swpages; /* number of PAGE_SIZE'ed swap pages */ 1114int swpginuse; /* number of swap pages in use */ 1115int nswget; /* number of times fault calls uvm_swap_get() */ 1116int nanon; /* number total of anon's in system */ 1117int nfreeanon; /* number of free anon's */ 1118 1119/* stat counters */ 1120int faults; /* page fault count */ 1121int traps; /* trap count */ 1122int intrs; /* interrupt count */ 1123int swtch; /* context switch count */ 1124int softs; /* software interrupt count */ 1125int syscalls; /* system calls */ 1126int pageins; /* pagein operation count */ 1127 /* pageouts are in pdpageouts below */ 1128int swapins; /* swapins */ 1129int swapouts; /* swapouts */ 1130int pgswapin; /* pages swapped in */ 1131int pgswapout; /* pages swapped out */ 1132int forks; /* forks */ 1133int forks_ppwait; /* forks where parent waits */ 1134int forks_sharevm; /* forks where vmspace is shared */ 1135 1136/* fault subcounters */ 1137int fltnoram; /* number of times fault was out of ram */ 1138int fltnoanon; /* number of times fault was out of anons */ 1139int fltpgwait; /* number of times fault had to wait on a page */ 1140int fltpgrele; /* number of times fault found a released page */ 1141int fltrelck; /* number of times fault relock called */ 1142int fltrelckok; /* number of times fault relock is a success */ 1143int fltanget; /* number of times fault gets anon page */ 1144int fltanretry; /* number of times fault retrys an anon get */ 1145int fltamcopy; /* number of times fault clears "needs copy" */ 1146int fltnamap; /* number of times fault maps a neighbor anon page */ 1147int fltnomap; /* number of times fault maps a neighbor obj page */ 1148int fltlget; /* number of times fault does a locked pgo_get */ 1149int fltget; /* number of times fault does an unlocked get */ 1150int flt_anon; /* number of times fault anon (case 1a) */ 1151int flt_acow; /* number of times fault anon cow (case 1b) */ 1152int flt_obj; /* number of times fault is on object page (2a) */ 1153int flt_prcopy; /* number of times fault promotes with copy (2b) */ 1154int flt_przero; /* number of times fault promotes with zerofill (2b) */ 1155 1156/* daemon counters */ 1157int pdwoke; /* number of times daemon woke up */ 1158int pdrevs; /* number of times daemon rev'd clock hand */ 1159int pdswout; /* number of times daemon called for swapout */ 1160int pdfreed; /* number of pages daemon freed since boot */ 1161int pdscans; /* number of pages daemon scanned since boot */ 1162int pdanscan; /* number of anonymous pages scanned by daemon */ 1163int pdobscan; /* number of object pages scanned by daemon */ 1164int pdreact; /* number of pages daemon reactivated since boot */ 1165int pdbusy; /* number of times daemon found a busy page */ 1166int pdpageouts; /* number of times daemon started a pageout */ 1167int pdpending; /* number of times daemon got a pending pageout */ 1168int pddeact; /* number of pages daemon deactivates */ 1169.Ed 1170.Sh NOTES 1171.Fn uvm_chgkprot 1172is only available if the kernel has been compiled with options 1173.Dv KGDB . 1174.Pp 1175All structure and types whose names begin with 1176.Dq vm_ 1177will be renamed to 1178.Dq uvm_ . 1179.Sh SEE ALSO 1180.Xr swapctl 2 , 1181.Xr getloadavg 3 , 1182.Xr kvm 3 , 1183.Xr sysctl 3 , 1184.Xr ddb 4 , 1185.Xr options 4 , 1186.Xr memoryallocators 9 , 1187.Xr pmap 9 1188.Sh HISTORY 1189UVM is a new VM system developed at Washington University in St. Louis 1190(Missouri). 1191UVM's roots lie partly in the Mach-based 1192.Bx 4.4 1193VM system, the 1194.Fx 1195VM system, and the SunOS 4 VM system. 1196UVM's basic structure is based on the 1197.Bx 4.4 1198VM system. 1199UVM's new anonymous memory system is based on the 1200anonymous memory system found in the SunOS 4 VM (as described in papers 1201published by Sun Microsystems, Inc.). 1202UVM also includes a number of features new to 1203.Bx 1204including page loanout, map entry passing, simplified 1205copy-on-write, and clustered anonymous memory pageout. 1206UVM is also further documented in an August 1998 dissertation by 1207Charles D. Cranor. 1208.Pp 1209UVM appeared in 1210.Nx 1.4 . 1211.Sh AUTHORS 1212Charles D. Cranor 1213.Aq chuck@ccrc.wustl.edu 1214designed and implemented UVM. 1215.Pp 1216Matthew Green 1217.Aq mrg@eterna.com.au 1218wrote the swap-space management code and handled the logistical issues 1219involved with merging UVM into the 1220.Nx 1221source tree. 1222.Pp 1223Chuck Silvers 1224.Aq chuq@chuq.com 1225implemented the aobj pager, thus allowing UVM to support System V shared 1226memory and process swapping. 1227He also designed and implemented the UBC part of UVM, which uses UVM pages 1228to cache vnode data rather than the traditional buffer cache buffers. 1229