1.\" $NetBSD: uvm.9,v 1.60 2005/01/11 09:46:49 wiz Exp $ 2.\" 3.\" Copyright (c) 1998 Matthew R. Green 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 3. The name of the author may not be used to endorse or promote products 15.\" derived from this software without specific prior written permission. 16.\" 17.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR 18.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 19.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 20.\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, 21.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 22.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 23.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED 24.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 25.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 26.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 27.\" SUCH DAMAGE. 28.\" 29.Dd January 9, 2005 30.Dt UVM 9 31.Os 32.Sh NAME 33.Nm uvm 34.Nd virtual memory system external interface 35.Sh SYNOPSIS 36.In sys/param.h 37.In uvm/uvm.h 38.Sh DESCRIPTION 39The UVM virtual memory system manages access to the computer's memory 40resources. 41User processes and the kernel access these resources through 42UVM's external interface. 43UVM's external interface includes functions that: 44.Pp 45.Bl -hyphen -compact 46.It 47initialise UVM sub-systems 48.It 49manage virtual address spaces 50.It 51resolve page faults 52.It 53memory map files and devices 54.It 55perform uio-based I/O to virtual memory 56.It 57allocate and free kernel virtual memory 58.It 59allocate and free physical memory 60.El 61.Pp 62In addition to exporting these services, UVM has two kernel-level processes: 63pagedaemon and swapper. 64The pagedaemon process sleeps until physical memory becomes scarce. 65When that happens, pagedaemon is awoken. 66It scans physical memory, paging out and freeing memory that has not 67been recently used. 68The swapper process swaps in runnable processes that are currently swapped 69out, if there is room. 70.Pp 71There are also several miscellaneous functions. 72.Sh INITIALISATION 73.Ft void 74.br 75.Fn uvm_init "void" ; 76.Pp 77.Ft void 78.br 79.Fn uvm_init_limits "struct proc *p" ; 80.Pp 81.Ft void 82.br 83.Fn uvm_setpagesize "void" ; 84.Pp 85.Ft void 86.br 87.Fn uvm_swap_init "void" ; 88.Pp 89.Fn uvm_init 90sets up the UVM system at system boot time, after the 91copyright has been printed. 92It initialises global state, the page, map, kernel virtual memory state, 93machine-dependent physical map, kernel memory allocator, 94pager and anonymous memory sub-systems, and then enables 95paging of kernel objects. 96.Pp 97.Fn uvm_init_limits 98initialises process limits for the named process. 99This is for use by the system startup for process zero, before any 100other processes are created. 101.Pp 102.Fn uvm_setpagesize 103initialises the uvmexp members pagesize (if not already done by 104machine-dependent code), pageshift and pagemask. 105It should be called by machine-dependent code early in the 106.Fn pmap_init 107call (see 108.Xr pmap 9 ) . 109.Pp 110.Fn uvm_swap_init 111initialises the swap sub-system. 112.Sh VIRTUAL ADDRESS SPACE MANAGEMENT 113.Ft int 114.br 115.Fn uvm_map "struct vm_map *map" "vaddr_t *startp" "vsize_t size" "struct uvm_object *uobj" "voff_t uoffset" "vsize_t align" "uvm_flag_t flags" ; 116.Pp 117.Ft int 118.br 119.Fn uvm_unmap "struct vm_map *map" "vaddr_t start" "vaddr_t end" ; 120.Pp 121.Ft int 122.br 123.Fn uvm_map_pageable "struct vm_map *map" "vaddr_t start" "vaddr_t end" "boolean_t new_pageable" "int lockflags" ; 124.Pp 125.Ft boolean_t 126.br 127.Fn uvm_map_checkprot "struct vm_map *map" "vaddr_t start" "vaddr_t end" "vm_prot_t protection" ; 128.Pp 129.Ft int 130.br 131.Fn uvm_map_protect "struct vm_map *map" "vaddr_t start" "vaddr_t end" "vm_prot_t new_prot" "boolean_t set_max" ; 132.Pp 133.Ft int 134.br 135.Fn uvm_deallocate "struct vm_map *map" "vaddr_t start" "vsize_t size" ; 136.Pp 137.Ft struct vmspace * 138.br 139.Fn uvmspace_alloc "vaddr_t min" "vaddr_t max" "int pageable" ; 140.Pp 141.Ft void 142.br 143.Fn uvmspace_exec "struct proc *p" "vaddr_t start" "vaddr_t end" ; 144.Pp 145.Ft struct vmspace * 146.br 147.Fn uvmspace_fork "struct vmspace *vm" ; 148.Pp 149.Ft void 150.br 151.Fn uvmspace_free "struct vmspace *vm1" ; 152.Pp 153.Ft void 154.br 155.Fn uvmspace_share "struct proc *p1" "struct proc *p2" ; 156.Pp 157.Ft void 158.br 159.Fn uvmspace_unshare "struct proc *p" ; 160.Pp 161.Ft boolean_t 162.br 163.Fn uvm_uarea_alloc "vaddr_t *uaddrp" ; 164.Pp 165.Ft void 166.br 167.Fn uvm_uarea_free "vaddr_t uaddr" ; 168.Pp 169.Fn uvm_map 170establishes a valid mapping in map 171.Fa map , 172which must be unlocked. 173The new mapping has size 174.Fa size , 175which must a multiple of 176.Dv PAGE_SIZE . 177The 178.Fa uobj 179and 180.Fa uoffset 181arguments can have four meanings. 182When 183.Fa uobj 184is 185.Dv NULL 186and 187.Fa uoffset 188is 189.Dv UVM_UNKNOWN_OFFSET , 190.Fn uvm_map 191does not use the machine-dependent 192.Dv PMAP_PREFER 193function. 194If 195.Fa uoffset 196is any other value, it is used as the hint to 197.Dv PMAP_PREFER . 198When 199.Fa uobj 200is not 201.Dv NULL 202and 203.Fa uoffset 204is 205.Dv UVM_UNKNOWN_OFFSET , 206.Fn uvm_map 207finds the offset based upon the virtual address, passed as 208.Fa startp . 209If 210.Fa uoffset 211is any other value, we are doing a normal mapping at this offset. 212The start address of the map will be returned in 213.Fa startp . 214.Pp 215.Fa align 216specifies alignment of mapping unless 217.Dv UVM_FLAG_FIXED 218is specified in 219.Fa flags . 220.Fa align 221must be a power of 2. 222.Pp 223.Fa flags 224passed to 225.Fn uvm_map 226are typically created using the 227.Fn UVM_MAPFLAG "vm_prot_t prot" "vm_prot_t maxprot" "vm_inherit_t inh" "int advice" "int flags" 228macro, which uses the following values. 229The 230.Fa prot 231and 232.Fa maxprot 233can take are: 234.Bd -literal 235#define UVM_PROT_MASK 0x07 /* protection mask */ 236#define UVM_PROT_NONE 0x00 /* protection none */ 237#define UVM_PROT_ALL 0x07 /* everything */ 238#define UVM_PROT_READ 0x01 /* read */ 239#define UVM_PROT_WRITE 0x02 /* write */ 240#define UVM_PROT_EXEC 0x04 /* exec */ 241#define UVM_PROT_R 0x01 /* read */ 242#define UVM_PROT_W 0x02 /* write */ 243#define UVM_PROT_RW 0x03 /* read-write */ 244#define UVM_PROT_X 0x04 /* exec */ 245#define UVM_PROT_RX 0x05 /* read-exec */ 246#define UVM_PROT_WX 0x06 /* write-exec */ 247#define UVM_PROT_RWX 0x07 /* read-write-exec */ 248.Ed 249.Pp 250The values that 251.Fa inh 252can take are: 253.Bd -literal 254#define UVM_INH_MASK 0x30 /* inherit mask */ 255#define UVM_INH_SHARE 0x00 /* "share" */ 256#define UVM_INH_COPY 0x10 /* "copy" */ 257#define UVM_INH_NONE 0x20 /* "none" */ 258#define UVM_INH_DONATE 0x30 /* "donate" \*[Lt]\*[Lt] not used */ 259.Ed 260.Pp 261The values that 262.Fa advice 263can take are: 264.Bd -literal 265#define UVM_ADV_NORMAL 0x0 /* 'normal' */ 266#define UVM_ADV_RANDOM 0x1 /* 'random' */ 267#define UVM_ADV_SEQUENTIAL 0x2 /* 'sequential' */ 268#define UVM_ADV_MASK 0x7 /* mask */ 269.Ed 270.Pp 271The values that 272.Fa flags 273can take are: 274.Bd -literal 275#define UVM_FLAG_FIXED 0x010000 /* find space */ 276#define UVM_FLAG_OVERLAY 0x020000 /* establish overlay */ 277#define UVM_FLAG_NOMERGE 0x040000 /* don't merge map entries */ 278#define UVM_FLAG_COPYONW 0x080000 /* set copy_on_write flag */ 279#define UVM_FLAG_AMAPPAD 0x100000 /* for bss: pad amap to reduce malloc() */ 280#define UVM_FLAG_TRYLOCK 0x200000 /* fail if we can not lock map */ 281.Ed 282.Pp 283The 284.Dv UVM_MAPFLAG 285macro arguments can be combined with an or operator. 286There are several special purpose macros for checking protection 287combinations, e.g., the 288.Dv UVM_PROT_WX 289macro. 290There are also some additional macros to extract bits from the flags. 291The 292.Dv UVM_PROTECTION , 293.Dv UVM_INHERIT , 294.Dv UVM_MAXPROTECTION 295and 296.Dv UVM_ADVICE 297macros return the protection, inheritance, maximum protection and advice, 298respectively. 299.Fn uvm_map 300returns a standard UVM return value. 301.Pp 302.Fn uvm_unmap 303removes a valid mapping, 304from 305.Fa start 306to 307.Fa end , 308in map 309.Fa map , 310which must be unlocked. 311.Pp 312.Fn uvm_map_pageable 313changes the pageability of the pages in the range from 314.Fa start 315to 316.Fa end 317in map 318.Fa map 319to 320.Fa new_pageable . 321.Fn uvm_map_pageable 322returns a standard UVM return value. 323.Pp 324.Fn uvm_map_checkprot 325checks the protection of the range from 326.Fa start 327to 328.Fa end 329in map 330.Fa map 331against 332.Fa protection . 333This returns either 334.Dv TRUE 335or 336.Dv FALSE . 337.Pp 338.Fn uvm_map_protect 339changes the protection 340.Fa start 341to 342.Fa end 343in map 344.Fa map 345to 346.Fa new_prot , 347also setting the maximum protection to the region to 348.Fa new_prot 349if 350.Fa set_max 351is non-zero. 352This function returns a standard UVM return value. 353.Pp 354.Fn uvm_deallocate 355deallocates kernel memory in map 356.Fa map 357from address 358.Fa start 359to 360.Fa start + size . 361.Pp 362.Fn uvmspace_alloc 363allocates and returns a new address space, with ranges from 364.Fa min 365to 366.Fa max , 367setting the pageability of the address space to 368.Fa pageable . 369.Pp 370.Fn uvmspace_exec 371either reuses the address space of process 372.Fa p 373if there are no other references to it, or creates 374a new one with 375.Fn uvmspace_alloc . 376The range of valid addresses in the address space is reset to 377.Fa start 378through 379.Fa end . 380.Pp 381.Fn uvmspace_fork 382creates and returns a new address space based upon the 383.Fa vm1 384address space, typically used when allocating an address space for a 385child process. 386.Pp 387.Fn uvmspace_free 388lowers the reference count on the address space 389.Fa vm , 390freeing the data structures if there are no other references. 391.Pp 392.Fn uvmspace_share 393causes process 394.Pa p2 395to share the address space of 396.Fa p1 . 397.Pp 398.Fn uvmspace_unshare 399ensures that process 400.Fa p 401has its own, unshared address space, by creating a new one if 402necessary by calling 403.Fn uvmspace_fork . 404.Pp 405.Fn uvm_uarea_alloc 406allocates virtual space for a u-area (i.e., a kernel stack) and stores 407its virtual address in 408.Fa *uaddrp . 409The return value is 410.Dv TRUE 411if the u-area is already backed by wired physical memory, otherwise 412.Dv FALSE . 413.Pp 414.Fn uvm_uarea_free 415frees a u-area allocated with 416.Fn uvm_uarea_alloc , 417freeing both the virtual space and any physical pages which may have been 418allocated to back that virtual space later. 419.Sh PAGE FAULT HANDLING 420.Ft int 421.br 422.Fn uvm_fault "struct vm_map *orig_map" "vaddr_t vaddr" "vm_fault_t fault_type" "vm_prot_t access_type" ; 423.Pp 424.Fn uvm_fault 425is the main entry point for faults. 426It takes 427.Fa orig_map 428as the map the fault originated in, a 429.Fa vaddr 430offset into the map the fault occurred, 431.Fa fault_type 432describing the type of fault, and 433.Fa access_type 434describing the type of access requested. 435.Fn uvm_fault 436returns a standard UVM return value. 437.Sh MEMORY MAPPING FILES AND DEVICES 438.Ft struct uvm_object * 439.br 440.Fn uvn_attach "void *arg" "vm_prot_t accessprot" ; 441.Pp 442.Ft void 443.br 444.Fn uvm_vnp_setsize "struct vnode *vp" "voff_t newsize" ; 445.Pp 446.Ft void * 447.br 448.Fn ubc_alloc "struct uvm_object *uobj" "voff_t offset" "vsize_t *lenp" "int flags" ; 449.Pp 450.Ft void 451.br 452.Fn ubc_release "void *va" "int flags" ; 453.Pp 454.Fn uvn_attach 455attaches a UVM object to vnode 456.Fa arg , 457creating the object if necessary. 458The object is returned. 459.Pp 460.Fn uvm_vnp_setsize 461sets the size of vnode 462.Fa vp 463to 464.Fa newsize . 465Caller must hold a reference to the vnode. 466If the vnode shrinks, pages no longer used are discarded. 467.Pp 468.Fn ubc_alloc 469creates a kernel mappings of 470.Fa uobj 471starting at offset 472.Fa offset . 473the desired length of the mapping is pointed to by 474.Fa lenp , 475but the actual mapping may be smaller than this. 476.Fa lenp 477is updated to contain the actual length mapped. 478The flags must be one of 479.Bd -literal 480#define UBC_READ 0x01 /* mapping will be accessed for read */ 481#define UBC_WRITE 0x02 /* mapping will be accessed for write */ 482.Ed 483.Pp 484Currently, 485.Fa uobj 486must actually be a vnode object. 487Once the mapping is created, it must be accessed only by methods that can 488handle faults, such as 489.Fn uiomove 490or 491.Fn kcopy . 492Page faults on the mapping will result in the vnode's 493.Fn VOP_GETPAGES 494method being called to resolve the fault. 495.Pp 496.Fn ubc_release 497frees the mapping at 498.Fa va 499for reuse. 500The mapping may be cached to speed future accesses to the same region 501of the object. 502The flags can be any of 503.Bd -literal 504#define UBC_UNMAP 0x01 /* do not cache mapping */ 505.Ed 506.Sh VIRTUAL MEMORY I/O 507.Ft int 508.br 509.Fn uvm_io "struct vm_map *map" "struct uio *uio" ; 510.Pp 511.Fn uvm_io 512performs the I/O described in 513.Fa uio 514on the memory described in 515.Fa map . 516.Sh ALLOCATION OF KERNEL MEMORY 517.Ft vaddr_t 518.br 519.Fn uvm_km_alloc "struct vm_map *map" "vsize_t size" ; 520.Pp 521.Ft vaddr_t 522.br 523.Fn uvm_km_zalloc "struct vm_map *map" "vsize_t size" ; 524.Pp 525.Ft vaddr_t 526.br 527.Fn uvm_km_alloc1 "struct vm_map *map" "vsize_t size" "boolean_t zeroit" ; 528.Pp 529.Ft vaddr_t 530.br 531.Fn uvm_km_kmemalloc1 "struct vm_map *map" "struct uvm_object *obj" "vsize_t size" "vsize_t align" "voff_t preferred offset" "int flags" ; 532.Pp 533.Ft vaddr_t 534.br 535.Fn uvm_km_kmemalloc "struct vm_map *map" "struct uvm_object *obj" "vsize_t size" "int flags" ; 536.Pp 537.Ft vaddr_t 538.br 539.Fn uvm_km_valloc "struct vm_map *map" "vsize_t size" ; 540.Pp 541.Ft vaddr_t 542.br 543.Fn uvm_km_valloc_wait "struct vm_map *map" "vsize_t size" ; 544.Pp 545.Ft struct vm_map * 546.br 547.Fn uvm_km_suballoc "struct vm_map *map" "vaddr_t *min" "vaddr_t *max " "vsize_t size" "boolean_t pageable" "boolean_t fixed" "struct vm_map *submap" ; 548.Pp 549.Ft void 550.br 551.Fn uvm_km_free "struct vm_map *map" "vaddr_t addr" "vsize_t size" ; 552.Pp 553.Ft void 554.br 555.Fn uvm_km_free_wakeup "struct vm_map *map" "vaddr_t addr" "vsize_t size" ; 556.Pp 557.Fn uvm_km_alloc 558and 559.Fn uvm_km_zalloc 560allocate 561.Fa size 562bytes of wired kernel memory in map 563.Fa map . 564In addition to allocation, 565.Fn uvm_km_zalloc 566zeros the memory. 567Both of these functions are defined as macros in terms of 568.Fn uvm_km_alloc1 , 569and should almost always be used in preference to 570.Fn uvm_km_alloc1 . 571.Pp 572.Fn uvm_km_alloc1 573allocates and returns 574.Fa size 575bytes of wired memory in the kernel map, zeroing the memory if the 576.Fa zeroit 577argument is non-zero. 578.Pp 579.Fn uvm_km_kmemalloc1 580allocates and returns 581.Fa size 582bytes of wired kernel memory into 583.Fa obj . 584The first address of the allocated memory range will be aligned according to the 585.Fa align 586argument 587.Pq specify 0 if no alignment is necessary . 588The flags can be any of: 589.Bd -literal 590#define UVM_KMF_NOWAIT 0x1 /* matches M_NOWAIT */ 591#define UVM_KMF_VALLOC 0x2 /* allocate VA only */ 592#define UVM_KMF_CANFAIL 0x4 /* caller handles failure */ 593#define UVM_KMF_TRYLOCK UVM_FLAG_TRYLOCK /* try locking only */ 594.Ed 595.Pp 596.Dv UVM_KMF_NOWAIT 597causes 598.Fn uvm_km_kmemalloc1 599to return immediately if no memory is available. 600.Dv UVM_KMF_VALLOC 601causes no physical pages to be allocated, only virtual space. 602.Dv UVM_KMF_TRYLOCK 603causes 604.Fn uvm_km_kmemalloc1 605to use 606.Fn simple_lock_try 607when locking maps. 608.Dv UVM_KMF_CANFAIL 609indicates that 610.Fn uvm_km_kmemalloc1 611can return 0 even if 612.Dv UVM_KMF_NOWAIT 613is not specified. 614(If neither 615.Dv UVM_KMF_NOWAIT 616nor 617.Dv UVM_KMF_CANFAIL 618are specified, 619.Fn uvm_km_kmemalloc1 620will never fail, but rather sleep indefinitely until the allocation succeeds.) 621.Pp 622.Fn uvm_km_kmemalloc 623allocates kernel memory like 624.Fn uvm_km_kmemalloc1 625but uses the default values 626.Dv 0 627for the 628.Fa align , 629and 630.Dv UVM_UNKNOWN_OFFSET 631for the 632.Fa prefer 633arguments. 634.Pp 635.Fn uvm_km_valloc 636and 637.Fn uvm_km_valloc_wait 638return a newly allocated zero-filled address in the kernel map of size 639.Fa size . 640.Fn uvm_km_valloc_wait 641will also wait for kernel memory to become available, if there is a 642memory shortage. 643.Pp 644.Fn uvm_km_free 645and 646.Fn uvm_km_free_wakeup 647free 648.Fa size 649bytes of memory in the kernel map, starting at address 650.Fa addr . 651.Fn uvm_km_free_wakeup 652calls 653.Fn wakeup 654on the map before unlocking the map. 655.Pp 656.Fn uvm_km_suballoc 657allocates submap from 658.Fa map , 659creating a new map if 660.Fa submap 661is 662.Dv NULL . 663The addresses of the submap can be specified exactly by setting the 664.Fa fixed 665argument to non-zero, which causes the 666.Fa min 667argument to specify the beginning of the address in the submap. 668If 669.Fa fixed 670is zero, any address of size 671.Fa size 672will be allocated from 673.Fa map 674and the start and end addresses returned in 675.Fa min 676and 677.Fa max . 678If 679.Fa pageable 680is non-zero, entries in the map may be paged out. 681.Sh ALLOCATION OF PHYSICAL MEMORY 682.Ft struct vm_page * 683.br 684.Fn uvm_pagealloc "struct uvm_object *uobj" "voff_t off" "struct vm_anon *anon" "int flags" ; 685.Pp 686.Ft void 687.br 688.Fn uvm_pagerealloc "struct vm_page *pg" "struct uvm_object *newobj" "voff_t newoff" ; 689.Pp 690.Ft void 691.br 692.Fn uvm_pagefree "struct vm_page *pg" ; 693.Pp 694.Ft int 695.br 696.Fn uvm_pglistalloc "psize_t size" "paddr_t low" "paddr_t high" "paddr_t alignment" "paddr_t boundary" "struct pglist *rlist" "int nsegs" "int waitok" ; 697.Pp 698.Ft void 699.br 700.Fn uvm_pglistfree "struct pglist *list" ; 701.Pp 702.Ft void 703.br 704.Fn uvm_page_physload "vaddr_t start" "vaddr_t end" "vaddr_t avail_start" "vaddr_t avail_end" "int free_list" ; 705.Pp 706.Fn uvm_pagealloc 707allocates a page of memory at virtual address 708.Fa off 709in either the object 710.Fa uobj 711or the anonymous memory 712.Fa anon , 713which must be locked by the caller. 714Only one of 715.Fa uobj 716and 717.Fa anon 718can be non 719.Dv NULL . 720Returns 721.Dv NULL 722when no page can be found. 723The flags can be any of 724.Bd -literal 725#define UVM_PGA_USERESERVE 0x0001 /* ok to use reserve pages */ 726#define UVM_PGA_ZERO 0x0002 /* returned page must be zero'd */ 727.Ed 728.Pp 729.Dv UVM_PGA_USERESERVE 730means to allocate a page even if that will result in the number of free pages 731being lower than 732.Dv uvmexp.reserve_pagedaemon 733(if the current thread is the pagedaemon) or 734.Dv uvmexp.reserve_kernel 735(if the current thread is not the pagedaemon). 736.Dv UVM_PGA_ZERO 737causes the returned page to be filled with zeroes, either by allocating it 738from a pool of pre-zeroed pages or by zeroing it in-line as necessary. 739.Pp 740.Fn uvm_pagerealloc 741reallocates page 742.Fa pg 743to a new object 744.Fa newobj , 745at a new offset 746.Fa newoff . 747.Pp 748.Fn uvm_pagefree 749frees the physical page 750.Fa pg . 751If the content of the page is known to be zero-filled, 752caller should set 753.Dv PG_ZERO 754in pg-\*[Gt]flags so that the page allocator will use 755the page to serve future 756.Dv UVM_PGA_ZERO 757requests efficiently. 758.Pp 759.Fn uvm_pglistalloc 760allocates a list of pages for size 761.Fa size 762byte under various constraints. 763.Fa low 764and 765.Fa high 766describe the lowest and highest addresses acceptable for the list. 767If 768.Fa alignment 769is non-zero, it describes the required alignment of the list, in 770power-of-two notation. 771If 772.Fa boundary 773is non-zero, no segment of the list may cross this power-of-two 774boundary, relative to zero. 775.Fa nsegs 776is the maximum number of physically contigous segments. 777If 778.Fa waitok 779is non-zero, the function may sleep until enough memory is available. 780(It also may give up in some situations, so a non-zero 781.Fa waitok 782does not imply that 783.Fn uvm_pglistalloc 784cannot return an error.) 785The allocated memory is returned in the 786.Fa rlist 787list; the caller has to provide storage only, the list is initialized by 788.Fn uvm_pglistalloc . 789.Pp 790.Fn uvm_pglistfree 791frees the list of pages pointed to by 792.Fa list . 793If the content of the page is known to be zero-filled, 794caller should set 795.Dv PG_ZERO 796in pg-\*[Gt]flags so that the page allocator will use 797the page to serve future 798.Dv UVM_PGA_ZERO 799requests efficiently. 800.Pp 801.Fn uvm_page_physload 802loads physical memory segments into VM space on the specified 803.Fa free_list . 804It must be called at system boot time to set up physical memory 805management pages. 806The arguments describe the 807.Fa start 808and 809.Fa end 810of the physical addresses of the segment, and the available start and end 811addresses of pages not already in use. 812.\" XXX expand on "system boot time"! 813.Sh PROCESSES 814.Ft void 815.br 816.Fn uvm_pageout "void" ; 817.Pp 818.Ft void 819.br 820.Fn uvm_scheduler "void" ; 821.Pp 822.Ft void 823.br 824.Fn uvm_swapin "struct proc *p" ; 825.Pp 826.Fn uvm_pageout 827is the main loop for the page daemon. 828.Pp 829.Fn uvm_scheduler 830is the process zero main loop, which is to be called after the 831system has finished starting other processes. 832It handles the swapping in of runnable, swapped out processes in priority 833order. 834.Pp 835.Fn uvm_swapin 836swaps in the named process. 837.Sh PAGE LOAN 838.Ft int 839.br 840.Fn uvm_loan "struct vm_map *map" "vaddr_t start" "vsize_t len" "void *v" "int flags" ; 841.Pp 842.Ft void 843.br 844.Fn uvm_unloan "void *v" "int npages" "int flags" ; 845.Pp 846.Fn uvm_loan 847loans pages in a map out to anons or to the kernel. 848.Fa map 849should be unlocked , 850.Fa start 851and 852.Fa len 853should be multiples of 854.Dv PAGE_SIZE . 855Argument 856.Fa flags 857should be one of 858.Bd -literal 859#define UVM_LOAN_TOANON 0x01 /* loan to anons */ 860#define UVM_LOAN_TOPAGE 0x02 /* loan to kernel */ 861.Ed 862.Pp 863.Fa v 864should be pointer to array of pointers to 865.Li struct anon 866or 867.Li struct vm_page , 868as appropriate. 869The caller has to allocate memory for the array and 870ensure it's big enough to hold 871.Fa len / PAGE_SIZE 872pointers. 873Returns 0 for success, or appropriate error number otherwise. 874Note that wired pages can't be loaned out and 875.Fn uvm_loan 876will fail in that case. 877.Pp 878.Fn uvm_unloan 879kills loans on pages or anons. 880The 881.Fa v 882must point to the array of pointers initialized by previous call to 883.Fn uvm_loan . 884.Fa npages 885should match number of pages allocated for loan, this also matches 886number of items in the array. 887Argument 888.Fa flags 889should be one of 890.Bd -literal 891#define UVM_LOAN_TOANON 0x01 /* loan to anons */ 892#define UVM_LOAN_TOPAGE 0x02 /* loan to kernel */ 893.Ed 894.Pp 895and should match what was used for previous call to 896.Fn uvm_loan . 897.Sh MISCELLANEOUS FUNCTIONS 898.Ft struct uvm_object * 899.br 900.Fn uao_create "vsize_t size" "int flags" ; 901.Pp 902.Ft void 903.br 904.Fn uao_detach "struct uvm_object *uobj" ; 905.Pp 906.Ft void 907.br 908.Fn uao_reference "struct uvm_object *uobj" ; 909.Pp 910.Ft boolean_t 911.br 912.Fn uvm_chgkprot "caddr_t addr" "size_t len" "int rw" ; 913.Pp 914.Ft void 915.br 916.Fn uvm_kernacc "caddr_t addr" "size_t len" "int rw" ; 917.Pp 918.Ft int 919.br 920.Fn uvm_vslock "struct proc *p" "caddr_t addr" "size_t len" "vm_prot_t prot" ; 921.Pp 922.Ft void 923.br 924.Fn uvm_vsunlock "struct proc *p" "caddr_t addr" "size_t len" ; 925.Pp 926.Ft void 927.br 928.Fn uvm_meter "void" ; 929.Pp 930.Ft void 931.br 932.Fn uvm_fork "struct proc *p1" "struct proc *p2" "boolean_t shared" ; 933.Pp 934.Ft int 935.br 936.Fn uvm_grow "struct proc *p" "vaddr_t sp" ; 937.Pp 938.Ft int 939.br 940.Fn uvm_coredump "struct proc *p" "struct vnode *vp" "struct ucred *cred" "struct core *chdr" ; 941.Pp 942.Ft void 943.br 944.Fn uvn_findpages "struct uvm_object *uobj" "voff_t offset" "int *npagesp" "struct vm_page **pps" "int flags" ; 945.Pp 946.Ft void 947.br 948.Fn uvm_swap_stats "int cmd" "struct swapent *sep" "int sec" "register_t *retval" ; 949.Pp 950The 951.Fn uao_create , 952.Fn uao_detach , 953and 954.Fn uao_reference 955functions operate on anonymous memory objects, such as those used to support 956System V shared memory. 957.Fn uao_create 958returns an object of size 959.Fa size 960with flags: 961.Bd -literal 962#define UAO_FLAG_KERNOBJ 0x1 /* create kernel object */ 963#define UAO_FLAG_KERNSWAP 0x2 /* enable kernel swap */ 964.Ed 965.Pp 966which can only be used once each at system boot time. 967.Fn uao_reference 968creates an additional reference to the named anonymous memory object. 969.Fn uao_detach 970removes a reference from the named anonymous memory object, destroying 971it if removing the last reference. 972.Pp 973.Fn uvm_chgkprot 974changes the protection of kernel memory from 975.Fa addr 976to 977.Fa addr + len 978to the value of 979.Fa rw . 980This is primarily useful for debuggers, for setting breakpoints. 981This function is only available with options 982.Dv KGDB . 983.Pp 984.Fn uvm_kernacc 985checks the access at address 986.Fa addr 987to 988.Fa addr + len 989for 990.Fa rw 991access in the kernel address space. 992.Pp 993.Fn uvm_vslock 994and 995.Fn uvm_vsunlock 996control the wiring and unwiring of pages for process 997.Fa p 998from 999.Fa addr 1000to 1001.Fa addr + len . 1002These functions are normally used to wire memory for I/O. 1003.Pp 1004.Fn uvm_meter 1005calculates the load average and wakes up the swapper if necessary. 1006.Pp 1007.Fn uvm_fork 1008forks a virtual address space for process' (old) 1009.Fa p1 1010and (new) 1011.Fa p2 . 1012If the 1013.Fa shared 1014argument is non zero, p1 shares its address space with p2, 1015otherwise a new address space is created. 1016This function currently has no return value, and thus cannot fail. 1017In the future, this function will be changed to allow it to 1018fail in low memory conditions. 1019.Pp 1020.Fn uvm_grow 1021increases the stack segment of process 1022.Fa p 1023to include 1024.Fa sp . 1025.Pp 1026.Fn uvm_coredump 1027generates a coredump on vnode 1028.Fa vp 1029for process 1030.Fa p 1031with credentials 1032.Fa cred 1033and core header description in 1034.Fa chdr . 1035.Pp 1036.Fn uvn_findpages 1037looks up or creates pages in 1038.Fa uobj 1039at offset 1040.Fa offset , 1041marks them busy and returns them in the 1042.Fa pps 1043array. 1044Currently 1045.Fa uobj 1046must be a vnode object. 1047The number of pages requested is pointed to by 1048.Fa npagesp , 1049and this value is updated with the actual number of pages returned. 1050The flags can be 1051.Bd -literal 1052#define UFP_ALL 0x00 /* return all pages requested */ 1053#define UFP_NOWAIT 0x01 /* don't sleep */ 1054#define UFP_NOALLOC 0x02 /* don't allocate new pages */ 1055#define UFP_NOCACHE 0x04 /* don't return pages which already exist */ 1056#define UFP_NORDONLY 0x08 /* don't return PG_READONLY pages */ 1057.Ed 1058.Pp 1059.Dv UFP_ALL 1060is a pseudo-flag meaning all requested pages should be returned. 1061.Dv UFP_NOWAIT 1062means that we must not sleep. 1063.Dv UFP_NOALLOC 1064causes any pages which do not already exist to be skipped. 1065.Dv UFP_NOCACHE 1066causes any pages which do already exist to be skipped. 1067.Dv UFP_NORDONLY 1068causes any pages which are marked PG_READONLY to be skipped. 1069.Pp 1070.Fn uvm_swap_stats 1071implements the 1072.Dv SWAP_STATS 1073and 1074.Dv SWAP_OSTATS 1075operation of the 1076.Xr swapctl 2 1077system call. 1078.Fa cmd 1079is the requested command, 1080.Dv SWAP_STATS 1081or 1082.Dv SWAP_OSTATS . 1083The function will copy no more than 1084.Fa sec 1085entries in the array pointed by 1086.Fa sep . 1087On return, 1088.Fa retval 1089holds the actual number of entries copied in the array. 1090.Sh SYSCTL 1091UVM provides support for the 1092.Dv CTL_VM 1093domain of the 1094.Xr sysctl 3 1095hierarchy. 1096It handles the 1097.Dv VM_LOADAVG , 1098.Dv VM_METER , 1099.Dv VM_UVMEXP , 1100and 1101.Dv VM_UVMEXP2 1102nodes, which return the current load averages, calculates current VM 1103totals, returns the uvmexp structure, and a kernel version independent 1104view of the uvmexp structure, respectively. 1105It also exports a number of tunables that control how much VM space is 1106allowed to be consumed by various tasks. 1107The load averages are typically accessed from userland using the 1108.Xr getloadavg 3 1109function. 1110The uvmexp structure has all global state of the UVM system, 1111and has the following members: 1112.Bd -literal 1113/* vm_page constants */ 1114int pagesize; /* size of a page (PAGE_SIZE): must be power of 2 */ 1115int pagemask; /* page mask */ 1116int pageshift; /* page shift */ 1117 1118/* vm_page counters */ 1119int npages; /* number of pages we manage */ 1120int free; /* number of free pages */ 1121int active; /* number of active pages */ 1122int inactive; /* number of pages that we free'd but may want back */ 1123int paging; /* number of pages in the process of being paged out */ 1124int wired; /* number of wired pages */ 1125int reserve_pagedaemon; /* number of pages reserved for pagedaemon */ 1126int reserve_kernel; /* number of pages reserved for kernel */ 1127 1128/* pageout params */ 1129int freemin; /* min number of free pages */ 1130int freetarg; /* target number of free pages */ 1131int inactarg; /* target number of inactive pages */ 1132int wiredmax; /* max number of wired pages */ 1133 1134/* swap */ 1135int nswapdev; /* number of configured swap devices in system */ 1136int swpages; /* number of PAGE_SIZE'ed swap pages */ 1137int swpginuse; /* number of swap pages in use */ 1138int nswget; /* number of times fault calls uvm_swap_get() */ 1139int nanon; /* number total of anon's in system */ 1140int nfreeanon; /* number of free anon's */ 1141 1142/* stat counters */ 1143int faults; /* page fault count */ 1144int traps; /* trap count */ 1145int intrs; /* interrupt count */ 1146int swtch; /* context switch count */ 1147int softs; /* software interrupt count */ 1148int syscalls; /* system calls */ 1149int pageins; /* pagein operation count */ 1150 /* pageouts are in pdpageouts below */ 1151int swapins; /* swapins */ 1152int swapouts; /* swapouts */ 1153int pgswapin; /* pages swapped in */ 1154int pgswapout; /* pages swapped out */ 1155int forks; /* forks */ 1156int forks_ppwait; /* forks where parent waits */ 1157int forks_sharevm; /* forks where vmspace is shared */ 1158 1159/* fault subcounters */ 1160int fltnoram; /* number of times fault was out of ram */ 1161int fltnoanon; /* number of times fault was out of anons */ 1162int fltpgwait; /* number of times fault had to wait on a page */ 1163int fltpgrele; /* number of times fault found a released page */ 1164int fltrelck; /* number of times fault relock called */ 1165int fltrelckok; /* number of times fault relock is a success */ 1166int fltanget; /* number of times fault gets anon page */ 1167int fltanretry; /* number of times fault retrys an anon get */ 1168int fltamcopy; /* number of times fault clears "needs copy" */ 1169int fltnamap; /* number of times fault maps a neighbor anon page */ 1170int fltnomap; /* number of times fault maps a neighbor obj page */ 1171int fltlget; /* number of times fault does a locked pgo_get */ 1172int fltget; /* number of times fault does an unlocked get */ 1173int flt_anon; /* number of times fault anon (case 1a) */ 1174int flt_acow; /* number of times fault anon cow (case 1b) */ 1175int flt_obj; /* number of times fault is on object page (2a) */ 1176int flt_prcopy; /* number of times fault promotes with copy (2b) */ 1177int flt_przero; /* number of times fault promotes with zerofill (2b) */ 1178 1179/* daemon counters */ 1180int pdwoke; /* number of times daemon woke up */ 1181int pdrevs; /* number of times daemon rev'd clock hand */ 1182int pdswout; /* number of times daemon called for swapout */ 1183int pdfreed; /* number of pages daemon freed since boot */ 1184int pdscans; /* number of pages daemon scanned since boot */ 1185int pdanscan; /* number of anonymous pages scanned by daemon */ 1186int pdobscan; /* number of object pages scanned by daemon */ 1187int pdreact; /* number of pages daemon reactivated since boot */ 1188int pdbusy; /* number of times daemon found a busy page */ 1189int pdpageouts; /* number of times daemon started a pageout */ 1190int pdpending; /* number of times daemon got a pending pageout */ 1191int pddeact; /* number of pages daemon deactivates */ 1192.Ed 1193.Sh NOTES 1194.Fn uvm_chgkprot 1195is only available if the kernel has been compiled with options 1196.Dv KGDB . 1197.Pp 1198All structure and types whose names begin with 1199.Dq vm_ 1200will be renamed to 1201.Dq uvm_ . 1202.Sh SEE ALSO 1203.Xr swapctl 2 , 1204.Xr getloadavg 3 , 1205.Xr kvm 3 , 1206.Xr sysctl 3 , 1207.Xr ddb 4 , 1208.Xr options 4 , 1209.Xr pmap 9 1210.Sh HISTORY 1211UVM is a new VM system developed at Washington University in St. Louis 1212(Missouri). 1213UVM's roots lie partly in the Mach-based 1214.Bx 4.4 1215VM system, the 1216.Fx 1217VM system, and the SunOS 4 VM system. 1218UVM's basic structure is based on the 1219.Bx 4.4 1220VM system. 1221UVM's new anonymous memory system is based on the 1222anonymous memory system found in the SunOS 4 VM (as described in papers 1223published by Sun Microsystems, Inc.). 1224UVM also includes a number of feature new to 1225.Bx 1226including page loanout, map entry passing, simplified 1227copy-on-write, and clustered anonymous memory pageout. 1228UVM is also further documented in an August 1998 dissertation by 1229Charles D. Cranor. 1230.Pp 1231UVM appeared in 1232.Nx 1.4 . 1233.Sh AUTHORS 1234Charles D. Cranor 1235.Aq chuck@ccrc.wustl.edu 1236designed and implemented UVM. 1237.Pp 1238Matthew Green 1239.Aq mrg@eterna.com.au 1240wrote the swap-space management code and handled the logistical issues 1241involved with merging UVM into the 1242.Nx 1243source tree. 1244.Pp 1245Chuck Silvers 1246.Aq chuq@chuq.com 1247implemented the aobj pager, thus allowing UVM to support System V shared 1248memory and process swapping. 1249He also designed and implemented the UBC part of UVM, which uses UVM pages 1250to cache vnode data rather than the traditional buffer cache buffers. 1251