1.\" $NetBSD: uvm.9,v 1.67 2005/09/10 12:51:13 wiz Exp $ 2.\" 3.\" Copyright (c) 1998 Matthew R. Green 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 3. The name of the author may not be used to endorse or promote products 15.\" derived from this software without specific prior written permission. 16.\" 17.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR 18.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 19.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 20.\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, 21.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 22.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 23.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED 24.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 25.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 26.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 27.\" SUCH DAMAGE. 28.\" 29.Dd April 1, 2005 30.Dt UVM 9 31.Os 32.Sh NAME 33.Nm uvm 34.Nd virtual memory system external interface 35.Sh SYNOPSIS 36.In sys/param.h 37.In uvm/uvm.h 38.Sh DESCRIPTION 39The UVM virtual memory system manages access to the computer's memory 40resources. 41User processes and the kernel access these resources through 42UVM's external interface. 43UVM's external interface includes functions that: 44.Pp 45.Bl -hyphen -compact 46.It 47initialise UVM sub-systems 48.It 49manage virtual address spaces 50.It 51resolve page faults 52.It 53memory map files and devices 54.It 55perform uio-based I/O to virtual memory 56.It 57allocate and free kernel virtual memory 58.It 59allocate and free physical memory 60.El 61.Pp 62In addition to exporting these services, UVM has two kernel-level processes: 63pagedaemon and swapper. 64The pagedaemon process sleeps until physical memory becomes scarce. 65When that happens, pagedaemon is awoken. 66It scans physical memory, paging out and freeing memory that has not 67been recently used. 68The swapper process swaps in runnable processes that are currently swapped 69out, if there is room. 70.Pp 71There are also several miscellaneous functions. 72.Sh INITIALISATION 73.Ft void 74.br 75.Fn uvm_init "void" ; 76.Pp 77.Ft void 78.br 79.Fn uvm_init_limits "struct proc *p" ; 80.Pp 81.Ft void 82.br 83.Fn uvm_setpagesize "void" ; 84.Pp 85.Ft void 86.br 87.Fn uvm_swap_init "void" ; 88.Pp 89.Fn uvm_init 90sets up the UVM system at system boot time, after the 91copyright has been printed. 92It initialises global state, the page, map, kernel virtual memory state, 93machine-dependent physical map, kernel memory allocator, 94pager and anonymous memory sub-systems, and then enables 95paging of kernel objects. 96.Pp 97.Fn uvm_init_limits 98initialises process limits for the named process. 99This is for use by the system startup for process zero, before any 100other processes are created. 101.Pp 102.Fn uvm_setpagesize 103initialises the uvmexp members pagesize (if not already done by 104machine-dependent code), pageshift and pagemask. 105It should be called by machine-dependent code early in the 106.Fn pmap_init 107call (see 108.Xr pmap 9 ) . 109.Pp 110.Fn uvm_swap_init 111initialises the swap sub-system. 112.Sh VIRTUAL ADDRESS SPACE MANAGEMENT 113.Ft int 114.br 115.Fn uvm_map "struct vm_map *map" "vaddr_t *startp" "vsize_t size" "struct uvm_object *uobj" "voff_t uoffset" "vsize_t align" "uvm_flag_t flags" ; 116.Pp 117.Ft void 118.br 119.Fn uvm_unmap "struct vm_map *map" "vaddr_t start" "vaddr_t end" ; 120.Pp 121.Ft int 122.br 123.Fn uvm_map_pageable "struct vm_map *map" "vaddr_t start" "vaddr_t end" "boolean_t new_pageable" "int lockflags" ; 124.Pp 125.Ft boolean_t 126.br 127.Fn uvm_map_checkprot "struct vm_map *map" "vaddr_t start" "vaddr_t end" "vm_prot_t protection" ; 128.Pp 129.Ft int 130.br 131.Fn uvm_map_protect "struct vm_map *map" "vaddr_t start" "vaddr_t end" "vm_prot_t new_prot" "boolean_t set_max" ; 132.Pp 133.Ft int 134.br 135.Fn uvm_deallocate "struct vm_map *map" "vaddr_t start" "vsize_t size" ; 136.Pp 137.Ft struct vmspace * 138.br 139.Fn uvmspace_alloc "vaddr_t min" "vaddr_t max" "int pageable" ; 140.Pp 141.Ft void 142.br 143.Fn uvmspace_exec "struct proc *p" "vaddr_t start" "vaddr_t end" ; 144.Pp 145.Ft struct vmspace * 146.br 147.Fn uvmspace_fork "struct vmspace *vm" ; 148.Pp 149.Ft void 150.br 151.Fn uvmspace_free "struct vmspace *vm1" ; 152.Pp 153.Ft void 154.br 155.Fn uvmspace_share "struct proc *p1" "struct proc *p2" ; 156.Pp 157.Ft void 158.br 159.Fn uvmspace_unshare "struct proc *p" ; 160.Pp 161.Ft boolean_t 162.br 163.Fn uvm_uarea_alloc "vaddr_t *uaddrp" ; 164.Pp 165.Ft void 166.br 167.Fn uvm_uarea_free "vaddr_t uaddr" ; 168.Pp 169.Fn uvm_map 170establishes a valid mapping in map 171.Fa map , 172which must be unlocked. 173The new mapping has size 174.Fa size , 175which must be a multiple of 176.Dv PAGE_SIZE . 177The 178.Fa uobj 179and 180.Fa uoffset 181arguments can have four meanings. 182When 183.Fa uobj 184is 185.Dv NULL 186and 187.Fa uoffset 188is 189.Dv UVM_UNKNOWN_OFFSET , 190.Fn uvm_map 191does not use the machine-dependent 192.Dv PMAP_PREFER 193function. 194If 195.Fa uoffset 196is any other value, it is used as the hint to 197.Dv PMAP_PREFER . 198When 199.Fa uobj 200is not 201.Dv NULL 202and 203.Fa uoffset 204is 205.Dv UVM_UNKNOWN_OFFSET , 206.Fn uvm_map 207finds the offset based upon the virtual address, passed as 208.Fa startp . 209If 210.Fa uoffset 211is any other value, we are doing a normal mapping at this offset. 212The start address of the map will be returned in 213.Fa startp . 214.Pp 215.Fa align 216specifies alignment of mapping unless 217.Dv UVM_FLAG_FIXED 218is specified in 219.Fa flags . 220.Fa align 221must be a power of 2. 222.Pp 223.Fa flags 224passed to 225.Fn uvm_map 226are typically created using the 227.Fn UVM_MAPFLAG "vm_prot_t prot" "vm_prot_t maxprot" "vm_inherit_t inh" "int advice" "int flags" 228macro, which uses the following values. 229The 230.Fa prot 231and 232.Fa maxprot 233can take are: 234.Bd -literal 235#define UVM_PROT_MASK 0x07 /* protection mask */ 236#define UVM_PROT_NONE 0x00 /* protection none */ 237#define UVM_PROT_ALL 0x07 /* everything */ 238#define UVM_PROT_READ 0x01 /* read */ 239#define UVM_PROT_WRITE 0x02 /* write */ 240#define UVM_PROT_EXEC 0x04 /* exec */ 241#define UVM_PROT_R 0x01 /* read */ 242#define UVM_PROT_W 0x02 /* write */ 243#define UVM_PROT_RW 0x03 /* read-write */ 244#define UVM_PROT_X 0x04 /* exec */ 245#define UVM_PROT_RX 0x05 /* read-exec */ 246#define UVM_PROT_WX 0x06 /* write-exec */ 247#define UVM_PROT_RWX 0x07 /* read-write-exec */ 248.Ed 249.Pp 250The values that 251.Fa inh 252can take are: 253.Bd -literal 254#define UVM_INH_MASK 0x30 /* inherit mask */ 255#define UVM_INH_SHARE 0x00 /* "share" */ 256#define UVM_INH_COPY 0x10 /* "copy" */ 257#define UVM_INH_NONE 0x20 /* "none" */ 258#define UVM_INH_DONATE 0x30 /* "donate" \*[Lt]\*[Lt] not used */ 259.Ed 260.Pp 261The values that 262.Fa advice 263can take are: 264.Bd -literal 265#define UVM_ADV_NORMAL 0x0 /* 'normal' */ 266#define UVM_ADV_RANDOM 0x1 /* 'random' */ 267#define UVM_ADV_SEQUENTIAL 0x2 /* 'sequential' */ 268#define UVM_ADV_MASK 0x7 /* mask */ 269.Ed 270.Pp 271The values that 272.Fa flags 273can take are: 274.Bd -literal 275#define UVM_FLAG_FIXED 0x010000 /* find space */ 276#define UVM_FLAG_OVERLAY 0x020000 /* establish overlay */ 277#define UVM_FLAG_NOMERGE 0x040000 /* don't merge map entries */ 278#define UVM_FLAG_COPYONW 0x080000 /* set copy_on_write flag */ 279#define UVM_FLAG_AMAPPAD 0x100000 /* for bss: pad amap to reduce malloc() */ 280#define UVM_FLAG_TRYLOCK 0x200000 /* fail if we can not lock map */ 281.Ed 282.Pp 283The 284.Dv UVM_MAPFLAG 285macro arguments can be combined with an or operator. 286There are several special purpose macros for checking protection 287combinations, e.g., the 288.Dv UVM_PROT_WX 289macro. 290There are also some additional macros to extract bits from the flags. 291The 292.Dv UVM_PROTECTION , 293.Dv UVM_INHERIT , 294.Dv UVM_MAXPROTECTION 295and 296.Dv UVM_ADVICE 297macros return the protection, inheritance, maximum protection and advice, 298respectively. 299.Fn uvm_map 300returns a standard UVM return value. 301.Pp 302.Fn uvm_unmap 303removes a valid mapping, 304from 305.Fa start 306to 307.Fa end , 308in map 309.Fa map , 310which must be unlocked. 311.Pp 312.Fn uvm_map_pageable 313changes the pageability of the pages in the range from 314.Fa start 315to 316.Fa end 317in map 318.Fa map 319to 320.Fa new_pageable . 321.Fn uvm_map_pageable 322returns a standard UVM return value. 323.Pp 324.Fn uvm_map_checkprot 325checks the protection of the range from 326.Fa start 327to 328.Fa end 329in map 330.Fa map 331against 332.Fa protection . 333This returns either 334.Dv TRUE 335or 336.Dv FALSE . 337.Pp 338.Fn uvm_map_protect 339changes the protection 340.Fa start 341to 342.Fa end 343in map 344.Fa map 345to 346.Fa new_prot , 347also setting the maximum protection to the region to 348.Fa new_prot 349if 350.Fa set_max 351is non-zero. 352This function returns a standard UVM return value. 353.Pp 354.Fn uvm_deallocate 355deallocates kernel memory in map 356.Fa map 357from address 358.Fa start 359to 360.Fa start + size . 361.Pp 362.Fn uvmspace_alloc 363allocates and returns a new address space, with ranges from 364.Fa min 365to 366.Fa max , 367setting the pageability of the address space to 368.Fa pageable . 369.Pp 370.Fn uvmspace_exec 371either reuses the address space of process 372.Fa p 373if there are no other references to it, or creates 374a new one with 375.Fn uvmspace_alloc . 376The range of valid addresses in the address space is reset to 377.Fa start 378through 379.Fa end . 380.Pp 381.Fn uvmspace_fork 382creates and returns a new address space based upon the 383.Fa vm1 384address space, typically used when allocating an address space for a 385child process. 386.Pp 387.Fn uvmspace_free 388lowers the reference count on the address space 389.Fa vm , 390freeing the data structures if there are no other references. 391.Pp 392.Fn uvmspace_share 393causes process 394.Pa p2 395to share the address space of 396.Fa p1 . 397.Pp 398.Fn uvmspace_unshare 399ensures that process 400.Fa p 401has its own, unshared address space, by creating a new one if 402necessary by calling 403.Fn uvmspace_fork . 404.Pp 405.Fn uvm_uarea_alloc 406allocates virtual space for a u-area (i.e., a kernel stack) and stores 407its virtual address in 408.Fa *uaddrp . 409The return value is 410.Dv TRUE 411if the u-area is already backed by wired physical memory, otherwise 412.Dv FALSE . 413.Pp 414.Fn uvm_uarea_free 415frees a u-area allocated with 416.Fn uvm_uarea_alloc , 417freeing both the virtual space and any physical pages which may have been 418allocated to back that virtual space later. 419.Sh PAGE FAULT HANDLING 420.Ft int 421.br 422.Fn uvm_fault "struct vm_map *orig_map" "vaddr_t vaddr" "vm_fault_t fault_type" "vm_prot_t access_type" ; 423.Pp 424.Fn uvm_fault 425is the main entry point for faults. 426It takes 427.Fa orig_map 428as the map the fault originated in, a 429.Fa vaddr 430offset into the map the fault occurred, 431.Fa fault_type 432describing the type of fault, and 433.Fa access_type 434describing the type of access requested. 435.Fn uvm_fault 436returns a standard UVM return value. 437.Sh MEMORY MAPPING FILES AND DEVICES 438.Ft struct uvm_object * 439.br 440.Fn uvn_attach "void *arg" "vm_prot_t accessprot" ; 441.Pp 442.Ft void 443.br 444.Fn uvm_vnp_setsize "struct vnode *vp" "voff_t newsize" ; 445.Pp 446.Ft void * 447.br 448.Fn ubc_alloc "struct uvm_object *uobj" "voff_t offset" "vsize_t *lenp" "int flags" ; 449.Pp 450.Ft void 451.br 452.Fn ubc_release "void *va" "int flags" ; 453.Pp 454.Fn uvn_attach 455attaches a UVM object to vnode 456.Fa arg , 457creating the object if necessary. 458The object is returned. 459.Pp 460.Fn uvm_vnp_setsize 461sets the size of vnode 462.Fa vp 463to 464.Fa newsize . 465Caller must hold a reference to the vnode. 466If the vnode shrinks, pages no longer used are discarded. 467.Pp 468.Fn ubc_alloc 469creates a kernel mappings of 470.Fa uobj 471starting at offset 472.Fa offset . 473the desired length of the mapping is pointed to by 474.Fa lenp , 475but the actual mapping may be smaller than this. 476.Fa lenp 477is updated to contain the actual length mapped. 478The flags must be one of 479.Bd -literal 480#define UBC_READ 0x01 /* mapping will be accessed for read */ 481#define UBC_WRITE 0x02 /* mapping will be accessed for write */ 482.Ed 483.Pp 484Currently, 485.Fa uobj 486must actually be a vnode object. 487Once the mapping is created, it must be accessed only by methods that can 488handle faults, such as 489.Fn uiomove 490or 491.Fn kcopy . 492Page faults on the mapping will result in the vnode's 493.Fn VOP_GETPAGES 494method being called to resolve the fault. 495.Pp 496.Fn ubc_release 497frees the mapping at 498.Fa va 499for reuse. 500The mapping may be cached to speed future accesses to the same region 501of the object. 502The flags can be any of 503.Bd -literal 504#define UBC_UNMAP 0x01 /* do not cache mapping */ 505.Ed 506.Sh VIRTUAL MEMORY I/O 507.Ft int 508.br 509.Fn uvm_io "struct vm_map *map" "struct uio *uio" ; 510.Pp 511.Fn uvm_io 512performs the I/O described in 513.Fa uio 514on the memory described in 515.Fa map . 516.Sh ALLOCATION OF KERNEL MEMORY 517.Ft vaddr_t 518.br 519.Fn uvm_km_alloc "struct vm_map *map" "vsize_t size" "vsize_t align" "uvm_flag_t flags" ; 520.Pp 521.Ft void 522.br 523.Fn uvm_km_free "struct vm_map *map" "vaddr_t addr" "vsize_t size" "uvm_flag_t flags" ; 524.Pp 525.Ft struct vm_map * 526.br 527.Fn uvm_km_suballoc "struct vm_map *map" "vaddr_t *min" "vaddr_t *max" "vsize_t size" "boolean_t pageable" "boolean_t fixed" "struct vm_map *submap" ; 528.Pp 529.Fn uvm_km_alloc 530allocates 531.Fa size 532bytes of kernel memory in map 533.Fa map . 534The first address of the allocated memory range will be aligned according to the 535.Fa align 536argument 537.Pq specify 0 if no alignment is necessary . 538The alignment must be a multiple of page size. 539The 540.Fa flags 541is a bitwise inclusive OR of the allocation type and operation flags. 542.Pp 543The allocation type should be one of: 544.Bl -tag -width UVM_KMF_PAGEABLE 545.It UVM_KMF_WIRED 546Wired memory. 547.It UVM_KMF_PAGEABLE 548Demand-paged zero-filled memory. 549.It UVM_KMF_VAONLY 550Virtual address only. 551No physical pages are mapped in the allocated region. 552If necessary, it's the caller's responsibility to enter page mappings. 553It's also the caller's responsibility to clean up the mappings before freeing 554the address range. 555.El 556.Pp 557The following operation flags are available: 558.Bl -tag -width UVM_KMF_PAGEABLE 559.It UVM_KMF_CANFAIL 560Can fail even if 561.Dv UVM_KMF_NOWAIT 562is not specified and 563.Dv UVM_KMF_WAITVA 564is specified. 565.It UVM_KMF_ZERO 566Request zero-filled memory. 567Only supported for 568.Dv UVM_KMF_WIRED . 569Shouldn't be used with other types. 570.It UVM_KMF_TRYLOCK 571Fail if we can't lock the map. 572.It UVM_KMF_NOWAIT 573Fail immediately if no memory is available. 574.It UVM_KMF_WAITVA 575Sleep to wait for the virtual address resources if needed. 576.El 577.Pp 578(If neither 579.Dv UVM_KMF_NOWAIT 580nor 581.Dv UVM_KMF_CANFAIL 582are specified and 583.Dv UVM_KMF_WAITVA 584is specified, 585.Fn uvm_km_alloc 586will never fail, but rather sleep indefinitely until the allocation succeeds.) 587.Pp 588Pageability of the pages allocated with 589.Dv UVM_KMF_PAGEABLE 590can be changed by 591.Fn uvm_map_pageable . 592In that case, the entire range must be changed atomically. 593Changing a part of the range is not supported. 594.Pp 595.Fn uvm_km_free 596frees the memory range allocated by 597.Fn uvm_km_alloc . 598.Fa addr 599must be an address returned by 600.Fn uvm_km_alloc . 601.Fa map 602and 603.Fa size 604must be the same as the ones used for the corresponding 605.Fn uvm_km_alloc . 606.Fa flags 607must be the allocation type used for the corresponding 608.Fn uvm_km_alloc . 609.Pp 610.Fn uvm_km_free 611is the only way to free memory ranges allocated by 612.Fn uvm_km_alloc . 613.Fn uvm_unmap 614must not be used. 615.Pp 616.Fn uvm_km_suballoc 617allocates submap from 618.Fa map , 619creating a new map if 620.Fa submap 621is 622.Dv NULL . 623The addresses of the submap can be specified exactly by setting the 624.Fa fixed 625argument to non-zero, which causes the 626.Fa min 627argument to specify the beginning of the address in the submap. 628If 629.Fa fixed 630is zero, any address of size 631.Fa size 632will be allocated from 633.Fa map 634and the start and end addresses returned in 635.Fa min 636and 637.Fa max . 638If 639.Fa pageable 640is non-zero, entries in the map may be paged out. 641.Sh ALLOCATION OF PHYSICAL MEMORY 642.Ft struct vm_page * 643.br 644.Fn uvm_pagealloc "struct uvm_object *uobj" "voff_t off" "struct vm_anon *anon" "int flags" ; 645.Pp 646.Ft void 647.br 648.Fn uvm_pagerealloc "struct vm_page *pg" "struct uvm_object *newobj" "voff_t newoff" ; 649.Pp 650.Ft void 651.br 652.Fn uvm_pagefree "struct vm_page *pg" ; 653.Pp 654.Ft int 655.br 656.Fn uvm_pglistalloc "psize_t size" "paddr_t low" "paddr_t high" "paddr_t alignment" "paddr_t boundary" "struct pglist *rlist" "int nsegs" "int waitok" ; 657.Pp 658.Ft void 659.br 660.Fn uvm_pglistfree "struct pglist *list" ; 661.Pp 662.Ft void 663.br 664.Fn uvm_page_physload "vaddr_t start" "vaddr_t end" "vaddr_t avail_start" "vaddr_t avail_end" "int free_list" ; 665.Pp 666.Fn uvm_pagealloc 667allocates a page of memory at virtual address 668.Fa off 669in either the object 670.Fa uobj 671or the anonymous memory 672.Fa anon , 673which must be locked by the caller. 674Only one of 675.Fa uobj 676and 677.Fa anon 678can be non 679.Dv NULL . 680Returns 681.Dv NULL 682when no page can be found. 683The flags can be any of 684.Bd -literal 685#define UVM_PGA_USERESERVE 0x0001 /* ok to use reserve pages */ 686#define UVM_PGA_ZERO 0x0002 /* returned page must be zero'd */ 687.Ed 688.Pp 689.Dv UVM_PGA_USERESERVE 690means to allocate a page even if that will result in the number of free pages 691being lower than 692.Dv uvmexp.reserve_pagedaemon 693(if the current thread is the pagedaemon) or 694.Dv uvmexp.reserve_kernel 695(if the current thread is not the pagedaemon). 696.Dv UVM_PGA_ZERO 697causes the returned page to be filled with zeroes, either by allocating it 698from a pool of pre-zeroed pages or by zeroing it in-line as necessary. 699.Pp 700.Fn uvm_pagerealloc 701reallocates page 702.Fa pg 703to a new object 704.Fa newobj , 705at a new offset 706.Fa newoff . 707.Pp 708.Fn uvm_pagefree 709frees the physical page 710.Fa pg . 711If the content of the page is known to be zero-filled, 712caller should set 713.Dv PG_ZERO 714in pg-\*[Gt]flags so that the page allocator will use 715the page to serve future 716.Dv UVM_PGA_ZERO 717requests efficiently. 718.Pp 719.Fn uvm_pglistalloc 720allocates a list of pages for size 721.Fa size 722byte under various constraints. 723.Fa low 724and 725.Fa high 726describe the lowest and highest addresses acceptable for the list. 727If 728.Fa alignment 729is non-zero, it describes the required alignment of the list, in 730power-of-two notation. 731If 732.Fa boundary 733is non-zero, no segment of the list may cross this power-of-two 734boundary, relative to zero. 735.Fa nsegs 736is the maximum number of physically contigous segments. 737If 738.Fa waitok 739is non-zero, the function may sleep until enough memory is available. 740(It also may give up in some situations, so a non-zero 741.Fa waitok 742does not imply that 743.Fn uvm_pglistalloc 744cannot return an error.) 745The allocated memory is returned in the 746.Fa rlist 747list; the caller has to provide storage only, the list is initialized by 748.Fn uvm_pglistalloc . 749.Pp 750.Fn uvm_pglistfree 751frees the list of pages pointed to by 752.Fa list . 753If the content of the page is known to be zero-filled, 754caller should set 755.Dv PG_ZERO 756in pg-\*[Gt]flags so that the page allocator will use 757the page to serve future 758.Dv UVM_PGA_ZERO 759requests efficiently. 760.Pp 761.Fn uvm_page_physload 762loads physical memory segments into VM space on the specified 763.Fa free_list . 764It must be called at system boot time to set up physical memory 765management pages. 766The arguments describe the 767.Fa start 768and 769.Fa end 770of the physical addresses of the segment, and the available start and end 771addresses of pages not already in use. 772.\" XXX expand on "system boot time"! 773.Sh PROCESSES 774.Ft void 775.br 776.Fn uvm_pageout "void" ; 777.Pp 778.Ft void 779.br 780.Fn uvm_scheduler "void" ; 781.Pp 782.Ft void 783.br 784.Fn uvm_swapin "struct proc *p" ; 785.Pp 786.Fn uvm_pageout 787is the main loop for the page daemon. 788.Pp 789.Fn uvm_scheduler 790is the process zero main loop, which is to be called after the 791system has finished starting other processes. 792It handles the swapping in of runnable, swapped out processes in priority 793order. 794.Pp 795.Fn uvm_swapin 796swaps in the named process. 797.Sh PAGE LOAN 798.Ft int 799.br 800.Fn uvm_loan "struct vm_map *map" "vaddr_t start" "vsize_t len" "void *v" "int flags" ; 801.Pp 802.Ft void 803.br 804.Fn uvm_unloan "void *v" "int npages" "int flags" ; 805.Pp 806.Fn uvm_loan 807loans pages in a map out to anons or to the kernel. 808.Fa map 809should be unlocked, 810.Fa start 811and 812.Fa len 813should be multiples of 814.Dv PAGE_SIZE . 815Argument 816.Fa flags 817should be one of 818.Bd -literal 819#define UVM_LOAN_TOANON 0x01 /* loan to anons */ 820#define UVM_LOAN_TOPAGE 0x02 /* loan to kernel */ 821.Ed 822.Pp 823.Fa v 824should be pointer to array of pointers to 825.Li struct anon 826or 827.Li struct vm_page , 828as appropriate. 829The caller has to allocate memory for the array and 830ensure it's big enough to hold 831.Fa len / PAGE_SIZE 832pointers. 833Returns 0 for success, or appropriate error number otherwise. 834Note that wired pages can't be loaned out and 835.Fn uvm_loan 836will fail in that case. 837.Pp 838.Fn uvm_unloan 839kills loans on pages or anons. 840The 841.Fa v 842must point to the array of pointers initialized by previous call to 843.Fn uvm_loan . 844.Fa npages 845should match number of pages allocated for loan, this also matches 846number of items in the array. 847Argument 848.Fa flags 849should be one of 850.Bd -literal 851#define UVM_LOAN_TOANON 0x01 /* loan to anons */ 852#define UVM_LOAN_TOPAGE 0x02 /* loan to kernel */ 853.Ed 854.Pp 855and should match what was used for previous call to 856.Fn uvm_loan . 857.Sh MISCELLANEOUS FUNCTIONS 858.Ft struct uvm_object * 859.br 860.Fn uao_create "vsize_t size" "int flags" ; 861.Pp 862.Ft void 863.br 864.Fn uao_detach "struct uvm_object *uobj" ; 865.Pp 866.Ft void 867.br 868.Fn uao_reference "struct uvm_object *uobj" ; 869.Pp 870.Ft boolean_t 871.br 872.Fn uvm_chgkprot "caddr_t addr" "size_t len" "int rw" ; 873.Pp 874.Ft void 875.br 876.Fn uvm_kernacc "caddr_t addr" "size_t len" "int rw" ; 877.Pp 878.Ft int 879.br 880.Fn uvm_vslock "struct proc *p" "caddr_t addr" "size_t len" "vm_prot_t prot" ; 881.Pp 882.Ft void 883.br 884.Fn uvm_vsunlock "struct proc *p" "caddr_t addr" "size_t len" ; 885.Pp 886.Ft void 887.br 888.Fn uvm_meter "void" ; 889.Pp 890.Ft void 891.br 892.Fn uvm_fork "struct proc *p1" "struct proc *p2" "boolean_t shared" ; 893.Pp 894.Ft int 895.br 896.Fn uvm_grow "struct proc *p" "vaddr_t sp" ; 897.Pp 898.Ft int 899.br 900.Fn uvm_coredump "struct proc *p" "struct vnode *vp" "struct ucred *cred" "struct core *chdr" ; 901.Pp 902.Ft void 903.br 904.Fn uvn_findpages "struct uvm_object *uobj" "voff_t offset" "int *npagesp" "struct vm_page **pps" "int flags" ; 905.Pp 906.Ft void 907.br 908.Fn uvm_swap_stats "int cmd" "struct swapent *sep" "int sec" "register_t *retval" ; 909.Pp 910The 911.Fn uao_create , 912.Fn uao_detach , 913and 914.Fn uao_reference 915functions operate on anonymous memory objects, such as those used to support 916System V shared memory. 917.Fn uao_create 918returns an object of size 919.Fa size 920with flags: 921.Bd -literal 922#define UAO_FLAG_KERNOBJ 0x1 /* create kernel object */ 923#define UAO_FLAG_KERNSWAP 0x2 /* enable kernel swap */ 924.Ed 925.Pp 926which can only be used once each at system boot time. 927.Fn uao_reference 928creates an additional reference to the named anonymous memory object. 929.Fn uao_detach 930removes a reference from the named anonymous memory object, destroying 931it if removing the last reference. 932.Pp 933.Fn uvm_chgkprot 934changes the protection of kernel memory from 935.Fa addr 936to 937.Fa addr + len 938to the value of 939.Fa rw . 940This is primarily useful for debuggers, for setting breakpoints. 941This function is only available with options 942.Dv KGDB . 943.Pp 944.Fn uvm_kernacc 945checks the access at address 946.Fa addr 947to 948.Fa addr + len 949for 950.Fa rw 951access in the kernel address space. 952.Pp 953.Fn uvm_vslock 954and 955.Fn uvm_vsunlock 956control the wiring and unwiring of pages for process 957.Fa p 958from 959.Fa addr 960to 961.Fa addr + len . 962These functions are normally used to wire memory for I/O. 963.Pp 964.Fn uvm_meter 965calculates the load average and wakes up the swapper if necessary. 966.Pp 967.Fn uvm_fork 968forks a virtual address space for process' (old) 969.Fa p1 970and (new) 971.Fa p2 . 972If the 973.Fa shared 974argument is non zero, p1 shares its address space with p2, 975otherwise a new address space is created. 976This function currently has no return value, and thus cannot fail. 977In the future, this function will be changed to allow it to 978fail in low memory conditions. 979.Pp 980.Fn uvm_grow 981increases the stack segment of process 982.Fa p 983to include 984.Fa sp . 985.Pp 986.Fn uvm_coredump 987generates a coredump on vnode 988.Fa vp 989for process 990.Fa p 991with credentials 992.Fa cred 993and core header description in 994.Fa chdr . 995.Pp 996.Fn uvn_findpages 997looks up or creates pages in 998.Fa uobj 999at offset 1000.Fa offset , 1001marks them busy and returns them in the 1002.Fa pps 1003array. 1004Currently 1005.Fa uobj 1006must be a vnode object. 1007The number of pages requested is pointed to by 1008.Fa npagesp , 1009and this value is updated with the actual number of pages returned. 1010The flags can be 1011.Bd -literal 1012#define UFP_ALL 0x00 /* return all pages requested */ 1013#define UFP_NOWAIT 0x01 /* don't sleep */ 1014#define UFP_NOALLOC 0x02 /* don't allocate new pages */ 1015#define UFP_NOCACHE 0x04 /* don't return pages which already exist */ 1016#define UFP_NORDONLY 0x08 /* don't return PG_READONLY pages */ 1017.Ed 1018.Pp 1019.Dv UFP_ALL 1020is a pseudo-flag meaning all requested pages should be returned. 1021.Dv UFP_NOWAIT 1022means that we must not sleep. 1023.Dv UFP_NOALLOC 1024causes any pages which do not already exist to be skipped. 1025.Dv UFP_NOCACHE 1026causes any pages which do already exist to be skipped. 1027.Dv UFP_NORDONLY 1028causes any pages which are marked PG_READONLY to be skipped. 1029.Pp 1030.Fn uvm_swap_stats 1031implements the 1032.Dv SWAP_STATS 1033and 1034.Dv SWAP_OSTATS 1035operation of the 1036.Xr swapctl 2 1037system call. 1038.Fa cmd 1039is the requested command, 1040.Dv SWAP_STATS 1041or 1042.Dv SWAP_OSTATS . 1043The function will copy no more than 1044.Fa sec 1045entries in the array pointed by 1046.Fa sep . 1047On return, 1048.Fa retval 1049holds the actual number of entries copied in the array. 1050.Sh SYSCTL 1051UVM provides support for the 1052.Dv CTL_VM 1053domain of the 1054.Xr sysctl 3 1055hierarchy. 1056It handles the 1057.Dv VM_LOADAVG , 1058.Dv VM_METER , 1059.Dv VM_UVMEXP , 1060and 1061.Dv VM_UVMEXP2 1062nodes, which return the current load averages, calculates current VM 1063totals, returns the uvmexp structure, and a kernel version independent 1064view of the uvmexp structure, respectively. 1065It also exports a number of tunables that control how much VM space is 1066allowed to be consumed by various tasks. 1067The load averages are typically accessed from userland using the 1068.Xr getloadavg 3 1069function. 1070The uvmexp structure has all global state of the UVM system, 1071and has the following members: 1072.Bd -literal 1073/* vm_page constants */ 1074int pagesize; /* size of a page (PAGE_SIZE): must be power of 2 */ 1075int pagemask; /* page mask */ 1076int pageshift; /* page shift */ 1077 1078/* vm_page counters */ 1079int npages; /* number of pages we manage */ 1080int free; /* number of free pages */ 1081int active; /* number of active pages */ 1082int inactive; /* number of pages that we free'd but may want back */ 1083int paging; /* number of pages in the process of being paged out */ 1084int wired; /* number of wired pages */ 1085int reserve_pagedaemon; /* number of pages reserved for pagedaemon */ 1086int reserve_kernel; /* number of pages reserved for kernel */ 1087 1088/* pageout params */ 1089int freemin; /* min number of free pages */ 1090int freetarg; /* target number of free pages */ 1091int inactarg; /* target number of inactive pages */ 1092int wiredmax; /* max number of wired pages */ 1093 1094/* swap */ 1095int nswapdev; /* number of configured swap devices in system */ 1096int swpages; /* number of PAGE_SIZE'ed swap pages */ 1097int swpginuse; /* number of swap pages in use */ 1098int nswget; /* number of times fault calls uvm_swap_get() */ 1099int nanon; /* number total of anon's in system */ 1100int nfreeanon; /* number of free anon's */ 1101 1102/* stat counters */ 1103int faults; /* page fault count */ 1104int traps; /* trap count */ 1105int intrs; /* interrupt count */ 1106int swtch; /* context switch count */ 1107int softs; /* software interrupt count */ 1108int syscalls; /* system calls */ 1109int pageins; /* pagein operation count */ 1110 /* pageouts are in pdpageouts below */ 1111int swapins; /* swapins */ 1112int swapouts; /* swapouts */ 1113int pgswapin; /* pages swapped in */ 1114int pgswapout; /* pages swapped out */ 1115int forks; /* forks */ 1116int forks_ppwait; /* forks where parent waits */ 1117int forks_sharevm; /* forks where vmspace is shared */ 1118 1119/* fault subcounters */ 1120int fltnoram; /* number of times fault was out of ram */ 1121int fltnoanon; /* number of times fault was out of anons */ 1122int fltpgwait; /* number of times fault had to wait on a page */ 1123int fltpgrele; /* number of times fault found a released page */ 1124int fltrelck; /* number of times fault relock called */ 1125int fltrelckok; /* number of times fault relock is a success */ 1126int fltanget; /* number of times fault gets anon page */ 1127int fltanretry; /* number of times fault retrys an anon get */ 1128int fltamcopy; /* number of times fault clears "needs copy" */ 1129int fltnamap; /* number of times fault maps a neighbor anon page */ 1130int fltnomap; /* number of times fault maps a neighbor obj page */ 1131int fltlget; /* number of times fault does a locked pgo_get */ 1132int fltget; /* number of times fault does an unlocked get */ 1133int flt_anon; /* number of times fault anon (case 1a) */ 1134int flt_acow; /* number of times fault anon cow (case 1b) */ 1135int flt_obj; /* number of times fault is on object page (2a) */ 1136int flt_prcopy; /* number of times fault promotes with copy (2b) */ 1137int flt_przero; /* number of times fault promotes with zerofill (2b) */ 1138 1139/* daemon counters */ 1140int pdwoke; /* number of times daemon woke up */ 1141int pdrevs; /* number of times daemon rev'd clock hand */ 1142int pdswout; /* number of times daemon called for swapout */ 1143int pdfreed; /* number of pages daemon freed since boot */ 1144int pdscans; /* number of pages daemon scanned since boot */ 1145int pdanscan; /* number of anonymous pages scanned by daemon */ 1146int pdobscan; /* number of object pages scanned by daemon */ 1147int pdreact; /* number of pages daemon reactivated since boot */ 1148int pdbusy; /* number of times daemon found a busy page */ 1149int pdpageouts; /* number of times daemon started a pageout */ 1150int pdpending; /* number of times daemon got a pending pageout */ 1151int pddeact; /* number of pages daemon deactivates */ 1152.Ed 1153.Sh NOTES 1154.Fn uvm_chgkprot 1155is only available if the kernel has been compiled with options 1156.Dv KGDB . 1157.Pp 1158All structure and types whose names begin with 1159.Dq vm_ 1160will be renamed to 1161.Dq uvm_ . 1162.Sh SEE ALSO 1163.Xr swapctl 2 , 1164.Xr getloadavg 3 , 1165.Xr kvm 3 , 1166.Xr sysctl 3 , 1167.Xr ddb 4 , 1168.Xr options 4 , 1169.Xr pmap 9 1170.Sh HISTORY 1171UVM is a new VM system developed at Washington University in St. Louis 1172(Missouri). 1173UVM's roots lie partly in the Mach-based 1174.Bx 4.4 1175VM system, the 1176.Fx 1177VM system, and the SunOS 4 VM system. 1178UVM's basic structure is based on the 1179.Bx 4.4 1180VM system. 1181UVM's new anonymous memory system is based on the 1182anonymous memory system found in the SunOS 4 VM (as described in papers 1183published by Sun Microsystems, Inc.). 1184UVM also includes a number of features new to 1185.Bx 1186including page loanout, map entry passing, simplified 1187copy-on-write, and clustered anonymous memory pageout. 1188UVM is also further documented in an August 1998 dissertation by 1189Charles D. Cranor. 1190.Pp 1191UVM appeared in 1192.Nx 1.4 . 1193.Sh AUTHORS 1194Charles D. Cranor 1195.Aq chuck@ccrc.wustl.edu 1196designed and implemented UVM. 1197.Pp 1198Matthew Green 1199.Aq mrg@eterna.com.au 1200wrote the swap-space management code and handled the logistical issues 1201involved with merging UVM into the 1202.Nx 1203source tree. 1204.Pp 1205Chuck Silvers 1206.Aq chuq@chuq.com 1207implemented the aobj pager, thus allowing UVM to support System V shared 1208memory and process swapping. 1209He also designed and implemented the UBC part of UVM, which uses UVM pages 1210to cache vnode data rather than the traditional buffer cache buffers. 1211