1.\" $NetBSD: sysctl.9,v 1.14 2009/04/08 12:50:00 joerg Exp $ 2.\" 3.\" Copyright (c) 2004 The NetBSD Foundation, Inc. 4.\" All rights reserved. 5.\" 6.\" This code is derived from software contributed to The NetBSD Foundation 7.\" by Andrew Brown. 8.\" 9.\" Redistribution and use in source and binary forms, with or without 10.\" modification, are permitted provided that the following conditions 11.\" are met: 12.\" 1. Redistributions of source code must retain the above copyright 13.\" notice, this list of conditions and the following disclaimer. 14.\" 2. Redistributions in binary form must reproduce the above copyright 15.\" notice, this list of conditions and the following disclaimer in the 16.\" documentation and/or other materials provided with the distribution. 17.\" 18.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS 19.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED 20.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 21.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS 22.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 23.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 24.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 25.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 26.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 27.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 28.\" POSSIBILITY OF SUCH DAMAGE. 29.\" 30.Dd August 15, 2008 31.Dt SYSCTL 9 32.Os 33.Sh NAME 34.Nm sysctl 35.Nd system variable control interfaces 36.Sh SYNOPSIS 37.In sys/param.h 38.In sys/sysctl.h 39.Pp 40Primary external interfaces: 41.Ft void 42.Fn sysctl_init void 43.Ft int 44.Fn sysctl_lock "struct lwp *l" "void *oldp" "size_t savelen" 45.Ft int 46.Fn sysctl_dispatch "const int *name" "u_int namelen" "void *oldp" \ 47"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 48"struct lwp *l" "const struct sysctlnode *rnode" 49.Ft void 50.Fn sysctl_unlock "struct lwp *l" 51.Ft int 52.Fn sysctl_createv "struct sysctllog **log" "int cflags" \ 53"const struct sysctlnode **rnode" "const struct sysctlnode **cnode" \ 54"int flags" "int type" "const char *namep" "const char *desc" \ 55"sysctlfn func" "u_quad_t qv" "void *newp" "size_t newlen" ... 56.Ft int 57.Fn sysctl_destroyv "struct sysctlnode *rnode" ... 58.Ft void 59.Fn sysctl_free "struct sysctlnode *rnode" 60.Ft void 61.Fn sysctl_teardown "struct sysctllog **" 62.Ft int 63.Fn old_sysctl "int *name" "u_int namelen" "void *oldp" \ 64"size_t *oldlenp" "void *newp" "size_t newlen" "struct lwp *l" 65.Pp 66Core internal functions: 67.Ft int 68.Fn sysctl_locate "struct lwp *l" "const int *name" "u_int namelen" \ 69"const struct sysctlnode **rnode" "int *nip" 70.Ft int 71.Fn sysctl_lookup "const int *name" "u_int namelen" "void *oldp" \ 72"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 73"struct lwp *l" "const struct sysctlnode *rnode" 74.Ft int 75.Fn sysctl_create "const int *name" "u_int namelen" "void *oldp" \ 76"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 77"struct lwp *l" "const struct sysctlnode *rnode" 78.Ft int 79.Fn sysctl_destroy "const int *name" "u_int namelen" "void *oldp" \ 80"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 81"struct lwp *l" "const struct sysctlnode *rnode" 82.Ft int 83.Fn sysctl_query "const int *name" "u_int namelen" "void *oldp" \ 84"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 85"struct lwp *l" "const struct sysctlnode *rnode" 86.Pp 87Simple 88.Dq helper 89functions: 90.Ft int 91.Fn sysctl_needfunc "const int *name" "u_int namelen" "void *oldp" \ 92"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 93"struct lwp *l" "const struct sysctlnode *rnode" 94.Ft int 95.Fn sysctl_notavail "const int *name" "u_int namelen" "void *oldp" \ 96"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 97"struct lwp *l" "const struct sysctlnode *rnode" 98.Ft int 99.Fn sysctl_null "const int *name" "u_int namelen" "void *oldp" \ 100"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 101"struct lwp *l" "const struct sysctlnode *rnode" 102.Sh DESCRIPTION 103The SYSCTL subsystem instruments a number of kernel tunables and other 104data structures via a simple MIB-like interface, primarily for 105consumption by userland programs, but also for use internally by the 106kernel. 107.Sh LOCKING 108All operations on the SYSCTL tree must be protected by acquiring the 109main SYSCTL lock. 110The only functions that can be called when the lock is not held are 111.Fn sysctl_lock , 112.Fn sysctl_createv , 113.Fn sysctl_destroyv , 114and 115.Fn old_sysctl . 116All other functions require the tree to be locked. 117This is to prevent other users of the tree from moving nodes around 118during an add operation, or from destroying nodes or subtrees that are 119actively being used. 120The lock is acquired by calling 121.Fn sysctl_lock 122with a pointer to the process's lwp 123.Fa l 124.Dv ( NULL 125may be passed to all functions as the lwp pointer if no lwp is 126appropriate, though any changes made via 127.Fn sysctl_create , 128.Fn sysctl_destroy , 129.Fn sysctl_lookup , 130or by any helper function will be done with effective superuser 131privileges). 132The 133.Fa oldp 134and 135.Fa savelen 136arguments are a pointer to and the size of the memory region the 137caller will be using to collect data from SYSCTL. 138These may also be 139.Dv NULL 140and 0, respectively. 141.Pp 142The memory region will be locked via 143.Fn uvm_vslock 144if it is a region in userspace. 145The address and size of the region are recorded so that when the 146SYSCTL lock is to be released via 147.Fn sysctl_unlock , 148only the lwp pointer 149.Fa l 150is required. 151.Sh LOOKUPS 152Once the lock has been acquired, it is typical to call 153.Fn sysctl_dispatch 154to handle the request. 155.Fn sysctl_dispatch 156will examine the contents of 157.Fa name , 158an array of integers at least 159.Fa namelen 160long, which is to be located in kernel space, in order to determine 161which function to call to handle the specific request. 162.Pp 163.Fn sysctl_dispatch 164uses the following algorithm to determine the function to call: 165.Pp 166.Bl -bullet 167.It 168Scan the tree using 169.Fn sysctl_locate 170.It 171If the node returned has a 172.Dq helper 173function, call it 174.It 175If the requested node was found but has no function, call 176.Fn sysctl_lookup 177.It 178If the node was not found and 179.Fa name 180specifies one of 181.Fn sysctl_query , 182.Fn sysctl_create , 183or 184.Fn sysctl_destroy , 185call the appropriate function 186.It 187If none of these options applies and no other error was yet recorded, 188return 189.Er EOPNOTSUPP 190.Pp 191.El 192The 193.Fa oldp 194and 195.Fa oldlenp 196arguments to 197.Fn sysctl_dispatch , 198as with all the other core functions, describe an area into which the 199current or requested value may be copied. 200.Fa oldp 201may or may not be a pointer into userspace (as dictated by whether 202.Fa l 203is 204.Dv NULL 205or not). 206.Fa oldlenp 207is a 208.No non- Ns Dv NULL 209pointer to a size_t. 210.Fa newp 211and 212.Fa newlen 213describe an area where the new value for the request may be found; 214.Fa newp 215may also be a pointer into userspace. 216The 217.Fa oname 218argument is a 219.No non- Ns Dv NULL 220pointer to the base of the request currently 221being processed. 222By simple arithmetic on 223.Fa name , 224.Fa namelen , 225and 226.Fa oname , 227one can easily determine the entire original request and 228.Fa namelen 229values, if needed. 230The 231.Fa rnode 232value, as passed to 233.Fn sysctl_dispatch 234represents the root of the tree into which the current request is to 235be dispatched. 236If 237.Dv NULL , 238the main tree will be used. 239.Pp 240.Fn sysctl_locate 241scans a tree for the node most specific to a request. 242If the pointer referenced by 243.Fa rnode 244is not 245.Dv NULL , 246the tree indicated is searched, otherwise the main tree 247will be used. 248The address of the most relevant node will be returned via 249.Fa rnode 250and the number of MIB entries consumed will be returned via 251.Fa nip , 252if it is not 253.Dv NULL . 254.Pp 255The 256.Fn sysctl_lookup 257function takes the same arguments as 258.Fn sysctl_dispatch 259with the caveat that the value for 260.Fa namelen 261must be zero in order to indicate that the node referenced by the 262.Fa rnode 263argument is the one to which the lookup is being applied. 264.Sh CREATION AND DESTRUCTION OF NODES 265New nodes are created and destroyed by the 266.Fn sysctl_create 267and 268.Fn sysctl_destroy 269functions. 270These functions take the same arguments as 271.Fn sysctl_dispatch 272with the additional requirement that the 273.Fa namelen 274argument must be 1 and the 275.Fa name 276argument must point to an integer valued either 277.Dv CTL_CREATE 278or 279.Dv CTL_CREATESYM 280when creating a new node, or 281.Dv CTL_DESTROY 282when destroying 283a node. 284The 285.Fa newp 286and 287.Fa newlen 288arguments should point to a copy of the node to be created or 289destroyed. 290If the create or destroy operation was successful, a copy of the node 291created or destroyed will be placed in the space indicated by 292.Fa oldp 293and 294.Fa oldlenp . 295If the create operation fails because of a conflict with an existing 296node, a copy of that node will be returned instead. 297.Pp 298In order to facilitate the creation and destruction of nodes from a 299given tree by kernel subsystems, the functions 300.Fn sysctl_createv 301and 302.Fn sysctl_destroyv 303are provided. 304These functions take care of the overhead of filling in the contents 305of the create or destroy request, dealing with locking, locating the 306appropriate parent node, etc. 307.Pp 308The arguments to 309.Fn sysctl_createv 310are used to construct the new node. 311If the 312.Fa log 313argument is not 314.Dv NULL , 315a sysctllog structure will be allocated and the pointer referenced 316will be changed to address it. 317The same log may be used for any number of nodes, provided they are 318all inserted into the same tree. 319This allows for a series of nodes to be created and later removed from 320the tree in a single transaction (via 321.Fn sysctl_teardown ) 322without the need for any record 323keeping on the caller's part. 324The 325.Fa cflags 326argument is currently unused and must be zero. 327The 328.Fa rnode 329argument must either be 330.Dv NULL 331or a valid pointer to a reference to the root of the tree into which 332the new node must be placed. 333If it is 334.Dv NULL , 335the main tree will be used. 336It is illegal for 337.Fa rnode 338to refer to a 339.Dv NULL 340pointer. 341If the 342.Fa cnode 343argument is not 344.Dv NULL , 345on return it will be adjusted to point to the address of the new node. 346.Pp 347The 348.Fa flags 349and 350.Fa type 351arguments are combined into the 352.Fa sysctl_flags 353field, and the current value for 354.Dv SYSCTL_VERSION 355is added in. 356Note: the 357.Dv CTLFLAG_PERMANENT 358flag can only be set from SYSCTL setup routines (see 359.Sx SETUP FUNCTIONS ) 360as called by 361.Fn sysctl_init . 362The 363.Fa namep 364argument is copied into the 365.Fa sysctl_name 366field and must be less than 367.Dv SYSCTL_NAMELEN 368characters in length. 369The string indicated by 370.Fa desc 371will be copied if the 372.Dv CTLFLAG_OWNDESC 373flag is set, and will be used as the node's description. 374Note: if 375.Fn sysctl_destroyv 376attempts to delete a node that does not own its own description (and 377is not marked as permanent), but the deletion fails, the description 378will be copied and 379.Fn sysctl_destroyv 380will set the 381.Dv CTLFLAG_OWNDESC 382flag. 383.Pp 384The 385.Fa func 386argument is the name of a 387.Dq helper 388function (see 389.Sx HELPER FUNCTIONS AND MACROS ) . 390If the 391.Dv CTLFLAG_IMMEDIATE 392flag is set, the 393.Fa qv 394argument will be interpreted as the initial value for the new 395.Dq int 396or 397.Dq quad 398node. 399This flag does not apply to any other type of node. 400The 401.Fa newp 402and 403.Fa newlen 404arguments describe the data external to SYSCTL that is to be 405instrumented. 406One of 407.Fa func , 408.Fa qv 409and the 410.Dv CTLFLAG_IMMEDIATE 411flag, or 412.Fa newp 413and 414.Fa newlen 415must be given for nodes that instrument data, otherwise an error is 416returned. 417.Pp 418The remaining arguments are a list of integers specifying the path 419through the MIB to the node being created. 420The list must be terminated by the 421.Dv CTL_EOL 422value. 423The penultimate value in the list may be 424.Dv CTL_CREATE 425if a dynamic MIB entry is to be made for this node. 426.Fn sysctl_createv 427specifically does not support 428.Dv CTL_CREATESYM , 429since setup routines are 430expected to be able to use the in-kernel 431.Xr ksyms 4 432interface to discover the location of the data to be instrumented. 433If the node to be created matches a node that already exists, a return 434code of 0 is given, indicating success. 435.Pp 436When using 437.Fn sysctl_destroyv 438to destroy a given node, the 439.Fa rnode 440argument, if not 441.Dv NULL , 442is taken to be the root of the tree from which 443the node is to be destroyed, otherwise the main tree is used. 444The rest of the arguments are a list of integers specifying the path 445through the MIB to the node being destroyed. 446If the node being destroyed does not exist, a successful return code 447is given. 448Nodes marked with the 449.Dv CTLFLAG_PERMANENT 450flag cannot be destroyed. 451.Sh HELPER FUNCTIONS AND MACROS 452Helper functions are invoked with the same common argument set as 453.Fn sysctl_dispatch 454except that the 455.Fa rnode 456argument will never be 457.Dv NULL . 458It will be set to point to the node that corresponds most closely to 459the current request. 460Helpers are forbidden from modifying the node they are passed; they 461should instead copy the structure if changes are required in order to 462effect access control or other checks. 463The 464.Dq helper 465prototype and function that needs to ensure that a newly assigned 466value is within a certain range (presuming external data) would look 467like the following: 468.Pp 469.Bd -literal -offset indent -compact 470static int sysctl_helper(SYSCTLFN_PROTO); 471 472static int 473sysctl_helper(SYSCTLFN_ARGS) 474{ 475 struct sysctlnode node; 476 int t, error; 477 478 node = *rnode; 479 node.sysctl_data = \*[Am]t; 480 error = sysctl_lookup(SYSCTLFN_CALL(\*[Am]node)); 481 if (error || newp == NULL) 482 return (error); 483 484 if (t \*[Lt] 0 || t \*[Gt] 20) 485 return (EINVAL); 486 487 *(int*)rnode-\*[Gt]sysctl_data = t; 488 return (0); 489} 490.Ed 491.Pp 492The use of the 493.Dv SYSCTLFN_PROTO , 494.Dv SYSCTLFN_ARGS, and 495.Dv SYSCTLFN_CALL 496 macros ensure that all arguments are passed properly. 497The single argument to the 498.Dv SYSCTLFN_CALL 499macro is the pointer to the node being examined. 500.Pp 501Three basic helper functions are available for use. 502.Fn sysctl_needfunc 503will emit a warning to the system console whenever it is invoked and 504provides a simplistic read-only interface to the given node. 505.Fn sysctl_notavail 506will forward 507.Dq queries 508to 509.Fn sysctl_query 510so that subtrees can be discovered, but will return 511.Er EOPNOTSUPP 512for any other condition. 513.Fn sysctl_null 514specifically ignores any arguments given, sets the value indicated by 515.Fa oldlenp 516to zero, and returns success. 517.Sh SETUP FUNCTIONS 518Though nodes can be added to the SYSCTL tree at any time, in order to 519add nodes during the kernel bootstrap phase, a proper 520.Dq setup 521function must be used. 522Setup functions are declared using the 523.Dv SYSCTL_SETUP 524macro, which takes the name of the function and a short string 525description of the function as arguments. 526.Po 527See the 528.Dv SYSCTL_DEBUG_SETUP 529kernel configuration in 530.Xr options 4 . 531.Pc 532The address of the function is added to a list of functions that 533.Fn sysctl_init 534traverses during initialization. 535.Pp 536Setup functions do not have to add nodes to the main tree, but can set 537up their own trees for emulation or other purposes. 538Emulations that require use of a main tree but with some nodes changed 539to suit their own purposes can arrange to overlay a sparse private 540tree onto their main tree by making the 541.Fa e_sysctlovly 542member of their struct emul definition point to the overlaid tree. 543.Pp 544Setup functions should take care to create all nodes from the root 545down to the subtree they are creating, since the order in which setup 546functions are called is arbitrary (the order in which setup functions 547are called is only determined by the ordering of the object files as 548passed to the linker when the kernel is built). 549.Sh MISCELLANEOUS FUNCTIONS 550.Fn sysctl_init 551is called early in the kernel bootstrap process. 552It initializes the SYSCTL lock, calls all the registered setup 553functions, and marks the tree as permanent. 554.Pp 555.Fn sysctl_free 556will unconditionally delete any and all nodes below the given node. 557Its intended use is for the deletion of entire trees, not subtrees. 558If a subtree is to be removed, 559.Fn sysctl_destroy 560or 561.Fn sysctl_destroyv 562should be used to ensure that nodes not owned by the sub-system being 563deactivated are not mistakenly destroyed. 564The SYSCTL lock must be held when calling this function. 565.Pp 566.Fn sysctl_teardown 567unwinds a sysctllog and deletes the nodes in the opposite order in 568which they were created. 569.Pp 570.Fn old_sysctl 571provides an interface similar to the old SYSCTL implementation, with 572the exception that access checks on a per-node basis are performed if 573the 574.Fa l 575argument is 576.No non- Ns Dv NULL . 577If called with a 578.Dv NULL 579argument, the values for 580.Fa newp 581and 582.Fa oldp 583are interpreted as kernel addresses, and access is performed as for 584the superuser. 585.Sh NOTES 586It is expected that nodes will be added to (or removed from) the tree 587during the following stages of a machine's lifetime: 588.Pp 589.Bl -bullet -compact 590.It 591initialization -- when the kernel is booting 592.It 593autoconfiguration -- when devices are being probed at boot time 594.It 595.Dq plug and play 596device attachment -- when a PC-Card, USB, or other device is plugged 597in or attached 598.It 599module initialization -- when a module is being loaded 600.It 601.Dq run-time 602-- when a process creates a node via the 603.Xr sysctl 3 604interface 605.El 606.Pp 607Nodes marked with 608.Dv CTLFLAG_PERMANENT 609can only be added to a tree during the first or initialization phase, 610and can never be removed. 611The initialization phase terminates when the main tree's root is 612marked with the 613.Dv CTLFLAG_PERMANENT 614flag. 615Once the main tree is marked in this manner, no nodes can be added to 616any tree that is marked with 617.Dv CTLFLAG_READONLY 618at its root, and no nodes can be added at all if the main tree's root 619is so marked. 620.Pp 621Nodes added by device drivers, modules, and at device insertion time can 622be added to (and removed from) 623.Dq read-only 624parent nodes. 625.Pp 626Nodes created by processes can only be added to 627.Dq writable 628parent nodes. 629See 630.Xr sysctl 3 631for a description of the flags that are allowed to be used by 632when creating nodes. 633.Sh SEE ALSO 634.Xr sysctl 3 635.Sh HISTORY 636The dynamic SYSCTL implementation first appeared in 637.Nx 2.0 . 638.Sh AUTHORS 639.An Andrew Brown 640.Aq atatat@NetBSD.org 641designed and implemented the dynamic SYSCTL implementation. 642