1.\" $NetBSD: sysctl.9,v 1.5 2004/03/24 23:51:18 wiz Exp $ 2.\" 3.\" Copyright (c) 2004 The NetBSD Foundation, Inc. 4.\" All rights reserved. 5.\" 6.\" This code is derived from software contributed to The NetBSD Foundation 7.\" by Andrew Brown. 8.\" 9.\" Redistribution and use in source and binary forms, with or without 10.\" modification, are permitted provided that the following conditions 11.\" are met: 12.\" 1. Redistributions of source code must retain the above copyright 13.\" notice, this list of conditions and the following disclaimer. 14.\" 2. Redistributions in binary form must reproduce the above copyright 15.\" notice, this list of conditions and the following disclaimer in the 16.\" documentation and/or other materials provided with the distribution. 17.\" 3. Neither the name of The NetBSD Foundation nor the names of its 18.\" contributors may be used to endorse or promote products derived 19.\" from this software without specific prior written permission. 20.\" 21.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS 22.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED 23.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 24.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS 25.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 26.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 27.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 28.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 29.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 30.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 31.\" POSSIBILITY OF SUCH DAMAGE. 32.\" 33.Dd March 24, 2004 34.Dt SYSCTL 9 35.Os 36.Sh NAME 37.Nm sysctl 38.Nd system variable control interfaces 39.Sh SYNOPSIS 40.In sys/param.h 41.In sys/sysctl.h 42.Pp 43Primary external interfaces: 44.Ft void 45.Fn sysctl_init void 46.Ft int 47.Fn sysctl_lock "struct lwp *l" "void *oldp" "size_t savelen" 48.Ft int 49.Fn sysctl_dispatch "const int *name" "u_int namelen" "void *oldp" \ 50"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 51"struct lwp *l" "const struct sysctlnode *rnode" 52.Ft void 53.Fn sysctl_unlock "struct lwp *l" 54.Ft int 55.Fn sysctl_createv "struct sysctllog **log" "int cflags" \ 56"struct sysctlnode **rnode" "struct sysctlnode **cnode" "int flags" \ 57"int type" "const char *namep" "const char *desc" \ 58"sysctlfn func" "u_quad_t qv" "void *newp" "size_t newlen" ... 59.Ft int 60.Fn sysctl_destroyv "struct sysctlnode *rnode" ... 61.Ft void 62.Fn sysctl_free "struct sysctlnode *rnode" 63.Ft void 64.Fn sysctl_teardown "struct sysctllog **" 65.Ft int 66.Fn old_sysctl "int *name" "u_int namelen" "void *oldp" \ 67"size_t *oldlenp" "void *newp" "size_t newlen" "struct lwp *l" 68.Pp 69Core internal functions: 70.Ft int 71.Fn sysctl_locate "struct lwp *l" "const int *name" "u_int namelen" \ 72"struct sysctlnode **rnode" "int *nip" 73.Ft int 74.Fn sysctl_lookup "const int *name" "u_int namelen" "void *oldp" \ 75"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 76"struct lwp *l" "struct sysctlnode *rnode" 77.Ft int 78.Fn sysctl_create "const int *name" "u_int namelen" "void *oldp" \ 79"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 80"struct lwp *l" "struct sysctlnode *rnode" 81.Ft int 82.Fn sysctl_destroy "const int *name" "u_int namelen" "void *oldp" \ 83"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 84"struct lwp *l" "struct sysctlnode *rnode" 85.Ft int 86.Fn sysctl_query "const int *name" "u_int namelen" "void *oldp" \ 87"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 88"struct lwp *l" "const struct sysctlnode *rnode" 89.Pp 90Simple 91.Dq helper 92functions: 93.Ft int 94.Fn sysctl_needfunc "const int *name" "u_int namelen" "void *oldp" \ 95"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 96"struct lwp *l" "const struct sysctlnode *rnode" 97.Ft int 98.Fn sysctl_notavail "const int *name" "u_int namelen" "void *oldp" \ 99"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 100"struct lwp *l" "const struct sysctlnode *rnode" 101.Ft int 102.Fn sysctl_null "const int *name" "u_int namelen" "void *oldp" \ 103"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 104"struct lwp *l" "const struct sysctlnode *rnode"" 105.Sh DESCRIPTION 106The SYSCTL subsystem instruments a number of kernel tunables and other 107data structures via a simple MIB-like interface, primarily for 108consumption by userland programs, but also for use internally by the 109kernel. 110.Sh LOCKING 111All operations on the SYSCTL tree must be protected by acquiring the 112main SYSCTL lock. 113The only functions that can be called when the lock is not held are 114.Fn sysctl_lock , 115.Fn sysctl_createv , 116.Fn sysctl_destroyv , 117and 118.Fn old_sysctl . 119All other functions require the tree to be locked. 120This is to prevent other users of the tree from moving nodes around 121during an add operation, or from destroying nodes or subtrees that are 122actively being used. 123The lock is acquired by calling 124.Fn sysctl_lock 125with a pointer to the process's lwp 126.Fa l 127.Dv ( NULL 128may be passed to all functions as the lwp pointer if no lwp is 129appropriate, though any changes made via 130.Fn sysctl_create , 131.Fn sysctl_destroy , 132.Fn sysctl_lookup , 133or by any helper function will be done with effective superuser 134privileges). 135The 136.Fa oldp 137and 138.Fa savelen 139arguments are a pointer to and the size of the memory region the 140caller will be using to collect data from SYSCTL. 141These may also be 142.Dv NULL 143and 0, respectively. 144.Pp 145The memory region will be locked via 146.Fn uvm_vslock 147if it is a region in userspace. 148The address and size of the region are recorded so that when the 149SYSCTL lock is to be released via 150.Fn sysctl_unlock , 151only the lwp pointer 152.Fa l 153is required. 154.Sh LOOKUPS 155Once the lock has been acquired, it is typical to call 156.Fn sysctl_dispatch 157to handle the request. 158.Fn sysctl_dispatch 159will examine the contents of 160.Fa name , 161an array of integers at least 162.Fa namelen 163long, which is to be located in kernel space, in order to determine 164which function to call to handle the specific request. 165.Pp 166.Fn sysctl_dispatch 167uses the following algorithm to determine the function to call: 168.Pp 169.Bl -bullet 170.It 171Scan the tree using 172.Fn sysctl_locate 173.It 174If the node returned has a 175.Dq helper 176function, call it 177.It 178If the requested node was found but has no function, call 179.Fn sysctl_lookup 180.It 181If the node was not found and 182.Fa name 183specifies one of 184.Fn sysctl_query , 185.Fn sysctl_create , 186or 187.Fn sysctl_destroy , 188call the appropriate function 189.It 190If none of these options applies and no other error was yet recorded, 191return 192.Er EOPNOTSUPP 193.Pp 194.El 195The 196.Fa oldp 197and 198.Fa oldlenp 199arguments to 200.Fn sysctl_dispatch , 201as with all the other core functions, describe an area into which the 202current or requested value may be copied. 203.Fa oldp 204may or may not be a pointer into userspace (as dictated by whether 205.Fa l 206is 207.Dv NULL 208or not). 209.Fa oldlenp 210is a 211.No non- Ns Dv NULL 212pointer to a size_t. 213.Fa newp 214and 215.Fa newlen 216describe an area where the new value for the request may be found; 217.Fa newp 218may also be a pointer into userspace. 219The 220.Fa oname 221argument is a 222.No non- Ns Dv NULL 223pointer to the base of the request currently 224being processed. 225By simple arithmetic on 226.Fa name , 227.Fa namelen , 228and 229.Fa oname , 230one can easily determine the entire original request and 231.Fa namelen 232values, if needed. 233The 234.Fa rnode 235value, as passed to 236.Fn sysctl_dispatch 237represents the root of the tree into which the current request is to 238be dispatched. 239If 240.Dv NULL , 241the main tree will be used. 242.Pp 243.Fn sysctl_locate 244scans a tree for the node most specific to a request. 245If the pointer referenced by 246.Fa rnode 247is not 248.Dv NULL , 249the tree indicated is searched, otherwise the main tree 250will be used. 251The address of the most relevant node will be returned via 252.Fa rnode 253and the number of MIB entries consumed will be returned via 254.Fa nip , 255if it is not 256.Dv NULL . 257.Pp 258The 259.Fn sysctl_lookup 260function takes the same arguments as 261.Fn sysctl_dispatch 262with the caveat that the value for 263.Fa namelen 264must be zero in order to indicate that the node referenced by the 265.Fa rnode 266argument is the one to which the lookup is being applied. 267.Sh CREATION AND DESTRUCTION OF NODES 268New nodes are created and destroyed by the 269.Fn sysctl_create 270and 271.Fn sysctl_destroy 272functions. 273These functions take the same arguments as 274.Fn sysctl_dispatch 275with the additional requirement that the 276.Fa namelen 277argument must be 1 and the 278.Fa name 279argument must point to an integer valued either 280.Dv CTL_CREATE 281or 282.Dv CTL_CREATESYM 283when creating a new node, or 284.Dv CTL_DESTROY 285when destroying 286a node. 287The 288.Fa newp 289and 290.Fa newlen 291arguments should point to a copy of the node to be created or 292destroyed. 293If the create or destroy operation was successful, a copy of the node 294created or destroyed will be placed in the space indicated by 295.Fa oldp 296and 297.Fa oldlenp . 298If the create operation fails because of a conflict with an existing 299node, a copy of that node will be returned instead. 300.Pp 301In order to facilitate the creation and destruction of nodes from a 302given tree by kernel subsystems, the functions 303.Fn sysctl_createv 304and 305.Fn sysctl_destroyv 306are provided. 307These functions take care of the overhead of filling in the contents 308of the create or destroy request, dealing with locking, locating the 309appropriate parent node, etc. 310.Pp 311The arguments to 312.Fn sysctl_createv 313are used to construct the new node. 314If the 315.Fa log 316argument is not 317.Dv NULL , 318a sysctllog structure will be allocated and the pointer referenced 319will be changed to address it. 320The same log may be used for any number of nodes, provided they are 321all inserted into the same tree. 322This allows for a series of nodes to be created and later removed from 323the tree in a single transaction (via 324.Fn sysctl_teardown ) 325without the need for any record 326keeping on the caller's part. 327The 328.Fa cflags 329argument is currently unused and must be zero. 330The 331.Fa rnode 332argument must either be 333.Dv NULL 334or a valid pointer to a reference to the root of the tree into which 335the new node must be placed. 336If it is 337.Dv NULL , 338the main tree will be used. 339It is illegal for 340.Fa rnode 341to refer to a 342.Dv NULL 343pointer. 344If the 345.Fa cnode 346argument is not 347.Dv NULL , 348on return it will be adjusted to point to the address of the new node. 349.Pp 350The 351.Fa flags 352and 353.Fa type 354arguments are combined into the 355.Fa sysctl_flags 356field, and the current value for 357.Dv SYSCTL_VERSION 358is added in. 359Note: the 360.Dv CTLFLAG_PERMANENT 361flag can only be set from SYSCTL setup routines (see 362.Sx SETUP FUNCTIONS ) 363as called by 364.Fn sysctl_init . 365The 366.Fa namep 367argument is copied into the 368.Fa sysctl_name 369field and must be less than 370.Dv SYSCTL_NAMELEN 371characters in length. 372The string indicated by 373.Fa desc 374will be copied if the 375.Dv CTLFLAG_OWNDESC 376flag is set, and will be used as the node's description. 377Note: if 378.Fn sysctl_destroyv 379attempts to delete a node that does not own its own description (and 380is not marked as permanent), but the deletion fails, the description 381will be copied and 382.Fn sysctl_destroyv 383will set the 384.Dv CTLFLAG_OWNDESC 385flag. 386.Pp 387The 388.Fa func 389argument is the name of a 390.Dq helper 391function (see 392.Sx HELPER FUNCTIONS AND MACROS ) . 393If the 394.Dv CTLFLAG_IMMEDIATE 395flag is set, the 396.Fa qv 397argument will be interpreted as the initial value for the new 398.Dq int 399or 400.Dq quad 401node. 402This flag does not apply to any other type of node. 403The 404.Fa newp 405and 406.Fa newlen 407arguments describe the data external to SYSCTL that is to be 408instrumented. 409One of 410.Fa func , 411.Fa qv 412and the 413.Dv CTLFLAG_IMMEDIATE 414flag, or 415.Fa newp 416and 417.Fa newlen 418must be given for nodes that instrument data, otherwise an error is 419returned. 420.Pp 421The remaining arguments are a list of integers specifying the path 422through the MIB to the node being created. 423The list must be terminated by the 424.Dv CTL_EOL 425value. 426The penultimate value in the list may be 427.Dv CTL_CREATE 428if a dynamic MIB entry is to be made for this node. 429.Fn sysctl_createv 430specifically does not support 431.Dv CTL_CREATESYM , 432since setup routines are 433expected to be able to use the in-kernel 434.Xr ksyms 4 435interface to discover the location of the data to be instrumented. 436If the node to be created matches a node that already exists, a return 437code of 0 is given, indicating success. 438.Pp 439When using 440.Fn sysctl_destroyv 441to destroy a given node, the 442.Fa rnode 443argument, if not 444.Dv NULL , 445is taken to be the root of the tree from which 446the node is to be destroyed, otherwise the main tree is used. 447The rest of the arguments are a list of integers specifying the path 448through the MIB to the node being destroyed. 449If the node being destroyed does not exist, a successful return code 450is given. 451Nodes marked with the 452.Dv CTLFLAG_PERMANENT 453flag cannot be destroyed. 454.Sh HELPER FUNCTIONS AND MACROS 455Helper functions are invoked with the same common argument set as 456.Fn sysctl_dispatch 457except that the 458.Fa rnode 459argument will never be 460.Dv NULL . 461It will be set to point to the node that corresponds most closely to 462the current request. 463Helpers are forbidden from modifying the node they are passed; they 464should instead copy the structure if changes are required in order to 465effect access control or other checks. 466The 467.Dq helper 468prototype and function that needs to ensure that a newly assigned 469value is within a certain range (presuming external data) would look 470like the following: 471.Pp 472.Bd -literal -offset indent -compact 473static int sysctl_helper(SYSCTLFN_PROTO); 474.sp 475static int 476sysctl_helper(SYSCTLFN_ARGS) 477{ 478 struct sysctlnode node; 479 int t, error; 480.sp 481 node = *rnode; 482 node.sysctl_data = \*[Am]t; 483 error = sysctl_lookup(SYSCTLFN_CALL(\*[Am]node)); 484 if (error || newp == NULL) 485 return (error); 486.sp 487 if (t \*[Lt] 0 || t \*[Gt] 20) 488 return (EINVAL); 489.sp 490 *(int*)rnode-\*[Gt]sysctl_data = t; 491 return (0); 492} 493.Ed 494.Pp 495The use of the 496.Dv SYSCTLFN_PROTO , 497.Dv SYSCTLFN_ARGS, and 498.Dv SYSCTLFN_CALL 499 macros ensure that all arguments are passed properly. 500The 501.Dv SYSCTLFN_RWPROTO 502and 503.Dv SYSCTLFN_RWARGS 504macros are only used internally by those core SYSCTL routines that may 505have cause to modify the data in the given SYSCTL tree. 506The single argument to the 507.Dv SYSCTLFN_CALL 508macro is the pointer to the node being examined. 509.Pp 510Three basic helper functions are available for use. 511.Fn sysctl_needfunc 512will emit a warning to the system console whenever it is invoked and 513provides a simplistic read-only interface to the given node. 514.Fn sysctl_notavail 515will forward 516.Dq queries 517to 518.Fn sysctl_query 519so that subtrees can be discovered, but will return 520.Er EOPNOTSUPP 521for any other condition. 522.Fn sysctl_null 523specifically ignores any arguments given, sets the value indicated by 524.Fa oldlenp 525to zero, and returns success. 526.Sh SETUP FUNCTIONS 527Though nodes can be added to the SYSCTL tree at any time, in order to 528add nodes during the kernel bootstrap phase, a proper 529.Dq setup 530function must be used. 531Setup functions are declared using the 532.Dv SYSCTL_SETUP 533macro, which takes the name of the function and a short string 534description of the function as arguments. 535The address of the function is added to a list of functions that 536.Fn sysctl_init 537traverses during initialization. 538.Pp 539Setup functions to not have to add nodes to the main tree, but can set 540up their own trees for emulation or other purposes. 541Emulations that require use of a main tree but with some nodes changed 542to suit their own purposes can arrange to overlay a sparse private 543tree onto their main tree by making the 544.Fa e_sysctlovly 545member of their struct emul definition point to the overlaid tree. 546.Pp 547Setup functions should take care to create all nodes from the root 548down to the subtree they are creating, since the order in which setup 549functions are called is arbitrary (the order in which setup functions 550are called is only determined by the ordering of the object files as 551passed to the linker when the kernel is built). 552.Sh MISCELLANEOUS FUNCTIONS 553.Fn sysctl_init 554is called early in the kernel bootstrap process. 555It initializes the SYSCTL lock, calls all the registered setup 556functions, and marks the tree as permanent. 557.Pp 558.Fn sysctl_free 559will unconditionally delete any and all nodes below the given node. 560Its intended use is for the deletion of entire trees, not subtrees. 561If a subtree is to be removed, 562.Fn sysctl_destroy 563or 564.Fn sysctl_destroyv 565should be used to ensure that nodes not owned by the sub-system being 566deactivated are not mistakenly destroyed. 567The SYSCTL lock must be held when calling this function. 568.Pp 569.Fn sysctl_teardown 570unwinds a sysctllog and deletes the nodes in the opposite order in 571which they were created. 572.Pp 573.Fn old_sysctl 574provides an interface similar to the old SYSCTL implementation, with 575the exception that access checks on a per-node basis are performed if 576the 577.Fa l 578argument is 579.No non- Ns Dv NULL . 580If called with a 581.Dv NULL 582argument, the values for 583.Fa newp 584and 585.Fa oldp 586are interpreted as kernel addresses, and access is performed as for 587the superuser. 588.Sh NOTES 589It is expected that nodes will be added to (or removed from) the tree 590during the following stages of a machine's lifetime: 591.Pp 592.Bl -bullet -compact 593.It 594initialization -- when the kernel is booting 595.It 596autoconfiguration -- when devices are being probed at boot time 597.It 598.Dq plug and play 599device attachment -- when a PC-Card, USB, or other device is plugged 600in or attached 601.It 602LKM initialization -- when an LKM is being loaded 603.It 604.Dq run-time 605-- when a process creates a node via the 606.Xr sysctl 3 607interface 608.El 609.Pp 610Nodes marked with 611.Dv CTLFLAG_PERMANENT 612can only be added to a tree during the first or initialization phase, 613and can never be removed. 614The initialization phase terminates when the main tree's root is 615marked with the 616.Dv CTLFLAG_PERMANENT 617flag. 618Once the main tree is marked in this manner, no nodes can be added to 619any tree that is marked with 620.Dv CTLFLAG_READONLY 621at its root, and no nodes can be added at all if the main tree's root 622is so marked. 623.Pp 624Nodes added by device drivers, LKMs, and at device insertion time can 625be added to (and removed from) 626.Dq read-only 627parent nodes. 628.Pp 629Nodes created by processes can only be added to 630.Dq writable 631parent nodes. 632See 633.Xr sysctl 3 634for a description of the flags that are allowed to be used by 635when creating nodes. 636.Sh SEE ALSO 637.Xr sysctl 3 638.Sh HISTORY 639The dynamic SYSCTL implementation first appeared in 640.Nx 2.0 . 641.Sh AUTHORS 642.An Andrew Brown 643.Aq atatat@NetBSD.org 644designed and implemented the dynamic SYSCTL implementation. 645