xref: /openbsd-src/share/man/man9/vnode.9 (revision 47911bd667ac77dc523b8a13ef40b012dbffa741)
1.\"     $OpenBSD: vnode.9,v 1.10 2002/11/08 08:08:47 mpech Exp $
2.\"
3.\" Copyright (c) 2001 Constantine Sapuntzakis
4.\" All rights reserved.
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\"
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\" 2. The name of the author may not be used to endorse or promote products
13.\"    derived from this software without specific prior written permission.
14.\"
15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES,
16.\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
17.\" AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
18.\" THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
19.\" EXEMPLARY, OR CONSEQUENTIAL  DAMAGES (INCLUDING, BUT NOT LIMITED TO,
20.\" PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
21.\" OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
22.\" WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
23.\" OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
24.\" ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
25.\"
26.Dd February 22, 2001
27.Dt vnode 9
28.Os OpenBSD 2.9
29.Sh NAME
30.Nm vnode
31.Nd an overview of vnodes
32.Sh DESCRIPTION
33A vnode is an object that speaks the UNIX file interface (open,
34read, write, close, readdir, etc.). Vnodes can represent files,
35directories, FIFOs, domain sockets, block devices, character devices.
36.Pp
37Each vnode has a set of methods which start with string 'VOP_'.
38These methods include VOP_OPEN, VOP_READ, VOP_WRITE, VOP_RENAME, VOP_CLOSE,
39VOP_MKDIR.
40Many of these methods correspond closely to the equivalent
41file system call--open, read, write, rename, etc.
42Each file system (FFS, NFS, etc.) provides implementations for these methods.
43.Pp
44The Virtual File System (VFS) library maintains a pool of vnodes.
45File systems cannot allocate their own vnodes; they must use the functions
46provided by the VFS to create and manage vnodes.
47.Ss Vnode state
48Vnodes have a reference count which corresponds to the number of kernel
49objects that hold references to the vnode.
50A positive reference count keeps
51the vnode off of the free list, which prevents the vnode from being recycled
52to refer to a different file.
53.Pp
54Vnodes that refer to a valid file and have a reference count of 1 or
55greater are "active".
56When a vnodes reference count drops to zero, it
57is "inactivated" and becomes "inactive".
58Inactive vnodes are placed on the
59free list, to be re-used to represent other files.
60.Pp
61Before a struct vnode can be re-used to refer to another file, it must
62be cleaned out of all information pertaining to the old file.
63A vnode that doesn't refer to any file is called a "reclaimed" vnode.
64.Pp
65The VFS may "reclaim" a vnode with a positive reference count.
66This is done when the underlying file is revoked, as happens with the
67revoke system call or through a forceable unmount.
68Such a vnode is given
69to the dead file system, which returns errors for most operations.
70The vnode will not be re-used for another file until its reference count
71hits zero.
72.Pp
73There are three states then for a vnode: active, inactive, and reclaimed.
74All transitions are meaningful except reclaimed to inactive.
75.Ss Vnode pool
76The
77.Xr getnewvnode 9
78system call returns a fresh active vnode from the vnode
79pool assigned to the file system specified in its arguments.
80The vnode returned has a reference count (v_usecount) of 1.
81.Pp
82The
83.Xr vref 9
84call increments the reference count on the vnode.
85It may only be on a vnode with reference count of 1 or greater.
86The
87.Xr vrele 9
88and
89.Xr vput 9
90calls decrement the reference count.
91In addition, the
92.Xr vput 9
93call also releases the vnode lock.
94.Pp
95The
96.Xr vget 9
97call, when used on an inactive vnode, will make the vnode "active"
98by bumping the reference count to one.
99When called on an active vnode, vget increases the reference count by one.
100However, if the vnode is being reclaimed concurrently, then vget will fail
101and return an error.
102.Pp
103The
104.Xr vgone 9
105and
106.Xr vgonel 9
107orchestrate the reclamation of a vnode.
108They can be called on both active and inactive vnodes.
109.Pp
110While transitioning a vnode to the "reclaimed" state, the VFS will call
111.Xr vop_reclaim 9
112method.
113File systems use this method to free any file-system specific data
114they attached to the vnode.
115.Ss Vnode locks
116The vnode actually has three different types of lock: the vnode lock,
117the vnode interlock, and the vnode reclamation lock (VXLOCK).
118.Ss The vnode lock
119The most general lock is the vnode lock.
120This lock is acquired by calling
121.Xr vn_lock 9
122and released by calling
123.Xr vn_unlock 9 .
124The vnode lock is used to serialize operations through the file system for
125a given file when there are multiple concurrent requests on the same file.
126Many file system functions require that you hold the vnode lock on entry.
127The vnode lock may be held when sleeping.
128.Pp
129A vnode will not be reclaimed as long as the vnode lock is held by some
130other process.
131.Pp
132The vnode lock is a multiple-reader or single-writer lock.
133An exclusive vnode lock may be acquired multiple times by the same
134process.
135.Pp
136The vnode lock is somewhat messy because it is used for many purposes.
137Some clients of the vnode interface use it to try to bundle a series
138of VOP_ method calls into an atomic group.
139Many file systems rely on it to prevent race conditions in updating file
140system specific data structures (as opposed to having their own locks).
141.Pp
142The implementation of the vnode lock is the responsibility of the individual
143file systems.
144Not all file system implement it.
145.Pp
146To prevent deadlocks, when acquiring locks on multiple vnodes, the lock
147of parent directory must be acquired before the lock on the child directory.
148.Ss Vnode interlock
149The vnode interlock (vp->v_interlock) is a spinlock.
150It is useful on multi-processor systems for acquiring a quick exclusive
151lock on the contents of the vnode.
152It MUST NOT be held while sleeping.
153(What fields does it cover? What about splbio/interrupt issues?)
154.Pp
155Operations on this lock are a no-op on uniprocessor systems.
156.Ss Other Vnode synchronization
157The vnode reclamation lock (VXLOCK) is used to prevent multiple
158processes from entering the vnode reclamation code.
159It is also used as a flag to indicate that reclamation is in progress.
160The VXWANT flag is set by processes that wish to woken up when reclamation
161is finished.
162.Pp
163The
164.Xr vwaitforio 9
165call is used for to wait for all outstanding write I/Os associated with a
166vnode to complete.
167.Ss Version number/capability
168The vnode capability, v_id, is a 32-bit version number on the vnode.
169Every time a vnode is reassigned to a new file, the vnode capability
170is changed.
171This is used by code that wish to keep pointers to vnodes but doesn't want
172to hold a reference (e.g., caches).
173The code keeps both a vnode * and a copy of the capability.
174The code can later compare the vnode's capability to its copy and see
175if the vnode still points to the same file.
176.Pp
177Note: for this to work, memory assigned to hold a struct vnode can
178only be used for another purpose when all pointers to it have disappeared.
179Since the vnode pool has no way of knowing when all pointers have
180disappeared, it never frees memory it has allocated for vnodes.
181.Ss Vnode fields
182Most of the fields of the vnode structure should be treated as opaque
183and only manipulated through the proper APIs.
184This section describes the fields that are manipulated directly.
185.Pp
186The v_flag attribute contains random flags related to various functions.
187They are summarized in table ...
188.Pp
189The v_tag attribute indicates what file system the vnode belongs to.
190Very little code actually uses this attribute and its use is deprecated.
191Programmers should seriously consider using more object-oriented approaches
192(e.g. function tables).
193There is no safe way of defining new v_tags for loadable file systems.
194The v_tag attribute is read-only.
195.Pp
196The v_type attribute indicates what type of file (e.g. directory,
197regular, fifo) this vnode is.
198This is used by the generic code to ensure for various checks.
199For example, the
200.Xr read 2
201system call returns an error when a read is attempted on a directory.
202.Pp
203The v_data attribute allows a file system to attach piece of file
204system specific memory to the vnode.
205This contains information about the file that is specific to
206the file system.
207.Pp
208The v_numoutput attribute indicates the number of pending synchronous
209and asynchronous writes on the vnode.
210It does not track the number of dirty buffers attached to the vnode.
211The attribute is used by code like fsync to wait for all writes
212to complete before returning to the user.
213This attribute must be manipulated at splbio().
214.Pp
215The v_writecount attribute tracks the number of write calls pending
216on the vnode.
217.Ss RULES
218The vast majority of vnode functions may not be called from interrupt
219context.
220The exceptions are bgetvp and brelvp.
221The following fields of the vnode are manipulated at interrupt level:
222v_numoutput, v_holdcnt, v_dirtyblkhd, v_cleanblkhd, v_bioflag, v_freelist,
223and v_synclist.
224Any accesses to these field should be protected by splbio,
225unless you are certain that there is no chance an interrupt handler
226will modify them.
227.Sh HISTORY
228This document first appeared in
229.Ox 2.9 .
230