xref: /openbsd-src/share/man/man9/vnode.9 (revision b2ea75c1b17e1a9a339660e7ed45cd24946b230e)
1.Dd February 22, 2001
2.Dt vnode 9
3.Os OpenBSD 2.9
4.Sh NAME
5.Nm vnode
6.Nd an overview of vnodes
7.Sh DESCRIPTION
8The vnode is the kernel object that corresponds to a file (actually,
9a file, a directory, a fifo, a domain socket, a symlink, or a device).
10.Pp
11Each vnode has a set of methods corresponding to file operations
12(vop_open, vop_read, vop_write, vop_rename, vop_mkdir, vop_close).
13These methods are implemented by the individual file systems and
14are dispatched through function pointers.
15.Pp
16In addition, the VFS has functions for maintaining a pool of vnodes,
17associating vnodes with mount points, and associating vnodes with buffers.
18The individual file systems cannot override these functions. As such,
19individual file systems cannot allocate their own vnodes.
20.Pp
21In general, the contents of a struct vnode should not be examined or
22modified by the users of vnode methods. There are some rather common
23exceptions detailed later in this document.
24.Pp
25The vast majority of the vnode functions CANNOT be called from interrupt
26context.
27.Ss Vnode pool
28All the vnodes in the kernel are allocated out of a shared pool.
29The
30.Xr getnewvnode 9
31system call returns a fresh vnode from the vnode
32pool. The vnode returned has a reference count (v_usecount) of 1.
33.Pp
34The
35.Xr vref 9
36call increments the reference count on the vnode. The
37.Xr vrele 9
38and
39.Xr vput 9
40calls decrement the reference count.
41In addition, the
42.Xr vput 9
43call also releases the vnode lock.
44.Pp
45When a vnode's reference count becomes zero, the vnode pool places it
46a pool of free vnodes, eligible to be assigned to a different file.
47The vnode pool calls the
48.Xr vop_inactive 9
49method to inform the file system that the reference count has reached zero.
50.Pp
51When placed in the pool of free vnodes, the vnode is not otherwise altered.
52In fact, it can often be retrieved before it is reassigned to a different file.
53This is useful when the system closes a file and opens it again in rapid
54succession. The
55.Xr vget 9
56call is used to revive the vnode. Note, callers should ensure the vnode
57they get back has not been reassigned to a different file.
58.Pp
59When the vnode pool decides to reclaim the vnode to satisfy a getnewvnode
60request, it calls the
61.Xr vop_reclaim 9
62method. File systems
63often use this method to free any file-system specific data they
64attach to the vnode.
65.Pp
66A file system can force a vnode with a reference count of zero
67to be reclaimed earlier by calling the
68.Xr vrecycle 9
69call. The
70.Xr vrecycle 9
71call is a null operation if the reference count is greater than zero.
72.Pp
73The
74.Xr vgone 9
75and
76.Xr vgonel 9
77calls will force the pool to reclaim
78the vnode even if it has a non-zero reference count. If the vnode had
79a non-zero reference count, the vnode is then assigned an operations
80vector corresponding to the "dead" file system. In this operations
81vector, most operations return errors.
82.Ss Vnode locks
83Note to beginners: locks don't actually prevent memory from being read
84or overwritten. Instead, they are an object that, where used, allows
85only one piece of code to proceed through the locked section.  If you
86do not surround a stretch of code with a lock, it can and probably
87will eventually be executed simultaneously with other stretches of code
88(including stretches ). Chances are the results will be unexpected and
89disappointing to both the user and you.
90.Pp
91The vnode actually has three different types of lock: the vnode lock,
92the vnode interlock, and the vnode reclamation lock (VXLOCK).
93.Ss The vnode lock
94The most general lock is the vnode lock. This lock is acquired by
95calling
96.Xr vn_lock 9
97and released by calling
98.Xr vn_unlock 9
99. The vnode lock is used to serialize operations through the file system for
100a given file when there are multiple concurrent requests on the same file.
101Many file system functions require that you hold the vnode lock on entry.
102The vnode lock may be held when sleeping.
103.Pp
104The
105.Xr revoke 2
106and forcible unmount features in BSD UNIX allows a
107user to invalidate files and their associated vnodes at almost any
108time, even if there are active open files on it. While in a region of code
109protected by the vnode lock, the process is guaranteed that the vnode
110will not be reclaimed or invalidated.
111.Pp
112The vnode lock is a multiple-reader or single-writer lock. An
113exclusive vnode lock may be acquired multiple times by the same
114process.
115.Pp
116The vnode lock is somewhat messy because it is used for many purposes.
117Some clients of the vnode interface use it to try to bundle a series
118of VOP_ method calls into an atomic group. Many file systems rely on
119it to prevent race conditions in updating file system specific data
120structures (as opposed to having their own locks).
121.Pp
122The implementation of the vnode lock is the responsibility of the individual
123file systems.  Not all file system implement it.
124.Pp
125To prevent deadlocks, when acquiring locks on multiple vnodes, the lock
126of parent directory must be acquired before the lock on the child directory.
127.Pp
128Interrupt handlers must not acquire vnode locks.
129.Ss Vnode interlock
130The vnode interlock (vp->v_interlock) is a spinlock. It is useful on
131multi-processor systems for acquiring a quick exclusive lock on the
132contents of the vnode. It MUST NOT be held while sleeping. (What
133fields does it cover? What about splbio/interrupt issues?)
134.Pp
135Operations on this lock are a no-op on uniprocessor systems.
136.Ss Other Vnode synchronization
137The vnode reclamation lock (VXLOCK) is used to prevent multiple
138processes from entering the vnode reclamation code. It is also used as
139a flag to indicate that reclamation is in progress. The VXWANT flag is
140set by processes that wish to woken up when reclamation is finished.
141.Pp
142The
143.Xr vwaitforio 9
144call is used for to wait for all outstanding write I/Os associated with a
145vnode to complete.
146.Ss Version number/capability
147The vnode capability, v_id, is a 32-bit version number on the vnode.
148Every time a vnode is reassigned to a new file, the vnode capability
149is changed. This is used by code that wish to keep pointers to vnodes
150but doesn't want to hold a reference (e.g. caches). The code keeps
151both a vnode * and a copy of the capability. The code can later compare
152the vnode's capability to its copy and see if the vnode still
153points to the same file.
154.Pp
155Note: for this to work, memory assigned to hold a struct vnode can
156only be used for another purpose when all pointers to it have disappeared.
157Since the vnode pool has no way of knowing when all pointers have
158disappeared, it never frees memory it has allocated for vnodes.
159.Ss Vnode fields
160Most of the fields of the vnode structure should be treated as opaque
161and only manipulated through the proper APIs. This section describes
162the fields that are manipulated directly.
163.Pp
164The v_flag attribute contains random flags related to various functions.
165They are summarized in table ...
166.Pp
167The v_tag attribute indicates what file system the vnode belongs to.
168Very little code actually uses this attribute and its use is deprecated.
169Programmers should seriously consider using more object-oriented approaches
170(e.g. function tables). There is no safe way of defining new v_tags
171for loadable file systems. The v_tag attribute is read-only.
172.Pp
173The v_type attribute indicates what type of file (e.g. directory,
174regular, fifo) this vnode is. This is used by the generic code to
175ensure for various checks. For example, the
176.Xr read 2
177system call returns an error when a read is attempted on a directory.
178.Pp
179The v_data attribute allows a file system to attach piece of file
180system specific memory to the vnode. This contains information about
181the file that is specific to the file system.
182.Pp
183The v_numoutput attribute indicates the number of pending synchronous
184and asynchronous writes on the vnode. It does not track the number of
185dirty buffers attached to the vnode.  The attribute is used by code
186like fsync to wait for all writes to complete before returning to the
187user. This attribute must be manipulated at splbio().
188.Pp
189The v_writecount attribute tracks the number of write calls pending
190on the vnode.
191.Ss RULES
192The vast majority of vnode functions may not be called from interrupt
193context. The exceptions are bgetvp and brelvp. The following
194fields of the vnode are manipulated at interrupt level: v_numoutput,
195v_holdcnt, v_dirtyblkhd, v_cleanblkhd, v_bioflag, v_freelist, and
196v_synclist. Any accesses to these field should be protected by splbio,
197unless you are certain that there is no chance an interrupt handler
198will modify them.
199.Pp
200A vnode will only be reassigned to another file when its reference count
201reaches zero and the vnode lock is freed.
202.Pp
203A vnode will not be reclaimed as long as the vnode lock is held.
204If the vnode reference count drops to zero while a process is holding
205the vnode lock, the vnode MAY be queued for reclamation. Increasing
206the reference count from 0 to 1 while holding the lock will most likely
207cause intermittent kernel panics.
208.Sh SEE ALSO
209
210.Sh HISTORY
211This document first appeared in
212.Ox 2.9
213.
214
215
216