1*a53f50b9Schristos NFS Attribute Caching OS Problems and Amd 2*a53f50b9Schristos Last updated September 18, 2005 3*a53f50b9Schristos 4*a53f50b9Schristos* Summary: 5*a53f50b9Schristos 6*a53f50b9SchristosSome OSs don't seem to have a way to turn off the NFS attribute cache, which 7*a53f50b9Schristosbreaks the Amd automounter so badly that it is not recommend using Amd on 8*a53f50b9Schristossuch OS for heavy use, not until this is fixed. 9*a53f50b9Schristos 10*a53f50b9Schristos 11*a53f50b9Schristos* Details: 12*a53f50b9Schristos 13*a53f50b9SchristosAmd is a user-level NFSv2 server that manages automounts of all other file 14*a53f50b9Schristossystems. The kernel contacts Amd via RPCs, and Amd in turn performs the 15*a53f50b9Schristosactual mounts, and then responds back to the kernel's RPCs. Every kernel 16*a53f50b9Schristoscaches attributes of files, in a cache called the Directory Name Lookup 17*a53f50b9SchristosCache (DNLC), or a Directory Cache (dcache). 18*a53f50b9Schristos 19*a53f50b9SchristosAmd manages its namespace in the user level, but the kernel caches names 20*a53f50b9Schristositself. So the two must coordinate to ensure that both namespaces are in 21*a53f50b9Schristossync. If the kernel uses a cached entry from the DNLC, without consulting 22*a53f50b9SchristosAmd, users may see corruption of the automounter namespace (symlinks 23*a53f50b9Schristospointing to the wrong places, ESTALE errors, and more). For example, 24*a53f50b9Schristossuppose Amd timed out an entry and removed the entry from Amd's namespace. 25*a53f50b9SchristosAmd has to tell the kernel to purge its corresponding DNLC entry too. The 26*a53f50b9Schristosway Amd often does that is by incrementing the last modification time 27*a53f50b9Schristos(mtime) of the parent directory. This is the most common method for kernels 28*a53f50b9Schristosto check if their DNLC entries are stale: if the parent directory mtime is 29*a53f50b9Schristosnewer, the kernel will discard all cached entries for that directory, and 30*a53f50b9Schristoswill re-issue lookup methods. Those lookups will result in 31*a53f50b9SchristosNFS_GETATTR/NFS_LOOKUP calls sent from the kernel down to Amd, and Amd can 32*a53f50b9Schristosthen properly inform the kernel of the new state of automounted entries. 33*a53f50b9Schristos 34*a53f50b9SchristosIn order to ensure that Amd is "in charge" of its namespace without 35*a53f50b9Schristosinterference from the kernel, Amd will try to turn off the NFS attribute 36*a53f50b9Schristoscache. It does so by using the NFSMNT_NOAC flag, if it exists, or by 37*a53f50b9Schristossetting various "cache timeout" fields in struct nfs_args to 0 (acregmin, 38*a53f50b9Schristosacregmax, acdirmin, or acdirmax). 39*a53f50b9Schristos 40*a53f50b9SchristosWe have released a major new version of am-utils, version 6.1, in June 2005. 41*a53f50b9SchristosSince then, a lot of people have experimented with Amd, in anticipation of 42*a53f50b9Schristosmigrating from the very old am-utils 6.0 to the new 6.1. For a couple of 43*a53f50b9Schristosmonths since the release of 6.1, we have received reports of problems with 44*a53f50b9SchristosAmd, especially under heavy use. Users reported getting ESTALE errors from 45*a53f50b9Schristostime to time, or seeing automounted entries whose symlinks don't point to 46*a53f50b9Schristoswhere it should be. After much debugging, we traced it to a few places in 47*a53f50b9SchristosAmd where it wasn't updating the parent directory mtime as it should have; 48*a53f50b9Schristosin some places where Amd was indeed updating the mtime, it was using a 49*a53f50b9Schristosresolution of only 1 second, which was not fine enough under heavy load. We 50*a53f50b9Schristosfixed this problem and switched to using a microsecond resolution mtime. 51*a53f50b9Schristos 52*a53f50b9SchristosAfter fixing this in Amd, we went on to verify that things work for other 53*a53f50b9SchristosOSs. When we got to test certain BSDs, we found out that they always cache 54*a53f50b9Schristosdirectory entries, and there is no way to turn it off completely. 55*a53f50b9SchristosSpecifically, if we set the ac{reg,dir}{min,max} fields in struct nfs_args 56*a53f50b9Schristosall to zero, the kernel seems to cache the entries for a default number of 57*a53f50b9Schristosseconds (something like 5-30 seconds). On some OSs, setting these four 58*a53f50b9Schristosfields to 0 turns off the attribute cache, but not on some BSDs. We were 59*a53f50b9Schristosable to verify this using Amd and a script that exercises the interaction of 60*a53f50b9Schristosthe kernel's attrcache and Amd. (If you're interested, the script can be 61*a53f50b9Schristosmade available.) 62*a53f50b9Schristos 63*a53f50b9SchristosWe then experimented by setting the ac{reg,dir}{min,max} fields in struct 64*a53f50b9Schristosnfs_args all to 1, the smallest non-zero value we could. When we ran the 65*a53f50b9SchristosAmd exercising script, we found that the value of 1 reduced the race between 66*a53f50b9Schristosthe DNLC and Amd, and the script took a little longer to run before it 67*a53f50b9Schristosdetected an incoherency. That makes sense: the smaller the DNLC cache 68*a53f50b9Schristosinterval is, the shorter the window of vulnerability is. (BTW, the man 69*a53f50b9Schristospages on some OSs say that the ac{reg,dir}{min,max} fields use a 1 second 70*a53f50b9Schristosresolution, but experimentation indicated it was in 0.1 second units.) 71*a53f50b9Schristos 72*a53f50b9SchristosClearly, setting the ac{reg,dir}{min,max} fields to 0 is worse than setting 73*a53f50b9Schristosit to 1 on those OSs that don't have a way to turn off the attribute cache. 74*a53f50b9SchristosSo the current workaround I've implemented in am-utils is to create a 75*a53f50b9Schristosconfiguration parameter called "broken_attrcache" which, if turned on, will 76*a53f50b9Schristosset these nfs_args fields to 1 instead of 0. I wish I didn't have to create 77*a53f50b9Schristossuch ugly workaround features in Amd, but I've got no choice. 78*a53f50b9Schristos 79*a53f50b9SchristosThe near term solution is for every OS to support a true 'noac' flag, which 80*a53f50b9Schristoscan be added fairly easily. This'd make Amd work reliably. 81*a53f50b9Schristos 82*a53f50b9SchristosThe long term solution is to implement Autofs support for all OSs and to 83*a53f50b9Schristossupport it in Amd. Currently, Amd supports autofs on Solaris and Linux; 84*a53f50b9SchristosFreeBSD is next. Still, we found that even with autofs support, many 85*a53f50b9Schristossysadmins still prefer to use the good 'ol non-autofs mode. 86*a53f50b9Schristos 87*a53f50b9Schristos 88*a53f50b9Schristos* Confirmed Status 89*a53f50b9Schristos 90*a53f50b9SchristosThis is the confirmed status of various OSs' vulnerability to this attribute 91*a53f50b9Schristoscache bug. We are slowly checking the status of other OSs. The status of 92*a53f50b9Schristosany OS not listed is unknown as of the date at the top of this file. 93*a53f50b9Schristos 94*a53f50b9Schristos** Not Vulnerable (support a proper "noac" flag): 95*a53f50b9Schristos 96*a53f50b9SchristosSun Solaris 8 and 9 (10 probably works fine) 97*a53f50b9SchristosLinux: 2.6.11 kernel (2.4.latest probably works fine) 98*a53f50b9SchristosFreeBSD 5.4 and 6.0-SNAP001 (older versions probably work fine) 99*a53f50b9SchristosOpenBSD 3.7 (older versions probably work fine) 100*a53f50b9Schristos 101*a53f50b9Schristos** Vulnerable (don't support a proper "noac" flag natively): 102*a53f50b9Schristos 103*a53f50b9SchristosNetBSD 2.0.2 (older versions are also probably affected) 104*a53f50b9Schristos 105*a53f50b9SchristosNote: NetBSD has promised to support a noac flag hopefully after 2.1.0 is 106*a53f50b9Schristosreleased (maybe in 3.0 or 2.2). In the mean time, you can apply one of 107*a53f50b9Schristosthese two kernel patchs to support a 'noac' flag in NetBSD 2.x or 3.x: 108*a53f50b9Schristos ftp://ftp.netbsd.org/pub/NetBSD/misc/christos/2x.nfs.noac.diff 109*a53f50b9Schristos ftp://ftp.netbsd.org/pub/NetBSD/misc/christos/3x.nfs.noac.diff 110*a53f50b9SchristosAfter applying this patch and rebuilding your kernel, reboot with the new 111*a53f50b9Schristoskernel. Then copy the new nfs.h and nfsmount.h from /sys/nfs/ to 112*a53f50b9Schristos/usr/include/nfs/, and finally rebuild am-utils from scratch. 113*a53f50b9Schristos 114*a53f50b9Schristos** Testing 115*a53f50b9Schristos 116*a53f50b9SchristosWhen you build am-utils, a script named scripts/test-attrcache is built, 117*a53f50b9Schristoswhich can be used to test the NFS attribute cache behavior of the current 118*a53f50b9SchristosOS. You can run this script as root as follows: 119*a53f50b9Schristos 120*a53f50b9Schristos# make install 121*a53f50b9Schristos# cd scripts 122*a53f50b9Schristos# sh test-attrcache 123*a53f50b9Schristos 124*a53f50b9SchristosIf you run this script on an OS whose status is known (and not listed 125*a53f50b9Schristosabove), please report it to us via Bugzilla or the am-utils mailing list 126*a53f50b9Schristos(see www.am-utils.org), so we can record it in this file. 127*a53f50b9Schristos 128*a53f50b9SchristosSincerely, 129*a53f50b9SchristosErez. 130