xref: /netbsd-src/external/bsd/nsd/dist/doc/NSD-VERIFY-MODS (revision 811a4a0195236f69295602fbee687a174d42af9b)
1*811a4a01SchristosIn this file a quick overview of all the modifications that have been made for
2*811a4a01Schristoszone verification.
3*811a4a01Schristos
4*811a4a01Schristos
5*811a4a01SchristosConfiguring the verifier
6*811a4a01Schristos========================
7*811a4a01Schristos
8*811a4a01SchristosConfigure (nsd.conf) options were added. In the new "verify:" clause:
9*811a4a01Schristos	enable:
10*811a4a01Schristos	port:
11*811a4a01Schristos	ip-address:
12*811a4a01Schristos	verify-zones:
13*811a4a01Schristos	verifier:
14*811a4a01Schristos	verifier-count,
15*811a4a01Schristos	verifier-feed-zone,
16*811a4a01Schristos    and verifier-timeout.
17*811a4a01Schristos
18*811a4a01SchristosAnd for the "zone:" and "pattern:" clauses:
19*811a4a01Schristos	verify-zone,
20*811a4a01Schristos	verifier,
21*811a4a01Schristos	verifier-feed-zone,
22*811a4a01Schristos    and verifier-timeout.
23*811a4a01Schristos
24*811a4a01SchristosTo parse the syntax for those options, configlexer.lex and configparser.y are
25*811a4a01Schristosmodified. To hold those configuration values, the structs nsd_options and
26*811a4a01Schristospattern_options in the file options.h are extended.
27*811a4a01Schristos
28*811a4a01SchristosThe type of pattern_options::verifier, char**, is in the vector of arguments
29*811a4a01Schristosform that can be used by the execve family of executing functions. The helper
30*811a4a01Schristostype "struct component" is defined to help parsing a command with arguments.
31*811a4a01SchristosA zone_verifier is a list of STRING tokens. A stack of component is
32*811a4a01Schristosconstructed from those strings, that eventually is converted to an argument
33*811a4a01Schristosin configparser.y.
34*811a4a01Schristos
35*811a4a01Schristos
36*811a4a01SchristosDifffile modifications
37*811a4a01Schristos======================
38*811a4a01Schristos
39*811a4a01SchristosIt is possible that during a reload updates for multiple different zones are
40*811a4a01Schristosread. If some should be loaded (because they verified or didn't need to be
41*811a4a01Schristosverified) and some not, we have a problem because the database is updated
42*811a4a01Schristoswith all the updates (also the bad ones) and we cannot easily selectively
43*811a4a01Schristosundo only the bad updates.
44*811a4a01Schristos
45*811a4a01SchristosIn order to break this situation the committed field of each transfer is
46*811a4a01Schristosutilized. Initially it will be assigned the value DIFF_NOT_COMMITTED (0).
47*811a4a01SchristosWhen an update is verified this will be modified to DIFF_COMMITTED (1),
48*811a4a01SchristosDIFF_CORRUPT (2) or DIFF_INCONSISTENT (4) depending on whether the update
49*811a4a01Schristoswas applied and verified successfully. When a reload resulted in one or
50*811a4a01Schristosmore zones being corrupt or inconsistent, the newly forked server will quit
51*811a4a01Schristoswith exit status NSD_RELOAD_FAILED and the parent server will initiate a new
52*811a4a01Schristosreload. Then it is clear which updates should be merged with the database (the
53*811a4a01Schristosupdates which committed field is neither DIFF_CORRUPT or DIFF_INCONSISTENT).
54*811a4a01Schristos
55*811a4a01Schristos	Handling of the NSD_RELOAD_FAILED exit status of a child reload server
56*811a4a01Schristos	is in server_main (server.c)
57*811a4a01Schristos
58*811a4a01SchristosTo allow updates to be applied again on failure, xfrd has been updated to keep
59*811a4a01Schristosall updates for each zone around until a reload succeeds. The set of updates
60*811a4a01Schristosis fixed once a reload has been initiated to avoid a potentially infinite
61*811a4a01Schristosloop. During the update window, xfrd will accept and transfer updates, but
62*811a4a01Schristosdoes not schedule them until the reload finishes. As a result, xfrd manages
63*811a4a01Schristosthe updates stored on disk rather than the server, which previously just
64*811a4a01Schristosremoved each update during the reload process regardless of the result.
65*811a4a01SchristosPotentially resulting in the same transfer being tried mutiple times if the
66*811a4a01Schristosset of updates contained a bad update.
67*811a4a01Schristos
68*811a4a01Schristos
69*811a4a01SchristosRunning verifiers
70*811a4a01Schristos=================
71*811a4a01Schristos
72*811a4a01SchristosIn server_reload (in server.c) the function server_verify is called just after
73*811a4a01Schristosall updates are merged into the (in memory) database, but just before the new
74*811a4a01Schristosdatabase will be served. server_verify sets up a temporary event loop, calls
75*811a4a01Schristosverify_zone repeatedly to run the verifiers and mark each updated zone.
76*811a4a01Schristosserver_reload then inspects the update status for each zone and communicates
77*811a4a01Schristosthe number of good and bad zones in the update. server_reload then decides how
78*811a4a01Schristosto continue based on the number of good and bad zones as described above.
79*811a4a01Schristos
80*811a4a01Schristosverify_zone is defined in verify.c (and .h). The function creates the
81*811a4a01Schristosnecessary pipes, starts the verifier and then sets up the required events and
82*811a4a01Schristosregisters them with the event loop.
83*811a4a01Schristos
84*811a4a01SchristosThe state for each verifier is maintained an array of struct verifier. The
85*811a4a01Schristossize of the array is "verifier-count:" big. Each verifier that runs
86*811a4a01Schristossimultaneously is assigned a slot. When no free slots are available it waits
87*811a4a01Schristosuntil a running verifier is finished (or timed out) and a free slot is
88*811a4a01Schristosavailable for a potential next verifier to run simultaneously with the already
89*811a4a01Schristosrunning verifiers. The default setting is to run just one verifier at once,
90*811a4a01Schristoswhich will probably be fine in most situations.
91*811a4a01Schristos
92*811a4a01SchristosOnce all verifiers are finised (or timed out), the event loop is exited and
93*811a4a01Schristosserver_reload communicates the status for each updated zone.
94*811a4a01Schristos
95*811a4a01Schristos
96*811a4a01SchristosEnvironment variables for the verifiers
97*811a4a01Schristos=======================================
98*811a4a01Schristos
99*811a4a01SchristosVerifiers are informed on how a zone can be verified through environment
100*811a4a01Schristosvariables. The information on which addresses and ports a verifier may query a
101*811a4a01Schristoszone to be assessed is available and set on startup just after reading the
102*811a4a01Schristosconfiguration and setting up the sockets in nsd.c by calling
103*811a4a01Schristossetup_verifier_environment (also in nsd.c).
104*811a4a01Schristos
105*811a4a01SchristosVerifiers are spawned (via verify_zone) with popen3. verify_zone sets the zone
106*811a4a01Schristosspecific environment variables (VERIFY_ZONE and VERIFY_ZONE_ON_STDIN) just
107*811a4a01Schristosbefore it executes the verifier with execvp. Server sockets are automatically
108*811a4a01Schristosclosed when the verifier is executed.
109*811a4a01Schristos
110*811a4a01Schristos
111*811a4a01SchristosLogging a verifiers standard output and error streams
112*811a4a01Schristos=====================================================
113*811a4a01Schristos
114*811a4a01SchristosEverything a verifier outputs to stdin and stderr is logged in the nsd log
115*811a4a01Schristosfile.  Handler with handle_log_from_fd (verify.c) as a callback are setup by
116*811a4a01Schristosserver_verifiers_add. The log_from_fd_t struct is the user_data for the handler
117*811a4a01Schristosand contains besides the priority and the file descriptor, variables that are
118*811a4a01Schristosused by handle_log_from_fd to make sure logged lines will never exceed
119*811a4a01SchristosLOGLINELEN in length and will be split into parts if necessary.
120*811a4a01Schristos
121*811a4a01SchristosNote that in practice error messages are always logged before messages on the
122*811a4a01Schristosstandard output, because stdout is buffered and stderr is not. Maybe it is more
123*811a4a01Schristosconvenient to set stdout to unbuffered too.
124*811a4a01Schristos
125*811a4a01Schristos
126*811a4a01SchristosFeeding a zone to a verifier
127*811a4a01Schristos============================
128*811a4a01Schristos
129*811a4a01SchristosThe complete zone may be fed to the standard input of a verifier when the
130*811a4a01Schristos"verifier-feed-zone:" configuration option has value "yes" (the default). For
131*811a4a01Schristosthis purpose a verify_handle_feed (verify.c) handler is called when the
132*811a4a01Schristosstandard input file descriptor of the verifier is writeable. The function
133*811a4a01Schristosutilizes the zone_rr_iter_next (verify.c) function to get the next rr to
134*811a4a01Schristoswrite to the verifier. The verifier_zone_feed struct is used to maintain state
135*811a4a01Schristos(the file handle, the rr pretty printing state and the zone iterator).
136*811a4a01Schristos
137*811a4a01Schristos
138*811a4a01SchristosServing a zone to a verifier
139*811a4a01Schristos============================
140*811a4a01Schristos
141*811a4a01SchristosThe nsd struct (in nsd.h) is extended with two arrays of nsd_socket structs:
142*811a4a01Schristosverify_tcp and verify_udp and an verify_ifs size_t which holds the number of
143*811a4a01Schristossockets for verifying. This reflects the tcp, udp and ifs members that are used
144*811a4a01Schristosfor normal serving. Several parts in the code that operate on the tcp and udp
145*811a4a01Schristosarrays is simply reused with the verify_tcp and verify_udp arrays.
146*811a4a01Schristos
147*811a4a01SchristosFurthermore, in places in server.c were before the server_close_all_sockets
148*811a4a01Schristos(server.c) function was used with the normal server sockets, the function is
149*811a4a01Schristoscalled subsequently for the verify sockets. Also in server_start_xfrd the
150*811a4a01Schristossockets for verifiers are closed in the xfrd child process, because it has no
151*811a4a01Schristosneed for them.
152*811a4a01Schristos
153*811a4a01Schristos
154*811a4a01SchristosVerifier timeouts
155*811a4a01Schristos=================
156*811a4a01Schristos
157*811a4a01SchristosA handler for timeouts (as configured with the "verifier-timeout:" option) is
158*811a4a01Schristosadded by server_verifiers_add at verifier initialization time. The callback is
159*811a4a01Schristoshandle_verifier_timeout (verify.c) and the verifier_state_type for the verifier
160*811a4a01Schristosis used as user_data.
161*811a4a01Schristos
162*811a4a01Schristosverify_handle_timeout simply kills the verifier (by sending SIGTERM) and does
163*811a4a01Schristosnot cleanup the verifier state for reuse. This is done in verify_handle_exit,
164*811a4a01Schristoswhich is triggered once the verifier exits, because it can handle and start
165*811a4a01Schristosmore verifiers simultaneously.
166*811a4a01Schristos
167*811a4a01Schristos
168*811a4a01SchristosAborting the reload process (and killing all running verifiers)
169*811a4a01Schristos===============================================================
170*811a4a01Schristos
171*811a4a01SchristosA reload might (especially with a verifier) take some time. A parent server
172*811a4a01Schristosprocess could in this time be asked to quit. If that happens and it has a child
173*811a4a01Schristosreload server process, it sends the NSD_QUIT command over the communication
174*811a4a01Schristoschannel. verify_handle_command, which is registered when the temporary event
175*811a4a01Schristosloop is created, is triggered and sends a SIGTERM signal to each of the
176*811a4a01Schristosverifiers.
177*811a4a01Schristos
178*811a4a01Schristos
179*811a4a01SchristosRefreshing and expiring zones
180*811a4a01Schristos=============================
181*811a4a01Schristos
182*811a4a01SchristosWhen the SOA-Refresh timer runs out, a fresh zone is tried to be fetched from
183*811a4a01Schristosthe master server. If that fails, each SOA-Retry time will be tried again. To
184*811a4a01Schristosprevent a bad zone from being verified again and again, xfrd remembers the
185*811a4a01Schristoslast serial number of the zone that didn't verify. It will not try to transfer
186*811a4a01Schristosa zone with the bad serial number again.
187*811a4a01Schristos
188*811a4a01SchristosBefore afer reloading, the reload process informed xfrd which SOA's were
189*811a4a01Schristosmerged in the database, so that xfrd knew when zone needed to be refreshed.
190*811a4a01SchristosThis is adapted to inform xfrd about bad zones. The function
191*811a4a01Schristosinform_xfrd_new_soas is called for this in server.c. It communicated either
192*811a4a01Schristosgood or bad soas. When bad soas are communicated a session starts with
193*811a4a01SchristosNSD_BAD_SOA_BEGIN. For only good zones it starts with NSD_SOA_BEGIN. Each soa
194*811a4a01Schristosis preceded by a NSD_SOA_INFO. When all soas are communicated, NSD_SOA_END is
195*811a4a01Schristossend. Reception of these messages by xfrd is handled by function
196*811a4a01Schristosxfrd_handle_ipc_read in ipc.c. In the xfrd_state struct (in xfrd.h), the
197*811a4a01Schristosboolean parent_bad_soa_infos is added to help with this control flow in ipc.
198*811a4a01Schristos
199*811a4a01SchristosThe soas are eventually processed by xfrd, via xfrd_handle_ipc_SOAINFO in
200*811a4a01Schristosipc.c, with the xfrd_handle_incoming_soa function in xfrd.c.  The function
201*811a4a01Schristosmake sure that if a bad soa was received it is remembered in the xfrd_zone
202*811a4a01Schristosstruct. Two new variables are added for the purpose to this struct: soa_bad
203*811a4a01Schristosand soa_bad_acquired.  The values are stored and read to the xfrd.state file
204*811a4a01Schristoswith the functions xfrd_write_state_soa and xfrd_read_state respectively.
205*811a4a01Schristos
206*811a4a01SchristosIn xfrd.c function xfrd_parse_received_xfr_packet is adapted to make sure that
207*811a4a01Schristosknown bad serials are not transfered again unless the transfer is in a
208*811a4a01Schristosresponse to a notify. And even then only when the SOA matches the one in the
209*811a4a01Schristosnotify (if it contained one, otherwise any SOA is good).
210*811a4a01Schristos
211