1*811a4a01SchristosIn this file a quick overview of all the modifications that have been made for 2*811a4a01Schristoszone verification. 3*811a4a01Schristos 4*811a4a01Schristos 5*811a4a01SchristosConfiguring the verifier 6*811a4a01Schristos======================== 7*811a4a01Schristos 8*811a4a01SchristosConfigure (nsd.conf) options were added. In the new "verify:" clause: 9*811a4a01Schristos enable: 10*811a4a01Schristos port: 11*811a4a01Schristos ip-address: 12*811a4a01Schristos verify-zones: 13*811a4a01Schristos verifier: 14*811a4a01Schristos verifier-count, 15*811a4a01Schristos verifier-feed-zone, 16*811a4a01Schristos and verifier-timeout. 17*811a4a01Schristos 18*811a4a01SchristosAnd for the "zone:" and "pattern:" clauses: 19*811a4a01Schristos verify-zone, 20*811a4a01Schristos verifier, 21*811a4a01Schristos verifier-feed-zone, 22*811a4a01Schristos and verifier-timeout. 23*811a4a01Schristos 24*811a4a01SchristosTo parse the syntax for those options, configlexer.lex and configparser.y are 25*811a4a01Schristosmodified. To hold those configuration values, the structs nsd_options and 26*811a4a01Schristospattern_options in the file options.h are extended. 27*811a4a01Schristos 28*811a4a01SchristosThe type of pattern_options::verifier, char**, is in the vector of arguments 29*811a4a01Schristosform that can be used by the execve family of executing functions. The helper 30*811a4a01Schristostype "struct component" is defined to help parsing a command with arguments. 31*811a4a01SchristosA zone_verifier is a list of STRING tokens. A stack of component is 32*811a4a01Schristosconstructed from those strings, that eventually is converted to an argument 33*811a4a01Schristosin configparser.y. 34*811a4a01Schristos 35*811a4a01Schristos 36*811a4a01SchristosDifffile modifications 37*811a4a01Schristos====================== 38*811a4a01Schristos 39*811a4a01SchristosIt is possible that during a reload updates for multiple different zones are 40*811a4a01Schristosread. If some should be loaded (because they verified or didn't need to be 41*811a4a01Schristosverified) and some not, we have a problem because the database is updated 42*811a4a01Schristoswith all the updates (also the bad ones) and we cannot easily selectively 43*811a4a01Schristosundo only the bad updates. 44*811a4a01Schristos 45*811a4a01SchristosIn order to break this situation the committed field of each transfer is 46*811a4a01Schristosutilized. Initially it will be assigned the value DIFF_NOT_COMMITTED (0). 47*811a4a01SchristosWhen an update is verified this will be modified to DIFF_COMMITTED (1), 48*811a4a01SchristosDIFF_CORRUPT (2) or DIFF_INCONSISTENT (4) depending on whether the update 49*811a4a01Schristoswas applied and verified successfully. When a reload resulted in one or 50*811a4a01Schristosmore zones being corrupt or inconsistent, the newly forked server will quit 51*811a4a01Schristoswith exit status NSD_RELOAD_FAILED and the parent server will initiate a new 52*811a4a01Schristosreload. Then it is clear which updates should be merged with the database (the 53*811a4a01Schristosupdates which committed field is neither DIFF_CORRUPT or DIFF_INCONSISTENT). 54*811a4a01Schristos 55*811a4a01Schristos Handling of the NSD_RELOAD_FAILED exit status of a child reload server 56*811a4a01Schristos is in server_main (server.c) 57*811a4a01Schristos 58*811a4a01SchristosTo allow updates to be applied again on failure, xfrd has been updated to keep 59*811a4a01Schristosall updates for each zone around until a reload succeeds. The set of updates 60*811a4a01Schristosis fixed once a reload has been initiated to avoid a potentially infinite 61*811a4a01Schristosloop. During the update window, xfrd will accept and transfer updates, but 62*811a4a01Schristosdoes not schedule them until the reload finishes. As a result, xfrd manages 63*811a4a01Schristosthe updates stored on disk rather than the server, which previously just 64*811a4a01Schristosremoved each update during the reload process regardless of the result. 65*811a4a01SchristosPotentially resulting in the same transfer being tried mutiple times if the 66*811a4a01Schristosset of updates contained a bad update. 67*811a4a01Schristos 68*811a4a01Schristos 69*811a4a01SchristosRunning verifiers 70*811a4a01Schristos================= 71*811a4a01Schristos 72*811a4a01SchristosIn server_reload (in server.c) the function server_verify is called just after 73*811a4a01Schristosall updates are merged into the (in memory) database, but just before the new 74*811a4a01Schristosdatabase will be served. server_verify sets up a temporary event loop, calls 75*811a4a01Schristosverify_zone repeatedly to run the verifiers and mark each updated zone. 76*811a4a01Schristosserver_reload then inspects the update status for each zone and communicates 77*811a4a01Schristosthe number of good and bad zones in the update. server_reload then decides how 78*811a4a01Schristosto continue based on the number of good and bad zones as described above. 79*811a4a01Schristos 80*811a4a01Schristosverify_zone is defined in verify.c (and .h). The function creates the 81*811a4a01Schristosnecessary pipes, starts the verifier and then sets up the required events and 82*811a4a01Schristosregisters them with the event loop. 83*811a4a01Schristos 84*811a4a01SchristosThe state for each verifier is maintained an array of struct verifier. The 85*811a4a01Schristossize of the array is "verifier-count:" big. Each verifier that runs 86*811a4a01Schristossimultaneously is assigned a slot. When no free slots are available it waits 87*811a4a01Schristosuntil a running verifier is finished (or timed out) and a free slot is 88*811a4a01Schristosavailable for a potential next verifier to run simultaneously with the already 89*811a4a01Schristosrunning verifiers. The default setting is to run just one verifier at once, 90*811a4a01Schristoswhich will probably be fine in most situations. 91*811a4a01Schristos 92*811a4a01SchristosOnce all verifiers are finised (or timed out), the event loop is exited and 93*811a4a01Schristosserver_reload communicates the status for each updated zone. 94*811a4a01Schristos 95*811a4a01Schristos 96*811a4a01SchristosEnvironment variables for the verifiers 97*811a4a01Schristos======================================= 98*811a4a01Schristos 99*811a4a01SchristosVerifiers are informed on how a zone can be verified through environment 100*811a4a01Schristosvariables. The information on which addresses and ports a verifier may query a 101*811a4a01Schristoszone to be assessed is available and set on startup just after reading the 102*811a4a01Schristosconfiguration and setting up the sockets in nsd.c by calling 103*811a4a01Schristossetup_verifier_environment (also in nsd.c). 104*811a4a01Schristos 105*811a4a01SchristosVerifiers are spawned (via verify_zone) with popen3. verify_zone sets the zone 106*811a4a01Schristosspecific environment variables (VERIFY_ZONE and VERIFY_ZONE_ON_STDIN) just 107*811a4a01Schristosbefore it executes the verifier with execvp. Server sockets are automatically 108*811a4a01Schristosclosed when the verifier is executed. 109*811a4a01Schristos 110*811a4a01Schristos 111*811a4a01SchristosLogging a verifiers standard output and error streams 112*811a4a01Schristos===================================================== 113*811a4a01Schristos 114*811a4a01SchristosEverything a verifier outputs to stdin and stderr is logged in the nsd log 115*811a4a01Schristosfile. Handler with handle_log_from_fd (verify.c) as a callback are setup by 116*811a4a01Schristosserver_verifiers_add. The log_from_fd_t struct is the user_data for the handler 117*811a4a01Schristosand contains besides the priority and the file descriptor, variables that are 118*811a4a01Schristosused by handle_log_from_fd to make sure logged lines will never exceed 119*811a4a01SchristosLOGLINELEN in length and will be split into parts if necessary. 120*811a4a01Schristos 121*811a4a01SchristosNote that in practice error messages are always logged before messages on the 122*811a4a01Schristosstandard output, because stdout is buffered and stderr is not. Maybe it is more 123*811a4a01Schristosconvenient to set stdout to unbuffered too. 124*811a4a01Schristos 125*811a4a01Schristos 126*811a4a01SchristosFeeding a zone to a verifier 127*811a4a01Schristos============================ 128*811a4a01Schristos 129*811a4a01SchristosThe complete zone may be fed to the standard input of a verifier when the 130*811a4a01Schristos"verifier-feed-zone:" configuration option has value "yes" (the default). For 131*811a4a01Schristosthis purpose a verify_handle_feed (verify.c) handler is called when the 132*811a4a01Schristosstandard input file descriptor of the verifier is writeable. The function 133*811a4a01Schristosutilizes the zone_rr_iter_next (verify.c) function to get the next rr to 134*811a4a01Schristoswrite to the verifier. The verifier_zone_feed struct is used to maintain state 135*811a4a01Schristos(the file handle, the rr pretty printing state and the zone iterator). 136*811a4a01Schristos 137*811a4a01Schristos 138*811a4a01SchristosServing a zone to a verifier 139*811a4a01Schristos============================ 140*811a4a01Schristos 141*811a4a01SchristosThe nsd struct (in nsd.h) is extended with two arrays of nsd_socket structs: 142*811a4a01Schristosverify_tcp and verify_udp and an verify_ifs size_t which holds the number of 143*811a4a01Schristossockets for verifying. This reflects the tcp, udp and ifs members that are used 144*811a4a01Schristosfor normal serving. Several parts in the code that operate on the tcp and udp 145*811a4a01Schristosarrays is simply reused with the verify_tcp and verify_udp arrays. 146*811a4a01Schristos 147*811a4a01SchristosFurthermore, in places in server.c were before the server_close_all_sockets 148*811a4a01Schristos(server.c) function was used with the normal server sockets, the function is 149*811a4a01Schristoscalled subsequently for the verify sockets. Also in server_start_xfrd the 150*811a4a01Schristossockets for verifiers are closed in the xfrd child process, because it has no 151*811a4a01Schristosneed for them. 152*811a4a01Schristos 153*811a4a01Schristos 154*811a4a01SchristosVerifier timeouts 155*811a4a01Schristos================= 156*811a4a01Schristos 157*811a4a01SchristosA handler for timeouts (as configured with the "verifier-timeout:" option) is 158*811a4a01Schristosadded by server_verifiers_add at verifier initialization time. The callback is 159*811a4a01Schristoshandle_verifier_timeout (verify.c) and the verifier_state_type for the verifier 160*811a4a01Schristosis used as user_data. 161*811a4a01Schristos 162*811a4a01Schristosverify_handle_timeout simply kills the verifier (by sending SIGTERM) and does 163*811a4a01Schristosnot cleanup the verifier state for reuse. This is done in verify_handle_exit, 164*811a4a01Schristoswhich is triggered once the verifier exits, because it can handle and start 165*811a4a01Schristosmore verifiers simultaneously. 166*811a4a01Schristos 167*811a4a01Schristos 168*811a4a01SchristosAborting the reload process (and killing all running verifiers) 169*811a4a01Schristos=============================================================== 170*811a4a01Schristos 171*811a4a01SchristosA reload might (especially with a verifier) take some time. A parent server 172*811a4a01Schristosprocess could in this time be asked to quit. If that happens and it has a child 173*811a4a01Schristosreload server process, it sends the NSD_QUIT command over the communication 174*811a4a01Schristoschannel. verify_handle_command, which is registered when the temporary event 175*811a4a01Schristosloop is created, is triggered and sends a SIGTERM signal to each of the 176*811a4a01Schristosverifiers. 177*811a4a01Schristos 178*811a4a01Schristos 179*811a4a01SchristosRefreshing and expiring zones 180*811a4a01Schristos============================= 181*811a4a01Schristos 182*811a4a01SchristosWhen the SOA-Refresh timer runs out, a fresh zone is tried to be fetched from 183*811a4a01Schristosthe master server. If that fails, each SOA-Retry time will be tried again. To 184*811a4a01Schristosprevent a bad zone from being verified again and again, xfrd remembers the 185*811a4a01Schristoslast serial number of the zone that didn't verify. It will not try to transfer 186*811a4a01Schristosa zone with the bad serial number again. 187*811a4a01Schristos 188*811a4a01SchristosBefore afer reloading, the reload process informed xfrd which SOA's were 189*811a4a01Schristosmerged in the database, so that xfrd knew when zone needed to be refreshed. 190*811a4a01SchristosThis is adapted to inform xfrd about bad zones. The function 191*811a4a01Schristosinform_xfrd_new_soas is called for this in server.c. It communicated either 192*811a4a01Schristosgood or bad soas. When bad soas are communicated a session starts with 193*811a4a01SchristosNSD_BAD_SOA_BEGIN. For only good zones it starts with NSD_SOA_BEGIN. Each soa 194*811a4a01Schristosis preceded by a NSD_SOA_INFO. When all soas are communicated, NSD_SOA_END is 195*811a4a01Schristossend. Reception of these messages by xfrd is handled by function 196*811a4a01Schristosxfrd_handle_ipc_read in ipc.c. In the xfrd_state struct (in xfrd.h), the 197*811a4a01Schristosboolean parent_bad_soa_infos is added to help with this control flow in ipc. 198*811a4a01Schristos 199*811a4a01SchristosThe soas are eventually processed by xfrd, via xfrd_handle_ipc_SOAINFO in 200*811a4a01Schristosipc.c, with the xfrd_handle_incoming_soa function in xfrd.c. The function 201*811a4a01Schristosmake sure that if a bad soa was received it is remembered in the xfrd_zone 202*811a4a01Schristosstruct. Two new variables are added for the purpose to this struct: soa_bad 203*811a4a01Schristosand soa_bad_acquired. The values are stored and read to the xfrd.state file 204*811a4a01Schristoswith the functions xfrd_write_state_soa and xfrd_read_state respectively. 205*811a4a01Schristos 206*811a4a01SchristosIn xfrd.c function xfrd_parse_received_xfr_packet is adapted to make sure that 207*811a4a01Schristosknown bad serials are not transfered again unless the transfer is in a 208*811a4a01Schristosresponse to a notify. And even then only when the SOA matches the one in the 209*811a4a01Schristosnotify (if it contained one, otherwise any SOA is good). 210*811a4a01Schristos 211