xref: /plan9/sys/doc/acidpaper.ms (revision 426d2b71458df9b491ba6c167f699b3f1f7b0428)
1.HTML "Acid: A Debugger Built From A Language
2.TL
3Acid: A Debugger Built From A Language
4.AU
5Phil Winterbottom
6philw@plan9.bell-labs.com
7.AB
8.FS
9Originally appeared in
10.I
11Proc. of the Winter 1994 USENIX Conf.,
12.R
13pp. 211-222,
14San Francisco, CA
15.FE
16Acid is an unusual source-level symbolic debugger for Plan 9. It is implemented
17as a language interpreter with specialized primitives that provide
18debugger support.  Programs written in the language manipulate
19one or more target processes; variables in the language represent the
20symbols, state, and resources of those processes.
21This structure allows complex
22interaction between the debugger and the target program and
23provides a convenient method of parameterizing differences between
24machine architectures.
25Although some effort is required to learn
26the debugging language, the richness and flexibility of the
27debugging environment encourages new ways of reasoning about the way
28programs run and the conditions under which they fail.
29.AE
30.NH
31Introduction
32.PP
33The size and complexity
34of programs have increased in proportion to processor speed and memory but
35the interface between debugger and programmer has changed little.
36Graphical user interfaces have eased some of the tedious
37aspects of the interaction. A graphical interface is a convenient
38means for navigating through source and data structures but provides
39little benefit for process control.
40The introduction of a new concurrent language, Alef [Win93], emphasized the
41inadequacies of the existing Plan 9 [Pike90] debugger
42.I db ,
43a distant relative of
44.I adb ,
45and made it clear that a new debugger was required.
46.PP
47Current debuggers like
48.I dbx ,
49.I sdb ,
50and
51.I gdb
52are limited to answering only the questions their authors
53envisage.  As a result, they supply a plethora
54of specialized commands, each attempting to anticipate
55a specific question a user may ask.
56When a debugging situation arises that is beyond the scope
57of the command set, the tool is useless.
58Further,
59it is often tedious or impossible to reproduce an anomalous state
60of the program, especially when
61the state is embedded in the program's data structures.
62.PP
63Acid applies some ideas found in CAD software used for
64hardware test and simulation.
65It is based on the notion that the state and resources of a program
66are best represented and manipulated by a language. The state and resources,
67such as memory, registers, variables, type information and source code
68are represented by variables in the language.
69Expressions provide a computation mechanism and control
70statements allow repetitive or selective interpretation based
71on the result of expression evaluation.
72The heart of the Acid debugger is an interpreter for a small typeless
73language whose operators mirror the operations
74of C and Alef, which in turn correspond well to the basic operations of
75the machine. The interpreter itself knows nothing of the underlying
76hardware; it deals with the program state and resources
77in the abstract.
78Fundamental routines to control
79processes, read files, and interface to the system are implemented
80as builtin functions available to the interpreter.
81The actual debugger functionality is coded
82in Acid; commands are implemented as Acid functions.
83.PP
84This language-based approach has several advantages.
85Most importantly, programs written in Acid, including most of the
86debugger itself, are inherently portable.
87Furthermore, Acid avoids the limitations other debuggers impose when
88debugging parallel programs.  Instead of embedding a fixed
89process model in the debugger, Acid allows the
90programmer to adapt the debugger to handle an
91arbitrary process partitioning or program structure.
92The ability to
93interact dynamically with an executing process provides clear advantages
94over debuggers constrained to probe a static image.
95Finally, the Acid language is a powerful vehicle for expressing
96assertions about logic, process state, and the contents of data structures.
97When combined with dynamic interaction it allows a
98limited form of automated program verification without requiring
99modification or recompilation of the source code.
100The language is also an
101excellent vehicle for preserving a test suite for later regression testing.
102.PP
103The debugger may be customized by its users; standard
104functions may be modified or extended to suit a particular application
105or preference.
106For example, the kernel developers in our group require a
107command set supporting assembler-level debugging while the application
108programmers prefer source-level functionality.
109Although the default library is biased toward assembler-level debugging,
110it is easily modified to provide a convenient source-level interface.
111The debugger itself does not change; the user combines primitives
112and existing Acid functions in different ways to
113implement the desired interface.
114.NH
115Related Work
116.PP
117DUEL [Gol93], an extension to
118.I gdb
119[Stal91], proposes using a high level expression evaluator to solve
120some of these problems. The evaluator provides iterators to loop over data
121structures and conditionals to control evaluation of expressions.
122The author shows that complex state queries can be formulated
123by combining concise expressions but this only addresses part of the problem.
124A program is a dynamic entity; questions asked when the program is in
125a static state are meaningful only after the program has been `caught' in
126that state. The framework for manipulating the program is still as
127primitive as the underlying debugger. While DUEL provides a means to
128probe data structures it entirely neglects the most beneficial aspect
129of debugging languages: the ability to control processes. Acid is structured
130around a thread of control that passes between the interpreter and the
131target program.
132.PP
133The NeD debugger [May92] is a set of extensions to TCL [Ous90] that provide
134debugging primitives. The resulting language, NeDtcl, is used to implement
135a portable interface between a conventional debugger, pdb [May90], and
136a server that executes NeDtcl programs operating on the target program.
137Execution of the NeDtcl programs implements the debugging primitives
138that pdb expects.
139NeD is targeted at multi-process debugging across a network,
140and proves the flexibility of a language as a means of
141communication between debugging tools. Whereas NeD provides an interface
142between a conventional debugger and the process it debugs, Acid is the
143debugger itself. While NeD has some of the ideas
144found in Acid it is targeted toward a different purpose. Acid seeks to
145integrate the manipulation of a program's resources into the debugger
146while NeD provides a flexible interconnect between components of
147the debugging environment. The choice of TCL is appropriate for its use
148in NeD but is not suitable for Acid. Acid relies on the coupling of the type
149system with expression evaluation, which are the root of its design,
150to provide the debugging primitives.
151.PP
152Dalek [Ols90] is an event based language extension to gdb. State transitions
153in the target program cause events to be queued for processing by the
154debugging language.
155.PP
156Acid has many of the advantages of same process or
157.I local
158.I agent
159debuggers, like Parasight [Aral], without the need for dynamic linking or
160shared memory.
161Acid improves on the ideas of these other systems by completely integrating
162all aspects of the debugging process into the language environment. Of
163particular importance is the relationship between Acid variables,
164program symbols, source code, registers and type information. This
165integration is made possible by the design of the Acid language.
166.PP
167Interpreted languages such as Lisp and Smalltalk are able to provide
168richer debugging environments through more complete information than
169their compiled counterparts. Acid is a means to gather and represent
170similar information about compiled programs through cooperation
171with the compilation tools and library implementers.
172.NH
173Acid the Language
174.PP
175Acid is a small interpreted language targeted to its debugging task.
176It focuses on representing program state and addressing data rather than
177expressing complex computations. Program state is
178.I addressable
179from an Acid program.
180In addition to parsing and executing expressions and providing
181an architecture-independent interface to the target process,
182the interpreter supplies a mark-and-scan garbage collector
183to manage storage.
184.PP
185Every Acid session begins with the loading of the Acid libraries.
186These libraries contain functions, written in Acid, that provide
187a standard debugging environment including breakpoint management,
188stepping by instruction or statement, stack tracing, and
189access to variables, memory, and registers.
190The library contains 600 lines of Acid code and provides
191functionality similar to
192.I dbx .
193Following the loading of the system library, Acid loads
194user-specified libraries; this load sequence allows the
195user to augment or override the standard commands
196to customize the debugging environment.  When all libraries
197are loaded, Acid issues an interactive prompt and begins
198evaluating expressions entered by the user.  The Acid `commands'
199are actually invocations of builtin primitives or previously defined
200Acid functions. Acid evaluates each expression as it is entered and
201prints the result.
202.NH
203Types and Variables
204.PP
205Acid variables are of four basic types:
206.I integer ,
207.I string ,
208.I float ,
209and
210.I list .
211The type of a variable is inferred by the type of the right-hand side of
212an assignment expression.
213Many of the operators can be applied to more than
214one type; for these operators the action of the operator is determined
215by the type of its operands.
216For example,
217the
218.CW +
219operator adds
220.I integer
221and
222.I float
223operands, and concatenates
224.I string
225and
226.I list
227operands.
228Lists are the only complex type in Acid; there are no arrays, structures
229or pointers. Operators provide
230.CW head ,
231.CW tail ,
232.CW append
233and
234.CW delete
235operations.
236Lists can also be indexed like arrays.
237.PP
238Acid has two levels of scope: global and local.
239Function parameters and variables declared in a function body
240using the
241.CW local
242keyword are created at entry to the function and
243exist for the lifetime of a function.
244Global variables are created by assignment and need not be declared.
245All variables and functions in the program
246being debugged are entered in the Acid symbol table as global
247variables during Acid initialization.
248Conflicting variable names are resolved by prefixing enough `$' characters
249to make them unique.
250Syntactically, Acid variables and target program
251symbols are referenced identically.
252However, the variables are managed differently in the Acid
253symbol table and the user must be aware of this distinction.
254The value of an Acid variable is stored in the symbol
255table; a reference returns the value.
256The symbol table entry for a variable or function in the target
257program contains the address of that symbol in the image
258of the program.  Thus, the value of a program variable is
259accessed by indirect reference through the Acid
260variable that has the same name; the value of an Acid variable is the
261address of the corresponding program variable.
262.NH
263Control Flow
264.PP
265The
266.CW while
267and
268.CW loop
269statements implement looping.
270The former
271is similar to the same statement in C.
272The latter evaluates starting and ending expressions yielding
273integers and iterates while an incrementing loop index
274is within the bounds of those expressions.
275.P1
276acid: i = 0; loop 1,5 do print(i=i+1)
2770x00000001
2780x00000002
2790x00000003
2800x00000004
2810x00000005
282acid:
283.P2
284The traditional
285.CW if-then-else
286statement implements conditional execution.
287.NH
288Addressing
289.PP
290Two indirection operators allow Acid to access values in
291the program being debugged.
292The
293.CW *
294operator fetches a value from the memory image of an
295executing process;
296the
297.CW @
298operator fetches a value from the text file of the process.
299When either operator appears on the left side of an assignment, the value
300is written rather than read.
301.PP
302The indirection operator must know the size of the object
303referenced by a variable.
304The Plan 9 compilers neglect to include this
305information in the program symbol table, so Acid cannot
306derive this information implicitly.
307Instead Acid variables have formats.
308The format is a code
309letter specifying the printing style and the effect of some of the
310operators on that variable.
311The indirection operators look at the format code to determine the
312number of bytes to read or write.
313The format codes are derived from the format letters used by
314.I db .
315By default, symbol table variables and numeric constants
316are assigned the format code
317.CW 'X'
318which specifies 32-bit hexadecimal.
319Printing such a variable yields output of the form
320.CW 0x00123456 .
321An indirect reference through the variable fetches 32 bits
322of data at the address indicated by the variable.
323Other formats specify various data types, for example
324.CW i
325an instruction,
326.CW D
327a signed 32 bit decimal,
328.CW s
329a null-terminated string.
330The
331.CW fmt
332function
333allows the user to change the format code of a variable
334to control the printing format and
335operator side effects.
336This function evaluates the expression supplied as the first
337argument, attaches the format code supplied as the second
338argument to the result and returns that value.
339If the result is assigned to a variable,
340the new format code applies to
341that variable.  For convenience, Acid provides the
342.CW \e
343operator as a shorthand infix form of
344.CW fmt .
345For example:
346.P1
347acid: x=10
348acid: x				 // print x in hex
3490x0000000a
350acid: x = fmt(x, 'D')		 // make x type decimal
351acid: print(x, fmt(x, 'X'), x\eX) // print x in decimal & hex
35210 0x0000000a 0x0000000a
353acid: x				 // print x in decimal
35410
355acid: x\eo			 // print x in octal
356000000000012
357.P2
358The
359.CW ++
360and
361.CW --
362operators increment or decrement a variable by an amount
363determined by its format code.  Some formats imply a non-fixed size.
364For example, the
365.CW i
366format code disassembles an instruction into a string.
367On a 68020, which has variable length instructions:
368.P1
369acid: p=main\ei                     // p=addr(main), type INST
370acid: loop 1,5 do print(p\eX, @p++) // disassemble 5 instr's
3710x0000222e LEA	0xffffe948(A7),A7
3720x00002232 MOVL	s+0x4(A7),A2
3730x00002236 PEA	0x2f($0)
3740x0000223a MOVL	A2,-(A7)
3750x0000223c BSR	utfrrune
376acid:
377.P2
378Here,
379.CW main
380is the address of the function of the same name in the program under test.
381The loop retrieves the five instructions beginning at that address and
382then prints the address and the assembly language representation of each.
383Notice that the stride of the increment operator varies with the size of
384the instruction: the
385.CW MOVL
386at
387.CW 0x0000223a
388is a two byte instruction while all others are four bytes long.
389.PP
390Registers are treated as normal program variables referenced
391by their symbolic assembler language names.
392When a
393process stops, the register set is saved by the kernel
394at a known virtual address in the process memory map.
395The Acid variables associated with the registers point
396to the saved values and the
397.CW *
398indirection operator can then be used to read and write the register set.
399Since the registers are accessed via Acid variables they may
400be used in arbitrary expressions.
401.P1
402acid: PC                            // addr of saved PC
4030xc0000f60
404acid: *PC
4050x0000623c                          // contents of PC
406acid: *PC\ea
407main
408acid: *R1=10                        // modify R1
409acid: asm(*PC+4)                    // disassemble @ PC+4
410main+0x4 0x00006240 	MOVW	R31,0x0(R29)
411main+0x8 0x00006244 	MOVW	$setR30(SB),R30
412main+0x10 0x0000624c 	MOVW	R1,_clock(SB)
413.P2
414Here, the saved
415.CW PC
416is stored at address
417.CW 0xc0000f60 ;
418its current content is
419.CW 0x0000623c .
420The
421.CW a ' `
422format code converts this value to a string specifying
423the address as an offset beyond the nearest symbol.
424After setting the value of register
425.CW 1 ,
426the example uses the
427.CW asm
428command to disassemble a short section of code beginning
429at four bytes beyond the current value of the
430.CW PC .
431.NH
432Process Interface
433.PP
434A program executing under Acid is monitored through the
435.I proc
436file system interface provided by Plan 9.
437Textual messages written to the
438.CW ctl
439file control the execution of the process.
440For example writing
441.CW waitstop
442to the control file causes the write to block until the target
443process enters the kernel and is stopped. When the process is stopped
444the write completes. The
445.CW startstop
446message starts the target process and then does a
447.CW waitstop
448action.
449Synchronization between the debugger and the target process is determined
450by the actions of the various messages. Some operate asynchronously to the
451target process and always complete immediately, others block until the
452action completes. The asynchronous messages allow Acid to control
453several processes simultaneously.
454.PP
455The interpreter has builtin functions named after each of the control
456messages. The functions take a process id as argument.
457Any time a control message causes the program to execute instructions
458the interpreter performs two actions when the control operation has completed.
459The Acid variables pointing at the register set are fixed up to point
460at the saved registers, and then
461the user defined function
462.CW stopped
463is executed.
464The
465.CW stopped
466function may print the current address,
467line of source or instruction and return to interactive mode. Alternatively
468it may traverse a complex data structure, gather statistics and then set
469the program running again.
470.PP
471Several Acid variables are maintained by the debugger rather than the
472programmer.
473These variables allow generic Acid code to deal with the current process,
474architecture specifics or the symbol table.
475The variable
476.CW pid
477is the process id of the current process Acid is debugging.
478The variable
479.CW symbols
480contains a list of lists where each sublist contains the symbol
481name, its type and the value of the symbol.
482The variable
483.CW registers
484contains a list of the machine-specific register names. Global symbols in the target program
485can be referenced directly by name from Acid. Local variables
486are referenced using the colon operator as \f(CWfunction:variable\fP.
487.NH
488Source Level Debugging
489.PP
490Acid provides several builtin functions to manipulate source code.
491The
492.CW file
493function reads a text file, inserting each line into a list.
494The
495.CW pcfile
496and
497.CW pcline
498functions each take an address as an argument.
499The first
500returns a string containing the name of the source file
501and the second returns an integer containing the line number
502of the source line containing the instruction at the address.
503.P1
504acid: pcfile(main)		// file containing main
505main.c
506acid: pcline(main)		// line # of main in source
50711
508acid: file(pcfile(main))[pcline(main)]	// print that line
509main(int argc, char *argv[])
510acid: src(*PC)			// print statements nearby
511 9
512 10 void
513>11 main(int argc, char *argv[])
514 12 {
515 13	int a;
516.P2
517In this example, the three primitives are combined in an expression to print
518a line of source code associated with an address.
519The
520.CW src
521function prints a few lines of source
522around the address supplied as its argument. A companion routine,
523.CW Bsrc ,
524communicates with the external editor
525.CW sam .
526Given an address, it loads the corresponding source file into the editor
527and highlights the line containing the address.  This simple interface
528is easily extended to more complex functions.
529For example, the
530.CW step
531function can select the current file and line in the editor
532each time the target program stops, giving the user a visual
533trace of the execution path of the program. A more complete interface
534allowing two way communication between Acid and the
535.CW acme
536user interface [Pike93] is under construction. A filter between the debugger
537and the user interface provides interpretation of results from both
538sides of the interface. This allows the programming environment to
539interact with the debugger and vice-versa, a capability missing from the
540.CW sam
541interface.
542The
543.CW src
544and
545.CW Bsrc
546functions are both written in Acid code using the file and line primitives.
547Acid provides library functions to step through source level
548statements and functions. Furthermore, addresses in Acid expressions can be
549specified by source file and line.
550Source code is manipulated in the Acid
551.I list
552data type.
553.NH
554The Acid Library
555.PP
556The following examples define some useful commands and
557illustrate the interaction of the debugger and the interpreter.
558.P1
559defn bpset(addr)                          // set breakpoint
560{
561	if match(addr, bplist) >= 0 then
562		print("bkpoint already set:", addr\ea, "\en");
563	else {
564		*fmt(addr, bpfmt) = bpinst;   // plant it
565		bplist = append bplist, addr; // add to list
566	}
567}
568.P2
569The
570.CW bpset
571function plants a break point in memory. The function starts by
572using the
573.CW match
574builtin to
575search the breakpoint list to determine if a breakpoint is already
576set at the address.
577The indirection operator, controlled by the format code returned
578by the
579.CW fmt
580primitive, is used to plant the breakpoint in memory.
581The variables
582.CW bpfmt
583and
584.CW bpinst
585are Acid global variables containing the format code specifying
586the size of the breakpoint instruction and the breakpoint instruction
587itself.
588These
589variables are set by architecture-dependent library code
590when the debugger first attaches to the executing image.
591Finally the address of the breakpoint is
592appended to the breakpoint list,
593.CW bplist .
594.P1
595defn step()				// single step
596{
597	local lst, lpl, addr, bput;
598
599	bput = 0;			// sitting on bkpoint
600	if match(*PC, bplist) >= 0 then {
601		bput = fmt(*PC, bpfmt);	// save current addr
602		*bput = @bput;		// replace it
603	}
604
605	lst = follow(*PC);		// get follow set
606
607	lpl = lst;
608	while lpl do {			// place breakpoints
609		*(head lpl) = bpinst;
610		lpl = tail lpl;
611	}
612
613	startstop(pid);			// do the step
614
615	while lst do {			// remove breakpoints
616		addr = fmt(head lst, bpfmt);
617		*addr = @addr;		// replace instr.
618		lst = tail lst;
619	}
620	if bput != 0 then
621		*bput = bpinst;		// restore breakpoint
622}
623.P2
624The
625.CW step
626function executes a single assembler instruction.
627If the
628.CW PC
629is sitting
630on a breakpoint, the address and size of
631the breakpoint are saved.
632The breakpoint instruction
633is then removed using the
634.CW @
635operator to fetch
636.CW bpfmt
637bytes from the text file and to place it into the memory
638of the executing process using the
639.CW *
640operator.
641The
642.CW follow
643function is an Acid
644builtin which returns a follow-set: a list of instruction addresses which
645could be executed next.
646If the instruction stored at the
647.CW PC
648is a branch instruction, the
649list contains the addresses of the next instruction and
650the branch destination; otherwise, it contains only the
651address of the next instruction.
652The follow-set is then used to replace each possible following
653instruction with a breakpoint instruction.  The original
654instructions need not be saved; they remain
655in their unaltered state in the text file.
656The
657.CW startstop
658builtin writes the `startstop' message to the
659.I proc
660control file for the process named
661.CW pid .
662The target process executes until some condition causes it to
663enter the kernel, in this case, the execution of a breakpoint.
664When the process blocks, the debugger regains control and invokes the
665Acid library function
666.CW stopped
667which reports the address and cause of the blockage.
668The
669.CW startstop
670function completes and returns to the
671.CW step
672function where
673the follow-set is used to replace the breakpoints placed earlier.
674Finally, if the address of the original
675.CW PC
676contained a breakpoint, it is replaced.
677.PP
678Notice that this approach to process control is inherently portable;
679the Acid code is shared by the debuggers for all architectures.
680Acid variables and builtin functions provide a transparent interface
681to architecture-dependent values and functions.  Here the breakpoint
682value and format are referenced through Acid variables and the
683.CW follow
684primitive masks the differences in the underlying instruction set.
685.PP
686The
687.CW next
688function, similar to the
689.I dbx
690command of the same name,
691is a simpler example.
692This function steps through
693a single source statement but steps over function calls.
694.P1
695defn next()
696{
697	local sp, bound;
698
699	sp = *SP;			// save starting SP
700	bound = fnbound(*PC);		// begin & end of fn.
701	stmnt();			// step 1 statement
702	pc = *PC;
703	if pc >= bound[0] && pc < bound[1] then
704		return {};
705
706	while (pc<bound[0] || pc>bound[1]) && sp>=*SP do {
707		step();
708		pc = *PC;
709	}
710	src(*PC);
711}
712.P2
713The
714.CW next
715function
716starts by saving the current stack pointer in a local variable.
717It then uses the Acid library function
718.CW fnbound
719to return the addresses of the first and last instructions in
720the current function in a list.
721The
722.CW stmnt
723function executes a single source statement and then uses
724.CW src
725to print a few lines of source around the new
726.CW PC .
727If the new value of the
728.CW PC
729remains in the current function,
730.CW next
731returns.
732When the executed statement is a function call or a return
733from a function, the new value of the
734.CW PC
735is outside the bounds calculated by
736.CW fnbound
737and the test of the
738.CW while
739loop is evaluated.
740If the statement was a return, the new value of the stack pointer
741is greater than the original value and the loop completes without
742execution.
743Otherwise, the loop is entered and instructions are continually
744executed until the value of the
745.CW PC
746is between the bounds calculated earlier.  At that point, execution
747ceases and a few lines of source in the vicinity of the
748.CW PC
749are printed.
750.PP
751Acid provides concise and elegant expression for control and
752manipulation of target programs. These examples demonstrate how a
753few well-chosen primitives can be combined to create a rich debugging environment.
754.NH
755Dealing With Multiple Architectures
756.PP
757A single binary of Acid may be used to debug a program running on any
758of the five processor architectures supported by Plan 9.  For example,
759Plan 9 allows a user on a MIPS to import the
760.I proc
761file system from an i486-based PC and remotely debug a program executing
762on that processor.
763.PP
764Two levels of abstraction provide this architecture independence.
765On the lowest level, a Plan 9 library supplies functions to
766decode the file header of the program being debugged and
767select a table of system parameters
768and a jump vector of architecture-dependent
769functions based on the magic number.
770Among these functions are byte-order-independent
771access to memory and text files, stack manipulation, disassembly,
772and floating point number interpretation.
773The second level of abstraction is supplied by Acid.
774It consists of primitives and approximately 200 lines
775of architecture-dependent Acid library code that interface the
776interpreter to the architecture-dependent library.
777This layer performs functions such as mapping register names to
778memory locations, supplying breakpoint values and sizes,
779and converting processor specific data to Acid data types.
780An example of the latter is the stack trace function
781.CW strace ,
782which uses the stack traversal functions in the
783architecture-dependent library to construct a list of lists describing
784the context of a process.  The first level of list selects
785each function in the trace; subordinate lists contain the
786names and values of parameters and local variables of
787the functions.  Acid commands and library functions that
788manipulate and display process state information operate
789on the list representation and are independent of the
790underlying architecture.
791.NH
792Alef Runtime
793.PP
794Alef is a concurrent programming language,
795designed specifically for systems programming, which supports both
796shared variable and message passing paradigms.
797Alef borrows the C expression syntax but implements
798a substantially different type system.
799The language provides a rich set of
800exception handling, process management, and synchronization
801primitives, which rely on a runtime system.
802Alef program bugs are often deadlocks, synchronization failures,
803or non-termination caused by locks being held incorrectly.
804In such cases, a process stalls deep
805in the runtime code and it is clearly
806unreasonable to expect a programmer using the language
807to understand the detailed
808internal semantics of the runtime support functions.
809.PP
810Instead, there is an Alef support library, coded in Acid, that
811allows the programmer to interpret the program state in terms of
812Alef operations.  Consider the example of a multi-process program
813stalling because of improper synchronization.  A stack trace of
814the program indicates that it is waiting for an event in some
815obscure Alef runtime
816synchronization function.
817The function itself is irrelevant to the
818programmer; of greater importance is the identity of the
819unfulfilled event.
820Commands in the Alef support library decode
821the runtime data structures and program state to report the cause
822of the blockage in terms of the high-level operations available to
823the Alef programmer.
824Here, the Acid language acts
825as a communications medium between Alef implementer and Alef user.
826.NH
827Parallel Debugging
828.PP
829The central issue in parallel debugging is how the debugger is
830multiplexed between the processes comprising
831the program.
832Acid has no intrinsic model of process partitioning; it
833only assumes that parallel programs share a symbol table,
834though they need not share memory.
835The
836.CW setproc
837primitive attaches the debugger to a running process
838associated with the process ID supplied as its argument
839and assigns that value to the global variable
840.CW pid ,
841thereby allowing simple rotation among a group of processes.
842Further, the stack trace primitive is driven by parameters
843specifying a unique process context, so it is possible to
844examine the state of cooperating processes without switching
845the debugger focus from the process of interest.
846Since Acid is inherently extensible and capable of
847dynamic interaction with subordinate processes, the
848programmer can define Acid commands to detect and control
849complex interactions between processes.
850In short, the programmer is free to specify how the debugger reacts
851to events generated in specific threads of the program.
852.PP
853The support for parallel debugging in Acid depends on a crucial kernel
854modification: when the text segment of a program is written (usually to
855place a breakpoint), the segment is cloned to prevent other threads
856from encountering the breakpoint.  Although this incurs a slight performance
857penalty, it is of little importance while debugging.
858.NH
859Communication Between Tools
860.PP
861The Plan 9 Alef and C compilers do not
862embed detailed type information in the symbol table of an
863executable file.
864However, they do accept a command line option causing them to
865emit descriptions of complex data types
866(e.g., aggregates and abstract data types)
867to an auxiliary file.
868The vehicle for expressing this information is Acid source code.
869When an Acid debugging session is
870subsequently started, that file is loaded with the other Acid libraries.
871.PP
872For each complex object in the program the compiler generates
873three pieces of Acid code.
874The first is a table describing the size and offset of each
875member of the complex data type.  Following is an Acid function,
876named the same as the object, that formats and prints each member.
877Finally, Acid declarations associate the
878Alef or C program variables of a type with the functions
879to print them.
880The three forms of declaration are shown in the following example:
881.P1
882struct Bitmap {
883	Rectangle    0 r;
884	Rectangle   16 clipr;
885	'D'   32 ldepth;
886	'D'   36 id;
887	'X'   40 cache;
888};
889.P2
890.P1
891defn
892Bitmap(addr) {
893	complex Bitmap addr;
894	print("Rectangle r {\en");
895	Rectangle(addr.r);
896	print("}\en");
897	print("Rectangle clipr {\en");
898	Rectangle(addr.clipr);
899	print("}\en");
900	print("	ldepth	", addr.ldepth, "\en");
901	print("	id	", addr.id, "\en");
902	print("	cache	", addr.cache, "\en");
903};
904
905complex Bitmap darkgrey;
906complex Bitmap Window_settag:b;
907.P2
908The
909.CW struct
910declaration specifies decoding instructions for the complex type named
911.CW Bitmap .
912Although the syntax is superficially similar to a C structure declaration,
913the semantics differ markedly: the C declaration specifies a layout, while
914the Acid declaration tells how to decode it.
915The declaration specifies a type, an offset, and name for each
916member of the complex object. The type is either the name of another
917complex declaration, for example,
918.CW Rectangle ,
919or a format code.
920The offset is the number of bytes from the start
921of the object to the member
922and the name is the member's name in the Alef or C declaration.
923This type description is a close match for C and Alef, but is simple enough
924to be language independent.
925.PP
926The
927.CW Bitmap
928function expects the address of a
929.CW Bitmap
930as its only argument.
931It uses the decoding information contained in the
932.CW Bitmap
933structure declaration to extract, format, and print the
934value of each member of the complex object pointed to by
935the argument.
936The Alef compiler emits code to call other Acid functions
937where a member is another complex type; here,
938.CW Bitmap
939calls
940.CW Rectangle
941to print its contents.
942.PP
943The
944.CW complex
945declarations associate Alef variables with complex types.
946In the example,
947.CW darkgrey
948is the name of a global variable of type
949.CW Bitmap
950in the program being debugged.
951Whenever the name
952.CW darkgrey
953is evaluated by Acid, it automatically calls the
954.CW Bitmap
955function with the address of
956.CW darkgrey
957as the argument.
958The second
959.CW complex
960declaration associates a local variable or parameter named
961.CW b
962in function
963.CW Window_settag
964with the
965.CW Bitmap
966complex data type.
967.PP
968Acid borrows the C operators
969.CW .
970and
971.CW ->
972to access the decoding parameters of a member of a complex type.
973Although this representation is sufficiently general for describing
974the decoding of both C and Alef complex data types, it may
975prove too restrictive for target languages with more complicated
976type systems.
977Further, the assumption that the compiler can select the proper
978Acid format code for each basic type in the language is somewhat
979naive.  For example, when a member of a complex type is a pointer,
980it is assigned a hexadecimal type code; integer members are always
981assigned a decimal type code.
982This heuristic proves inaccurate when an integer field is a
983bit mask or set of bit flags which are more appropriately displayed
984in hexadecimal or octal.
985.NH
986Code Verification
987.PP
988Acid's ability to interact dynamically with
989an executing program allows passive test and
990verification of the target program.  For example,
991a common concern is leak detection in programs using
992.CW malloc .
993Of interest are two items: finding memory that was allocated
994but never freed and detecting bad pointers passed to
995.CW free .
996An auxiliary Acid library contains Acid functions to
997monitor the execution of a program and detect these
998faults, either as they happen or in the automated
999post-mortem analysis of the memory arena.
1000In the following example, the
1001.CW sort
1002command is run under the control of the
1003Acid memory leak library.
1004.P1
1005helix% acid -l malloc /bin/sort
1006/bin/sort: mips plan 9 executable
1007/lib/acid/port
1008/lib/acid/mips
1009/lib/acid/malloc
1010acid: go()
1011now
1012is
1013the
1014time
1015<ctrl-d>
1016is
1017now
1018the
1019time
102027680 : breakpoint	_exits+0x4	MOVW	$0x8,R1
1021acid:
1022.P2
1023The
1024.CW go
1025command creates a process and plants
1026breakpoints at the entry to
1027.CW malloc
1028and
1029.CW free .
1030The program is then started and continues until it
1031exits or stops.  If the reason for stopping is anything
1032other than the breakpoints in
1033.CW malloc
1034and
1035.CW free ,
1036Acid prints the usual status information and returns to the
1037interactive prompt.
1038.PP
1039When the process stops on entering
1040.CW malloc ,
1041the debugger must capture and save the address that
1042.CW malloc
1043will return.
1044After saving a stack
1045trace so the calling routine can be identified, it places
1046a breakpoint at the return address and restarts the program.
1047When
1048.CW malloc
1049returns, the breakpoint stops the program,
1050allowing the debugger
1051to grab the address of the new memory block from the return register.
1052The address and stack trace are added to the list of outstanding
1053memory blocks, the breakpoint is removed from the return point, and
1054the process is restarted.
1055.PP
1056When the process stops at the beginning of
1057.CW free ,
1058the memory address supplied as the argument is compared to the list
1059of outstanding memory blocks.  If it is not found an error message
1060and a stack trace of the call is reported; otherwise, the
1061address is deleted from the list.
1062.PP
1063When the program exits, the list of outstanding memory blocks contains
1064the addresses of all blocks that were allocated but never freed.
1065The
1066.CW leak
1067library function traverses the list producing a report describing
1068the allocated blocks.
1069.P1 1m
1070acid: leak()
1071Lost a total of 524288 bytes from:
1072    malloc() malloc.c:32 called from dofile+0xe8 sort.c:217
1073    dofile() sort.c:190 called from main+0xac sort.c:161
1074    main() sort.c:128 called from _main+0x20 main9.s:10
1075Lost a total of 64 bytes from:
1076    malloc() malloc.c:32 called from newline+0xfc sort.c:280
1077    newline() sort.c:248 called from dofile+0x110 sort.c:222
1078    dofile() sort.c:190 called from main+0xac sort.c:161
1079    main() sort.c:128 called from _main+0x20 main9.s:10
1080Lost a total of 64 bytes from:
1081    malloc() malloc.c:32 called from realloc+0x14 malloc.c:129
1082    realloc() malloc.c:123 called from bldkey+0x358 sort.c:1388
1083    buildkey() sort.c:1345 called from newline+0x150 sort.c:285
1084    newline() sort.c:248 called from dofile+0x110 sort.c:222
1085    dofile() sort.c:190 called from main+0xac sort.c:161
1086    main() sort.c:128 called from _main+0x20 main9.s:10
1087acid: refs()
1088data...bss...stack...
1089acid: leak()
1090acid:
1091.P2
1092The presence of a block in the allocation list does not imply
1093it is there because of a leak; for instance, it may have been
1094in use when the program terminated.
1095The
1096.CW refs()
1097library function scans the
1098.I data ,
1099.I bss ,
1100and
1101.I stack
1102segments of the process looking for pointers
1103into the allocated blocks.  When one is found, the block is deleted from
1104the outstanding block list.
1105The
1106.CW leak
1107function is used again to report the
1108blocks remaining allocated and unreferenced.
1109This strategy proves effective in detecting
1110disconnected (but non-circular) data structures.
1111.PP
1112The leak detection process is entirely passive.
1113The program is not
1114specially compiled and the source code is not required.
1115As with the Acid support functions for the Alef runtime environment,
1116the author of the library routines has encapsulated the
1117functionality of the library interface
1118in Acid code.
1119Any programmer may then check a program's use of the
1120library routines without knowledge of either implementation.
1121The performance impact of running leak detection is great
1122(about 10 times slower),
1123but it has not prevented interactive programs like
1124.CW sam
1125and the
1126.CW 8½
1127window system from being tested.
1128.NH
1129Code Coverage
1130.PP
1131Another common component of software test uses
1132.I coverage
1133analysis.
1134The purpose of the test is to determine which paths through the code have
1135not been executed while running the test suite.
1136This is usually
1137performed by a combination of compiler support and a reporting tool run
1138on the output generated by statements compiled into the program.
1139The compiler emits code that
1140logs the progress of the program as it executes basic blocks and writes the
1141results to a file. The file is then processed by the reporting tool
1142to determine which basic blocks have not been executed.
1143.PP
1144Acid can perform the same function in a language independent manner without
1145modifying the source, object or binary of the program. The following example
1146shows
1147.CW ls
1148being run under the control of the Acid coverage library.
1149.P1
1150philw-helix% acid -l coverage /bin/ls
1151/bin/ls: mips plan 9 executable
1152/lib/acid/port
1153/lib/acid/mips
1154/lib/acid/coverage
1155acid: coverage()
1156acid
1157newstime
1158profile
1159tel
1160wintool
11612: (error) msg: pid=11419 startstop: process exited
1162acid: analyse(ls)
1163ls.c:102,105
1164	102:     return 1;
1165	103: }
1166	104: if(db[0].qid.path&CHDIR && dflag==0){
1167	105:     output();
1168ls.c:122,126
1169	122:     memmove(dirbuf+ndir, db, sizeof(Dir));
1170	123:     dirbuf[ndir].prefix = 0;
1171	124:     p = utfrrune(s, '/');
1172	125:     if(p){
1173	126:         dirbuf[ndir].prefix = s;
1174.P2
1175The
1176.CW coverage
1177function begins by looping through the text segment placing
1178breakpoints at the entry to each basic block. The start of each basic
1179block is found using the Acid builtin function
1180.CW follow .
1181If the list generated by
1182.CW follow
1183contains more than one
1184element, then the addresses mark the start of basic blocks. A breakpoint
1185is placed at each address to detect entry into the block. If the result
1186of
1187.CW follow
1188is a single address then no action is taken, and the next address is
1189considered. Acid maintains a list of
1190breakpoints already in place and avoids placing duplicates (an address may be
1191the destination of several branches).
1192.PP
1193After placing the breakpoints the program is set running.
1194Each time a breakpoint is encountered
1195Acid deletes the address from the breakpoint list, removes the breakpoint
1196from memory and then restarts the program.
1197At any instant the breakpoint list contains the addresses of basic blocks
1198which have not been executed.
1199The
1200.CW analyse
1201function reports the lines of source code bounded by basic blocks
1202whose addresses are have not been deleted from the breakpoint list.
1203These are the basic blocks which have not been executed.
1204Program performance is almost unaffected since each breakpoint is executed
1205only once and then removed.
1206.PP
1207The library contains a total of 128 lines of Acid code.
1208An obvious extension of this algorithm could be used to provide basic block
1209profiling.
1210.NH
1211Conclusion
1212.PP
1213Acid has two areas of weakness. As with
1214other language-based tools like
1215.I awk ,
1216a programmer must learn yet another language to step beyond the normal
1217debugging functions and use the full power of the debugger.
1218Second, the command line interface supplied by the
1219.I yacc
1220parser is inordinately clumsy.
1221Part of the problem relates directly to the use of
1222.I yacc
1223and could be circumvented with a custom parser.
1224However, structural problems would remain: Acid often requires
1225too much typing to execute a simple
1226command.
1227A debugger should prostitute itself to its users, doing whatever
1228is wanted with a minimum of encouragement; commands should be
1229concise and obvious. The language interface is more consistent than
1230an ad hoc command interface but is clumsy to use.
1231Most of these problems are addressed by an Acme interface
1232which is under construction. This should provide the best of
1233both worlds: graphical debugging and access to the underlying acid
1234language when required.
1235.PP
1236The name space clash between Acid variables, keywords, program variables,
1237and functions is unavoidable.
1238Although it rarely affects a debugging session, it is annoying
1239when it happens and is sometimes difficult to circumvent.
1240The current renaming scheme
1241is too crude; the new names are too hard to remember.
1242.PP
1243Acid has proved to be a powerful tool whose applications
1244have exceeded expectations.
1245Of its strengths, portability, extensibility and parallel debugging support
1246were by design and provide the expected utility.
1247In retrospect,
1248its use as a tool for code test and verification and as
1249a medium for communicating type information and encapsulating
1250interfaces has provided unanticipated benefits and altered our
1251view of the debugging process.
1252.NH
1253Acknowledgments
1254.PP
1255Bob Flandrena was the first user and helped prepare the paper.
1256Rob Pike endured three buggy Alef compilers and a new debugger
1257in a single sitting.
1258.NH
1259References
1260.LP
1261[Pike90] R. Pike, D. Presotto, K. Thompson, H. Trickey,
1262``Plan 9 from Bell Labs'',
1263.I
1264UKUUG Proc. of the Summer 1990 Conf.,
1265.R
1266London, England,
12671990,
1268reprinted, in a different form, in this volume.
1269.LP
1270[Gol93] M. Golan, D. Hanson,
1271``DUEL -- A Very High-Level Debugging Language'',
1272.I
1273USENIX Proc. of the Winter 1993 Conf.,
1274.R
1275San Diego, CA,
12761993.
1277.LP
1278[Lin90] M. A. Linton,
1279``The Evolution of DBX'',
1280.I
1281USENIX Proc. of the Summer 1990 Conf.,
1282.R
1283Anaheim, CA,
12841990.
1285.LP
1286[Stal91] R. M. Stallman, R. H. Pesch,
1287``Using GDB: A guide to the GNU source level debugger'',
1288Technical Report, Free Software Foundation,
1289Cambridge, MA,
12901991.
1291.LP
1292[Win93] P. Winterbottom,
1293``Alef reference Manual'',
1294this volume.
1295.LP
1296[Pike93] Rob Pike,
1297``Acme: A User Interface for Programmers'',
1298.I
1299USENIX Proc. of the Winter 1994 Conf.,
1300.R
1301San Francisco, CA,
1302reprinted in this volume.
1303.LP
1304[Ols90] Ronald A. Olsson, Richard H. Crawford, and W. Wilson Ho,
1305``Dalek: A GNU, improved programmable debugger'',
1306.I
1307USENIX Proc. of the Summer 1990 Conf.,
1308.R
1309Anaheim, CA.
1310.LP
1311[May92] Paul Maybee,
1312``NeD: The Network Extensible Debugger''
1313.I
1314USENIX Proc. of the Summer 1992 Conf.,
1315.R
1316San Antonio, TX.
1317.LP
1318[Aral] Ziya Aral, Ilya Gertner, and Greg Schaffer,
1319``Efficient debugging primitives for multiprocessors'',
1320.I
1321Proceedings of the Third International Conference on Architectural
1322Support for Programming Languages and Operating Systems,
1323.R
1324SIGPLAN notices Nr. 22, May 1989.
1325