1@c \input texinfo 2@c %**start of header 3@c @setfilename agentexpr.info 4@c @settitle GDB Agent Expressions 5@c @setchapternewpage off 6@c %**end of header 7 8@c This file is part of the GDB manual. 9@c 10@c Copyright (C) 2003-2016 Free Software Foundation, Inc. 11@c 12@c See the file gdb.texinfo for copying conditions. 13 14@node Agent Expressions 15@appendix The GDB Agent Expression Mechanism 16 17In some applications, it is not feasible for the debugger to interrupt 18the program's execution long enough for the developer to learn anything 19helpful about its behavior. If the program's correctness depends on its 20real-time behavior, delays introduced by a debugger might cause the 21program to fail, even when the code itself is correct. It is useful to 22be able to observe the program's behavior without interrupting it. 23 24Using GDB's @code{trace} and @code{collect} commands, the user can 25specify locations in the program, and arbitrary expressions to evaluate 26when those locations are reached. Later, using the @code{tfind} 27command, she can examine the values those expressions had when the 28program hit the trace points. The expressions may also denote objects 29in memory --- structures or arrays, for example --- whose values GDB 30should record; while visiting a particular tracepoint, the user may 31inspect those objects as if they were in memory at that moment. 32However, because GDB records these values without interacting with the 33user, it can do so quickly and unobtrusively, hopefully not disturbing 34the program's behavior. 35 36When GDB is debugging a remote target, the GDB @dfn{agent} code running 37on the target computes the values of the expressions itself. To avoid 38having a full symbolic expression evaluator on the agent, GDB translates 39expressions in the source language into a simpler bytecode language, and 40then sends the bytecode to the agent; the agent then executes the 41bytecode, and records the values for GDB to retrieve later. 42 43The bytecode language is simple; there are forty-odd opcodes, the bulk 44of which are the usual vocabulary of C operands (addition, subtraction, 45shifts, and so on) and various sizes of literals and memory reference 46operations. The bytecode interpreter operates strictly on machine-level 47values --- various sizes of integers and floating point numbers --- and 48requires no information about types or symbols; thus, the interpreter's 49internal data structures are simple, and each bytecode requires only a 50few native machine instructions to implement it. The interpreter is 51small, and strict limits on the memory and time required to evaluate an 52expression are easy to determine, making it suitable for use by the 53debugging agent in real-time applications. 54 55@menu 56* General Bytecode Design:: Overview of the interpreter. 57* Bytecode Descriptions:: What each one does. 58* Using Agent Expressions:: How agent expressions fit into the big picture. 59* Varying Target Capabilities:: How to discover what the target can do. 60* Rationale:: Why we did it this way. 61@end menu 62 63 64@c @node Rationale 65@c @section Rationale 66 67 68@node General Bytecode Design 69@section General Bytecode Design 70 71The agent represents bytecode expressions as an array of bytes. Each 72instruction is one byte long (thus the term @dfn{bytecode}). Some 73instructions are followed by operand bytes; for example, the @code{goto} 74instruction is followed by a destination for the jump. 75 76The bytecode interpreter is a stack-based machine; most instructions pop 77their operands off the stack, perform some operation, and push the 78result back on the stack for the next instruction to consume. Each 79element of the stack may contain either a integer or a floating point 80value; these values are as many bits wide as the largest integer that 81can be directly manipulated in the source language. Stack elements 82carry no record of their type; bytecode could push a value as an 83integer, then pop it as a floating point value. However, GDB will not 84generate code which does this. In C, one might define the type of a 85stack element as follows: 86@example 87union agent_val @{ 88 LONGEST l; 89 DOUBLEST d; 90@}; 91@end example 92@noindent 93where @code{LONGEST} and @code{DOUBLEST} are @code{typedef} names for 94the largest integer and floating point types on the machine. 95 96By the time the bytecode interpreter reaches the end of the expression, 97the value of the expression should be the only value left on the stack. 98For tracing applications, @code{trace} bytecodes in the expression will 99have recorded the necessary data, and the value on the stack may be 100discarded. For other applications, like conditional breakpoints, the 101value may be useful. 102 103Separate from the stack, the interpreter has two registers: 104@table @code 105@item pc 106The address of the next bytecode to execute. 107 108@item start 109The address of the start of the bytecode expression, necessary for 110interpreting the @code{goto} and @code{if_goto} instructions. 111 112@end table 113@noindent 114Neither of these registers is directly visible to the bytecode language 115itself, but they are useful for defining the meanings of the bytecode 116operations. 117 118There are no instructions to perform side effects on the running 119program, or call the program's functions; we assume that these 120expressions are only used for unobtrusive debugging, not for patching 121the running code. 122 123Most bytecode instructions do not distinguish between the various sizes 124of values, and operate on full-width values; the upper bits of the 125values are simply ignored, since they do not usually make a difference 126to the value computed. The exceptions to this rule are: 127@table @asis 128 129@item memory reference instructions (@code{ref}@var{n}) 130There are distinct instructions to fetch different word sizes from 131memory. Once on the stack, however, the values are treated as full-size 132integers. They may need to be sign-extended; the @code{ext} instruction 133exists for this purpose. 134 135@item the sign-extension instruction (@code{ext} @var{n}) 136These clearly need to know which portion of their operand is to be 137extended to occupy the full length of the word. 138 139@end table 140 141If the interpreter is unable to evaluate an expression completely for 142some reason (a memory location is inaccessible, or a divisor is zero, 143for example), we say that interpretation ``terminates with an error''. 144This means that the problem is reported back to the interpreter's caller 145in some helpful way. In general, code using agent expressions should 146assume that they may attempt to divide by zero, fetch arbitrary memory 147locations, and misbehave in other ways. 148 149Even complicated C expressions compile to a few bytecode instructions; 150for example, the expression @code{x + y * z} would typically produce 151code like the following, assuming that @code{x} and @code{y} live in 152registers, and @code{z} is a global variable holding a 32-bit 153@code{int}: 154@example 155reg 1 156reg 2 157const32 @i{address of z} 158ref32 159ext 32 160mul 161add 162end 163@end example 164 165In detail, these mean: 166@table @code 167 168@item reg 1 169Push the value of register 1 (presumably holding @code{x}) onto the 170stack. 171 172@item reg 2 173Push the value of register 2 (holding @code{y}). 174 175@item const32 @i{address of z} 176Push the address of @code{z} onto the stack. 177 178@item ref32 179Fetch a 32-bit word from the address at the top of the stack; replace 180the address on the stack with the value. Thus, we replace the address 181of @code{z} with @code{z}'s value. 182 183@item ext 32 184Sign-extend the value on the top of the stack from 32 bits to full 185length. This is necessary because @code{z} is a signed integer. 186 187@item mul 188Pop the top two numbers on the stack, multiply them, and push their 189product. Now the top of the stack contains the value of the expression 190@code{y * z}. 191 192@item add 193Pop the top two numbers, add them, and push the sum. Now the top of the 194stack contains the value of @code{x + y * z}. 195 196@item end 197Stop executing; the value left on the stack top is the value to be 198recorded. 199 200@end table 201 202 203@node Bytecode Descriptions 204@section Bytecode Descriptions 205 206Each bytecode description has the following form: 207 208@table @asis 209 210@item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b} 211 212Pop the top two stack items, @var{a} and @var{b}, as integers; push 213their sum, as an integer. 214 215@end table 216 217In this example, @code{add} is the name of the bytecode, and 218@code{(0x02)} is the one-byte value used to encode the bytecode, in 219hexadecimal. The phrase ``@var{a} @var{b} @result{} @var{a+b}'' shows 220the stack before and after the bytecode executes. Beforehand, the stack 221must contain at least two values, @var{a} and @var{b}; since the top of 222the stack is to the right, @var{b} is on the top of the stack, and 223@var{a} is underneath it. After execution, the bytecode will have 224popped @var{a} and @var{b} from the stack, and replaced them with a 225single value, @var{a+b}. There may be other values on the stack below 226those shown, but the bytecode affects only those shown. 227 228Here is another example: 229 230@table @asis 231 232@item @code{const8} (0x22) @var{n}: @result{} @var{n} 233Push the 8-bit integer constant @var{n} on the stack, without sign 234extension. 235 236@end table 237 238In this example, the bytecode @code{const8} takes an operand @var{n} 239directly from the bytecode stream; the operand follows the @code{const8} 240bytecode itself. We write any such operands immediately after the name 241of the bytecode, before the colon, and describe the exact encoding of 242the operand in the bytecode stream in the body of the bytecode 243description. 244 245For the @code{const8} bytecode, there are no stack items given before 246the @result{}; this simply means that the bytecode consumes no values 247from the stack. If a bytecode consumes no values, or produces no 248values, the list on either side of the @result{} may be empty. 249 250If a value is written as @var{a}, @var{b}, or @var{n}, then the bytecode 251treats it as an integer. If a value is written is @var{addr}, then the 252bytecode treats it as an address. 253 254We do not fully describe the floating point operations here; although 255this design can be extended in a clean way to handle floating point 256values, they are not of immediate interest to the customer, so we avoid 257describing them, to save time. 258 259 260@table @asis 261 262@item @code{float} (0x01): @result{} 263 264Prefix for floating-point bytecodes. Not implemented yet. 265 266@item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b} 267Pop two integers from the stack, and push their sum, as an integer. 268 269@item @code{sub} (0x03): @var{a} @var{b} @result{} @var{a-b} 270Pop two integers from the stack, subtract the top value from the 271next-to-top value, and push the difference. 272 273@item @code{mul} (0x04): @var{a} @var{b} @result{} @var{a*b} 274Pop two integers from the stack, multiply them, and push the product on 275the stack. Note that, when one multiplies two @var{n}-bit numbers 276yielding another @var{n}-bit number, it is irrelevant whether the 277numbers are signed or not; the results are the same. 278 279@item @code{div_signed} (0x05): @var{a} @var{b} @result{} @var{a/b} 280Pop two signed integers from the stack; divide the next-to-top value by 281the top value, and push the quotient. If the divisor is zero, terminate 282with an error. 283 284@item @code{div_unsigned} (0x06): @var{a} @var{b} @result{} @var{a/b} 285Pop two unsigned integers from the stack; divide the next-to-top value 286by the top value, and push the quotient. If the divisor is zero, 287terminate with an error. 288 289@item @code{rem_signed} (0x07): @var{a} @var{b} @result{} @var{a modulo b} 290Pop two signed integers from the stack; divide the next-to-top value by 291the top value, and push the remainder. If the divisor is zero, 292terminate with an error. 293 294@item @code{rem_unsigned} (0x08): @var{a} @var{b} @result{} @var{a modulo b} 295Pop two unsigned integers from the stack; divide the next-to-top value 296by the top value, and push the remainder. If the divisor is zero, 297terminate with an error. 298 299@item @code{lsh} (0x09): @var{a} @var{b} @result{} @var{a<<b} 300Pop two integers from the stack; let @var{a} be the next-to-top value, 301and @var{b} be the top value. Shift @var{a} left by @var{b} bits, and 302push the result. 303 304@item @code{rsh_signed} (0x0a): @var{a} @var{b} @result{} @code{(signed)}@var{a>>b} 305Pop two integers from the stack; let @var{a} be the next-to-top value, 306and @var{b} be the top value. Shift @var{a} right by @var{b} bits, 307inserting copies of the top bit at the high end, and push the result. 308 309@item @code{rsh_unsigned} (0x0b): @var{a} @var{b} @result{} @var{a>>b} 310Pop two integers from the stack; let @var{a} be the next-to-top value, 311and @var{b} be the top value. Shift @var{a} right by @var{b} bits, 312inserting zero bits at the high end, and push the result. 313 314@item @code{log_not} (0x0e): @var{a} @result{} @var{!a} 315Pop an integer from the stack; if it is zero, push the value one; 316otherwise, push the value zero. 317 318@item @code{bit_and} (0x0f): @var{a} @var{b} @result{} @var{a&b} 319Pop two integers from the stack, and push their bitwise @code{and}. 320 321@item @code{bit_or} (0x10): @var{a} @var{b} @result{} @var{a|b} 322Pop two integers from the stack, and push their bitwise @code{or}. 323 324@item @code{bit_xor} (0x11): @var{a} @var{b} @result{} @var{a^b} 325Pop two integers from the stack, and push their bitwise 326exclusive-@code{or}. 327 328@item @code{bit_not} (0x12): @var{a} @result{} @var{~a} 329Pop an integer from the stack, and push its bitwise complement. 330 331@item @code{equal} (0x13): @var{a} @var{b} @result{} @var{a=b} 332Pop two integers from the stack; if they are equal, push the value one; 333otherwise, push the value zero. 334 335@item @code{less_signed} (0x14): @var{a} @var{b} @result{} @var{a<b} 336Pop two signed integers from the stack; if the next-to-top value is less 337than the top value, push the value one; otherwise, push the value zero. 338 339@item @code{less_unsigned} (0x15): @var{a} @var{b} @result{} @var{a<b} 340Pop two unsigned integers from the stack; if the next-to-top value is less 341than the top value, push the value one; otherwise, push the value zero. 342 343@item @code{ext} (0x16) @var{n}: @var{a} @result{} @var{a}, sign-extended from @var{n} bits 344Pop an unsigned value from the stack; treating it as an @var{n}-bit 345twos-complement value, extend it to full length. This means that all 346bits to the left of bit @var{n-1} (where the least significant bit is bit 3470) are set to the value of bit @var{n-1}. Note that @var{n} may be 348larger than or equal to the width of the stack elements of the bytecode 349engine; in this case, the bytecode should have no effect. 350 351The number of source bits to preserve, @var{n}, is encoded as a single 352byte unsigned integer following the @code{ext} bytecode. 353 354@item @code{zero_ext} (0x2a) @var{n}: @var{a} @result{} @var{a}, zero-extended from @var{n} bits 355Pop an unsigned value from the stack; zero all but the bottom @var{n} 356bits. 357 358The number of source bits to preserve, @var{n}, is encoded as a single 359byte unsigned integer following the @code{zero_ext} bytecode. 360 361@item @code{ref8} (0x17): @var{addr} @result{} @var{a} 362@itemx @code{ref16} (0x18): @var{addr} @result{} @var{a} 363@itemx @code{ref32} (0x19): @var{addr} @result{} @var{a} 364@itemx @code{ref64} (0x1a): @var{addr} @result{} @var{a} 365Pop an address @var{addr} from the stack. For bytecode 366@code{ref}@var{n}, fetch an @var{n}-bit value from @var{addr}, using the 367natural target endianness. Push the fetched value as an unsigned 368integer. 369 370Note that @var{addr} may not be aligned in any particular way; the 371@code{ref@var{n}} bytecodes should operate correctly for any address. 372 373If attempting to access memory at @var{addr} would cause a processor 374exception of some sort, terminate with an error. 375 376@item @code{ref_float} (0x1b): @var{addr} @result{} @var{d} 377@itemx @code{ref_double} (0x1c): @var{addr} @result{} @var{d} 378@itemx @code{ref_long_double} (0x1d): @var{addr} @result{} @var{d} 379@itemx @code{l_to_d} (0x1e): @var{a} @result{} @var{d} 380@itemx @code{d_to_l} (0x1f): @var{d} @result{} @var{a} 381Not implemented yet. 382 383@item @code{dup} (0x28): @var{a} => @var{a} @var{a} 384Push another copy of the stack's top element. 385 386@item @code{swap} (0x2b): @var{a} @var{b} => @var{b} @var{a} 387Exchange the top two items on the stack. 388 389@item @code{pop} (0x29): @var{a} => 390Discard the top value on the stack. 391 392@item @code{pick} (0x32) @var{n}: @var{a} @dots{} @var{b} => @var{a} @dots{} @var{b} @var{a} 393Duplicate an item from the stack and push it on the top of the stack. 394@var{n}, a single byte, indicates the stack item to copy. If @var{n} 395is zero, this is the same as @code{dup}; if @var{n} is one, it copies 396the item under the top item, etc. If @var{n} exceeds the number of 397items on the stack, terminate with an error. 398 399@item @code{rot} (0x33): @var{a} @var{b} @var{c} => @var{c} @var{b} @var{a} 400Rotate the top three items on the stack. 401 402@item @code{if_goto} (0x20) @var{offset}: @var{a} @result{} 403Pop an integer off the stack; if it is non-zero, branch to the given 404offset in the bytecode string. Otherwise, continue to the next 405instruction in the bytecode stream. In other words, if @var{a} is 406non-zero, set the @code{pc} register to @code{start} + @var{offset}. 407Thus, an offset of zero denotes the beginning of the expression. 408 409The @var{offset} is stored as a sixteen-bit unsigned value, stored 410immediately following the @code{if_goto} bytecode. It is always stored 411most significant byte first, regardless of the target's normal 412endianness. The offset is not guaranteed to fall at any particular 413alignment within the bytecode stream; thus, on machines where fetching a 41416-bit on an unaligned address raises an exception, you should fetch the 415offset one byte at a time. 416 417@item @code{goto} (0x21) @var{offset}: @result{} 418Branch unconditionally to @var{offset}; in other words, set the 419@code{pc} register to @code{start} + @var{offset}. 420 421The offset is stored in the same way as for the @code{if_goto} bytecode. 422 423@item @code{const8} (0x22) @var{n}: @result{} @var{n} 424@itemx @code{const16} (0x23) @var{n}: @result{} @var{n} 425@itemx @code{const32} (0x24) @var{n}: @result{} @var{n} 426@itemx @code{const64} (0x25) @var{n}: @result{} @var{n} 427Push the integer constant @var{n} on the stack, without sign extension. 428To produce a small negative value, push a small twos-complement value, 429and then sign-extend it using the @code{ext} bytecode. 430 431The constant @var{n} is stored in the appropriate number of bytes 432following the @code{const}@var{b} bytecode. The constant @var{n} is 433always stored most significant byte first, regardless of the target's 434normal endianness. The constant is not guaranteed to fall at any 435particular alignment within the bytecode stream; thus, on machines where 436fetching a 16-bit on an unaligned address raises an exception, you 437should fetch @var{n} one byte at a time. 438 439@item @code{reg} (0x26) @var{n}: @result{} @var{a} 440Push the value of register number @var{n}, without sign extension. The 441registers are numbered following GDB's conventions. 442 443The register number @var{n} is encoded as a 16-bit unsigned integer 444immediately following the @code{reg} bytecode. It is always stored most 445significant byte first, regardless of the target's normal endianness. 446The register number is not guaranteed to fall at any particular 447alignment within the bytecode stream; thus, on machines where fetching a 44816-bit on an unaligned address raises an exception, you should fetch the 449register number one byte at a time. 450 451@item @code{getv} (0x2c) @var{n}: @result{} @var{v} 452Push the value of trace state variable number @var{n}, without sign 453extension. 454 455The variable number @var{n} is encoded as a 16-bit unsigned integer 456immediately following the @code{getv} bytecode. It is always stored most 457significant byte first, regardless of the target's normal endianness. 458The variable number is not guaranteed to fall at any particular 459alignment within the bytecode stream; thus, on machines where fetching a 46016-bit on an unaligned address raises an exception, you should fetch the 461register number one byte at a time. 462 463@item @code{setv} (0x2d) @var{n}: @var{v} @result{} @var{v} 464Set trace state variable number @var{n} to the value found on the top 465of the stack. The stack is unchanged, so that the value is readily 466available if the assignment is part of a larger expression. The 467handling of @var{n} is as described for @code{getv}. 468 469@item @code{trace} (0x0c): @var{addr} @var{size} @result{} 470Record the contents of the @var{size} bytes at @var{addr} in a trace 471buffer, for later retrieval by GDB. 472 473@item @code{trace_quick} (0x0d) @var{size}: @var{addr} @result{} @var{addr} 474Record the contents of the @var{size} bytes at @var{addr} in a trace 475buffer, for later retrieval by GDB. @var{size} is a single byte 476unsigned integer following the @code{trace} opcode. 477 478This bytecode is equivalent to the sequence @code{dup const8 @var{size} 479trace}, but we provide it anyway to save space in bytecode strings. 480 481@item @code{trace16} (0x30) @var{size}: @var{addr} @result{} @var{addr} 482Identical to trace_quick, except that @var{size} is a 16-bit big-endian 483unsigned integer, not a single byte. This should probably have been 484named @code{trace_quick16}, for consistency. 485 486@item @code{tracev} (0x2e) @var{n}: @result{} @var{a} 487Record the value of trace state variable number @var{n} in the trace 488buffer. The handling of @var{n} is as described for @code{getv}. 489 490@item @code{tracenz} (0x2f) @var{addr} @var{size} @result{} 491Record the bytes at @var{addr} in a trace buffer, for later retrieval 492by GDB. Stop at either the first zero byte, or when @var{size} bytes 493have been recorded, whichever occurs first. 494 495@item @code{printf} (0x34) @var{numargs} @var{string} @result{} 496Do a formatted print, in the style of the C function @code{printf}). 497The value of @var{numargs} is the number of arguments to expect on the 498stack, while @var{string} is the format string, prefixed with a 499two-byte length. The last byte of the string must be zero, and is 500included in the length. The format string includes escaped sequences 501just as it appears in C source, so for instance the format string 502@code{"\t%d\n"} is six characters long, and the output will consist of 503a tab character, a decimal number, and a newline. At the top of the 504stack, above the values to be printed, this bytecode will pop a 505``function'' and ``channel''. If the function is nonzero, then the 506target may treat it as a function and call it, passing the channel as 507a first argument, as with the C function @code{fprintf}. If the 508function is zero, then the target may simply call a standard formatted 509print function of its choice. In all, this bytecode pops 2 + 510@var{numargs} stack elements, and pushes nothing. 511 512@item @code{end} (0x27): @result{} 513Stop executing bytecode; the result should be the top element of the 514stack. If the purpose of the expression was to compute an lvalue or a 515range of memory, then the next-to-top of the stack is the lvalue's 516address, and the top of the stack is the lvalue's size, in bytes. 517 518@end table 519 520 521@node Using Agent Expressions 522@section Using Agent Expressions 523 524Agent expressions can be used in several different ways by @value{GDBN}, 525and the debugger can generate different bytecode sequences as appropriate. 526 527One possibility is to do expression evaluation on the target rather 528than the host, such as for the conditional of a conditional 529tracepoint. In such a case, @value{GDBN} compiles the source 530expression into a bytecode sequence that simply gets values from 531registers or memory, does arithmetic, and returns a result. 532 533Another way to use agent expressions is for tracepoint data 534collection. @value{GDBN} generates a different bytecode sequence for 535collection; in addition to bytecodes that do the calculation, 536@value{GDBN} adds @code{trace} bytecodes to save the pieces of 537memory that were used. 538 539@itemize @bullet 540 541@item 542The user selects trace points in the program's code at which GDB should 543collect data. 544 545@item 546The user specifies expressions to evaluate at each trace point. These 547expressions may denote objects in memory, in which case those objects' 548contents are recorded as the program runs, or computed values, in which 549case the values themselves are recorded. 550 551@item 552GDB transmits the tracepoints and their associated expressions to the 553GDB agent, running on the debugging target. 554 555@item 556The agent arranges to be notified when a trace point is hit. 557 558@item 559When execution on the target reaches a trace point, the agent evaluates 560the expressions associated with that trace point, and records the 561resulting values and memory ranges. 562 563@item 564Later, when the user selects a given trace event and inspects the 565objects and expression values recorded, GDB talks to the agent to 566retrieve recorded data as necessary to meet the user's requests. If the 567user asks to see an object whose contents have not been recorded, GDB 568reports an error. 569 570@end itemize 571 572 573@node Varying Target Capabilities 574@section Varying Target Capabilities 575 576Some targets don't support floating-point, and some would rather not 577have to deal with @code{long long} operations. Also, different targets 578will have different stack sizes, and different bytecode buffer lengths. 579 580Thus, GDB needs a way to ask the target about itself. We haven't worked 581out the details yet, but in general, GDB should be able to send the 582target a packet asking it to describe itself. The reply should be a 583packet whose length is explicit, so we can add new information to the 584packet in future revisions of the agent, without confusing old versions 585of GDB, and it should contain a version number. It should contain at 586least the following information: 587 588@itemize @bullet 589 590@item 591whether floating point is supported 592 593@item 594whether @code{long long} is supported 595 596@item 597maximum acceptable size of bytecode stack 598 599@item 600maximum acceptable length of bytecode expressions 601 602@item 603which registers are actually available for collection 604 605@item 606whether the target supports disabled tracepoints 607 608@end itemize 609 610@node Rationale 611@section Rationale 612 613Some of the design decisions apparent above are arguable. 614 615@table @b 616 617@item What about stack overflow/underflow? 618GDB should be able to query the target to discover its stack size. 619Given that information, GDB can determine at translation time whether a 620given expression will overflow the stack. But this spec isn't about 621what kinds of error-checking GDB ought to do. 622 623@item Why are you doing everything in LONGEST? 624 625Speed isn't important, but agent code size is; using LONGEST brings in a 626bunch of support code to do things like division, etc. So this is a 627serious concern. 628 629First, note that you don't need different bytecodes for different 630operand sizes. You can generate code without @emph{knowing} how big the 631stack elements actually are on the target. If the target only supports 63232-bit ints, and you don't send any 64-bit bytecodes, everything just 633works. The observation here is that the MIPS and the Alpha have only 634fixed-size registers, and you can still get C's semantics even though 635most instructions only operate on full-sized words. You just need to 636make sure everything is properly sign-extended at the right times. So 637there is no need for 32- and 64-bit variants of the bytecodes. Just 638implement everything using the largest size you support. 639 640GDB should certainly check to see what sizes the target supports, so the 641user can get an error earlier, rather than later. But this information 642is not necessary for correctness. 643 644 645@item Why don't you have @code{>} or @code{<=} operators? 646I want to keep the interpreter small, and we don't need them. We can 647combine the @code{less_} opcodes with @code{log_not}, and swap the order 648of the operands, yielding all four asymmetrical comparison operators. 649For example, @code{(x <= y)} is @code{! (x > y)}, which is @code{! (y < 650x)}. 651 652@item Why do you have @code{log_not}? 653@itemx Why do you have @code{ext}? 654@itemx Why do you have @code{zero_ext}? 655These are all easily synthesized from other instructions, but I expect 656them to be used frequently, and they're simple, so I include them to 657keep bytecode strings short. 658 659@code{log_not} is equivalent to @code{const8 0 equal}; it's used in half 660the relational operators. 661 662@code{ext @var{n}} is equivalent to @code{const8 @var{s-n} lsh const8 663@var{s-n} rsh_signed}, where @var{s} is the size of the stack elements; 664it follows @code{ref@var{m}} and @var{reg} bytecodes when the value 665should be signed. See the next bulleted item. 666 667@code{zero_ext @var{n}} is equivalent to @code{const@var{m} @var{mask} 668log_and}; it's used whenever we push the value of a register, because we 669can't assume the upper bits of the register aren't garbage. 670 671@item Why not have sign-extending variants of the @code{ref} operators? 672Because that would double the number of @code{ref} operators, and we 673need the @code{ext} bytecode anyway for accessing bitfields. 674 675@item Why not have constant-address variants of the @code{ref} operators? 676Because that would double the number of @code{ref} operators again, and 677@code{const32 @var{address} ref32} is only one byte longer. 678 679@item Why do the @code{ref@var{n}} operators have to support unaligned fetches? 680GDB will generate bytecode that fetches multi-byte values at unaligned 681addresses whenever the executable's debugging information tells it to. 682Furthermore, GDB does not know the value the pointer will have when GDB 683generates the bytecode, so it cannot determine whether a particular 684fetch will be aligned or not. 685 686In particular, structure bitfields may be several bytes long, but follow 687no alignment rules; members of packed structures are not necessarily 688aligned either. 689 690In general, there are many cases where unaligned references occur in 691correct C code, either at the programmer's explicit request, or at the 692compiler's discretion. Thus, it is simpler to make the GDB agent 693bytecodes work correctly in all circumstances than to make GDB guess in 694each case whether the compiler did the usual thing. 695 696@item Why are there no side-effecting operators? 697Because our current client doesn't want them? That's a cheap answer. I 698think the real answer is that I'm afraid of implementing function 699calls. We should re-visit this issue after the present contract is 700delivered. 701 702@item Why aren't the @code{goto} ops PC-relative? 703The interpreter has the base address around anyway for PC bounds 704checking, and it seemed simpler. 705 706@item Why is there only one offset size for the @code{goto} ops? 707Offsets are currently sixteen bits. I'm not happy with this situation 708either: 709 710Suppose we have multiple branch ops with different offset sizes. As I 711generate code left-to-right, all my jumps are forward jumps (there are 712no loops in expressions), so I never know the target when I emit the 713jump opcode. Thus, I have to either always assume the largest offset 714size, or do jump relaxation on the code after I generate it, which seems 715like a big waste of time. 716 717I can imagine a reasonable expression being longer than 256 bytes. I 718can't imagine one being longer than 64k. Thus, we need 16-bit offsets. 719This kind of reasoning is so bogus, but relaxation is pathetic. 720 721The other approach would be to generate code right-to-left. Then I'd 722always know my offset size. That might be fun. 723 724@item Where is the function call bytecode? 725 726When we add side-effects, we should add this. 727 728@item Why does the @code{reg} bytecode take a 16-bit register number? 729 730Intel's IA-64 architecture has 128 general-purpose registers, 731and 128 floating-point registers, and I'm sure it has some random 732control registers. 733 734@item Why do we need @code{trace} and @code{trace_quick}? 735Because GDB needs to record all the memory contents and registers an 736expression touches. If the user wants to evaluate an expression 737@code{x->y->z}, the agent must record the values of @code{x} and 738@code{x->y} as well as the value of @code{x->y->z}. 739 740@item Don't the @code{trace} bytecodes make the interpreter less general? 741They do mean that the interpreter contains special-purpose code, but 742that doesn't mean the interpreter can only be used for that purpose. If 743an expression doesn't use the @code{trace} bytecodes, they don't get in 744its way. 745 746@item Why doesn't @code{trace_quick} consume its arguments the way everything else does? 747In general, you do want your operators to consume their arguments; it's 748consistent, and generally reduces the amount of stack rearrangement 749necessary. However, @code{trace_quick} is a kludge to save space; it 750only exists so we needn't write @code{dup const8 @var{SIZE} trace} 751before every memory reference. Therefore, it's okay for it not to 752consume its arguments; it's meant for a specific context in which we 753know exactly what it should do with the stack. If we're going to have a 754kludge, it should be an effective kludge. 755 756@item Why does @code{trace16} exist? 757That opcode was added by the customer that contracted Cygnus for the 758data tracing work. I personally think it is unnecessary; objects that 759large will be quite rare, so it is okay to use @code{dup const16 760@var{size} trace} in those cases. 761 762Whatever we decide to do with @code{trace16}, we should at least leave 763opcode 0x30 reserved, to remain compatible with the customer who added 764it. 765 766@end table 767