1*55924Sbostic# @(#)POSIX 5.1 (Berkeley) 08/20/92 2*55924Sbostic 3*55924Sbostic Comments on the IEEE P1003.2 Draft 11.2 September 1991 4*55924Sbostic 5*55924Sbostic Part 2: Shell and Utilities 6*55924Sbostic Section 4.55: sed - Stream editor 7*55924Sbostic 8*55924SbosticIn the following paragraphs, `wrong' means `inconsistent with historic 9*55924Sbosticpractice'. Many of the comments refer to undocumented inconsistencies 10*55924Sbosticbetween the historical versions of sed and the POSIX standard. All the 11*55924Sbosticcomments are notes taken while implementing a POSIX-compatible version 12*55924Sbosticof sed, and should not be interpreted as official opinions or criticism 13*55924Sbostictowards the POSIX committee. Many are insignificant, pedantic and even 14*55924Sbosticwrong. 15*55924Sbostic Diomidis Spinellis <dds@doc.ic.ac.uk> 16*55924Sbostic 17*55924Sbostic[Some are significant and right, too. -- Keith Bostic] 18*55924Sbostic 19*55924Sbostic1. For the text argument of the a command it is not specified if lines are 20*55924Sbostic stripped from their initial blanks or not. There are some hints in D2 21*55924Sbostic 11335-11337 and in D2 11512-11514, but nothing concrete. Historical 22*55924Sbostic practice is to strip the blanks, i.e.: 23*55924Sbostic 24*55924Sbostic #!/bin/sed -f 25*55924Sbostic a\ 26*55924Sbostic foo\ 27*55924Sbostic bar 28*55924Sbostic 29*55924Sbostic produces: 30*55924Sbostic 31*55924Sbostic foo 32*55924Sbostic bar 33*55924Sbostic 34*55924Sbostic2. In the s command we assume that the w file is the last flag. This is 35*55924Sbostic historical practice, but not specified in the standard. 36*55924Sbostic 37*55924Sbostic3. In the s command the standard does not specify that a space must follow 38*55924Sbostic w. Also the standard does not specify that any number of spaces after 39*55924Sbostic the w command are allowed and removed. 40*55924Sbostic 41*55924Sbostic4. The specification of the a command is wrong. With the current 42*55924Sbostic specification both of these scripts should produce the same output: 43*55924Sbostic 44*55924Sbostic #!/bin/sed -f 45*55924Sbostic d 46*55924Sbostic a\ 47*55924Sbostic hello 48*55924Sbostic 49*55924Sbostic #!/bin/sed -f 50*55924Sbostic a\ 51*55924Sbostic hello 52*55924Sbostic d 53*55924Sbostic 54*55924Sbostic5. The specification of the c command in conjunction with the specification 55*55924Sbostic of the default operation (D2 11293-11299) is wrong. The default operation 56*55924Sbostic specifies that a newline is printed after the pattern space. This is not 57*55924Sbostic the case when the pattern space has been deleted by a c command. 58*55924Sbostic 59*55924Sbostic6. The rule for the l command differs from historic practice. Table 2-15 60*55924Sbostic includes the various escape sequences including \\. Is this meant by 61*55924Sbostic the standard? Furthermore some versions of sed print two digit octal 62*55924Sbostic numbers. Why does the standard require a three digit octal number? 63*55924Sbostic Normally the pattern space does not end with a newline. Will an implict 64*55924Sbostic \n be printed? Finaly the standard does not specify that a newline must 65*55924Sbostic follow the '$' sign (it seems logical to me). 66*55924Sbostic 67*55924Sbostic7. The specification for ! does not specify that for a single command the 68*55924Sbostic command must not contain an address specification whereas the command 69*55924Sbostic list can contain address specifications. 70*55924Sbostic 71*55924Sbostic8. The standard does not specify what happens with consequitive ! commands 72*55924Sbostic (e.g. /foo/!!!p) Current implementations allow any number of !'s without 73*55924Sbostic changing behaviour. It seems logical that each one should reverse the 74*55924Sbostic default behaviour. 75*55924Sbostic 76*55924Sbostic9. The ; command separator is not allowed for the commands a c i w r : b t 77*55924Sbostic # and at the end of a w flag in the s command. 78*55924Sbostic 79*55924Sbostic10. The standard does not specify that if an end of file occurs on the 80*55924Sbostic execution of the n command the program terminates (e.g. 81*55924Sbostic 82*55924Sbostic sed -e ' 83*55924Sbostic n 84*55924Sbostic i\ 85*55924Sbostic hello 86*55924Sbostic ' </dev/null 87*55924Sbostic 88*55924Sbostic will not produce any output. 89*55924Sbostic 90*55924Sbostic11. The standard does not specify that the q command causes all lines that 91*55924Sbostic have been appended to be output and that the pattern space is printed 92*55924Sbostic before exiting. 93*55924Sbostic 94*55924Sbostic12. Historic implementations ignore comments in the text of the i and a 95*55924Sbostic commands. 96*55924Sbostic 97*55924Sbostic13. The historic implementation does not consider the last line of a file 98*55924Sbostic to match $ if a null file follows: 99*55924Sbostic 100*55924Sbostic sed -n -e '$p' /usr/dict/words /dev/null 101*55924Sbostic 102*55924Sbostic will not print anything. 103*55924Sbostic 104*55924Sbostic14. Historical implementations do not output the change text of a c command 105*55924Sbostic in the case of an address range whose second line number is greater than 106*55924Sbostic the first (e.g. 3,1). The standard seems to imply otherwise. 107*55924Sbostic 108*55924Sbostic15. Historical implementations output the c text on EVERY line not included 109*55924Sbostic in the two address range in the case of a negation '!'. 110*55924Sbostic 111*55924Sbostic16. The standard does not specify that the p flag at the s command will 112*55924Sbostic write the pattern space plus a newline on the standard output 113*55924Sbostic 114*55924Sbostic17. The standard does not specify whether address ranges are checked and 115*55924Sbostic reset if a command is not executed due to a jump. The following 116*55924Sbostic program can behave in two different ways depending on whether the range 117*55924Sbostic operator is reset at line 6 or not. This is important in the case of 118*55924Sbostic pattern matches. 119*55924Sbostic 120*55924Sbostic sed -n -e ' 121*55924Sbostic 4,8b 122*55924Sbostic s/^/XXX/p 123*55924Sbostic 1,6 { 124*55924Sbostic p 125*55924Sbostic }' 126*55924Sbostic 127*55924Sbostic18. Historical implementations allow an output suppressing #n at the 128*55924Sbostic beginning of -e arguments as well. 129*55924Sbostic 130*55924Sbostic19. POSIX does not specify whether more than one numeric flag is 131*55924Sbostic allowed on the s command 132*55924Sbostic 133*55924Sbostic20. Existing versions of sed have the undocumented feature of allowing 134*55924Sbostic a semicolon to delimit commands. It is not specified in the standard. 135*55924Sbostic 136*55924Sbostic21. The standard does not specify whether a script is mandatory. The 137*55924Sbostic sed implementations I tested behave differently with ls | sed (no 138*55924Sbostic output) and ls | sed - e'' (behaves like cat). 139*55924Sbostic 140*55924Sbostic22. The requirement to open all wfiles from the beginning makes sed behave 141*55924Sbostic nonintuitively when the w commands are preceded by addresses or are 142*55924Sbostic within conditional blocks. 143*55924Sbostic 144*55924Sbostic23. The rule specified in lines 11412-11413 of the standard does not 145*55924Sbostic seem consistent with existing practice. The sed implementations I 146*55924Sbostic tested copied the rfile on standard output every time the r command was 147*55924Sbostic executed and not before reading a line of input. The wording should be 148*55924Sbostic changed to be consistent with the 'a' command i.e. 149*55924Sbostic 150*55924Sbostic24. The standard does not specify how excape sequences other than \n 151*55924Sbostic and \D (where D is the delimiter character) are to be treated. A 152*55924Sbostic strict interpretation would be that they should be treated literaly. 153*55924Sbostic In the sed implementations I have tried the \ is simply ingored. 154*55924Sbostic 155*55924Sbostic25. The standard specifies in line 11304 that an address can be empty. 156*55924Sbostic This is wrong since it implied that constructs like ,d or 1,d or ,5d 157*55924Sbostic are allowed. The sed implementation I tested do not allow them. 158*55924Sbostic 159*55924Sbostic26. The b t and : commands ignore leading white space, but not trailing 160*55924Sbostic white space. This is not specified in the standard. 161*55924Sbostic 162*55924Sbostic27. Although the standard specifies that reading from files that do not 163*55924Sbostic exist from within the script must not terminate the script; it does not 164*55924Sbostic specify what happens if a write command fails. 165*55924Sbostic 166*55924Sbostic28. In the sed implementation I tested the \n construct for newlines 167*55924Sbostic works on both strings of a y command. This is not specified in the 168*55924Sbostic standard. 169*55924Sbostic 170*55924Sbostic29. The standard does not specify if the "nth occurrence" of a regular 171*55924Sbostic expression in a substitute command is an overlapping or a 172*55924Sbostic non-overlappoin one. I.e. what is the result of s/a*/A/2 on the 173*55924Sbostic pattern "aaaaa aaaaa". (It crashes the implementation of sed I 174*55924Sbostic tested.) 175*55924Sbostic 176*55924Sbostic30. Existing implementations of sed ignore the regular expression 177*55924Sbostic delimiter characters within character classes. This is not specified 178*55924Sbostic in the standard. 179