xref: /netbsd-src/usr.bin/sed/sed.1 (revision e5548b402ae4c44fb816de42c7bba9581ce23ef5)
1.\"	$NetBSD: sed.1,v 1.24 2004/07/13 12:09:29 wiz Exp $
2.\"
3.\" Copyright (c) 1992, 1993
4.\"	The Regents of the University of California.  All rights reserved.
5.\"
6.\" This code is derived from software contributed to Berkeley by
7.\" the Institute of Electrical and Electronics Engineers, Inc.
8.\"
9.\" Redistribution and use in source and binary forms, with or without
10.\" modification, are permitted provided that the following conditions
11.\" are met:
12.\" 1. Redistributions of source code must retain the above copyright
13.\"    notice, this list of conditions and the following disclaimer.
14.\" 2. Redistributions in binary form must reproduce the above copyright
15.\"    notice, this list of conditions and the following disclaimer in the
16.\"    documentation and/or other materials provided with the distribution.
17.\" 3. Neither the name of the University nor the names of its contributors
18.\"    may be used to endorse or promote products derived from this software
19.\"    without specific prior written permission.
20.\"
21.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
22.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
23.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
24.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
25.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
26.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
27.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
28.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
29.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
30.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
31.\" SUCH DAMAGE.
32.\"
33.\"	@(#)sed.1	8.2 (Berkeley) 12/30/93
34.\"
35.Dd January 4, 2004
36.Dt SED 1
37.Os
38.Sh NAME
39.Nm sed
40.Nd stream editor
41.Sh SYNOPSIS
42.Nm
43.Op Fl aEn
44.Ar command
45.Op Ar file ...
46.Nm
47.Op Fl aEn
48.Op Fl e Ar command
49.Op Fl f Ar command_file
50.Op Ar file ...
51.Sh DESCRIPTION
52The
53.Nm
54utility reads the specified files, or the standard input if no files
55are specified, modifying the input as specified by a list of commands.
56The input is then written to the standard output.
57.Pp
58A single command may be specified as the first argument to
59.Nm .
60Multiple commands may be specified by using the
61.Fl e
62or
63.Fl f
64options.
65All commands are applied to the input in the order they are specified
66regardless of their origin.
67.Pp
68The following options are available:
69.Bl -tag -width indent
70.It Fl a
71The files listed as parameters for the
72.Dq w
73functions are created (or truncated) before any processing begins,
74by default.
75The
76.Fl a
77option causes
78.Nm
79to delay opening each file until a command containing the related
80.Dq w
81function is applied to a line of input.
82.It Fl E
83Enables the use of extended regular expressions instead of the
84usual basic regular expression syntax.
85.It Fl e Ar command
86Append the editing commands specified by the
87.Ar command
88argument
89to the list of commands.
90.It Fl f Ar command_file
91Append the editing commands found in the file
92.Ar command_file
93to the list of commands.
94The editing commands should each be listed on a separate line.
95.It Fl n
96By default, each line of input is echoed to the standard output after
97all of the commands have been applied to it.
98The
99.Fl n
100option suppresses this behavior.
101.El
102.Pp
103The form of a
104.Nm
105command is as follows:
106.sp
107.Dl [address[,address]]function[arguments]
108.sp
109Whitespace may be inserted before the first address and the function
110portions of the command.
111.Pp
112Normally,
113.Nm
114cyclically copies a line of input, not including its terminating newline
115character, into a
116.Em "pattern space" ,
117(unless there is something left after a
118.Dq D
119function),
120applies all of the commands with addresses that select that pattern space,
121copies the pattern space to the standard output, appending a newline, and
122deletes the pattern space.
123.Pp
124Some of the functions use a
125.Em "hold space"
126to save all or part of the pattern space for subsequent retrieval.
127.Sh SED ADDRESSES
128An address is not required, but if specified must be a number (that counts
129input lines
130cumulatively across input files), a dollar
131.Po
132.Dq $
133.Pc
134character that addresses the last line of input, or a context address
135(which consists of a regular expression preceded and followed by a
136delimiter).
137.Pp
138A command line with no addresses selects every pattern space.
139.Pp
140A command line with one address selects all of the pattern spaces
141that match the address.
142.Pp
143A command line with two addresses selects the inclusive range from
144the first pattern space that matches the first address through the next
145pattern space that matches the second.
146(If the second address is a number less than or equal to the line number
147first selected, only that line is selected.)
148Starting at the first line following the selected range,
149.Nm
150starts looking again for the first address.
151.Pp
152Editing commands can be applied to non-selected pattern spaces by use
153of the exclamation character
154.Pq Dq \&!
155function.
156.Sh SED REGULAR EXPRESSIONS
157The
158.Nm
159regular expressions are basic regular expressions (BRE's, see
160.Xr re_format 7
161for more information).
162In addition,
163.Nm
164has the following two additions to BRE's:
165.sp
166.Bl -enum -compact
167.It
168In a context address, any character other than a backslash
169.Po
170.Dq \e
171.Pc
172or newline character may be used to delimit the regular expression
173by prefixing the first use of that delimiter with a backslash.
174Also, putting a backslash character before the delimiting character
175causes the character to be treated literally.
176For example, in the context address \exabc\exdefx, the RE delimiter
177is an
178.Dq x
179and the second
180.Dq x
181stands for itself, so that the regular expression is
182.Dq abcxdef .
183.sp
184.It
185The escape sequence \en matches a newline character embedded in the
186pattern space.
187You can't, however, use a literal newline character in an address or
188in the substitute command.
189.El
190.Pp
191One special feature of
192.Nm
193regular expressions is that they can default to the last regular
194expression used.
195If a regular expression is empty, i.e. just the delimiter characters
196are specified, the last regular expression encountered is used instead.
197The last regular expression is defined as the last regular expression
198used as part of an address or substitute command, and at run-time, not
199compile-time.
200For example, the command
201.Dq /abc/s//XXX/
202will substitute
203.Dq XXX
204for the pattern
205.Dq abc .
206.Sh SED FUNCTIONS
207In the following list of commands, the maximum number of permissible
208addresses for each command is indicated by [0addr], [1addr], or [2addr],
209representing zero, one, or two addresses.
210.Pp
211The argument
212.Em text
213consists of one or more lines.
214To embed a newline in the text, precede it with a backslash.
215Other backslashes in text are deleted and the following character
216taken literally.
217.Pp
218The
219.Dq r
220and
221.Dq w
222functions take an optional file parameter, which should be separated
223from the function letter by white space.
224Each file given as an argument to
225.Nm
226is created (or its contents truncated) before any input processing begins.
227.Pp
228The
229.Dq b ,
230.Dq r ,
231.Dq s ,
232.Dq t ,
233.Dq w ,
234.Dq y ,
235.Dq \&! ,
236and
237.Dq \&:
238functions all accept additional arguments.
239The following synopses indicate which arguments have to be separated from
240the function letters by white space characters.
241.Pp
242Two of the functions take a function-list.
243This is a list of
244.Nm
245functions separated by newlines, as follows:
246.Bd -literal -offset indent
247{ function
248  function
249  ...
250  function
251}
252.Ed
253.Pp
254The
255.Dq {
256can be preceded by white space and can be followed by white space.
257The function can be preceded by white space.
258The terminating
259.Dq }
260must be preceded by a newline or optional white space.
261.sp
262.Bl -tag -width "XXXXXX" -compact
263.It [2addr] function-list
264Execute function-list only when the pattern space is selected.
265.sp
266.It [1addr]a\e
267.It text
268.br
269Write
270.Em text
271to standard output immediately before each attempt to read a line of input,
272whether by executing the
273.Dq N
274function or by beginning a new cycle.
275.sp
276.It [2addr]b[label]
277Branch to the
278.Dq \&:
279function with the specified label.
280If the label is not specified, branch to the end of the script.
281.sp
282.It [2addr]c\e
283.It text
284.br
285Delete the pattern space.
286With 0 or 1 address or at the end of a 2-address range,
287.Em text
288is written to the standard output.
289.sp
290.It [2addr]d
291Delete the pattern space and start the next cycle.
292.sp
293.It [2addr]D
294Delete the initial segment of the pattern space through the first
295newline character and start the next cycle.
296.sp
297.It [2addr]g
298Replace the contents of the pattern space with the contents of the
299hold space.
300.sp
301.It [2addr]G
302Append a newline character followed by the contents of the hold space
303to the pattern space.
304.sp
305.It [2addr]h
306Replace the contents of the hold space with the contents of the
307pattern space.
308.sp
309.It [2addr]H
310Append a newline character followed by the contents of the pattern space
311to the hold space.
312.sp
313.It [1addr]i\e
314.It text
315.br
316Write
317.Em text
318to the standard output.
319.sp
320.It [2addr]l
321(The letter ell.)
322Write the pattern space to the standard output in a visually unambiguous
323form.
324This form is as follows:
325.sp
326.Bl -tag -width "carriage-returnXX" -offset indent -compact
327.It backslash
328\e\e
329.It alert
330\ea
331.It form-feed
332\ef
333.It newline
334\en
335.It carriage-return
336\er
337.It tab
338\et
339.It vertical tab
340\ev
341.El
342.Pp
343Nonprintable characters are written as three-digit octal numbers (with a
344preceding backslash) for each byte in the character (most significant byte
345first).
346Long lines are folded, with the point of folding indicated by displaying
347a backslash followed by a newline.
348The end of each line is marked with a
349.Dq $ .
350.sp
351.It [2addr]n
352Write the pattern space to the standard output if the default output has
353not been suppressed, and replace the pattern space with the next line of
354input. (Does not begin a new cycle.)
355.sp
356.It [2addr]N
357Append the next line of input to the pattern space, using an embedded
358newline character to separate the appended material from the original
359contents.
360Note that the current line number changes.
361.sp
362.It [2addr]p
363Write the pattern space to standard output.
364.sp
365.It [2addr]P
366Write the pattern space, up to the first newline character to the
367standard output.
368.sp
369.It [1addr]q
370Branch to the end of the script and quit without starting a new cycle.
371.sp
372.It [1addr]r file
373Copy the contents of
374.Em file
375to the standard output immediately before the next attempt to read a
376line of input.
377If
378.Em file
379cannot be read for any reason, it is silently ignored and no error
380condition is set.
381.sp
382.It [2addr]s/regular expression/replacement/flags
383Substitute the replacement string for the first instance of the regular
384expression in the pattern space.
385Any character other than backslash or newline can be used instead of
386a slash to delimit the RE and the replacement.
387Within the RE and the replacement, the RE delimiter itself can be used as
388a literal character if it is preceded by a backslash.
389.Pp
390An ampersand
391.Po
392.Dq \*[Am]
393.Pc
394appearing in the replacement is replaced by the string matching the RE.
395The special meaning of
396.Dq \*[Am]
397in this context can be suppressed by preceding it by a backslash.
398The string
399.Dq \e# ,
400where
401.Dq #
402is a digit, is replaced by the text matched
403by the corresponding backreference expression (see
404.Xr re_format 7 ) .
405.Pp
406A line can be split by substituting a newline character into it.
407To specify a newline character in the replacement string, precede it with
408a backslash.
409.Pp
410The value of
411.Em flags
412in the substitute function is zero or more of the following:
413.Bl -tag -width "XXXXXX" -offset indent
414.It "0 ... 9"
415Make the substitution only for the N'th occurrence of the regular
416expression in the pattern space.
417.It g
418Make the substitution for all non-overlapping matches of the
419regular expression, not just the first one.
420.It p
421Write the pattern space to standard output if a replacement was made.
422If the replacement string is identical to that which it replaces, it
423is still considered to have been a replacement.
424.It w Em file
425Append the pattern space to
426.Em file
427if a replacement was made.
428If the replacement string is identical to that which it replaces, it
429is still considered to have been a replacement.
430.El
431.sp
432.It [2addr]t [label]
433Branch to the
434.Dq \&:
435function bearing the label if any substitutions have been made since the
436most recent reading of an input line or execution of a
437.Dq t
438function.
439If no label is specified, branch to the end of the script.
440.sp
441.It [2addr]w Em file
442Append the pattern space to the
443.Em file .
444.sp
445.It [2addr]x
446Swap the contents of the pattern and hold spaces.
447.sp
448.It [2addr]y/string1/string2/
449Replace all occurrences of characters in
450.Em string1
451in the pattern space with the corresponding characters from
452.Em string2 .
453Any character other than a backslash or newline can be used instead of
454a slash to delimit the strings.
455Within
456.Em string1
457and
458.Em string2 ,
459a backslash followed by any character other than a newline is that literal
460character, and a backslash followed by an ``n'' is replaced by a newline
461character.
462.sp
463.It [2addr]!function
464.It [2addr]!function-list
465Apply the function or function-list only to the lines that are
466.Em not
467selected by the address(es).
468.sp
469.It [0addr]:label
470This function does nothing; it bears a label to which the
471.Dq b
472and
473.Dq t
474commands may branch.
475.sp
476.It [1addr]=
477Write the line number to the standard output followed by a newline
478character.
479.sp
480.It [0addr]
481Empty lines are ignored.
482.sp
483.It [0addr]#
484The
485.Dq #
486and the remainder of the line are ignored (treated as a comment), with
487the single exception that if the first two characters in the file are
488.Dq #n ,
489the default output is suppressed.
490This is the same as specifying the
491.Fl n
492option on the command line.
493.El
494.Pp
495The
496.Nm
497utility exits 0 on success and \*[Gt]0 if an error occurs.
498.Sh SEE ALSO
499.Xr awk 1 ,
500.Xr ed 1 ,
501.Xr grep 1 ,
502.Xr regex 3 ,
503.Xr re_format 7
504.Sh STANDARDS
505The
506.Nm
507function is expected to be a superset of the
508.St -p1003.2
509specification.
510.Sh HISTORY
511A
512.Nm
513command appeared in
514.At v7 .
515