xref: /netbsd-src/usr.bin/sed/sed.1 (revision 4481672ffb4b81dda61e63be1d6f5f501f8a7937)
1.\"	$NetBSD: sed.1,v 1.43 2021/03/11 17:14:35 wiz Exp $
2.\" Copyright (c) 1992, 1993
3.\"	The Regents of the University of California.  All rights reserved.
4.\"
5.\" This code is derived from software contributed to Berkeley by
6.\" the Institute of Electrical and Electronics Engineers, Inc.
7.\"
8.\" Redistribution and use in source and binary forms, with or without
9.\" modification, are permitted provided that the following conditions
10.\" are met:
11.\" 1. Redistributions of source code must retain the above copyright
12.\"    notice, this list of conditions and the following disclaimer.
13.\" 2. Redistributions in binary form must reproduce the above copyright
14.\"    notice, this list of conditions and the following disclaimer in the
15.\"    documentation and/or other materials provided with the distribution.
16.\" 3. Neither the name of the University nor the names of its contributors
17.\"    may be used to endorse or promote products derived from this software
18.\"    without specific prior written permission.
19.\"
20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.\"	@(#)sed.1	8.2 (Berkeley) 12/30/93
33.\" $FreeBSD: head/usr.bin/sed/sed.1 259132 2013-12-09 18:57:20Z eadler $
34.\"
35.Dd March 11, 2021
36.Dt SED 1
37.Os
38.Sh NAME
39.Nm sed
40.Nd stream editor
41.Sh SYNOPSIS
42.Nm
43.Op Fl aEGglnru
44.Ar command
45.Op Ar
46.Nm
47.Op Fl aEGglnru
48.Op Fl e Ar command
49.Op Fl f Ar command_file
50.Op Fl I Ns Op Ar extension
51.Op Fl i Ns Op Ar extension
52.Op Ar
53.Sh DESCRIPTION
54The
55.Nm
56utility reads the specified files, or the standard input if no files
57are specified, modifying the input as specified by a list of commands.
58The input is then written to the standard output.
59.Pp
60A single command may be specified as the first argument to
61.Nm .
62Multiple commands may be specified by using the
63.Fl e
64or
65.Fl f
66options.
67All commands are applied to the input in the order they are specified
68regardless of their origin.
69.Pp
70The following options are available:
71.Bl -tag -width indent
72.It Fl a
73The files listed as parameters for the
74.Dq w
75functions are created (or truncated) before any processing begins,
76by default.
77The
78.Fl a
79option causes
80.Nm
81to delay opening each file until a command containing the related
82.Dq w
83function is applied to a line of input.
84.It Fl E
85Interpret regular expressions as extended (modern) regular expressions
86rather than basic regular expressions (BRE's).
87The
88.Xr re_format 7
89manual page fully describes both formats.
90.It Fl e Ar command
91Append the editing commands specified by the
92.Ar command
93argument
94to the list of commands.
95.It Fl f Ar command_file
96Append the editing commands found in the file
97.Ar command_file
98to the list of commands.
99The editing commands should each be listed on a separate line.
100.It Fl G
101Turn off GNU regex extensions (the default).
102.It Fl g
103Turn on GNU regex extensions.
104See
105.Xr regex 3
106for details.
107.It Fl I Ns Op Ar extension
108Edit files in-place, saving backups with the specified
109.Ar extension .
110If no
111.Ar extension
112is given, no backup will be saved.
113It is not recommended to give a zero-length
114.Ar extension
115when in-place editing files, as you risk corruption or partial content
116in situations where disk space is exhausted, etc.
117.Pp
118Note that in-place editing with
119.Fl I
120still takes place in a single continuous line address space covering
121all files, although each file preserves its individuality instead of
122forming one output stream.
123The line counter is never reset between files, address ranges can span
124file boundaries, and the
125.Dq $
126address matches only the last line of the last file.
127(See
128.Sx "Sed Addresses" . )
129That can lead to unexpected results in many cases of in-place editing,
130where using
131.Fl i
132is desired.
133.It Fl i Ns Op Ar extension
134Edit files in-place similarly to
135.Fl I ,
136but treat each file independently from other files.
137In particular, line numbers in each file start at 1,
138the
139.Dq $
140address matches the last line of the current file,
141and address ranges are limited to the current file.
142(See
143.Sx "Sed Addresses" . )
144The net result is as though each file were edited by a separate
145.Nm
146instance.
147.It Fl l
148Make output line buffered.
149.It Fl n
150By default, each line of input is echoed to the standard output after
151all of the commands have been applied to it.
152The
153.Fl n
154option suppresses this behavior.
155.It Fl r
156Same as
157.Fl E
158for compatibility with GNU sed.
159.It Fl u
160Make output unbuffered.
161.El
162.Pp
163The form of a
164.Nm
165command is as follows:
166.Pp
167.Dl [address[,address]]function[arguments]
168.Pp
169Whitespace may be inserted before the first address and the function
170portions of the command.
171.Pp
172Normally,
173.Nm
174cyclically copies a line of input, not including its terminating newline
175character, into a
176.Em "pattern space" ,
177(unless there is something left after a
178.Dq D
179function),
180applies all of the commands with addresses that select that pattern space,
181copies the pattern space to the standard output, appending a newline, and
182deletes the pattern space.
183.Pp
184Some of the functions use a
185.Em "hold space"
186to save all or part of the pattern space for subsequent retrieval.
187.Ss "Sed Addresses"
188An address is not required, but if specified must have one of the
189following formats:
190.Bl -bullet -offset indent
191.It
192a number that counts
193input lines
194cumulatively across input files (or in each file independently
195if a
196.Fl i
197option is in effect);
198.It
199a dollar
200.Pq Dq $
201character that addresses the last line of input (or the last line
202of the current file if a
203.Fl i
204option was specified);
205.It
206a context address
207that consists of a regular expression preceded and followed by a
208delimiter.
209The closing delimiter can also optionally be followed by the
210.Dq i
211character, to indicate that the regular expression is to be matched
212in a case-insensitive way.
213.El
214.Pp
215A command line with no addresses selects every pattern space.
216.Pp
217A command line with one address selects all of the pattern spaces
218that match the address.
219.Pp
220A command line with two addresses selects an inclusive range.
221This
222range starts with the first pattern space that matches the first
223address.
224The end of the range is the next following pattern space
225that matches the second address.
226If the second address is a number
227less than or equal to the line number first selected, only that
228line is selected.
229The number in the second address may be prefixed with a
230.Pq Dq \&+
231to specify the number of lines to match after the first pattern.
232In the case when the second address is a context
233address,
234.Nm
235does not re-match the second address against the
236pattern space that matched the first address.
237Starting at the
238first line following the selected range,
239.Nm
240starts looking again for the first address.
241.Pp
242Editing commands can be applied to non-selected pattern spaces by use
243of the exclamation character
244.Pq Dq \&!
245function.
246.Ss "Sed Regular Expressions"
247The regular expressions used in
248.Nm ,
249by default, are basic regular expressions (BREs, see
250.Xr re_format 7
251for more information), but extended (modern) regular expressions can be used
252instead if the
253.Fl E
254flag is given.
255In addition,
256.Nm
257has the following two additions to regular expressions:
258.Pp
259.Bl -enum -compact
260.It
261In a context address, any character other than a backslash
262.Pq Dq \e
263or newline character may be used to delimit the regular expression.
264The opening delimiter needs to be preceded by a backslash
265unless it is a slash.
266For example, the context address
267.Li \exabcx
268is equivalent to
269.Li /abc/ .
270Also, putting a backslash character before the delimiting character
271within the regular expression causes the character to be treated literally.
272For example, in the context address
273.Li \exabc\exdefx ,
274the RE delimiter is an
275.Dq x
276and the second
277.Dq x
278stands for itself, so that the regular expression is
279.Dq abcxdef .
280.Pp
281.It
282The escape sequence \en matches a newline character embedded in the
283pattern space.
284You cannot, however, use a literal newline character in an address or
285in the substitute command.
286.El
287.Pp
288One special feature of
289.Nm
290regular expressions is that they can default to the last regular
291expression used.
292If a regular expression is empty, i.e., just the delimiter characters
293are specified, the last regular expression encountered is used instead.
294The last regular expression is defined as the last regular expression
295used as part of an address or substitute command, and at run-time, not
296compile-time.
297For example, the command
298.Dq /abc/s//XXX/
299will substitute
300.Dq XXX
301for the pattern
302.Dq abc .
303.Ss "Sed Functions"
304In the following list of commands, the maximum number of permissible
305addresses for each command is indicated by [0addr], [1addr], or [2addr],
306representing zero, one, or two addresses.
307.Pp
308The argument
309.Em text
310consists of one or more lines.
311To embed a newline in the text, precede it with a backslash.
312Other backslashes in text are deleted and the following character
313taken literally.
314.Pp
315The
316.Dq r
317and
318.Dq w
319functions take an optional file parameter, which should be separated
320from the function letter by white space.
321Each file given as an argument to
322.Nm
323is created (or its contents truncated) before any input processing begins.
324.Pp
325The
326.Dq b ,
327.Dq r ,
328.Dq s ,
329.Dq t ,
330.Dq w ,
331.Dq y ,
332.Dq \&! ,
333and
334.Dq \&:
335functions all accept additional arguments.
336The following synopses indicate which arguments have to be separated from
337the function letters by white space characters.
338.Pp
339Two of the functions take a function-list.
340This is a list of
341.Nm
342functions separated by newlines, as follows:
343.Bd -literal -offset indent
344{ function
345  function
346  ...
347  function
348}
349.Ed
350.Pp
351The
352.Dq {
353can be preceded by white space and can be followed by white space.
354The function can be preceded by white space.
355The terminating
356.Dq }
357must be preceded by a newline, and may also be preceded by white space.
358.Pp
359.Bl -tag -width "XXXXXX" -compact
360.It [2addr] function-list
361Execute function-list only when the pattern space is selected.
362.Pp
363.It [1addr]a\e
364.It text
365Write
366.Em text
367to standard output immediately before each attempt to read a line of input,
368whether by executing the
369.Dq N
370function or by beginning a new cycle.
371.Pp
372.It [2addr]b[label]
373Branch to the
374.Dq \&:
375function with the specified label.
376If the label is not specified, branch to the end of the script.
377.Pp
378.It [2addr]c\e
379.It text
380Delete the pattern space.
381With 0 or 1 address or at the end of a 2-address range,
382.Em text
383is written to the standard output.
384.Pp
385.It [2addr]d
386Delete the pattern space and start the next cycle.
387.Pp
388.It [2addr]D
389Delete the initial segment of the pattern space through the first
390newline character and start the next cycle.
391.Pp
392.It [2addr]g
393Replace the contents of the pattern space with the contents of the
394hold space.
395.Pp
396.It [2addr]G
397Append a newline character followed by the contents of the hold space
398to the pattern space.
399.Pp
400.It [2addr]h
401Replace the contents of the hold space with the contents of the
402pattern space.
403.Pp
404.It [2addr]H
405Append a newline character followed by the contents of the pattern space
406to the hold space.
407.Pp
408.It [1addr]i\e
409.It text
410Write
411.Em text
412to the standard output.
413.Pp
414.It [2addr]l
415(The letter ell.)
416Write the pattern space to the standard output in a visually unambiguous
417form.
418This form is as follows:
419.Pp
420.Bl -tag -width "carriage-returnXX" -offset indent -compact
421.It backslash
422\e\e
423.It alert
424\ea
425.It form-feed
426\ef
427.It carriage-return
428\er
429.It tab
430\et
431.It vertical tab
432\ev
433.El
434.Pp
435Nonprintable characters are written as three-digit octal numbers (with a
436preceding backslash) for each byte in the character (most significant byte
437first).
438Long lines are folded, with the point of folding indicated by displaying
439a backslash followed by a newline.
440The end of each line is marked with a
441.Dq $ .
442.Pp
443.It [2addr]n
444Write the pattern space to the standard output if the default output has
445not been suppressed, and replace the pattern space with the next line of
446input.
447.Pp
448.It [2addr]N
449Append the next line of input to the pattern space, using an embedded
450newline character to separate the appended material from the original
451contents.
452Note that the current line number changes.
453.Pp
454.It [2addr]p
455Write the pattern space to standard output.
456.Pp
457.It [2addr]P
458Write the pattern space, up to the first newline character to the
459standard output.
460.Pp
461.It [1addr]q
462Branch to the end of the script and quit without starting a new cycle.
463.Pp
464.It [1addr]r file
465Copy the contents of
466.Em file
467to the standard output immediately before the next attempt to read a
468line of input.
469If
470.Em file
471cannot be read for any reason, it is silently ignored and no error
472condition is set.
473.Pp
474.It [2addr]s/regular expression/replacement/flags
475Substitute the replacement string for the first instance of the regular
476expression in the pattern space.
477Any character other than backslash or newline can be used instead of
478a slash to delimit the RE and the replacement.
479Within the RE and the replacement, the RE delimiter itself can be used as
480a literal character if it is preceded by a backslash.
481.Pp
482An ampersand
483.Pq Dq &
484appearing in the replacement is replaced by the string matching the RE.
485The special meaning of
486.Dq &
487in this context can be suppressed by preceding it by a backslash.
488The string
489.Dq \e# ,
490where
491.Dq #
492is a digit, is replaced by the text matched
493by the corresponding backreference expression (see
494.Xr re_format 7 ) .
495.Pp
496A line can be split by substituting a newline character into it.
497To specify a newline character in the replacement string, precede it with
498a backslash.
499.Pp
500The value of
501.Em flags
502in the substitute function is zero or more of the following:
503.Bl -tag -width "XXXXXX" -offset indent
504.It Ar N
505Make the substitution only for the
506.Ar N Ns 'th
507occurrence of the regular expression in the pattern space.
508.It g
509Make the substitution for all non-overlapping matches of the
510regular expression, not just the first one.
511.It p
512Write the pattern space to standard output if a replacement was made.
513If the replacement string is identical to that which it replaces, it
514is still considered to have been a replacement.
515.It w Em file
516Append the pattern space to
517.Em file
518if a replacement was made.
519If the replacement string is identical to that which it replaces, it
520is still considered to have been a replacement.
521.It i or I
522Match the regular expression in a case-insensitive way.
523.El
524.Pp
525.It [2addr]t [label]
526Branch to the
527.Dq \&:
528function bearing the label if any substitutions have been made since the
529most recent reading of an input line or execution of a
530.Dq t
531function.
532If no label is specified, branch to the end of the script.
533.Pp
534.It [2addr]w Em file
535Append the pattern space to the
536.Em file .
537.Pp
538.It [2addr]x
539Swap the contents of the pattern and hold spaces.
540.Pp
541.It [2addr]y/string1/string2/
542Replace all occurrences of characters in
543.Em string1
544in the pattern space with the corresponding characters from
545.Em string2 .
546Any character other than a backslash or newline can be used instead of
547a slash to delimit the strings.
548Within
549.Em string1
550and
551.Em string2 ,
552a backslash followed by any character other than a newline is that literal
553character, and a backslash followed by an ``n'' is replaced by a newline
554character.
555.Pp
556.It [2addr]!function
557.It [2addr]!function-list
558Apply the function or function-list only to the lines that are
559.Em not
560selected by the address(es).
561.Pp
562.It [0addr]:label
563This function does nothing; it bears a label to which the
564.Dq b
565and
566.Dq t
567commands may branch.
568.Pp
569.It [1addr]=
570Write the line number to the standard output followed by a newline
571character.
572.Pp
573.It [0addr]
574Empty lines are ignored.
575.Pp
576.It [0addr]#
577The
578.Dq #
579and the remainder of the line are ignored (treated as a comment), with
580the single exception that if the first two characters in the file are
581.Dq #n ,
582the default output is suppressed.
583This is the same as specifying the
584.Fl n
585option on the command line.
586.El
587.Sh ENVIRONMENT
588The
589.Ev COLUMNS , LANG , LC_ALL , LC_CTYPE
590and
591.Ev LC_COLLATE
592environment variables affect the execution of
593.Nm
594as described in
595.Xr environ 7 .
596.Sh EXIT STATUS
597.Ex -std
598.Sh SEE ALSO
599.Xr awk 1 ,
600.Xr ed 1 ,
601.Xr grep 1 ,
602.Xr regex 3 ,
603.Xr re_format 7
604.Sh STANDARDS
605The
606.Nm
607utility is expected to be a superset of the
608.St -p1003.2
609specification.
610.Pp
611The
612.Fl a , E , I ,
613and
614.Fl i
615options, the prefixing
616.Dq \&+
617in the second member of an address range,
618as well as the
619.Dq I
620flag to the address regular expression and substitution command are
621non-standard
622.Fx
623extensions and may not be available on other operating systems.
624.Sh HISTORY
625A
626.Nm
627command, written by
628.An L. E. McMahon ,
629appeared in
630.At v7 .
631.Sh AUTHORS
632.An "Diomidis D. Spinellis" Aq dds@FreeBSD.org
633.Sh BUGS
634Multibyte characters containing a byte with value 0x5C
635.Tn ( ASCII
636.Ql \e )
637may be incorrectly treated as line continuation characters in arguments to the
638.Dq a ,
639.Dq c
640and
641.Dq i
642commands.
643Multibyte characters cannot be used as delimiters with the
644.Dq s
645and
646.Dq y
647commands.
648