xref: /minix3/usr.bin/xstr/xstr.1 (revision 3179b9b918b79627288341afcb230649990f9e9c)
1*3179b9b9SThomas Cort.\"	$NetBSD: xstr.1,v 1.18 2005/09/11 23:29:44 wiz Exp $
2*3179b9b9SThomas Cort.\"
3*3179b9b9SThomas Cort.\" Copyright (c) 1980, 1993
4*3179b9b9SThomas Cort.\"	The Regents of the University of California.  All rights reserved.
5*3179b9b9SThomas Cort.\"
6*3179b9b9SThomas Cort.\" Redistribution and use in source and binary forms, with or without
7*3179b9b9SThomas Cort.\" modification, are permitted provided that the following conditions
8*3179b9b9SThomas Cort.\" are met:
9*3179b9b9SThomas Cort.\" 1. Redistributions of source code must retain the above copyright
10*3179b9b9SThomas Cort.\"    notice, this list of conditions and the following disclaimer.
11*3179b9b9SThomas Cort.\" 2. Redistributions in binary form must reproduce the above copyright
12*3179b9b9SThomas Cort.\"    notice, this list of conditions and the following disclaimer in the
13*3179b9b9SThomas Cort.\"    documentation and/or other materials provided with the distribution.
14*3179b9b9SThomas Cort.\" 3. Neither the name of the University nor the names of its contributors
15*3179b9b9SThomas Cort.\"    may be used to endorse or promote products derived from this software
16*3179b9b9SThomas Cort.\"    without specific prior written permission.
17*3179b9b9SThomas Cort.\"
18*3179b9b9SThomas Cort.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
19*3179b9b9SThomas Cort.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20*3179b9b9SThomas Cort.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
21*3179b9b9SThomas Cort.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
22*3179b9b9SThomas Cort.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
23*3179b9b9SThomas Cort.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
24*3179b9b9SThomas Cort.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
25*3179b9b9SThomas Cort.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
26*3179b9b9SThomas Cort.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
27*3179b9b9SThomas Cort.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
28*3179b9b9SThomas Cort.\" SUCH DAMAGE.
29*3179b9b9SThomas Cort.\"
30*3179b9b9SThomas Cort.\"     @(#)xstr.1	8.2 (Berkeley) 12/30/93
31*3179b9b9SThomas Cort.\"
32*3179b9b9SThomas Cort.Dd July 23, 2004
33*3179b9b9SThomas Cort.Dt XSTR 1
34*3179b9b9SThomas Cort.Os
35*3179b9b9SThomas Cort.Sh NAME
36*3179b9b9SThomas Cort.Nm xstr
37*3179b9b9SThomas Cort.Nd "extract strings from C programs to implement shared strings"
38*3179b9b9SThomas Cort.Sh SYNOPSIS
39*3179b9b9SThomas Cort.Nm
40*3179b9b9SThomas Cort.Op Fl cv
41*3179b9b9SThomas Cort.Op Fl l Ar array
42*3179b9b9SThomas Cort.Op Fl
43*3179b9b9SThomas Cort.Op Ar
44*3179b9b9SThomas Cort.Sh DESCRIPTION
45*3179b9b9SThomas Cort.Nm
46*3179b9b9SThomas Cortmaintains a file
47*3179b9b9SThomas Cort.Pa strings
48*3179b9b9SThomas Cortinto which strings in component parts of a large program are hashed.
49*3179b9b9SThomas CortThese strings are replaced with references to this common area.
50*3179b9b9SThomas CortThis serves to implement shared constant strings, most useful if they
51*3179b9b9SThomas Cortare also read-only.
52*3179b9b9SThomas Cort.Pp
53*3179b9b9SThomas CortAvailable options:
54*3179b9b9SThomas Cort.Bl -tag -width XXlXarrayXX
55*3179b9b9SThomas Cort.It Fl
56*3179b9b9SThomas Cort.Nm
57*3179b9b9SThomas Cortreads from the standard input.
58*3179b9b9SThomas Cort.It Fl c
59*3179b9b9SThomas Cort.Nm
60*3179b9b9SThomas Cortwill extract the strings from the C source
61*3179b9b9SThomas Cort.Ar file
62*3179b9b9SThomas Cortor the standard input
63*3179b9b9SThomas Cort.Pq Fl ,
64*3179b9b9SThomas Cortreplacing
65*3179b9b9SThomas Cortstring references by expressions of the form (\*[Am]xstr[number])
66*3179b9b9SThomas Cortfor some number.
67*3179b9b9SThomas CortAn appropriate declaration of
68*3179b9b9SThomas Cort.Nm
69*3179b9b9SThomas Cortis prepended to the file.
70*3179b9b9SThomas CortThe resulting C text is placed in the file
71*3179b9b9SThomas Cort.Pa x.c ,
72*3179b9b9SThomas Cortto then be compiled.
73*3179b9b9SThomas CortThe strings from this file are placed in the
74*3179b9b9SThomas Cort.Pa strings
75*3179b9b9SThomas Cortdata base if they are not there already.
76*3179b9b9SThomas CortRepeated strings and strings which are suffixes of existing strings
77*3179b9b9SThomas Cortdo not cause changes to the data base.
78*3179b9b9SThomas Cort.It Fl l Ar array
79*3179b9b9SThomas CortSpecify the named array in program references to abstracted
80*3179b9b9SThomas Cortstrings.
81*3179b9b9SThomas CortThe default array name is xstr.
82*3179b9b9SThomas Cort.It Fl v
83*3179b9b9SThomas CortBe verbose.
84*3179b9b9SThomas Cort.El
85*3179b9b9SThomas Cort.Pp
86*3179b9b9SThomas CortAfter all components of a large program have been compiled, a file
87*3179b9b9SThomas Cort.Pa xs.c
88*3179b9b9SThomas Cortdeclaring the common
89*3179b9b9SThomas Cort.Nm
90*3179b9b9SThomas Cortspace can be created by a command of the form:
91*3179b9b9SThomas Cort.Pp
92*3179b9b9SThomas Cort.Dl $ xstr
93*3179b9b9SThomas Cort.Pp
94*3179b9b9SThomas CortThe file
95*3179b9b9SThomas Cort.Pa xs.c
96*3179b9b9SThomas Cortshould then be compiled and loaded with the rest
97*3179b9b9SThomas Cortof the program.
98*3179b9b9SThomas CortIf possible, the array can be made read-only (shared) saving
99*3179b9b9SThomas Cortspace and swap overhead.
100*3179b9b9SThomas Cort.Pp
101*3179b9b9SThomas Cort.Nm
102*3179b9b9SThomas Cortcan also be used on a single file.
103*3179b9b9SThomas CortThe following command creates files
104*3179b9b9SThomas Cort.Pa x.c
105*3179b9b9SThomas Cortand
106*3179b9b9SThomas Cort.Pa xs.c
107*3179b9b9SThomas Cortas before, without using or affecting any
108*3179b9b9SThomas Cort.Pa strings
109*3179b9b9SThomas Cortfile in the same directory:
110*3179b9b9SThomas Cort.Pp
111*3179b9b9SThomas Cort.Dl $ xstr name
112*3179b9b9SThomas Cort.Pp
113*3179b9b9SThomas CortIt may be useful to run
114*3179b9b9SThomas Cort.Nm
115*3179b9b9SThomas Cortafter the C preprocessor if any macro definitions yield strings
116*3179b9b9SThomas Cortor if there is conditional code which contains strings
117*3179b9b9SThomas Cortwhich may not, in fact, be needed.
118*3179b9b9SThomas CortAn appropriate command sequence for running
119*3179b9b9SThomas Cort.Nm
120*3179b9b9SThomas Cortafter the C preprocessor is:
121*3179b9b9SThomas Cort.Pp
122*3179b9b9SThomas Cort.Bd -literal -offset indent
123*3179b9b9SThomas Cort$ cc \-E name.c | xstr \-c \-
124*3179b9b9SThomas Cort$ cc \-c x.c
125*3179b9b9SThomas Cort$ mv x.o name.o
126*3179b9b9SThomas Cort.Ed
127*3179b9b9SThomas Cort.Pp
128*3179b9b9SThomas Cort.Nm
129*3179b9b9SThomas Cortdoes not touch the file
130*3179b9b9SThomas Cort.Pa strings
131*3179b9b9SThomas Cortunless new items are added, thus
132*3179b9b9SThomas Cort.Xr make 1
133*3179b9b9SThomas Cortcan avoid remaking
134*3179b9b9SThomas Cort.Pa xs.o
135*3179b9b9SThomas Cortunless truly necessary.
136*3179b9b9SThomas Cort.Sh FILES
137*3179b9b9SThomas Cort.Bl -tag -width /tmp/xsxx* -compact
138*3179b9b9SThomas Cort.It Pa strings
139*3179b9b9SThomas CortData base of strings
140*3179b9b9SThomas Cort.It Pa x.c
141*3179b9b9SThomas CortMassaged C source
142*3179b9b9SThomas Cort.It Pa xs.c
143*3179b9b9SThomas CortC source for definition of array `xstr'
144*3179b9b9SThomas Cort.It Pa /tmp/xs*
145*3179b9b9SThomas CortTemp file when `xstr name' doesn't touch
146*3179b9b9SThomas Cort.Pa strings
147*3179b9b9SThomas Cort.El
148*3179b9b9SThomas Cort.Sh SEE ALSO
149*3179b9b9SThomas Cort.Xr mkstr 1
150*3179b9b9SThomas Cort.Sh HISTORY
151*3179b9b9SThomas CortThe
152*3179b9b9SThomas Cort.Nm
153*3179b9b9SThomas Cortcommand appeared in
154*3179b9b9SThomas Cort.Bx 3.0 .
155*3179b9b9SThomas Cort.Sh BUGS
156*3179b9b9SThomas CortIf a string is a suffix of another string in the data base,
157*3179b9b9SThomas Cortbut the shorter string is seen first by
158*3179b9b9SThomas Cort.Nm
159*3179b9b9SThomas Cortboth strings will be placed in the data base, when just
160*3179b9b9SThomas Cortplacing the longer one there will do.
161*3179b9b9SThomas Cort.Pp
162*3179b9b9SThomas Cort.Nm
163*3179b9b9SThomas Cortdoes not parse the file properly so it does not know not to process:
164*3179b9b9SThomas Cort.Bd -literal
165*3179b9b9SThomas Cort	char var[] = "const";
166*3179b9b9SThomas Cort.Ed
167*3179b9b9SThomas Cortinto:
168*3179b9b9SThomas Cort.Bd -literal
169*3179b9b9SThomas Cort	char var[] = (\*[Am]xstr[N]);
170*3179b9b9SThomas Cort.Ed
171*3179b9b9SThomas Cort.Pp
172*3179b9b9SThomas CortThese must be changed manually into an appropriate initialization for
173*3179b9b9SThomas Cortthe string, or use the following ugly hack.
174*3179b9b9SThomas Cort.Pp
175*3179b9b9SThomas CortAlso,
176*3179b9b9SThomas Cort.Nm
177*3179b9b9SThomas Cortcannot initialize structures and unions that contain strings.
178*3179b9b9SThomas CortThose can be fixed by changing from:
179*3179b9b9SThomas Cort.Bd -literal
180*3179b9b9SThomas Cort	struct foo {
181*3179b9b9SThomas Cort		int i;
182*3179b9b9SThomas Cort		char buf[10];
183*3179b9b9SThomas Cort	} = {
184*3179b9b9SThomas Cort		1, "foo"
185*3179b9b9SThomas Cort	};
186*3179b9b9SThomas Cort.Ed
187*3179b9b9SThomas Cortto:
188*3179b9b9SThomas Cort.Bd -literal
189*3179b9b9SThomas Cort	struct foo {
190*3179b9b9SThomas Cort		int i;
191*3179b9b9SThomas Cort		char buf[10];
192*3179b9b9SThomas Cort	} = {
193*3179b9b9SThomas Cort		1, { 'f', 'o', 'o', '\e0' }
194*3179b9b9SThomas Cort	};
195*3179b9b9SThomas Cort.Ed
196*3179b9b9SThomas Cort.Pp
197*3179b9b9SThomas CortThe real problem in both cases above is that the compiler knows the size
198*3179b9b9SThomas Cortof the literal constant so that it can perform the initialization required,
199*3179b9b9SThomas Cortbut when
200*3179b9b9SThomas Cort.Nm
201*3179b9b9SThomas Cortchanges the literal string to a pointer reference, the size information is
202*3179b9b9SThomas Cortlost.
203*3179b9b9SThomas CortIt would require a real parser to do this right, so the obvious solution is
204*3179b9b9SThomas Cortto fix the program manually to compile, or even better rely on the compiler
205*3179b9b9SThomas Cortand the linker to merge strings appropriately.
206*3179b9b9SThomas Cort.Pp
207*3179b9b9SThomas CortFinally,
208*3179b9b9SThomas Cort.Nm
209*3179b9b9SThomas Cortis not very useful these days because most of the string merging is done
210*3179b9b9SThomas Cortautomatically by the compiler and the linker, provided that the strings
211*3179b9b9SThomas Cortare identical and read-only.
212