1*3179b9b9SThomas Cort.\" $NetBSD: xstr.1,v 1.18 2005/09/11 23:29:44 wiz Exp $ 2*3179b9b9SThomas Cort.\" 3*3179b9b9SThomas Cort.\" Copyright (c) 1980, 1993 4*3179b9b9SThomas Cort.\" The Regents of the University of California. All rights reserved. 5*3179b9b9SThomas Cort.\" 6*3179b9b9SThomas Cort.\" Redistribution and use in source and binary forms, with or without 7*3179b9b9SThomas Cort.\" modification, are permitted provided that the following conditions 8*3179b9b9SThomas Cort.\" are met: 9*3179b9b9SThomas Cort.\" 1. Redistributions of source code must retain the above copyright 10*3179b9b9SThomas Cort.\" notice, this list of conditions and the following disclaimer. 11*3179b9b9SThomas Cort.\" 2. Redistributions in binary form must reproduce the above copyright 12*3179b9b9SThomas Cort.\" notice, this list of conditions and the following disclaimer in the 13*3179b9b9SThomas Cort.\" documentation and/or other materials provided with the distribution. 14*3179b9b9SThomas Cort.\" 3. Neither the name of the University nor the names of its contributors 15*3179b9b9SThomas Cort.\" may be used to endorse or promote products derived from this software 16*3179b9b9SThomas Cort.\" without specific prior written permission. 17*3179b9b9SThomas Cort.\" 18*3179b9b9SThomas Cort.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 19*3179b9b9SThomas Cort.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 20*3179b9b9SThomas Cort.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 21*3179b9b9SThomas Cort.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 22*3179b9b9SThomas Cort.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23*3179b9b9SThomas Cort.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 24*3179b9b9SThomas Cort.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 25*3179b9b9SThomas Cort.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 26*3179b9b9SThomas Cort.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 27*3179b9b9SThomas Cort.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 28*3179b9b9SThomas Cort.\" SUCH DAMAGE. 29*3179b9b9SThomas Cort.\" 30*3179b9b9SThomas Cort.\" @(#)xstr.1 8.2 (Berkeley) 12/30/93 31*3179b9b9SThomas Cort.\" 32*3179b9b9SThomas Cort.Dd July 23, 2004 33*3179b9b9SThomas Cort.Dt XSTR 1 34*3179b9b9SThomas Cort.Os 35*3179b9b9SThomas Cort.Sh NAME 36*3179b9b9SThomas Cort.Nm xstr 37*3179b9b9SThomas Cort.Nd "extract strings from C programs to implement shared strings" 38*3179b9b9SThomas Cort.Sh SYNOPSIS 39*3179b9b9SThomas Cort.Nm 40*3179b9b9SThomas Cort.Op Fl cv 41*3179b9b9SThomas Cort.Op Fl l Ar array 42*3179b9b9SThomas Cort.Op Fl 43*3179b9b9SThomas Cort.Op Ar 44*3179b9b9SThomas Cort.Sh DESCRIPTION 45*3179b9b9SThomas Cort.Nm 46*3179b9b9SThomas Cortmaintains a file 47*3179b9b9SThomas Cort.Pa strings 48*3179b9b9SThomas Cortinto which strings in component parts of a large program are hashed. 49*3179b9b9SThomas CortThese strings are replaced with references to this common area. 50*3179b9b9SThomas CortThis serves to implement shared constant strings, most useful if they 51*3179b9b9SThomas Cortare also read-only. 52*3179b9b9SThomas Cort.Pp 53*3179b9b9SThomas CortAvailable options: 54*3179b9b9SThomas Cort.Bl -tag -width XXlXarrayXX 55*3179b9b9SThomas Cort.It Fl 56*3179b9b9SThomas Cort.Nm 57*3179b9b9SThomas Cortreads from the standard input. 58*3179b9b9SThomas Cort.It Fl c 59*3179b9b9SThomas Cort.Nm 60*3179b9b9SThomas Cortwill extract the strings from the C source 61*3179b9b9SThomas Cort.Ar file 62*3179b9b9SThomas Cortor the standard input 63*3179b9b9SThomas Cort.Pq Fl , 64*3179b9b9SThomas Cortreplacing 65*3179b9b9SThomas Cortstring references by expressions of the form (\*[Am]xstr[number]) 66*3179b9b9SThomas Cortfor some number. 67*3179b9b9SThomas CortAn appropriate declaration of 68*3179b9b9SThomas Cort.Nm 69*3179b9b9SThomas Cortis prepended to the file. 70*3179b9b9SThomas CortThe resulting C text is placed in the file 71*3179b9b9SThomas Cort.Pa x.c , 72*3179b9b9SThomas Cortto then be compiled. 73*3179b9b9SThomas CortThe strings from this file are placed in the 74*3179b9b9SThomas Cort.Pa strings 75*3179b9b9SThomas Cortdata base if they are not there already. 76*3179b9b9SThomas CortRepeated strings and strings which are suffixes of existing strings 77*3179b9b9SThomas Cortdo not cause changes to the data base. 78*3179b9b9SThomas Cort.It Fl l Ar array 79*3179b9b9SThomas CortSpecify the named array in program references to abstracted 80*3179b9b9SThomas Cortstrings. 81*3179b9b9SThomas CortThe default array name is xstr. 82*3179b9b9SThomas Cort.It Fl v 83*3179b9b9SThomas CortBe verbose. 84*3179b9b9SThomas Cort.El 85*3179b9b9SThomas Cort.Pp 86*3179b9b9SThomas CortAfter all components of a large program have been compiled, a file 87*3179b9b9SThomas Cort.Pa xs.c 88*3179b9b9SThomas Cortdeclaring the common 89*3179b9b9SThomas Cort.Nm 90*3179b9b9SThomas Cortspace can be created by a command of the form: 91*3179b9b9SThomas Cort.Pp 92*3179b9b9SThomas Cort.Dl $ xstr 93*3179b9b9SThomas Cort.Pp 94*3179b9b9SThomas CortThe file 95*3179b9b9SThomas Cort.Pa xs.c 96*3179b9b9SThomas Cortshould then be compiled and loaded with the rest 97*3179b9b9SThomas Cortof the program. 98*3179b9b9SThomas CortIf possible, the array can be made read-only (shared) saving 99*3179b9b9SThomas Cortspace and swap overhead. 100*3179b9b9SThomas Cort.Pp 101*3179b9b9SThomas Cort.Nm 102*3179b9b9SThomas Cortcan also be used on a single file. 103*3179b9b9SThomas CortThe following command creates files 104*3179b9b9SThomas Cort.Pa x.c 105*3179b9b9SThomas Cortand 106*3179b9b9SThomas Cort.Pa xs.c 107*3179b9b9SThomas Cortas before, without using or affecting any 108*3179b9b9SThomas Cort.Pa strings 109*3179b9b9SThomas Cortfile in the same directory: 110*3179b9b9SThomas Cort.Pp 111*3179b9b9SThomas Cort.Dl $ xstr name 112*3179b9b9SThomas Cort.Pp 113*3179b9b9SThomas CortIt may be useful to run 114*3179b9b9SThomas Cort.Nm 115*3179b9b9SThomas Cortafter the C preprocessor if any macro definitions yield strings 116*3179b9b9SThomas Cortor if there is conditional code which contains strings 117*3179b9b9SThomas Cortwhich may not, in fact, be needed. 118*3179b9b9SThomas CortAn appropriate command sequence for running 119*3179b9b9SThomas Cort.Nm 120*3179b9b9SThomas Cortafter the C preprocessor is: 121*3179b9b9SThomas Cort.Pp 122*3179b9b9SThomas Cort.Bd -literal -offset indent 123*3179b9b9SThomas Cort$ cc \-E name.c | xstr \-c \- 124*3179b9b9SThomas Cort$ cc \-c x.c 125*3179b9b9SThomas Cort$ mv x.o name.o 126*3179b9b9SThomas Cort.Ed 127*3179b9b9SThomas Cort.Pp 128*3179b9b9SThomas Cort.Nm 129*3179b9b9SThomas Cortdoes not touch the file 130*3179b9b9SThomas Cort.Pa strings 131*3179b9b9SThomas Cortunless new items are added, thus 132*3179b9b9SThomas Cort.Xr make 1 133*3179b9b9SThomas Cortcan avoid remaking 134*3179b9b9SThomas Cort.Pa xs.o 135*3179b9b9SThomas Cortunless truly necessary. 136*3179b9b9SThomas Cort.Sh FILES 137*3179b9b9SThomas Cort.Bl -tag -width /tmp/xsxx* -compact 138*3179b9b9SThomas Cort.It Pa strings 139*3179b9b9SThomas CortData base of strings 140*3179b9b9SThomas Cort.It Pa x.c 141*3179b9b9SThomas CortMassaged C source 142*3179b9b9SThomas Cort.It Pa xs.c 143*3179b9b9SThomas CortC source for definition of array `xstr' 144*3179b9b9SThomas Cort.It Pa /tmp/xs* 145*3179b9b9SThomas CortTemp file when `xstr name' doesn't touch 146*3179b9b9SThomas Cort.Pa strings 147*3179b9b9SThomas Cort.El 148*3179b9b9SThomas Cort.Sh SEE ALSO 149*3179b9b9SThomas Cort.Xr mkstr 1 150*3179b9b9SThomas Cort.Sh HISTORY 151*3179b9b9SThomas CortThe 152*3179b9b9SThomas Cort.Nm 153*3179b9b9SThomas Cortcommand appeared in 154*3179b9b9SThomas Cort.Bx 3.0 . 155*3179b9b9SThomas Cort.Sh BUGS 156*3179b9b9SThomas CortIf a string is a suffix of another string in the data base, 157*3179b9b9SThomas Cortbut the shorter string is seen first by 158*3179b9b9SThomas Cort.Nm 159*3179b9b9SThomas Cortboth strings will be placed in the data base, when just 160*3179b9b9SThomas Cortplacing the longer one there will do. 161*3179b9b9SThomas Cort.Pp 162*3179b9b9SThomas Cort.Nm 163*3179b9b9SThomas Cortdoes not parse the file properly so it does not know not to process: 164*3179b9b9SThomas Cort.Bd -literal 165*3179b9b9SThomas Cort char var[] = "const"; 166*3179b9b9SThomas Cort.Ed 167*3179b9b9SThomas Cortinto: 168*3179b9b9SThomas Cort.Bd -literal 169*3179b9b9SThomas Cort char var[] = (\*[Am]xstr[N]); 170*3179b9b9SThomas Cort.Ed 171*3179b9b9SThomas Cort.Pp 172*3179b9b9SThomas CortThese must be changed manually into an appropriate initialization for 173*3179b9b9SThomas Cortthe string, or use the following ugly hack. 174*3179b9b9SThomas Cort.Pp 175*3179b9b9SThomas CortAlso, 176*3179b9b9SThomas Cort.Nm 177*3179b9b9SThomas Cortcannot initialize structures and unions that contain strings. 178*3179b9b9SThomas CortThose can be fixed by changing from: 179*3179b9b9SThomas Cort.Bd -literal 180*3179b9b9SThomas Cort struct foo { 181*3179b9b9SThomas Cort int i; 182*3179b9b9SThomas Cort char buf[10]; 183*3179b9b9SThomas Cort } = { 184*3179b9b9SThomas Cort 1, "foo" 185*3179b9b9SThomas Cort }; 186*3179b9b9SThomas Cort.Ed 187*3179b9b9SThomas Cortto: 188*3179b9b9SThomas Cort.Bd -literal 189*3179b9b9SThomas Cort struct foo { 190*3179b9b9SThomas Cort int i; 191*3179b9b9SThomas Cort char buf[10]; 192*3179b9b9SThomas Cort } = { 193*3179b9b9SThomas Cort 1, { 'f', 'o', 'o', '\e0' } 194*3179b9b9SThomas Cort }; 195*3179b9b9SThomas Cort.Ed 196*3179b9b9SThomas Cort.Pp 197*3179b9b9SThomas CortThe real problem in both cases above is that the compiler knows the size 198*3179b9b9SThomas Cortof the literal constant so that it can perform the initialization required, 199*3179b9b9SThomas Cortbut when 200*3179b9b9SThomas Cort.Nm 201*3179b9b9SThomas Cortchanges the literal string to a pointer reference, the size information is 202*3179b9b9SThomas Cortlost. 203*3179b9b9SThomas CortIt would require a real parser to do this right, so the obvious solution is 204*3179b9b9SThomas Cortto fix the program manually to compile, or even better rely on the compiler 205*3179b9b9SThomas Cortand the linker to merge strings appropriately. 206*3179b9b9SThomas Cort.Pp 207*3179b9b9SThomas CortFinally, 208*3179b9b9SThomas Cort.Nm 209*3179b9b9SThomas Cortis not very useful these days because most of the string merging is done 210*3179b9b9SThomas Cortautomatically by the compiler and the linker, provided that the strings 211*3179b9b9SThomas Cortare identical and read-only. 212