xref: /csrg-svn/usr.bin/spell/README (revision 30658)
130613SmckusickCopyright (c) 1987 Regents of the University of California.
230613SmckusickAll rights reserved.  The Berkeley software License Agreement
330613Smckusickspecifies the terms and conditions for redistribution.
430612Smckusick
5*30658Smckusick	@(#)README	1.4 (Berkeley) 03/20/87
630613Smckusick
7*30658SmckusickAll files and subdirectories of /usr/dict are recommended for
8*30658Smckusickrdisting except web2 and web2a (because of their size), and some of
9*30658Smckusickthe files  hlist*  depending on needs of your machine (details below).
10*30658SmckusickDescriptions of most of these files are given under FILES below.
1130613Smckusick
1230613SmckusickThe new subdirectory "special" contains lists of words in specialized
13*30658Smckusickfields, which may be hashed in with the regular lists on machines having
14*30658Smckusickmany users working in these fields.  As of this writing, there are two
15*30658Smckusicksuch specialized wordlists.
1630613Smckusick
1730613SmckusickIt is advised that system managers also create a directory
18*30658Smckusick/usr/local/dict. This can be used to maintain files of particular
19*30658Smckusickinterest to users of each machine (e.g., surnames of members of the
2030613Smckusickdepartment on a departmental machine).
2130613Smckusick
22*30658SmckusickThe hashed wordlists hlista and hlistb in this distribution include
23*30658Smckusickthe words in the file special/4bsd, comprising current 4bsd
24*30658Smckusickcommands, system calls, etc. (from "abs" to "zcat").  Machines
25*30658Smckusickwhose primary users are programmers should take these files by
26*30658Smckusickrdist.  For machines with other user populations, a file "hlist" is
27*30658Smckusickprovided which only contains the contents of /usr/dict/words.  Managers
28*30658Smckusickof such machines should rdist this file, and use "spellin" to produce
29*30658Smckusickfiles hlist{a,b} which contain the words from
30*30658Smckusick/usr/dict/{american,british} respectively, plus any other files
31*30658Smckusickappropriate to the needs of the majority of their users.  (Some basic
32*30658Smckusickunix commands and terms that general users are likely to encounter,
33*30658Smckusicke.g. troff, emacs, tty, have been included in /usr/dict/words.
34*30658SmckusickMore may be added as suggestions are received.)  Here, for instance is a
35*30658Smckusickscript that might be used to create the hashlists on a particular
36*30658Smckusickmachine, so as to include the words in /usr/dict/special/math, as well
37*30658Smckusickas two local lists which we will assume are called
38*30658Smckusick/usr/local/dict/surnames and /usr/local/dict/acronyms.
3930613Smckusick
4030613Smckusick	#
4130613Smckusick	cd /usr/dict
42*30658Smckusick	cat american special/math /usr/local/dict/{surnames,acronyms} | \
43*30658Smckusick		spellin hlist > hlista
44*30658Smckusick	cat british  special/math /usr/local/dict/{surnames,acronyms} | \
45*30658Smckusick		spellin hlist > hlistb
4630613Smckusick
47*30658Smckusick     Hashlists can also be created from scratch using
48*30658Smckusick/usr/src/usr.bin/spell/Makefile.  This is now written so that if "make"
49*30658Smckusickis run with no options it will produce the hashed files as presently
50*30658Smckusickdistributed, but so that the extra wordlists used can be controlled with
51*30658Smckusickvariables LOCAL and SPECIAL.  For instance, the results given by the
52*30658Smckusickabove script can be obtained by doing:
5330613Smckusick
54*30658Smckusick	cd /usr/src/usr.bin/spell
55*30658Smckusick	make LOCAL='/usr/local/dict/surnames /usr/local/dict/acronyms' \
56*30658Smckusick		SPECIAL=special.math
57*30658Smckusick	make install
5830613Smckusick
59*30658Smckusick     Returning to the subject of the wordlists in /usr/dict, these are,
60*30658Smckusickin general, ordered as in  sort -df.  This makes no difference for
61*30658Smckusickspell's hashing process, but makes a difference for other commands,
62*30658Smckusicksuch as "look", that perform binary searches on the unhashed lists.
63*30658Smckusick
64*30658SmckusickComplaints, and any additional suggestions for words or wordlists,
65*30658Smckusickshould be sent to me.  I cannot fix bugs involving the code of "spell",
66*30658Smckusickbut I am maintaining a list of these bugs, and of other ideas for
67*30658Smckusickimprovement.
68*30658Smckusick		George Bergman, gbergman@cartan.Berkeley.Edu
69*30658Smckusick		18 March, 1987
70*30658Smckusick
7130613Smckusick--------------------------------------------------------------------
72*30658SmckusickFILES and subdirectories of /usr/dict:
73*30658Smckusick
74*30658Smckusick    words    -- common words, and important technical terms from all
7530613Smckusick	fields, that are spelled the same in British and American usage.
76*30658Smckusick    american -- spellings preferred in American but not British usage.
77*30658Smckusick    british  -- spellings preferred in British but not American usage.
78*30658Smckusick    stop     -- forms that would otherwise be derivable by "spell" from
7930613Smckusick	words in one of the above files, but should not be accepted.
80*30658Smckusick    hlist    -- hashed list, formed from the file "words" only.
81*30658Smckusick    hlista   -- hashed list, formed from files {words,american,special/4bsd}.
82*30658Smckusick    hlistb   -- hashed list, formed from files {words,british,special/4bsd}.
83*30658Smckusick    hstop    -- hashed list, formed from file "stop".
84*30658Smckusick    web2     -- words from Webster's 2nd International (see WEB below).
85*30658Smckusick    web2a    -- compounds and phrases from same source.
86*30658Smckusick    README   -- this file
87*30658Smckusick    papers/  -- an (out-of-date specialized) bibliographical database,
88*30658Smckusick	used as the default by the program "refer".
89*30658Smckusick    special/ -- directory of less common terms from specialized fields.
90*30658Smckusick	It presently contains:
91*30658Smckusick
92*30658Smckusick	special/4bsd -- commands and system calls (from filenames in
93*30658Smckusick	    /usr/man/man[1238n]), and builtin csh commands (named in
9430613Smckusick	    /usr/man/man1/csh.1) of the current version of 4bsd Unix.
9530613Smckusick	    (Supersedes old "/usr/src/usr.bin/spell/local".)
96*30658Smckusick	special/math -- some mathematical terms not in /usr/dict/words.
9730613Smckusick
98*30658SmckusickWEB ---- (introduction provided by jaw@riacs) -------------------------
99*30658Smckusick
10030612SmckusickWelcome to Webster's Second International, all 234,936 words worth.
10130612SmckusickThe 1934 copyright has elapsed, according to the supplier.  The
10230612Smckusicksupplemental 'web2a' list contains hyphenated terms as well as assorted
10330612Smckusicknoun and adverbial phrases.  The wordlist makes a dandy 'grep' victim.
10430612Smckusick
10530612Smckusick     -- James A. Woods    {ihnp4,hplabs}!ames!jaw    (or jaw@riacs)
106