diff --git a/contrib/gperf/FREEBSD-Xlist b/contrib/gperf/FREEBSD-Xlist index 120732a9f751..adb468037113 100644 --- a/contrib/gperf/FREEBSD-Xlist +++ b/contrib/gperf/FREEBSD-Xlist @@ -8,7 +8,5 @@ $FreeBSD$ */getopt* */*.pdf */*.vms -*/*.texi -*/*.texinfo */*.msvc */*.woe32 diff --git a/contrib/gperf/doc/gperf.texi b/contrib/gperf/doc/gperf.texi new file mode 100644 index 000000000000..e23f7b5329c0 --- /dev/null +++ b/contrib/gperf/doc/gperf.texi @@ -0,0 +1,1370 @@ +\input texinfo @c -*- texinfo -*- +@c %**start of header +@setfilename gperf.info +@settitle Perfect Hash Function Generator +@c @setchapternewpage odd +@c %**end of header + +@c some day we should @include version.texi instead of defining +@c these values at hand. +@set UPDATED 31 March 2007 +@set EDITION 3.0.3 +@set VERSION 3.0.3 +@c --------------------- + +@c remove the black boxes generated in the GPL appendix. +@finalout + +@c Merge functions into the concept index +@syncodeindex fn cp +@c @synindex pg cp + +@dircategory Programming Tools +@direntry +* Gperf: (gperf). Perfect Hash Function Generator. +@end direntry + +@ifinfo +This file documents the features of the GNU Perfect Hash Function +Generator @value{VERSION}. + +Copyright @copyright{} 1989-2006 Free Software Foundation, Inc. + +Permission is granted to make and distribute verbatim copies of this +manual provided the copyright notice and this permission notice are +preserved on all copies. + +@ignore +Permission is granted to process this file through TeX and print the +results, provided the printed document carries a copying permission +notice identical to this one except for the removal of this paragraph +(this paragraph not being relevant to the printed manual). + +@end ignore + +Permission is granted to copy and distribute modified versions of this +manual under the conditions for verbatim copying, provided also that the +section entitled ``GNU General Public License'' is included exactly as +in the original, and provided that the entire resulting derived work is +distributed under the terms of a permission notice identical to this +one. + +Permission is granted to copy and distribute translations of this manual +into another language, under the above conditions for modified versions, +except that the section entitled ``GNU General Public License'' and this +permission notice may be included in translations approved by the Free +Software Foundation instead of in the original English. + +@end ifinfo + +@titlepage +@title User's Guide to @code{gperf} @value{VERSION} +@subtitle The GNU Perfect Hash Function Generator +@subtitle Edition @value{EDITION}, @value{UPDATED} +@author Douglas C. Schmidt +@author Bruno Haible + +@page +@vskip 0pt plus 1filll +Copyright @copyright{} 1989-2007 Free Software Foundation, Inc. + + +Permission is granted to make and distribute verbatim copies of +this manual provided the copyright notice and this permission notice +are preserved on all copies. + +Permission is granted to copy and distribute modified versions of this +manual under the conditions for verbatim copying, provided also that the +section entitled ``GNU General Public License'' is included +exactly as in the original, and provided that the entire resulting +derived work is distributed under the terms of a permission notice +identical to this one. + +Permission is granted to copy and distribute translations of this manual +into another language, under the above conditions for modified versions, +except that the section entitled ``GNU General Public License'' may be +included in a translation approved by the author instead of in the +original English. +@end titlepage + +@ifinfo +@node Top, Copying, (dir), (dir) +@top Introduction + +This manual documents the GNU @code{gperf} perfect hash function generator +utility, focusing on its features and how to use them, and how to report +bugs. + +@menu +* Copying:: GNU @code{gperf} General Public License says + how you can copy and share @code{gperf}. +* Contributors:: People who have contributed to @code{gperf}. +* Motivation:: The purpose of @code{gperf}. +* Search Structures:: Static search structures and GNU @code{gperf} +* Description:: High-level discussion of how GPERF functions. +* Options:: A description of options to the program. +* Bugs:: Known bugs and limitations with GPERF. +* Projects:: Things still left to do. +* Bibliography:: Material Referenced in this Report. + +* Concept Index:: + +@detailmenu --- The Detailed Node Listing --- + +High-Level Description of GNU @code{gperf} + +* Input Format:: Input Format to @code{gperf} +* Output Format:: Output Format for Generated C Code with @code{gperf} +* Binary Strings:: Use of NUL bytes + +Input Format to @code{gperf} + +* Declarations:: Declarations. +* Keywords:: Format for Keyword Entries. +* Functions:: Including Additional C Functions. +* Controls for GNU indent:: Where to place directives for GNU @code{indent}. + +Declarations + +* User-supplied Struct:: Specifying keywords with attributes. +* Gperf Declarations:: Embedding command line options in the input. +* C Code Inclusion:: Including C declarations and definitions. + +Invoking @code{gperf} + +* Input Details:: Options that affect Interpretation of the Input File +* Output Language:: Specifying the Language for the Output Code +* Output Details:: Fine tuning Details in the Output Code +* Algorithmic Details:: Changing the Algorithms employed by @code{gperf} +* Verbosity:: Informative Output + +@end detailmenu +@end menu + +@end ifinfo + +@node Copying, Contributors, Top, Top +@unnumbered GNU GENERAL PUBLIC LICENSE +@include gpl.texinfo + +@node Contributors, Motivation, Copying, Top +@unnumbered Contributors to GNU @code{gperf} Utility + +@itemize @bullet +@item +@cindex Bugs +The GNU @code{gperf} perfect hash function generator utility was +written in GNU C++ by Douglas C. Schmidt. The general +idea for the perfect hash function generator was inspired by Keith +Bostic's algorithm written in C, and distributed to net.sources around +1984. The current program is a heavily modified, enhanced, and extended +implementation of Keith's basic idea, created at the University of +California, Irvine. Bugs, patches, and suggestions should be reported +to @code{}. + +@item +Special thanks is extended to Michael Tiemann and Doug Lea, for +providing a useful compiler, and for giving me a forum to exhibit my +creation. + +In addition, Adam de Boor and Nels Olson provided many tips and insights +that greatly helped improve the quality and functionality of @code{gperf}. + +@item +Bruno Haible enhanced and optimized the search algorithm. He also rewrote +the input routines and the output routines for better reliability, and +added a testsuite. +@end itemize + +@node Motivation, Search Structures, Contributors, Top +@chapter Introduction + +@code{gperf} is a perfect hash function generator written in C++. It +transforms an @var{n} element user-specified keyword set @var{W} into a +perfect hash function @var{F}. @var{F} uniquely maps keywords in +@var{W} onto the range 0..@var{k}, where @var{k} >= @var{n-1}. If @var{k} += @var{n-1} then @var{F} is a @emph{minimal} perfect hash function. +@code{gperf} generates a 0..@var{k} element static lookup table and a +pair of C functions. These functions determine whether a given +character string @var{s} occurs in @var{W}, using at most one probe into +the lookup table. + +@code{gperf} currently generates the reserved keyword recognizer for +lexical analyzers in several production and research compilers and +language processing tools, including GNU C, GNU C++, GNU Java, GNU Pascal, +GNU Modula 3, and GNU indent. Complete C++ source code for @code{gperf} is +available from @code{http://ftp.gnu.org/pub/gnu/gperf/}. +A paper describing @code{gperf}'s design and implementation in greater +detail is available in the Second USENIX C++ Conference proceedings +or from @code{http://www.cs.wustl.edu/~schmidt/resume.html}. + +@node Search Structures, Description, Motivation, Top +@chapter Static search structures and GNU @code{gperf} +@cindex Static search structure + +A @dfn{static search structure} is an Abstract Data Type with certain +fundamental operations, e.g., @emph{initialize}, @emph{insert}, +and @emph{retrieve}. Conceptually, all insertions occur before any +retrievals. In practice, @code{gperf} generates a @emph{static} array +containing search set keywords and any associated attributes specified +by the user. Thus, there is essentially no execution-time cost for the +insertions. It is a useful data structure for representing @emph{static +search sets}. Static search sets occur frequently in software system +applications. Typical static search sets include compiler reserved +words, assembler instruction opcodes, and built-in shell interpreter +commands. Search set members, called @dfn{keywords}, are inserted into +the structure only once, usually during program initialization, and are +not generally modified at run-time. + +Numerous static search structure implementations exist, e.g., +arrays, linked lists, binary search trees, digital search tries, and +hash tables. Different approaches offer trade-offs between space +utilization and search time efficiency. For example, an @var{n} element +sorted array is space efficient, though the average-case time +complexity for retrieval operations using binary search is +proportional to log @var{n}. Conversely, hash table implementations +often locate a table entry in constant time, but typically impose +additional memory overhead and exhibit poor worst case performance. + +@cindex Minimal perfect hash functions +@emph{Minimal perfect hash functions} provide an optimal solution for a +particular class of static search sets. A minimal perfect hash +function is defined by two properties: + +@itemize @bullet +@item +It allows keyword recognition in a static search set using at most +@emph{one} probe into the hash table. This represents the ``perfect'' +property. +@item +The actual memory allocated to store the keywords is precisely large +enough for the keyword set, and @emph{no larger}. This is the +``minimal'' property. +@end itemize + +For most applications it is far easier to generate @emph{perfect} hash +functions than @emph{minimal perfect} hash functions. Moreover, +non-minimal perfect hash functions frequently execute faster than +minimal ones in practice. This phenomena occurs since searching a +sparse keyword table increases the probability of locating a ``null'' +entry, thereby reducing string comparisons. @code{gperf}'s default +behavior generates @emph{near-minimal} perfect hash functions for +keyword sets. However, @code{gperf} provides many options that permit +user control over the degree of minimality and perfection. + +Static search sets often exhibit relative stability over time. For +example, Ada's 63 reserved words have remained constant for nearly a +decade. It is therefore frequently worthwhile to expend concerted +effort building an optimal search structure @emph{once}, if it +subsequently receives heavy use multiple times. @code{gperf} removes +the drudgery associated with constructing time- and space-efficient +search structures by hand. It has proven a useful and practical tool +for serious programming projects. Output from @code{gperf} is currently +used in several production and research compilers, including GNU C, GNU +C++, GNU Java, GNU Pascal, and GNU Modula 3. The latter two compilers are +not yet part of the official GNU distribution. Each compiler utilizes +@code{gperf} to automatically generate static search structures that +efficiently identify their respective reserved keywords. + +@node Description, Options, Search Structures, Top +@chapter High-Level Description of GNU @code{gperf} + +@menu +* Input Format:: Input Format to @code{gperf} +* Output Format:: Output Format for Generated C Code with @code{gperf} +* Binary Strings:: Use of NUL bytes +@end menu + +The perfect hash function generator @code{gperf} reads a set of +``keywords'' from an input file (or from the standard input by +default). It attempts to derive a perfect hashing function that +recognizes a member of the @dfn{static keyword set} with at most a +single probe into the lookup table. If @code{gperf} succeeds in +generating such a function it produces a pair of C source code routines +that perform hashing and table lookup recognition. All generated C code +is directed to the standard output. Command-line options described +below allow you to modify the input and output format to @code{gperf}. + +By default, @code{gperf} attempts to produce time-efficient code, with +less emphasis on efficient space utilization. However, several options +exist that permit trading-off execution time for storage space and vice +versa. In particular, expanding the generated table size produces a +sparse search structure, generally yielding faster searches. +Conversely, you can direct @code{gperf} to utilize a C @code{switch} +statement scheme that minimizes data space storage size. Furthermore, +using a C @code{switch} may actually speed up the keyword retrieval time +somewhat. Actual results depend on your C compiler, of course. + +In general, @code{gperf} assigns values to the bytes it is using +for hashing until some set of values gives each keyword a unique value. +A helpful heuristic is that the larger the hash value range, the easier +it is for @code{gperf} to find and generate a perfect hash function. +Experimentation is the key to getting the most from @code{gperf}. + +@node Input Format, Output Format, Description, Description +@section Input Format to @code{gperf} +@cindex Format +@cindex Declaration section +@cindex Keywords section +@cindex Functions section +You can control the input file format by varying certain command-line +arguments, in particular the @samp{-t} option. The input's appearance +is similar to GNU utilities @code{flex} and @code{bison} (or UNIX +utilities @code{lex} and @code{yacc}). Here's an outline of the general +format: + +@example +@group +declarations +%% +keywords +%% +functions +@end group +@end example + +@emph{Unlike} @code{flex} or @code{bison}, the declarations section and +the functions section are optional. The following sections describe the +input format for each section. + +@menu +* Declarations:: Declarations. +* Keywords:: Format for Keyword Entries. +* Functions:: Including Additional C Functions. +* Controls for GNU indent:: Where to place directives for GNU @code{indent}. +@end menu + +It is possible to omit the declaration section entirely, if the @samp{-t} +option is not given. In this case the input file begins directly with the +first keyword line, e.g.: + +@example +@group +january +february +march +april +... +@end group +@end example + +@node Declarations, Keywords, Input Format, Input Format +@subsection Declarations + +The keyword input file optionally contains a section for including +arbitrary C declarations and definitions, @code{gperf} declarations that +act like command-line options, as well as for providing a user-supplied +@code{struct}. + +@menu +* User-supplied Struct:: Specifying keywords with attributes. +* Gperf Declarations:: Embedding command line options in the input. +* C Code Inclusion:: Including C declarations and definitions. +@end menu + +@node User-supplied Struct, Gperf Declarations, Declarations, Declarations +@subsubsection User-supplied @code{struct} + +If the @samp{-t} option (or, equivalently, the @samp{%struct-type} declaration) +@emph{is} enabled, you @emph{must} provide a C @code{struct} as the last +component in the declaration section from the input file. The first +field in this struct must be of type @code{char *} or @code{const char *} +if the @samp{-P} option is not given, or of type @code{int} if the option +@samp{-P} (or, equivalently, the @samp{%pic} declaration) is enabled. +This first field must be called @samp{name}, although it is possible to modify +its name with the @samp{-K} option (or, equivalently, the +@samp{%define slot-name} declaration) described below. + +Here is a simple example, using months of the year and their attributes as +input: + +@example +@group +struct month @{ char *name; int number; int days; int leap_days; @}; +%% +january, 1, 31, 31 +february, 2, 28, 29 +march, 3, 31, 31 +april, 4, 30, 30 +may, 5, 31, 31 +june, 6, 30, 30 +july, 7, 31, 31 +august, 8, 31, 31 +september, 9, 30, 30 +october, 10, 31, 31 +november, 11, 30, 30 +december, 12, 31, 31 +@end group +@end example + +@cindex @samp{%%} +Separating the @code{struct} declaration from the list of keywords and +other fields are a pair of consecutive percent signs, @samp{%%}, +appearing left justified in the first column, as in the UNIX utility +@code{lex}. + +If the @code{struct} has already been declared in an include file, it can +be mentioned in an abbreviated form, like this: + +@example +@group +struct month; +%% +january, 1, 31, 31 +... +@end group +@end example + +@node Gperf Declarations, C Code Inclusion, User-supplied Struct, Declarations +@subsubsection Gperf Declarations + +The declaration section can contain @code{gperf} declarations. They +influence the way @code{gperf} works, like command line options do. +In fact, every such declaration is equivalent to a command line option. +There are three forms of declarations: + +@enumerate +@item +Declarations without argument, like @samp{%compare-lengths}. + +@item +Declarations with an argument, like @samp{%switch=@var{count}}. + +@item +Declarations of names of entities in the output file, like +@samp{%define lookup-function-name @var{name}}. +@end enumerate + +When a declaration is given both in the input file and as a command line +option, the command-line option's value prevails. + +The following @code{gperf} declarations are available. + +@table @samp +@item %delimiters=@var{delimiter-list} +@cindex @samp{%delimiters} +Allows you to provide a string containing delimiters used to +separate keywords from their attributes. The default is ",". This +option is essential if you want to use keywords that have embedded +commas or newlines. + +@item %struct-type +@cindex @samp{%struct-type} +Allows you to include a @code{struct} type declaration for generated +code; see above for an example. + +@item %ignore-case +@cindex @samp{%ignore-case} +Consider upper and lower case ASCII characters as equivalent. The string +comparison will use a case insignificant character comparison. Note that +locale dependent case mappings are ignored. + +@item %language=@var{language-name} +@cindex @samp{%language} +Instructs @code{gperf} to generate code in the language specified by the +option's argument. Languages handled are currently: + +@table @samp +@item KR-C +Old-style K&R C. This language is understood by old-style C compilers and +ANSI C compilers, but ANSI C compilers may flag warnings (or even errors) +because of lacking @samp{const}. + +@item C +Common C. This language is understood by ANSI C compilers, and also by +old-style C compilers, provided that you @code{#define const} to empty +for compilers which don't know about this keyword. + +@item ANSI-C +ANSI C. This language is understood by ANSI C compilers and C++ compilers. + +@item C++ +C++. This language is understood by C++ compilers. +@end table + +The default is C. + +@item %define slot-name @var{name} +@cindex @samp{%define slot-name} +This declaration is only useful when option @samp{-t} (or, equivalently, the +@samp{%struct-type} declaration) has been given. +By default, the program assumes the structure component identifier for +the keyword is @samp{name}. This option allows an arbitrary choice of +identifier for this component, although it still must occur as the first +field in your supplied @code{struct}. + +@item %define initializer-suffix @var{initializers} +@cindex @samp{%define initializer-suffix} +This declaration is only useful when option @samp{-t} (or, equivalently, the +@samp{%struct-type} declaration) has been given. +It permits to specify initializers for the structure members following +@var{slot-name} in empty hash table entries. The list of initializers +should start with a comma. By default, the emitted code will +zero-initialize structure members following @var{slot-name}. + +@item %define hash-function-name @var{name} +@cindex @samp{%define hash-function-name} +Allows you to specify the name for the generated hash function. Default +name is @samp{hash}. This option permits the use of two hash tables in +the same file. + +@item %define lookup-function-name @var{name} +@cindex @samp{%define lookup-function-name} +Allows you to specify the name for the generated lookup function. +Default name is @samp{in_word_set}. This option permits multiple +generated hash functions to be used in the same application. + +@item %define class-name @var{name} +@cindex @samp{%define class-name} +This option is only useful when option @samp{-L C++} (or, equivalently, +the @samp{%language=C++} declaration) has been given. It +allows you to specify the name of generated C++ class. Default name is +@code{Perfect_Hash}. + +@item %7bit +@cindex @samp{%7bit} +This option specifies that all strings that will be passed as arguments +to the generated hash function and the generated lookup function will +solely consist of 7-bit ASCII characters (bytes in the range 0..127). +(Note that the ANSI C functions @code{isalnum} and @code{isgraph} do +@emph{not} guarantee that a byte is in this range. Only an explicit +test like @samp{c >= 'A' && c <= 'Z'} guarantees this.) + +@item %compare-lengths +@cindex @samp{%compare-lengths} +Compare keyword lengths before trying a string comparison. This option +is mandatory for binary comparisons (@pxref{Binary Strings}). It also might +cut down on the number of string comparisons made during the lookup, since +keywords with different lengths are never compared via @code{strcmp}. +However, using @samp{%compare-lengths} might greatly increase the size of the +generated C code if the lookup table range is large (which implies that +the switch option @samp{-S} or @samp{%switch} is not enabled), since the length +table contains as many elements as there are entries in the lookup table. + +@item %compare-strncmp +@cindex @samp{%compare-strncmp} +Generates C code that uses the @code{strncmp} function to perform +string comparisons. The default action is to use @code{strcmp}. + +@item %readonly-tables +@cindex @samp{%readonly-tables} +Makes the contents of all generated lookup tables constant, i.e., +``readonly''. Many compilers can generate more efficient code for this +by putting the tables in readonly memory. + +@item %enum +@cindex @samp{%enum} +Define constant values using an enum local to the lookup function rather +than with #defines. This also means that different lookup functions can +reside in the same file. Thanks to James Clark @code{}. + +@item %includes +@cindex @samp{%includes} +Include the necessary system include file, @code{}, at the +beginning of the code. By default, this is not done; the user must +include this header file himself to allow compilation of the code. + +@item %global-table +@cindex @samp{%global-table} +Generate the static table of keywords as a static global variable, +rather than hiding it inside of the lookup function (which is the +default behavior). + +@item %pic +@cindex @samp{%pic} +Optimize the generated table for inclusion in shared libraries. This +reduces the startup time of programs using a shared library containing +the generated code. If the @samp{%struct-type} declaration (or, +equivalently, the option @samp{-t}) is also given, the first field of the +user-defined struct must be of type @samp{int}, not @samp{char *}, because +it will contain offsets into the string pool instead of actual strings. +To convert such an offset to a string, you can use the expression +@samp{stringpool + @var{o}}, where @var{o} is the offset. The string pool +name can be changed through the @samp{%define string-pool-name} declaration. + +@item %define string-pool-name @var{name} +@cindex @samp{%define string-pool-name} +Allows you to specify the name of the generated string pool created by +the declaration @samp{%pic} (or, equivalently, the option @samp{-P}). +The default name is @samp{stringpool}. This declaration permits the use of +two hash tables in the same file, with @samp{%pic} and even when the +@samp{%global-table} declaration (or, equivalently, the option @samp{-G}) +is given. + +@item %null-strings +@cindex @samp{%null-strings} +Use NULL strings instead of empty strings for empty keyword table entries. +This reduces the startup time of programs using a shared library containing +the generated code (but not as much as the declaration @samp{%pic}), at the +expense of one more test-and-branch instruction at run time. + +@item %define word-array-name @var{name} +@cindex @samp{%define word-array-name} +Allows you to specify the name for the generated array containing the +hash table. Default name is @samp{wordlist}. This option permits the +use of two hash tables in the same file, even when the option @samp{-G} +(or, equivalently, the @samp{%global-table} declaration) is given. + +@item %define length-table-name @var{name} +@cindex @samp{%define length-table-name} +Allows you to specify the name for the generated array containing the +length table. Default name is @samp{lengthtable}. This option permits the +use of two length tables in the same file, even when the option @samp{-G} +(or, equivalently, the @samp{%global-table} declaration) is given. + +@item %switch=@var{count} +@cindex @samp{%switch} +Causes the generated C code to use a @code{switch} statement scheme, +rather than an array lookup table. This can lead to a reduction in both +time and space requirements for some input files. The argument to this +option determines how many @code{switch} statements are generated. A +value of 1 generates 1 @code{switch} containing all the elements, a +value of 2 generates 2 tables with 1/2 the elements in each +@code{switch}, etc. This is useful since many C compilers cannot +correctly generate code for large @code{switch} statements. This option +was inspired in part by Keith Bostic's original C program. + +@item %omit-struct-type +@cindex @samp{%omit-struct-type} +Prevents the transfer of the type declaration to the output file. Use +this option if the type is already defined elsewhere. +@end table + +@node C Code Inclusion, , Gperf Declarations, Declarations +@subsubsection C Code Inclusion + +@cindex @samp{%@{} +@cindex @samp{%@}} +Using a syntax similar to GNU utilities @code{flex} and @code{bison}, it +is possible to directly include C source text and comments verbatim into +the generated output file. This is accomplished by enclosing the region +inside left-justified surrounding @samp{%@{}, @samp{%@}} pairs. Here is +an input fragment based on the previous example that illustrates this +feature: + +@example +@group +%@{ +#include +/* This section of code is inserted directly into the output. */ +int return_month_days (struct month *months, int is_leap_year); +%@} +struct month @{ char *name; int number; int days; int leap_days; @}; +%% +january, 1, 31, 31 +february, 2, 28, 29 +march, 3, 31, 31 +... +@end group +@end example + +@node Keywords, Functions, Declarations, Input Format +@subsection Format for Keyword Entries + +The second input file format section contains lines of keywords and any +associated attributes you might supply. A line beginning with @samp{#} +in the first column is considered a comment. Everything following the +@samp{#} is ignored, up to and including the following newline. A line +beginning with @samp{%} in the first column is an option declaration and +must not occur within the keywords section. + +The first field of each non-comment line is always the keyword itself. It +can be given in two ways: as a simple name, i.e., without surrounding +string quotation marks, or as a string enclosed in double-quotes, in +C syntax, possibly with backslash escapes like @code{\"} or @code{\234} +or @code{\xa8}. In either case, it must start right at the beginning +of the line, without leading whitespace. +In this context, a ``field'' is considered to extend up to, but +not include, the first blank, comma, or newline. Here is a simple +example taken from a partial list of C reserved words: + +@example +@group +# These are a few C reserved words, see the c.gperf file +# for a complete list of ANSI C reserved words. +unsigned +sizeof +switch +signed +if +default +for +while +return +@end group +@end example + +Note that unlike @code{flex} or @code{bison} the first @samp{%%} marker +may be elided if the declaration section is empty. + +Additional fields may optionally follow the leading keyword. Fields +should be separated by commas, and terminate at the end of line. What +these fields mean is entirely up to you; they are used to initialize the +elements of the user-defined @code{struct} provided by you in the +declaration section. If the @samp{-t} option (or, equivalently, the +@samp{%struct-type} declaration) is @emph{not} enabled +these fields are simply ignored. All previous examples except the last +one contain keyword attributes. + +@node Functions, Controls for GNU indent, Keywords, Input Format +@subsection Including Additional C Functions + +The optional third section also corresponds closely with conventions +found in @code{flex} and @code{bison}. All text in this section, +starting at the final @samp{%%} and extending to the end of the input +file, is included verbatim into the generated output file. Naturally, +it is your responsibility to ensure that the code contained in this +section is valid C. + +@node Controls for GNU indent, , Functions, Input Format +@subsection Where to place directives for GNU @code{indent}. + +If you want to invoke GNU @code{indent} on a @code{gperf} input file, +you will see that GNU @code{indent} doesn't understand the @samp{%%}, +@samp{%@{} and @samp{%@}} directives that control @code{gperf}'s +interpretation of the input file. Therefore you have to insert some +directives for GNU @code{indent}. More precisely, assuming the most +general input file structure + +@example +@group +declarations part 1 +%@{ +verbatim code +%@} +declarations part 2 +%% +keywords +%% +functions +@end group +@end example + +@noindent +you would insert @samp{*INDENT-OFF*} and @samp{*INDENT-ON*} comments +as follows: + +@example +@group +/* *INDENT-OFF* */ +declarations part 1 +%@{ +/* *INDENT-ON* */ +verbatim code +/* *INDENT-OFF* */ +%@} +declarations part 2 +%% +keywords +%% +/* *INDENT-ON* */ +functions +@end group +@end example + +@node Output Format, Binary Strings, Input Format, Description +@section Output Format for Generated C Code with @code{gperf} +@cindex hash table + +Several options control how the generated C code appears on the standard +output. Two C functions are generated. They are called @code{hash} and +@code{in_word_set}, although you may modify their names with a command-line +option. Both functions require two arguments, a string, @code{char *} +@var{str}, and a length parameter, @code{int} @var{len}. Their default +function prototypes are as follows: + +@deftypefun {unsigned int} hash (const char * @var{str}, unsigned int @var{len}) +By default, the generated @code{hash} function returns an integer value +created by adding @var{len} to several user-specified @var{str} byte +positions indexed into an @dfn{associated values} table stored in a +local static array. The associated values table is constructed +internally by @code{gperf} and later output as a static local C array +called @samp{hash_table}. The relevant selected positions (i.e. indices +into @var{str}) are specified via the @samp{-k} option when running +@code{gperf}, as detailed in the @emph{Options} section below (@pxref{Options}). +@end deftypefun + +@deftypefun {} in_word_set (const char * @var{str}, unsigned int @var{len}) +If @var{str} is in the keyword set, returns a pointer to that +keyword. More exactly, if the option @samp{-t} (or, equivalently, the +@samp{%struct-type} declaration) was given, it returns +a pointer to the matching keyword's structure. Otherwise it returns +@code{NULL}. +@end deftypefun + +If the option @samp{-c} (or, equivalently, the @samp{%compare-strncmp} +declaration) is not used, @var{str} must be a NUL terminated +string of exactly length @var{len}. If @samp{-c} (or, equivalently, the +@samp{%compare-strncmp} declaration) is used, @var{str} must +simply be an array of @var{len} bytes and does not need to be NUL +terminated. + +The code generated for these two functions is affected by the following +options: + +@table @samp +@item -t +@itemx --struct-type +Make use of the user-defined @code{struct}. + +@item -S @var{total-switch-statements} +@itemx --switch=@var{total-switch-statements} +@cindex @code{switch} +Generate 1 or more C @code{switch} statement rather than use a large, +(and potentially sparse) static array. Although the exact time and +space savings of this approach vary according to your C compiler's +degree of optimization, this method often results in smaller and faster +code. +@end table + +If the @samp{-t} and @samp{-S} options (or, equivalently, the +@samp{%struct-type} and @samp{%switch} declarations) are omitted, the default +action +is to generate a @code{char *} array containing the keywords, together with +additional empty strings used for padding the array. By experimenting +with the various input and output options, and timing the resulting C +code, you can determine the best option choices for different keyword +set characteristics. + +@node Binary Strings, , Output Format, Description +@section Use of NUL bytes +@cindex NUL + +By default, the code generated by @code{gperf} operates on zero +terminated strings, the usual representation of strings in C. This means +that the keywords in the input file must not contain NUL bytes, +and the @var{str} argument passed to @code{hash} or @code{in_word_set} +must be NUL terminated and have exactly length @var{len}. + +If option @samp{-c} (or, equivalently, the @samp{%compare-strncmp} +declaration) is used, then the @var{str} argument does not need +to be NUL terminated. The code generated by @code{gperf} will only +access the first @var{len}, not @var{len+1}, bytes starting at @var{str}. +However, the keywords in the input file still must not contain NUL +bytes. + +If option @samp{-l} (or, equivalently, the @samp{%compare-lengths} +declaration) is used, then the hash table performs binary +comparison. The keywords in the input file may contain NUL bytes, +written in string syntax as @code{\000} or @code{\x00}, and the code +generated by @code{gperf} will treat NUL like any other byte. +Also, in this case the @samp{-c} option (or, equivalently, the +@samp{%compare-strncmp} declaration) is ignored. + +@node Options, Bugs, Description, Top +@chapter Invoking @code{gperf} + +There are @emph{many} options to @code{gperf}. They were added to make +the program more convenient for use with real applications. ``On-line'' +help is readily available via the @samp{--help} option. Here is the +complete list of options. + +@menu +* Output File:: Specifying the Location of the Output File +* Input Details:: Options that affect Interpretation of the Input File +* Output Language:: Specifying the Language for the Output Code +* Output Details:: Fine tuning Details in the Output Code +* Algorithmic Details:: Changing the Algorithms employed by @code{gperf} +* Verbosity:: Informative Output +@end menu + +@node Output File, Input Details, Options, Options +@section Specifying the Location of the Output File + +@table @samp +@item --output-file=@var{file} +Allows you to specify the name of the file to which the output is written to. +@end table + +The results are written to standard output if no output file is specified +or if it is @samp{-}. + +@node Input Details, Output Language, Output File, Options +@section Options that affect Interpretation of the Input File + +These options are also available as declarations in the input file +(@pxref{Gperf Declarations}). + +@table @samp +@item -e @var{keyword-delimiter-list} +@itemx --delimiters=@var{keyword-delimiter-list} +@cindex Delimiters +Allows you to provide a string containing delimiters used to +separate keywords from their attributes. The default is ",". This +option is essential if you want to use keywords that have embedded +commas or newlines. One useful trick is to use -e'TAB', where TAB is +the literal tab character. + +@item -t +@itemx --struct-type +Allows you to include a @code{struct} type declaration for generated +code. Any text before a pair of consecutive @samp{%%} is considered +part of the type declaration. Keywords and additional fields may follow +this, one group of fields per line. A set of examples for generating +perfect hash tables and functions for Ada, C, C++, Pascal, Modula 2, +Modula 3 and JavaScript reserved words are distributed with this release. + +@item --ignore-case +Consider upper and lower case ASCII characters as equivalent. The string +comparison will use a case insignificant character comparison. Note that +locale dependent case mappings are ignored. This option is therefore not +suitable if a properly internationalized or locale aware case mapping +should be used. (For example, in a Turkish locale, the upper case equivalent +of the lowercase ASCII letter @samp{i} is the non-ASCII character +@samp{capital i with dot above}.) For this case, it is better to apply +an uppercase or lowercase conversion on the string before passing it to +the @code{gperf} generated function. +@end table + +@node Output Language, Output Details, Input Details, Options +@section Options to specify the Language for the Output Code + +These options are also available as declarations in the input file +(@pxref{Gperf Declarations}). + +@table @samp +@item -L @var{generated-language-name} +@itemx --language=@var{generated-language-name} +Instructs @code{gperf} to generate code in the language specified by the +option's argument. Languages handled are currently: + +@table @samp +@item KR-C +Old-style K&R C. This language is understood by old-style C compilers and +ANSI C compilers, but ANSI C compilers may flag warnings (or even errors) +because of lacking @samp{const}. + +@item C +Common C. This language is understood by ANSI C compilers, and also by +old-style C compilers, provided that you @code{#define const} to empty +for compilers which don't know about this keyword. + +@item ANSI-C +ANSI C. This language is understood by ANSI C compilers and C++ compilers. + +@item C++ +C++. This language is understood by C++ compilers. +@end table + +The default is C. + +@item -a +This option is supported for compatibility with previous releases of +@code{gperf}. It does not do anything. + +@item -g +This option is supported for compatibility with previous releases of +@code{gperf}. It does not do anything. +@end table + +@node Output Details, Algorithmic Details, Output Language, Options +@section Options for fine tuning Details in the Output Code + +Most of these options are also available as declarations in the input file +(@pxref{Gperf Declarations}). + +@table @samp +@item -K @var{slot-name} +@itemx --slot-name=@var{slot-name} +@cindex Slot name +This option is only useful when option @samp{-t} (or, equivalently, the +@samp{%struct-type} declaration) has been given. +By default, the program assumes the structure component identifier for +the keyword is @samp{name}. This option allows an arbitrary choice of +identifier for this component, although it still must occur as the first +field in your supplied @code{struct}. + +@item -F @var{initializers} +@itemx --initializer-suffix=@var{initializers} +@cindex Initializers +This option is only useful when option @samp{-t} (or, equivalently, the +@samp{%struct-type} declaration) has been given. +It permits to specify initializers for the structure members following +@var{slot-name} in empty hash table entries. The list of initializers +should start with a comma. By default, the emitted code will +zero-initialize structure members following @var{slot-name}. + +@item -H @var{hash-function-name} +@itemx --hash-function-name=@var{hash-function-name} +Allows you to specify the name for the generated hash function. Default +name is @samp{hash}. This option permits the use of two hash tables in +the same file. + +@item -N @var{lookup-function-name} +@itemx --lookup-function-name=@var{lookup-function-name} +Allows you to specify the name for the generated lookup function. +Default name is @samp{in_word_set}. This option permits multiple +generated hash functions to be used in the same application. + +@item -Z @var{class-name} +@itemx --class-name=@var{class-name} +@cindex Class name +This option is only useful when option @samp{-L C++} (or, equivalently, +the @samp{%language=C++} declaration) has been given. It +allows you to specify the name of generated C++ class. Default name is +@code{Perfect_Hash}. + +@item -7 +@itemx --seven-bit +This option specifies that all strings that will be passed as arguments +to the generated hash function and the generated lookup function will +solely consist of 7-bit ASCII characters (bytes in the range 0..127). +(Note that the ANSI C functions @code{isalnum} and @code{isgraph} do +@emph{not} guarantee that a byte is in this range. Only an explicit +test like @samp{c >= 'A' && c <= 'Z'} guarantees this.) This was the +default in versions of @code{gperf} earlier than 2.7; now the default is +to support 8-bit and multibyte characters. + +@item -l +@itemx --compare-lengths +Compare keyword lengths before trying a string comparison. This option +is mandatory for binary comparisons (@pxref{Binary Strings}). It also might +cut down on the number of string comparisons made during the lookup, since +keywords with different lengths are never compared via @code{strcmp}. +However, using @samp{-l} might greatly increase the size of the +generated C code if the lookup table range is large (which implies that +the switch option @samp{-S} or @samp{%switch} is not enabled), since the length +table contains as many elements as there are entries in the lookup table. + +@item -c +@itemx --compare-strncmp +Generates C code that uses the @code{strncmp} function to perform +string comparisons. The default action is to use @code{strcmp}. + +@item -C +@itemx --readonly-tables +Makes the contents of all generated lookup tables constant, i.e., +``readonly''. Many compilers can generate more efficient code for this +by putting the tables in readonly memory. + +@item -E +@itemx --enum +Define constant values using an enum local to the lookup function rather +than with #defines. This also means that different lookup functions can +reside in the same file. Thanks to James Clark @code{}. + +@item -I +@itemx --includes +Include the necessary system include file, @code{}, at the +beginning of the code. By default, this is not done; the user must +include this header file himself to allow compilation of the code. + +@item -G +@itemx --global-table +Generate the static table of keywords as a static global variable, +rather than hiding it inside of the lookup function (which is the +default behavior). + +@item -P +@itemx --pic +Optimize the generated table for inclusion in shared libraries. This +reduces the startup time of programs using a shared library containing +the generated code. If the option @samp{-t} (or, equivalently, the +@samp{%struct-type} declaration) is also given, the first field of the +user-defined struct must be of type @samp{int}, not @samp{char *}, because +it will contain offsets into the string pool instead of actual strings. +To convert such an offset to a string, you can use the expression +@samp{stringpool + @var{o}}, where @var{o} is the offset. The string pool +name can be changed through the option @samp{--string-pool-name}. + +@item -Q @var{string-pool-name} +@itemx --string-pool-name=@var{string-pool-name} +Allows you to specify the name of the generated string pool created by +option @samp{-P}. The default name is @samp{stringpool}. This option +permits the use of two hash tables in the same file, with @samp{-P} and +even when the option @samp{-G} (or, equivalently, the @samp{%global-table} +declaration) is given. + +@item --null-strings +Use NULL strings instead of empty strings for empty keyword table entries. +This reduces the startup time of programs using a shared library containing +the generated code (but not as much as option @samp{-P}), at the expense +of one more test-and-branch instruction at run time. + +@item -W @var{hash-table-array-name} +@itemx --word-array-name=@var{hash-table-array-name} +@cindex Array name +Allows you to specify the name for the generated array containing the +hash table. Default name is @samp{wordlist}. This option permits the +use of two hash tables in the same file, even when the option @samp{-G} +(or, equivalently, the @samp{%global-table} declaration) is given. + +@itemx --length-table-name=@var{length-table-array-name} +@cindex Array name +Allows you to specify the name for the generated array containing the +length table. Default name is @samp{lengthtable}. This option permits the +use of two length tables in the same file, even when the option @samp{-G} +(or, equivalently, the @samp{%global-table} declaration) is given. + +@item -S @var{total-switch-statements} +@itemx --switch=@var{total-switch-statements} +@cindex @code{switch} +Causes the generated C code to use a @code{switch} statement scheme, +rather than an array lookup table. This can lead to a reduction in both +time and space requirements for some input files. The argument to this +option determines how many @code{switch} statements are generated. A +value of 1 generates 1 @code{switch} containing all the elements, a +value of 2 generates 2 tables with 1/2 the elements in each +@code{switch}, etc. This is useful since many C compilers cannot +correctly generate code for large @code{switch} statements. This option +was inspired in part by Keith Bostic's original C program. + +@item -T +@itemx --omit-struct-type +Prevents the transfer of the type declaration to the output file. Use +this option if the type is already defined elsewhere. + +@item -p +This option is supported for compatibility with previous releases of +@code{gperf}. It does not do anything. +@end table + +@node Algorithmic Details, Verbosity, Output Details, Options +@section Options for changing the Algorithms employed by @code{gperf} + +@table @samp +@item -k @var{selected-byte-positions} +@itemx --key-positions=@var{selected-byte-positions} +Allows selection of the byte positions used in the keywords' +hash function. The allowable choices range between 1-255, inclusive. +The positions are separated by commas, e.g., @samp{-k 9,4,13,14}; +ranges may be used, e.g., @samp{-k 2-7}; and positions may occur +in any order. Furthermore, the wildcard '*' causes the generated +hash function to consider @strong{all} byte positions in each keyword, +whereas '$' instructs the hash function to use the ``final byte'' +of a keyword (this is the only way to use a byte position greater than +255, incidentally). + +For instance, the option @samp{-k 1,2,4,6-10,'$'} generates a hash +function that considers positions 1,2,4,6,7,8,9,10, plus the last +byte in each keyword (which may be at a different position for each +keyword, obviously). Keywords +with length less than the indicated byte positions work properly, since +selected byte positions exceeding the keyword length are simply not +referenced in the hash function. + +This option is not normally needed since version 2.8 of @code{gperf}; +the default byte positions are computed depending on the keyword set, +through a search that minimizes the number of byte positions. + +@item -D +@itemx --duplicates +@cindex Duplicates +Handle keywords whose selected byte sets hash to duplicate values. +Duplicate hash values can occur if a set of keywords has the same names, but +possesses different attributes, or if the selected byte positions are not well +chosen. With the -D option @code{gperf} treats all these keywords as +part of an equivalence class and generates a perfect hash function with +multiple comparisons for duplicate keywords. It is up to you to completely +disambiguate the keywords by modifying the generated C code. However, +@code{gperf} helps you out by organizing the output. + +Using this option usually means that the generated hash function is no +longer perfect. On the other hand, it permits @code{gperf} to work on +keyword sets that it otherwise could not handle. + +@item -m @var{iterations} +@itemx --multiple-iterations=@var{iterations} +Perform multiple choices of the @samp{-i} and @samp{-j} values, and +choose the best results. This increases the running time by a factor of +@var{iterations} but does a good job minimizing the generated table size. + +@item -i @var{initial-value} +@itemx --initial-asso=@var{initial-value} +Provides an initial @var{value} for the associate values array. Default +is 0. Increasing the initial value helps inflate the final table size, +possibly leading to more time efficient keyword lookups. Note that this +option is not particularly useful when @samp{-S} (or, equivalently, +@samp{%switch}) is used. Also, +@samp{-i} is overridden when the @samp{-r} option is used. + +@item -j @var{jump-value} +@itemx --jump=@var{jump-value} +@cindex Jump value +Affects the ``jump value'', i.e., how far to advance the associated +byte value upon collisions. @var{Jump-value} is rounded up to an +odd number, the default is 5. If the @var{jump-value} is 0 @code{gperf} +jumps by random amounts. + +@item -n +@itemx --no-strlen +Instructs the generator not to include the length of a keyword when +computing its hash value. This may save a few assembly instructions in +the generated lookup table. + +@item -r +@itemx --random +Utilizes randomness to initialize the associated values table. This +frequently generates solutions faster than using deterministic +initialization (which starts all associated values at 0). Furthermore, +using the randomization option generally increases the size of the +table. + +@item -s @var{size-multiple} +@itemx --size-multiple=@var{size-multiple} +Affects the size of the generated hash table. The numeric argument for +this option indicates ``how many times larger or smaller'' the maximum +associated value range should be, in relationship to the number of keywords. +It can be written as an integer, a floating-point number or a fraction. +For example, a value of 3 means ``allow the maximum associated value to be +about 3 times larger than the number of input keywords''. +Conversely, a value of 1/3 means ``allow the maximum associated value to +be about 3 times smaller than the number of input keywords''. Values +smaller than 1 are useful for limiting the overall size of the generated hash +table, though the option @samp{-m} is better at this purpose. + +If `generate switch' option @samp{-S} (or, equivalently, @samp{%switch}) is +@emph{not} enabled, the maximum +associated value influences the static array table size, and a larger +table should decrease the time required for an unsuccessful search, at +the expense of extra table space. + +The default value is 1, thus the default maximum associated value about +the same size as the number of keywords (for efficiency, the maximum +associated value is always rounded up to a power of 2). The actual +table size may vary somewhat, since this technique is essentially a +heuristic. +@end table + +@node Verbosity, , Algorithmic Details, Options +@section Informative Output + +@table @samp +@item -h +@itemx --help +Prints a short summary on the meaning of each program option. Aborts +further program execution. + +@item -v +@itemx --version +Prints out the current version number. + +@item -d +@itemx --debug +Enables the debugging option. This produces verbose diagnostics to +``standard error'' when @code{gperf} is executing. It is useful both for +maintaining the program and for determining whether a given set of +options is actually speeding up the search for a solution. Some useful +information is dumped at the end of the program when the @samp{-d} +option is enabled. +@end table + +@node Bugs, Projects, Options, Top +@chapter Known Bugs and Limitations with @code{gperf} + +The following are some limitations with the current release of +@code{gperf}: + +@itemize @bullet +@item +The @code{gperf} utility is tuned to execute quickly, and works quickly +for small to medium size data sets (around 1000 keywords). It is +extremely useful for maintaining perfect hash functions for compiler +keyword sets. Several recent enhancements now enable @code{gperf} to +work efficiently on much larger keyword sets (over 15,000 keywords). +When processing large keyword sets it helps greatly to have over 8 megs +of RAM. + +@item +The size of the generate static keyword array can get @emph{extremely} +large if the input keyword file is large or if the keywords are quite +similar. This tends to slow down the compilation of the generated C +code, and @emph{greatly} inflates the object code size. If this +situation occurs, consider using the @samp{-S} option to reduce data +size, potentially increasing keyword recognition time a negligible +amount. Since many C compilers cannot correctly generate code for +large switch statements it is important to qualify the @var{-S} option +with an appropriate numerical argument that controls the number of +switch statements generated. + +@item +The maximum number of selected byte positions has an +arbitrary limit of 255. This restriction should be removed, and if +anyone considers this a problem write me and let me know so I can remove +the constraint. +@end itemize + +@node Projects, Bibliography, Bugs, Top +@chapter Things Still Left to Do + +It should be ``relatively'' easy to replace the current perfect hash +function algorithm with a more exhaustive approach; the perfect hash +module is essential independent from other program modules. Additional +worthwhile improvements include: + +@itemize @bullet +@item +Another useful extension involves modifying the program to generate +``minimal'' perfect hash functions (under certain circumstances, the +current version can be rather extravagant in the generated table size). +This is mostly of theoretical interest, since a sparse table +often produces faster lookups, and use of the @samp{-S} @code{switch} +option can minimize the data size, at the expense of slightly longer +lookups (note that the gcc compiler generally produces good code for +@code{switch} statements, reducing the need for more complex schemes). + +@item +In addition to improving the algorithm, it would also be useful to +generate an Ada package as the code output, in addition to the current +C and C++ routines. +@end itemize + +@page + +@node Bibliography, Concept Index, Projects, Top +@chapter Bibliography + +[1] Chang, C.C.: @i{A Scheme for Constructing Ordered Minimal Perfect +Hashing Functions} Information Sciences 39(1986), 187-195. + +[2] Cichelli, Richard J. @i{Author's Response to ``On Cichelli's Minimal Perfect Hash +Functions Method''} Communications of the ACM, 23, 12(December 1980), 729. + +[3] Cichelli, Richard J. @i{Minimal Perfect Hash Functions Made Simple} +Communications of the ACM, 23, 1(January 1980), 17-19. + +[4] Cook, C. R. and Oldehoeft, R.R. @i{A Letter Oriented Minimal +Perfect Hashing Function} SIGPLAN Notices, 17, 9(September 1982), 18-27. + +[5] Cormack, G. V. and Horspool, R. N. S. and Kaiserwerth, M. +@i{Practical Perfect Hashing} Computer Journal, 28, 1(January 1985), 54-58. + +[6] Jaeschke, G. @i{Reciprocal Hashing: A Method for Generating Minimal +Perfect Hashing Functions} Communications of the ACM, 24, 12(December +1981), 829-833. + +[7] Jaeschke, G. and Osterburg, G. @i{On Cichelli's Minimal Perfect +Hash Functions Method} Communications of the ACM, 23, 12(December 1980), +728-729. + +[8] Sager, Thomas J. @i{A Polynomial Time Generator for Minimal Perfect +Hash Functions} Communications of the ACM, 28, 5(December 1985), 523-532 + +[9] Schmidt, Douglas C. @i{GPERF: A Perfect Hash Function Generator} +Second USENIX C++ Conference Proceedings, April 1990. + +[10] Schmidt, Douglas C. @i{GPERF: A Perfect Hash Function Generator} +C++ Report, SIGS 10 10 (November/December 1998). + +[11] Sebesta, R.W. and Taylor, M.A. @i{Minimal Perfect Hash Functions +for Reserved Word Lists} SIGPLAN Notices, 20, 12(September 1985), 47-53. + +[12] Sprugnoli, R. @i{Perfect Hashing Functions: A Single Probe +Retrieving Method for Static Sets} Communications of the ACM, 20 +11(November 1977), 841-850. + +[13] Stallman, Richard M. @i{Using and Porting GNU CC} Free Software Foundation, +1988. + +[14] Stroustrup, Bjarne @i{The C++ Programming Language.} Addison-Wesley, 1986. + +[15] Tiemann, Michael D. @i{User's Guide to GNU C++} Free Software +Foundation, 1989. + +@node Concept Index, , Bibliography, Top +@unnumbered Concept Index + +@printindex cp + +@contents +@bye diff --git a/contrib/gperf/doc/gpl.texinfo b/contrib/gperf/doc/gpl.texinfo new file mode 100644 index 000000000000..be2dc253c470 --- /dev/null +++ b/contrib/gperf/doc/gpl.texinfo @@ -0,0 +1,398 @@ +@c This GPL is meant to be included from other files. +@c To format a standalone GPL, use license.texi. + +@center Version 2, June 1991 + +@display +Copyright @copyright{} 1989, 1991 Free Software Foundation, Inc., +59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. + +Everyone is permitted to copy and distribute verbatim copies +of this license document, but changing it is not allowed. +@end display + +@unnumberedsec Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software---to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Library General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + +@iftex +@unnumberedsec TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION +@end iftex +@ifinfo +@center TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION +@end ifinfo + +@enumerate 0 +@item +This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The ``Program'', below, +refers to any such program or work, and a ``work based on the Program'' +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term ``modification''.) Each licensee is addressed as ``you''. + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + +@item +You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + +@item +You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + +@enumerate a +@item +You must cause the modified files to carry prominent notices +stating that you changed the files and the date of any change. + +@item +You must cause any work that you distribute or publish, that in +whole or in part contains or is derived from the Program or any +part thereof, to be licensed as a whole at no charge to all third +parties under the terms of this License. + +@item +If the modified program normally reads commands interactively +when run, you must cause it, when started running for such +interactive use in the most ordinary way, to print or display an +announcement including an appropriate copyright notice and a +notice that there is no warranty (or else, saying that you provide +a warranty) and that users may redistribute the program under +these conditions, and telling the user how to view a copy of this +License. (Exception: if the Program itself is interactive but +does not normally print such an announcement, your work based on +the Program is not required to print an announcement.) +@end enumerate + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + +@item +You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + +@enumerate a +@item +Accompany it with the complete corresponding machine-readable +source code, which must be distributed under the terms of Sections +1 and 2 above on a medium customarily used for software interchange; or, + +@item +Accompany it with a written offer, valid for at least three +years, to give any third party, for a charge no more than your +cost of physically performing source distribution, a complete +machine-readable copy of the corresponding source code, to be +distributed under the terms of Sections 1 and 2 above on a medium +customarily used for software interchange; or, + +@item +Accompany it with the information you received as to the offer +to distribute corresponding source code. (This alternative is +allowed only for noncommercial distribution and only if you +received the program in object code or executable form with such +an offer, in accord with Subsection b above.) +@end enumerate + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + +@item +You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + +@item +You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + +@item +Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + +@item +If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + +@item +If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + +@item +The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and ``any +later version'', you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + +@item +If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + +@iftex +@vskip -@baselineskip +@vskip -@baselineskip +@heading NO WARRANTY +@end iftex +@ifinfo +@center NO WARRANTY +@end ifinfo + +@item +BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + +@item +IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. +@end enumerate + +@iftex +@heading END OF TERMS AND CONDITIONS +@end iftex +@ifinfo +@center END OF TERMS AND CONDITIONS +@end ifinfo + +@page +@unnumberedsec How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the ``copyright'' line and a pointer to where the full notice is found. + +@smallexample +@var{one line to give the program's name and an idea of what it does.} +Copyright (C) @var{year} @var{name of author} + +This program is free software; you can redistribute it and/or +modify it under the terms of the GNU General Public License +as published by the Free Software Foundation; either version 2 +of the License, or (at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. +@end smallexample + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + +@smallexample +Gnomovision version 69, Copyright (C) @var{year} @var{name of author} +Gnomovision comes with ABSOLUTELY NO WARRANTY; for details +type `show w'. This is free software, and you are welcome +to redistribute it under certain conditions; type `show c' +for details. +@end smallexample + +The hypothetical commands @samp{show w} and @samp{show c} should show +the appropriate parts of the General Public License. Of course, the +commands you use may be called something other than @samp{show w} and +@samp{show c}; they could even be mouse-clicks or menu items---whatever +suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a ``copyright disclaimer'' for the program, if +necessary. Here is a sample; alter the names: + +@example +@group +Yoyodyne, Inc., hereby disclaims all copyright +interest in the program `Gnomovision' +(which makes passes at compilers) written +by James Hacker. + +@var{signature of Ty Coon}, 1 April 1989 +Ty Coon, President of Vice +@end group +@end example + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Library General +Public License instead of this License.