summaryrefslogtreecommitdiffstats
path: root/man2html/scripts/cgi-aux/man/mansearchhelp.aux
diff options
context:
space:
mode:
Diffstat (limited to 'man2html/scripts/cgi-aux/man/mansearchhelp.aux')
-rw-r--r--man2html/scripts/cgi-aux/man/mansearchhelp.aux295
1 files changed, 295 insertions, 0 deletions
diff --git a/man2html/scripts/cgi-aux/man/mansearchhelp.aux b/man2html/scripts/cgi-aux/man/mansearchhelp.aux
new file mode 100644
index 0000000..200b509
--- /dev/null
+++ b/man2html/scripts/cgi-aux/man/mansearchhelp.aux
@@ -0,0 +1,295 @@
+Content-type: text/html
+
+<HTML>
+<HEAD>
+<TITLE>Manual Pages - Search Help</TITLE>
+<!-- Changed by: Michael Hamilton, 6-Apr-1996 -->
+<!-- Note: this is not html, but requires preprocessing -->
+</HEAD>
+<BODY>
+<H1>Manual Pages - Search Help</H1>
+
+<A HREF="%cg/mansearch">Perform another search</A>
+<BR>
+<A HREF="%cg/man2html">Return to Main Contents</A>
+<P>
+<HR>
+<P>
+The full text index uses the <I>Glimpse</I> text indexing system.
+The
+<A HREF="%cg/man2html?1+glimpse">glimpse(1)</A>
+manual page documents glimpse in full. This summary documents those
+features of glimpse that are valid when searching through the manual pages.
+<P>
+<HR>
+
+<H2>Search Options</H2>
+
+The following search options must be at the start of the search string.
+
+<DL COMPACT>
+
+<DT><B>-</B> <I>#</I>
+<DD>
+<I>#</I> is an integer between 1 and 8
+specifying the maximum number of errors
+permitted in finding the approximate matches (the default is zero).
+Generally, each insertion, deletion, or substitution counts as one error.
+Since the index stores only lower case characters, errors of
+substituting upper case with lower case may be missed.
+
+<DT><B>-B</B>
+<DD>
+Best match mode. (Warning: -B sometimes misses matches. It is safer
+to specify the number of errors explicitly.)
+When -B is specified and no exact matches are found, glimpse
+will continue to search until the closest matches (i.e., the ones
+with minimum number of errors)
+are found.
+In general, -B may be slower than -#, but not by very much.
+Since the index stores only lower case characters, errors of
+substituting upper case with lower case may be missed.
+
+<DT><B>-L <I>x</I> | <I>x</I>:<I>y</I> | <I>x</I>:<I>y</I>:<I>z</I></B>
+<DD>
+A non-zero value of <I>x</I> limits the number of matches
+that will be shown.
+A non-zero value of <I>y</I> limits the number of man pages
+that will be shown.
+A non-zero valye of <I>z</I> will only show pages that have
+less that z matches.
+For example, -L 0:10 will output all matches for the first 10 files that
+contain a match.
+
+<DT><B>-F</B> <I>pattern</I>
+<DD>
+The -F option provides a pattern that restricts the search results to
+those filenames that match the pattern.
+or example, <I>-F 8</I> effectively restricts matches to section 8.
+
+<DT><B>-w</B>
+<DD>
+Search for the pattern as a word - i.e., surrounded by non-alphanumeric
+characters. For example,
+<I>-w -1 car</I> will match cars, but not characters and not
+car10.
+The non-alphanumeric <I>must</I>
+surround the match; they cannot be counted as errors.
+This option does not work with regular expressions.
+
+<DT><B>-W</B>
+<DD>
+The default for Boolean AND queries is that they cover one record
+(the default for a record is one line) at a time.
+For example, glimpse 'good;bad' will output all lines containing
+both 'good' and 'bad'.
+The -W option changes the scope of Booleans to be the whole file.
+Within a file glimpse will output all matches to any of the patterns.
+So, glimpse -W 'good;bad' will output all lines containing 'good'
+<I>or</I> 'bad', but only in files that contain both patterns.
+
+<DT><B>-k</B>
+<DD>
+No symbol in the pattern is treated as a meta character.
+For example, <I>-k a(b|c)*d</I> will find
+the occurrences of a(b|c)*d whereas <I>a(b|c)*d</I>
+will find substrings that match the regular expression 'a(b|c)*d'.
+(The only exception is ^ at the beginning of the pattern and $ at the
+end of the pattern, which are still interpreted in the usual way.
+Use \^ or \$ if you need them verbatim.)
+
+</DL>
+
+<P>
+<HR>
+
+<H2>Patterns</H2>
+
+<I>Glimpse</I>
+supports a large variety of patterns, including simple
+strings, strings with classes of characters, sets of strings,
+wild cards, and regular expressions (see <A HREF="#limitations">LIMITATIONS</A>).
+
+<DL COMPACT>
+<DT><B>Strings </B><DD>
+Strings are any sequence of characters, including the special symbols
+`^' for beginning of line and `$' for end of line.
+The following special characters (
+`<B>$</B>',
+
+`^<B>',</B>
+
+`<B>*</B>',
+
+`<B>[</B>'<B>,</B>
+
+`<B>^</B>',
+
+`<B>|</B>',
+
+`<B>(</B>',
+
+`<B>)</B>',
+
+`<B>!</B>',
+
+and
+`<B>\</B>'
+
+)
+as well as the following meta characters special to glimpse (and agrep):
+`<B>;</B>',
+
+`<B>,</B>',
+
+`<B>#</B>',
+
+`<B>&lt;</B>',
+
+`<B>&gt;</B>',
+
+`<B>-</B>',
+
+and
+`<B>.</B>',
+
+should be preceded by `\' if they are to be matched as regular
+characters. For example, \^abc\\ corresponds to the string ^abc\,
+whereas ^abc corresponds to the string abc at the beginning of a
+line.
+<DT><B>Classes of characters</B><DD>
+a list of characters inside [] (in order) corresponds to any character
+from the list. For example, [a-ho-z] is any character between a and h
+or between o and z. The symbol `^' inside [] complements the list.
+For example, [^i-n] denote any character in the character set except
+character 'i' to 'n'.
+The symbol `^' thus has two meanings, but this is consistent with
+egrep.
+The symbol `.' (don't care) stands for any symbol (except for the
+newline symbol).
+<DT><B>Boolean operations</B><DD>
+<B>Glimpse </B>
+
+supports an `AND' operation denoted by the symbol `;'
+an `OR' operation denoted by the symbol `,',
+or any combination.
+For example,
+<I>glimpse 'pizza;cheeseburger'</I> will output all lines containing
+both patterns.
+<I>glimpse -F 'gnu;\.c$' 'define;DEFAULT'</I>
+will output all lines containing both 'define' and 'DEFAULT'
+(anywhere in the line, not necessarily in order) in
+files whose name contains 'gnu' and ends with .c.
+<I>glimpse '{political,computer};science'</I> will match 'political science'
+or 'science of computers'.
+<DT><B>Wild cards</B><DD>
+The symbol '#' is used to denote a sequence
+of any number (including 0)
+of arbitrary characters (see <A HREF="#limitations">LIMITATIONS</A>).
+The symbol # is equivalent to .* in egrep.
+In fact, .* will work too, because it is a valid regular expression
+(see below), but unless this is part of an actual regular expression,
+# will work faster.
+(Currently glimpse is experiencing some problems with #.)
+<DT><B>Combination of exact and approximate matching</B><DD>
+Any pattern inside angle brackets &lt;&gt; must match the text exactly even
+if the match is with errors. For example, &lt;mathemat&gt;ics matches
+mathematical with one error (replacing the last s with an a), but
+mathe&lt;matics&gt; does not match mathematical no matter how many errors are
+allowed.
+(This option is buggy at the moment.)
+<DT><B>Regular expressions</B><DD>
+Since the index is word based, a regular expression must match
+words that appear in the index for glimpse to find it.
+Glimpse first strips the regular expression from all non-alphabetic
+characters, and searches the index for all remaining words.
+It then applies the regular expression matching algorithm to the
+files found in the index.
+For example, <I>glimpse</I> 'abc.*xyz' will search the index
+for all files that contain both 'abc' and 'xyz', and then
+search directly for 'abc.*xyz' in those files.
+(If you use glimpse -w 'abc.*xyz', then 'abcxyz' will not be found,
+because glimpse
+will think that abc and xyz need to be matches to whole words.)
+The syntax of regular expressions in <B>glimpse</B> is in general the same as
+that for <B>agrep</B>. The union operation `|', Kleene closure `*',
+and parentheses () are all supported.
+Currently '+' is not supported.
+Regular expressions are currently limited to approximately 30
+characters (generally excluding meta characters). Some options
+(-d, -w, -t, -x, -D, -I, -S) do not
+currently work with regular expressions.
+The maximal number of errors for regular expressions that use '*'
+or '|' is 4. (See <A HREF="#limitations">LIMITATIONS</A>.)
+
+</DL>
+
+<HR>
+
+<H2><A NAME="limitations">Limitations</A></H2>
+
+The index of glimpse is word based. A pattern that contains more than
+one word cannot be found in the index. The way glimpse overcomes this
+weakness is by splitting any multi-word pattern into its set of words
+and looking for all of them in the index.
+For example, <B>glimpse 'linear programming'</B> will first consult the index
+to find all files containing both <I>linear</I> and <I>programming</I>,
+and then apply agrep to find the combined pattern.
+This is usually an effective solution, but it can be slow for
+cases where both words are very common, but their combination is not.
+<P>
+
+As was mentioned in the section on PATTERNS above, some characters
+serve as meta characters for glimpse and need to be
+preceded by '\' to search for them. The most common
+examples are the characters '.' (which stands for a wild card),
+and '*' (the Kleene closure).
+So, &quot;glimpse ab.de&quot; will match abcde, but &quot;glimpse ab\.de&quot;
+will not, and &quot;glimpse ab*de&quot; will not match ab*de, but
+&quot;glimpse ab\*de&quot; will.
+The meta character - is translated automatically to a hyphen
+unless it appears between [] (in which case it denotes a range of
+characters).
+<P>
+
+The index of glimpse stores all patterns in lower case.
+When glimpse searches the index it first converts
+all patterns to lower case, finds the appropriate files,
+and then searches the actual files using the original
+patterns.
+So, for example, <I>glimpse ABCXYZ</I> will first find all
+files containing abcxyz in any combination of lower and upper
+cases, and then searches these files directly, so only the
+right cases will be found.
+One problem with this approach is discovering misspellings
+that are caused by wrong cases.
+For example, <I>glimpse -B abcXYZ</I> will first search the
+index for the best match to abcxyz (because the pattern is
+converted to lower case); it will find that there are matches
+with no errors, and will go to those files to search them
+directly, this time with the original upper cases.
+If the closest match is, say AbcXYZ, glimpse may miss it,
+because it doesn't expect an error.
+Another problem is speed. If you search for &quot;ATT&quot;, it will look
+at the index for &quot;att&quot;. Unless you use -w to match the whole word,
+glimpse may have to search all files containing, for example, &quot;Seattle&quot;
+which has &quot;att&quot; in it.
+<P>
+
+There is no size limit for simple patterns and simple patterns
+within Boolean expressions.
+More complicated patterns, such as regular expressions,
+are currently limited to approximately 30 characters.
+Lines are limited to 1024 characters.
+Records are limited to 48K, and may be truncated if they are larger
+than that.
+The limit of record length can be
+changed by modifying the parameter Max_record in agrep.h.
+<P>
+
+Glimpseindex does not index words of size &gt; 64.
+<A NAME="lbAQ">&nbsp;</A>
+
+<HR>
+</BODY>
+</HTML>