This search facility is built upon the
glimpse indexing facility.
The index is built on a nightly basis. Therefore,
posts to HyperNews in the past 24 hours may not be searched.
All searches are case insensitive.
Some articles (very few) have their body text set to an external URL. For these
articles, only the headers are indexed.
Results are presented sorted by forum. Because of this sort, all matches are
gathered first before anything is written back to the web page. Therefore, searches
with lots of matches will take a long time to show any results.
There is no "relevancy" rating as in some commercial web search engines.
Articles which contain all the words listed in the search specification are
found. Non-alphanumeric characters are ignored except spaces, semi-colons, commas
and curly braces. Spaces and semi-colons implement a boolean AND and commas
implement a boolean OR. The curly braces can be used to group expressions.
For example, the search spec
hepix solaris will list all articles that contain both words,
hepix,solaris will list all articles that contain either word, and
hepix;{solaris,aix} will list all articles that contain the
word hepix along with either words solaris or aix.
Complex
In a complex search, the index is first searched for candidate files and
the actual files are searched to verify matches. Obviously, this search is
somewhat slower than a simple search. Many additional features are available
in the search using special non-alphanumeric characters. A space is truly a
space in a complex search. Details are available in the
PATTERNS section
of the
glimpse man page.
Searches only article headers. This includes Name (full name of author),
Title, Date and From (nickname of author). It is probably best
to do a Header Only search first when search for very common patterns.
For the most part, a Header Only search is effectively a search on the title.
However, to truly search only the title for the word example, use a complex
search with the pattern title;example and turn on Same Line Mode.
Header and Body
Searches both the header and body of articles for given pattern.
Do not match words in the search specification to substrings of larger words.
For instance, hep will not match to hepix when this option is turned on.
Fuzzy Match Mode
Allows up to two errors in the match. Generally, each insertion, deletion,
or substitution counts as one error. Two errors is enough to make HP-UX
match hepix.
Same Line Mode
Makes boolean AND operation (i.e. ';') work on a line by
line basis rather than file by file. Note, because of the way most HTML is written
with no linebreaks in a paragraph, this option effectively changes the basis to
paragraph by paragraph. This option works for complex searches only.