summaryrefslogtreecommitdiffstats
path: root/debian/htdig/htdig-3.2.0b6/ChangeLog
diff options
context:
space:
mode:
Diffstat (limited to 'debian/htdig/htdig-3.2.0b6/ChangeLog')
-rw-r--r--debian/htdig/htdig-3.2.0b6/ChangeLog8763
1 files changed, 8763 insertions, 0 deletions
diff --git a/debian/htdig/htdig-3.2.0b6/ChangeLog b/debian/htdig/htdig-3.2.0b6/ChangeLog
new file mode 100644
index 00000000..b7615dd4
--- /dev/null
+++ b/debian/htdig/htdig-3.2.0b6/ChangeLog
@@ -0,0 +1,8763 @@
+Mon Jun 14 10:08:01 CEST 2004 Gabriele Bartolini <angusgb@users.sourceforge.net>
+
+ * Tagged release htdig-3-2-0b6
+
+Sun 13 Jun 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * db/os_abs.c, (db/os_abs.c.win32 removed):
+ Re-fix Cygwin bug (#814268, fixed 25 Apr) so that it won't be
+ clobbered by autotools.
+
+Sat 12 Jun 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdoc/RELEASE.html: Separated bug fixes from new features
+
+ * htdoc/{htdig,htfuzzy}.html, installdir/{htdig,htfuzzy}.1.in:
+ Added list of database files used
+
+ * htdoc/{htdump,htmerge,htnotify,htpurge,hts_general,htstat,rundig}.html:
+ Hyperlinked COMMON_DIR, BIN_DIR, DATABASE_DIR to attrs.html.
+
+ * htcommon/defaults.cc, htdoc/attrs.html.in:
+ Remove reference to deprecated '-l' option (generate URL log) of htdig.
+
+Fri Jun 11 11:48:40 2004 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/parser.cc (phrase): Applied Lachlan's patch to prevent endless
+ loop when boolean keywords appear in a phrase in boolean match method.
+
+Fri Jun 11 11:26:56 2004 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * db/hash.c (CDB___ham_open): Applied Red Hat's h_hash patch, to ensure
+ that hash function always set to something valid.
+
+Fri Jun 11 10:53:49 2004 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * installdir/HtFileType: Added -f to rm command.
+
+ * htsearch/parser.cc (perform_or): Added missing & in if clause.
+
+ * contrib/htdig-3.2.0.spec: Updated for 3.2.0b6.
+
+ * installdir/Makefile.{am,in}: Don't stick $(DESTDIR) in HtFileType.
+
+Thu Jun 10 16:39:36 CEST 2004 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htcommon/conf_(lexer.lxx,parser.yxx): applied Gilles' patch (April 22)
+ which features:
+ - improved error handling, gives file name and correct line number,
+ even if using include files
+ - allows space before comment, because otherwise it would just complain
+ about the "#" character and go on to parse the text after it as a
+ definition
+ - allows config file with an unterminated line at end of file, by
+ pushing an extra newline token to the parser at EOF
+ - parser correctly handles extra newline tokens, by moving this
+ handling out of simple_expression, and into simple_expression_list
+ and block, as simple_expression must return a new ConfigDefaults
+ object and a newline token doesn't cut it (caused segfaults when
+ dealing with fix above)
+ * htcommon/conf_lexer.cxx: Regenerate using flex 2.5.31.
+ * htcommon/conf_parser.cxx: Regenerate using bison 1.875a.
+
+Wed Jun 9 12:32:47 2004 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc (do_tag): Fixed meta date handling fix of June 3 to
+ ensure null byte gets put in by get() call.
+
+Wed 9 Jun 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * contrib/doc2html/doc2html.pl, installdir/mime.types:
+ Add support for OpenOffice.org documents (#957305)
+
+Sat 5 Jun 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * test/t_htdig, test/t_factors: fix tests for non-gnu/linux systems.
+
+Sat 5 Jun 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdoc/cf_generate.pl: Hyperlink to simplify finding the defaults of
+ attributes defined in terms of others (e.g.,
+ accents_db->database_base->database_dir).
+ * htdoc/attrs.html.in: regenerated using cf_generate.pl
+
+Sat 5 Jun 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/defaults.cc: Escaped new-line in "allow_spaces_in_url" entry.
+ Set no_next_page_text to ${next_page_text}; likewise no_prev_page_text.
+
+Fri Jun 4 10:23:53 CEST 2004 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htcommon/URL.cc: added "allow_space_in_url" (from fileSpace.1 patch)
+ * htcommon/defaults.[cc,xml]: added documentation of allow_space_in_url
+ * htdoc/attrs.html.in: regenerated using cf_generate.pl
+ * htdoc/cf_byname.html: ditto
+ * htdoc/cf_byprog.html: ditto
+ * htdoc/RELEASE.html: updated with info regarding this attribute
+
+Thu Jun 3 16:04:23 2004 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc (do_tag): Fixed meta date handling to avoid inadvertently
+ matching names like DC.Date.Review.
+
+Thu Jun 3 10:01:50 CEST 2004 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htdoc/RELEASE.html: updated release notes and changes
+ * htdoc/THANKS.html: updated the 'thanks' section
+
+Thu Jun 3 09:32:52 CEST 2004 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * global: updated with 'autoreconf -if' (autoconf 2.59, libtool 1.5.6
+ and automake 1.7.9)
+
+Wed Jun 2 19:03:14 CEST 2004 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * contrib/rtf2html: added the rtf2html.c source as modified by David Lippi
+ and Gabriele Bartolini of the Comune di Prato. The source code is now
+ released under GNU GPL and included in the ht://Dig package.
+
+Tue Jun 1 20:23:40 CEST 2004 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htcommon/HtSGMLCodec.cc: changed &curren; to &euro;
+
+Fri 28 May 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * Most files: Update copyright to 2004
+
+Sun 23 May 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdocs/FAQ.html: Sync with maindocs
+
+Sun 23 May 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * configure, configure.in:
+ Resolve variables (e.g., BINDIR) copied into attrs.html,
+ without introducing "NONE" prefix detected by Gabriele.
+
+Sun 23 May 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * .version, htdoc/RELEASE.html, htdoc/where.html,
+ htdoc/attrs.html.in, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Prepare docs for release of 3.2.0b6.
+
+Mon Apr 26 15:12:22 2004 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htfuzzy/Soundex.cc (generateKey): Applied Alex Kiesel's fix to prevent
+ segfaults when word has no letters.
+
+Sun 25 Apr 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdig/HTML.cc: Handle empty noindex_start/noindex_end lists.
+ * htlib/StringList.{cc,h}: const-correctness of Add/Insert/Assign(char*)
+
+ * redo mistakenly backed out patch...
+
+Sun 25 Apr 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/parser.cc: Address (but not fix) bug #934739
+ If collection->getDocumentRef() on line 889 returns NULL, don't crash.
+ I'm still trying to work out why it does return NULL -- I don't think
+ it ever should.
+
+ * mistakenly back out previous patch :(
+
+Sun 25 Apr 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdig/Retriever.{h,cc}, htcommon/defaults.cc, htdoc/FAQ.html:
+ Add store_phrases attribute. If it is false, htdig only stores the
+ first occurrence of each word in a document. This reduces the database
+ size dramatically, and slightly increases digging speed.
+
+Sun 25 Apr 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * db/{aclocal.m4,configure,os_abs.c.win32}, STATUS, htdoc/THANKS.html:
+ Correctly dected paths beginning C: as absolute paths in cygwin/Win32.
+ Fixes bug #814268.
+
+Sun 25 Apr 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdig/Retriever.cc:
+ Gilles's patch to avoid regex compile for every URL encountered.
+
+Sun 25 Apr 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * contrib/htdig-3.2.0.spec:
+ Karl Eichwalder's patch to use mktemp to create safe temp file.
+
+Wed Apr 7 17:12:33 2004 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Retriever.cc (IsValidURL): Fixed bug #931377 so bad_extensions
+ and valid_extensions not thrown off by periods in query strings.
+
+Mon Mar 15 11:56:04 CET 2004 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htsearch/Display.cc: changed (and fixed) the date factor formula as
+ Lachlan and David Lippi suggested, in order not to give negative results.
+
+Fri Mar 12 09:13:28 CET 2004 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * configure.in: removed 'eval' expressions which caused the 'NONE' prefix
+ path to be instantiated and the make script to hang
+ * acinclude.in: fixed AC_DEFINEs for SSL and ZLIB check macros, which prevented
+ autoheader (and therefore autoreconf) to correctly work
+ * moved manual pages from htdoc to installdir
+ * htdoc/[manpages].in: removed
+ * installdir/*.[1,8]: removed man pages (htdig-pdfparser.1, htdig.1,
+ htdump.1, htfuzzy.1, htload.1, htmerge.1, htnotify.1, htpurge.1,
+ htsearch.1, htstat.1, rundig.1, htdigconfig.8)
+ * installdir/*.[1,8].in: added pre-configure man pages (htdig-pdfparser.1.in,
+ htdig.1.in, htdump.1.in, htfuzzy.1.in, htload.1.in, htmerge.1.in, htnotify.1.in,
+ htpurge.1.in, htsearch.1.in, htstat.1.in, rundig.1.in, htdigconfig.8.in)
+ * regenerated configure scripts with autoreconf
+ * fixes bug #909674
+
+Sat 21 Feb 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * installdir/HtFileType: Use mktemp to create safe temp file (bug #901555)
+
+Wed Feb 25 11:14:45 CET 2004 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htdocs/THANKS.html: added Robert Ribnitz to the 'thanks' page and fixed
+ Nenciarini's position (it was not in alphabetical order - sorry!).
+
+Wed Feb 25 11:02:37 CET 2004 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * installdir/*.[1,8]: added man pages (htdig-pdfparser.1, htdig.1,
+ htdump.1, htfuzzy.1, htload.1, htmerge.1, htnotify.1, htpurge.1,
+ htsearch.1, htstat.1, rundig.1, htdigconfig.8) provided by
+ Robert Ribnitz <ribnitz at linuxbourg.ch> of the Debian Project
+ * installdir/Makefile.am: prepared the automake script for correctly
+ handling the man pages
+
+Sat 21 Feb 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/htsearch.cc:
+ Back out change of 21 December, as it causes problems with characters
+ which *should* be unencded, like /
+
+Thu 19 Feb 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * aclocal.m4, acinclude.m4, configure.in:
+ Remove duplicate tests for zlib
+ Fix tests for SSL (Fixes bug #829081)
+ Fix configure --help formatting
+
+ * htdoc/*.[18].in, htdoc/Makefile.am, configure.in: Added man pages
+
+ * htdoc/attrs.html.in, htdoc/cf_generate.pl, htdoc/Makefile.am:
+ Fill in #define'd attribs (Fixes bug #692125)
+
+ * test/Makefile.am: Incorporate new tests in make check
+
+ * test/t_htdig, test/t_parsing: suppress unwanted diagnostics
+
+ * STATUS: list Cygwin bug (#814268)
+
+ * htcommon/default.cc:
+ added wordlist_cache_inserts, remove worlist_cache_dirty_level
+
+ * configure, */Makefile.in, */Makefile, htdoc/cf_by{name,prog}.html:
+ regenerated
+
+Fri 13 Feb 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * db/mp_cmpr.c: Fix bug with --without-zlib
+
+Sun 8 Feb 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/URL.cc: Make server_alias case insensitive.
+
+ * htdig/Document.cc: Don't hex-decode twice. (Caused problems with names
+ like file%20name)
+
+ * htdig/Retriever.cc: Test validity of URL value *before* calling
+ signature(), as that implictly normalises, and confuses
+ limit_normalised vs limit_urls_to
+
+ * htdig/htdig.cc: Remove stale md5_db if -i specified
+
+ * installdir/htdig.conf: Set common_url_parts to contain all strings
+ which *must* be in a valid URL. Probably contains whole domain name,
+ so more compression than using standard strings.
+
+ * htcommon/defaults.cc: Update docs. Remove default "bad_extensions"
+ from common_url_parts, and add .shtml
+
+ * test/t_htdig, test/t_htdig_local: Update self-tests
+
+Tue Feb 3 18:06:38 CET 2004 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htcommon/HtConfiguration.cc: changed the Find method in order not to
+ ignore empty string results for string attributes whenever they are
+ defined in the configuration file by the user
+ * htdig/Document.cc: fixed bugs in handling the http_proxy,
+ http_proxy_authorization, authorization attributes
+ * htlib/Configuration.[h,cc]: added the Exists method in order to query
+ whether an attribute's definition is present in the configuration
+ dictionary (before it was checked against its string's length which
+ prevented empty attributes to be correctly used)
+ * these changes fix bug #887552
+
+Sun 18 Jan 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/URL.cc, test/url.cc:
+ Rename "allow_dbl_slash" to "allow_double_slash", to match defaults.cc
+
+ * htcommon/default.cc, htdoc/{hts_temlates,attrs}.html:
+ Explain that keywords_factor applies to meta keywords. Fix old typo.
+
+ * test/t_{factors,templates}, test/htdocs/set1/{title.html,bad_local.htm}
+ * test/conf/entry-template:
+ Expanded test suite.
+
+Sat 17 Jan 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * test/t_{parsing,htdig_local,factors,templates},
+ * test/htdocs/set1/title.html:
+ Expanded test suite.
+
+Sat 17 Jan 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/DocumentRef.cc:
+ Fix old-style use of HtConfiguration, so defaults are read correctly.
+ Causes max_descriptions to be treated correctly.
+
+ * htcommon/default.cc, htdoc/{hts_temlates,attrs,cf_byname,cf_byprog}.html:
+ Explain that max_description{s,_length} don't affect indexing -- only
+ text used to fill in template variables.
+
+Mon 12 Jan 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * Very many files: Fix bug #873965
+ Replace C++ style comments with C style comments in all C files, and .h
+ files they include.
+ Also, change //_WIN32 to /* _WIN32 */ in .cc files for uniformity.
+
+Mon 12 Jan 2004 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * test/t_parsing, test/test_functions.in: Add new tests
+ * htcommon/default.cc, htdoc/hts_templates.html: Cross-ref documentation.
+
+Mon Dec 29 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdig/Retriever.cc:
+ Fix bug in which validity of first URL from each server was not checked.
+
+Mon Dec 29 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdig/htdig.cc, htdoc/htdig.html: Fix bug #845054
+ Fix behaviour of -m and additional list of urls at the end of a command.
+ In either case, "-" denotes stdin.
+
+Mon Dec 29 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * installdir/rundig, installdir/Makefile.{in,am}: Address bug #860708
+ Make bin/rundig -a handle multiple database directories
+
+Sun Dec 21 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/htsearch.cc:
+ Improve handling of restrict/exclude URLs with spaces or encoded chars
+
+Sun Dec 21 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/HtURLSeedScore.cc, htsearch/SplitMatches.cc: Fix bug #863860
+ Split patterns at "|".
+ For SplitMatches, make "*" only match if all other patterns fail.
+
+Sun Dec 14 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdig/Server.cc: Fix bug #851303.
+ Allow indexing if robots.txt has an empty "disallow".
+
+ * test/t_htdig, test/t_htsearch, test/htdocs/robots.txt:
+ Tests for the above.
+
+Sun Dec 14 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdig/htdig.cc, test/t_factors: Warn if config file has obsolete fields.
+
+Sun Dec 14 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/Display.cc: Apply Gilles's patch for ellipses bug #844828.
+
+Sun Dec 14 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * test/{t_validwords,t_templates,t_fuzzy,t_factors}
+ * test/{set_attr,synonym_dict,dummy.stems,dummy.affixes,bad_word_list}
+ * test/conf/main-template test/htdocs/set1/{site2.html,site4.html}:
+ Added four new tests to test suite. Not included in "make check",
+ but can be run explicitly by "make TESTS=t_... check".
+
+Sun Dec 14 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/conf_lexer.{lxx,cxx}:
+ Back out changes to try to accept files without EOL :(
+
+Sat Dec 13 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/defaults.{cc,xml}, htdoc/{attrs,cf_byprog}.html:
+ Fix "used by" for max_excerpts, and resulting hyperlinks.
+
+Sat Nov 22 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/conf_lexer.{lxx,cxx}, htcommon/conf_parser.{yxx,cxx}:
+ Partially address bug #823455.
+ Don't complain if config file doesn't end in EOL.
+ Should the grammar be fixed not to need EOL?
+ Report errors to stderr, not stdout, as they confuse the web server.
+
+Sun Nov 9 14:44:02 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * Tagged release htdig-3-2-0b5
+
+Sat Nov 8 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/default.cc, htsearch/parser.cc: Fix bug #825877
+ Reduce backlink_factor to comparable with other factors, and
+ interpret multimatch_factor as the *bonus* given for multiple matches.
+
+Sat Nov 1 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/parser.cc: Fix bug #806419. Ignore bad words at start of phrase.
+
+Tue Oct 28 11:58:06 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htdig/htdig.cc: set the debug level when we are importing a cookie file.
+ Fix bug #831478.
+
+Mon Oct 27 17:13:02 2003 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Server.cc: Fix bug #831407. Make sure time properly reset after
+ delay completed, so that it doesn't allow 2 connections per delay.
+
+Mon Oct 27 15:57:38 2003 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/THANKS.html: Added Lachlan, Jim and Neal to the active developers
+ list.
+
+Sun Oct 26 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdoc/hts_templates.html: Clarify that PREV/NEXTPAGE template variables
+ are empty if there is only one page, ignoring no_{prev,next}_page_text.
+
+Sun Oct 26 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/defaults.cc: Fixed documentation to close bug #829767
+ Clarified that noindex_start/end do not get replaced by whitespace.
+ Also removed spurious '>' from start of boolean_syntax_errors, and
+ added missing '#' to many local <a href> tags.
+
+Sun Oct 26 12:42:27 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htcommon/defaults.cc: Fixed description of 'head_before_get' after
+ Lachlan fixes.
+ * htdoc/attrs.html: rerun cf_generate.pl
+
+Sat Oct 25 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/Display.cc: Fix #829761.
+ If last component of the URL is used as a title, URL-decode it.
+
+Sat Oct 25 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdig/Server.cc: Fix #829754. Avoid calculations with negative time
+
+Fri Oct 24 17:17:15 2003 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/htdig.html, htdoc/meta.html, htdoc/require.html: Update URL for
+ the Standard for Robot Exclusion.
+
+ * htdoc/htmerge.html: Added two clarifications to -m option description.
+
+ * htdoc/cf_types.html: Make clear distinction between String List and
+ Quoted String List.
+
+Fri Oct 24 15:30:08 2003 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc: Fix bug #829746. Applied Niel Kohl's fix for this,
+ to check if words input given before trying to use it, to avoid NULL
+ argument to syslog().
+
+Fri Oct 24 15:15:53 2003 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc: Fix bug #578570. The enddate handling now works
+ correctly for a large, negative startday value.
+
+Fri Oct 24 12:47:51 2003 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc (ctor): Fix obvious typo in metadatetags.Pattern setting.
+
+Thu Oct 23 10:27:18 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/default.cc: Fix bug #828808. Default startyear to empty
+ Document "startyear defaults to 1970 if a start/end date set".
+
+Thu Oct 23 12:14:30 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htdig/htdig.cc: restored the code before Oct 21 (fixes ##828628)
+
+Thu Oct 23 11:41:15 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htdig/Retriever.[h,cc]: removed 'head_before_get' overriding by
+ restoring the code before Oct 21.
+ * htdig/Document.[h,cc]: ditto, with the exception of detaching the HEAD
+ before GET mechanism from the persistent connections'.
+ * htcommon/defaults.cc: improved documentation (even though it needs
+ corrections by an english-speaking developer).
+ * These changes fix bug #828628
+
+Wed Oct 22 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/parser.cc: Applied Neal's patch to fix bug #823403
+ Documents only added to search list if they were successfully dug.
+ Lines 237-238 of htsearch/Display.cc
+ if (!ref || ref->DocState() != Reference_normal)
+ continue;
+ should now be redundant. (Left in to be defensive.)
+
+Tue Oct 21 11:04:56 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htdig/Retriever.h: added the 'RetrieverType' enum and an object variable
+ for storing the type of dig we are performing (default initial);
+ * htdig/Retriever.cc: changed constructor in order to handle the type,
+ added some debugging explanation regarding the override of the
+ 'head_before_get' attribute, added checks regarding an empty
+ database of URLs to be updated (set the type to initial).
+ * htdig/Document.h: added the attribute 'is_initial' which stores the
+ information regarding the type of indexing (initial or incremental)
+ we are currently performing. Added access methods (get-and-set-like)
+ * htdig/Document.cc: modified the logic of the HeadBeforeGet settings during
+ the retrieval phase, in order to always override user's settings in
+ an incremental dig and automatically set the 'HEAD' call in this case.
+ * htcommon/defaults.cc: modified the default value of 'head_before_get' and a bit
+ of its explanation.
+ * htnet/HtHTTP.cc: detached the HEAD before GET mechanism to the persistent
+ connections one
+ * htdig/Server.cc: added one level of debugging to the display of the
+ server settings in the server constructor
+
+Fri Oct 17 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htword/WordType.cc, htcommon/defaults.cc: Patched to fix bug #823083
+ Don't assume IsStrictChar returns false for digits.
+ Clarify behaviour of allow_numbers in the documentation.
+
+Fri Oct 17 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/defaults.cc: Patched to fix bug #823455
+ Escaped "$" in valid_punctuation, and add warnings about $, \ and `.
+
+Wed Oct 15 11:12:52 2003 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Server.cc (robotstxt): Patched to fix bug #765726.
+ Don't block paths with subpaths excluded by robots.txt, and make
+ sure any regex meta characters are properly escaped.
+
+Tue Oct 14 11:54:07 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtHTTP.cc: add an empty Accept-Encoding header - this inform the
+ server that htdig is only able to manage documents that are not encoded
+ (if no Accept-Encoding is sent, the server assumes that the client is
+ capable of handling every content encoding - i.e. zipped documents with
+ Apache's mod_gzip module). Partial fix of bug #594790 (which now becomes a
+ feature request)
+
+Mon Oct 13 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htfuzzy/Regex.cc: Search for regular expression. (Used to ignore it!)
+
+ * htfuzzy/Speling.cc, htword/{WordList.cc,WordList.h,WordKey.cc,WordKey.h}:
+ When looking in word database for misspelt words, don't ask to match
+ trailing numeric fields in database key.
+
+ * htcommon/defaults.cc, htdoc/htfuzzy.cc: Update docs.
+
+Sun Oct 12 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/htsearch.cc:
+ Fix bug if fuzzy algorithms produced no search words.
+ Send all debugging output to cerr not cout. More debugging output.
+
+Sun Oct 12 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdig/{Retriever,Server}.cc: Back out the previous.
+ Gilles pointed out inconsistency with Retriever::IsValidURL().
+
+Sun Oct 5 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdig/{Retriever,Server}.cc: Jim Cole's patch to bug #765726.
+ Don't block paths with subpaths excluded by robots.txt.
+
+Sun Oct 5 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/htsearch.cc: Highlight phrases containing stop words
+ * test/t_htsearch, test/conf/htdig.conf.in: Tests for the above
+
+Sat Sep 27 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * test/{test_functions.in,t_htdig,t_htdig_local,t_htnet}:
+ Don't assume shell "." command passes arguments. (Doesn't on FreeBSD.)
+
+Sat Sep 27 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htlib/HtDateTime.h, htnet/HtCookie.cc:
+ Avoid ambiguous function call on systems (HP-UX) where time_t=int
+
+Fri Aug 29 09:35:46 MDT 2003 Neal Richter <nealr at rightnow.com>
+
+ * removed references to CDB___mp_dirty_level ,CDB_set_mp_diry_level()
+ & CDB_get_mp_diry_level()
+
+ * The config verb 'wordlist_cache_dirty_level' was left for possible use in
+ the future.
+
+Thu Aug 28 15:11:21 MDT 2003 Neal Richter <nealr at rightnow.com>
+
+ * Changed db/LICENSE file to new LGPL compatible license from Sleepycat
+ Software -- Thanks Sleepycat!
+
+ * Reverted to Revision 1.2 or db/mp_alloc.c The recent changed cuased
+ large DB growth. Strangely the files contained no 'new' data, they were
+ just much larger. Looks like the pages were being flushed too often????
+
+Thu Aug 28 12:41:22 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * global: updated with 'autoreconf -if' (autoconf 2.57, libtool 1.5.0a and
+ automake 1.7.6)
+ * 'make check' successful on: AMD64 Linux 2.4, Alpha Linux 2.2,
+ RedHat Linux 7.3 (2.4), SPARC Ultra60 Linux 2.4,
+ Sparc R220 Sun Solaris (5.8).
+ * README.developer: added further info
+
+Thu Aug 28 12:00:10 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * db/[config.guess,config.sub,install-sh,ltmain.sh,missing]: added in the
+ database directory (this way 'make dist' goes on); I have not been able to
+ tell the db/configure script to get the 'top_srcdir' ones (which should be
+ the default behaviour). Maybe in the future we'll look for this.
+
+Thu Aug 28 11:53:48 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * db/configure.in: changed AC_PROG_INSTALL() to AC_PROG_INSTALL and removed
+ AC_CONFIG_AUX_DIR; this implies that autotools copies will be made for the
+ db directory as well.
+
+Thu Aug 28 11:36:42 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * [htcommon,htdb,htdig,htfuzzy,htlib,htnet,htsearch,httools,htword,test]/Makefile.am:
+ added the option above to every *_LDFLAGS
+
+Thu Aug 28 11:30:39 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * Makefile.am: removed acconfig.h from the EXTRA_DIST list
+
+Thu Aug 28 11:25:07 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * configure.in: removed portability checks for error, stat and lstat that
+ caused a compile errors on Solaris. Added the '-mimpure-text'
+ extra ld flag for GCC on solaris systems (a linkage error occurs
+ when libstdc++ is not shared)
+
+Thu Aug 28 11:22:57 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * include/Makefile.am: changed htconfig.h.in into config.h.in
+
+Thu Aug 28 11:16:19 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htlib/error.[h,c]: removed for now, until replacement functions will be
+ correctly performed.
+
+Thu Aug 28 11:11:32 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htdoc/cf_generate.pl: fixed an error when opening tail and head files
+ * Makefile.am: enabled rebuild from a different directory (it is used
+ my 'make dist')
+
+Thu Aug 28 10:46:35 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htlib/malloc.c: modified according to autoconf specifications as far
+ as replacement functions are regarded
+ * htlib/[lstat, stat].c: removed for now
+
+Thu Aug 28 10:40:58 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htdoc/cf_generate.pl: accept an optional parameter (top source directory)
+ * htcommon/defaults.cc: fixed some broken lines which prevented
+ cf_generate.pl from correctly working
+ * htdoc/Makefile.am: modified the automake file for passing the top
+ source directory to cf_generate.pl
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerated using cf_generate.pl.
+
+Tue Aug 26 12:25:40 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * configure.in: removed AC_FUNC_MKTIME because it may not work properly
+ and added default replacement directory (htlib) for future uses
+ * htlib/Makefile.am: back-step with re-inclusion of mktime.c in the
+ list of files to be always compiled (caused linking errors
+ for the __mktime_internal function)
+ * global: updated with 'autoreconf -if'
+
+Sun Aug 24 12:44:29 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * updated with 'autoreconf -if': autoconf 2.57, automake 1.7.6 and
+ libtool 1.5.0a (autotools that come with Debian SID)
+
+Sun Aug 24 12:39:34 EST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * configure.in: moved AC_PROG_LEX to AM_PROG_LEX
+ * db/configure.in: enabled AM_MAINTAINER_MODE which prevented users without
+ autotools to configure and compile the program (relatively to the db
+ directory)
+ * include/htconfig.h: previously excluded from the branch (severe error!)
+
+Mon Jul 21 20:54:47 CEST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htlib/(malloc|error|lstat|stat|realloc).c: added for cross-compiling
+ reasons (as suggested by automake)
+ * htlib/error.h: ditto
+ * db/acconfig.h: removed as suggested by autotools' new versions
+ * configure.in: removed AC_PROG_RANLIB (overriden by AC_PROG_LIBTOOL)
+ * updated as of rerun 'autoreconf -if'
+
+Mon Jul 21 10:08:24 CEST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * Patch provided by Marco Nenciarini <mnencia at linux.it> has been
+ completely applied; the patch adds support for detection
+ of standard C++ library
+ * all sources using <iostream.h> <fstream.h> <iomanip.h>: modified
+ to use standard ISO C++ library, if present
+ * db/configure scripts: modified for autoconf 2.57
+
+Mon Jul 21 09:59:16 CEST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * [.,*]/Makefile.in: regenerated by new automake against new configure.in
+ * Makefile.config: now looking for the global configuration file
+ in the source directory
+
+Mon Jul 21 09:49:22 CEST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * configure.in: completely rewritten, deprecated directives have
+ been removed and now version 2.57 is a prerequisite.
+ * acinclude.m4: moved all the macros here
+ * aclocal.m4, configure: regenerated by aclocal and autoconf
+ * acconfig.h: removed as now it is deprecated
+ * include/htconfig.h.in: removed, as 'config.h.in' is preferred
+ and auto-generated
+ * config.[guess,sub]: updated with newer versions
+
+Tue Jul 8 16:29:44 2003 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/parser.cc (checkSyntax): Fixed boolean_syntax_errors
+ handling to work over multiple config files.
+
+Mon Jul 7 00:41:55 CEST 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * Updated to autoconf 2.57, libtool 1.5 and automake 1.7.5
+ * removed acconfig.h files
+ * autoconf include file is now include/config.h (for autoheader)
+ * include/htconfig.h.in renamed in include/htconfig.h: now includes
+ config.h and redefines the bool types
+ * htlib/HtRegexList.cc, htdig/(Document.cc|ExternalParser.cc): removed
+ TRUE and FALSE and converted to C++ standard values
+
+Sat Jul 5 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * test/test_functions.in: Fix bugs starting/killing apache
+
+Sat Jul 5 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/defaults.cc: Disable cache flushing to avoid "page leak".
+
+Tue Jun 24 2003 Neal Richter <nearl at rightnow.com>
+
+ * Update Copyright Notices in code & documentation to 2003
+
+ * Changed License Notice GPL -> LGPL License change (Decided by HtDig
+ Board & Membership October 2002
+
+Mon Jun 23 2003 Neal Richter <nearl at rightnow.com>
+
+ * Raft of changes. Most todo with Native Win32 support
+
+ * TODO: ExternalTranport & ExternalParser are effectively dissabled with
+ #ifdefs for Native WIN32
+
+ * remove global CDB___mp_dirty_level variable and subsitute functions to set/get variable
+
+ * Added local copies of GNU LGPL regex, POSIX-like dirent routines, getopt
+ library and filecopy routines - mainly for Native WIN32 support
+
+ * improve IsValidURL with return codes (htdig/Retriever.cc)
+
+ * lots of improvements/new-features to libhtdig
+
+Sun Jun 22 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * db/mp_cmpr.c (CDB___memp_cmpr_open):
+ Make weak compression database standalone to avoid recursion
+ This *should* fix all of the recent problems with dirty cache etc.
+
+ * test/search.cc: Don't take sizeof zero sized array
+
+Fri Jun 20 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * configure,aclocal.m4,acinclude.m4: --with-ssl set CPPFLAGS, not CFLAGS
+
+Fri Jun 20 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * db/configure: Hack which should allow select to be detected on HP/UX
+
+ * db/db.c: Replace HAVE_ZLIB with HAVE_LIBZ (as set by configure)
+
+ * htword/wordKey.cc: More descriptive error message
+
+ (Changes to compile with Sun's C++)
+ * htnet/{HtCookie.cc,HtFTP.cc,Transport.cc}:
+ Assign substring of const string to const pointer.
+ * htsearch/ResultMatch.h:
+ Allow use of SortType in ResultMatch::setSortType()
+ * test/search.cc: Don't take sizeof(variable size array)
+ * htdb/htdb_stat.cc: avoid name clash for global var internal
+ * htcommon/URL.h, htlib/HtTime.h, htlib/htString.h, htnet/Connection.h,
+ htword/WordBitCompress.h:
+ Cast default args of type string literal to type (char*)
+
+ * htdocs/require.html: Remove email address.
+
+ * htlib/gregex.h: Avoid warning if __restrict_arr already defined
+
+Sun Jun 14 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/defaults.cc:
+ Set wordlist_cache_dirty_level to 1 (it most conservative value).
+ Miscellaneous reformatting.
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerated using cf_generate.pl.
+
+ * htdoc/{require.html,meta.html,all.html,meta.html}:
+ Update disk usage for phrase searching.
+ Updated list of supported platforms. More hyperlinks.
+
+Fri Jun 13 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/Display.cc (setVariables), htdocs/hts_template.html:
+ Set MATCH_MESSAGE from method_names (for internationalisability).
+ Removed all trace of hack for config attribute...
+
+Thu Jun 12 14:16:05 2003 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/htsearch.cc (main): Fixed boolean_keywords handling to
+ work over multiple config files (must destroy old list before
+ creating new one).
+
+ * htcommon/defaults.cc, htsearch/Display.cc (setVariables): Removed
+ incorrect default value for "config" attribute, and removed hack
+ that attempted to correct it.
+
+ * htdoc/attrs.html: Regenerated using cf_generate.pl.
+
+Thu Jun 12 13:28:01 2003 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc, htcommon/HtSGMLCodec.cc (ctor): Added
+ translate_latin1 option to allow disable Latin 1 specific SGML
+ translations.
+
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerated using cf_generate.pl.
+
+Mon Jun 9 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/htsearch.cc: Fixed setupWords loop for junk at end of query
+
+Mon Jun 9 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/Display.cc: Set CONFIG template variable to the base name
+ of the config file (no directory or .conf), as expected by htsearch
+
+Mon Jun 9 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * test/test_functions.in: avoid trying killing apache multiple times
+
+ * configure,configure.in: Reformat --help output
+ * htdoc/FAQ.html: Brought up-to-date with main docs
+ * htdoc/hts_templates.html: added hyperlinks.
+ * installdir/search.html: Display version
+
+Sun Jun 8 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * configure: Hack to set --disable-bigfile for Solaris (with Sun cc)
+ and --disable-shared --enable-static for Mac OS X
+
+ * test/{test_functions.in,t_htdig,t_htdig_local,t_htnet}:
+ Only start Apache for tests which need it, and kill it after the test
+
+ * contrib/parse_doc.pl: Allow file names containing spaces (from .deb)
+
+Mon Jun 2 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * db/mp_cmpr.c: Add default zlib setting to default_cmpr_info
+ * htcommon/defaults.cc, htword/WordDBCompress.cc: Fix docs to say
+ default compression by 8 (not by 3, which I had "fixed" it to...)
+
+ * htcommon/conf_lexer.{cxx,lxx}: Avoid warnings, and document hack.
+
+Thu May 29 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * db/mp_cmpr.c: Fix comparison of -1 and unsigned which broke SunOS cc
+ * htdoc/install.html: Warn SunOS cc users to --disable-bigfile
+
+ * htcommon/conf_lexer.cxx: Suppress warnings of unused identifiers
+ * test/con/htdig.conf2.in: Disable testing of content_classifier
+ attribute, as didn't work until after installation
+
+Tue May 27 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * db/configure, db/ac{local,include}.m4:
+ Stop test for zlib from adding -I/default/path (*this* time...)
+
+ * htword/DBPage.h: Fix bug introduce in previous patch
+
+ * test/Makefile.{in,am}: Replace non-portable make -C X by cd X; make
+
+Tue May 27 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * {,db/}configure, {,db/}ac{local,include}.m4:
+ Stop test for zlib from adding -I/default/path (broke SunOS cc)
+ Fix -Wall test if CCC is g++ but CC is not gcc
+
+ * test/dbbench.cc: #include <fcntl.h> later, to avoid #define open
+ causing problems
+
+ * includedir/synonyms: Remove trailing blank line which caused warning
+ * htnet/HtCookieInFileJar.cc,htfuzzy/Synonym.cc: .get() to stop warnings
+ * htlib/mhash_md5.c: char -> unsigned char to stop warnings
+ * test/search.cc, htword/WordDBPage.h:
+ Casts to (int) to stop printf warnings. ALLIGN -> ALIGN
+
+Sat May 24 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/defaults.cc: Keep more wordlist cache pages clean
+
+ * {,db/}configure{,.in}, {,db/}ac{local,include}.m4:
+ Patch by Richard Munroe to test if -Wno-deprecated needed.
+ Many bug fixes / extra search paths added.
+
+ * include/htconfig.h.in, db/db_config.h.in:
+ Only '#define const' if not C++ (htword/WordDB.cc uses db_config.h)
+ * test/dbbench.cc: check for alloca even if gcc
+ * test/t_url: used grep -C instead of grep -c (for portability)
+ * db/mp_{alloc,cmpr}.c: Removed/replaced C++ style comments
+
+ * htdoc/require.html: Revised list of supported platforms
+
+Thu May 22 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htnet/HtFile.cc: Fix previous .get() patch...
+
+Thu May 22 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htlib/DB_2.cc: Set wordlist_cache_dirty_level before opening
+ database, to avoid database memory allocation problem.
+
+ * db/db_err.c: Make 'fatal' errors actually exit.
+
+ * htdig/Document.cc, htsearch/parser.cc, htdig/htdig.cc,
+ * htnet/Ht{HTTP,File}.cc:
+ Add .get() to use of strings to avoid compiler warnings (FreeBSD).
+
+Thu May 22 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * ltmain.sh, test/Makefile.in: Hack to list library dependencies
+ multiple times in g++ command, to get MacOS X to 'make check'.
+
+ * test/{search,word}.cc: cast sizeof() to (int) to avoid warnings.
+
+ * htdoc/install.html: Documented MacOS X's shared libraries problem.
+
+Sun May 18 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * db/mp_alloc.c: Hopefully the *last* fix for this morning's patch...
+
+ * configure, aclocal.m4, acinclude.m4:
+ Look for httpd modules in .../libexec/httpd for OS X
+ * test/conf/httpd.conf: Disabled mod_auth_db, mod_log{agent,referer}.
+
+Sun May 18 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * db/db.h.in: Declare variable introduced in db/mp_cmpr.c patch
+
+Sun May 18 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * db/mp.h, db/mp_{alloc,bh,cmpr,region}.c,
+ * htword/WordDB.cc, htdig/htdig.cc:
+ Avoid infinite loop if memp_alloc has only dirty,
+ "weakly compressed" (i.e. overflow) pages.
+ * htcommon/defaults.cc: Document the above, plus misc updates.
+
+ * htword/WordDBPage.h:
+ Cast sizeof() to (int) in printf()s to avoid compiler warnings.
+
+Sun APR 20 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdig/htdig.cc: delete db.words.db_weakcmpr if -i specified.
+
+Wed Feb 26 22:10:40 CET 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtHTTP.cc: fixed colon (':') problem with HTTP header parsing,
+ as Frank Passek, Gilles and others suggested, as space is not
+ mandatory between the field declaration and the field value returned
+ by the server
+
+Sun Feb 23 10:20:58 CET 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htcommon/defaults.[cc,xml]: added the 'cookies_input_file'
+ configuration attribute for pre-loading cookies in memory
+ * htdig/htdig.cc: added the feature above; the code automatically
+ loads the cookies from the input file into the 'jar' that will be
+ used during the crawl.
+
+Sun Feb 23 10:16:08 CET 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtHTTP.h: removed the NULL pointer check before assigning a
+ new jar to the HTTP code
+
+Tue Feb 11 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/defaults.cc: Set default compression_level to 6,
+ which enables Neal's wordlist_compression_zlib flag.
+
+Tue Feb 11 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htcommon/{DocumentRef.h, HtWordReference.h},
+ htsearch/WeightWord.{cc,h},
+ htsearch/parser.{cc,h}, htsearch/htsearch.cc:
+ Added field-restricted searching, by title:word or author:word
+
+ * htdig/ExternalParser.cc, htdig/HTML.{cc,h}, htdig/Parsable.{cc,h},
+ htdig/Retriever.{cc,h}:
+ Parse author from <meta ...> tags. Also moved some common
+ functionality from HTML/ExternalParser into Parsable.
+
+ * test/t_htsearch, htcommon/defaults.cc,
+ htdoc/{TODO.html,hts_general.html,hts_method.html}:
+ Test and document the above
+
+Sun Feb 9 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htdig/HTML.cc: fix bug in detection of deprecated noindex_start/end
+ * htsearch/Display.cc: try harder to find value for DBL_MAX #680836
+ * htcommon/defaults.cc: fixed typos.
+
+Sat Feb 1 13:57:17 CET 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtCookie.[h,cc]: allowed printDebug to be passed an ostream object
+ * htnet/HtCookieMemJar.cc: removed a debug call
+
+Thu Jan 30 19:28:32 CET 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * configure.in: used AC_LIBOBJ instead of deprecated LTLIBOBJS's workaround
+ * ltconfig: removed as not needed anymore since libtool 1.4
+ * db/configure.in: added AC_CONFIG_AUX_DIR(../) for letting automake know to use
+ the main ltmain.sh file
+ * configure, aclocal.m4, Makefile.in, */Makefile.in, config.guess, config.sub,
+ install-sh, ltmain.sh, missing, mkinstalldirs: re-generated by autotools:
+ aclocal, autoconf 2.57, automake 1.6.3 and libtool 1.4.3
+ * db/aclocal.m4, db/configure, db/mkinstalldirs: ditto
+
+Thu Jan 30 00:16:51 CET 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htsearch/htsearch.cc: removed a warning due to a not-initialized pointer
+
+Wed Jan 29 22:53:25 CET 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * acinclude.m4: included the function for checking against SSL, as
+ found in the ac-archive.
+
+Tue Jan 28 12:23:16 CET 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/Makefile.am: added HtCookieInFileJar.[h,cc] files
+ * installdir/cookies.txt: example file for pre-loading HTTP cookies
+ * installdir/Makefile.am: added cookies.txt
+
+Tue Jan 28 12:16:28 CET 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtCookieMemJar.[h,cc]: performed deep copy of the jar in the copy constructor
+
+Tue Jan 28 12:13:44 CET 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtCookie.[h,cc]: added the constructor of a cookie object from a line
+ of a cookie input file (Netscape's way): if an expiration value of '0' is set
+ through the cookies input file, the cookie is managed as a session cookie.
+ Improved copy constructor, solving a bug related to the expires field.
+
+Tue Jan 28 12:11:27 CET 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtCookieInFileJar.[h,cc]: class for importing cookies from a text file
+
+Tue Jan 28 12:08:20 CET 2003 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htlib/HtDateTime.h: added the constructor HtDateTime(const int)
+
+Sat Jan 25 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htsearch/Display.cc: Convert "<br>\n" in $(DESCRIPTION) to "<br>"
+ so it can be used in Javascript (feature request #529926).
+
+Tue Jan 21 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * HTML.cc (HTML, parse): Handle noindex_start/end as string lists.
+
+ * test/{t_htsearch,htdocs/set1/script}: Test the above
+
+ * htcomon/defaults.cc:
+ Add "<SCRIPT" to default noindex_start/end (feature request #586359).
+
+
+ * htlib/String.cc (operator>> (istream&,String&) ):
+ Exit loop when getline fails for reasons other than a full buffer.
+
+ * htnet/HtFile.cc (File2Mime), installdir/HtFileType:
+ Allow file names containing spaces.
+
+Sat Jan 11 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htnet/HtFile.cc (Request), htdig/Document.cc (RetrieveLocal),
+ htcommon/URL.h htcommon/URLTrans.cc:
+ Decode URL paths before use as local filenames (file:/// & local_urls).
+
+ * test/{t_htdig,t_htdig_local,t_htsearch}, test/conf/htdig.conf2.in,
+ test/htdocs/set1/{index.html,site 1,sub%20dir/empty file.html}:
+ Tests for the above.
+
+ * htcommon/HtConfiguration.cc: brackets around assignment in 'if'.
+ * test/search.cc (LocationCompare): Only specify default arg once.
+
+Fri Jan 10 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htlib/String.cc (operator>> (istream&,String&) ):
+ Check status of stream, no return value of get().
+ Fixes bug (for some C++ libs) where reading stops at a blank line.
+
+Fri Jan 1 2003 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * htnet/HtFile.cc(Ext2Mime,Request), htdig/Document.cc(RetrieveLocal):
+ Determine local files' MIME types from mime.types, not hard-coded.
+ URLs matching attribute "bad_local_extensions" must use their true
+ transport protocol (HTTP for http://, filesystem for file:///).
+
+ * htnet/HtFile.cc (File2Mime, Request): For file:/// URLs only,
+ files without (or with unrecognised) extensions are checked by
+ the program specfied by the "content_classifier" attribute.
+
+ * htnet/htFile.cc (Request): Symbolic links are treated as
+ redirects, to avoid problems with relative references.
+
+ * htcommon/defaults.cc: Documented the above (and added crossrefs).
+
+ * test/t_ht{dig,dig_local,search}, test/htdocs/set1/*,
+ test/conf/htdig.conf2.in: Add tests for bad_local_extensions.
+
+Mon Dec 31 2002 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * configure.in,htfuzzy/EndingsDB.cc,htlib/{HtR,r}egex.h,Makefile.am:
+ Renamed regex.h to gregex.h and allow use of rx instead.
+
+ * htcommon/defaults.cc,htdocs/{attrs,cf_byprog,cf_byname}.html:
+ Fixed typo in cross-references to restrict and limit_urls_to.
+
+ * test/t_htmerge: Re-enabled htmerge command (discarding output).
+
+ * test/Makefile,test/conf/htdig.conf3.in: Added conf3 and fixed db path.
+
+Mon Dec 30 2002 Lachlan Andrew <lha at users.sourceforge.net>
+
+ * contrib/doc2html/*: Incorporated David Adams' latest version, 3.0.1.
+
+Mon Dec 30 2002 Lachlan Andrew <lha at users.sourcefourge.net>
+
+ Forward-ported several patches from 3.1.6:
+
+ * htdig/ExternalParser.cc: Added "description_meta_tag_names" attrib.
+ Added "dc.date|dc.date.created|dc.date.modified" synonyms for "date".
+ Allow spaces between "url" and "=" in refresh.
+ Fixed bug in flag positions.
+ Added "use_doc_date" attribute.
+
+ * htdig/HTML.cc: Added "description \_meta_tag_names" attribute.
+ Added "dc.date|..." synonyms.
+ Added "ignore_alt_text" attribute.
+
+ * htdig/Retriever.cc: Added "ignore_dead_servers" attribute.
+ Added call to "url.rewrite() in got_href().
+
+ * htdig/FAQ.html: Latest version now 3.1.6. Mention old security hole.
+ Describe external converters for PostScript etc.
+ Mention pdf_parser not supported in 3.2.
+
+ * htdoc/{attrs,cf_byname,cf_byprog}.html: New attributes added
+ (automatically from defaults.cc).
+
+ * htdoc/htmerge.html: Update for multiple database support.
+
+ * htdoc/hts_form.html: Describe relative/incomplete dates.
+
+ * htdoc/require.html: Describe phrase searching, external parsers,
+ external transports.
+ Added some new supported systems. (Commented out as testing
+ incomplete.)
+
+ * htfuzzy/Synonym.cc: Protect against "synonym" entries with one word.
+
+ * htlib/String.cc: Protect against negative string lengths.
+
+ * htsearch/Display.{cc,h}: Added "search_result_contenttype" attribute,
+ and corresponding displayHTTPheaders() function.
+ Rewrite URLs.
+ Remove old "ANCHOR" variable.
+ Handle relative dates.
+ Added "max_excerpts" attribute and buildExcerpts() function.
+ Added "anchor_target" attribute.
+
+ * htsearch/DocMatch.h: Added "orMatches"
+
+ * htsearch/htsearch.cc: Added "boolean_keywords" attribute.
+ Rewrite URLs.
+
+ * htsearch/parser.cc: Added "boolean_syntax_errors" attribute.
+ Added wildcard search.
+ Fixed bug in perform_phrase() so it now handles "bad words" and
+ short words properly.
+ Added "multimatch_factor" to give greater weight to documents matching
+ multiple "OR" terms.
+
+ * htsearch/htparser.h: Added boolean_keywords support.
+
+ * htcommon/defaults.{cc,xml}: New attributes added, and enhanced
+ descriptions
+
+
+ Cleaned code to remove some compiler warnings/errors:
+
+ * htcommon/HtConfiguration.cc: Brackets around assignment 'path='
+ inside 'if'
+
+ * htdig/Server.cc, htsearch/Display.cc:
+ Added ".get()" when strings passed as arguments.
+
+ * htlib/StringMatch.h, htword/WordBitCompress.h:
+ Explicit cast of NULL to (char*)NULL for broken C++ compilers.
+
+
+ Also:
+
+ * STATUS: Removed "not all htsearch input parameters handled properly",
+ "Return all URLs", "Turn on URL parser test",
+ "htsearch phrase support tests".
+ Reduced list of things to do for "require.html".
+
+
+ * test/t_htsearch, test/conf/htdig.conf3.in:
+ Added testing of phrases and boolean_keywords / boolean_syntax_errors.
+
+Thu Nov 28 09:02:46 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * installdir/english.0: Removed S flag from birth, because it doesn't
+ do what we want (birthes, not births).
+
+Tue Nov 26 23:16:08 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/hts_form.html: Fixed typo in link & description for restrict.
+
+Tue Nov 26 22:30:06 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * installdir/english.0: Patched with Lachlan Andrew's changes, fixing
+ lots of dubious uses of suffixes to get more appropriate and correct
+ fuzzy endings expansions.
+
+ * installdir/synonyms: Updated with the version contributed by
+ David Adams, with minor changes. Kept old one as synonyms.original.
+
+Mon Nov 4 10:44:35 CET 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htcommon/URL.[h,cc]: added the assignment operator
+
+Sun Oct 27 09:29:02 2002 Geoffrey Hutchison <ghutchis at localhost>
+
+ Merge in word DB zlib patch from Neal Richter.
+
+ * db/db.h.in, db/mp_cmpr.c, htword/WordList.cc,
+ htword/WordDBCompress.h, htword/WordDBCompress.cc: Add support for
+ using the zlib compression (and compression level) if specified by
+ the new wordlist_compress_zlib, which is "true" by default.
+
+ * htcommon/defaults.cc: Add attribute wordlist_compress_zlib as
+ above.
+
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Update using cf_generate.pl.
+
+Sat Oct 26 21:59:01 2002 Geoffrey Hutchison <ghutchis at localhost>
+
+ Merge in fixes from Lachlan Andrew
+
+ * test/Makefile.am, test/Makefile.in, test/t_url, test/url.cc,
+ test/url.children, test/url.parents, test/url.output: Add URL
+ tests to the automatic test suite (rather than requiring them to
+ be run manually).
+
+ * */Makefile.in: Regenerate using automake-1.4p6.
+
+ * htcommon/URL.cc, htcommon/URL.h: Add new configuration attribute
+ allow_double_slash to only remove // marks when requested (since
+ some server-side code uses them), handle initial protocols
+ without double slashes, and only remove the default doc string
+ from appropriate protocol URLs (e.g. not file), treat ".//" as a
+ relative path, and collapse /../ *after* // and /./ handling.
+
+ * htcommon/defaults.cc: Add documentation for allow_double_slash,
+ as well as various documentation cleanups.
+
+ * htdig/ExternalTransport.cc: Fix minor bug--recognize service
+ specified as https:// rather than https.
+
+ * htdoc/hts_form.html, htdoc/hts_templates.html: Documentation fixes.
+
+ * htsearch/htsearch.cc: Create valid boolean query if "exact" not
+ specified in search_algorithms by adding the exact word with low
+ weight. Solves PR#405294.
+
+Fri Oct 4 17:05:06 2002 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/defaults.xml: Added first-draft XML version of defaults
+ file. This will eventually be used to generate defaults.cc and
+ documentation automatically. (As pointed out by Brian White, this
+ will make the binaries smaller.)
+
+Wed Sep 25 13:56:31 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc (parse): Fixed handling of JavaScript skipping so it
+ doesn't get confused by "<" in code.
+
+Thu Sep 19 09:04:50 CEST 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtHTTP.cc : another check for cookie jar's null pointer
+
+Tue Sep 17 17:41:51 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc (external_protocols): Fixed table formatting
+ as suggested by Lachlan Andrew.
+
+Thu Aug 29 21:21:34 CEST 2002 Soeren Vejrup Carlsen <svc at users.sourceforge.net>
+
+ * htdig/Document.[h,cc]: first steps in FTP handling. HtFTP.h included and
+ we now test for the 'ftp' protocol in the Document::Retrieve function.
+ Has not yet been tested!
+
+ * htnet/HtFTP.[h,cc]: added class to handle the FTP-protocol. Very
+ experimental (has not been tested yet).
+
+Fri Aug 9 13:01:05 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * httools/htnotify.cc (readPreAndPostamble): Check for empty strings
+ in file names, not just NULL, as suggested by Martin Kraemer.
+
+Wed Aug 7 12:11:31 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/ExternalParser.cc (parse): Fixed to impose max_doc_size
+ restriction on external converter output which it reads in.
+
+Tue Aug 6 18:21:11 CEST 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * these changes were suggested by David Reed <DReed1 at citgo.com> (thanks)
+
+ * htdig/Document.cc: manage cookies via SSL
+
+ * htnet/HtCookie.[h,cc]: features both RFC2109 and Netscape version
+
+ * htnet/HtCookieJar.cc: ditto
+
+Tue Aug 6 17:12:22 CEST 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htcommon/defaults.cc: added the 'http_proxy_authorization' attribute.
+ Needs revision due to my usual *spaghetti* english. :-)
+
+ * htdig/Document.[h,cc]: proxy authorization is now enabled
+
+Tue Aug 6 09:28:39 CEST 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/Connection.[h,cc]: IP address storing as string (sync with ht://Check)
+
+ * htnet/Transport.[h,cc]: HTTP Proxy and Basic credentials handling moved here (ditto)
+ through the use of a protected static method
+
+ * htnet/HtHTTP.h: SetCredentials declared to be virtual (unnecessary because inherited,
+ but gives better understanding); new method SetProxyCredentials for
+ proxy authorization.
+
+ * htnet/HtHTTP.cc: HTTP header Proxy-Authorization is now handled. The
+ SetCredentials and SetProxyCredentials methods now make use of the
+ Transport::SetHTTPBasicAccessAuthorizationString method, in order to
+ write the string for negotiating the access.
+
+Fri Aug 2 15:40:18 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Document.cc (Retrieve): Allow redirects from HTTPSConnect.
+
+Tue Jul 30 12:46:56 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/md5.cc: Added missing include of stdlib.h, as Geoff suggested.
+
+Sat Jul 27 11:57:25 2002 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htnet/SSLConnection.cc: Add fix for segfault on SSL connections
+ noticed by several users. Fix contributed by Andy Bach
+ <afbach at users.sourceforge.net>.
+
+Tue Jun 18 10:22:01 2002 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Retriever.cc (got_word): Check that the word length meets
+ the minimum word length before doing any processing.
+
+Fri Jun 14 17:26:21 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc (buildMatchList), htsearch/HtURLSeedScore.cc
+ (Match), htsearch/SplitMatches.cc (Match): Added Jim Cole's fix to
+ bugs in handling of search_results_order.
+
+Wed May 15 09:45:40 CEST 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/Retriever.cc: fixed the bug regarding the server_wait_time
+ feature after the maximum number of requests per connection has been
+ reached.
+
+Tue Apr 9 16:41:33 CEST 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtCookie*.[h,cc]: RFC2109 compliant.
+ * htlib/HtDateTime.[h,cc]: Add const-ness to the DiffTime static method
+
+Tue Apr 9 12:52:30 CEST 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtCookie.cc: fixed a bug regarding expiry date recognition
+
+Fri Apr 5 14:08:39 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/ExternalTransport.cc (Request): Fixed to strip CR from
+ header lines, output header lines with -vvv.
+
+Tue Mar 19 08:40:54 CET 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtCookie.cc: enhanced controls regarding the expires setting
+ when no expires is returned. Prevents NULL pointer exceptions to be
+ arisen.
+
+Mon Mar 18 11:28:02 CET 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htlib/HtDateTime.h: added the copy constructor
+ * htnet/HtCookie.cc: fixed a NULL pointer bug regarding 'datestring'
+ management and HtDateTime copy constructor is now used
+
+Tue Mar 12 18:19:49 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/HtDateTime.cc (Parse, SetFTime): Added Parse method for
+ more flexible parsing of LOOSE/SHORT formats, use it in SetFTime.
+ Also skip unexpected leading spaces in SetFTime, as these frequently
+ cause problems with some strptime() implementations.
+
+Mon Feb 11 23:28:37 2002 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Retriever.h (got_redirect): Add referer to properly handle
+ broken links through a redirect as reported by Joe Jah.
+
+ * htdig/Retriever.cc: As above.
+
+ * htdig/Document.cc (Retrieve): Fix bug that prevented external
+ transport methods from reporting redirects as reported by Jamie
+ Anstice <Jamie.Anstice at sli-systems.com>.
+
+ * htlib/Dictionary.cc (hashCode): Trial of hash function suggested
+ by Jamie Anstice.
+
+Sat Feb 9 18:06:29 2002 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/DocMatch.[h,cc]: Add scoring code for the new htsearch
+ framework.
+
+Thu Feb 7 11:32:14 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtHTTP.cc (ReadChunkedBody): gets control of Read_Line
+ methods (return error when they fail).
+
+Fri Feb 1 17:12:31 2002 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * Merged htdig-3-2-x branch back into CVS mainline.
+
+ * ChangeLog.0: Update with current 3.1.6 ChangeLog.
+
+Thu Jan 24 18:06:04 2002 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * configure.in, aclocal.m4: Use new CHECK_SSL macro from the
+ autoconf archive.
+
+ * configure: Generate via autoconf.
+
+Fri Jan 18 11:15:29 2002 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htnet/Transport.h (class Transport): Add const to SetCredentials
+ method declaration as pointed out by Roman Maeder.
+
+Wed Jan 16 13:35:26 2002 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * db/db.h.in: Add #include <sys/stat.h> which seems to help
+ problems of stat64 conflicts on Solaris as suggested by Gilles.
+
+Sat Jan 12 16:19:55 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc: A few changes to the wording and formatting
+ of the 'accept_language' attribute description.
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+Fri Jan 11 21:18:00 CET 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htcommon/defaults.cc: added the 'accept_language' attribute
+
+Fri Jan 11 20:53:36 CET 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtHTTP.[h,cc]: management of the accept-language directive added
+ * htcommon/URL.[h,cc]: const-ness in copy constructor and other cosmetic changes
+ * htlib/Server.[h,cc]: management of the 'accept_language' attribute as
+ a server block configuration directive.
+ * htlib/Document.cc: set of the attribute above for the HTTP layer
+
+Fri Jan 11 13:25:49 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/ExternalTransport.cc (Request): Fixed to allocate access_time
+ object before setting it.
+
+Fri Jan 4 12:31:34 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htnet/HtCookie.cc, htword/WordKeyInfo.cc, htword/WordMonitor.cc,
+ test/search.cc: changed all uses of strcasecmp to mystrcasecmp for
+ consistency and portability.
+
+Fri Jan 4 12:17:10 2002 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htnet/HtHTTP.cc (HTTPRequest): make the second comparison of the
+ transfer-encoding header the same as the first, i.e. case insensitive
+ and limited to 7 characters.
+
+Fri Jan 4 15:13:13 CET 2002 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtHTTP.cc: parse the transfer-encoding header as case insens.
+ [fix htdig-Bugs-499388 by Matthias Emmert <Matthias.Emmert2 at start.de>]
+
+Sun Dec 30 15:47:35 CET 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * HtHTTP.[h,cc]: management of the Content-Language directive for the response
+
+Sat Dec 29 13:07:08 CET 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtCookie.[h,cc]: new fields (srcURL and isDomainValid) and
+ a more robust class with initialization list and copy constructor
+
+ * htnet/HtCookieJar.[h,cc]: method for calculating the minimum number
+ of periods that a domain specification of a cookie must have. Depending
+ on what the Netscape cookies specification says.
+
+ * htnet/HtCookieMemJar.cc: Management of the domain field of the cookie
+
+Mon Dec 17 06:45:02 CET 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htdig/htdig.cc: fixed bug about cookie jar creation. It is done in
+ here, because there is only one jar for the whole process. However
+ it can be moved anywhere else. :-)
+
+Mon Dec 17 06:40:25 CET 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtHTTP.cc: check for null pointer of cookie jar
+
+Sun Dec 16 19:55:07 CET 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/Connection.[h,cc]: default constructor is changed and accepts
+ a socket value (by default is -1)
+ * htnet/HtCookieJar.[h,cc]: added a simple iterator
+ * htnet/HtCookieMemJar.[h,cc]: ditto
+ * htnet/HtFile: removed the management of modification_time (constructor)
+ * htnet/HtHTTP.[h,cc]: constructor with initilization list and without
+ a default constructor (the construction is now forced to pass a valid
+ connection object). Removed any memory deletion from the destructor.
+ The class is now abstract (see the virtual pure destructor).
+ * htnet/HtHTTPBasic.cc: creates a Connection object in the initialization
+ and the destructor has no responsability
+ * htnet/HtHTTPSecure.cc: creates an SSLConnection object in the initialization
+ and the destructor has no responsability
+ * htnet/HtNNTP.cc: creates a Connection object in the initialization
+ and the destructor has no responsability
+ * htnet/Transport.[h,cc]: default constructor accepts a pointer to a
+ Connection object and the destructor carries out the deletion of it
+
+Thu Dec 6 13:24:30 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/examples/rundig.sh: Fixed to make use of DBDIR variable,
+ and to test for and copy db.words.db.work_weakcmpr if it's there.
+
+Fri Oct 19 11:07:33 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Retriever.cc (IsValidURL): Fixed discrepancies in debug
+ levels for messages giving cause of rejection, inadvertantly
+ changed when regex support added.
+
+Wed Oct 17 15:48:23 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/ExternalTransport.h: Added missing class keyword on friend
+ declaration.
+
+Tue Oct 16 14:35:16 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/default.cc (external_parsers): Documented external converter
+ chaining to same content-type, e.g. text/html->text/html-internal.
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+Mon Oct 15 22:25:55 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Document.cc, htdig/htdig.cc, htdig/Retriever.cc: Make sure
+ setEscaped is called with the current value of
+ case_sensitive. Fixes bug pointed out by Phil Glatz.
+
+Fri Oct 12 17:14:08 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/htdump.html, htdoc/htload.html: Fixed 3 little typos.
+
+Fri Oct 12 15:11:45 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htnet/HtHTTP.cc (ParseHeader): Show header lines in debugging
+ output at verbosity level 3, not 4, for consistency with 3.1.x.
+
+ * htcommon/URL.cc (removeIndex): Fixed to make sure the matched
+ file name is at the end of the URL.
+
+Fri Oct 12 10:39:54 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/HtRegexList.cc (setEscaped): Fixed to set compiled flag to
+ FALSE when there's no pattern, so match() can detect this condition.
+ Fixes handling of empty lists in bad_querystr, exclude_urls, etc.
+
+ * htdig/Retriever.cc (IsValidURL): Fixed bad_querystr matching to
+ look at right part of URL, not whole URL.
+
+Mon Sep 24 11:47:15 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htnet/HtHTTP.cc (SetRequestCommand): Put If-Modified-Since header
+ out in GMT, not local time, and only put it out if existing document
+ time > 0.
+
+ * htsearch/parser.cc (perform_phrase): Optimized phrase search handling
+ to use linear algorithm with Dictionary lookups instead of n**2 alg.,
+ as suggested by Toivo Pedaste.
+
+Tue Sep 18 10:50:40 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/running.html: New documentation on how to run after configuring.
+ * htdoc/rundig.html: New manual page for rundig script.
+ * htdoc/install.html: Added link to running.html.
+ * htdoc/contents.html: Added link to running.html, rundig.html, related
+ projects. Updated links to contrib and developer site.
+
+Fri Sep 14 22:12:56 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/URL.h: Moved DefaultPort() from private to public for
+ use in HtHTTP.cc.
+
+Fri Sep 14 09:25:20 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htnet/HtHTTP.cc (SetRequestCommand): Add port to Host: header when
+ port is not default, as per RFC2616(14.23). Fixes bug #459969.
+
+Sat Sep 8 22:15:33 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * acconfig.h, include/htconfig.h.in: Add undef for
+ ALLOW_INSECURE_CGI_CONFIG, which if defined does about what you'd
+ expect. (This is for any wrapper authors who don't want to rewrite
+ but are willing to run insecure.)
+
+ * htsearch/htsearch.cc: Only allow the -c flag to work when
+ REQUEST_METHOD is undefined. Fixes PR#458013.
+
+Tue Sep 4 18:58:31 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/DocMatch.cc: Add scoring for Quim's new parser
+ framework. Only the normal word scoring is currently done, not
+ backlink_factor or other "Document" methods.
+
+Fri Aug 31 15:34:28 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.h, htdig/HTML.cc (ctor, parse, do_tag): Fixed buggy
+ handling of nested tags that independently turn off indexing, so
+ </script> doesn't cancel <meta name=robots ...> tag. Add handling
+ of <noindex follow> tag. Added <> delim. to tag debugging output.
+ Fixed a few typos.
+
+Wed Aug 29 10:33:01 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc (url_part_aliases): Added clarification
+ explaining how to use example.
+
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+Mon Aug 27 15:05:09 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * installdir/search.html: Add DTD tag for HTML 4 compliance.
+ * installdir/htdig.conf: Added .css to bad_extensions default,
+ added missing closing ">".
+ * htdoc/config.html: Updated with sample of latest htdig.conf and
+ installdir/*.html.
+
+Wed Jul 25 22:16:06 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc: Put new htnotify_* entries in alphabetical
+ order. Removed superfluous quotes from htnotify_webmaster example
+ (htnotify.cc adds in the quotes).
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+Tue Jul 24 16:07:01 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc: Changed references in (no_)page_number_text
+ entries from maximum_pages to maximum_page_buttons.
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+Tue Jul 24 14:38:22 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/hts_templates.html: Document Quim Sanmarti's URL decoding
+ feature for template variables.
+
+Thu Jul 12 14:12:02 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htnet/HtFile.cc (Request): Fixed so it doesn't remove newlines
+ from documents, and so it only tries to open mime.types once even
+ if the open fails.
+
+Thu Jul 12 11:40:07 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/conv_doc.pl, contrib/parse_doc.pl: Fixed EOF handling in
+ dehyphenation, fixed to handle %xx codes in title made from URL.
+
+ * contrib/doc2html/doc2html.pl, contrib/doc2html/pdf2html.pl,
+ contrib/doc2html/swf2html.pl: Fixed to handle %xx codes in URL title.
+
+Wed Jul 11 15:05:47 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc (readFile): Added missing fclose() call, and
+ debugging message for when file can't be opened.
+
+Wed Jul 11 14:26:28 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc (displayParsedFile): Added debugging message
+ when file can't be opened.
+
+ * htseach/Display.cc (buildMatchList): Fixed while loop to avoid
+ warning.
+
+ * htsearch/htsearch.cc (main): Fixed handling of syntax error message
+ to use String class instead of strdup().
+
+ * htsearch/parser.cc (setError): Added debugging message when error
+ is set.
+
+ * htsearch/parser.cc (parse): Fixed not to clear error message after
+ it's set.
+
+Sat Jul 7 22:19:18 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * */Makefile.in: Update using current production automake
+ (1.4-p4).
+
+ * htfuzzy/Regexp.[cc,h]: Change class name to Regexp to prevent
+ further namespace clashes.
+
+ * htfuzzy/Fuzzy.c: #include "Regexp.h" now and make sure we create
+ the right class when needed.
+
+ * htlib/mktime.c: Change included mktime declaration to mymktime
+ to avoid conflict on Mac OS X. (For some reason, autoconf's
+ AC_FUNC_MKTIME doesn't work for Mac OS X. So this is a hack in the
+ meantime.)
+
+ * htfuzzy/Makefile.am: Rename Regex files. Oops!
+
+Fri Jul 6 18:38:58 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htfuzzy/Regexp.cc, htfuzzy/Regexp.h: Rename Regex class to
+ prevent problems on case-insensitive systems.
+
+ * htlib/HtRegexReplaceList.cc, htlib/String.cc, htdig/htdig.cc:
+ Change #include of <stream.h> to modern standard of iostream.h.
+
+ * htlib/Configuration.cc (Read): Make sure we never reference a
+ negative position when trimming off whitespace.
+
+ * config.guess, config.sub: Update with new versions from GNU to
+ recognize various flavors of Mac OS X/Rhapsody.
+
+ * htlib/strptime.cc: Make sure len is initialized.
+
+Fri Jul 6 12:04:52 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/HtRegexList.cc (setEscaped): Fixed a potential problem
+ with list building. When we go back a step, we still have to
+ compile the new pattern in case it's the last one.
+
+Wed Jul 4 23:39:19 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/URL.cc (parse, ServerAlias): Fixed two problems that
+ caused incorrect signatures to be generated.
+
+Wed Jul 4 13:52:54 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * test/document.cc (dodoc), test/url.cc (dourl),
+ test/testnet.cc (Retrieve): Fixed up handling of config to match
+ David Graff's changes of May 16, and handling of HtHTTPBasic class
+ to match Joshua Gerth's changes of Mar 17.
+
+Tue Jul 3 16:20:56 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Retriever.cc (GetLocal): Fixed to use URL class on given
+ URL, so that default port numbers are stripped off. This was needed
+ to allow local fetching of robots.txt.
+
+ * htnet/Connection.cc (ctors, dtor, Assign_Server, Get_Peername),
+ htnet/Connection.h: Got rid of strdup stuff, used String class for
+ peer & server_name.
+
+ * htnet/Connection.cc (Get_PeerIP): Used unambiguous name for structure.
+
+ * htnet/HtHTTP.cc (ctor, dtor): Don't allocate a 2nd Connection, as
+ child classes already do this, and set pointer to null when connection
+ is deleted, so we don't try to delete it twice. This was messing up
+ the heap and causing segfaults. Call Transport::CloseConnection before
+ deleting connection.
+
+ * htnet/HtHTTPBasic.cc (dtor), htnet/HtHTTPSecure.cc (dtor),
+
+ * htnet/HtNNTP.cc (dtor): Only delete connection if non-null, & set
+ to null after deleting. Call Transport::CloseConnection before
+ deleting connection.
+
+ * htnet/Transport.cc (CloseConnection): Don't exit if connection
+ pointer is null, as this may be normal when called from destructor.
+
+Fri Jun 29 11:14:36 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htfuzzy/Endings.cc (getWords): Undid change introduced in 3.1.3,
+ in part. It now gets permutations of word whether or not it has
+ a root, but it also gets permutations of one or more roots that
+ the word has, based on a suggestion by Alexander Lebedev.
+ * htfuzzy/EndingsDB.cc (createRoot): Fixed to handle words that have
+ more than one root.
+ * installdir/english.0: Removed P flag from wit, like and high, so
+ they're not treated as roots of witness, likeness and highness, which
+ are already in the dictionary.
+
+Mon Jun 25 12:50:47 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/htsearch.cc (main): Got rid of last remnants of 'urllist'
+ and used the 'l' StringList as was used in the code before, to make
+ restrict and exclude handling work properly.
+
+Mon Jun 25 15:52:19 CEST 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htsearch/htsearch.cc: defined 'urllist' in order to remove the
+ compilation error (as Jesse suggested).
+
+Fri Jun 22 16:28:13 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc (buildMatchList): Fix date_factor calculation
+ to avoid 32-bit int overflow after multiplication by 1000, and avoid
+ repetitive time(0) call, as contributed by Marc Pohl. Also move the
+ localtime() call up before gmtime() call, to avoid clobbering gmtime's
+ returned static structure (my thinko).
+
+ * htdig/htdig.cc (main): Use .work file for md5_db, if -a given,
+ as contributed by Marc Pohl.
+
+ * htcommon/URL.cc (constructURL): Ensure that the _host is set if we
+ are constructing non-file urls, as contributed by Marc Pohl.
+
+ * htdoc/THANKS.html: Credit Marc Pohl for patches.
+
+Tue Jun 19 17:14:05 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * README: Bump up to 3.2.0b4, fix note about bug report submissions.
+
+Tue Jun 19 17:01:16 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc (setVariables): Fixed handling of
+ build_select_lists attribute, to deal with new restrict & exclude
+ attributes.
+
+Mon Jun 18 12:16:27 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * configure.in, configure: Fix "hdig" typo in help.
+
+Fri Jun 15 17:57:19 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc: Noted effect of locale setting on floating
+ point numbers in search_algorithm and locale descriptions.
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+Fri Jun 15 15:36:51 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/cf_generate.pl: Fixed to handle new defaults.cc format
+ with trailing backslashes.
+
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+Fri Jun 15 14:57:21 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdb/htdb_dump.cc, htdb/htdb_load.cc, htdb/htdb_stat.cc: Added a
+ conditional include of <getopt.h> if HAVE_GETOPT_H is defined.
+
+Fri Jun 15 11:25:24 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/htsearch.cc (main), htcommon/defaults.cc,
+ htdoc/hts_form.html: two new attributes, used by htsearch, have
+ been added: restrict and exclude. They can now give more control
+ to template customisation through configuration files, allowing
+ to restrict or exclude URLs from search without passing
+ any CGI variables (although this specification overrides the
+ configuration one).
+
+Fri Jun 15 09:34:23 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/htsearch.cc (main): Changed ridiculously outdated question
+ "Did you run htmerge?" to "Did you run htdig?".
+
+Fri Jun 8 11:07:04 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/Display.cc: Add <float.h> header, now needed for RH 7.1.
+
+Thu Jun 7 12:05:09 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/htdig-3.2.0.spec: Updated to 3.2.0b4.
+
+ * contrib/README: Mention acroconv.pl script.
+
+Thu Jun 7 10:46:19 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc (expandVariables): Use isalnum() instead of
+ isalpha() to allow digits in variable names, allow '-' in variable
+ names too for consistency with attribute name handling.
+
+Wed Jun 6 16:14:06 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * httools/htpurge.cc (main): Added missing "u:" declaration in
+ getopt() call.
+
+Wed Jun 6 15:24:04 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/doc2html/DETAILS, contrib/doc2html/README,
+ contrib/doc2html/doc2html.pl, contrib/doc2html/pdf2html.pl,
+ contrib/doc2html/swf2html.pl: Update to version 3.0 of doc2html,
+ contributed by David Adams <D.J.Adams at soton.ac.uk>.
+
+Wed May 16 11:23:04 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ Added a pile of changes contributed by David Graff
+ <phlat at mindspring.com> fixing compilation problems with
+ non-gcc/g++ compilers (i.e. Sun's compiler).
+
+ * Makefile.config, db/Makefile.am: Added no-dependencies to
+ AUTOMAKE_OPTIONS for those not on GNU C/C++
+
+ * configure.in: Changed AM_PROG_YACC to AC_PROG_YACC as autoconf
+ and autoreconf both complain that AM_PROG_YACC is not in the
+ library.
+
+ * htcommon/DocumentDB.cc: Removed default parameters as they are
+ already declared in the header
+
+ * htcommon/HtConfiguration.cc: Changed some of the loop
+ declarations so that Sparc C 4.2 is happy. Removed default
+ parameters as they are already declared in the header Moved inline
+ ParseString to header where it belongs. Added initialization for
+ HtConfiguration::_config static member variable. Added
+ implementation of HtConfiguration::config() static class member.
+
+ * htcommon/HtConfiguration.h: Added include for ParsedString.h.
+ Added declaration of static member function ::config().
+ Added private static member variable _config;.
+ Added inline ParseString from implementation.
+
+ * htcommon/HtURLCodec.cc, htcommon/HtURLRewriter.cc,
+ htcommon/HtZlibCodec.cc, htcommon/URL.cc, htcommon/conf_lexer.lxx,
+ htdig/Document.cc, htdig/ExternalParser.cc,
+ htdig/ExternalTransport.cc, htdig/HTML.cc, htdig/Parsable.cc,
+ htdig/Plaintext.cc, htdig/Retriever.cc, :
+ Changed to use new global configuration semantics.
+
+ * htcommon/conf_parser.yxx: Added a return to yyerror to quiet
+ Sparc C 4.2. Should really return a value here. Is it normal to
+ return a YY_something or just -1, 0, ?
+
+ * htcommon/defaults.cc: Added line continuation characters at the
+ end of all the string lines that did not completed by a quote.
+
+ * htcommon/defaults.h, htdig/htdig.h: Removed extern
+ HtConfiguation config in favor of HtConfiguration::config().
+
+ * htdig/ExternalTransport.h Changed return type of GetResponse to
+ match superclass.
+
+ * htdig/Server.cc, htdig/htdig.cc, htfuzzy/htfuzzy.cc, htnet/HtFile.cc,
+ htsearch/Display.cc, htsearch/QueryLexer.cc, htsearch/WordSearcher.cc,
+ htsearch/htsearch.cc, htsearch/parser.cc, htsearch/qtest.cc,
+ httools/htdump.cc, httools/htload.cc, httools/htmerge.cc,
+ httools/htnotify.cc, httools/htpurge.cc, httools/htstat.cc
+ htlib/Configuration.cc, htlib/HtRegex.cc:
+ Changed constructor to use initializers
+
+ * htlib/HtDateTime.cc: Moved inlines to header
+
+ * htlib/HtDateTime.h: Added inlines from implementation
+
+ * htlib/HtHeap.cc, htlib/HtHeap.h, htlib/HtVector.cc, htlib/HtVector.h,
+ htlib/HtVectorGeneric.h, htlib/HtVectorGenericCode.h:
+ Changed Copy member to return same type as superclass
+
+ * htlib/HtRegexReplace.cc, htlib/HtRegexReplaceList.cc: Removed
+ default parameters as they are declared already in the header
+
+ * htlib/myqsort.h: Changed comment in header to use C-style
+ comments as it's compiled using a C.
+
+ * htlib/regex.h: Changed #if __STDC__ to #if defined(__STDC__)
+
+ * htword/WordKey.h: Corrected const'ness
+
+Wed May 9 07:50:19 CEST 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtCookieJar.h: ShowSummary makes the class abstract
+
+Sat May 5 20:51:00 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdoc/cf_blocks.html: Add colon in example and description of
+ blocks to match code for the moment. The parser can be changed
+ later if we like.
+
+Sat May 5 20:38:44 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/ParsedString.cc (get): Use isalnum() instead of isalpha()
+ for looking up--allows names that contain digits too.
+
+Sat May 5 20:36:29 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/htString.h (class String): Remove now-obsolete and
+ confusing int() casting operator. This was previously used to make
+ a string of a certain length. Use String(int) as a ctor instead.
+
+Sat May 5 20:30:18 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htword/WordContext.[h,cc]: Change Initialize to supply a config
+ that can be modified (i.e. if we don't have ZLIB_H).
+
+Sat May 5 23:30:55 CEST 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtCookieJar.h: ShowSummary, printing cookies (to be derived)
+ * htnet/HtCookieMemJar.[h,cc]: ShowSummary, printing cookies
+
+Thu May 3 23:14:14 CEST 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtHTTP[h,cc]: connection object is now created and destroyed.
+ NULL pointers converted to C++ standard (0).
+ * htnet/Transport[h,cc]: NULL pointers converted to C++ standard (0).
+ * htnet/Connection[h,cc]: ditto
+
+Thu May 3 23:09:33 CEST 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htlib/HtDateTime.[h,cc]: Timestamp format added (used by ht://Check
+ for MySQL interfacing) - keeping them equal helps me maintaining
+ both of them!
+
+Thu May 3 10:28:56 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/parser.cc (perform_and): Add missing return statement,
+ as suggested by Quim Sanmarti.
+
+Fri Mar 30 15:50:42 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/ResultMatch.h, htsearch/ResultMatch.cc (setTitle): Changed
+ argument type to char * to fix problem with sort by title not working,
+ as reported by Adam Lewenberg.
+
+Fri Mar 30 14:08:51 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Document.h, htdig/Retriever.cc (parse_url): Define and use
+ Document::StoredLength() method to get actual length of data
+ retrieved and given to md5(), which may be less than original
+ length. Fixes bug reported by Michael Haggerty.
+
+Wed Mar 21 22:22:55 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/Display.cc (generateStars): Add NSTARS variable for
+ template output as suggested by Caleb Crome
+ <ccrome at users.sourceforge.net> (except here precision is 0). Fixes
+ feature request #405787.
+
+ * htdoc/hts_templates.html: Add description of NSTARS variable
+ above.
+
+ * htlib/HtRegex.cc (set): Make sure we free memory if we've
+ already compiled a pattern.
+
+ * htdig/Retriever.cc (got_href): Fix bug pointed out by Gilles
+ with hopcounts and don't bother to update the DocURL unless we
+ have a new doc.
+
+Mon Mar 19 18:00:18 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/URL.cc (URL): Make sure even absolute relative URLs are
+ run through normalizePath() as pointed out by Gilles. Allows
+ backout of previous fix of #408586, which does extra re-parsing of
+ URL.
+
+ * htdig/Retriever.cc (Need2Get): Back out change of Mar. 17 for above.
+
+ * htcommon/conf_lexer.[cxx, lxx]: Apply change suggested by Jesse
+ to remove empty statements.
+
+Mon Mar 19 11:33:25 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtRegexList.cc (setEscaped): Fix assorted bugs, including
+ obvious segfault, incorrect creation of limits, and failure to set
+ "compiled" flag before return().
+
+ * htdig/Retriever.cc (IsValidURL): Make sure the tmpList is
+ cleared before attempting to parse the bad_querystr
+ config--otherwise we'll just Add to the end of the list.
+
+Sun Mar 18 14:01:56 CET 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/Transport.[h,cc], htnet/HtHTTP.cc: In order to modularize
+ the net code the default parser string for the content-type has
+ been added to the Transport class.
+ * htdig/Document.cc: modified for the changes above.
+
+Sat Mar 17 16:38:27 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * configure.in, configure, include/htconfig.h.in: Add tests for
+ libssl, libcrypto, and ssl.h.
+
+ * htnet/SSLConnection.[cc,h], htnet/HtHTTPBasic.[cc,h],
+ htnet/HTTPSecure.[cc,h]: New files. Contributed by Joshua Gerth
+ <jgerth at hmsoaps.com>.
+
+ * htnet/Transport.[cc,h], htnet/HtNTTP.cc, htnet/HtHTTP.cc,
+ htnet/Connection.h: Changes needed to support SSLConnection class.
+
+ * htdig/Document.cc, htdig/Document.h: Ditto.
+
+ * htnet/Makefile.am, htnet/Makefile.in: Add above for compilation.
+
+ * htdoc/THANKS.html: Updated with new contributors.
+
+Sat Mar 17 15:28:20 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htword/WordContext.cc (Initialize): If HAVE_LIBZ or HAVE_ZLIB_H
+ are not defined, make sure wordlist_compress is set to false. This
+ semi-hack will not be necessary with new mifluz code which does
+ not necessary need zlib. Fixes bug #405761.
+
+Sat Mar 17 14:39:17 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/HTML.cc (do_tag): Fixed problems with META descriptions
+ containing newlines, returns or tabs. They are now replaced with
+ spaces. Fixes bug #405771.
+
+Sat Mar 17 14:26:55 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/HTML.cc (do_tag): Improve handling of whitespace in META
+ refresh handling. Fixes bug #406244.
+
+ * htlib/HtRegexList.cc (setEscaped): Make this more efficient by
+ building up larger and larger patterns--when we fail, go back a
+ step and add the pattern in the next loop. This ensures we have a
+ list of the maximum allowable length regexp.
+
+ * htdig/Retriever.cc (Need2Get): Add change suggested by Yariv Tal
+ to run URLs through the URL parser for cleanup before comparing to
+ the visited list. Fixes bug #408586.
+
+Mon Mar 12 13:28:56 2001 Michael Haggerty <mhagger at alum.mit.edu>
+
+ * htdig/Retriever.cc, htdig/Retriever.h:
+ Fixed two off-by-one errors related to Retriever::factor table.
+
+Mon Mar 12 11:25:31 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/Dictionary.cc (Add): Fix comments about add method--it
+ will replace existing keys. Fixes report #407940.
+
+Thu Mar 8 15:31:45 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtHTTP.cc: removed an unuseful <else>
+
+Tue Mar 6 11:42:10 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/regex.[c,h]: Update with versions from glibc 2.2.2.
+
+Mon Mar 5 13:47:30 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * ltconfig (host_os): Add test to solve problems building C++
+ shared libraries on some platforms. Currently should only make
+ --enable-shared the default on Linux and *BSD* unless specified
+ explicitly by the user.
+
+Mon Mar 5 12:52:57 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/String.cc (operator =): Add fix contributed by Yariv Tal
+ <YarivT at webmap.com>, fixed bug #406075.
+
+Mon Mar 5 12:06:26 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtRegexList.cc (match): Ignore rearrangement code for the
+ moment--may or may not be the culprit for bug #405277, but is a
+ start to debugging the problem.
+
+ * htlib/List.[cc,h]: Remove *prev pointer from listnode
+ structure and add a *prev pointer to the cursor structure. Saves
+ one pointer per item in the list, plus overhead.
+
+Mon Mar 5 11:56:16 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/defaults.cc (bad_extensions): Add .css to ignore CSS docs.
+
+ * htdig/Document.cc (getParsable): Ignore CSS documents -- they
+ aren't very useful to parse. Solves bug report #405772.
+
+Sun Mar 04 11:32:43 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtHTTP.cc: fixed a bug regarding <no header> with persistent
+ connections enabled, but head call before the get one disabled.
+ Sourceforge.net's bug reference: 405275 - fixed.
+
+Sat Mar 3 21:09:55 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * .version: Bump to 3.2.0b4 so snapshots have right versioning.
+
+Thu Mar 1 16:51:09 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * configure.in: Added test for alloca.h, which is needed for the
+ regex.c code.
+
+Wed Feb 28 12:54:43 CEST 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htcommon/defaults.cc: 'disable_cookies' option has been added, with
+ a 'server' scope. By default it is set to 'false'.
+ * htdig/Server.h, cc: management of the option above has been enhanced.
+ * htnet/HtHTTP.h, cc: now an HTTP connection can disable/enable cookies
+ through the configuration attribute 'disable_cookies'.
+ * htdig/Document.cc: management of cookies enabling/disabling is here.
+ * Cookies classes: now support the expiration time. Need only the
+ subdomain treatment.
+
+Mon Feb 26 16:37:30 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/conf_lexer.lxx: Don't directly call exit(1) on an error
+ condition! Seems a harsh problem for an unknown character.
+
+ * htcommon/conf_parser.yxx: Ditto. (Running out of memory is a
+ much more fatal condition, of course.)
+
+ * htcommon/conf_lexer.cxx: Regenerate using flex 2.5.4.
+
+ * htcommon/conf_parser.cxx: Regenerate using bison 1.28.
+
+Sun Feb 25 19:46:01 CEST 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtHTTP.h, cc: support for cookies enabled
+ * htnet/Makefile.am: files for cookies have been added to make.
+
+Sun Feb 25 19:27:18 CEST 2001 Gabriele Bartolini <angusgb at users.sourceforge.net>
+
+ * htnet/HtCookie.h,cc: class HTTP cookie
+ * htnet/HtCookieJar.h,cc: abstract class for managing the
+ 'jar' of cookies. In this way, we can use different methods
+ for the storage of them.
+ * htnet/HtCookieMemJar.h,cc: class for managing the 'jar' of
+ cookies in memory, without persistent storage (no db or file).
+ * Many thanks to Robert LaFerla for his coding on this! Yeah,
+ really really thanks Robert! <robertlaferla at mediaone.net>
+
+
+Thu Feb 22 16:43:18 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdoc/ChangeLog, htdig/RELEASE.html, README: Update to roll the
+ release of 3.2.0b3.
+
+Thu Feb 22 16:22:05 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/htsearch.cc (main), htsearch/Display.cc (setVariables,
+ createURL, buildMatchList), htdoc/hts_form.html,
+ htdoc/hts_templates.html: Add Mike Grommet's date range search
+ feature.
+
+Mon Feb 19 18:24:42 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htfuzzy/Synonym.cc (createDB): Create database in a temporary
+ directory before we move it into place, much like the endings
+ code. This should prevent problems when we just append to the DB
+ instead of making a new one.
+
+ * htdig/htdig.cc (main): Fix bug discovered by Gilles--htword
+ should be initialized *after* we are finished modifying config
+ attributes based on flags and unlink with -i.
+
+ * installdir/rundig: Fix bug with calling htpurge with -s option.
+
+Thu Feb 15 11:03:42 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdoc/*.html: Update with 2001 copyrights and various changes
+ with the website move for the pending 3.2.0b3 release.
+
+Thu Feb 15 10:41:47 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtRegexList.cc (match): Fix thinko with logic for matching
+ and add code to rearrange matching nodes for hopefully better
+ performance.
+
+Sun Feb 11 16:42:11 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtRegexList.h, htlib/HtRegexList.cc (class HtRegexList):
+ Simple List(HtRegex) object with similar calling conventions to
+ HtRegex class. This version is not as sophisticated as it could
+ be, but it's not likely to drop objects when reorganizing.
+
+ * htlib/Makefile.[in,am]: Add HtRegexList files to list for
+ compilation.
+
+ * htdig/htdig.h, htdig/htdig.cc, htdig/Retriever.cc: Use
+ HtRegexList instead of HtRegex for setting escaped values--should
+ never fail (since each String item is short).
+
+ * htlib/HtDateTime.cc: Put back timezone specs into the output
+ formats so we give everything even if we ignore it when reading
+ input.
+
+Mon Feb 5 11:47:07 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtDateTime.cc: Remove the timezone specs in the date
+ formats--these are not required in the RFCs because many dates are
+ in GMT anyway.
+
+Wed Jan 17 08:48:30 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/ExternalTransport.cc (Request): Oops, fixed a holdover from
+ code borrowed from ExternalParser.cc's fork handling.
+
+Mon Jan 15 23:09:37 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htnet/Connection.cc: Back out previous change--this should not
+ in any way be needed since the configure script should set
+ FD_SET_T.
+
+ * configure.in, configure: Add more lenient prototyping for
+ select() test--now allows "const struct timeval" for compilation
+ on BSDI.
+
+ * htdoc/RELEASE.html: Update with Gilles's changes.
+
+ * htdoc/cf_blocks.html: New file describing <server ...></server>
+ and <url ...></url> blocks.
+
+ * htdoc/cf_general.html, htdoc/confmenu.html: Refer to the above.
+
+Mon Jan 15 17:46:07 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/TemplateList.cc (createFromString), htcommon/defaults.cc:
+ Treat template_map as a _quoted_ string list.
+
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+Mon Jan 15 17:40:45 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/hts_templates.html: Add METADESCRIPTION variable.
+
+ * htsearch/Display.cc (displayMatch): Add METADESCRIPTION variable.
+
+ * htdig/ExternalParser.cc (parse): Fix up handling of arguments.
+
+ * htdig/ExternalTransport.cc (Request): Fix up handling of fork/exec
+ and command arguments, add wait() call.
+
+Wed Jan 10 19:23:36 2001 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * installdir/rundig: Fix -a handling to move db.words.db.work_weakcmpr
+ into place if it exists
+
+Sat Jan 6 21:50:58 2001 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * configure.in: Add checks for <sys/wait.h> and <wait.h> for
+ ExternalParser.
+
+ * include/htconfig.h.in: Regenerate using autoheader.
+
+ * configure: Regenerate using configure.
+
+ * htnet/Connection.cc: Add definition for FD_SET_T to fix problems
+ compiling on BSDI mentioned by Joe.
+
+ * htdig/ExternalParser.cc: Use <sys/wait.h> or <wait.h> as
+ appropriate. Should fix problems with compiliation mentioned by
+ Jesse on HP/UX.
+
+ * README, htdoc/RELEASE.html: Adjust dates for the new year.
+
+ * htdoc/upgrade.html: A few "remaining features" have been implemented.
+
+Sun Dec 06 19:46:15 CEST 2000 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htnet/HtHTTP.cc: Fixed bug for Read_Line function call in
+ ReadChunkedBody method. Many thanks to Robert LaFerla. ;-)
+
+Tue Dec 12 13:24:49 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/ExternalParser.cc (parse): Fixed to properly handle binary
+ output from an external converter. Fixed some compilation errors.
+
+Tue Dec 12 12:52:14 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/ExternalParser.cc (parse): Handle parser command string
+ as a string list again to allow arguments, build up argv and
+ use execv instead of execl.
+
+Tue Dec 12 12:25:04 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/ExternalParser.cc (parse): Add call to wait for child process,
+ to avoid zombie buildup.
+
+Mon Dec 11 23:57:43 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/ExternalParser.cc (parse): Fix up handling of fds in child
+ process, more fault-tolerant handling of pipe or fork errors.
+
+Mon Dec 11 23:30:55 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/ExternalParser.cc (parse): Fix up handling of creation
+ of temporary file, check for proper return code, give error if
+ appropriate.
+
+Mon Dec 11 23:19:28 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/ExternalParser.cc (parse): Lowercase content-types and
+ strip off any trailing semicolons, at one last spot. This reinserts
+ code added Sep 11, which was dropped Oct 9, probably inadvertantly
+ during mifluz back-out.
+
+Sun Dec 10 15:28:44 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/ExternalTransport.cc: Use fork/exec instead of calling
+ popen, which bypasses any shell escape problems.
+
+ * htdig/ExternalParser.cc: Ditto, plus use of mkstemp where
+ available to pick the filename.
+
+ * configure, configure.in: Check for mkstemp where available.
+
+ * include/htconfig.h.in: Define it as above.
+
+ * htlib/Makefile.am: Omit regex.c from SOURCES--this is included
+ when necessary by the configure script. Otherwise this produces
+ duplicate declarations, etc.
+
+ * htlib/Makefile.in: Regenerate using automake --foreign.
+
+ * htcommon/URL.cc: Fix bug with ports of 0 showing up in URLs like
+ mailto: or other less-common protocols.
+
+Fri Dec 1 14:45:33 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/htdig-3.2.0.spec: Updated to 3.2.0b3.
+
+Fri Dec 1 13:59:09 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/Makefile.am: Fix pkginclude_HEADERS to list missing headers
+ ber.h, libdefs.h, myqsort.h, mhash_md5.h, omit unneeded langinfo.h;
+ fix libht_la_SOURCES to list missing sources regex.c, myqsort.c.
+
+ * htlib/Makefile.in: Regenerate using automake --foreign
+
+ * htlib/langinfo.h, htlib/nl_types.h: Removed as they're now unused.
+
+Fri Dec 1 13:22:47 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/strptime.cc (mystrptime): make ptr const and use cast on
+ return value to avoid warnings.
+
+ * htlib/Makefile.am: Fix pkginclude_HEADERS to list HtRegexReplace*.h
+ rather than .cc.
+
+ * htlib/Makefile.in: Regenerate using automake --foreign
+
+Fri Dec 1 11:58:21 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * Makefile.in, [hit]*/Makefile.in: Regenerate using automake --foreign
+ after fixing bug with cp -pr in automake.
+
+Thu Nov 30 14:41:58 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/Makefile.am: Removed howitworks.html from EXTRA_DIST.
+
+ * Makefile.in (distdir): Added missing variable name 'd' to cp -pr.
+
+Thu Nov 30 14:01:48 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/strptime.cc, htlib/lib.h: make first 2 args to strptime
+ const to avoid warnings, use cast in asizeof to avoid warnings.
+
+ * htsearch/qtest.cc: Change include from iostream to iostream.h
+
+ * htsearch/DocMatch.cc: Change include from iostream to iostream.h
+
+ * htsearch/Display.cc (createURL, buildMatchList, excerpt, hilight):
+ Clean up code to get rid of warnings, especially resulting from
+ NULLs in ternary operators.
+
+Thu Nov 30 10:55:09 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/String_fmt.cc (form, vform): Use vsnprintf rather than
+ vsprintf, for buffer overflow prevention if vsnprintf available.
+
+ * htdig/Retriever.cc: Remove unused strptime declaration.
+
+ * htlib/HtDateTime.cc: Use mystrptime if HAVE_STRPTIME not set.
+
+Wed Nov 29 23:31:10 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdb/htdb_stat.cc, htdb_load.cc, htdb_dump.cc: Make sure we
+ include htconfig.h to include proper declarations.
+
+ * htlib/strptime.cc: Change to strptime.cc, from htdig-3.1 series
+ hopefully more portable until I can find a more suitable
+ replacement.
+
+ * htlib/Makefile.am, htlib/Makefile.in: As above.
+
+ * htlib/clib.h, htlib/lib.h: Ditto.
+
+ * htdoc/all.html: Add a first draft of program summaries.
+
+Wed Nov 29 18:00:15 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Retriever.cc (parse_url): Remove undeclared "dup" variable,
+ add missing calls to words.Skip().
+
+Wed Nov 29 17:44:56 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/htdig.html: Add description of -v output.
+
+Mon Nov 27 12:03:34 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/md5.cc: Added missing include of time.h
+
+Fri Nov 24 00:56:01 2000 Toivo Pedaste <toivo at ucs.uwa.edu.au>
+
+ * htsearch/Display.cc: Some extra debugging for scoring
+
+Sun Nov 19 00:56:01 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htnet/HtFile.cc (Request): Use opendir/readdir instead of
+ scandir for generating directory listings on-the-fly.
+
+ * htdoc/RELEASE.html: Write up release notes for 3.2.0b3.
+
+ * htdoc/THANKS.html: Update list of contributors for 3.2.0b3 as
+ current.
+
+Fri Nov 17 14:52:37 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/acroconv.pl: Added external converter script to convert
+ PDFs with acroread.
+
+Mon Nov 6 12:13:13 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Retriever.cc (GetLocal, GetLocalUser): move String definition
+ out of while statement for AIX xlC compiler.
+
+Mon Oct 30 21:50:02 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Server.h, htdig/Server.cc (push): Add newDoc paramter that
+ will allow redirects (old docs) to be followed and not count
+ against the maxDoc restrictions.
+
+ * htdig/Retriever.cc (got_redirect): Use new parameter so we don't
+ count against a server's max documents since it's a redirect.
+
+ * htlib/nl_types.h: Add for systems missing this header file.
+
+Sun Oct 29 21:36:51 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/defaults.cc: Updated per-server and per-URL fields to
+ match code. I still have a "wish list" of additional attributes
+ that should work this way eventually.
+
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+Sun Oct 22 17:13:08 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/HtWordList.h: Add missing include for stdlib.h needed for
+ abort().
+
+ * htsearch/BooleanQueryParser.cc (ParseAnd): Fix problems with RH7
+ compiler -- shouldn't use "not" as a variable name!
+
+Thu Oct 19 22:19:16 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * ltmain.sh, ltconfig: Update with versions from libtool
+ 1.3.5. which may fix some problems building libraries.
+
+Mon Oct 9 21:59:11 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * */* [many, many files]: Backed out mifluz merge by going back on
+ modified files to 091000 snapshot.
+
+ * configure: Regenerated from configure.in.
+
+ * */Makefile.in: Regenerated using automake.
+
+Fri Oct 6 11:03:14 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc (do_tag): Parse <object> tags properly, looking
+ for data= attribute rather than src=.
+
+ * htcommon/defaults.cc (server_aliases): Additional clarification
+ to server_aliases description of port numbers.
+
+Wed Oct 4 12:12:31 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc (limit_normalized, server_aliases,
+ server_max_docs, server_wait_time): Added clarification
+ to server_aliases description. Changed word "directive" to
+ "attribute" where appropriate. Added cross-link to server_aliases
+ from limit_normalized.
+
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+Wed Sep 27 00:05:41 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdb/mifluz[dict, dump, load].cc, htdb/util_sig.h,
+ htdb/util_sig.cc: New files from mifluz merge. (Whoops, missed a
+ directory).
+
+ * htdb/*.cc: Change config.h references to htconfig.h.
+
+ * htlib/myqsort.c: Ditto.
+
+ * htcommon/HtWordReference.h, htcommon/HtWordReference.cc: Ensure
+ we keep the WordContext object around--unfortunately this also
+ requires that callers initialize us with a WordContext (e.g. from
+ the HtWordList class).
+
+ * htlib/StringMatch.h, htlib/StringMatch.cc: Changes to use
+ WordType directly instead of HtWordType.
+
+ * htfuzzy/*: Ditto. Additionally make sure HtWordReference objects
+ are intstantiated properly.
+
+ * htcommon/DocumentRef.cc, htcommon/HtWordList.cc: As above.
+
+ * htdig/*: As above.
+
+ * htsearch/*: As above.
+
+ * httools/*: Don't bother initializing WordContext--this is done
+ in the HtWordList class now.
+
+ * htdig/htdig.cc: Ditto.
+
+ * htsearch/htsearch.cc, htsearch/qtest.cc: Ditto.
+
+ * htfuzzy/htfuzzy.cc: Ditto.
+
+ * db/Makefile.am, db/Makefile.in: Update to build libhtdb instead
+ of libdb to prevent conflicts.
+
+Sun Sep 24 22:50:22 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htword/HtWordList.h, htword/HtWordList.cc: Keep a WordContext
+ object private that is associated with this word database and
+ provide accessor.
+
+ * htword/WordType.h, htword/WordType.cc: Add WordToken function,
+ migrated from HtWordType class.
+
+ * htcommon/HtWordType.cc: WordType class no longer has Instance()
+ method, so just pass along the calls.
+
+ * htlib/DB2_db.cc (db_init): Remove unnecessary NULL parameter.
+
+ * htlib/Makefile.am, htlib/Makefile.in: Remove HtVectorGeneric and
+ derived files as well as HtWordType as these are depreciated.
+
+Wed Sep 20 22:47:01 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * aclocal.m4: Add in missing autoconf macros that somehow didn't
+ make the merge before. (No idea why I didn't catch this earlier.)
+
+ * acinclude.m4: Use newer CHECK_ZLIB macro.
+
+ * */Makefile.in: Updated with automake for new build changes.
+
+ * configure, include/htconfig.h.in: Updated using autoconf.
+
+ * test/dbbench.cc, test/word.cc, test/search.cc: Fix #include to
+ point to htconfig.h not non-existant config.h.
+
+ * htlib/Configuration.h: Fix copy ctor, removing code in header file.
+
+ * htword/*.cc: Ditto.
+
+ * htword/Makefile.am: Update from mifluz version.
+
+ * htlib/myqsort.h, htlib/myqsort.c: Additional system library
+ replacement code.
+
+Sat Sep 16 20:14:32 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * configure.in, configure, acinclude.m4, aclocal.m4, acconfig.h,
+ include/htconfig.h.in: Merged with mifluz versions. Main
+ difference is that top-level configure script now also configures
+ db/ directory as well.
+
+ * Makefile.am, */Makefile.in: Updated with automake for new build
+ environment (with db/ run through top-level configure).
+
+ * db/*.c: Updated to use htconfig.h instead of config.h.
+
+Wed Sep 13 22:05:33 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * Merged in mifluz-0.19 branch. Everything will break
+ temporarily. Loic and I will clean up tomorrow.
+
+ * htdoc/RELEASE.html, htdoc/THANKS.html, htdoc/TODO.html: Get a
+ start on updting these files for the next release.
+
+ * htdoc/cf_generate.pl: Revert change of Sep. 9 to ignore links to
+ all.html in cf_byprog.html file.
+
+ * htdoc/all.html: New file, moved from howitworks.html and not
+ updated yet.
+
+ * htdoc/contents.html: Change link from howitworks.html to all.html
+
+Tue Sep 12 17:00:00 CEST 2000 Quim Sanmarti <qss at gtd.es>
+
+ * htsearch: added AndQuery.cc BooleanLexer.cc BooleanQueryParser.cc
+ ExactWordQuery.cc GParser.cc NearQuery.cc NotQuery.cc
+ OperatorQuery.cc OrFuzzyExpander.cc OrQuery.cc
+ PhraseQuery.cc Query.cc QueryLexer.cc QueryParser.cc
+ SimpleQueryParser.cc VolatileCache.cc WordSearcher.cc
+ qtest.cc WordSearcher.h AndQuery.h AndQueryParser.h
+ BooleanLexer.h BooleanQueryParser.h ExactWordQuery.h
+ FuzzyExpander.h GParser.h NearQuery.h NotQuery.h
+ OperatorQuery.h OrFuzzyExpander.h OrQuery.h OrQueryParser.h
+ PhraseQuery.h Query.h QueryCache.h QueryLexer.h
+ QueryParser.h SimpleLexer.h SimpleQueryParser.h VolatileCache.h.
+ This is the new query parsing/evaluation framework.
+
+ * Modified DocMatch.{cc,h} and ResultList.{cc,h} for compatibility.
+
+ * Removed the previous {And,Or,Exact,}ParseTree.{cc,h} files.
+
+ * Modified Makefile.{am,in} consequently.
+
+Mon Sep 11 11:56:44 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/ExternalParser.cc (parse): Lowercase content-types and
+ strip off any trailing semicolons, at one last spot which Geoff missed.
+
+Sat Sep 9 21:28:29 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Document.cc (getParsable): Fix a bug with earlier
+ change--if no parser is found and the MIME type is not text/* then
+ return a NULL parser.
+
+ * htdig/Retriever.cc (RetrievedDocument): If a NULL parser is
+ returned, mark the document as noindex and move on.
+
+ * configure.in, configure (enable-tests): Fix bug that would run
+ the 'yes' program inside the configure script if --enable-tests
+ was set.
+
+Sat Sep 9 17:50:11 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/defaults.cc: Add "all" program listing for common
+ attributes--seems more logical esp. now with many httool programs.
+
+ * htdoc/cf_generate.pl (cf_byprog): Do not output a link when
+ 'prog' is 'all.'
+
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+Sat Sep 9 11:44:47 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * aclocal.m4 (AM_CHECK_YACC): New macro to check for bison/yacc
+ and use "missing yacc" if not found.
+
+ * configure.in (enable_tests): Fix buglet where --enable-tests=no
+ or --disable-tests would not work and set the default to enabled
+ tests. Since the tests do not build unless the user does a "make
+ check" this should not be confusing and should help debugging.
+ Also use AM_CHECK_YACC instead of AC_CHECK_YACC.
+
+ * configure: Regenerate using autoconf.
+
+Sat Sep 9 11:01:03 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/ExternalParser.cc (canParse): Lowercase content-types and
+ strip off any trailing semicolons. Should prevent problems with
+ combined content-type; charset values.
+ (ctor): As above.
+
+ * htdig/Document.cc (getParsable): Only assume plain text if MIME
+ code starts with text/. Should prevent problems with retrieving
+ things like image/png or application/postscript as text.
+
+Fri Sep 8 22:59:10 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/defaults.cc: Add new attributes htnotify_replyto,
+ htnotify_webmaster, htnotify_prefix_file, htnotify_suffix_file.
+
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+ * httools/htnotify.cc: Added in code from Richard Beton
+ <richard.beton at roke.co.uk> to collect multiple URLs per e-mail
+ address and allow customization of notification messages by
+ reading in header/footer text as designated by the new attributes
+ above.
+
+Fri Sep 8 15:15:00 2000 Quim Sanmarti <qss at gtd.es>
+
+ * htsearch/Display.cc: Fixed tiny date_format bug;
+ added url-decoding template variable expansion.
+
+Thu Sep 7 23:45:25 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Retriever.cc (Retriever): Only open up md5 database if
+ check_unique_md5 attribute is set.
+
+Thu Sep 7 22:56:19 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/URL.cc (DefaultPort): Add file default port of 0.
+
+ * htnet/HtFile.cc (Request): Handle directory listings by using
+ scandir and generating minimal HTML file with appropriate noindex listing.
+
+Wed Sep 06 10:00:50 CEST 2000 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htlib/URL.h, htlib/URL.cc: Restored corrected versions of URL.*
+ * htnet/HtNNTP.h: Removed the error in the NNTP class declaration
+
+Mon Sep 04 13:43:40 CEST 2000 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htnet/HtHTTP.cc: Restored previous version of HtHTTP. I removed
+ an initialization in the constructor (_modification_time). Sorry.
+
+Sun Sep 3 16:51:24 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Retriever.cc, htdig/Server.cc: Fix compiler warnings about
+ String conversions.
+
+ * configure, configure.in, db/configure, db/configure.in,
+ db/acinclude.m4, db/aclocal.m4: Ensure --enable-bigfile is handled
+ correctly by the configure scripts as pointed out by Jesse.
+
+Fri Sep 01 23:28:43 CEST 2000 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * URL.cc: added DefaultPort() method and changed NNTP default port
+ from 523 to 119.
+ * Document.cc: management of NNTP documents retrieval.
+
+Fri Sep 01 19:05:02 CEST 2000 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htnet/HtNNTP.* : just created them ...
+ * htnet/HtHTTP.cc : removed modification_time deletion in the
+ class destructor.
+
+Thu Sep 01 12:00:00 2000 Toivo Pedaste <toivo at ucs.uwa.edu.au>
+
+ * htdig/Retriever.cc: Allow for modify time being set to
+ current time if not available.
+
+Thu Aug 31 13:21:12 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc (allow_in_form, build_select_lists):
+ Add clearer instructions to allow_in_form description, add
+ cross-links between these two sections.
+
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+Wed Aug 30 10:01:59 CEST 2000 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * substition of char * returned types to const String & in URL and
+ Server classes. This change made me do lots of changes in other files:
+ HtFile.cc, HtHTTP.cc, HtConfiguration.*, Document.*, ExternalParser.*,
+ Retriever.*.
+
+Tue Aug 30 12:00:00 2000 Toivo Pedaste <toivo at ucs.uwa.edu.au>
+
+ * htlibs/md5.cc, htlibs/md5.h: Generate md5 hash of
+ a page and also optionally the modify date.
+
+ * htlibs/mhash_md5.h, htlibs/mhash_md5.c, htlibs/libdefs.h:
+ Md5 hash code from libmhash
+
+ * htdig/Retriever.cc: Allow storing m5 hashes of pages
+ in order to reject aliases.
+
+ * htcommon/defaults.cc: Options "check_unique_md5" and
+ "check_unique_date"
+
+Tue Aug 29 08:51:39 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdoc/upgrade.html: Add description of the difference between
+ htmerge and htpurge. Mention other httools.
+
+ * htsearch/parser.cc, htsearch/parser.h: Merge in patch by Quim
+ Sanmarti <qss at gtd.es> to fix problems with phrase searching and
+ AND searches and improve performance.
+
+Sun Aug 27 22:41:10 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/AndParseTree.cc, htsearch/OrParseTree.cc (Parse):
+ Rewrote using new WordToken inherited method. Fixes a bug where
+ user input two phrases next to each other.
+
+ * htsearch/ParseTree.cc (Parse): Fix bug where phrases would
+ "adsorb" prior query words. Also fix bug where operators were
+ incorrectly popped off the stack. Should (hopefully) solve all
+ parsing problems.
+
+ * htsearch/*ParseTree.cc (GetLogicalWords): Test for empty list of
+ children to prevent potential segfault.
+
+Sat Aug 26 18:40:50 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * installdir/{syntax, header, footer, wrapper, nomatch}.html:
+ Add DTD tags, ALT attributes and remove bogus </select> tags to
+ fix invalid HTML pointed out in PR#901.
+
+Wed Aug 23 23:39:18 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/ParseTree.cc (Parse): Get rid of compiler warnings, use
+ new private tokenizer to ensure parens and quote aren't
+ removed. Also, when popping an operator off the parens stack, make
+ sure it's adopted by a new ParseTree object so we get the parens
+ back in the tree heirarchy.
+
+Wed Aug 23 23:34:44 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/AndParseTree.cc (Parse): Fix nasty infinite loop when
+ phrases hit in AND searches.
+
+ * htsearch/OrParseTree.cc (Parse): Ditto.
+
+Wed Aug 23 13:24:31 CEST 2000 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htnet/HtHTTP.*, htnet/Transport.h: all 'char *', when possibile,
+ have been changed into 'const String &' types.
+
+Sun Aug 20 23:25:01 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * httools/htpurge.cc (purgeDocs): Add error message when document
+ database is completely empty. Should take care of PR#672 (and others).
+
+Sun Aug 20 20:37:53 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtRegex.h, htlib/HtRegex.cc: Made destructor virtual,
+ added lastError() and associated support. Changed return type of
+ set*() to int. They now return the value of |compiled|.
+
+ * htcommon/defaults.cc (url_rewrite_rules): Add new attribute to
+ support patch by Andy Armstrong <andy at tagish.com> for permanent
+ URL rewriting.
+
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+ * htlib/HtRegexReplace.cc, htlib/HtRegexReplaceList.cc,
+ htlib/HtRegexReplace.h, htlib/HtRegexReplaceList.h,
+ htcommon/HtURLRewriter.cc, htcommon/HtURLRewriter.h: New classes.
+
+ * htcommon/Makefile.am, htcommon/Makefile.in: Add compilation for
+ HtURLRewriter.
+
+ * htlib/Makefile.am, htcommon/Makefile.in: Ditto for
+ HtRegexReplace*
+
+ * htcommon/URL.h, htcommon/URL.cc (rewrite): New method for
+ transforming URLs based on HtURLRewriter.
+
+ * htdig/Retriever.cc (got_href): Rewrite the URL before we do
+ anything with it.
+
+ * htdig/htdig.cc: Include HtURLRewriter headers and check rewrite
+ rules for errors.
+
+Sat Aug 19 17:01:36 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/conf_lexer.lxx: Patched to fix the bug with relative
+ filename includes. Keeps a separate stack with the filenames and
+ adjusts accordingly.
+
+ * htcommon/conf_lexer.cxx: Updated using flex 2.5.4.
+
+Thu Aug 17 23:59:26 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/conf_lexer.lxx: Patched to fix a bug reported by Abel
+ Deuring -- config filename stack was decremented too many times.
+
+ * htcommon/conf_lexer.cxx: Updated using flex 2.5.4.
+
+Thu Aug 17 23:40:08 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htword/WordType.h (WordToken): Add non-destructive version of
+ HtWordToken using a passed int as a pointer into the
+ string. Add virtual destructor so class can be sub-classed.
+
+ * htword/WordType.cc (WordToken): Implement it.
+
+ * httools/htmerge.cc (mergeDB): Back out change of Aug. 9th --
+ WordSearchDescription has disappeared from htword
+ interfaces. Should be restored when Loic comes back and can
+ suggest an alternative.
+
+Thu Aug 17 16:59:05 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc (createURL): Get rid of extra "config="
+ parameter that was inserted before collections stuff.
+
+Thu Aug 17 15:47:58 CEST 2000 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htnet/HtHTTP.cc: ask again for a document after a <NoHeader>
+ response is given by the HTTPRequest() method.
+
+Thu Aug 17 12:25:33 CEST 2000 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htnet/HtHTTP.*, htnet/Transport.* : fixed bug with HTTP/1.1 management.
+ Now the "Connection: close" directive is handled and force the connection
+ to be closed. So the bug has now been fixed. Fixed other minor bugs and
+ strings initializations.
+
+Tue Aug 15 00:24:33 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * contrib/multidig/Makefile, gen-collect, db.conf, multidig.conf:
+ Add missing trailing newlines as pointed out by Doug Moran
+ <dmoran at dougmoran.com>.
+
+ * contrib/multidig/Makefile (install): Make sure scripts have a+x
+ permissions. Pointed out by Doug Moran.
+
+ * contrib/multidig/new-collect: Fix typo to ensure MULTIDIG_CONF
+ is set correctly.
+
+Sun Aug 13 23:17:30 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Server.h, htdig/Server.cc (Server): Add support for
+ per-server user_agent configuration.
+
+ * htdig/Document.cc (Retrieve): Ditto.
+
+ * httools/htpurge.cc (purgeDocs): Set remove_* attributes on a
+ per-server basis.
+
+ * htcommon/defaults.cc: Fix remove_bad_urls and
+ remove_unretrieved_urls to point to htpurge and not htmerge.
+
+Sat Aug 12 23:03:32 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdoc/cf_generate.pl (html_escape): Fix mindless thinko with
+ perl stringwise-equal operator. Documentation is now generated
+ with block: portion appropriate to defaults.cc.
+
+ * htdoc/attrs.html, cf_by{name,prog}.html: Reran cf_generate.pl.
+
+Fri Aug 11 16:03:18 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc (parse): fix problem with &amp; not being translated.
+
+Fri Aug 11 10:48:54 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc (setVariables), htcommon/defaults.cc: Added
+ maximum_page_buttons attribute, to limit buttons to less than
+ maximum_pages. Fixes PR#731 & PR#781.
+ * htdoc/attrs.html, cf_by{name,prog}.html: reran cf_generate.pl
+
+Wed Aug 9 23:04:39 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * httools/htmerge.cc (mergeDB): Add fix to prevent duplicate
+ documents when you merge a database with a copy of itself
+ contributed by Lorenzo.
+
+Wed Aug 9 22:58:39 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/parser.cc (score): Merged in patch contributed by
+ Lorenzo Campedelli <lorenzo.campedelli at libero.it> and Arthur
+ Prokosch <prokosch at aptima.com> to fix problems with AND operators
+ and phrase matches.
+
+Wed Aug 2 11:44:11 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc (setVariables), htcommon/defaults.cc: Enhanced
+ build_select_lists attribute, to generate not only single-choice
+ select lists, but also select multiple lists, radio button lists
+ and checkbox lists. Added explanation and examples in documentation.
+ * htdoc/hts_selectors.html: Added detailed explanation of new feature.
+ * htdoc/attrs.html, cf_by{name,prog}.html: reran cf_generate.pl
+
+Tue Aug 1 21:50:22 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/ParseTree.cc (Parse): Fix problems with token
+ comparisons and fix thinko with HtWordToken parsing--previously
+ didn't advance the parse step at all.
+
+ * htsearch/*ParseTree.cc (Parse): Fix thinko with HtWordToken as
+ above--here it acted as an infinite loop.
+
+ * htdig/ExternalParser.cc (parse): Add shell quoting around
+ content-type. Hard to exploit, but a server could potentially
+ return a strange value that could then be exectuted locally.
+
+Thu Jun 29 23:33:51 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/ParseTree.h, htsearch/ParseTree.cc: New parent class
+ for the new htsearch framework. Still needs work.
+
+ * htsearch/*ParseTree.*: Derived classes appropriate to the method
+ indicated.
+
+ * htsearch/parsetest.cc: New program to alllow initial
+ command-line testing of ParseTree classes.
+
+ * htsearch/Makefile.am, htsearch/Makefile.in: Build parsetest in
+ addition to htsearch. Eventually, parsetest is probably best
+ modified slightly and moved into the tests directory.
+
+Tue Jun 20 22:29:57 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * httools/htmerge.cc (mergeDB): Merge in patch contributed by
+ Lorenzo Campedelli <lorenzo.campedelli at libero.it> to greatly
+ reduce memory usage.
+
+Sun Jun 18 13:15:43 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/Object.h (class Object): Fix problems with retrieval order
+ by insuring the compare() method is declared const.
+
+Tue Jun 13 22:57:10 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Retriever.cc (GetLocal): Fix bug that would cause a
+ coredump when local_urls was used and local_default_docs was
+ needed. The list of default filenames was freed before it should
+ have been.
+
+Tue Jun 13 19:30:28 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/HtWordReference.h, htcommon/HtWordReference.cc (Load,
+ LoadHeaders): New methods to check the header of an ASCII
+ representation and read it in.
+
+ * htcommon/HtWordList.h, htcommon/HtWordList.cc (Load): Add load
+ method to read in data. Calls the new methods above.
+
+ * httools/htload.cc: Open word databases read-write and call
+ HtWordList::Load().
+
+Sun Jun 11 14:39:28 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/Display.cc (generateStars): Fix problem when maxScore
+ == minScore as reported by Rajendra. Fixed problem PR#858.
+ (displayMatch): Ditto.
+
+ * htsearch/htsearch.cc: Fix memory corruption problem in reporting
+ syntax errors pointed out by Rajendra. Fixes PR#860.
+
+Thu Jun 8 09:31:15 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htfuzzy/Accents.h, htfuzzy/Accents.cc: Apply Robert Marchand's
+ patch to his algorithm. Gets rid of writeDB function (falls back
+ on default one in Fuzzy.cc), changes addWord, and adds a new
+ getWords function to override default. These avoid overhead of
+ unaccented forms of words in accents database, but ensure that
+ unaccented form of search word is always searched.
+
+Thu Jun 8 09:00:02 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/DocumentRef.h(DocScore, docScore),
+ htsearch/ResultMatch.cc(ScoreMatch::compare),
+ htsearch/ResultMatch.h(setScore, getScore, score),
+ htsearch/Display.cc(displayMatch, generateStars, buildMatchList):
+ Apply Terry Luedtke's patch for score calculations, to calculate
+ min & max from log(score).
+
+Thu Jun 8 08:47:03 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/doc2html/doc2html.pl: Apply David Adams' fix for missing
+ quote.
+
+Wed Jun 07 10:53:53 2000 Loic Dachary <loic at senga.org>
+
+ * db/db.c (CDB___db_dbenv_setup): open mode is 0666 instead
+ of 0 otherwise the weakcmpr file is not open with the proper
+ mode.
+
+Tue Jun 6 23:48:48 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * httools/htpurge.cc: Fix coredump problems by passing
+ dictionaries as pointers rather than full objects (this is
+ preferred anyway).
+
+Sun Jun 4 22:17:14 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * test/t_htdig_local: Added test for local filesystem support.
+
+ * test/config/htdig.conf2.in: Change to be a config file for
+ local_urls testing.
+
+ * test/Makefile.am: Add t_htdig_local to list.
+
+Tue May 30 23:52:45 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * httools/htmerge.cc: Move to httools directory, remove "cleanup"
+ functionality now in htpurge and merge in htmerge.h and db.cc files.
+
+ * httools/Makefile.am: Add htmerge now moved to this directory.
+
+ * */Makefile.in: Update with automake.
+
+ * Makefile.am (SUBDIRS): Remove htmerge, now found in httools.
+
+ * configure.in: Ditto.
+
+ * configure: Update with autoconf.
+
+ * test/test_functions.in: Add paths for htpurge, htstat, htload,
+ htdump and update path for htmerge.
+
+ * test/t_htdig: Change htmerge to htpurge to clean out incorrect URLs.
+
+ * installdir/rundig: Change htmerge to htpurge. This needs serious
+ additional cleanup for use in 3.2 since many conventions have changed!
+
+Tue May 23 22:21:14 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * README: Fix for 3.2.0b3 and clean up organization a bit for new
+ directory structure.
+
+Wed May 17 23:22:31 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/HTML.cc (do_tag): Add support for TITLE attributes in
+ anchor and related tags.
+
+Fri May 12 17:54:09 2000 Loic Dachary <loic at senga.org>
+
+ * db/acinclude.m4: bigfile support is disabled by default.
+
+ * db/mp_region.c (CDB___memp_close): clear weakcmpr pointer
+ when closing region so that memory pool files are not
+ released twice.
+
+Wed May 10 22:26:21 2000 Loic Dachary <loic at senga.org>
+
+ * */*.cc: all include htconfig.h
+
+ * htlib/HtTime.h: remove htconfig.h inclusion (never in headers)
+
+ * htlib/*.h,*.cc: Fix copyright GNU Public -> Gnu General Public
+ and 1999, 2000 instead of 1999.
+
+Tue May 09 16:38:07 2000 Loic Dachary <loic at senga.org>
+
+ * htsearch/Collection.cc (Collection): set searchWords and
+ searchWordsPattern to null in constructor. Delete in destructor.
+ Also delete matches in destructor.
+
+ * test/word.cc (doskip_harness): free cursor after use.
+
+ * test/word.cc (doskip_overflow): free cursor after use.
+
+ * test/dbbench.cc (find): free cursor after use.
+
+ * htsearch/htsearch.cc (main): free searchWords and searchWordsPattern
+ after usage.
+
+ * htdb/htdb_{load,dump,stat}.cc (main): call WordContext::Finish
+ to free global context for inverted index.
+
+ * htdb/htdb_stat.cc (btree_stats): free stat structure.
+
+ * htlib/List.h (class List): Add Shift/Unshift/Push/Pop methods.
+
+ * htlib/List.h (class List): Add Remove(int position) method.
+
+Tue May 09 00:22:33 2000 Loic Dachary <loic at senga.org>
+
+ * htsearch/htsearch.cc (main): kill useless call to
+ StringList::Release
+
+ * htsearch/HtURLSeedScore.cc (ScoreAdjustItem): remove useless
+ call to StringList::Destroy.
+
+ * htlib/HtWordCodec.cc (HtWordCodec): Fix usage of StringList
+ that was inserting pointers to volatile strings instead of
+ permanent copies. I suspect that the tweak on StringList was
+ primarily done to satisfy this piece of code. After reviewing
+ all the usage of StringList, it's the only one to use it in this
+ fashion.
+
+ * htlib/QuotedStringList.h (class QuotedStringList): remove
+ noop destructor to enable Destroy of the underlying StringList
+ when deleted.
+
+Mon May 08 18:17:02 2000 Loic Dachary <loic at senga.org>
+
+ * htlib/StringList.h (class StringList): change methods
+ Add/Insert/Assign that were copying the String* given in argument.
+ This behaviour is confusing since it has a different semantic
+ than the base class List.
+
+Mon May 08 17:16:00 2000 Loic Dachary <loic at senga.org>
+
+ * htdig/Retriever.cc (GetLocal): fix leaked defaultdocs
+
+Mon May 08 04:27:47 2000 Loic Dachary <loic at senga.org>
+
+ * htlib/StringList.cc (Create): remove SRelease. Deleting
+ the strings is taken care of by the destructor thru
+ Destroy. If destruction of the Strings is not desirable
+ Release should be used. SRelease was added apparently after
+ a virtual constructor doing nothing was added to hide the
+ default call to Destroy therefore leaking memory.
+
+Mon May 08 01:28:25 2000 Loic Dachary <loic at senga.org>
+
+ * test/txt2mifluz.cc,word.cc,search.cc: fix minor memory leaks.
+
+Sun May 07 19:24:12 2000 Loic Dachary <loic at senga.org>
+
+ * Makefile.config (HTLIBS): add libht at end because htdb
+ now depends on htlib.
+
+ * configure.in,htlib/Makefile.am: use LTLIBOBJS as suggested
+ by the libtool documentation.
+
+Sun May 07 17:09:22 2000 Loic Dachary <loic at senga.org>
+
+ * test/Makefile.am (clean-local): clean conf to prevent
+ inconsistencies when re-configuring in a directory that
+ is not the source directory.
+
+Sun May 07 05:07:23 2000 Loic Dachary <loic at senga.org>
+
+ * db/mkinstalldir,test/benchmark: Add for installation purpose
+
+Sun May 07 02:17:03 2000 Loic Dachary <loic at senga.org>
+
+ * Makefile.am (distclean-local): Xtest instead of test
+ that confuse some shells.
+
+Sun May 07 02:02:46 2000 Loic Dachary <loic at senga.org>
+
+ * htword/WordDB.cc: Move Open to WordDB.cc.
+
+Sun May 07 01:32:47 2000 Loic Dachary <loic at senga.org>
+
+ * test/t_*: check/fix scripts. All regression tests pass
+ on RedHat-6.2.
+
+Sun May 07 00:54:30 2000 Loic Dachary <loic at senga.org>
+
+ * */*.cc: fix warnings and large file support inclusion
+ files on Solaris.
+
+Sat May 06 21:55:58 2000 Loic Dachary <loic at senga.org>
+
+ * test/: import regression tests from mifluz
+
+ * htlib/DB2_db.cc (db_init): fix flags used when creating the
+ environment to include a memory pool.
+
+ * htcommon/defaults.cc: change wordkey_description format.
+ update all wordlist_* attributes
+
+Sat May 06 04:46:03 2000 Loic Dachary <loic at senga.org>
+
+ * htmerge/words.cc (mergeWords): WordSearchDescription becomes
+ WordCursor.
+
+ * httools/htpurge.cc (purgeWords): WordSearchDescription becomes
+ WordCursor.
+
+Sat May 06 02:01:40 2000 Loic Dachary <loic at senga.org>
+
+ * htdb/*: upgrade to Berkeley DB 3.0.55. Very different.
+
+ * htlib/getcwd.c,memcmp.c,memcpy.c,memmove.c,raise.c,snprintf.c,
+ strerror.c,vsnprintf.c,clib.h: Add compatibility support
+
+ * htcommon/DocumentDB.cc (LoadDB): remove unused variable
+
+ * htlib/DB2_db.cc: adapt to Berkeley DB 3.0.55 syntax.
+
+ * htlib/Database.h (class Database): remove DB_INFO, does
+ not exist in Berkeley DB 3.0.55
+
+ * htlib/*: run ../db/prefix-symbols.sh
+
+ * Makefile.config (INCLUDES): fix db include dirs
+
+ * acconfig.h: Big file support + replacement functions
+
+ * acinclude.m4,configure.in : db instead of db/dist + bug fixes
+
+Fri May 5 08:33:59 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * db/*: Merge in changes from Loic's mifluz tree. This will break
+ everything, but Loic promises he'll fix it ASAP after I make this
+ change.
+
+Mon Apr 24 21:58:22 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/htdig.cc (main): Make the -l stop & restart mode the
+ default. This will catch signals and quit gracefully. The
+ command-line parser will still accept -l, it will just ignore it.
+ (usage): Remove -l portion.
+ (main): Fix -m option to read in a file as it's
+ supposed to do! Also set max_hops correctly so really only indexes
+ the URLs in that file.
+
+ * htdoc/htdig.html: Remove -l from documentation since it's now
+ the default.
+
+Mon Apr 24 21:22:53 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Server.cc (push): Fix bug where changes in the robots.txt
+ would be ignored. If a URL was indexed and later the robots.txt
+ changed to forbid it, the URL would still be updated.
+
+Wed Apr 19 22:13:02 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * Merging in changes from mifluz 0.14 from Loic.
+
+ * htlib/Configuration.cc (Read): Removed dependency on fstream.h,
+ use fopen, fprintf, fgets, fclose instead of iostream.
+
+ * htlib/HtPack.cc, htlib/HtVectorGeneric.h, htlib/Object.h,
+ htlib/ParsedString.cc, htlib/String.cc: Remove use of cerr,
+ instead use fprintf(stderr ...).
+
+ * htlib/Dictionary.cc, htlib/HtVectorGeneric.cc, htlib/List.cc,
+ htlib/Object.cc, htlib/StringList.cc, htlib/htString.h,
+ htlib/strcasecmp.cc: Add #ifdef blocks for htconfig.h
+
+Wed Apr 12 19:09:40 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * .version: Bump to 3.2.0b3.
+
+ * htdoc/htload.html, htdoc/htpurge.html, htdoc/htstat.html: Fix
+ typos in headers.
+
+ * htdoc/main.html: Fix link to download to actually point to 3.2.0b2.
+
+Tue Apr 11 00:21:48 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/htsearch.cc (setupWords): Does not apply fuzzy
+ algorithms to phrase queries. This helps prevent the infinite
+ loops described on the mailing list.
+
+ * htcommon/conf_parser.yxx (list): Add conditions for lists
+ starting with string-number, number-string, and number-number.
+
+ * htcommon/conf_parser.cxx: Regenerate using bison.
+
+ * htdoc/RELEASE.html: Update release notes for recent bug fixes
+ and likely release date for 3.2.0b2.
+
+ * htdoc/main.html: Add a blurb about the 3.2.0b2 release.
+
+ * htdoc/*.html: Remove author notes in the footer as requested by
+ Andrew. To balance it out, the copyright notice at the top links
+ to THANKS.html.
+
+Sun Apr 9 15:21:12 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/conf_parser.yxx (list): Fix problem with
+ build_select_lists--parser didn't support lists including numbers.
+
+ * htcommon/conf_parser.cxx: Regenerate using bison.
+
+Sun Apr 9 12:53:02 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdoc/RELEASE.html: Add a first draft of 3.2.0b2 release notes.
+
+Sun Apr 9 12:31:13 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * httools/Makefile.am, httools/Makefile.in: Add htload to
+ compilation list.
+
+ * htcommon/DocumentDB.h: Add optional verbose options to DumpDB
+ and LoadDB.
+
+ * htcommon/DocumentDB.cc (LoadDB): Implement loading and parsing
+ an ASCII version of the document database. Records on disk will
+ replace any matching records in the db.
+ (DumpDB): Add all fields in the DocumentRef to ensure the entire
+ database is written out.
+
+ * htcommon/DocumentRef.h: Add new method for setting DocStatus
+ from an int type.
+
+ * htcommon/DocumentRef.cc (DocStatus): Set it using a switch
+ statement. (It's not pretty, but it works.)
+
+ * httools/htload.cc: New file. Loads in ASCII versions of the
+ databases, replacing existing records if found.
+
+ * httools/htdump.cc: Pass verbose flags to DumpDB method. Make
+ sure to close the document DB before quitting.
+
+ * httools/htpurge.cc: Add -u option to specify a URL to purge from
+ the command-line.
+
+ * httools/htstat.cc: Add -u option to output the list of URLs in
+ the document DB as well.
+
+Sat Apr 8 16:35:55 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/defaults.cc: Change all <b>, <i>, and <tt> tags to the
+ HTML-4.0 compliant <strong>, <em>, and <code> tags.
+
+ * installdir/long.html, installdir/header.html,
+ installdir/nomatch.html, installdir/syntax.html,
+ installdir/wrapper.html: Ditto.
+
+ * htdoc/*.html: Ditto. (Don't you just love sed?)
+
+ * htsearch/TemplateList.cc (createFromString): Ditto.
+
+ * htdoc/htpurge.html, htdoc/htdump.html, htdoc/htload.html,
+ htdoc/htstat.html: New files documenting usage of httools
+ programs.
+
+ * htdoc/contents.html: Add links to above.
+
+ * htdoc/htdig.html: Update table with -t format to match htdump.
+
+Fri Apr 7 00:30:01 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * README: Update to mention 3.2.0b2 and use correct copyright. (It
+ is 2000 after all!)
+
+ * htdoc/FAQ.html, htdoc/where.html, htdoc/uses.html,
+ htdoc/isp.html: Update with most recent versions from maindocs.
+
+ * htdoc/RELEASE.html: Add release notes for 3.1.5 to the
+ top. (It's out of version ordering, but it is in correct
+ chronological order.)
+
+Fri Apr 7 00:11:29 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * httools/htpurge.cc (main): Read in URLs from STDIN for purging,
+ one per line. Pass them along to purgeDocs for removal. Also, make
+ discard_list into a local variable and pass it from purgeDocs to
+ purgeWords.
+ (purgeDocs): Accept a hash of URLs to delete (user input) and
+ return the list of doc IDs deleted.
+ (usage): Note the - option to read in URLs to be deleted from STDIN.
+
+Thu Apr 6 00:10:23 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Retriever.cc (got_redirect): Allow the redirect to accept
+ relative redirects instead of just full URLs.
+
+Wed Apr 5 15:07:52 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc: Added #if test to make sure DBL_MAX is
+ defined on Solaris, as reported by Terry Luedtke.
+
+Tue Apr 4 12:46:37 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/doc2html/*: Added parser submitted by D.J.Adams at soton.ac.uk
+
+Mon Apr 3 13:48:59 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc: Fix error in description of new attribute
+ plural_suffix.
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+
+Fri Mar 31 21:48:02 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * configure.in, configure: Add test using AC_TRY_RUN to compile
+ against the htlib/regex.c and attempt to compile a regexp. This
+ should allow us to find out if the included regex code causes
+ problems.
+
+ * acconfig.h: Add HAVE_BROKEN_REGEX as a result of the configure
+ script to conditionally include the appropriate regex.h file.
+
+ * include/htconfig.h.in: Regenerate using autoheader.
+
+ * htlib/regex.c: Move #include "htconfig.h" inside HAVE_CONFIG_H
+ tests. This file is only created when this is true anyway. This
+ prevents problems with the configure test.
+
+ * htlib/HtRegex.h, htfuzzy/EndingsDB.cc: Use HAVE_BROKEN_REGEX
+ switch to use the system include instead of the local include
+ where appropriate.
+
+ * htlib/Makefile.am, htlib/Makefile.in: Only compile regex.lo if
+ the configure script added it to LIBOBJS.
+
+Thu Mar 30 22:41:38 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/URL.cc (normalizePath): Remove Gilles's loop to add
+ back ../ components to a path that would go above the top
+ level. Now we simply discard them. Both are allowed under the RFC,
+ but this should have fewer "surprises."
+
+Tue Mar 28 21:57:49 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htnet/Connection.cc (Read_Partial): Fix bug reported by Valdas
+ where a zero value returned by select would result in an infinite
+ loop.
+
+ * htcommon/defaults.cc: Add new attribute plural_suffix to set the
+ language-dependent suffix for PLURAL_MATCHES contributed by Jesse.
+
+ * htsearch/Display.cc (setVariables): Use it.
+
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+
+Mon Mar 27 22:28:20 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/DocumentRef.cc (Deserialize): Add back stub for
+ DOC_IMAGESIZE to prevent decoding errors. This just throws away
+ that field.
+
+ * htcommon/HtSGMLCodec.h (class HtSGMLCodec): Differentiate
+ between codec used for &foo; and numeric form &#nnn; Make sure
+ encoding goes through both but decoding only goes through the
+ preferred text form.
+
+ * htcommon/HtSGMLCodec.cc (HtSGMLCodec): When constructing the
+ private HtWordCodec objects, create separate lists for the number
+ and text codecs.
+
+Mon Mar 27 21:25:27 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/HtURLSeedScore.cc (ScoreAdjustItem): Change to use
+ HtRegex for flexibility and to get around const char * -> char *
+ problems.
+
+ * htsearch/SplitMatches.cc (MatchArea): Ditto.
+
+ * htsearch/Makefile.am, htsearch/Makefile.in: Add SplitMatches.cc
+ and HtURLSeedScore.cc to compilation list!
+
+Mon Mar 27 21:03:12 2000 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htcommon/defaults.cc (defaults): Add default for
+ search_results_order, url_seed_score.
+
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerated using cf_generate.pl.
+
+ * htlib/List.h (List): New method AppendList.
+ * htlib/List.cc (List::AppendList): Implement it.
+
+ * htsearch/SplitMatches.h, htsearch/SplitMatches.cc: New.
+
+ * htsearch/HtURLSeedScore.cc, HtURLSeedScore.h: New.
+
+ * htsearch/Display.h (class Display: Add member minScore.
+ Change maxScore type to double.
+
+ * htsearch/Display.cc: Include SplitMatches.h and HtURLSeedScore.h
+ (ctor): Initialize minScore, change init value for
+ maxScore to -DBL_MAX.
+ (buildMatchList): Use a SplitMatches to hold search results and
+ interate over its parts when sorting scores.
+ Ignore Count() of matches when setting minScore and maxScore.
+ Use an URLSeedScore to adjust the score after other calculations.
+ Calculate minScore.
+ Correct maxScore adjustment for change to double.
+ (displayMatch): Use minScore in calculation of score to adjust for
+ negative scores.
+ (sort): Calculation of maxScore moved to buildMatchList.
+
+Mon Mar 27 20:22:24 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/DocumentRef.h, htcommon/DocumentRef.cc: Remove
+ DocImageSize field since it is not used anywhere and is never updated.
+
+ * htdig/Retriever.h (class Retriever): Remove references to Images class.
+
+ * htcommon/DocumentDB.cc (DumpDB): Ignore DocImageSize field.
+
+ * htdig/Makefile.am, htdig/Makefile.in: Remove Images.cc since
+ this is no longer used.
+
+ * htdig/Plaintext.cc: Do not insert SGML equivalents into the
+ excerpt, these are decoded by HtSGMLCodec automatically.
+
+Sat Mar 25 21:58:36 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdoc/cf_generate.pl (html_escape): Changed <b></b> and <i></i>
+ tags to HTML 4.0 <strong> and <em> tags.
+
+Sat Mar 25 17:23:46 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdb/Makefile.am, htdb/Makefile.in: Change the names of the htdb
+ utility programs to escape name conflicts with httool programs.
+
+ * htdb/htdb_load.cc: Rename htload.cc to escape name conflict and
+ more closely match orignal db_load program name.
+
+ * htdb/htdb_dump.cc, htdb/htdb_stat.cc: Ditto.
+
+ * htfuzzy/Prefix.cc (getWords): Add code to "weed out" duplicates
+ returned from WordList::Prefix. We only want to add unique words
+ to the search list.
+
+Fri Mar 24 22:33:20 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Document.cc (Document): Fix bug reported by Mentos
+ Hoffman, contributed by Atlee Gordy <agordy at moonlight.net>.
+
+Mon Mar 20 23:14:26 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/DocumentDB.cc (Delete): Fix bug reported by Valdas
+ where duplicate document records could "sneak in" because the
+ doc_index entry was removed incorrectly.
+
+Mon Mar 20 19:08:14 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/defaults.cc: Added block field and added appropriate blocks.
+
+ * htlib/Configuration.h (struct ConfigDefaults): Add block field.
+
+ * htdoc/cf_generate.pl: Parse the new block field.
+
+ * htdoc/cf_byname.html, htdoc/cf_byprog.html, htdoc/attrs.html:
+ Regenerate using above.
+
+ * htcommon/DocumentDB.cc (DumpDB): Make sure we decompress the
+ DocHead field before we write it to disk!
+
+ * httools/htdump.cc, httools/htstat.cc: Call
+ WordContext::Initialize() before doing any htword calls.
+
+Mon Mar 20 14:10:30 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * httools/htpurge.cc: Whoops! Left some references to htmerge in
+ the error messages and usage message.
+
+ * httools/htstat.cc: New program. Simply spits up the total number
+ of documents, words and unique words in the databases.
+
+ * httools/htdump.cc: New program. Simply dumps the contents of the
+ document DB and the word DB to doc_list and word_dump files
+ respectively. Also has flags -w and -d to pick one or the other.
+
+ * httools/Makefile.am, httools/Makefile.in: Add htdump and htstat
+ programs to compilation list.
+
+ * htcommon/DocumentDB.cc (DumpDB): Change name of CreateSearchDB
+ and add fields for DocBackLinks, DocSig, DocHopCount, DocEmail,
+ DocNotification, and DocSubject. This should now export every
+ portion of the document DB.
+
+ * htcommon/DocumentDB.h: Change name of CreateSearchDB and add
+ stub for LoadDB, to be written shortly.
+
+ * htdig/htdig.cc: Call DumpDB instead of CreateSearchDB when
+ creating an ASCII version of the DB.
+
+Sat Mar 18 22:57:02 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * httools/Makefile.am, httools/Makefile.in: New directory for
+ useful database utilities.
+
+ * httools/htnotify.cc: Moved htnotify to httools directory.
+
+ * httools/htpurge.cc: New program--currently just purges documents
+ (and corresponding words) in the databases. Will shortly also
+ allow deletion of specified URLs.
+
+ * Makefile.am, configure.in: Remove htnotify directory in favor of
+ httools directory.
+
+ * configure: Regenerate using autoconf.
+
+ * Makefile.in: Regenerate using automake --foreign.
+
+Fri Mar 17 16:47:37 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc (excerpt, hilight): Correctly handle case
+ where there is no pattern to highlight.
+ * htsearch/htsearch.cc (addRequiredWords), htcommon/defaults.cc:
+ Add any_keywords attribute, to OR keywords rather than ANDing,
+ fix addRequiredWords not to mess up expression when there are
+ no search words, but required words are given.
+ * htdoc/hts_form.html: Mention new attribute, add links to all
+ mentioned attributes.
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+
+Fri Mar 17 15:48:12 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htfuzzy/Accents.cc (generateKey): Truncate words to
+ maximum_word_length, for consistency with what's found in word DB.
+
+Fri Mar 17 10:56:17 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc (do_tag): Use case insensitive parsing of META
+ robots tag content.
+ * htlib/String.cc (uppercase): Fix misplaced cast for islower().
+
+Mon Mar 6 17:31:37 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/htsearch.cc (setupWords): Don't allow comma as string
+ list separator, as it can be a decimal point in some locales.
+
+Mon Mar 06 00:58:00 2000 Loic Dachary <loic at ceic.com>
+
+ * db/mp/mp_bh.c (__memp_bhfree): always free the chain, if
+ any. The bh is reset to null after free and we loose the
+ pointer anyway, finally filling the pool with it.
+
+ * db/mp/mp_cmpr.c (__memp_cmpr_write): i < CMPR_MAX - 1 instead of
+ i < CMPR_MAX otherwise go beyond array limits. This fixes a
+ major problem when handling large files.
+
+Sat Mar 04 19:41:49 2000 Loic Dachary <loic at ceic.com>
+
+ * db/mp/mp_cmpr.c (__memp_cmpr_free_chain): clear BH_CMPR
+ flag. Was causing core dumps, thanks to
+ Peter Marelas maral at phase-one.com.au for providing
+ a simple case to reproduce the error.
+
+Fri Mar 3 11:32:34 2000 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * Fixed bugs regarding yesterday's changes. Even Leonardo da Vinci
+ used to commit errors, so ...
+
+Fri Mar 3 11:25:42 2000 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * testnet.cc: added the -r and -w options in order to set how many
+ times it retries to re-connect after a timeout occurs, and how long
+ it should wait after it.
+
+Thu Mar 2 18:45:15 2000 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htnet/Connection.*: management of wait_time and number of retries
+ after a timeout occurs.
+
+ * htnet/Transport.*: Management of connection attributes above.
+
+ * htdig/Server.*: Set members for managing timeout retries taken from
+ the configuration file ("timeout", "tcp_max_retries", "tcp_wait_time").
+
+ * htdig/Document.cc: Added the chance to configure on a server basis
+ "persistent_connections", "head_before_get", "timeout",
+ "tcp_max_retries", "tcp_wait_time". Changed Retrieve method accepting
+ now a server object pointer: Retrieve (server*, HtDateTime).
+
+ * htdig/Retriever.cc: Added the chance to configure on a server basis
+ "max_connection_requests" attribute.
+
+ * htcommon/defaults.cc: Added "tcp_max_retries", "tcp_wait_time" -- Need
+ to be go over by someone who speaks english better than me. Not a hard
+ work !!! ;-)
+
+Wed Mar 1 17:01:09 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc (excerpt, hilight): move SGML encoding into
+ hilight() function, because when it's done earlier it breaks
+ highlighting of accented characters.
+
+Wed Mar 1 16:02:49 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htfuzzy/htfuzzy.cc (main): Correctly test return value on Open()
+ of word database, include db name in error message if Open() fails,
+ do a WordContext::Initialize() before we need htword functions.
+ (Obviously I'm the first to test htfuzzy in 3.2!)
+ * htfuzzy/Accents.cc (generateKey): cast characters to unsigned char
+ before using as array subscripts.
+
+Wed Mar 1 13:27:26 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc: Added accents_db attribute, mentioned accents
+ algorithm in search_algorithms section.
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+ * installdir/htdig.conf: Added mentions of accents, speling & substring,
+ fixed a couple typos in comments.
+ * htdoc/htfuzzy.html: Added blurb on accents algorithm.
+ * htdoc/require.html: Added mentions of accents, speling, substring,
+ prefix & regex.
+ * htdoc/config.html: Updated with sample of latest htdig.conf and
+ installdir/*.html, added blurb on wrapper.html.
+
+Wed Mar 1 00:30:19 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * configure.in, configure: Add test for FD_SET_T, the second (also
+ third and fourth) argument in calls to select(). Should solve PR#739.
+
+ * acconfig.h, include/htconfig.h.in: Add declaration for FD_SET_T.
+
+ * htnet/Connection.cc (ReadPartial): Change declaration of fds to
+ use FD_SET_T define set by the configure script.
+
+Tue Feb 29 23:11:49 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/DB2_db.cc (Error): Simply fprint the error message on
+ stderr. This is not a method since the db.h interface expects a C
+ function.
+ (db_init): Don't set db_errfile, instead set errcall to point to
+ the new Error function.
+
+Tue Feb 29 15:09:41 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htfuzzy/Accents.h, htfuzzy/Accents.cc: Adapted writeDB() for 3.2.
+
+Tue Feb 29 14:29:37 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htfuzzy/Accents.h, htfuzzy/Accents.cc: Added these, as contributed
+ by Robert Marchand, to implement accents fuzzy match. Adapted to 3.2.
+ * htfuzzy/Fuzzy.cc, htfuzzy/htfuzzy.cc, htfuzzy/Makefile.am,
+ htfuzzy/Makefile.in: Added in accents algorithm, as for soundex.
+
+Tue Feb 29 11:31:53 2000 Loic Dachary <loic at ceic.com>
+
+ * test/testnet.cc (Listen): Add -b port to listen to a specific
+ port. This is to test connect timeout conditions.
+
+ * htnet/Connection.cc (Connect): Added SIGALRM signal handler,
+ Connect() always allow EINTR to occur.
+
+Mon Feb 28 15:32:46 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordKey.h (class WordKey): explicitly add inline keyword
+ for all inline functions.
+
+Mon Feb 28 13:10:34 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordKey.h (class WordKey): nfields data member caches
+ result of NFields() method.
+
+ * htword/WordDBPage.h (class WordDBPage): nfields data member caches
+ result of WordKey::NFields() method.
+
+ * acinclude.m4 (APACHE): check in lib/apache for modules
+
+Sat Feb 26 22:05:03 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/Collection.h, htsearch/Collection.cc: New files
+ contributed by Rajendra Inamdar <inamdar at beasys.com>.
+
+ * htsearch/Makefile.am, htsearch/Makefile.in: Compile them.
+
+ * htcommon/defaults.cc: Add new collection_names attribute as
+ described by Rajendra.
+
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+
+ * htsearch/Display.h, htsearch/Display.cc: Loop through
+ collections as we are assembling results.
+ (buildMatchList): Use 1.0 as minimum score and take log(score) as
+ the final score. This requires an increase in magnitude in weight
+ to correspond to a factor of increase in score.
+
+ * htsearch/DocMatch.h, htsearch/DocMatch.cc: Keep track of the
+ collection we're in.
+
+ * htsearch/ResultMatch.h: Ditto.
+
+ * htsearch/htsearch.h, htsearch/htsearch.cc: Wrap results in
+ collections.
+
+ * htsearch/parser.h, htsearch/parser.cc: Set the collection for
+ the results--we use this to get to the appropriate word DB.
+ (score): Divide word weights by word frequency to calibrate for
+ expected Zipf's law. Rare words should count more.
+
+Fri Feb 25 11:19:47 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc (maximum_pages): Describe new bahaviour (as of
+ 3.1.4), where this limits total matches shown.
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+
+Thu Feb 24 14:43:06 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htnet/HtFile.cc (Request): Fix silly typo.
+
+ * htlib/DB2_db.cc: Remove include of malloc.h, as it causes problems
+ on some systems (e.g. Mac OS X), and all we need should be in stdlib.h.
+
+Thu Feb 24 13:11:15 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htnet/HtFile.cc (Request): Don't append more than _max_document_size
+ bytes to _contents string, set _content_length to size returned by
+ stat().
+ * htnet/HtHTTP.cc (HTTPRequest): Extra tests in case Content-Length
+ not given for non-chunked input, and not to close persistent
+ connection when chunked input exceeds _max_document_size.
+ (ReadChunkedBody): Don't append more than _max_document_size bytes
+ to _contents string.
+
+Thu Feb 24 11:40:24 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc (do_tag): Fix handling of img alt text to be consistent
+ with body text, rather than keywords.
+ * htdig/Retriever.cc (ctor): Treat alt text as plain text, until it has
+ its own FLAG and factor.
+
+Thu Feb 24 11:16:37 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc (version): Moved example over to correct field.
+ (defaults[] terminator): Padded zeros to new number of fields.
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+
+Thu Feb 24 19:08:41 2000 Loic Dachary <loic at ceic.com>
+
+ * htmerge/words.cc: only display Word in verbose message instead
+ of complete key if verbosity < 3.
+
+Thu Feb 24 10:43:12 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc (external_protocols, external_parser):
+ Swapped these two entries to put them in alphabetical order.
+ (star_blank): Fixed old typo (incorrect reference to image_star).
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+
+Wed Feb 23 16:53:40 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc (backlink_factor, external_parser,
+ local_default_doc, local_urls, local_urls_only, local_user_urls):
+ Add some updates from 3.1.5's attrs.html.
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+
+Wed Feb 23 15:11:51 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ [ Improve htsearch's HTML 4.0 compliance ]
+ * htsearch/TemplateList.cc (createFromString): Use file name rather
+ than internal name to select builtin-* templates, use $&(TITLE) and
+ $&(URL) in templates and quote HTML tag parameters.
+ * installdir/long.html, installdir/short.html: Use $&(TITLE) and
+ $&(URL) in templates and quote HTML tag parameters.
+ * htsearch/Display.cc (setVariables): quote all HTML tag parameters
+ in generated select lists.
+ * installdir/footer.html, installdir/header.html,
+ installdir/nomatch.html, installdir/search.html,
+ installdir/syntax.html, installdir/wrapper.html:
+ Use $&(var) where appropriate, and quote HTML tag parameters.
+ * installdir/htdig.conf: quote all HTML tag parameters.
+
+Wed Feb 23 13:40:27 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/URL.h (encodeURL): Change list of valid characters to
+ include only unreserved ones.
+ * htcommon/cgi.cc (init): Allow "&" and ";" as input param. separators.
+ * htsearch/Display.cc (createURL): Encode each parameter separately,
+ using new unreserved list, before piecing together query string, to
+ allow characters like "?=&" within parameters to be encoded.
+
+Wed Feb 23 13:22:29 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/URL.cc (ServerAlias): Fix server_aliases processing to prevent
+ infinite loop (as for local_urls in PR#688).
+
+Wed Feb 23 12:49:52 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/HtDateTime.h, htlib/HtDateTime.cc: change Httimegm() method
+ to HtTimeGM(), to avoid conflict with Httimegm() C function, so we
+ don't need "::" override, for Mac OS X.
+ * htlib/htString.h, htlib/String.cc: change write() method to
+ Write(), to avoid conflict with write() function, so we don't need
+ "::" override, for Mac OS X.
+
+Wed Feb 23 12:17:46 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/Configuration.cc(Read): Fixed to allow final line without
+ terminating newline character, rather than ignoring it.
+
+Wed Feb 23 12:01:01 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Retriever.cc (GetLocal, GetLocalUser): Add URL-decoding
+ enhancements to local_urls, local_default_urls & local_default_doc,
+ to allow hex encoding of special characters.
+
+Wed Feb 23 19:14:29 2000 Loic Dachary <loic at ceic.com>
+
+ * htcommon/conf_parser.cxx: regenerated from conf_parser.yxx
+
+Wed Feb 23 19:04:16 2000 Loic Dachary <loic at ceic.com>
+
+ * test/test_functions.in: inconditionaly remove existing test/var
+ directory before runing tests to prevent accidents.
+
+ * htcommon/URL.cc (URL): fixed String->char warning
+
+ * htcommon/defaults.cc (wordlist_compress): defaults to true
+
+Tue Feb 22 17:09:10 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc(parse, do_tag): Fix handling of <img alt=...> text
+ and parsing of words in meta tags, to to proper word separation.
+ * htlib/HtWordType.h, htlib/HtWordType.cc: Add HtWordToken() function,
+ to replace strtok() in HTML parser.
+
+Tue Feb 22 16:21:25 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/URL.cc (ctor, normalizePath): Fix PR#779, to handle relative
+ URLs correctly when there's a trailing ".." or leading "//".
+
+Tue Feb 22 14:09:26 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Document.cc (RetrieveLocal): Handle common extensions for
+ text/plain, application/pdf & application/postscript.
+
+Mon Feb 21 17:25:21 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/htdig-3.2.0.spec: Fixed %post script to add more
+ descriptive entries in htdig.conf, made cron script a config file,
+ updated to 3.2.0b2.
+
+ * contrib/conv_doc.pl, contrib/parse_doc.pl: Added comments to show
+ Warren Jones's updates in change history.
+
+Mon Feb 21 17:09:13 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/HtConfiguration.h, htcommon/conf_parser.yxx,
+ htlib/Configuration.h, htlib/Configuration.cc: split Add() method
+ into Add() and AddParsed(), so that only config attributes get parsed.
+ Use AddParsed() only in Read() and Defaults().
+
+Fri Feb 18 22:50:54 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htnet/Connection.h, htnet/Connection.cc: Renamed methods with
+ capitals to remove the need to use ::-escaped library calls.
+
+ * htnet/Transport.h, htnet/Transport.cc, htnet/HtHTTP.cc,
+ htdig/Images.cc: Fix code using Connection to use the newly
+ capitalized methods.
+
+Fri Feb 18 14:40:50 2000 Loic Dachary <loic at ceic.com>
+
+ * test/conf/access.conf.in: removed cookies. Not used and some
+ httpd are not compiled with usertrack.
+
+Wed Feb 16 12:15:08 2000 Vadim Chekan <vadim at etc.lviv.ua>
+
+ * htcommon/Makefile.am replaced conf.tab.cc.h by conf_parser.h in
+ noinst_HEADERS
+
+ * htcommon/conf_parser.yxx,conf_parser.lxx,HtConfiguration.cc,
+ HtConfiguration.h: added copyright and Id:
+
+ * htcommon/cgi.cc(init): fixed bug: array must be free by
+ delete [] buf, not just delete buf;
+
+Tue Feb 15 23:16:14 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htnet/HtHTTP.cc (isParsable): Remove application/pdf as a
+ default type--it is now handled through the ExternalParser
+ interface if at all.
+
+ * htcommon/defaults.cc: Remove pdf_parser attribute.
+
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+
+ * htdig/Document.cc (getParsable): Remove PDF once and for all
+ (hopefully).
+
+ * htdig/ExternalParser.cc (parse): Ditto.
+
+ * configure.in: Remove check for PDF_PARSER.
+
+ * configure: Regenerate using autoconf
+
+ * htdig/Makefile.am: Remove PDF.cc and PDF.h.
+
+ * Makefile.in, */Makefile.in: Regenerate using automake --foreign
+
+Tue Feb 15 12:02:39 EET 2000 Vadim Chekan <vadim at etc.lviv.ua>
+
+ * htcommon/HtConfiguration.cc,HtConfiguration.h: fixed bug discovered
+ by Gilles. HtConfiguration was able to get info only from "url" and
+ "server" block.
+
+ * htcommon/conf_parser.yxx: deleted 1st parameter for new char[],
+ lefted when realloc was replaced by new char[]. Removed a few unused
+ variable declaration.
+
+ * htcommon/Makefile.am: added -d flag to bison to generate
+ conf_parser.h template from conf_parser.yxx;
+ conf_lexer.lxx uses #include conf_parser.h;
+ conf.tab.cc.h removed.
+
+Sun Feb 13 21:19:04 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/defaults.cc: Get rid of uncoded_db_compatible since
+ the current DB format has clearly broken backwards compatibility.
+
+ * htsearch/Display.cc (Display), htnotify/htnotify.cc (main),
+ htmerge/docs.cc (convertDocs), htmerge/db.cc (mergeDB),
+ htdig/htdig.cc (main): Remove call to DocumentDB::setCompatibility().
+
+ * htcommon/DocumentDB.h (class DocumentDB): Remove
+ setCompatibility and related private variable.
+
+ * htcommon/DocumentDB.cc ([], Delete): Don't bother checking for
+ an unencoded URL, at this point all URLs will be encoded using
+ HtURLCodec.
+
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+
+Sat Feb 12 21:29:20 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/HtSGMLCodec.cc (HtSGMLCodec): Always translate &quot;
+ &amp; &lt; and &gt;
+
+ * htcommon/defaults.cc: Remove translate_* and word_list
+ attributes since they're now no longer used.
+
+ * htdig/PDF.cc (parseNonTextLine): Fix bogus escape sequences
+ around Title parsing. Fixes PR#740.
+
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+
+Fri Feb 11 11:41:36 2000 Loic Dachary <loic at ceic.com>
+
+ * htlib/Makefile.am: removed CFLAGS=-g (use make CXXFLAGS=-g all
+ instead).
+
+ * htdoc/install.html: specify header/lib install directory now
+ is prefix/include/htdig and prefix/lib/htdig.
+
+ * Makefile.am (distclean-local): use TESTDIR instead of deprecated
+ HTDIGDIRS.
+
+ * */Makefile.am: install libraries in prefix/lib/htdig and
+ includes in prefix/include/htdig. Just prepend pkg in front of
+ automake targets.
+
+ * include/Makefile.am: install htconfig.h
+
+Thu Feb 10 23:18:37 2000 Loic Dachary <loic at ceic.com>
+
+ * Connection.cc (Connection): set retry_value to 1 instead of
+ 0 as suggested by Geoff.
+
+Thu Feb 10 17:36:09 2000 Loic Dachary <loic at ceic.com>
+
+ * htdig/Document.cc: fix (String)->(char*) conversion warnings.
+
+ * htword/WordList.cc: kill Collect(WordSearchDescription) which
+ was useless and error prone.
+
+ * htword/WordDB.h (WordDBCursor::Get): small performance improvement
+ by copying values only if key found.
+
+ * htword/WordDB.h,WordList.cc: fix reference counting bug when
+ using Override (+1 even if entry existed). Turn WordDB.h return
+ values to be std Berkeley DB fashion instead of the mixture with
+ OK/NOTOK that was a stupid idea. This allows to detect Put errors
+ and handle them properly to fix the Override bug without performance
+ loss.
+
+ * test/conf/httpd.conf.in: comment out loading of mod_rewrite
+ since not everyone has it.
+
+Thu Feb 10 00:26:02 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/defaults.cc: Add new attribute "nph" to send out
+ non-parsed headers for servers that do not supply HTTP headers on
+ CGI output (e.g. IIS).
+
+ * htsearch/Display.cc (display): If nph is set, send out HTTP OK
+ header as suggested by Matthew Daniel <mdaniel at scdi.com>
+ (displaySyntaxError): Ditto.
+
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate from current defaults.cc file.
+
+Thu Feb 10 00:21:58 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/HTML.cc (do_tag): Treat <script></script> tags as noindex
+ tags, much like <style></style> as suggested by Torsten.
+
+Thu Feb 10 00:02:41 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * .version: Bump for 3.2.0b2.
+
+ * htcommon/defaults.cc: Add category fields for each
+ attribute. Though these are currently unused, they could allow the
+ documentation to be split into multiple files based on logical
+ categories and subcategories.
+
+Wed Feb 9 23:52:55 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htnet/Connection.cc (connect): Add alarm(timeout) ... alarm(0)
+ around ::connect() call to ensure this does timeout as appropriate
+ as suggested by Russ Lentini <rlentini at atl.lmco.com> to resolve
+ PR#762 (and probably others as well).
+ (connect): Add a retry loop as suggested by Wilhelm Schnell
+ <Wilhelm.Schnell at mn.man.de> to resolve PR#754.
+
+ * htnet/HtHTTP.cc (HTTPRequest): Add CloseConnection() when the
+ connection fails on open before returning from the method. Should
+ take care of PR#670 for htdig-3-2-x.
+
+Wed Feb 09 17:20:50 2000 Loic Dachary <loic at ceic.com>
+
+ * db/dist/Makefile.in (libhtdb.so): move dependent libraries
+ *after* the list of objects, otherwise it's useless.
+
+ * htword/WordKey.h (class WordKey): move #if SWIG around to
+ please swig (www.swig.org).
+
+ * htword/WordList.h (class WordList): allow SWIG to see Walk*
+ functions (#if SWIG).
+
+Wed Feb 9 09:21:00 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Server.cc (robotstxt): apply more rigorous parsing of
+ multiple user-agent fields, and use only the first one.
+
+ * htlib/HtRegex.cc (set): apply the fix from Valdas Andrulis, to
+ properly compile case_sensitive expressions.
+
+Mon Feb 09 09:43:59 2000 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htnet/HtHTTP.cc: changed "<<" to append() for content_length
+ assignment in ReadChunkedBody() function (as Gilles suggested)
+
+Tue Feb 08 10:54:08 2000 Loic Dachary <loic at ceic.com>
+
+ * db/dist/configure.in: Added AC_PREFIX_DEFAULT(/opt/www)
+ so that headers and libraries are installed in the proper
+ directory when no --prefix is given.
+
+Tue Feb 08 10:32:48 2000 Loic Dachary <loic at ceic.com>
+
+ * test/t_wordskip: copy $srcdir/skiptest_db.txt to allow running
+ outside the source tree.
+
+ * configure.in: use '${prefix}/...' instead of "$ac_default_prefix/..."
+ that did not carry the --prefix value.
+
+ * configure.in: run CHECK_USER and AC_PROG_APACHE if --enable-tests
+
+Mon Feb 07 17:40:47 2000 Loic Dachary <loic at ceic.com>
+
+ * htlib/htString.h (last): turn to const
+
+Mon Feb 07 14:05:37 2000 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htnet/HtHTTP.cc: fixed a bug in ReadChunkedBody() function
+ regarding document size assignment (raised by Valdas Andrulis)
+
+Sun Feb 06 19:11:05 2000 Loic Dachary <loic at ceic.com>
+
+ * configure.in: Fix inconsistencies between default values
+ shown by ./configure and actual defaults.
+
+ * htdoc/install.html: change example version 3.1 to 3.2
+ Commented out warning about libguile.
+ Replace CONFIG variables by configure.in options.
+ Specify default value for each of them.
+ Replace (and move) make depend by automake (distributed
+ Makefiles do not include dependency generation)
+ Added section for running tests.
+ Added section on shared libraries.
+
+ * configure.in: use AM_CONDITIONAL for --enable-tests
+
+ * Makefile.am: use automake conditionals for subdir so
+ that make dist knows what to distribution --enable-tests
+ specified or not.
+
+ * db/Makefile.in: allow make dist to work outside the source
+ tree.
+
+Sat Feb 05 18:31:04 2000 Loic Dachary <loic at ceic.com>
+
+ * test/word.cc (SkipTestEntries): The fix of
+ WordList::SkipUselessSequentialWalking actually saves us
+ a few hops when walking lists of words.
+
+Fri Feb 04 17:28:32 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordKey.cc,WordReference.cc,WordRecord.cc (Print): use
+ cerr instead of cout for immediate printing under debugger.
+
+Thu Feb 3 16:06:45 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Document.cc (RetrieveLocal): fix bug that prevented local
+ filesystem digging, because max_doc_size was initialized to 0.
+ Now sets it to max_doc_size for current url.
+
+Thu Feb 3 12:36:56 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * installdir/Makefile.{am,in}: install mime.types as mime.types,
+ not as htdig.conf.
+
+ * htfuzzy/EndingsDB.cc (createDB): fix code to use MV macro in
+ system() command, not hard-coded "MV" string literal, and use
+ get() on config objects to avoid passing String objects to form().
+
+Wed Feb 2 19:44:33 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtDateTime.cc (SetRFC1123): Strip off weekday, if present
+ and use LOOSE format.
+ (SetRFC850): Ditto.
+
+ * configure.in, configure: Add configure check for "mv."
+
+ * htfuzzy/Makefile.am: Use it.
+
+ * */Makefile.in: Regenerate using automake.
+
+ * htfuzzy/EndingsDB.cc (createDB): Use the detected mv, or
+ whatever is in the path to move the endings DB when they're
+ finished.
+
+Wed Feb 2 15:49:14 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Document.cc (RetrieveLocal), htdig/Retriever.cc (GetLocal):
+ Fix compilation errors. Oops!
+
+Wed Feb 2 13:53:27 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Retriever.cc (IsValidURL): fix problem with valid_extensions
+ matching failure when URL parameters follow extension.
+
+Wed Feb 2 13:29:48 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/QuotedStringList.cc (Create): fix PR#743, where quoted string
+ lists didn't allow embedded quotes of opposite sort in strings
+ (e.g. "'" or '"'), and fix to avoid overrunning end of string
+ if it ends with backslash.
+
+Wed Feb 2 13:23:16 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc (ctor, parse, do_tag), htcommon/defaults.cc:
+ Add max_keywords attribute to limit meta keyword spamming.
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+
+Wed Feb 2 12:57:40 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Document.cc (RetrieveLocal), htdig/Document.h,
+ htdig/Retriever.cc (Initial, parse_url, GetLocal, GetLocalUser,
+ IsLocalURL, got_href, got_redirect), htdig/Retriever.h,
+ htdig/Server.cc (ctor), htdig/Server.h: Add in Paul Henson's
+ enhancements to local_urls, local_default_urls & local_default_doc.
+ * htcommon/defaults.cc: Document these.
+
+Wed Feb 02 10:14:57 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordKeyInfo.h,WordKey.{cc,h}: fix overflow bug when 32
+ bits. For that purpose implement Outbound/Overflow/Underflow
+ methods in WordKey, MaxValue in WordKey/WordKeyInfo.
+ (WordKey::SetToFollowing) was FUBAR : overflow of field1 tested
+ with number of bits in next field, do not handle overflow,
+ Re-implemented.
+ (WordKey::Set) Change atoi to strtoul.
+ (WordList::SkipUselessSequentialWalking) was much to fucked up
+ to explain. Re-implement
+ (WordKey::Diff) Added as a support function of
+ SkipUselessSequentialWalking.
+ implement consistent verbosity.
+
+ * htword/WordList.cc (operator >>): explicit error message when
+ insert failed, with line number.
+
+Wed Feb 2 00:11:03 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdoc/RELEASE.html: Finish up with notes on all significant
+ new attributes.
+
+ * htdoc/FAQ.html, htdoc/where.html: Mention new 3.2.0b1 release
+ as a beta.
+
+ * contrib/README: Update to mention new scripts.
+
+ * installdir/mime.types: Add default Apache mime.types file for
+ systems that do not already have one.
+
+ * installdir/Makefile.am: Make sure it is installed by default.
+
+ * installdir/Makefile.in: Regenerate using automake.
+
+ * htcommon/defaults.cc: Add documentation for mime_types
+ attribute, remove currently unused image_alt_factor, and add
+ documentation for external_protocols.
+
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Regenerate using cf_generate.pl.
+
+Tue Feb 1 10:24:19 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/parser.cc (score): fix up score calculations for
+ correctness and efficiency.
+
+Mon Jan 31 16:29:20 2000 Marcel Bosc <bosc at ceic.com>
+
+ * htword/WordBitCompress.cc: fixed endian bug in compression
+
+Sat Jan 29 21:14:03 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/parser.cc (score): Change config.Value (which returns
+ int) to config.Double to preserve accuracy of attributes.
+
+ * htcommon/defaults.cc: Updated documentation for attributes now
+ allowing regex, search_algorithms (for new fuzzy) and added
+ documentation for the overlooked remove_unretrieved_urls.
+
+ * htdoc/*.html: Updated copyright notice for 2000, changed footer
+ to use CVS's magic Date keyword. Regenerated documentation from
+ defaults changes.
+
+Sat Jan 29 16:32:08 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * contrib/htdig-3.1.4.spec, contrib/htdig-3.1.4-conf.patch: Remove
+ these since they don't apply to the 3.2.x releases.
+
+ * htfuzzy/Synonym.cc (openIndex): Change database format from
+ DB_BTREE to DB_HASH--no reason for the synonym database to be a
+ btree. This was probably overlooked when I switched the rest of
+ the fuzzy databases over to DB_HASH.
+
+Sat Jan 29 05:34:26 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordKey.h (UnpackNumber): Very nasty bug. Optimization
+ dated Dec 29 broke endianess on Solaris. Restore previous version.
+
+Fri Jan 28 18:17:08 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/Configuration.h (struct ConfigDefaults): Add version and
+ category fields for more accurate documentation.
+
+ * htcommon/defaults.cc: Add blank category fields and start
+ filling in version field. Killed modification_time_is_now_attribute.
+
+ * htdig/Document.cc (Document): Kill attribute
+ modification_time_is_now since it can cause more harm than good.
+
+ * htnet/HtHTTP.cc (ParseHeader): Ditto.
+
+ * htdoc/cf_generate.pl: Added support for new version and category
+ fields. Currently category does nothing, but it could split the
+ documentation into categories.
+
+Sat Jan 29 01:37:45 2000 Loic Dachary <loic at ceic.com>
+
+ * .version: remove the trailing -dev
+
+Thu Jan 27 12:22:57 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordList.cc: cdebug replaced by cerr. replace lverbose
+ by verbose > 2. Remove shutup.
+ (WordList): monitor = 0
+ (Open): create monitor only if wordlist_monitor = true
+ (Close): delete monitor if set, delete compressor if set
+
+ * htword/WordDBCompress.cc,WordList.cc: only activate monitoring code
+ if monitor is set. No interaction with the monitor is therefore possible
+ if wordlist_monitor is false.
+
+ * htword/WordMonitor.cc: remove useless test of wordlist_monitor (done by
+ WordList now).
+
+ * htword/WordDBCompress.cc (TestCompress): remove redundant debuglevel argument.
+
+ * htword/WordDBCompress.cc (WordDBCompress): init cmprInfo to 0
+
+ * db/include/db_cxx.h: Add get_mp_cmpr_info method
+
+ * htword/WordDBCompress.cc (WordDBCompress): set default debug level to 0
+
+ * htword/WordDB.h: CmprInfo returns current CmprInfo and non static,
+ overload to set CmprInfo if argument given.
+
+ * htword/WordDBCompress.h: new CmprInfo() method returns DB_CMPR_INFO object
+ for Berkeley DB database.
+
+ * htword/WordList.h: add compressor member, kill cmprInfo member.
+
+ * htword/WordList.cc:
+
+Wed Jan 26 20:05:33 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordList.cc,htword/WordList.h: get rid of obsolete WordBenchmarking
+
+Wed Jan 26 9:14:32 2000 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htcommon/defaults.cc: added "max_connection_requests".
+
+ * htdig/Retriever.cc: now manages the attribute above.
+
+Tue Jan 25 12:59:01 2000 Loic Dachary <loic at ceic.com>
+
+ * htsearch/Display.cc (setVariables): fixed
+ Display.cc:505: warning: multiline `//' comment
+
+Tue Jan 25 8:37:15 2000 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htdig/Document.h: Added the "HtHTTP *GetHTTPHandler()" method, in
+ order to be able to control an HTTP object outside the Document class.
+ This is useful for the Server class, after the request for robots.txt.
+ We can control the response of a server and check if it supports
+ persistent connections.
+
+ * htdig/Server.cc: inside the constructor, persistent_connections var is
+ initialized to the configuration parameter value, instead of <true>.
+ Besides, after the request of the robots.txt, it controls and set
+ the attribute for persistent connections, depending on whether the
+ server supports them or not.
+
+ * htdig/Retriever.cc: modified the Start() method. Now the loop manage
+ HTTP persistent connections "on a server" basis. Indeed, it's a
+ Server object that decides if persisent connections are allowed on
+ that server or not (depending on configuration or capabilities of
+ the remote http server).
+
+Mon Jan 24 12:57:45 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc(setVariables): Added double quotes around
+ default selection value in build_select_lists handling.
+
+Mon Jan 24 12:37:22 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc(setVariables), htcommon/defaults.cc: Added
+ build_select_lists attribute, to generate selector menus in forms.
+ Added relevant explanations and links to selectors documentation.
+ * htdoc/hts_selectors.html: Added this page to explain this new
+ feature, plus other details on select lists in general.
+ * htdoc/hts_templates.html: Added relevant links to related attributes
+ and selectors documentation.
+ * htdoc/attrs.html, cf_by{name,prog}.html: reran cf_generate.pl
+
+Fri Jan 21 18:57:58 EET 2000 Vadim Chekan <vadim at etc.lviv.ua>
+
+ * htcommon/HtConfiguration.cc: added HtConfiguration::ParseString(char*)
+ method to allow lexer handle "include: ${var}/file.inc" construction
+
+ * htcommon/conf_lexer.lxx: fixed handling "include: ${var}file.inc"
+ bug.
+
+Fri Jan 21 17:04:28 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordList.cc (WalkFinish,WalkInit,WalkNextStep): fix typos in error messages
+ and misleading comment.
+
+ * htword/WordList.h,WordList.cc: move part of WalkInit in WalkRewind so that
+ we have a function to go back to the beginning of possible matches.
+
+Wed Jan 19 21:49:57 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/HTML.cc (do_tag): Only add words for META descriptions,
+ keywords, and IMG ALT attributes if doindex is set.
+
+ * htcommon/DocumentRef.h: Added Reference_obsolete for documents
+ that should be removed (but haven't).
+
+ * htdig/Retriever.cc (parse_url): Flag documents that have been
+ modified as Reference_obsolete and update the database. Flag all
+ documents with various errors as something other than
+ Reference_normal, as appropriate--these probably should be pruned.
+
+ * htdig/Retriever.h: Get rid of GetRef() method--it's only used once!
+
+ * htsearch/Display.cc (display): Don't show DocumentRefs with
+ states other than Reference_normal--these documents have various
+ errors.
+
+ * htmerge/docs.cc: If a document has a state of Reference_obsolete, ignore it.
+
+ * htcommon/HtWordList.h, htcommon/HtWordList.cc (Skip): Change
+ MarkGone() to Skip() to emphasize that this document should be ignored.
+
+Wed Jan 19 14:11:51 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordList.cc (SkipUselessSequentialWalking): return OK if skipping,
+ NOTOK if not skipping.
+
+ * htword/WordReference.h: remove useless Clear in WordReference(key, record)
+ constructor.
+
+ * htword/WordList.h,WordList.cc: Split Walk in three separate functions
+ WalkInit, WalkNext and WalkFinish. Much clearer. Fill the status field
+ of WordSearchDescription to have more information about the error condition.
+ Add found field to WordSearchDescription for WalkNext result. Add cursor_get_flags
+ and searchKeyIsSamePrefix fields to WordSearchDescription as internal state
+ information.
+
+ * htword/WordList.h,WordList.cc: WalkInit to create and prepare cursor,
+ WalkNext to move to next match
+ WalkNextStep to move to next index entry, be it a match or not
+ WalkFinish to release cursor.
+
+ * htword/WordList.h: WordSearchDescription::ModifyKey add to jump
+ while walking.
+
+ * htword/WordList.cc (WalkNext) : it is now legal to step without
+ collection or callback because search contains the last match (found
+ field) and it s therefore not useless.
+
+Mon Jan 17 12:15:45 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/htdig-3.2.0.spec: added sample RPM spec file for 3.2
+
+Sat Jan 15 11:53:35 2000 Loic Dachary <loic at ceic.com>
+
+ * htdb/htstat.cc,htdb/htdump.cc: remove useless -S option since
+ the page size is found in the header of the file.
+
+ * htdb/htstat.cc,htdump.cc,htload.cc: only call WordContext::Initialize
+ if -W flag specified.
+
+Fri Jan 14 18:39:12 2000 Marcel Bosc <bosc at ceic.com>
+
+ * htword/WordBitCompress.cc: speedup, VlengthCoder::code()
+ finds appropriate coding interval much faster
+
+Fri Jan 14 11:30:41 2000 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Retriver.cc(IsValidURL): Fix problem with valid_extensions,
+ which got lost in the shuffle yesterday.
+
+Fri Jan 14 15:56:49 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordType.cc,WordRecord.cc,WordKeyInfo.cc (Initialize): change
+ inverted test on instance (== instead of !=).
+
+ * htword/WordRecord.cc (WordRecordInfo): change inverted test on compare
+
+Fri Jan 14 14:24:39 2000 Loic Dachary <loic at ceic.com>
+
+ * htdig/htdig.cc,htmerge/htmerge.cc,htsearch/htsearch.cc: Use Initialize(defaults)
+ to load configuration file if provided.
+
+ * htword/WordDBCompress.cc (Compress): initialize monitor to null in
+ constructor and check if null before usage. Core dumped in htdb/htload.
+
+ * htword/WordContext.h (class WordContext): Add
+ Initialize(const ConfigDefaults* config_defaults = 0)
+ that probe configuration files. Usefull when htword is used as a standalone library.
+
+Thu Jan 13 19:52:27 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Retriver.cc: Fix problem with valid_extensions when an
+ "extension" would include part of a directory path or server
+ name, as contributed by Warren Jones.
+
+Thu Jan 13 19:22:25 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htnet/Makefile.am, htnet/Makefile.in: Add HtFile to the build process.
+
+Thu Jan 13 18:58:03 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htnet/HtFile.h, htnet/HtFile.cc: New Transport classes
+ contributed by Alexis Mikhailov to allow file:// access.
+
+ * htdig/Document.h, htdig/Document.cc: Add logic to call HtFile
+ objects for URLs.
+
+ * htcommon/URL.cc: Don't remove a trailing index.html (removeIndex)
+ if the URL is a file://URL.
+
+Thu Jan 13 18:49:41 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * contrib/conv_doc.pl, contrib/parse_doc.pl: Replace "break" by
+ "last" for correct Perl syntax and additional cleanups and
+ simplifications as contributed by Warren Jones.
+
+Thu Jan 13 18:42:29 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htword/WordType.h, htword/WordType.cc: Implementation of new
+ methods IsDigit() and IsCntrl() as contributed by Marc Pohl
+ <marc.pohl at wdr.de>. Fixes some problems with 8-bit characters.
+
+Thu Jan 13 17:17:47 2000 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * ChangeLog.0, configure, configure.in, htfuzzy/Endings.cc,
+ htlib/String.cc, htlib/Configuration.cc,
+ htlib/QuotedStringList.cc, htlib/regex.c, htcommon/defaults.cc,
+ htdig/ExternalParser.cc, htdig/Retriever.h, htsearch/Display.cc,
+ include/htconfig.h.in installdir/htdig.conf: Merge in changes from
+ 3.1.x releases.
+
+ * htdoc/: Merge in documentation changes from 3.1.x releases.
+
+Thu Jan 13 20:12:42 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordList.cc (Walk): close the cursor before returning. If
+ not doing that the cursor might be closed after the database is
+ closed, leading to double free of the cursor. Bad bug.
+
+Thu Jan 13 13:23:17 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordContext.h (class WordContext): simplifies a lot. WordContext is
+ no longer a repository for pointers of class instances. Only a place to call
+ Initialize for classes that have a single instance.
+
+ * htlib/HtWordType.cc: added to include definition of functions shortcuts for
+ WordType.
+
+ * htword/WordRecord.h,WordType.h,WordKeyInfo.h: implement homogeneous scheme to
+ handle unique instance of the class.
+ - constructor takes const Configuration& argument and init object with config
+ values
+ - static member instance
+ - static method Initialize the static member instance
+ - static method Instance returns the pointer in instance data member
+
+ * htword/WordRecord.cc: add constructor for WordRecordInfo, and Instance static
+ function. Add WORD_RECORD_INVALID to depict uninitialize WordRecordInfo object.
+
+ * htword/WordKeyInfo.h: rename SetKeyDescriptionFromFile and SetKeyDescriptionFromString
+ to InitializeFromFile and InitializeFromString and implement them by calling Initialize.
+ rename SetKeyDescriptionRandom to InitializeRandom
+ rename Initialize(String& line) to GetNFields(String& line)
+ rename Initialize(int nfields) to Alloc(int nfields)
+
+ * htdig/htdig.cc,htmerge/htmerge.cc,htsearch/htsearch.cc,test/word.cc: replace
+ WordList::Initialize with WordContext::Initialize and run immediately after
+ config is read. Otherwise WordType fails to work and configuration value
+ extraction will fail.
+
+ * htmerge/htmerge.cc: move initialization
+
+ * test/conf/htdig.conf2.in: reorder so that it looks as much as possible as conf.in
+
+Thu Jan 13 12:33:46 2000 Loic Dachary <loic at ceic.com>
+
+ * htdb/htstat.cc,htdump.cc,htload.cc: set proper progname
+
+Wed Jan 12 20:02:26 2000 Loic Dachary <loic at ceic.com>
+
+ * htcommon/HtWordList.cc (Dump): Use Walk instead of Collect otherwise does not work.
+
+Wed Jan 12 19:38:33 2000 Loic Dachary <loic at ceic.com>
+
+ * htlib/HtDateTime.h (class HtDateTime): killed void SetDateTime(const int t)
+ because they cause problems when time_t is an int and were useless anyway.
+
+Wed Jan 12 13:31:45 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordBitCompress.h: remove inline qualifier on check_tag1: its not inline
+
+ * htword/WordKey.h: #define WORD_KEY_UNKNOWN_POSITION to -1. Remove default
+ argument to SetToFollowing so that its more explicit when used with
+ WORD_KEY_UNKNOWN_POSITION.
+
+ * htword/WordKey.cc: change name of variable info0 to info
+
+ * htword/WordList.cc: use WordKey::Info instead of WordKeyInfo::Get as done
+ in WordKey.cc for consistency.
+
+ * htword/WordList.{cc,h},htword/WordDB.h: rename WordCursor to WordDBCursor
+ for consistency.
+
+ * htword/WordList.h: Kill the WordSearchDescription::Setup useless function
+
+ * htword/WordList.h: WordSearchDescription constructor now have a straightforward
+ semantics.
+
+ * htword/WordList.h: Rename Search into Collect since it already existed, just
+ with a different prototype.
+
+Wed Jan 12 12:36:46 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordList.h (class WordSearchDescription): add cursor member
+
+Tue Jan 11 19:33:44 2000 Marcel Bosc <bosc at ceic.com>
+
+ * htlib/HtVectorGeneric,htword: Fixed some warnings found
+ when compiling under FreeBSD
+
+Tue Jan 11 18:22:58 2000 Marcel Bosc <bosc at ceic.com>
+
+ * htlib/HtVectorGeneric.h: inlined functions Add and Allocate which
+ are critical to performance
+
+Tue Jan 11 12:18:47 2000 Marcel Bosc <bosc at ceic.com>
+
+ * htword/WordKey.h: fixed uninitialized memory read
+
+ * htword/WordBitCompress.cc: Fixed big number BUG
+ Fixed memeory leak
+
+Tue Jan 11 09:37:36 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordList.h: move operator << and operator >> to end of
+ functions declarations instead of data members.
+
+ * htword/WordList.h: added more comments on functions behaviour.
+
+ * htword/WordList.h: added #if SWIG for Perl interface
+
+Mon Jan 10 17:55:05 2000 Marcel Bosc <bosc at ceic.com>
+
+ * htword/WordDBPage: enhanced compression debugging output
+
+Mon Jan 10 09:07:19 2000 Loic Dachary <loic at ceic.com>
+
+ * WordContext.h,WordKey.h,WordList.h: Added #if SWIG for perl
+ interfaces. Remove InSortOrder, useless now that everything
+ is manipulated in sort order as far as the interface is concerned.
+
+ * WordKey.cc,WordList.cc: remove InSortOrder
+
+ * WordKey.h,WordRecord.h,WordReference.h: commented out Set/Get for
+ ascii Set/Get for SWIG.
+
+ * WordKey.h: turn CopyFrom to public for those who dont want to
+ use operator =.
+
+ * WordKey.h: rename info -> Info and nfields NFields
+
+ * WordKey.h: remove int IsFullyDefined() const redundant with Filled
+
+Thu Jan 06 14:41:15 2000 Marcel Bosc <bosc at ceic.com>
+
+ * htword,all: Changed interface to overloaded Walk function that was
+ ambigous on some compilers...
+
+Thu Jan 06 14:00:01 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordList.h (class WordSearchDescription): rename setup to Setup
+
+ * htword/WordList.h (class WordBenchmarking): rename show to Show
+
+ * htword/WordRecord.{h,cc}, htword/WordReference.h, htword/WordList.h:
+ add comments, reorganize member functions for clarity.
+
+Thu Jan 06 12:01:47 2000 Marcel Bosc <bosc at ceic.com>
+
+ * htword/compression: Split WordDBCompress.* to WordDBCompress +
+ WordDBPage.*
+
+ * htword/WordBitCompress: renamed put/get to put_uint/get_uint. added get/put_uint_vl
+
+ * htword/compression: modified slightly the compression: this makes old databases
+ OBSOLETE: headers compress better. Chaged Flags compress better and faster.
+
+ * htword/WordKey: added operator [] and Get/Set accessors
+
+ * htword: removed the obsolete --with_key configure option (KEYDESC)
+
+ * htword/WordMonitor: addded monitor input
+
+Wed Jan 05 14:32:31 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordKeyInfo.h (class WordKeyInfo ): if(encode) was if(sort)
+
+ * htword/WordKeyInfo.h: rename show to Show an nprint to Nprint
+
+ * htword/WordKeyInfo.h: move WORD_ISA from WordKey.h to WordKeyInfo.h,
+ rename WORD_ISA_String to WORD_ISA_STRING.
+
+ * htword/WordKey.h: rename FATAL_ABORT to WORD_FATAL_ABORT and errr to word_errr
+
+ * htword/WordKey.h: move private functions at bottom of class above data members
+ rename show_packed to ShowPacked
+
+ * htword/WordKey.cc: move WordKeyInfo::SetKeyDescriptionRandom from WordKey.cc
+ to WordKeyInfo.cc
+
+ * htword/WordKeyInfo.cc: add include htconfig.h
+
+Wed Jan 05 13:26:16 2000 Loic Dachary <loic at ceic.com>
+
+ * htdig/ExternalParser.cc (parse): use nocase_compare instead of mystrcasecmp to
+ suppress warnings. (char*)String for mystrncasecmp that has no equivalent in
+ the String class.
+
+ * htdig/Retriever.cc (IsValidURL): remove warning by (char*)url
+
+Wed Jan 05 11:54:19 2000 Loic Dachary <loic at ceic.com>
+
+ * htword/WordKey.h: kill obsolete comment and add suffix explanation at
+ the beginning of the file.
+
+ * htword/WordKey.h (class WordKey): rename copy_from and initialize to CopyFrom
+ and Initialize to fit naming conventions. Reorganize the methods to group them
+ in logical sets. Fix indenting. Comment each method.
+
+ * htword/WordKey.h (Clear): add kword.trunc()
+
+ * htword/WordKey.h: protect SetWord(const char *str,int len) because it opens
+ the door to all kind of specific derivations. Should be
+ SetWord(String(foo, foo_length)) if not performance critical.
+
+Wed Dec 29 18:41:14 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htlib/HtMaxMin: added max/min of arrays, added comments to
+ HtMaxMin. Added HtMaxMin.cc all these are used in htword
+
+ * htlib/HtTime.h: added comments. included portable time.h
+
+ * htlib/HtVectorGeneric.cc: added HtVector_double, HtVector_String
+
+ * htlib/HtVectorGeneric.h: inlined several methods, disactivated CheckBounds
+
+ * htlib/StringMatch.cc: removed #include"WordType.h", this made htlib dependant
+ on htword, which is not acceptable for a library
+
+ * htlib/HtWordType.h: this replaces the macros used in StringMatch.cc
+
+ * htlib/HtRandom.h: added tools for using random number
+ (this is used currently in tests)
+
+ * htword/WordBitCompress.cc: transfered max_v/min_v to htlib
+
+ * htword/WordBitCompress.cc: optimized put/get for better performance
+
+ * htword/WordMonitor: system for detailed monitoring of operation
+ and performance within htword
+
+ * htword/WordDBCompress: fixed compression for case of empty WordRecord
+
+ * htword/WordDBCompress: cleaned up some code added some comments
+
+ * htword/WordKeyInfo: split WordKey files into WordKey and WordKeyInfo files
+
+ * htword/WordContext: centralized global configuration into one class
+
+ * htword/WordKey: inserted randomized key/keydescription into WordKey classes
+ (this was previously used in several tests)
+
+ * htword/WordKey: optimized Compare, UnpackNumber for speed (these are
+ really speed critical)
+
+ * htword/WordRecord: is now configurable, type can be configured to "DATA" (htdig)
+ or "NONE" (for other uses)
+
+ * htword/WordType: changed macros to global functions to make it compatible
+ with cleanup in StringMatch. Integrated WordType to WordContext
+ configuration/Initialization
+
+ * htword/WordKeyInfo: fixed initialization from key descrition file
+
+Tue Dec 28 18:58:21 EET 1999 Vadim Chekan <vadim at etc.lviv.ua>
+
+ * htlib/String.cc: String::lowercase(), String::uppercase()
+ support for national character added.
+
+ * htfuzzy/Prefix.cc: method "prefix" works now.
+
+Mon Dec 27 22:17:48 1999 Loic Dachary <loic at ceic.com>
+
+ * htdig/htdig.cc (main): change '\r\n' to "\r\n"
+
+ * Makefile.config,db/dist/Makefile.in: rename libdb to libhtdb to
+ prevent conflicts with installed libdb.
+
+ * db/dist/Makefile.in: do not install documentation nor binary
+ utilities (db_dump & al) since they are replaced by htdb binaries
+ (htdump & al).
+
+ * db/dist/Makefile.in (prefix): prepend $(DESTDIR) to prefix
+ to support make DESTDIR=/staging install for binary distribution
+ packages generation.
+
+ * configure.in: use AC_FUNC_ALLOCA to check for alloca. Used
+ in regex and test/dbbench.cc only but definitely a usefull
+ feature to have.
+
+Thu Dec 23 11:10:24 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htcommon/defaults.cc: set wordlist_cache_size default to 10Meg
+
+ * db/mp: removed some debuging messages
+
+ * htword/WordList.cc: added warning if no cache
+
+ * test/word.cc: added cache
+
+ * htlib/HtTime.h: added ifdefs for portable time.h sys/time.h
+
+Tue Dec 21 23:33:06 1999 Loic Dachary <loic at ceic.com>
+
+ * htdoc/attrs.html,cf_by*.html: regenerate to include
+ wordlist_wordkey_description attribute
+
+ * htcommon/Makefile.am: Add AM_LFLAGS = -L and AM_YFLAGS = -l to
+ prevent #line generation because it confuses the dependencies
+ generator of GCC if configure run out of source tree.
+
+ * configure.in: remove --with-key option. Not needed since
+ word description now dynamic. Destroyed WordKey.h if
+ specified.
+
+ * htword/Makefile.am: remove commented lines for WordKey.h
+ generation.
+
+Tue Dec 21 18:18:01 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htword: added code for benchmarking
+
+Mon Dec 20 17:59:15 1999 Marcel Bosc <bosc at ceic.com>
+
+ * WordKey: Made the key structure dynamic: Changing the
+ key structure used to imply recompiling the htword library.
+ This should not change anything in htdig.
+
+ * WordKey: numerical key fields are stored in an array of unsigned
+ ints instead of compile-time defined pools.
+
+ * WordKey.h: WordKey now needs copy opreators. Setbits are stored
+ in sort order (used to be in encoding order)
+
+ * htword: word_key_info is now a pointer, had to change all references
+
+ * word.cc: Rewrote wordkey test for new dynamically
+ set key structure. The test randomly creates key structures
+ and tests them.
+
+ * test: adapted test files (simplifies things a lot)
+
+1999-12-21 Toivo Pedaste <toivo at ucs.uwa.edu.au>
+
+ * htlib/Dictionary.cc: Fix memory leak when destroying dictionary
+
+ * htlib/StringList.cc, htdig/Retriever.cc: Fix memory leak, not
+ the most elegent way but I'm not sure about the exact semantics
+ of StringList
+
+Mon Dec 20 21:59:03 1999 Loic Dachary <loic at ceic.com>
+
+ * htdb/{Makefile.am,err.c,getlong.c}: Fix mistake: err.c and
+ getlong.c contain C functions (declared in clib_ext) and
+ must be C compiled otherwise the prototype won't fit. Checking
+ db Makefiles, getlong.c and err.c are added to the list of objects
+ for each utility program. This guaranties that they won't conflict
+ with objects included in libdb.a.
+
+Sun Dec 19 20:04:42 1999 Loic Dachary <loic at ceic.com>
+
+ * htdb/{Makefile.am, err.cc}: add err.cc for portability
+ purposes.
+
+Fri Dec 17 18:04:09 1999 Loic Dachary <loic at ceic.com>
+
+ * Makefile.config: add PROFILING variable and document it. Designed
+ to enable profiling of htdig easily.
+
+ * */Makefile.am: add *_LDFLAGS = $(PROFILING) for every binary to
+ enable profiling, if specified.
+
+Thu Dec 16 17:16:33 1999 Loic Dachary <loic at ceic.com>
+
+ * htdb/*.cc: add -W option to activate htword specific compression.
+ Keep compatibility with zlib compression (-z only).
+
+Thu Dec 16 11:56:02 1999 Loic Dachary <loic at ceic.com>
+
+ * test/dbbench.cc: change wrong strcpy with memcpy
+
+Wed Dec 15 15:04:39 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/htdig.cc(main): Handle list of URLs given on stdin, if
+ optional "-" argument given. (Uses >> operator below.)
+
+ * htlib/htString.h, htlib/String.cc: Added Alexis Mikhailov's String
+ input methods, readLine() and >> operator.
+
+Wed Dec 15 13:59:34 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Retriever.cc: remove include of sys/stat.h, which is no
+ longer needed after hack removed from Need2Get(), and could pose
+ a problem on systems that need sys/types.h included first.
+
+Wed Dec 15 17:00:04 1999 Loic Dachary <loic at ceic.com>
+
+ * htword/WordDB.h: add inline keyword for portability
+
+ * htword/WordDB.h: add CmprInfo method to get object describing
+ compression scheme for Berkeley DB
+
+ * htdb: Add htdump, htload, htstat equivalent of db_dump
+ db_load and db_stat that know about htword specific compression
+ strategy.
+
+ * htword/WordDBCompress: add static to localy defined functions and
+ variables, remove unecessary #define and #include from header.
+
+Tue Dec 14 21:56:57 EET 1999 Vadim Chekan <vadim at etc.lviv.ua>
+
+ * htcommon/conf_parser.lxx, htcommon/conf_lexer.cxx:
+ bcopy on Solaris is in strings.h, not in string.h. Added
+ check for #ifdef HAVE_STRINGS_H
+
+Tue Dec 14 19:18:22 1999 Marcel Bosc <bosc at ceic.com>
+
+ * WordBitCompress: code cleaned up and commented
+
+Tue Dec 14 18:32:21 1999 Loic Dachary <loic at ceic.com>
+
+ * htword/Word{Record,Reference,Key}: added a Get method to
+ convert the structure into it's ascii string representation.
+ operator << now uses Get.
+
+Tue Dec 14 17:46:33 1999 Loic Dachary <loic at ceic.com>
+
+ * db/dist/Makefile.in (install): fix bugous test for libshared
+
+Tue Dec 14 14:10:28 1999 Loic Dachary <loic at ceic.com>
+
+ * htword/{WordKey,WordReference,WordRecord}: rework
+ the input methods (operator >>). Each class now has a Set function
+ to initialize itself from an ascii description and a Get function
+ to retrieve an ascii description of the object.
+
+ * htword/WordList: operator >> has a better and cleaner input loop
+ using StringList and String instead of char*.
+
+Tue Dec 14 12:06:24 1999 Marcel Bosc <bosc at ceic.com>
+
+ * WordDBCompress.cc : Added compression version checking
+
+Mon Dec 13 21:09:31 EET 1999 Vadim Chekan <vadim at etc.lviv.ua>
+
+ * htcommon/conf_parser.lxx, htcommon/conf_lexer.cxx:
+ Added #include <string.h> Without it failed to compile
+ on Solaris.
+
+Mon Dec 13 16:31:27 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htword/WordBitCompress.cc : fixed bug that made compression
+ fail on big documents or big number of url's ...
+
+Mon Dec 13 13:49:35 1999 Loic Dachary <loic at ceic.com>
+
+ * htword/WordKey.h.tmpl: Added *_POSITION macro generation
+
+Mon Dec 13 11:51:50 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htcommon/conf_parser.yxx: fixed several delete that should be delete []
+
+Sun Dec 12 17:14:00 EET 1999 Vadim Chekan <vadim at etc.lviv.ua>
+
+ * htcommon/conf_lexer.lxx, htcommon/conf_lexer.cxx:
+ national symbols are allowed in right part of expressions
+ (noted by Marcel Bosc).
+ Changed default behavior of flex from print unknown chars
+ on stdout to exit with error message.
+
+Sat Dec 11 17:34:03 EET 1999 Vadim Chekan <vadim at etc.lviv.ua>
+
+ * htdig/Retriever.cc,htdig/htdig.cc: "exclude_urls","bad_querystr"
+ "bad_extensions","valid_extensions","local_default_doc"
+ changed for new config.
+
+ * htdig/Server.cc: "server_max_docs","server_wait_time" changed for
+ new config.
+
+ * check for "limit_normalized" moved from Retriever::got_href and
+ Retriever::got_redirect to more appropriate Retriever::IsValidUrl
+
+Fri Dec 10 18:05:48 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htword: checked for failed memory allocations in compression code
+
+Fri Dec 10 18:03:42 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htword/WordList,htcommon/HtWordList.cc,htmerge/words.cc: cleaned up WordList::Walk()
+ function, change two occurences of WordList::Walk in htdig files
+
+Fri Dec 10 17:40:22 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htword/WordKey.cc (Compare): Fixed bug: compare used to compare chars and not
+ unsigned chars, this failed when non-ascii caracters were used
+
+Fri Dec 10 11:54:36 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htcommon/defaults.cc : doc for wordlist_cache_size
+
+Thu Dec 09 17:07:47 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htcommon/defaults.cc: added defaults for compression and DB configuration
+ parameters
+
+Thu Dec 09 16:47:54 1999 Loic Dachary <loic at ceic.com>
+
+ * db/dist/configure.in,Makefile.in: Added shared lib support
+ for linux only. Not enabled if not on linux.
+
+Thu Dec 09 15:07:11 1999 Loic Dachary <loic at ceic.com>
+
+ * acinclude.m4,db/dist/acinclude.mr: CHECK_ZLIB now fails if either
+ zlib.h or libz is not found.
+
+ * configure.in: do not test zlib.h
+
+ * db/db/db.c,db/mp/mp_fopen.c: added #ifdef HAVE_ZLIB so that
+ compilation works if zlib is not found
+
+ * htlib/.cvsignore: remove wrong *.cxx
+
+ * test/dbbench.cc: added #ifdef HAVE_ZLIB so that
+ compilation works if zlib is not found
+
+Thu Dec 09 13:25:45 1999 Marcel Bosc <bosc at ceic.com>
+
+ * test/Word.cc,t_wordlist,Makefile.am: upgraded tests
+ * htcommon/HtWordList.h: fixed Configuration/HtConfiguration problem
+
+Thu Dec 09 12:10:32 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htword: Added the compression code:
+ * WordDBCompress: Classes for page specific compression code
+ * WordBitCompress: Classes for bitstreams and non-specific compression
+
+Thu Dec 9 12:09:51 EET 1999 Vadim Chekan <vadim at etc.lviv.ua>
+
+ * htcommon/HtConfiguration.cc: bug fix: sometimes
+ htConfiguration::Find(url,char*) retuned empty values
+ even if there was something to return.
+
+Thu Dec 09 11:15:30 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htlib/Configuration.cc (Read): Read is now a virtual function: the old one
+ for Configuration the new one (Vadim's ... with the parser) in HtConfiguration
+
+Thu Dec 09 11:01:22 1999 Loic Dachary <loic at ceic.com>
+
+ * acinclude.m4: upgrade AC_PROG_APACHE macro for
+ modules detection.
+
+ * test/conf/httpd.conf,test/test_functions.in,test/conf/Makefile:
+ use @APACHE_MODULES@ to accomodate various apache modules directory
+ flavors.
+
+Tue Dec 07 20:32:34 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htdig: Split the Configuration class into Configuration
+ and HtConfiguration. All the HtConfiguration and the
+ configuration parsing (lex..) was woved to htcommon.
+ Configuration was replaced by HtConfiguration as needed
+
+Tue Dec 07 16:21:13 1999 Loic Dachary <loic at ceic.com>
+
+ * configure.in: added AM_PROG_LEX and AC_PROG_YACC
+
+ * htlib/Makefile.am: simply set conf_lexer.lxx and conf_parser.yxx,
+ automake knows how to handle these. The renaming is needed to avoid
+ conflicts in automake generated rules.
+
+Mon Dec 6 16:23:39 CST 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/cf_generate.pl: added a bit of error checking for when it
+ can't fetch the config info, and made it more flexible for what it
+ allows as terminator.
+ * htcommon/defaults.cc: add default and description for authorization
+ attribute, and clean up external_protocols entry for cf_generate.pl.
+ * htdoc/attrs.html, cf_by{name,prog}.html: reran cf_generate.pl
+ * htdig/htdig.cc(main): set authorization parameter before Retriever
+ constuctor is called, as it may initialize a Server. (Should complete
+ fix of PR#490.)
+
+Mon Dec 6 21:34:29 EET 1999 Vadim Chekan <vadim at etc.lviv.ua>
+
+ * htdig/Document.cc htdig/htdig.cc: "authorization" parameter
+ in config is added and is new config compatible.
+ New code has'n got PR#490 bug (don't authentificate robot.txt)
+
+Mon Dec 06 11:58:56 1999 Marcel Bosc <bosc at ceic.com>
+
+ * HtVectorGeneric.h: generic vectors, stl-free: this was originally a copy of
+ HtVector.h with Object * replaced by GType and some small changes.
+ It has been modified and checked to see if it all works ok.
+ You can build vectors of any type that has an empty constructor.
+ * HtVectorGenericCode.h: generic vectors, stl-free: implementation
+ (modified "copy" of HtVector.cc)
+ * HtVectorGeneric.cc: generic vectors: implementation for common types
+ * HtVector_int.h: generic vectors: declaration for the most common type
+ (and example of howto use)
+
+Sat Dec 4 23:49:18 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htfuzzy/Synonym.cc (createDB): Change declaration to match
+ Fuzzy::createDB(config), allowing the method to be called by
+ htfuzzy.
+
+ * htfuzzy/htfuzzy.cc (main): Add an error message if
+ fuzzy->createDB() comes back with an error.
+
+Sat Dec 4 15:38:34 EET 1999 Vadim Chekan <vadim at etc.lviv.ua>
+
+ * htnet/HtHTTP.cc, htnet/HtHTTP.h, htdig/Document.cc
+ fixed proxy bug. GET command in HtHTTP included only
+ path of url insead full url when use proxy.
+ HtHTTP::UseProxy(int) added.
+
+ * htdig/Document.cc: make "http_proxy" parameter
+ url-depended for new configuration.
+
+Fri Dec 03 14:57:13 1999 Marcel Bosc <bosc at ceic.com>
+
+ * BerkelyDB: Compression code: added possibility to use
+ user-defined compression routines (the goal is to enable
+ the mifluz-specific DB page compression that obtains
+ higher compression ratios than generic zlib compression)
+ this envolves the following changes in BerkeleyDB:
+ * BerkelyDB/CompressionEnvironment: Adding a structure db_cmpr_info
+ in db_env that permits db user to specify the external compression
+ routines and other information related to compression
+ * BerkelyDB/CompressionEnvironment: Adding a cmpr_context structure
+ to DB_MPOOLFILE that stores information that compression needs
+ (the _weacmpr DB and the db_cmpr_info)
+ * BerkelyDB/Compression: Needed to modify the compression
+ system (that is implemented in the BerkelyDB memory pool) to permit
+ higher compression ratios and to use the compression environment
+
+Thu Dec 2 16:47:30 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Retriever.cc(parse_url): Use a static int to avoid
+ re-fetching local_urls_only from the config object.
+ (Initial, got_href, got_redirect): Try to get the local filename
+ for a server's robots.txt file and pass it along to the newly
+ generated server.
+
+ * htdig/Server.cc(ctor): Retrieve the robots.txt file from the
+ filesystem when possible and respect the local_urls_only option.
+
+ * htdig/Server.h: Change type of local_robots_file to String* to
+ better match Retriever::GetLocal().
+
+Thu Dec 02 16:24:27 1999 Loic Dachary <loic at ceic.com>
+
+ * htword/WordReference.cc,WordKey.cc,WordRecord.cc (Print): Add function
+ to ease printing from Perl.
+
+Thu Dec 02 16:06:29 1999 Loic Dachary <loic at ceic.com>
+
+ * htword/WordReference.h (WORD_FILLED): remove
+ unused WORD_FILLED and WORD_PARTIAL macros
+
+Wed Dec 01 19:18:42 1999 Loic Dachary <loic at ceic.com>
+
+ * htword/WordKey.h.tmpl,WordRecord.h,WordReference.h,
+ WordList.h: Added #ifndef SWIG for
+ www.swig.org sake.
+
+Wed Dec 1 19:47:20 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtRegex.cc, htlib/HtRegex.h (set*): Add a case_sensitive
+ flag which defaults to insensitive. This better mirrors the
+ StringMatch class.
+
+ * htcommon/URL.cc(signature): Make the signature a proper URL to
+ the base of the server.
+
+ * htdig/Server.h: Add IsDead() methods to query the status of the
+ server, as well as an IsDisallowed() method to query whether a URL
+ is forbidden by the robots.txt rules. Change _disallow to HtRegex.
+
+ * htdig/Server.cc(ctor): Only retrieve the robots.txt file if this
+ is an http or https server.
+ (robotstxt): Use the proper HtRegex method for setting the pattern.
+ (push): Remove logic checking the _disallow patterns. This is now
+ done by the Retriever object.
+
+ * htcommon/defaults.cc: Add new attribute "local_urls_only" which
+ defaults to false, which dictates whether retrieval should revert
+ to another method if RetrieveLocal() fails.
+
+ * htdig/Retriever.cc(parse_url): Check to see if the server is
+ dead before calling the Retrieve() method. Notify the server
+ object if a connection fails. Also respects the new
+ local_urls_only attribute as described above.
+ (IsValidURL): Check the server's IsDisallowed() method to see if
+ the robots.txt forbids this URL.
+
+ * htdoc/THANKS.html: Updated to reflect current contributions, etc.
+
+ * README: Update to mention version 3.2.0b1.
+
+Wed Dec 1 17:05:48 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Retriever.cc(GetLocal): Fix error in GetLocalUser() return
+ value check, as suggested by Vadim.
+
+Wed Dec 1 15:57:09 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/conv_doc.pl: Added a sample external converter script.
+
+Mon Nov 29 23:19:35 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Retriver.cc, htdig/Retriver.h, htdig/Server.cc,
+ htdig/Server.h: forward-ported patch provided by Alexis Mikhailov
+ <alexis at medinf.chuvashia.su> and Gilles's for cleaning up
+ IsLocal/GetLocal. Makes local digging persistent, even when HTTP
+ server is down.
+
+Mon Nov 29 22:35:06 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * test/url.cc: New test for URL class.
+
+ * test/url.parents: Base URLs for parsing.
+
+ * test/url.children: Derived relative URLs for testing.
+
+ * test/Makefile.am, test/Makefile.in: Add the above for building.
+
+ * htcommon/URL.cc: A variety of bug fixes (some hacks), especially
+ for file:// and user@host URLs.
+
+Sun Nov 28 00:35:59 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * .version: Bump to 3.2.0b1-dev.
+
+Sat Nov 27 20:23:14 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/ExternalTransport.h, htdig/ExternalTransport.cc: New class
+ to allow external scripts to handle transport methods.
+
+ * contrib/handler.pl: Example handler using the program 'curl' to
+ handle HTTP or HTTPS transactions.
+
+ * htcommon/defaults.cc: Add new configuration option
+ 'external_protocols' as a list of protocols and scripts to handle
+ them. Documentation currently needs to be written.
+
+ * htdig/Document.h, htdig/Document.cc(Retrieve): Call
+ ExternalTransport::canHandle to establish which protocols are
+ supported by handler scripts and then create an appropriate
+ transport object.
+
+ * Makefile.in, htdig/Makefile.am, htdig/Makefile.in: Add
+ dependencies for ExternalTransport class.
+
+ * htnet/HtHTTP.h, htnet/HtHTTP.cc, htnet/Transport.h,
+ htnet/Transport.cc: Move _location field from HtHTTP_Response to
+ Transport_Response to allow other subclasses to use it. Similarly,
+ move NewDate and RecognizeDateFormat to Transport.
+
+Fri Nov 26 17:07:52 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc(HTML & do_tag): add code to turn off indexing between
+ <style> and </style> tags.
+
+Fri Nov 26 15:56:47 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc(setVariables): added Alexis Mikhailov's fix
+ to check the number of pages against maximum_pages at the right time.
+ * htlib/String.cc(write): added Alexis Mikhailov's fix to bump up
+ pointer after writing a block.
+
+Wed Nov 24 15:10:05 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * installdir/htdig.conf: Add bad_extensions to make it more obvious to
+ users how to exclude certain document types.
+
+Tue Nov 23 19:29:37 CST 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htnotify/htnotify.cc(send_notification): apply Jason Haar's fix
+ to quote the sender name "ht://Dig Notification Service".
+
+Tue Nov 23 19:46:00 EET 1999 Vadim Chekan <vadim at etc.lviv.ua>
+
+ * conf.tab.cc.h conf.l.cc conf.tab.cc
+ Added files pre-generated from conf.y, conf.l
+
+Sun Nov 21 18:26:21 EET 1999 Vadim Chekan <vadim at etc.lviv.ua>
+
+ *htdig/Document.cc: "max_doc_size" supports new
+ configuration and is url-depended now.
+
+Sun Nov 21 17:06:50 EET 1999 Vadim Chekan <vadim at etc.lviv.ua>
+
+ * New config parser commited. htlib/(Makefile.am,Makefile.in),
+ htlib/Configuration.cc, htlib/Configuration.h
+ htlib/(conf.y, conf.l) added.
+
+Fri Nov 12 14:17:37 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/cgi.cc(init): Fix bug in reading long queries via POST
+ method (PR#668).
+
+Wed Nov 10 15:34:04 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc(setVariables & createURL),
+ htsearch/htsearch.cc(main), htdoc/hts_templates.html: handle keywords
+ input parameter like others, and make it propagate to followups.
+
+Wed Nov 10 15:16:57 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Retriever.cc: Fix PR#688, where htdig goes into an infinite
+ loop if an entry in local_urls (or local_user_urls) is missing a '='
+ (or a ',').
+
+ * htcommon/defaults.cc: removed vestigial references to MAX_MATCHES
+ template variables in search_results_{header,footer}.
+ * htdoc/attrs.html, cf_by{name,prog}.html: reran cf_generate.pl
+
+ * htdoc/hts_form.html: add disclaimer about keywords parameter not
+ being limited to meta keywords.
+
+ * htdoc/meta.html: add description of "keywords" meta tag property.
+ add links to keywords_factor & meta_description_factor attributes.
+
+1999-11-10 Toivo Pedaste <toivo at ucs.uwa.edu.au>
+
+ * htdig/Retriever.cc : Ignore SIGPIPEs with persistant connections
+
+ * htnet/HtHTTP.cc : Fix buffer overrun reading chunks
+
+ * htdig/Document.cc : Make redirects work
+
+ * htdig/Retriever.cc : Make valid URL checks apply to initial URL's
+ particularly those from a previous run
+
+ * htlib/Dictionary.cc : Fix memory deallocation error
+
+
+Tue Nov 02 13:44:57 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htsearch/Display.cc (setVariables): parentheses missing around ternary
+ operator : confusion in priority with <<.
+
+Tue Nov 02 13:33:50 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htsearch/Display.cc (hilight): changed static char * (!!) to const string,
+ static char evaluated before configuration is loaded so config had no
+ effect + unnecesary conversion
+
+Tue Nov 02 11:45:49 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htword/WordKey.cc : Cleaned up obsolete code now using *InSortOrder fcts
+ and WordKeyInfo.sort[]
+ * htword/WordKey : Added FirstSkipField :
+ find first field that must be checked for skip
+ * htword/WordKey (PrefixOnly): now returns OK/NOTOK, fixed bug which
+ made Walk loop over the whole db if the searchkey just had
+ a the "word" field defined
+ * htword/WordKey.cc (Unpack): had forgten to: SetDefinedWordSuffix
+ * htword/WordKey.cc (operator >>): added check for very very long words
+ (even if this should never happen)
+ * htword/WordKey.cc (operators << >>): added <UNDEF> word suffix handling
+ * htword/WordKey.h : Filled() did not check for WordSuffix
+ * htword/WordKey.h : added WordKey::ExactEqual
+ * htword/WordKey.h (IsDefinedWordSuffix): fixed bad flag check
+ * htword/WordList : Removed all obsolete HTDIG_WORDLIST flags: only
+ two remain : COLLECTOR and WALKER the rest is now specified by the searchKey
+ removed action arg to WordList::Collect()
+ * htcommon/HtWordList.cc,htmerge/words.cc : changed flags in calls to WordList::Walk
+ * htword/WordList.cc : skip now deals with the SuffixUndefined case
+
+Fri Oct 29 17:13:21 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/cf_generate.pl: now updates last modified date in attrs.html
+ * htdoc/attrs.html: reran cf_generate.pl
+
+Fri Oct 29 15:28:22 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc(setVariables & hilight): added Sergey's idea
+ for start_highlight, end_highlight & page_number_separator attributes.
+ * htcommon/defaults.cc: added & documented these.
+ * htdoc/attrs.html, cf_by{name,prog}.html: reran cf_generate.pl
+
+Thu Oct 28 13:06:23 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/ExternalParser.cc: added support for external converters
+ as extension to external_parsers attribute.
+ * htcommon/defaults.cc: Updated external_parsers with new description
+ and examples of external converters.
+
+Thu Oct 28 12:52:28 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc: Updated programs lists for *_factor, so they
+ all refer to htsearch and not htdig. Added htsearch to programs lists
+ for translate_*. img_alt_factor & url_factor not defined yet because
+ they're still not used in htdig/htsearch.
+
+Wed Oct 27 15:53:36 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc: added descriptions & examples for
+ doc_excerpt, heading_factor, max_descriptions, minimum_speling_length,
+ regex_max_words, use_doc_date, valid_extensions. Added references
+ to these elsewhere in document as appropriate. Removed -pairs option
+ from pdf_parser default (again). Minor changes to noindex_start & end,
+ and changed example for modification_time_is_now. Corrected references
+ to heading_factor_[1-6].
+ * htdoc/attrs.html, cf_by{name,prog}.html: reran cf_generate.pl
+
+Wed Oct 27 13:32:50 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/cf_generate.pl: changed formatting of output to more closely
+ match format of old attrs.html (to make diff'ing easier),
+ and fixed handling of pdf_parser default to strip quotes.
+ * htcommon/defaults.cc: oops, fixed typo in url_part_aliases example.
+ * htdoc/attrs.html, cf_by{name,prog}.html: reran cf_generate.pl
+
+Wed Oct 27 18:24:36 1999 Loic Dachary <loic at ceic.com>
+
+ * htdoc/cf_generate.pl: fixed wrong target for cf_byprog, escape
+ HTML chars <>&'" for default values.
+
+Wed Oct 27 10:21:18 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc: restored 2nd example for url_part_aliases
+
+Tue Oct 26 16:28:29 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc: corrected descriptions for allow_in_form,
+ search_results_header, noindex_start, noindex_end. Also fixed a
+ few small typos & formatting errors here & there in descriptions
+ and examples.
+
+Tue Oct 26 16:01:22 1999 Loic Dachary <loic at ceic.com>
+
+ * htword/Makefile.am: rm Wordkey.h instead of chmod to copy with
+ non existent WordKey.h
+
+Tue Oct 26 10:54:52 1999 Loic Dachary <loic at ceic.com>
+
+ * htcommon/default.cc: fixed all inconsistencies reported by Gilles.
+
+Mon Oct 25 11:42:13 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htword/ word.cc,t_wordskip,skip_db.txt: Added test for *Skip Speedup*
+ * htword/ WordList: Added tracing of Walk() for debuging purposes
+
+Fri Oct 22 18:22:00 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htword/ WordList.cc,WordKey: Added a defined/undefined flag for saying
+ if a search key's word is a prefix or not: WORD_KEY_WORDSUFFIX_DEFINED
+ reduces code size and makes it much easier to undertand
+ * htword/ WordList,WordReference,WordKey: Added input output streams for
+ WordList,WordReference,WordKey
+
+Wed Oct 20 16:47:52 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htword/ WordKey,Makefile.am,WordCaseIsAStatements.h: for readability
+ replaced the switch ... #ifdef ..STATEMENT().... sequence that apeared many times
+ with an include file :WordCaseIsAStatements.h
+
+ * htword/ WordKey: WordKeyInfo: duplicated all of the fields structure into
+ sort structure, for fast acces without cross referencing and for simplifying code
+ (required change of perl in template WordKey.h.tmpl)
+
+ * htword/ WordList: *Skip Speedup* added a speedup to avoid wasting time
+ by sequentialy walking through useless entries. see function:
+ SkipUselessSequentialWalking() for an example and more info
+
+ * htword/ WordKey.h,WordKey.cc: Changed Set,Unset,IsSet Wordkey accesors' names to:
+ SetDefined,Undefined,IsDefined. (easier to read and avoids naming conflicts)
+
+ * htword/ WordKey: added generic numerical accesors for accesing
+ numerical fields in WordKey (in sorted order):GetInSortOrder,SetInSortOrder
+
+ * htword/ WordKey,word_builder.pl: added a MAX_NFIELDS constant, that specifies
+ a maximum number of fields that a WordKey can have. Sanity check in word_builder.pl.
+
+ * htword/ word_builder.pl: enforced word sort order to ascending
+
+ * htword/ WordList: added a verbose flag using config."wordlist_verbose"
+
+Tue Oct 19 18:36:42 1999 Loic Dachary <loic at ceic.com>
+
+ * htword/WordType.h: const accessors to wtype and config
+
+Tue Oct 19 13:10:47 1999 Loic Dachary <loic at ceic.com>
+
+ * acconfig.h: remove uncessary VERSION (redundant)
+
+Tue Oct 19 11:32:38 1999 Loic Dachary <loic at ceic.com>
+
+ * db/Makefile.in,db/dist/Makefile.in: install db library so
+ that external applications can be linked.
+
+Tue Oct 19 10:57:27 1999 Loic Dachary <loic at ceic.com>
+
+ * configure.in: add --with-key to specify alternate to htword/word.desc
+
+ * configure.in: htword is done before htcommon to prevent unecessary
+ recompilation because WordKey.h changes.
+
+ * htword/Makefile.am: use @KEYDESC@
+
+Tue Oct 19 10:38:41 1999 Loic Dachary <loic at ceic.com>
+
+ * test/word.cc use TypeA instead of DocID and the like
+
+Mon Oct 18 17:21:34 1999 Loic Dachary <loic at ceic.com>
+
+ * Makefile.config: AUTOMAKE_OPTIONS = foreign
+
+Mon Oct 18 11:40:17 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htword/ WordList.cc (Walk): fixed bug in Walk: if flag HTDIG_WORDLIST was set
+ then data was uninitialized in loop
+
+Fri Oct 15 18:52:03 1999 Marcel Bosc <bosc at ceic.com>
+
+ * htdig/Document.h (class Document): added const to:
+ Transport::DocStatus RetrieveLocal(HtDateTime date, const String filename);
+
+Fri Oct 15 17:46:23 1999 Loic Dachary <loic at ceic.com>
+
+ * acinclude.m4,configure.in: modified AC_APACHE_PROG to detect
+ version number and control it.
+
+ * test/conf/*.in: patch to fit module loading or not, accomodate
+ various installation configurations.
+
+ * test/test_functions.in: More portable call to apache.
+
+Fri Oct 15 12:55:47 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htdig/Document: added the management of 'persistent_connections',
+ 'head_before_get', 'max_retries' configuration attributes.
+
+Fri Oct 15 12:54:11 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * test/testnet.cc: added the option '-m' for setting the max size
+ of the document.
+
+Fri Oct 15 12:48:49 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htdig/Server: added a flag for persistent connections.
+ It's set to true if the Server allows persistent connections.
+ It should be used when retrieving a document.
+
+Fri Oct 15 12:45:42 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * defaults.cc: added the configuration attributes 'persistent_connections',
+ 'max_retries' and 'head_before_get'. Their default values are
+ respectively true, 3, false.
+
+Fri Oct 15 12:35:51 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * HtHTTP.cc: managing of uncompleted stream reading with persistent
+ connections (it occurs when max_doc_size is lower than the real
+ content length of the document, or when a document is not parsable
+ and we asked for it with a GET call).
+
+ * Transport: _host variable is treated as a String, as Loic suggested.
+
+Fri Oct 15 12:11:23 1999 Marcel Bosc <bosc at ceic.com>
+
+ * Added README to htword
+
+Thu Oct 14 11:29:35 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/mktime.c, htlib/regex.c, htlib/regex.h, htlib/strptime.c:
+ Updated with latest glibc versions. Merging from glibc sources may
+ have introduced bugs, so this is the last merge before htdig-3.2.0b1.
+
+Thu Oct 14 13:09:32 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htnet/Transport: added statistics for open and close of connections
+ and changes of servers.
+ Fixed a bug in the SetConnection method, regarding the host comparison.
+ Added a method for showing the statistics on a given channel.
+
+ * htnet/HtHTTP: More debug info available.
+ Added a method for showing the statistics on a given channel.
+
+ * test/testnet.cc: now receives changes above.
+
+Wed Oct 13 13:35:42 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htdig/Document.h: added an HtHTTP pointer to the class.
+
+ * htdig/Document.cc: Transport and HtHTTP initialization methods
+ inside the Document constructur. The class destructor now calls
+ only the HtHTTP destructor (not the Transport destructor).
+ Modified the Retrieve method.
+
+ * htdig/Server.h: _last_connection is now an HtDateTime object.
+
+ * htdig/Server.cc: _modified the constructor and the delay method.
+
+ * htdig/Retriever.cc: modified the parse_url function in order to manage
+ all the Document status messages coming from the Transport class.
+ Also modified the method for not found URLs for managing the no_port
+ status.
+
+Tue Oct 12 10:12:10 1999 Loic Dachary <loic at ceic.com>
+
+ * install headers and libraries so that htdig libraries may be used by external programs
+
+ * htword/WordList.cc,WordType.cc: add comments about config parameters used.
+
+Fri Oct 8 09:35:30 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtDateTime.cc (SetFTime): Change buffer argument to const
+ char* to prevent problems passing in const buffers.
+
+ * htnet/HtHTTP.h: Change SetUserAgent to take a const char* to
+ prevent problems passing in const parameters.
+
+ * htdig/Document.h, htdig/Document.cc(): Use Transport class for
+ obtaining documents. Remove duplication of declarations
+ (e.g. DocStatus).
+
+ * htdig/Retriever.cc: Adapt switch statements from
+ Document::DocStatus to Transport::DocStatus.
+
+ * htdig/Server.cc: Use Document::Retrieve instead of RetrieveHTTP.
+
+Fri Oct 08 16:35:16 1999 Loic Dachary <loic at ceic.com>
+
+ * test/t_htnet: succeed if timeout occurs. It was the opposite.
+
+ * configure.in: AC_MSG_CHECKING(how to call getpeername?) add missing
+ comma at end for header spec block.
+
+Fri Oct 08 14:42:47 1999 Loic Dachary <loic at ceic.com>
+
+ * Fix all warnings reported by gcc-2.95.1 related to string
+ cast to char*.
+
+Fri Oct 08 14:04:21 1999 Loic Dachary <loic at yoda.ceic.com>
+
+ * htlib/Configuration,ParsedString,Dictionary: change char* to String
+ where possible.
+
+ * Fix a lot of warnings reported by gcc-2.95.1 related to string
+ cast to char*.
+
+ * Completely disable exception code from db.
+
+Fri Oct 08 13:44:32 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * HtHTTP.cc: fixed a little bug in setting the modification time
+ if not returned by the server.
+
+Fri Oct 08 11:30:53 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * HtHTTP.cc: better management of connection failures return values.
+ * Transport.h: added Document_no_connection and
+ Document_connection_no_port enum values.
+ * testnet.cc: management of above changes.
+
+Fri Oct 08 11:27:31 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * configure.in: modified getpeername() test.
+
+Fri Oct 08 10:28:15 1999 Loic Dachary <loic at ceic.com>
+
+ * htdig/Retriever.cc (IsValidURL): test return value of
+ ext = strrchr(url, '.');
+
+ * htword/WordRecord.h: initialize info member to 0 in constructor and
+ Clear.
+
+ * htlib/Configuration: char* -> String to all functions. Resolve
+ warnings.
+
+Thu Oct 07 16:19:46 1999 Loic Dachary <loic at ceic.com>
+
+ * htnet/HtHTTP.cc (ReadChunkedBody): use append instead of
+ << because buffer is *not* null terminated.
+
+ * htnet/Transport.cc (Transport): initialize _port and _max_document_size
+ otherwise comparison with undefined value occurs.
+
+Thu Oct 07 16:34:21 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * HtHTTP.cc: call FinishRequest everytime in HTTPRequest() a value is
+ returned.
+ * testnet.cc: improved with more statistics and connections timeouts
+ control.
+
+Thu Oct 07 12:53:12 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * configure.in: modified getpeername() test function with
+ AC_LANG_CPLUSPLUS instead of AC_LANG_C.
+
+Thu Oct 07 11:56:52 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * HtHTTP.cc : fixed bug of double deleting _access_time
+ and _modification_time objects in ~HtHTTP().
+
+Thu Oct 07 10:17:22 1999 Loic Dachary <loic at ceic.com>
+
+ * htword/WordRecord.h: change (const char*) cast to (char*)
+
+ * htword/WordKey.h.tmp: fix constness of accessors, const accessor
+ returns const ref. Prevents unecessary copies.
+
+Wed Oct 6 23:31:50 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htnet/Connection.h, htnet/Connection.cc: Merge in io
+ class. Connection class was the only subclass of io.
+
+ * Makefile.in, htlib/Makefile.am, htlib/Makefile.in: Update for
+ removed io class.
+
+ * htdig/ExternalParser.cc: Add more verbose flags for errors.
+
+Wed Oct 06 14:56:34 1999 Loic Dachary <loic at ceic.com>
+
+ * htnet/Connection.cc (assign_server): use free, not delete
+ on strdup allocated memory.
+
+ * htcommon/URL.cc (URL): set _port to 0 in constructors.
+
+Wed Oct 06 12:08:38 1999 Loic Dachary <loic at ceic.com>
+
+ * Move htlib/HtSGMLCodec.* to htcommon to prevent
+ crossed interdependencies between htlib and htcommon
+
+Wed Oct 06 12:07:32 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * HtHTTP.cc: patch from Michal Hirohama regarding
+ the SetBodyReadingController() method
+
+Wed Oct 06 11:49:15 1999 Loic Dachary <loic at ceic.com>
+
+ * Move htlib/HtZlibCodec.* htlib/cgi.* to htcommon to prevent
+ crossed interdependencies between htlib and htcommon
+
+Wed Oct 06 11:40:48 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * HtHTTP: stores the server info correctly and removed some debug info
+ in chunk managing
+
+Wed Oct 06 11:39:12 1999 Loic Dachary <loic at ceic.com>
+
+ * Move htlib/*URL* to htcommon
+
+Wed Oct 06 10:09:19 1999 Loic Dachary <loic at ceic.com>
+
+ * README: add htword
+
+ * test/t_htnet: fix variable set problem & return code problem
+
+Wed Oct 06 08:53:52 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * Written t_htnet test
+
+Tue Oct 5 12:24:43 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * db/*: Import of Sleepycat's Berkeley DB 2.7.7.
+
+ * db/db/db.c, db/include/db.h, db/include/db_cxx.h, db/mp/mp_bh.c:
+ Resolve conflicts created in merge.
+
+Tue Oct 05 18:53:13 1999 Loic Dachary <loic at ceic.com>
+
+ * htdig/Display.cc, htword/*.cc: add inclusion of htconfig.h
+
+Tue Oct 05 14:54:17 1999 Loic Dachary <loic at ceic.com>
+
+ * htlib/htString.h (class String): add set(char*)
+
+ * htword/WordKey.cc: define typedefs for key components. Leads to more
+ regular code and no dependency on a predefined set of known types.
+ All types must still be castable to unsigned int.
+ Assume Word of type String always exists.
+ Generic Get/Set/Unset methods made simpler. Added const and ref
+ for Get in both forms.
+
+ * htword/WordList.cc: enable word reference counting only if wordlist_extend
+ configuration parameter is set. This parameter is hidden because
+ no code uses per word statistics at present. It is only activated
+ in the test directory.
+
+ * htword/word_list.pl: add mapping to symbolic type names,
+ force and check to have exactly one String field named Word.
+
+Mon Oct 04 20:05:35 1999 Loic Dachary <loic at ceic.com>
+
+ * test: add thingies to make test work when doing ./configure
+ outside the source directory.
+
+ * htword/WordList: Add Ref and Unref to update statistics.
+ Fix walking to start from the end of statistics. All statistics
+ words start with \001, therefore at the beginning of the file and
+ all clustered together.
+
+ * htword/WordStat: derived from WordReference to implement
+ uniq word statistics.
+
+ * test/word.cc: test statistics updating.
+
+ * htword/WordKey.cc: fix bugous compare (returned length diff
+ if key of different length).
+
+Mon Oct 04 18:43:56 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * test/testnet.cc: added the option for HEAD before GET control
+
+Mon Oct 04 17:33:24 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htnet/Transport.h .cc: added the FlushConnection() method
+
+ * htnet/HtHTTP.h .cc: now the Request() method can make a HEAD
+ request precede a GET request. This is made by default, and
+ can be changed by using the methods Enable/DisableHeadBeforeGet().
+ A configuration option can be raised to manage it.
+
+Mon Oct 04 12:43:41 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htlib/io.h .cc: added a flush() method.
+
+ * htnet/HtHTTP.cc: manage the chunk correctly, by calling the flush()
+ method after reading it.
+
+Mon Oct 04 12:02:24 1999 Loic Dachary <loic at ceic.com>
+
+ * htlib/htString.h: move null outside inline operator [] functions.
+
+Fri Oct 01 14:55:56 1999 Loic Dachary <loic at ceic.com>
+
+ * htword/WordRecord: mutable, can also contain uniq word statistics.
+
+ * htword/WordReference: remove all dependencies related to the actual
+ structure of the key.
+
+ * htcommon/HtWordReference: derived from WordReference, explicit
+ accessors.
+
+ * htcommon/HtWordList: derived from WordList, only handles the
+ word cache (Flush, MarkGone).
+
+ * htdig/HTML.cc (do_tag): add wordindex to have location set in
+ tags
+
+ * htcommon/DocumentRef.cc (AddDescription): add Location calculation
+
+ * htword/WordList.cc: add dberror to map Berkeley DB error codes
+
+ * htsearch/Display.cc (display): initialize good_sort to get rid
+ of strange warning.
+
+Fri Oct 01 09:02:11 1999 Loic Dachary <loic at ceic.com>
+
+ * Makefile.config: duplicate library lines to resolve
+ interdependencies.
+
+Thu Sep 30 17:56:55 1999 Loic Dachary <loic at ceic.com>
+
+ * htmerge/words.cc (delete_word): Upgrade to use WordCursor.
+
+ * htword/WordList: Walk now uses a local WordCursor. Many concurent
+ Walk can happen at the same time.
+
+ * htword/WordList: Walk callback now take the current WordCursor.
+ Added a Delete method that takes the WordCursor. Allows to delete
+ the current record while walking.
+
+ * db/include/db_cxx.h (DB_ENV): add int return type to operator =
+
+ * db/dist/configure.in (CXXFLAGS): disable adding obsolete
+ g++ option.
+
+ * configure.in: enable C++ support when configuring Berkeley DB
+
+ * htword: create. move Word* from htcommon. move HtWordType
+ from htlib and rename WordType.
+
+ * htword/WordList: use db_cxx interface instead of Database.
+ Less interface overhead. Get access to full capabilities of
+ Berkeley DB. Much more error checking done.
+ Create WordCursor private class to use String instead of Dbt.
+
+Wed Sep 29 20:03:31 1999 Loic Dachary <loic at yoda.ceic.com>
+
+ * htlib/lib.h: AIX xlC does is confused by overloaded mystrcasestr
+ that only differ in constness. Only keep const form and use cast
+ where approriate. *sigh*
+
+ * htlib/htString.h: accomodate new form of Object::compare and
+ Copy. Explicitly convert compare arg to String&, prevent hiding
+ and therefore missing the underlying compare function.
+
+ * htlib/HtVector.cc (Copy): make it const
+
+ * htlib/HtHeap.cc: accomodate new form of Object::compare
+
+ * htcommon/List.h,cc: Add ListCursor to allow many pointers that
+ walk the list to exist in the same program.
+
+ * htlib/Object.h (class Object): kill unused Serialize + Deserialize.
+ Change unused Copy to const and bark on stderr if called because it
+ is clearly not was is wanted. If Copy is called and the derived class
+ does not implement Copy we are in trouble. Alternatives are to make
+ it pure virtual but it will break things all over the code or to abort
+ but this will be considered to violent. Change compare to take a
+ const reference and be a const.
+
+Wed Sep 29 16:51:58 1999 Loic Dachary <loic at yoda.ceic.com>
+
+ * acinclude.m4,configure.in,Makefile.config: remove -Wall from
+ Makefile.conf, add the AC_COMPILE_WARNINGS macro in acinclude.m4
+ and use it in configure.in.
+
+ * htdoc/default_check.pl: remove, unused
+
+Wed Sep 29 13:07:58 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htnet/Transport: fixed some bugs on construction and destruction
+
+ * htnet/HtHTTP: the most important add is the decoding of chunked
+ encoded responses, as reported on RFC2616 (HTTP/1.1). It needs
+ to be developed, because it timeouts at the end of the request.
+ Added a function pointer in order to dynamically handle the function
+ that reads the body of a response (for now, normal and chunked, but
+ other encoding ways exist, so ...). Fixed some bugs on construction
+ and added some features like Server and Transfer-encoding headers.
+
+Wed Sep 29 13:54:59 1999 Loic Dachary <loic at yoda.ceic.com>
+
+ * fix all inline method declarations so that they are always declared
+ inline in the class declaration if an inline definition follows.
+
+ * acinclude.m4: also search apache in /usr/local/apache/bin by default.
+
+ * fix various warnings of gcc-2.95, now compiles ok without warnings
+ and with -Wall.
+
+ * htlib/htString.h: removed commented out inline get
+
+ * test/testnet.cc: add includes for optarg
+
+Tue Sep 28 18:56:36 1999 Loic Dachary <loic at ceic.com>
+
+ * Makefile.config (HTLIBS): libhtnet at the beginning of the list. It
+ matters on Solaris-2.6 for instance.
+
+ * test/testnet.cc: change times to timesvar to avoid conflict with
+ function (was warning only on Solaris-2.6).
+
+ * htdig,htsearch,htmerge,test/word are purify clean when running
+ make check.
+
+Tue Sep 28 18:23:49 1999 Loic Dachary <loic at ceic.com>
+
+ * htmerge/words.cc (mergeWords): use WordList::Walk to avoid loading ALL
+ the words into memory.
+
+ * htlib/DB2_db.cc (Open): we don't want duplicates. Big mistake. If DUP is
+ on, every put for update will insert a new entry.
+
+ * htcommon/WordList.cc (Delete): separate Delete (straight Delete and WalkDelete)
+ to avoid accessing dbf from outside WordList.
+
+ * htcommon/WordList.cc (Walk): now promoted to public.
+
+Tue Sep 28 16:34:56 1999 Loic Dachary <loic at ceic.com>
+
+ * test/word.cc (dolist): Add regression tests for Delete.
+
+ * htcommon/WordList.cc (Delete): Reimplement from scratch. Use Walk
+ to find records to delete. This allows to say delete all occurence
+ of this word, delete all words in this document (slow), delete
+ all occurences of this word in this document etc.
+
+ * htcommon/WordList.cc (Walk): extend so that it handles walk for
+ partially specified keys, remains fully backward compatible. It allows
+ to extract all the words in a specific document (slow) or all occurences
+ of a word in a specific document etc.
+
+Tue Sep 28 12:56:12 1999 Loic Dachary <loic at ceic.com>
+
+ * htcommon/DocumentDB.cc (Open): report errors on stderr
+
+ * htmerge/docs.cc (convertDocs): rely on error reporting from DocumentDB
+ instead of implementing a custom one.
+
+Tue Sep 28 11:36:28 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htnet/Transport.h: added the status code and the reason phrase
+
+ * htnet/HtHTTP.cc .h: removed the attributes above.
+ Read the body of a response if the code is 2xx. Issues the
+ GetLocation() method.
+
+Tue Sep 28 10:32:47 1999 Loic Dachary <loic at ceic.com>
+
+ * test/htdocs/set3: create and populate with cgi scripts have
+ bad behaviour (time out and, slow connection).
+
+Tue Sep 28 10:20:23 1999 Loic Dachary <loic at ceic.com>
+
+ * test/htdocs: move html files in set1/set2 subdirectories to allows
+ tests that use different set of files. Change htdig.conf accordingly.
+
+Tue Sep 28 09:31:12 1999 Loic Dachary <loic at ceic.com>
+
+ * test/Makefile.am: comment test options, add LONG_TEST='y' for lengthy
+ tests, by default run quick tests.
+
+ * installdir/bad_words: removed it an of : since the minimum word
+ length is by default 3, these words are ignored anyway.
+
+Mon Sep 27 20:37:38 1999 Loic Dachary <loic at ceic.com>
+
+ * htlib/HtWordType.h,cc: concentrate knowledge about word definition in this
+ class. Rename the class WordType (think WordReference etc...). Change
+ Initialize to use an external default object. A WordType object may be
+ allocated on its own. Drag functionalities from BadWordFile, Replace and
+ IsValid of WordList, and concentrate them in the WordType::Normalize
+ function.
+
+ * htcommon/WordList: use the new WordList semantic. WordType is now a member
+ of WordList, opening the possibility to have many WordList object with different
+ configurations within the same program since the constructor takes
+
+ * htsearch/htsearch.cc (setupWords): Use HtNormalize to find out if word should
+ be ignored in query. Formerly using IsValid.
+
+ * htlib/String.cc (operator []): fix big mistake, operator [] was indeed last() !
+
+ * htlib/String.cc(uppercase, lowercase): return the number of converted chars.
+
+ * htlib/String.cc(remove): return the number of chars removed.
+
+Mon Sep 27 17:43:23 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * Created testnet.cc under test dir for trying the htnet library
+ It's a simple program that retrieves an URL.
+
+ * htnet/HtHTTP.cc, .h: added a 'int (*) (char *)' function pointer.
+ This attribute is static and it is used under the isParsable method
+ in order to determine if a document is parsable. It must be set
+ outside this class by using the SetParsingController static method.
+ The classic use is to set it to 'ExternalParser::canParse' .
+
+Mon Sep 27 10:52:51 1999 Loic Dachary <loic at ceic.com>
+
+ * htmerge/db.cc (mergeDB): delete words instead of words->Destroy()
+ because the words object itself was not freed.
+
+Mon Sep 27 10:38:37 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * Created 'htnet' library
+
+Mon Sep 27 12:39:24 1999 Loic Dachary <loic at ceic.com>
+
+ * test/word.cc (dolist): don't deal with upper case at present and prevent warning.
+
+Mon Sep 27 10:38:37 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htlib/String.cc: removed compiler warnings
+
+ * htdig/HtHTTP.h: corrected cvs Id property
+
+Mon Sep 27 10:29:58 1999 Loic Dachary <loic at ceic.com>
+
+ * htlib/String.cc (String): make sure *all* constructors set the Data
+ member to 0.
+
+ * htsearch/parser.cc (score): add missing dm->id = wr->DocID();
+ strange it did not make search fail horribly.
+
+Mon Sep 27 09:46:34 1999 Loic Dachary <loic at ceic.com>
+
+ * test/conf/htdig.conf.in (common_dir): add common_dir so that
+ templates are found in compile directory.
+
+ * htsearch/parser.cc (phrase): free wordList at end and only allocate if
+ needed.
+
+Fri Sep 24 16:35:47 1999 Loic Dachary <loic at ceic.com>
+
+ * htcommon/DocumentDB.ccf (Open): change mode to 666 instead of 664,
+ it's the bizness of umask to remove permission bits.
+
+ * htlib/URL.cc (removeIndex): Memory leak. do not use l.Release
+ since standard Destroy called by destructor is ok.
+
+ * htdig/htdig.cc (main): Memory leak. Use l.Destroy instead of
+ l.Release.
+
+ * htlib/StringList.cc (Join): Memory leak (new String str +
+ return *str). Also change to const fct.
+
+ * htlib/List.cc (Nth): add const version to help StringList::Join save
+ memory.
+
+ * htdig/HTML.cc (parse): delete [] text (was missing [])
+
+ * htlib/HtVector.cc: Most of the boundary tests with element_count
+ (but not all of them) were wrong (> instead of >= for instance).
+
+ * htlib/HtVector.cc (Previous): limit test cut and pasted from Next
+ and obviously completely wrong. Fix.
+
+ * htlib/HtVector.cc (Remove): use RemoveFrom, avoid code duplication.
+
+ * htcommon/DocumentRef.cc (Clear): set all numerical fields to 0,
+ and truncate strings to 0. Some were missing.
+
+ * htlib/Connection.cc (Connection): free(server_name) because allocated
+ by strdup not new.
+
+Fri Sep 24 14:30:21 1999 Loic Dachary <loic at ceic.com>
+
+ * */.cvsignore: update to include .pure, *.la, *.lo, .purify
+
+ * htlib/String.cc (String): add Data = 0
+
+ * htlib/htString.h (class String): add Data = 0
+
+ * htlib/String.cc (String): init set to MinimumAllocationSize at least
+ prevents leaking if init = 0.
+
+ * htlib/String.cc (nocase_compare): use get() instead of direct
+ pointer to Data so that the trailing null will be added.
+
+ * htlib/Dictionary.cc (DictionaryEntry): free(key) instead of
+ delete [] key because obtained with strdup.
+
+ * htlib/DB2_db.cc (Close): free(dbenv) because db_appexit does not
+ free this although it free everything else.
+
+Thu Sep 23 18:18:40 1999 Loic Dachary <loic at ceic.com>
+
+ * configure.in: add PERL detection & use in Makefile.am
+
+Thu Sep 23 14:29:29 1999 Loic Dachary <loic at ceic.com>
+
+ * configure.in: removed unused alloca.h
+
+ * htcommon/DocumentDB.cc: test isopen in Close instead of before calling Close.
+ Add some const in functions arguments.
+ (Read): change char* args to const String&, changed tests for null pointers to
+ empty().
+ (Add): Delete the temp class member, use function local temp.
+ (operator []): change char* args to const String&
+ (CreateSearchDB): change char* args to const String&
+
+ * htcommon/DocumentRef.cc:(AddDescription): Add some const in functions arguments.
+ Use a WordReference as insertion context instead of merely the docid: it contains
+ the insertion context.
+ (AddAnchor): Add some const in functions arguments.
+
+ * htcommon/DocumentRef.h: Add some const in inline functions arguments.
+
+ * htcommon/Makefile.am: add WordKey + WordKey.h generation
+
+ * htcommon/word_builder.pl, word.desc, WordKey.h.tmpl: generate WordKey.h from WordKey.h.tmpl and
+ word.desc
+
+ * htcommon/WordList.cc: In general remove code that belongs to WordReference rather
+ than WordList and cleanup const + String.
+ (WordList) the constructor takes a Configuration object in argument.
+ (Word -> Replace): Word method replaced by Replace method because more explicit. Now
+ taks a WordReference in argument instead of the list of fields values.
+ (valid_word deleted, IsValid only): Add some const in functions arguments.
+ (BadWordFile): change char* args to const String&
+ (Open + Read -> Open): Open and Read merge into Open with mode argument. change char* args
+ to const String&.
+ (Add): use WordReference::Pack and simply do Put.
+ (operator[], Prefix ...) now take WordReference instead of Word. Autmatic Conversion from
+ Word for compatibility thru WordReference(const Word& w).
+ (Dump): change char* args to const String&
+ (Walk): use WordReference member functions instead of hard coded packing
+
+ * htcommon/WordRecord.h: move flag definitions to WordReference.h
+ only keep anchor, the reste moved to key.
+
+ * htdig/Document.cc: change all config[""] manipulations from char* to String
+ or const String
+ (setUsernamePassword): Add some const in functions arguments.
+
+ * htdig/HTML.cc: change all config[""] manipulations from char* to String
+ or const String. Change null pointer tests to empty().
+ (transSGML): change char* args to const String&
+
+ * htdig/HtHTTP.cc: Add error messages for default cases in every switch.
+
+ * htdig/PDF.cc: (parse) change char* to const String& for config[""]
+
+ * htdig/Plaintext.cc: (parse) remove unused variable
+
+ * htdig/Retriever.cc: use WordReference word_context instead of simple docid
+ to hold the insertion context.
+ (Retriever) pass config to WordList initializer.
+ (setUsernamePassword): Add some const in functions arguments.
+ (Initial): change char* args to const String&
+ (parse_url): use WordReference word_context, add debug information.
+ (RetrievedDocument): set anchor in word_context.
+ (got_word): use Replace instead of Word
+ (got_*): Add some const in functions arguments.
+
+ * htdig/htdig.cc: change all config[""] manipulations from char* to String
+
+ * htdoc/cf_generate.pl: compute attrs.html, cf_byprog.html and cf_byname.html from
+ ../htlib/default.cc and attrs_head.html attrs_tail.html cf_byname_head.html cf_byname_tail.html
+ cf_byprog_head.html cf_byprog_tail.html
+ Add rules in Makefile.am
+
+ * htfuzzy: In every programs I changed the constructor to take a
+ Configuration agrument. The openIndex and writeDB had this
+ argument sometime used it, sometimes used the global
+ config. Having it in the contructor is cleaner and safer, there
+ is no more reference to the global config. I also changed some
+ char* to String and const. Most of the program look the same, I
+ won't go into details here :-}
+
+ * htlib/Configuration.cc: changed separators from String* to String. Simpler.
+ (~Configuration): removed because not needed.
+ (Add): change to String, remove new String + delete for local var.
+ (Find, operator[]): make it const fct, add some const in functions arguments.
+ (Value + Double): killed, replaced by as_integer + as_double from String
+ (Boolean): use String methods + string objects
+ (Defaults): Add some const in functions arguments.
+
+ * htlib/Configuration.h: add
+ char *type; // Type of the value (string, integer, boolean)
+ char *programs; // White separated list of programs/modules using this attribute
+ char *example; // Example usage of the attribute (HTML)
+ char *description; // Long description of the attribute (HTML)
+ to the ConfigDefaults type.
+
+ * htlib/Connection.cc: (assign_server) change char* args to const String&
+
+ * htlib/DB2_db.cc: Merge with DB2_hash.
+ Add compare and prefix functions pointers.
+ Merge OpenRead & OpenReadWrite into Open, keep for compatibility.
+ skey and data are now strings instead of DBT.
+ Remove Get_Next_Seq.
+ Get_Next now returns key and value in arguments.
+ Remove all other Get_Next interfaces.
+
+ * htlib/Database.h:
+ Compatibility functions for Get_Next
+ Put, Get, Exists, Delete take String args and are inline
+ Add SetPrefix and SetCompare
+
+ * htlib/Dictionary.cc:
+ Add copy constructor.
+ Add DictionaryCursor that holds the traversal context.
+ Use DictionaryCursor object for traversal without explicit
+ cursor specified.
+ Add constness where meaningfull.
+
+ * htlib/HtPack.cc:
+ (htPack) format is const, change strtol call
+ to use temporary variable to cope with constness.
+ (htUnpack) dataref argument is not a reference anymore. Not used
+ anywhere and kind of hidden argument nobody wants.
+
+ * htlib/HtRegex.cc: set, match, HtRegex have const args.
+
+ * htlib/HtWordCodec.cc: (code) orig is const
+
+ * htlib/HtWordType.cc,h: statics is made of String instead of char*. Remove
+ static String punct_and_extra from Initialize.
+
+ * htlib/HtZlibCodec.cc: len is unsigned int
+
+ * htlib/ParsedString.cc: add constness to function args
+ (get) use String instead of char
+
+ * htlib/QuotedStringList.cc: inline functions argument variations and
+ add constness.
+
+ * htlib/String.cc: add constness whereever possible.
+
+ * htlib/htString.h: Add const get, char* cast, operator [].
+ Add as_double conversion.
+
+ * htlib/StringList.cc: inline functions argument variations and
+ add constness.
+
+ * htlib/StringMatch.cc: add constness to function args.
+
+ * htlib/URL.cc: add constness to function args.
+ (URL): fct arg was used as temp. Change, clearer.
+
+ * htlib/lib.h: add const declaration of string manipulation functions.
+ Two forms for mystrcasestsr: const and not const.
+
+ * htlib/strcasecmp.cc: add constness to function args.
+
+ * htlib/timegm.c: add declaration for __mktime_internal
+
+ * htmerge/db.cc: change *doc* vars from char* to const String, use
+ new WordList + WordReference interface.
+
+ * htmerge/docs.cc: change *doc* vars from char* to const String.
+
+ * htmerge/words.cc: use new WordList + WordReference interface.
+
+ * htsearch/Display.cc: use empty method on String where appropriate.
+ use String instead of char* where config[""] used.
+ (includeURL): change char* args to const String&
+
+ * htsearch/ResultMatch.cc: (setTitle, setSortType) change char* args to const String&
+
+ * htsearch/Template.cc: (createFromFile) change char* args to const String&
+
+ * htsearch/Template.h: accessors return const String& or take const char*
+
+ * htsearch/TemplateList.cc: (get) use const String for internalNames.
+
+ * htsearch/htsearch.cc: use String instead of char* where config[""] used.
+
+ * htsearch/parser.cc: Initialize WordList member with config global.
+ (perform_push): free the result list after calling score.
+ (score, phrase): use new WordList + WordReference interface.
+
+Thu Sep 23 14:29:29 1999 Loic Dachary <loic at ceic.com>
+
+ * htcommon/WordKey.h.tmpl, WordKey.cc: new, describe the key of the word
+ database.
+
+ * htcommon/word.desc: new, abstract description of the key structure of the word
+ database.
+
+ * htcommon/word_builder.pl: new, generate WordKey.h from WordKey.h.tmpl
+
+ * htcommon/WordReference.cc: move key manipulation to WordKey.cc
+ Add Unpack/Pack functions. Add accessors for fields and move fields to private.
+ Add constness where possible.
+
+Mon Sep 20 14:50:47 1999 Loic Dachary <loic at ceic.com>
+
+ * Everywhere config["string"] is used, check that it's *not* converted to
+ char* for later use. Keep String object so that there is no chance to
+ use a char* that has been deallocated. Using a String as return for config["string"]
+ is also *much* safer for the great number of calls that did not check for a possible
+ 0 pointer return.
+
+ * htfuzzy/*.{cc,h}: const Configuration& config member. Constructor sets it.
+ Remove config argument from openIndex & writeDB. The idea (as it was initialy,
+ I guess) is to be able to have a standalone fuzzy library using a specify
+ configuration file. It is now possible and consistent.
+
+ * htlib/htString.cc: more constness where appropriate. Changed compare
+ to have const String& arg instead of const Object* because useless and
+ potential source of bugous code.
+
+ * htfuzzy/Regex.cc (getWords): fix bugous setting of extra_word_chars
+ configuration value. It is set to change the behaviour of HtStripPunctuation
+ but this function get the extra_word_chars from a static array initialized
+ at program start by static void Initialize(Configuration & config). Use straight
+ s.remove() instead. Besides, the string was anchored by prepending a ^ that
+ was removed because part of the reserved chars.
+
+Mon Sep 20 11:47:05 1999 Loic Dachary <loic at ceic.com>
+
+ * htlib/Configuration.cc (operator []): changed return type to String
+ to solve memory leak. When char* the string was malloced from ParsedString
+ after substitution and never freed. In fact it was even worse : it was
+ free before use in some cases.
+
+Sun Sep 19 19:12:44 1999 Loic Dachary <loic at ceic.com>
+
+ * htdoc/cf_generate.pl, htcommon/defaults.cc, htlib/Configuration.h:
+ Change the structure of the configuration defaults. Move
+ description, examples, types, used_by information from attrs.html.
+ Write cf_generate.pl to build attrs.html, cf_byname, cf_byprog
+ from defaults.cc. Makes it easier to maintain an up to date
+ description of existing attributes. About 10 attributes existed
+ in defaults.cc and were not describted in the HTML pages.
+ Add rules in htdoc/Makefile.am to generate the pages if a source
+ changes.
+
+Fri Sep 17 19:34:48 1999 Loic Dachary <loic at ceic.com>
+
+ * Makefile.config: add -Wall to all compilation and fix
+ all resulting warnings.
+
+ * htlib/Connection.cc (assign_server): remove redundant test
+ and cast litteral value to unsigned
+
+ * htlib/String.cc: add const qualifier where possible. Helps
+ dealing with const objects at an upper level.
+
+Fri Sep 17 18:27:57 1999 Alexander Bergolth <leo at leo.wu-wien.ac.at>
+
+ A few changes so that it compiles with xlC on AIX:
+
+ * configure.in, include/htconfig.h.in: Add check for sys/select.h.
+ Add "long unsigned int" to the possible getpeername_length types.
+
+ * htdig/htdig.cc: Moved variable declaration out of case block.
+
+ * htlib/Connection.cc: Include sys/select.h.
+
+ * htcommon/WordList.cc: just a type cast
+
+ * htlib/regex.c: define true and false only if they aren't already
+
+ * htdig/Transport.{h,cc}: removed inline keywords (inline functions
+ have to be defined and declared simultaneously)
+
+ * htlib/{mktime.c,regex.h,strptime.c,timegm.c}: change // comments
+ to /* ... */
+
+Tue Sep 14 01:15:48 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htmerge/db.cc: Rewrite to use the WordList functions to merge
+ the two word databases. Also make sure to load the document
+ excerpt when adding in DocumentRefs.
+
+ * htmerge/docs.cc: Fix bug where ids were not added to the discard
+ list correctly.
+
+ * htmerge/words.cc: Fix bug where ids were not checked for
+ existance in the discard list correctly.
+
+Sun Sep 12 12:27:16 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/defaults.cc: Remove word_list since that file is no
+ longer used.
+
+ * htdig/htdig.cc: Ensure -a and -i are followed for the word_db
+ file. Fixes PR #638.
+
+Sat Sep 11 00:11:28 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/StringMatch.h: Add back mistakenly deleted #ifndef/#define.
+
+Fri Sep 10 23:07:43 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htmerge/*, htcommon/*, htdig/*, htlib/*: Add copyright information.
+
+Fri Sep 10 11:33:50 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htnotify/htnotify.cc: Add copyright information.
+
+ * htsearch/* htfuzzy/*: Ditto.
+
+Fri Sep 10 15:24:44 1999 Loic Dachary <loic at ceic.com>
+
+ * htdig/Retriever.cc: change static WordList words to
+ object member. words.Close() at end of Start function
+ to make sure data is flushed by database.
+
+ * htcommon/WordList.cc (Close): test isopen to prevent
+ ugly crash. Remove isopen test in calling functions.
+
+Fri Sep 10 13:45:53 1999 Loic Dachary <loic at ceic.com>
+
+ * htcommon/WordList.h htcommon/WordList.cc: methods Collect
+ and Walk that factorise the behaviour of operator [], Prefix
+ and WordRefs.
+
+ * htcommon/WordList.h htcommon/WordList.cc: method Dump to
+ dump an ascii version of the word database.
+
+ * htcommon/WordReference.h,htcommon/WordReference.cc: method Dump
+ to write an ascii version of a word.
+
+ * htdig/htdig.cc: -t now also dump word database in ascii as
+ well.
+
+ * htdoc/attrs.html,cf_byprog.html,cf_byname.html: added doc
+ for word_dump
+
+Thu Sep 9 20:30:18 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htfuzzy/Fuzzy.h, htfuzzy/Fuzzy.cc, htfuzzy/Prefix.cc,
+ htfuzzy/Regex.cc, htfuzzy/Speling.cc, htfuzzy/Substring.cc,
+ htfuzzy/htfuzzy.cc, htfuzzy.h: Change to use WordList code instead
+ of direct access to the database.
+
+Thu Sep 9 14:55:59 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/parse_doc.pl: fix bug in pdf title extraction.
+
+Tue Sep 7 23:49:41 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/ExternalParser.h, htdig/ExternalParser.cc (parse): Change
+ parsing of location to allow phrase searching -- location is *not*
+ just 0-1000.
+
+ * htdig/Plaintext.h, htdig/Plaintext.cc, htdig/PDF.cc: Ditto.
+
+ * htdig/Retriever.h, htdig/Retriever.cc: Don't call
+ HtStripPunctuation. This is now done in the WordList::Word method.
+
+ * htcommon/WordList.h htcommon/WordList.cc (Prefix): New method to
+ do prefix retrievals. Essentially the same as [], except the loop
+ is broken only in the unlikely event that we retrieve something
+ beyond the range set.
+ (Exists): New method for checking the existance of a
+ string--attempt to retrieve it and determine if anything's
+ actually there.
+ (Word): Call HtStripPunctuation as part of the cleanup.
+
+Tue Sep 7 21:37:44 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/defaults.cc: Add new configuration option
+ removed_unretrieved_urls to remove docs that have not been accessed.
+
+ * htmerge/docs.cc (convertDocs): Use it.
+
+ * htcommon/defaults.h, htcommon/WordRecord.h,
+ htcommon/WordReference.h: Add copyright notice to head of file.
+
+Mon Sep 6 10:32:59 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtZlibCodec.h, htlib/HtZlibCodec.cc(instance): New method
+ as used in other codecs.
+ (encode, decode): Fix compilation errors.
+
+ * htlib/Makefile.am: Added HtZlibCodec.cc to the compilation list.
+
+ * htcommon/DocumentDB.cc (ReadExcerpt): Call HtZlibCodec to decompress
+ the excerpt.
+ (Add): Call HtZlibCodec to compress the excerpt before storing.
+ (Open, Read): If the databases are
+ already open, close them first in case we're opening under a
+ different filename.
+ (CreateSearchDB): Remove call to external
+ sort program. Database is already sorted by DocID.
+
+ * configure.in, configure: Remove check for external sort
+ program. No longer necessary.
+
+ * */Makefile.in: Regenerate using automake.
+
+Sun Sep 5 13:50:34 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htmerge/docs.cc: Ensure a document with empty excerpt has
+ actually been retrieved. Otherwise document stubs are always
+ removed.
+
+ * htlib/String.cc: Implement the nocase_compare method.
+
+ * htcommon/WordReference.cc: Implement a compare method for
+ WordRefs to use in sorting. Uses the above.
+
+ * htcommon/DocumentRef.h, htcommon/DocumentRef.cc: Update the
+ headers.
+
+ * htcommon/DocumentDB.h: Ditto.
+
+Sun Sep 5 01:37:27 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/WordList.cc(Flush): Call Add() instead of storing the
+ data ourselves. Additionally, don't open the database ourself (and
+ then close it), instead call Open() if it's not open already.
+
+ * htcommon/DocumentRef.h, htcommon/DocumentRef.cc(AddDescription):
+ Pass in a WordList to use when adding link text words. Ensures
+ that the word db is never opened twice for writing.
+
+ * htdig/Retriever.cc: Call AddDescription as above.
+
+ * htdig/Server.cc(ctor): If debugging, write out an entry for the
+ robots.txt file.
+
+ * htlib/HtHeap.cc(percolateUp): Fix a bug where the parent was not
+ updated when moving up more than once.
+ (pushDownRoot): Fix a bug where the root was inproperly pushed
+ down when it required looping.
+
+Fri Sep 3 16:23:23 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtHeap.cc(Remove): Correct bug where after a removal, the
+ structure was not "re-heapified" correctly. The last item should
+ be moved to the top and pushed down.
+ (pushDownRoot): Don't move items past the size of the underlying
+ array.
+
+ * htdig/Server.h, htdig/Server.cc: Change _paths to work on a
+ heap, based on the hopcount. Ensures on a given server that the
+ indexing will be done in level-order by hopcount.
+
+Wed Sep 01 15:40:37 1999 Loic Dachary <loic at ceic.com>
+
+ * test: implement minimal tests for htsearch and htdig
+
+Tue Aug 31 02:17:04 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/WordRecord.h: Change back to struct to ensure integrity
+ when compressed and stored in the word database.
+
+ * htcommon/WordList.cc (Flush): Use HtPack to compress the
+ WordRecord before storage.
+ ([], WordRefs): Use HtUnpack to decompress the WordRecord after
+ storage.
+
+Sun Aug 29 00:42:07 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/htsearch.cc (convertToBoolean): Remove debugging
+ strings.
+
+ * htsearch/parser.h: Add new method score(List) to merge scoring
+ for both standard and phrase searching.
+
+ * htsearch/parser.cc(phrase): Keep the current list of successful
+ matched words around to pass to score and perform_phrase.
+ (perform_phrase): Naively (and slowly, but correctly) loop through
+ past words to make sure they match DocID as well as successive locations.
+ Move scoring to score().
+ (perform_push): Move scoring to score().
+ (score): Loop through a list of WordReferences and create a list
+ of scored DocMatches.
+
+Sun Aug 29 00:33:17 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/htsearch.cc(createLogicalWords): Hack to produce
+ correct output with phrase searching (e.g. anything in quotes is
+ essentially left alone). Ensure the StringMatch pattern includes
+ the phrase with correct spacing as well.
+ (setupWords): Add a " token whenever it occurs in the query.
+ (convertToBoolean): Make sure booleans are not inserted into
+ phrases.
+
+ * htsearch/parser.h: Add new methods phrase and perfor_phrase to
+ take care of parsing phrases and performing the actual matching.
+
+ * htsearch/parser.cc(lexan): Return a '"' when present for phrase
+ searching.
+ (factor): Call phrase() before parsing a factor--phrases are the
+ highest priority, so ("RedHat Linux" & Debian) ! Windows makes
+ sense.
+ (phrase): New method--slurps up the rest of a phrase and calls
+ perform_phrase to do the matching.
+ (perform_phrase): New method--currently just calls perform_and to
+ give the simulation of a phrase match.
+
+Sat Aug 28 15:57:53 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Server.h, htdig/Server.cc: Undo yesterdays change -- still
+ very buggy and shouldn't be used yet.
+
+ * htdig/Retriever.cc (parse_url): Change default index to 1 to
+ more closely match DocIDs shown with verbose output.
+
+ * htsearch/DocMatch.h: Change score to double and clean up
+ headers.
+
+ * htcommon/WordRecord.h: Change unnecessary long ints (id and
+ flags) to plain ints.
+
+ * htdig/HTML.cc (parse): Call got_word with actual word sequence
+ (i.e. 1, 2, 3...) rather than scaling to 1-1000 by character
+ offset.
+
+ * htlib/Database.h, htlib/DB2_db.h, htlib/DB2_hash.h: Change
+ Get_Item to Get_Next(String item) to return the data as a
+ reference. This makes it easier to use in a loop and cuts the
+ database calls in half.
+
+ * htlib/DB2_db.cc, htlib/DB2_hash.cc: Implement it, making sure we
+ keep the possibly useful data around, rather than tossing it!
+
+ * htsearch/htsearch.cc(htsearch): Don't attempt to open the word db
+ ourselves. Instead, pass the filename off to the parser, which
+ will do it through WordList.
+
+ * htsearch/parser.h: Use a WordList instead of a generic Database.
+
+ * htsearch/parser.cc(perform_push): Use the WordList[] operator to
+ return a list of all matching WordRefs and loop through, summing
+ the score.
+
+ * htcommon/WordList.cc (Flush): Don't use HtPack on the
+ data--somehow when unpacking, there's a mismatch of sizes.
+ (Read): Fix thinko where we attempted to open the database as a
+ DB_HASH.
+ ([]): Don't use HtUnpack since we get mismatches. Use the new
+ Get_Next(data) call instead of calling Get_Item separately.
+ (WordRefs): Same as above.
+
+Fri Aug 27 09:44:09 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Retriever.cc (Need2Get): Remove duplicate detection code for
+ local_urls. The code is somewhat buggy and should be replaced by
+ more general code shortly.
+
+ * htdig/Server.h, htdig/Server.cc (push, pop): Change _paths to a
+ HtHeap sorted on hopcount first (and order placed on heap
+ second). Ensures that on each server, the order indexed is
+ guaranteed to be level-order by hopcount.
+
+ * htdig/URLRef.h, htdig/URLRef.cc (compare): Add comparison method
+ to enable sorting by hopcount.
+
+Fri Aug 27 09:36:35 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/WordList.h, htcommon/WordList.cc (WordList): Change
+ words to a list instead of a dictionary for minor speed improvement.
+
+Thu Aug 26 11:18:20 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc, htdoc/attrs.html: increase default
+ maximum_word_length to 32.
+
+Wed Aug 25 16:50:16 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Retriever.cc(got_word): add code to check for compound words
+ and add their component parts to the word database.
+ * htdig/PDF.cc(parseString), htdig/Plaintext.cc(parse): Don't strip
+ punctuation or lowercase the word before calling got_word. That
+ should be left up to got_word & Word methods.
+
+ * htlib/StringMatch.h, htlib/StringMatch.cc(Pattern, IgnoreCase):
+ Add an IgnorePunct() method, which allows matches to skip over valid
+ punctuation, change Pattern() and IgnoreCase() to accomodate this.
+ * htsearch/htsearch.cc(main, createLogicalWords): use IgnorePunct()
+ to highlight matching words in excerpts regardless of punctuation,
+ toss out old origPattern, and don't add short or bad words to
+ logicalPattern.
+
+ * htlib/HtWordType.h, htlib/HtWordType.cc(Initialize): set up and
+ use a lookup table to speed up HtIsWordChar() and HtIsStrictWordChar().
+
+Mon Aug 23 10:13:05 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc(parse): fix problems with null pointer when attempting
+ SGML entity decoding on bare &, as reported by Vadim Chekan.
+
+Thu Aug 19 11:52:06 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/htsearch.cc(main): Fix to allow multiple keywords
+ input parameter definitions.
+
+ * contrib/parse_doc.pl: make spaces optional in LANGUAGE = POSTSCRIPT
+ PJL test.
+
+Wed Aug 18 11:27:46 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/PDF.cc(parse): Fixed wrong variable name in new code.
+ Double-Oops! (It was Friday the 13th, after all...)
+
+Tue Aug 17 16:26:46 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/HtHeap.cc(Remove): apply Geoff's patch to fix Remove.
+
+ * htlib/HtVector.h, htlib/HtVector.cc(Index): various bounds overrun
+ bug fixes and checking in Last(), Nth() & Index().
+
+Mon Aug 16 13:55:10 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc(expandVariables): fix up test for &amp;
+
+Mon Aug 16 12:08:57 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * Makefine.am, Makefile.in, installdir/Makefile.am,
+ installdir/Makefile.in: change all remaining INSTALL_ROOT to DESTDIR.
+
+Fri Aug 13 15:44:31 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/PDF.cc(parse): added missing ')' in new code. Oops!
+
+ * htlib/strptime.c, htlib/mktime.c: added #include "htconfig.h"
+ to pick up definitions from configure program. Let's try to
+ remember that config.h != htconfig.h!
+
+Fri Aug 13 14:49:07 1999 Loic Dachary <loic at ceic.com>
+
+ * configure.in: removed unused HTDIG_TOP, changed AM_WITH_ZLIB
+ by CHECK_ZLIB
+
+Fri Aug 13 14:00:16 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/PDF.cc(parse), htcommon/defaults.cc, htdoc/attrs.html
+ (pdf_parser): Removed -pairs option from default arguments, added
+ special test for acroread to decide whether to use output file or
+ directory as last argument (also adds -toPostScript if missing).
+ Program now tries to test for existance of parser before trying
+ to call it.
+
+Fri Aug 13 10:10:16 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/attrs.html(pdf_parser): updated xpdf version number.
+
+Thu Aug 12 17:09:37 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/parse_doc.pl: updated for xpdf 0.90, plus other fixes.
+
+Thu Aug 12 11:12:07 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/attrs.html(logging): added Geoff's description of log lines.
+
+Thu Aug 12 11:21:12 1999 Loic Dachary <loic at ceic.com>
+
+ * strptime fixes : AC_FUNC_STRPTIME defined in acinclude.m4 and used in configure.in,
+ conditional compilation of strptime.c (only if HAVE_STRPTIME not defined),
+ removed Htstrptime (strptime.c now defines strptime), changed all calls to Htstrptime
+ to calls to strptime.
+
+Wed Aug 11 16:59:41 1999 Loic Dachary <loic at ceic.com>
+
+ * */Makefile.am: use -release instead of -version-info because nobody
+ wants to bother with published shared lib interfaces version numbers
+ at present.
+
+ * htlib/Makefile.am: added langinfo.h
+
+Wed Aug 11 15:00:07 1999 Loic Dachary <loic at yoda.ceic.com>
+
+ * acconfig.h: removed MAX_WORD_LENGTH
+
+ * re-run auto* to make sure chain is consistent
+
+ * Makefile.am: improve distclean for tests
+
+Wed Aug 11 13:46:22 1999 Loic Dachary <loic at yoda.ceic.com>
+
+ * configure.in: change --enable-test to --enable-tests so
+ that Berkeley DB tests are not activated. Since they depend
+ on tcl this can be a pain.
+
+ * acinclude.m4: AM_PROG_TIME locate time command + find out
+ if verbose output is -l (freebsd) or -v (linux)
+
+Wed Aug 11 13:13:39 1999 Loic Dachary <loic at yoda.ceic.com>
+
+ * acinclude.m4 : AM_WITH_ZLIB autoconf macro for zlib detection that
+ allows --with-zlib=DIR to specify the install root of zlib,
+ --without-zlib to prevent inclusion of zlib. If nothing
+ specified zlib is searched in /usr and /usr/local.
+ --disable-zlib is replaced with --without-zlib.
+
+ * configure.in,configure,aclocal.m4,db/dist/acinclude.m4,
+ db/dist/aclocal.m4,db/dist/configure,db/dist/configure.in:
+ changed to use AM_WITH_ZLIB
+
+Tue Aug 10 21:14:34 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/Display.cc (outputVariable): Fix compilation error with
+ assignment between char * and char *.
+
+ * htsearch/htsearch.cc (main): Use cleaner trick to sidestep
+ discarding const char * as suggested by Gilles.
+
+Tue Aug 10 17:24:12 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc(expandVariables): clean up, simplify and
+ label lexical analyzer states.
+
+Tue Aug 10 17:04:54 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc(expandVariables, outputVariable): add handling
+ for $%(var) and $&(var) in templates. Still to be documented.
+
+Tue Aug 10 20:13:52 1999 Loic Dachary <loic at yoda.ceic.com>
+
+ * db/mp/mp_bh.c: fixed HAVE_ZLIB -> HAVE_LIBZ
+
+Tue Aug 10 17:58:01 1999 Loic Dachary <loic at yoda.ceic.com>
+
+ * configure,configure.in,db/dist/configure.in,db/dist/configure:
+ added --with-zlib configure flag for htdig to specify zlib
+ installation path. Motivated to have compatible tests between
+ htdig and db as far as zlib is concerned. Otherwise configuration
+ is confused and miss an existing libz.
+
+Tue Aug 10 17:44:49 1999 Loic Dachary <loic at yoda.ceic.com>
+
+ * db/mp/mp_fopen.c: fixed cmpr_open called even if libz not here
+
+Tue Aug 10 17:40:53 1999 Loic Dachary <loic at yoda.ceic.com>
+
+ * htlib/langinfo.h: header missing on FreeBSD-3.2, needed
+ by strptime.c
+
+Tue Aug 10 11:43:14 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.h, htdig/HTML.cc(parse, do_tag): fix problems with
+ SGML entity decoding, add decoding of entities within tag attributes.
+
+Mon Aug 9 21:13:50 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/HtHTTP.h(SetRequestMethod): Fix declaration to be void.
+
+ * htdig/Transport.h(GetRequestMaxDocumentSize): Fix declaration to
+ return int.
+
+ * htdig/Retriever.cc(got_href): Fix mistake in hopcount
+ calculations. Now returns the correct hopcount even for pages
+ when a faster path is found. (Still need to change indexing to
+ sort on hopcount).
+
+ * htsearch/htsearch.cc(main): Fix compiler error in gcc-2.95 when
+ discarding const by using strcpy. It's a hack, hopefully there's a
+ better way.
+
+Mon Aug 9 17:23:15 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/URL.cc(ServerAlias): fix small memory leak in new default
+ path code (don't need to allocate new from string each time).
+
+ * htlib/cgi.cc(init): Fix PR#572, where htsearch crashed if
+ CONTENT_LENGTH was not set but REQUEST_METHOD was.
+
+ * htfuzzy/Fuzzy.cc(getWords), htfuzzy/Metaphone.cc(vscode):
+ Fix Geoff's change of May 15 to Fuzzy.cc, add test to vscode macro
+ to stay in array bounds, so non-ASCII letters to cause segfault.
+ Should fix PR#514.
+
+Mon Aug 9 17:03:45 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * include/htconfig.h.in, htcommon/WordList.cc(Word,Flush&BadWordFile),
+ htcommon/DocumentRef.cc(AddDescription), htcommon/defaults.cc,
+ htsearch/parser.cc(perform_push), htdoc/attrs.html,
+ htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Convert the MAX_WORD_LENGTH compile-time option into the run-time
+ configuration attribute maximum_word_length. This required reinserting
+ word truncation code that had been taken out of WordList.cc.
+
+Mon Aug 9 16:34:14 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HtHTTP.cc (isParsable): allow application/pdf as parsable,
+ to use builtin PDF code.
+
+ * htdig/HtHTTP.cc (ParseHeader),
+ htdig/Document.cc (readHeader): clean up header parsing.
+
+ * htdig/Document.cc (getdate): make tm static, so it's initialized
+ to zeros. Should fix PR#81 & PR#472, where strftime() would crash
+ on some systems. Idea submitted by benoit.sibaud at cnet.francetelecom.fr
+
+ * htlib/URL.cc (parse): fix PR#348, to make sure a missing or invalid
+ port number will get set correctly.
+
+Mon Aug 9 15:42:41 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Added descriptions for attributes that were missing, added a few
+ clarifications, and corrected a few defaults and typos.
+ Covers PR#558, PR#626, and then some.
+
+ * configure.in, configure, include/htconfig.h.in, htlib/regex.c:
+ PR#545 fixed - configure tests for presence of alloca.h for regex.c
+
+Sat Aug 07 13:40:17 1999 Loic Dachary <loic at ceic.com>
+
+ * configure.in: remove test for strptime. Run autoconf + autoheader.
+
+ * htlib/HtDateTime.cc: always use htdig strptime, do not try to use
+ existing function in libc.
+
+ * htlib/HtDateTime.h: move inclusion of htconfig.h on top of file,
+ change #ifdef HAVE_CONFIG to HAVE_CONFIG_H
+
+Fri Aug 6 16:37:33 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Document.cc (UseProxy): fix call to match() and test of
+ return value to work as documented for http_proxy_exclude (PR#603).
+
+Fri Aug 06 15:06:23 1999 <loic at yoda.ceic.com>
+
+ * db/dist/config.hin, db/mp/mp_cmpr.c db/db/db.c, db/mp/mp_fopen.c:
+ disable compression if zlib not found by configure.
+
+Thu Aug 05 12:27:15 1999 <loic at yoda.ceic.com>
+
+ * test/dbbench.cc: invert -z and -Z for consistency
+
+ * test/Makefile.am: add dbbench call examples
+
+Thu Aug 05 11:38:58 1999 Loic Dachary <loic at ceic.com>
+
+ * test/Makefile.am: all .html go in distribution, compile dbbench
+ that tests Berkeley DB performances.
+
+ * configure.in/Makefile.am: conditional inclusion of the test
+ directory in the list of subdirs (--enable-test). The list
+ of subdirs is now @HTDIGDIRS@ in configure.in & Makefile.am
+
+ * db/*: Transparent I/O compression implementation. Defines the DB_COMPRESS flag.
+ For instance DB_CREATE | DB_COMPRESS.
+
+ * db/db_dump/load: add -C option to specify cache size to db_dump/db_load
+
+Wed Aug 4 22:57:27 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * db/*: Import of Sleepycat's Berkeley DB 2.7.5.
+
+Wed Aug 4 22:40:49 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * contrib/htparsedoc/htparsedoc: Add in contributed bug fixes from
+ Andrew Bishop to work on SunOS 4.x machines.
+
+Wed Aug 4 01:58:52 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * COPYING, htdoc/COPYING, configure.in, Makefile.am, Makefile.in:
+ Update information to use canonical version of the GPL from the
+ FSF. In particular, this version has the correct mailing address
+ of the FSF.
+
+Mon Aug 02 11:28:00 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htlib/htString.h, htlib/String.cc : added the possibility to
+ insert an unsigned int into a string.
+ * htdig.cc : with verbose mode shows start and end time.
+
+Thu Jul 22 18:10:00 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htdig/Transport.cc, htdig/HtHTTP.cc : modified the destructors.
+
+Thu Jul 22 13:10:00 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htdig/Transport.cc, htdig/Transport.h, htdig/HtHTTP.cc,
+ htdig/HtHTTP.h: Re-analyzed inheritance methods and attributes of
+ the 2 classes. This is a first step, not definitive ... cos it
+ still doesn't work as I hope.
+
+Tue Jul 20 11:21:52 1999 <loic at ceic.com>
+
+ * configure.in : added AM_MAINTAINER_MODE to prevent unwanted
+ dependencies check by default.
+
+ * db/Makefile.in : remove Makefile when distclean
+
+Mon Jul 19 13:23:53 1999 <loic at ceic.com>
+
+ * Makefile.config (INCLUDES): added -I$(top_srcdir)/include because
+ automatically -I../include is not good, added -I$(top_builddir)/db/dist
+ because some db headers are configure generated (if building in a
+ directory that is not the source directory).
+
+ * rename db/Makefile db/Makefile.in: otherwise it does not show
+ up if if building in a directory that is not the source directory.
+
+Mon Jul 19 13:02:22 1999 <loic at ceic.com>
+
+ * .cvsignore: do not ignore Makefile.config
+
+Sun Jul 18 22:47:49 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/parser.cc: Eliminated compiler errors. Currently
+ returns no matches until bugs in the WordList code are fixed.
+
+Sun Jul 18 22:42:04 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htmerge/htmerge.h: Cleanup, including WordRecord and
+ WordReference as needed.
+
+ * htmerge/htmerge.cc: Update for files necessary for merge
+ calls.
+ Call convertDocs before mergeWords so that the discardList gets
+ the list of documents deleted.
+
+ * htmerge/docs.cc: Update for difference in calling order.
+
+ * htmerge/words.cc: Update (and significant cleanup) since
+ WordList writes directly to db.words.db. Iterate over the stored
+ words, deleting those from deleted documents.
+
+ * htmerge/db.cc: Update to eliminate compiler errors. Currently
+ disabled until bugs in the words code are fixed.
+
+Sun Jul 18 22:33:49 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/defaults.cc: Collapse the multiple heading_factors into
+ one. (It's prohibitive to define a flag for each h* tag).
+ Add a new url_factor for the text of URLs (presently unused).
+
+ * htcommon/DocumentRef.cc(AddDescription): Use FLAG_LINK_TEXT as
+ defined in htcommon/WordRecord.h.
+
+ * htdig/Retriever.h: Change factor to accomodate flags instead of
+ weighting factors.
+
+ * htdig/Retriever.cc: Update to use flags, and define the indexed
+ flags in factor as appropriate.
+
+ * htdig/HTML.cc: Update calls to got_word with appropriate new
+ offsets into factor[].
+
+Sun Jul 18 22:18:16 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/WordReference.h, htcommon/WordRecord.h: Update to use
+ flags instead of weight.
+
+ * htcommon/WordList.h, htcommon/WordList.cc: Add database access
+ routines to match DocumentDB.cc.
+ (Word): Recognize flags instead of weight, simply add the
+ word. (Duplicates expected!)
+ (mark*): Simply delete the list of words.
+ (flush): Rather than dump to a text file, dump directly to the db.
+
+Sun Jul 18 21:50:04 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/Database.h, htlib/DB2_db.h, htlib/DB2_hash.h: Add new
+ method Get_Item to access the data of the current item when using
+ Get_Next() or Get_Next_Seq().
+
+ * htlib/DB2_db.h, htlib/DB2_hash.cc: Implement Get_Item() using
+ cursor access.
+
+Sat Jul 17 12:59:01 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * test/*.html: Added various HTML files as the beginnings of a
+ testing suite.
+
+Fri Jul 16 16:06:27 1999 Loic Dachary <loic at ceic.com>
+
+ * All libraries (except db) use libtools. Shared libraries are
+ generated by default. --disable-shared to get old behaviour.
+ Libraries are installed in all cases.
+
+ * Change structure of default installation directory (match
+ standard).
+ database : var/htdig
+ programs : bin
+ libraries : lib
+
+ Like default apache:
+ conf : conf
+ htdocs : htdocs/htdig
+ cgi-bin : cgi-bin
+
+ * Switch all Makefile.in into Makefile.am
+
+ * CONFIG.in CONFIG : removed. Replaced with --with- arguments in
+ configure.in
+
+ * Makefile.config.in removed, only keep Makefile.config : automake
+ automatically defines variables for each AC_SUBST variables.
+ Makefile.config has HTLIBS + DEFINES
+
+ * db/Makefile : added to forward (clean all distclean) targets to
+ db/dist and implement distdir target.
+
+ * acconfig.h : created to allow autoheader to work (contains GETPEERNAME_LENGTH_T
+ HAVE_BOOL, HAVE_TRUE, HAVE_FALSE, NEED_PROTO_GETHOSTNAME). Extra definitions
+ added before @TOP@ (TRUE, FALSE, VERSION, MAX_WORD_LENGTH, LOG_LEVEL, LOG_FACILITY).
+
+ * installdir/Makefile.am : installation rules moved from Makefile.am to installdir/Makefile.am
+
+ * include/Makefile.am : distribute htconfig.h.in and stamp-h.in
+
+ * Makefile.am : do not pre-create the directories, creation is done during the installation
+
+ * configure.in: CF_MAKE_INCLUDE not needed anymore : automake handles
+ the include itself.
+
+Fri Jul 16 13:04:27 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc(parse): fix to prevent closing ">" from being passed
+ to do_tag().
+
+Thu Jul 15 21:25:12 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Document.cc (readHeader, getParsable): Add back
+ application/pdf to use builtin PDF code.
+
+ * htdig/Makefile.in: Remove broken Postscript parser as it never
+ worked.
+
+ * htlib/URL.cc (normalizePath, path): Use config.Boolean as
+ pointed out by Gilles.
+
+Thu Jul 15 15:54:30 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdoc/attrs.html(pdf_parser & external_parsers): add corrections &
+ clarifications, links to relevant FAQ entries.
+
+Thu Jul 15 18:00:00 1999 CEST Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htlib/HtDateTime.cc, htlib/HtDateTime.h : added the possibility
+ to initialize and compares HtDateTime with integers. Added the
+ constructor HtDateTime (int) and various operator overloading methods.
+
+Wed Jul 14 22:57:14 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/URL.cc (normalizePath, path): If not case_sensitive,
+ lowercase the URL. Should ensure that all URLs are appropriately
+ lowercased, regardless of where they're generated.
+
+Wed Jul 14 22:37:47 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/DB2_db.cc (OpenReadWrite, OpenRead): Add flag DB_DUP to
+ database to allow storage of duplicate keys (in this case,
+ words).
+
+Tue Jul 13 15:36:40 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc (do_tag): Fix handling of <link> and <area>,
+ to use href= instead of src=.
+
+Mon Jul 12 22:31:48 1999 Hanno Mueller <kontakt at hanno.de>
+
+ * contrib/scriptname/results.shtml: Remove unintentional $(VERSION).
+
+Mon Jul 12 22:20:40 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/HTML.cc (do_tag): Cleanups suggested by Gilles, combining
+ <link> and <area>, <embed> <object> and <frame> and moving <img>
+ to a separate case.
+
+Sun Jul 11 19:32:38 1999 Hanno Mueller <kontakt at hanno.de>
+
+ * contrib/README: Add scriptname directory.
+
+ * contrib/scriptname/*: An example of using htsearch within
+ dynamic SSI pages
+
+ * htcommon/defaults.cc: Add script_name attribute to override
+ SCRIPT_NAME CGI environment variable.
+
+ * htdoc/FAQ.html: Update question 4.7 based on including htsearch
+ as a CGI in SSI markup.
+
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html,
+ htdoc/hts_templates.html: Update based on behavior of script_name
+ attribute.
+
+ * htsearch/Display.cc: Set SCRIPT_NAME variable to attribute
+ script_name if set and CGI environment variable if undefined.
+
+Sat Jul 10 00:22:34 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htfuzzy/Regex.cc (getWords): Anchor the match to the beginning
+ of string, add regex-interpeted characters to extra_word_chars
+ temporarily, and strip remaining punctuation before making a match.
+
+Fri Jul 9 22:35:57 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/Display.cc: Back out change of June 24.
+
+ * htsearch/htsearch.cc: Ditto.
+
+ * htsearch/htsearch.cc (setupWords): Remove HtStripPunctuation in
+ favor of requiring Fuzzy classes to strip whatever punctuation is
+ necessary.
+
+ * htfuzzy/Fuzzy.h: Add HtWordType.h to #includes and update comments.
+
+ * htfuzzy/Synonym.cc, htfuzzy/Substring.cc, htfuzzy/Speling.cc,
+ htfuzzy/Prefix.cc, htfuzzy/Exact.cc, htfuzzy/Endings.cc,
+ htfuzzy/Fuzzy.cc (getWords): Call HtStripPunctuation on input before
+ performing fuzzy matching.
+
+Thu Jul 8 21:28:44 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/HTML.cc (do_tag): Add support for parsing <LINK> tags.
+
+Mon Jul 5 16:53:23 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/htdig.cc (main): Insert '*' instead of username/password
+ combination to hide credentials in process accounting.
+
+Sat Jul 3 17:35:52 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Transport.h(ConnectionWrite): Return value from
+ Connection::write call.
+
+ * htdig/URLRef.h, htdig/URLRef.cc: Cleanup and made hopcount
+ default consistent with 7/3 change to DocumentRef.cc
+
+ * htdig/Server.h, htdig/Server.cc, htdig/Retriever.cc: Cleanup and
+ fixes to match URLRef calling interface.
+
+Sat Jul 3 16:37:29 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/HTML.cc (do_tag): Fix <meta> robots parsing to allow
+ multiple directives to work correctly. Fixes PR#578, as provided
+ by Chris Liddiard <c.h.liddiard at qmw.ac.uk>.
+
+Sat Jul 3 00:47:51 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Makefile.in: Remove old SGMLEntities code.
+
+Sat Jul 3 00:26:55 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/DocumentRef.cc (Clear): Change default value of
+ docHopCount to 0 to fix several hopcount bugs.
+
+ * htdig/Transport.h, htdig/Transport.cc: Changes to support URL
+ referers as well as authentication credentials.
+
+ * htdig/HtHTTP.h, htdig/HtHTTP.cc(SetCredentials): Implement HTTP
+ Basic Authentication credentials.
+ (SetRequestCommand): Use Referer and Authentication headers if
+ supplied.
+
+Sun Jun 30 11:26:00 1999 Gabriele Bartolini <g.bartol at comune.prato.it>
+
+ * htdig/Transport.h: Inserted the methods declarations regarding
+ the connection management. The code has been moved out from the
+ HtHTTP.h code. Also moved here the static variable 'debug'.
+
+ * htdig/Transport.cc: Definition of the connection management code.
+ The code has been moved out from the HtHTTP.cc code.
+
+ * htdig/HtHTTP.h: Eliminated the connection management code and the
+ static variable 'debug'. Inserted the 'modification_time_is_now' as
+ a static variable, in order to respect the encapsulation principle.
+
+ * htdig/HtHTTP.cc: Eliminated the connection management code and the
+ static variable 'debug' initialization. Inserted the
+ 'modification_time_is_now' initialization.
+
+Sun Jun 27 16:29:49 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/HTML.h: Cleanup.
+
+ * htcommon/defaults.cc: Added default for img_alt_factor for text
+ weighting on <IMG ALT="..." tags.
+
+ * htdig/Retriever.cc: Add slot for img_alt_factor.
+
+ * htdig/HTML.cc (do_tag): Rewrite using Configuration class to
+ separate tag attributes.
+ (parse): Ignore final '>' in string passed to do_tag.
+ (do_tag): Index IMG ALT text.
+
+Fri Jun 25 17:58:44 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Transport.h: Fix virtual methods for Transport_Response to
+ have defaults.
+
+ * htdig/HtHTTP.h: Fix class declaration of HtHTTP class to prevent
+ syntax error. Pointed out by Gabriele.
+
+ * htdig/Transport.cc: Add (empty) ctor and dtor functions for
+ Transport_Response.
+
+Thu Jun 24 22:28:44 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/htsearch.cc (main): Add support for form inputs
+ configdir and commondir as contributed by Herbert Martin Dietze
+ <herbert at fh-wedel.de>.
+
+ * htsearch/Display.cc (createURL): If configdir and commondir are
+ defined, add them to URLs sent for other pages.
+
+Wed Jun 23 23:00:18 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/HtHTTP.h, htdig/HtHTTP.cc: Make a subclass of Transport.
+
+Wed Jun 23 22:08:20 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/Configuration.cc (Add): Handle single-quoted values for
+ attributes.
+
+Tue Jun 22 23:35:39 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Transport.h, htdig/Transport.cc: Virtual classes to handle
+ transport protocols such as HTTP, FTP, WAIS, gopher, etc.
+
+ * htdig/Makefile.in: Make sure they're compiled (not that there's
+ much!)
+
+ * htdig/HtHTTP.h: Add htdig.h to ensure config is defined.
+
+Mon Jun 21 14:33:10 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Document.cc(readHeader), htdig/HtHTTP.cc(ParseHeader): fix
+ handling of modification_time_is_now in readHeader, add similar code
+ to ParseHeader.
+
+Sun Jun 20 21:25:15 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Retriever.h: Add hop parameter to got_href
+ method. Defaults to 1.
+
+ * htdig/Retriever.cc(got_href): Use it instead of constant 1.
+
+ * htdig/HTML.cc (do_tag): Use new hop parameter to keep the same
+ hopcount for frame, embed and object tags.
+
+ * htdig/Makefile.in: Make sure HtHTTP.cc is compiled.
+
+ * htdig/HtHTTP.cc (ctor): Add default value for _server to make
+ prevent strange segmentation faults.
+
+Fri Jun 18 09:53:30 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/DocumentRef.h, htcommon/DocumentRef.cc(Clear, Deserialize):
+ add docHeadIsSet field, code for setting and getting it.
+ * htcommon/DocumentDB.cc(Add): only put out excerpt record if DocHead
+ is really set.
+ * htmerge/doc.cc(convertDocs): add missing else after code to delete
+ documents with no excerpts.
+ (All these changes fix the disappearing excerpts problem in 3.2.)
+
+Wed Jun 16 23:04:38 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Document.cc (UseProxy): Change http_proxy_exclude to an
+ escaped regex string. Allows for much more complicated rules.
+
+Wed Jun 16 16:04:07 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * Makefile.config.in: fix typo in name IMAGE_URL_PREFIX.
+
+ * htdig/Retriever.cc(IsValidURL): change handling of valids to only
+ reject if list is not empty, give different error message.
+
+Wed Jun 16 14:40:56 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/htsearch.cc(main): pass StringList args to setEscaped()
+ instead of unprocessed input[] char *'s.
+
+ * htsearch/Display.cc(buildMatchList): cast score to (int) in maxScore
+ calculation, to avoid compiler warnings.
+
+ * htdig/htdig.cc(main): change comparison on minimalFile to avoid
+ compiler warnings.
+
+Wed Jun 16 11:30:23 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/HtRegex.cc(setEscaped): Fix appending of substring to avoid
+ compiler warnings.
+
+ * htlib/HtDateTime.cc(SettoNow): Strip out all the nonsense that
+ doesn't work, set Ht_t directly instead.
+
+Wed Jun 16 09:58:12 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * configure.in, configure, Makefile.config.in: Correct handling of
+ SEARCH_FORM variable, as Gabriele recommended.
+
+Wed Jun 16 09:32:06 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/cgi.h, htlib/cgi.cc(cgi & init), htsearch/htsearch.cc
+ (main & usage): allow a query string to be passed as an argument.
+
+Wed Jun 16 08:43:09 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/Makefile.in, htdig/Makefile.in, htfuzzy/Makefile.in,
+ htmerge/Makefile.in, htnotify/Makefile.in: Use standard $(bindir)
+ variable instead of $(BIN_DIR). Allows for standard configure flags
+ to set this. (Completes Geoff's change on May 15.)
+
+Tue Jun 15 14:31:50 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/PDF.cc(parseNonTextLine): move line that clears _parsedString,
+ so title cleared even if rejected.
+
+ * htsearch/Display.cc(buildMatchList & sort): move maxScore calculation
+ from sort to buildMatchList, so it's done even if there's only 1 match.
+
+Mon Jun 14 15:01:07 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Document.cc(RetrieveHTTP): Show "Unknown host" message if
+ Connection::assign_server() fails (due to gethostbyname() failure).
+
+Mon Jun 14 13:52:34 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htcommon/defaults.cc, htsearch/Display.h, htsearch/Display.cc,
+ htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html,
+ htdoc/hts_templates.html: add template_patterns attribute, to select
+ result templates based on URL patterns.
+
+Sun Jun 13 16:29:19 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Retriever.cc (IsValidURL): Add valid_extension list, as
+ requested numerous times.
+
+ * htcommon/defaults.cc: Add config attribute valid_extensions,
+ with default as empty.
+
+Sat Jun 12 23:10:39 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/DocumentRef.h: Fix thinkos introduced in change earlier
+ today. Actually compiles correctly now.
+
+Sat Jun 12 22:37:22 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/HtHTTP.cc (ParseHeader): Fix parsing to take empty headers
+ into account. Fixes PR#557.
+
+ * htsearch/Display.h, htsearch/Display.cc (excerpt): Fix
+ declaration to refer to first as reference--ensures ANCHOR is
+ properly set. Fixes PR#541 as suggested by <pmb1 at york.ac.uk>.
+
+ * htfuzzy/Endings.cc (getWords): Fixed PR#560 as suggested by
+ Steve Arlow <yorick at ClarkHill.com>. Solves problems with fuzzy
+ matching on words like -ness: witness, highness, likeness... Tries
+ to interpret words as root words before attempting stemming.
+
+ * installdir/search.html (Match): Add Boolean to default search
+ form, as suggested by PR#561.
+
+ * htlib/URL.cc (URL): Fix PR#566 by setting the correct length of
+ the string being matched. 'http://' is 7 characters...
+
+Sat Jun 12 19:06:36 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtZlibCodec.h, htlib/HtZlibCodec.cc: New files. Provide
+ general access to zlib compression routines when available.
+
+ * htcommon/DocumentRef.h, htcommon/DocumentRef.cc: Remove
+ compression access and restore DocHead access through default
+ methods. Compression of excerpts will occur through the
+ HtZlibCodec classes and through the DocumentDB excerpt access.
+
+Sat Jun 12 15:25:08 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htmerge/docs.cc (convertDocs): Load excerpt from external
+ database before considering it empty.
+
+Sat Jun 12 14:41:54 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/Display.cc (displayMatch): Added patch from Torsten
+ Neuer <tneuer at inwise.de> to fix PR# 554.
+
+ * htdig/HTML.cc (do_tag): Add parsing for <embed> and <object>,
+ including suggestions from Gilles as to condensing cases with
+ <img> parsing.
+
+Sat Jun 12 14:00:39 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/ExternalParser.cc (parse): Quote the filename before
+ passing it to the command-line to prevent shell escapes. Fixes PR#542.
+
+Fri Jun 11 15:59:10 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/URL.cc(removeIndex): use CompareWord instead of FindFirstWord,
+ to avoid substring matches.
+
+Wed Jun 2 15:51:00 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/URLTrans.cc(encodeURL): Fix to ensure that non-ASCII letters
+ get URL-encoded.
+
+Mon May 31 22:40:29 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/DocumentDB.cc(ReadExcerpt): Fix silly typos with methods,
+ thinko with docID.
+ (Add): Add the excerpt *before* the URL index is written.
+
+ * htdig/Retriever.cc(isValidURL): Remove code restricting URLs to
+ relative and http://.
+
+ * htdig/htdig.cc(main): Unlink the doc_excerpt file when doing an
+ initial dig.
+ (main): Fix silly typo with minimumFile.
+
+ * htmerge/db.cc(mergeDB): Call DocumentDB::Open() with doc_excerpt for
+ consistency--doesn't actually do anything with it.
+
+ * htmerge/docs.cc(convertDocs): Ditto. Also don't delete a
+ document simply because it has an empty DocHead. Excerpts are now
+ stored in a separate database!
+
+ * htmerge/htmerge.h: Call mergeDB and convertDocs with
+ doc_excerpt parameter.
+
+ * htmerge/htmerge.cc(main): Ditto.
+
+ * htsearch/Display.h: Call ctor with all three doc db filenames.
+
+ * htsearch/Display.cc(Display): Call DocumentDB::Open with above.
+ (excerpt): Retrieve the excerpt from the excerpt database.
+
+ * htsearch/htsearch.cc: Call Display::Display with all three doc
+ db filenames.
+
+Mon May 31 15:08:30 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/DocumentDB.h: Add new method ReadExcerpt to read the
+ excerpt from the separate (new) excerpt database. Change Open()
+ and Read() methods to account for this new database.
+
+ * htcommon/DocumentDB.cc (Open): Open the excerpt database too.
+ (Read): Ditto.
+ (Close): Close it if it exists.
+ (ReadExcerpt): Explicitly read the DocHead of this DocumentRef.
+ (Add): Make sure DocHeads go into the excerpt database.
+ (Delete): Make sure we delete the associated excerpt too.
+ (CreateSearchDB): Make sure we grab the excerpt from the database.
+
+ * htcommon/DocumentRef.cc(Serialize): Don't serialize the DocHead
+ field, this is done in the DocumentDB code.
+
+ * htcommon/defaults.cc(modification_time_is_now): Set to true to
+ avoid problems with not setting dates when no Last-Modified:
+ header appears.
+ (doc_excerpt): Add new attribute for the filename of the excerpt
+ database.
+
+ * htdig/HtHTTP.h: Remove incorrect virtual declarations from
+ Request and EstablishConnection methods. Assign void return value
+ to ResetStatistics since it doesn't return a value.
+
+ * htdig/htdig.cc (main): Add new "minimal" flag '-m' to only index
+ the URLs in the supplied file. Sets hopcount to ignore links.
+
+Sun May 30 19:36:15 1999 Alexander Bergolth <leo at leo.wu-wien.ac.at>
+
+ * htlib/URL.cc (normalizePath): Fix bug that caused endless loops
+ and core dumps when normalizing URLs with more than one of
+ ( "/../" | "/./" | "//" | "%7E" )
+
+ * htlib/HtDateTime.cc (Httimegm): Call Httimegm in timegm.c unless
+ HAVE_TIMEGM.
+
+Wed May 26 23:15:46 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htmerge/db.cc (mergeDB): Add patch contributed by Roman Dimov
+ <roman at twist.mark-itt.ru> to fix problems with confusing docIDs,
+ resulting in documents in main db removed when the corresponding
+ DocID was supposed to be removed from the merged db.
+
+Wed May 26 11:30:22 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/Display.h, htsearch/Display.cc, htsearch/htsearch.cc:
+ Switch restrict and excludes to use HtRegex instead of StringMatch.
+
+ * htdig/htdig.cc (main): Fix typo clobbering setting of
+ excludes. Obviously fixes problems with badquerystr and excludes!
+
+ * htdig/HtHTTP.cc (ParseHeader): Change parsing to skip extra
+ whitespace, as in 5/19 Document.cc(readHeader) change.
+
+Wed May 19 22:17:49 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/HtHTTP.cc, htdig/HtHTTP.h: Add new files, contributed by
+ Gabriele. A start at an HTTP/1.1 implementation.
+
+ * htdig/Document.cc (readHeader): Fix change of 5/16 to actually
+ work! :-)
+
+ * htsearch/Display.cc (expandVariables): Change end-of-expansion
+ test to include states 2 and 5 to ensure templates ending in } are
+ still properly expanded, as suggested by Gilles.
+
+Mon May 17 14:31:31 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtRegex.cc (setEscaped): Use full list of characters to
+ escape as suggested by Gilles.
+
+Sun May 16 17:27:51 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Document.cc (readHeader): Since multiple whitespace
+ characters are allowed after headers, don't use strtok.
+ (readHeader): We no longer pretend to parse Word, PostScript, or
+ PDF files internally.
+ (getParsable): Don't generate PostScript or PDF objects since we
+ no longer recommend using them.
+
+Sun May 16 17:07:19 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtRegex.cc (setEscaped): Ensure escaping does not loop
+ beyond the end of a string.
+
+ * htdig/Retriever.cc (IsValidURL): Fix badquerystr parsing to use
+ HtRegex as expected. (Oops!)
+
+ * htdig/HTML.cc (parse): Use HtSGMLCodec during parsing, rather
+ than encoding the whole document at the beginning. More consistent
+ with previous use of SGMLEntities.
+
+Sat May 15 12:57:40 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/URL.cc (normalizePath): Remove extra (useless) variable
+ declarations.
+
+ * htlib/htString.h, htlib/String.cc: Add new method Nth to solve
+ problems with (String *)->[].
+
+ * htlib/HtRegex.h, htlib/HtRegex.cc: Added new method
+ setEscaped(StringList) to produce a pattern connected with '|' of
+ possibly escaped strings. Strings are not escaped if enclosed in
+ [] and the brackets are removed from unescaped regex.
+
+ * htdig/htdig.h: Use HtRegex instead of StringMatch for limiting
+ by default.
+
+ * htdig/Retriever.cc: As above.
+
+ * htdig/htdig.cc(main): As above. Use setEscaped to set limits
+ correctly (i.e. in a backwards-compatible way).
+
+Sat May 15 11:24:26 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htfuzzy/Speling.h, htfuzzy/Speling.cc: New files for simple
+ spelling corection. Currently limited to transpostion and added
+ character errors. Missing character errors to be added soon.
+
+ * htfuzzy/Makefile.in: Compile it.
+
+ * htfuzzy/Fuzzy.cc (getFuzzyByName): Use it.
+
+ * htcommon/defaults.cc: Add new option minimum_speling_length for
+ the shortest query word to receive speling fuzzy
+ modifications. Should prevent problems with valid words generating
+ unrelated "corrections" of words. Default is 5 chars.
+
+Sat May 15 11:18:27 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htfuzzy/Fuzzy.cc (getWords): Ensure word is not an empty or null
+ string.
+
+ * htfuzzy/Metaphone.cc (generateKey): Ditto. Should solve PR#514.
+
+ * htdig/Document.cc (Reset): Do not use modification_time_is_now
+ attribute. Simply reset modtime to 0, time is set elsewhere.
+
+ * Makefile.config.in: Add options from separate CONFIG files.
+
+ * configure.in, configure: Add configure-level switches for
+ --with-image-url-prefix= and --with-search-form=. Do not generate
+ CONFIG file (hopefully to be phased out soon).
+
+ * */Makefile.in: Make linking CONFIG-dependent files depend on
+ Makefile.config, not CONFIG.
+
+ * Makefile.in: Use standard $(bindir) variable instead of
+ $(BIN_DIR). Allows for standard configure flags to set this.
+
+Tue May 11 11:15:08 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtDateTime.h, htlib/HtDateTime.cc: Updates from Gabriele,
+ fixing SetToNow() and adding GetDiff to return the difference in
+ time_t between two objects.
+
+ * htdig/Retriever.cc (Need2Get): Add patch from Warren Jones
+ <wjones at tc.fluke.com> to keep track of inodes on local files to
+ eliminate duplicates. Hopefully this will serve for a first-try at
+ a signature method for HTTP as well.
+
+Tue May 4 20:20:40 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htfuzzy/Regex.h, htfuzzy/Regex.cc: Add new regex fuzzy
+ algorithm, based on Substring and Prefix.
+
+ * htfuzzy/Fuzzy.cc (getFuzzyByName): Add it.
+
+ * htfuzzy/Makefile.in: Compile it.
+
+ * htcommon/defaults.cc: Add new attribute regex_max_words, same
+ concept as substring_max_words.
+
+ * htfuzzy/Exact.cc, htfuzzy/Substring.cc, htfuzzy/Prefix.cc:
+ Define names attribute for debugging purposes.
+
+ * installdir/htdig.conf: Fix the comments for search_algorithm to
+ refer to all the current possibilities.
+
+ * htlib/HtRegex.cc (match): Slight cleanup of how to return.
+
+Tue May 4 15:28:38 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/htsearch.cc (reportError): Add e-mail of maintainer to
+ error message. Should help direct people to the correct place.
+
+ * htdig/Retriever.cc (IsValidURL): Lowercase all extensions from
+ bad_extensions as well as all extensions used in
+ comparisons. Ensures we're using case-insenstive matching.
+
+Mon May 3 23:20:22 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/Retriever.cc (IsValidURL): Fix typo with #else statement
+ for REGEX.
+
+ * htdig/htdig.cc: Add conditionals for REGEX to use HtRegex
+ instead of StringMatch methods when defined.
+
+ * htlib/HtDateTime.h: Update to remove definitions of true and
+ false, established by May 2 change in
+ include/htconfig.h.in as contributed by Gabriele.
+
+ * htlib/HtDateTime.cc: Replace call to mktime internal function to
+ Httimegm in timegm.c, contributed by Leo.
+
+ * htlib/timegm.c: Declare my_mktime_gmtime_r to prevent compiler
+ errors with incompatible gmtime structures, contributed by Leo.
+
+ * configure.in: Rearrange date/time checks for clarity.
+
+ * configure: Regenerate using autoconf.
+
+ * include/htconfig.in: Add HAVE_STRFTIME flag.
+
+Sun May 2 18:49:04 1999 Alexander Bergolth <leo at leo.wu-wien.ac.at>
+
+ * configure.in, include/htconfig.h.in: Added a configure test for
+ the availability of the bool type.
+
+Fri Apr 30 20:00:09 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtDateTime.h, htlib/HtDateTime.cc: Update with new
+ versions sent by Gabriele.
+
+Fri Apr 30 19:30:42 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtRegex.h, htlib/HtRegex.cc: New class, contributed by
+ Peter D. Gray <pdg at draci.its.uow.edu.au> as a small wrapper for
+ system regex calls.
+
+ * htlib/Makefile.in: Build it.
+
+ * htdig/htdig.h: Use it if REGEX is defined.
+
+ * htdig/htdig.cc: Ditto.
+
+ * htdig/Retriever.cc: Ditto.
+
+ * htsearch/Display.cc(generateStars): Remove extra newline after
+ STARSRIGHT and STARSLEFT variables, noted by Torsten Neuer
+ <tneuer at inwise.de>.
+
+Fri Apr 30 18:52:56 1999 Alexander Bergolth <leo at leo.wu-wien.ac.at>
+
+ * htlib/URL.cc(ServerAlias): port for server_aliases entries now
+ defaults to 80 if omitted.
+
+Wed Apr 28 19:57:38 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtDateTime.h, htlib/HtDateTime.cc: New class, contributed
+ by Gabriele.
+
+ * htlib/Makefile.in: Compile it.
+
+ * README: Update message from 3.1.0 (oops!) to 3.2.0, remove rx
+ directory.
+
+ * installdir/htdig.conf: Add example of no_excerpt_show_top
+ attribute in line with most user's expectations.
+
+ * contrib/README: Mention contributed section of the website.
+
+ * Makefile.in: Ignore mailarchive directory--now removed from CVS.
+
+Wed Apr 28 10:46:31 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htmerge/db.cc(mergeDB): fix a few errors in how the merge index
+ name is obtained.
+
+Tue Apr 27 23:00:39 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * Makefile.config.in: Remove now-useless LIBDIRS variable.
+
+ * mailarchive/Split.java, mailarchive/htdig: Remove ancient
+ mailarchive stuff.
+
+Tue Apr 27 18:01:52 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc(setupImages): Remove code setting URLimage to
+ a bogus pattern (remnant left over after merge).
+
+Tue Apr 27 16:43:08 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Document.cc(RetrieveHTTP): Show "Unable to build connection"
+ message at lower debug level.
+
+Tue Apr 27 11:24:19 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/Display.h: Remove sort, compare functions re-introduced
+ in merge. Moved to ResultMatch by Hans-Peter's April 19th chnages.
+
+ * htsearch/Display.cc: Remove bogus call to ResultMatch:setRef,
+ removed by Hans-Pater's April 19th changes.
+
+Sat Apr 24 21:08:35 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * Merge in changes from 3.1.2 (see below).
+
+ * htcommon/WordList.cc: Change valid_word to use iscntl().
+
+ * htdig/Plaintext.cc: Remove CVS Log.
+
+ * htdig/Retriever.cc: Fix ancient bug with empty excludes list.
+
+ * htlib/List.cc: Remove CVS Log, use more succinct test for
+ out-of-bounds.
+
+ * htsearch/Display.cc: Fix logic with starPatterns, only show top
+ of META description.
+
+ * htsearch/Display.h: Introduce headers needed for sort functionality.
+
+ * installdir/htdig.conf: Add example max_doc_size attribute as
+ well as example for including start_url from a file.
+
+ * htdoc/ChangeLog, htdoc/RELEASE.html, htdoc/FAQ.html,
+ htdoc/where.html, htdoc/cf_byname.html, htdoc/cf_byprog.html,
+ htdoc/uses.html, htdoc/contents.html, htdoc/mailarchive.html:
+ Merge in documentation updates from 3.1.2.
+
+Sat Apr 24 15:18:45 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htsearch/Display.cc (sort): Return immediately if <= 1 items to
+ sort.
+
+Mon Apr 19 00:53:06 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htsearch/ResultMatch.h (create): New. All (the only) ctor
+ caller changed to use this.
+ (setRef, getRef): Removed. Callers changed to use nearby data.
+ (incomplete): Removed.
+ (setIncompleteScore): Renamed to...
+ (setScore): ...this. All callers changed.
+ (setSortType): New.
+ (getTitle, getTime, setTitle, setTime, getSortFun): New virtual
+ functions.
+ (enum SortType): Moved from Display, private.
+ (mySortType): New static member.
+
+ * htsearch/ResultMatch.cc (mySortType): Define static member
+ variable.
+ (getScore): Remove handling of "incomplete". Moved to ResultMatch.h
+ (getTitle, getTime, setTitle, setTime): New dummy functions.
+ (class ScoreMatch, class TimeMatch, class IDMatch, class
+ TitleMatch): Derived classes with compare functions (from Display)
+ and extra sort-method-related members, as needed.
+ (setSortType): New, mostly moved from Display.
+ (create): New.
+
+ * htsearch/Display.h: Changed first argument from ResultMatch * to
+ DocumentRef *.
+ (compare, compareTime, compareID, compareTitle, enum SortType,
+ sortType): Removed.
+
+ * htsearch/Display.cc (display): Call ResultMatch::setSortType and
+ output syntax error page for invalid sort methods.
+ (displayMatch): Change first argument from ResultMatch * to
+ DocumentRef *ref. All callers changed.
+ (buildMatchList): Remove call to sortType and typ variable.
+ Always call (ResultMatch::)setTime and setTitle. Remove extra
+ call to setID.
+ (sort): Call (ResultMatch::)getSortFun for qsort compare function.
+ (compare, compareTime, compareID, compareTitle, sortType): Removed.
+
+Wed Apr 14 21:21:35 1999 Alexander Bergolth <leo at leo.wu-wien.ac.at>
+
+ * htlib/regex.c: fixed compile problem with AIX xlc compiler
+
+ * htlib/HtHeap.h: fixed compile problem with AIX xlc compiler (bool)
+
+ * htlib/HtVector.h: ditto
+
+ * htsearch/Display.cc: fixed typo
+
+Wed Apr 14 00:17:06 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/Display.h: Add compareID for sorting results by DocID.
+
+ * htsearch/Display.cc: As above.
+
+Tue Apr 13 23:50:28 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/defaults.cc: Add new config option use_doc_date to use
+ document meta information for the DocTime() field.
+
+ * htdig/HTML.cc(do_tag): Call Retriever::got_time if use_doc_date
+ is set and we run across a META date tag.
+
+ * htdig/Retriever.h, htdig/Retriver.cc: Add new got_date
+ function. When called, sets the DocTime field of the DocumentRef
+ after parsing is completed. Currently assumes ISO 8601 format for
+ the date tag.
+
+Sun Apr 11 12:51:39 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htsearch/Display.cc (buildMatchList): Delete thisRef if excluded
+ by URL. Call setRef(NULL), not setRef(thisRef).
+
+Wed Apr 7 19:35:42 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/htsearch.cc(usage): Remove bogus -w flag.
+
+Thu Apr 1 12:05:11 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/htsearch.cc(main): Apply Gabriele's patch to avoid using an
+ invalid matchesperpage CGI input variable.
+
+ * htsearch/Display.cc(display) & (setVariables): Correct any invalid
+ values for matches_per_page attribute to avoid div. by 0 error.
+
+Wed Mar 31 15:19:25 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htfuzzy/Synonym.cc: Fix previous fix of minor memory leak.
+ (db pointer wasn't properly set)
+
+Mon Mar 29 10:31:09 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/Display.cc(excerpt): Added patch from Gabriele to
+ improve display of excerpts--show top of description always,
+ otherwise try to find the excerpt.
+
+Sun Mar 28 19:45:02 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htlib/HtWordType.h (HtIsWordChar): Avoid matching 0 when using
+ strchr.
+ (HtIsStrictWordChar): Ditto.
+
+ * htdig/ExternalParser.cc (parse): Before got_href call, set
+ hopcount of URL to that of base plus 1.
+ Add URL to external parser error output.
+
+ * htlib/URL.cc (URL(char *ref, URL &parent) ): Move call to
+ constructURL call inside previous else-clause.
+ (parse): Reset _normal, _signature, _user initially.
+ Commence parsing, even if no "//" is found. Do not set _normal
+ here.
+ (normalizePath): Call removeIndex finally.
+
+ * htcommon/WordRecord.h (WORD_RECORD_COMPRESSED_FORMAT)
+ [!NO_WORD_COUNT]: Change to "cu4".
+
+ * htlib/HtPack.cc (htPack): Correct handling at end of code-string
+ and end of encoding-byte. Add code 'c' for often-1 unsigned ints.
+ (htUnpack): Add handling of code 'c'.
+
+Thu Mar 25 12:18:05 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * installdir/long.html, installdir/short.html: Remove backslashes
+ before quotes in HTML versions of the builtin templates.
+
+ * Makefile.in: Add long.html & short.html to COMMONHTML list, so
+ they get installed in common_dir.
+
+Thu Mar 25 11:56:50 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc(displayMatch), htcommon/defaults.cc,
+ htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Add date_format attribute suggested by Marc Pohl.
+
+Thu Mar 25 09:46:07 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc(displayMatch): Avoid segfault when DocAnchors
+ list has too few entries for current anchor number.
+
+Tue Mar 23 15:08:40 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc(displayMatch): Fix problem when documents
+ did not have descriptions.
+
+Tue Mar 23 14:17:14 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/PDF.cc(parseString): Use minimum_word_length instead of
+ hardcoded constant.
+
+Tue Mar 23 14:02:40 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc: Fix bug where noindex_start was empty, allow case
+ insensitive matching of noindex_start & noindex_end.
+
+ * htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
+ Fix inconsistencies in documentation for noindex_start & noindex_end.
+
+Tue Mar 23 14:01:16 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc: Add check for <a href=...> tag that is missing a
+ closing </a> tag, terminating it at next href.
+
+Tue Mar 23 13:57:35 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Document.cc: Fix check of Content-type header in readHeader(),
+ correcting bug introduced Jan 10 (for PR#91), and check against
+ allowed external parsers.
+
+Tue Mar 23 13:54:35 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc: More lenient comment parsing, allows extra dashes.
+
+Tue Mar 23 12:22:53 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htlib/Configuration.cc(Add): Fix function to avoid infinite loop
+ on some systems, which don't allow all the letters in isalnum() that
+ isalpha() does, e.g. accented ones.
+
+ * htdig/HTML.cc: Fix three reported bugs about inconsistent
+ handling of space and punctuation in title, href description & head.
+ Now makes destinction between tags that cause word breaks and those
+ that don't, and which of the latter add space.
+
+Tue Mar 23 12:15:48 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/Plaintext.cc(parse): Use minimum_word_length instead of
+ hardcoded constant.
+
+Tue Mar 23 12:11:04 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htmerge/words.cc(mergeWords): Fix to prevent description text
+ words from clobbering anchor number of merged anchor text words.
+
+Tue Mar 23 12:02:00 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/Display.cc(generateStars): Add in support for use_star_image
+ which was lost when template support was put in way back when.
+
+Tue Mar 23 11:47:52 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * Makefile.in: add missing ';' in for loops, between fi & done
+
+Mon Mar 22 16:06:15 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htdig/HTML.cc: Check for presence of more than one <title> tag.
+
+Mon Mar 22 15:32:15 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrib/parse_doc.pl: Fix handling of minimum word length.
+
+Sun Mar 21 15:19:00 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htlib/HtPack.cc (htPack): New.
+ * htlib/HtPack.h: New.
+ * htsearch/parser.cc (perform_push): Unpack WordRecords using
+ htUnpack.
+ * htsearch/htsearch.h: Add "debug" declaration.
+ * htmerge/words.cc (mergeWords): Pack WordRecords using htPack.
+ * htlib/Makefile.in (OBJS): Add HtPack.o
+ * htcommon/WordRecord.h: Add WORD_RECORD_COMPRESSED_FORMAT
+
+ * htdig/HTML.cc (parse): Keep contents in String variable
+ textified_contents while using its "char *".
+
+ * htsearch/Display.cc (excerpt): Similar for head_string.
+
+Thu Mar 18 20:01:24 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * installdir/long.html, installdir/short.html: Write out HTML
+ versions of the builtin templates.
+
+ * installdir/htdig.conf: Add commented-out template_map and
+ template_name attributes to use the on-disk versions.
+
+Tue Mar 16 03:06:06 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htcommon/DocumentDB.cc (Delete): Fix bad parameter to Get: use
+ key, not DocID.
+
+Tue Mar 16 01:50:16 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htlib/HtWordType.h (class HtWordType): New.
+ * htlib/HtWordType.cc: New.
+ * htlib/Makefile.in (OBJS): Add HtWordType.o
+
+ * htdoc/attrs.html: Document attribute extra_word_characters.
+ * htdoc/cf_byprog.html: Ditto.
+ * htdoc/cf_byname.html: Ditto.
+
+ * htcommon/defaults.cc (defaults): Add extra_word_characters.
+
+ * htsearch/htsearch.h: Lose spurious extern declaration of unused
+ variable valid_punctuation.
+ * htsearch/htsearch.cc (main): Call HtWordType::Initialize.
+ (setupWords): Use HtIsWordChar, HtIsStrictWordChar and
+ HtStripPunctuation. Do not read valid_punctuation.
+
+ * htsearch/Display.cc (excerpt): Use HtIsStrictWordChar.
+
+ * htlib/StringMatch.cc (FindFirstWord): Ditto.
+ (CompareWord): Ditto.
+
+ * htdig/htdig.cc (main): Call HtWordType::Initialize.
+
+ * htdig/Retriever.h (class Retriever): Lose member
+ valid_punctuation.
+ * htdig/Retriever.cc (Retriever): Lose its initialization.
+
+ * htdig/Postscript.h (class Postscript): Lose member
+ valid_punctuation.
+ * htdig/Postscript.cc (Postscript): Lose its initialization.
+ (flush_word): Use HtStripPunctuation.
+ (parse_string): Use HtIsWordChar,
+ HtIsStrictWordChar and HtStripPunctuation.
+
+ * htdig/Parsable.h (class Parsable): Lose member
+ valid_punctuation.
+ * htdig/Parsable.cc (Parsable): Lose its initilization.
+
+ * htcommon/WordList.cc (valid_word): Use HtIsStrictWordChar.
+ (BadWordFile): Use HtStripPunctuation. Do not read
+ valid_punctuation.
+
+ * htcommon/DocumentRef.cc (AddDescription): Use HtIsWordChar,
+ HtIsStrictWordChar and HtStripPunctuation. Do not read
+ valid_punctuation.
+
+ * htdig/PDF.cc (parseString): Similar..
+
+ * htdig/HTML.cc (parse): Similar.
+
+ * htdig/Plaintext.cc (parse): Similar.
+
+Sun Mar 14 14:04:31 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/Makefile.in: Add HtSGMLEntites.o to OBJS.
+
+Sat Mar 13 21:29:38 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htcommon/DocumentDB.cc(Open, Read): Switch to DB_HASH for faster
+ access. Most important for very quick URL lookups!
+
+ * htcommon/DocumentRef.cc(AddDescription): Check to see that
+ description isn't a null string or contains only whitespace before
+ doing anything.
+
+ * htlib/HtSGMLCodec.h, htlib/HtSGMLCodec.cc: Add new class to
+ convert between SGML entities and high-bit characters.
+
+ * htdig/HTML.cc(parse): Use it instead of SGMLEntities.
+
+ * htsearch/Display.cc(excerpt): Use HtSGMLCodec to covert *back*
+ to SGML entities before displaying.
+
+ * htlib/HtHeap.cc: Cleaned up comments, use more efficient
+ procedure to build from a vector.
+
+ * htlib/HtWordCodec.cc(HtWordCodec): Fix bug with constructing from
+ uninitialized variables!
+
+ * htlib/URL.h, htlib/URL.cc: Initial support for multiple schemes and
+ user@host URLs.
+
+ * htlib/List.cc(Nth): Check for out-of-bounds requests before
+ doing anything.
+
+Fri Mar 12 00:31:03 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htlib/mktime.c (__mon_yday): Correct size to number of
+ initializers (2).
+
+ * htsearch/htsearch.cc (main): Remove doc_index handling.
+
+ * htsearch/ResultMatch.h (setURL): Change to setID, use int.
+ All callers changed.
+ (getURL): Change to getID.
+ All callers changed.
+ (String url): Change to "int id".
+
+ * htsearch/Display.h: (Display): Second parameter removed.
+ (docIndex) removed.
+
+ * htsearch/Display.cc (Display, ~Display): Do not handle
+ docIndex.
+ (display): Use DocumentDB::operator [](int), not
+ DocumentDB::operator [] (char *).
+ (buildMatchList): Changed to handle ResultMatch as DocID int,
+ instead of URL string: use DocumentDB::operator [](int), not
+ DocumentDB::operator [] (char *). Get DocumentRef directly, then
+ filter the URL by includeURL().
+
+ * htnotify/htnotify.cc (main): Use DocIDs(), not DocURLs().
+ Handle the change from String * to IntObject *.
+
+ * htmerge/htmerge.cc (main): Do not delete doc_index.
+
+ * htmerge/docs.cc (convertDocs): Test doc_index access as
+ read-only. Pass as parameter for docdb, do not handle separately.
+
+ * htmerge/docs.cc (convertDocs): Add debug messages about cause
+ when deleting documents. If verbose > 1, write id/URL for every URL.
+
+ * htmerge/db.cc (mergeDB): Handle doc_index, test accessibility.
+
+ * htlib/IntObject.h (class IntObject): Add int-constructor.
+
+ * htdoc/attrs.html (doc_index): Say that mapping is from document
+ URLs to numbers.
+ (doc_db): Say that indexing is on document number.
+
+ * htdoc/cf_byprog.html (doc_index): Move from htsearch to htdig
+ entry.
+
+ * htdig/htdig.cc (main): Add .work suffix to doc_index too.
+ Unlink doc_index if initial.
+
+ * htcommon/DocumentDB.h (Open): New second argument.
+ (Read): New second argument, default to 0.
+ (operator [](int)): New.
+ (Exists(char *), Delete(char *)): Change to int parameter.
+ (DocIDs, i_dbf): New.
+
+ * htcommon/DocumentDB.cc (operator [](int)): New.
+ (Exists(char *), Delete(char *)): Changed to DocID int parameter.
+ All callers changed.
+ (URLs): Assume keys are ok without probing for documents
+ with each key.
+ (DocIDs): New.
+ (Open): Take an index database file name as second argument.
+ All callers changed.
+ (Read): Similar, accept 0.
+ (all): Change to index on DocID.
+
+Wed Mar 10 02:25:24 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htdoc/attrs.html (template_name): Typo; used by htsearch, not
+ htdig.
+
+Mon Mar 8 13:30:44 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htdig/Retriever.cc (got_href): Check if the ref is for the
+ current document before adding it to the db.
+
+Mon Mar 8 01:36:38 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htlib/DB2_db.cc: Remove errno.
+ * htlib/DB2_hash.cc: Ditto.
+
+Sun Mar 7 20:50:37 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htfuzzy/EndingsDB.cc(createDB): Use link and unlink to move,
+ rather than a non-portable system call.
+
+ * htcommon/DocumentRef.h, htcommon/DocumentRef.cc: Fix #ifdef
+ problems with zlib.
+
+Sun Mar 7 09:39:37 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/timegm.c: Fix problems compiling on libc5 systems noted by
+ Hans-Peter.
+
+ * htlib/Makefile.in, Makefile.in, Makefile.config.in: Use regex.c
+ instead of rx.
+
+ * htfuzzy/EndingsDB.cc: Ditto.
+
+ * configure.in, configure: Don't bother to config rx directory.
+
+Fri Mar 5 08:09:20 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * contrig/parse_doc.pl: uses pdftotext to handle PDF files,
+ generates a head record with punctuation intact, extra checks
+ for file "wrappers" & check for MS Word signature (no longer
+ defaults to catdoc), strip extra punct. from start & end of words,
+ rehyphenate text from PDFs.
+
+Tue Mar 2 23:18:20 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htdig/htdig.cc: Renamed main.cc for consistency with other programs.
+
+ * htlib/DB2_hash.h, htlib/DB2_hash.cc: Added interface to Berkeley
+ hash database format.
+
+ * htlib/Makefile.in: Use them!
+
+ * htlib/Database.h: Define database types, allowing a choice
+ between different formats.
+
+ * htlib/Database.cc(getDatabaseInstance): Use passed type to pick
+ between subclasses. Currently only uses Hash and B-Tree formats of
+ Berkeley DB.
+
+ * htcommon/DocumentDB.cc, htfuzzy/Endings.cc,
+ htfuzzy/EndingsDB.cc, htfuzzy/Fuzzy.cc, htfuzzy/Prefix.cc,
+ htfuzzy/Substring.cc, htfuzzy/Synonym.cc, htfuzzy/htfuzzy.cc,
+ htmerge/docs.cc, htmerge/words.cc, htsearch/Display.cc,
+ htsearch/htsearch.cc: Use new form of getDatabaseInstance(),
+ currently with DB_BTREE option (for compatibility).
+
+Mon Mar 1 22:53:37 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/regex.c, htlib/striptime.c: Import new versions from
+ glibc.
+
+ * htlib/Makefile.in, htlib/mktime.c, htlib/timegm.c, htlib/lib.h:
+ Changes to use glibc timegm() function instead of buggy mytimegm().
+
+ * htdig/Document.cc(getdate): Use it.
+
+Tue Mar 2 02:35:50 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * attrs.html: Rephrase and clarify entry for url_part_aliases.
+
+Sun Feb 28 23:25:40 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htlib/HtURLCodec.cc (~HtURLCodec): Add missing deletion of
+ myWordCodec.
+
+Fri Feb 26 19:03:58 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * configure, configure.in: Fix typo on timegm test.
+
+ * htlib/mytimegm.cc: Fix Y2K problems.
+
+Wed Feb 24 21:09:19 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/htsearch.cc(main): Remember to delete the parser!
+
+ * htlib/String.cc(String(char *s, int len)): Remove redundant copy.
+
+ * htsearch/Display.cc(display): Free DocumentRef memory after
+ displaying them.
+ (displayMatch): Fix memory leak when documents did not have anchors.
+
+Wed Feb 24 15:18:26 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/Configuration.cc(Add): Fix small leak in locale code.
+
+ * htlib/String.cc: Fix up code to be cleaner with memory
+ allocation, inline next_power_of_2.
+
+Mon Feb 22 22:13:49 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/String.cc, htlib/htString.h: Fix some memory leaks.
+
+Mon Feb 22 08:52:19 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/Dictionary.h, htlib/Dictionary.cc(hashCode): Check if key
+ can be converted to an integer using strtol. If so, use the
+ integer as the hash code.
+
+ * htlib/HtVector.h, htlib/HtVector.cc: Implement Release() method
+ and make sure delete calls are done properly.
+
+ * htsearch/ResultList.h, htsearch/ResultList.cc(elements): Use HtVector
+ instead of List.
+
+ * htsearch/parser.cc: Ditto.
+
+Sun Feb 21 16:13:59 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtHeap.h, htlib/HtHeap.cc: Add new class.
+
+ * htlib/Makefile.in: Compile it.
+
+ * htlib/HtVector.h, htlib/HtVector.cc: Add Assign() to assign to
+ elements of vectors.
+
+Sun Feb 21 14:45:26 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htsearch/htsearch.cc: Add patch from Jerome Alet <alet at unice.fr>
+ to allow '.' in config field but NOT './' for security reasons.
+
+ * htdig/HTML.cc: Add patch from Gabriele to ensure META
+ descriptions are parsed, even if 'description' is added to the
+ keyword list.
+
+Sun Feb 21 14:43:44 1999 Gilles Detillieux <grdetil at scrc.umanitoba.ca>
+
+ * htsearch/parser.h, htsearch/parser.cc: Clean up patch made for
+ error messages, made on Feb 16.
+
+Thu Feb 18 20:19:30 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * htlib/HtVector.h, htlib/HtVector.cc: Added new Vector class.
+
+ * htlib/Makefile.in: Compile it.
+
+ * htlib/strptime.c: Add new version from glibc-2.1, replacing
+ strptime.cc.
+
+ * htdig/Document.cc: Use it.
+
+ * htlib/regex.h, htlib/regex.c: Add new files from glibc-2.1.
+
+ * htlib/mktime.c: Update from glibc-2.1.
+
+Wed Feb 17 23:44:59 1999 Geoff Hutchison <ghutchis at wso.williams.edu>
+
+ * configure.in, configure, aclocal.m4: Add autoconf macro to
+ detect syntax of makefile includes.
+
+ * Makefile.in, Makefile.config.in, */Makefile.in: Change include
+ syntax to use it.
+
+Wed Feb 17 12:36:42 1999 Hans-Peter Nilsson <hp at bitrange.com>
+
+ * htcommon/defaults.cc (defaults): locale: change to "C".
+
+Local Variables:
+ add-log-time-format: current-time-string
+End: