HTMLArchive

In short, HTMLArchive seeks to provide webmasters with a simple means by which s/he may provide archive-oriented web content to users.

You'll notice that in both cases, lists of URLs are managed on the page:

In the change log you see the list of links as a list, with text from the top-most link included in the page.

If you go to my home page you'll see the latest text from the list kept in my archives.That url list is kept as a table, rather than a list. Thus, the power of HTMLArchive.

I'm too lazy to go through the effort of grabbing the text from the latest page and including it into some other page. I am also too lazy to link the page into my indexing page. This little fact kept me from creating my page for literally years.

But I'm not too lazy to write a program that will cause something else to do all that for me. Ergo: HTMLArchive.

I don't know perl. I've tried to learn it, but because I have a background in C++, every time I try to work with it, I find myself trying to figure out where the pointers are, and what the damned thing must be thinking at every point along the way. In the end, I've given up; I'll stick to C++.

Consequently, HTMLArchive has been written in C++.

Also, because I'm somewhat obsessive about performance, HTMLArchive works quickly. It caches information to a file in the same directory as your template file, reading that information if possible. If you update a web page, no worries; HTMLArchive will notice a difference between what's in the cache and your file (handled by checking the modification date.. one of the items cached). This caching file shares the same name as your template file, but starts with a '.' (making it invisible on most unix machines). Currently, HTMLArchive describes this as a 'preference' file... but it's actually a caching file (I need to change the description soon).

Currently, HTMLArchive does not work with Unicode web pages; I work for a company supporting the Microsoft operating system, and that OS has some interesting differences that I want to make sure do or don't apply here. Namely, I want to know whether or not Unicode text files in unix must start with FFFE/FEFF (depending on endianness), or if that's not needed.

HTMLArchive also doesn't write its status messages in any language other than English. This isn't because I'm one of those ugly Americans who feel that English needs to rule the world... it's more a matter of wanting to get this working first before trying to provide i18n support through gettext. I also cannot adequately speak any other languages besides English (although I've heard my French can be amusing, as well as my ASL [but American Sign Language doesn't exactly translate well into po files]).

I have some horrible documentation for it. I hope you can understand how it works. I'll be happy to help as well as I can.

For more information about HTMLArchive (or, perhaps, clarification on how it works), please contact Trey Van Riper.

My Freshmeat entry may also help you keep up to date.

Fleeb.Com