2007-12-09

Another Site Rewrite

cheasy under construction barricade - so 90s!

Over the years, the website has grown to the point it was getting very difficult to find things. The index page was a long list of articles in no particular order and I found it increasingly frustrating to maintain. I can only imagine it was also a challenge to read!

The Changes

So, a bit of a rewrite was started... At first it was just some cosmetic changes, and a many-to-many category system. This grew until almost all the code in the site was touched in some way. PHP5-clean support took some time, it was something I've wanted to do for a long time (this site started with PHP3 (flat HTML before that too) and still had deprecated bits and pieces hanging around).

The "wikitext" rendering engine is probably the least changed, but it terrifies me in how nasty it remains. This is partially because it is written for performance and is quite non-trivial in the input langauge it supports. I will probably rewrite it, the markup produced could be better, and the code is a single 332-line monster class that was hurriedly hacked into PHP5 compliance.

The pyrotechnic entities (compositions and devices) have been converted to use the wikitext renderer. They used their own earlier markup language, the prototype to the wikitext actually, but sufficiently different that the automatic conversion utility I wrote was quite non-trivial. A bunch of spelling errors were also corrected in the pyro entities. This needs to happen for the rest of the site also, the spelling is pretty abysmal in places.

The schema changes to support categorisation were quite straight forward. The complexity of the article entity has been reduced significantly. The parent-child relation and grouping construct has been maintained to support sub-articles. This is sometimes useful and a few largish projects documented here use it, making it difficult to remove without restructuring them.

The cosmetic changes were the result of playing with GIMP one afternoon. I am no artist, but it looks OK to me. Better than the pretty vanilla 1990s-style it had before. I tried to use a bit of colour theory in picking orthogonal colours. I'm not completely convinced about the brown (the colour of the back-side of unetched PCB board), I may experiment further with this and the banner, but that is quite easy as it is all CSS based. The CSS is pretty clean, everything is based on one text size, it scales beautifully. There is some scripting to disable the CSS (in the footer) if you find your eyes are bleeding, it falls back to plain HTML nicely. The markup is all quite semantic, but the original site wasn't too bad in this respect either.

According to W3C "cool URLs should not change". I agree and tried quite hard to maintain backwards compatibility for bookmarked and search engine indexed URLs. The articles still have the same URL form, but the pyro entities changed to the somewhat frustrating foo.php/123 form that I've used for some time now. The foo.php?id=123 form is supported by a redirect. The old 404 handler to support the ancient flat-HTML version of the site is also still around, logs suggest it sees a few hits a month.

Yet To Come

Still to come is a search engine. I actually wanted to wait until this was complete and do one big bang, but the query compiler is taking a little longer than I thought. The basic word-matching engine could go up now, but the phrase matching and query optimising compiler is a thing of beauty!

Knuth would probably tell me my implementation sucks, and my mates keep telling me to use ht://dig or Google... But this search engine project is something I've been tinkering with since the earliest experimental version I wrote for Fuji Xerox Australia's corporate website (looks like they still use it too). Getting it out the door will be a cathartic exercise for me. It is quite sophisticated and will probably be useful elsewhere as it is pretty modular and could in theory index anything.

Leave a comment on this article.