
Programming info for Bitstretcher          Tony Haines
17th April 1999


Initial Picture
The picture is generated using a one-dimensional cellular
automata for each colour bit, with some noise to make things
interesting. White is replaced with random colours.
About 2/3rds of the way down the algorithm changes to make
it look prettier. I chose a 'random' seed that made a nice picture.

(For each row the colour of the pixel depends on the colour of
the three pixels above and adjacent to it + a small chance of a
random bit being flipped. If the colour comes out as white it is
randomised.)


Plotting
The basic effect is to 'stretch' a byte of data and plot it to
more than one pixel. This is done by masking out the unwanted
bits from each drawn byte. A more efficient method is to
simply AND the data byte with the bit(s) you want in each pixel,
however that method will not allow a smooth zoom.

There are two core plotting routines, one for less than 1 bit per
pixel and one for more. Both are unrolled to draw a word at a time,
this makes plotting a little faster.


Game code
The game code is fairly standard. Collision detection is from the
screen pixels, and was a right pain to get working properly.

It is possible to change the width of the game world displayed;
the StrongARM can easily cope with full screen, however with
the cache off it crawled a bit so I narrowed it down for ARM2
users. This has the advantage of making the maze more
mysterious.


Compression
To make this fit in 1k I had to write my own compression routine.
This makes a list of unique program words and stores the offsets
and lengths needed to reconstruct the program as pairs of bytes.
This has the advantages that it compresses the unrolled code well,
and also the decompression routine is fairly small.

Feel free to use the compression program, but bear in mind the
following caveats:
The compressing program is fairly horrible code - I've tinkered
with it a lot. It runs through the code backwards because this gives
better compression - the first iteration of the loops in my program
is generally subtly different from the others.
It won't work without modification on other programs. You *will*
need to set the branch offset in the assembled code to where you
want execution to start. I put some data at the start of my program
(to avoid a long ADR or two). If you do the same you can jump it
'for free'.
The compression structure I use won't work on programs with more
than 256 different program words used. This will obviously not be a
problem in 1k programs.
I removed the XOS_SynchroniseCodeAreas SWI, this turned out to
be unnecessary (thanks Baah). However, if your code is very small
or repetitive, it might be a problem. If the writing position is at the
end of the cache line loaded with the first data word. Either put it
back in, move the created program start pos further away, or make
your program bigger. :-)

Optimisation technique
In spite of this I've had to do some quite nasty optimisations.
Of these, my favourite is the the 'tower building' routine - which
relies on data from the maze drawing routine to set up loop
variables.


Compilation
To compile the code run 'source'. A file 'code' will be made in
the currently selected directory. (I usually set mine to ram:$)
This will run, but will be too large. Run 'Shrinker' to produce a
version that fits in 1k.


This program and all sources freeware.
