				VGA Hardware Tricks
				Part 3 of 6

Introduction:

Welcome to VGA Hardware Tricks, a six-part series written by
Trixter/Hornet.  In this series, I'll be exploring ways you can push VGA
harder to achieve new effects.  The emphasis of this series is twofold:

	- The techniques discussed will work on any *standard* VGA card.
	(No SVGA or VESA video cards are necessary, but these techniques
	will work on those cards as well.)

	- The techniques discussed require very little calculation, so
	they will work on slower computers.  (Some techniques, however,
	requires a lot of CPU *attention*, which means that while the
	effects are happening, they can't be disturbed by other
	calculations, etc.  Good Assembler programmers might be able to
	get around this, however.)

This series is for intermediate to advanced coders, so there are a
couple of prerequisites you should meet:  Example code will be given in
assembler and Pascal, so familiarity with those languages will be
helpful when looking at the example code; also, a familiarity with Mode
X (unchained VGA) is required, as procedures like changing video
resolutions will be discussed.

This series covers six topics:

	- Crossfading 16-color pictures		(published in DemoNews 106)
	- Crossfading 256-color pictures	(published in DemoNews 107)
	- More than 256 colors: 12-bit color	(this article)
	- More than 256 colors: 18-bit color
	- Copper effects in text mode
	- Displaying graphics in text mode

Description:

Displaying more than 256 colors on a standard VGA card isn't technically
possible, and yet it has been achieved in such productions as Xography's
Party 93 report and Orange's X14 demo.  But how?  The answer, oddly
enough, is through our own human weaknesses--our brain.

Our eyes, retina, and optic nerves are not as fast as a computer
monitor.  While this fact is obvious, think of the implications:  Images
happen in our real world much faster than our eyes can process the
information.  If this is so, why don't we see black areas all the time
where our eyes are failing to see things?  The answer is a phenominon
called "persistance of vision."

Persistance of Vision is our brain's ability to "fill-in" missing
information that the eyes aren't providing.  Without it, we'd see black
areas all the time.  :-)  In attempting to display more than 256 colors
at a time, we're going to take advantage of persistance of vision to
fool us into seeing more colors than are actually displayed.

Overview:

This week, we'll discuss "12-bit" color.  (That term is slightly
misleading; the number of colors that we'll actually be displaying is
3840 colors.  256 colors isn't much of a difference, though, so we'll be
calling it 12-bit color anyway.)  This mode is achieved by quickly
changing the color of all the pixels in a picture between two different
colors very quickly.  If they change quickly enough, persistance of
vision "blends" the two colors into a single color.

If you stopped reading right now to try this, you'd quickly find that
only Pentiums are fast enough to change every single pixel in a picture
once per frame.  Obviously there's an easier way to do this, and it works on slower 386's as well:

	- Switch into Mode X (we need two video pages)
	- Set up the palette in a special way
	- Draw two slightly different versions of the same picture onto
	  two different video pages
	- Quickly flip between both video pages

Yaka / Xography first used this particular technique in Xography's The
Party '93 report.  Later, he released an explanation of how he achieved
the mode.  Excerpts of his excellent documentation in FAKEMODE.ZIP
illustrate what's involved in making it work:

(Documentation excerpt begins------------------------------)

FakeMode is achieved by combination of several means:

- Use Y-mode (320x400 at 256 colors and 2 pages)
- Flip between the 2 pages at every vertical retrace
- select the palette colors wisely
- set pixel data in a special way.

*** 2.1 Y-Mode

Y-Mode (similar to X-mode) is a video mode for register compatible VGA cards,
that pushes resolution up to 320x400 at still 256 colors and 2 pages! The
disadvantage compared to standard mode 13h (320x200, 256col, 1 page) is that
memory access is not so easy anymore (the pixels are split up in bitplanes).
Here's the code I use to setup Y-Mode for FakedMode (in TASM 3.1) [1]:

********************************************
  _F_initgraph PROC
    push di                     ;//save DI because of BC (I call from BC)
    mov ax,0f00h                ;//Get old videomode...
    int 10h
    mov oldvideomode,al         ;//...and save it (define oldvideomode!)

    mov ax,0013h                ;//initialize normal Mode 13h
    int 10h

    mov dx,3ceh                 ;//select Graphics Controller...
    mov al,5                    ;//...Graphics Mode Register
    out dx,al
    inc dx
    in al,dx
    and al,11101111b            ;//switch off ODD/EVEN mode
    out dx,al
    dec dx

    mov al,6                    ;//...Miscellaneous Register
    out dx,al
    inc dx
    in al,dx
    and al,11111101b            ;//switch off ODD/EVEN mode here, too
    out dx,al

    mov dx,3c4h                 ;//select Sequencer Controller...
    mov al,4                    ;//...Memory Mode Register
    out dx,al
    inc dx
    in al,dx
    and al,11110111b            ;//use linear adressing
    or al,4
    out dx,al

    mov ax,0a000h               ;//access Video Memory
    mov es,ax
    xor di,di
    mov ax,di
    mov cx,8000h
    rep stosw                   ;//clear Screen

    mov dx,3d4h                 ;//select CRT Controller...
    mov al,9                    ;//...Maximum Scan Line Register
    out dx,al
    inc dx
    in al,dx
    and al,01110000b            ;//select 400 lines
    out dx,al
    dec dx

    mov al,14h                  ;//...Underline Location Register
    out dx,al
    inc dx
    in al,dx
    and al,10111111b            ;//switch off Doubleword-Mode
    out dx,al
    dec dx

    mov al,17h                  ;//...Mode Control Register
    out dx,al
    inc dx
    in al,dx                    ;//select Word-Mode (normally: Bytemode)
    and al,10111111b            ;//normally: or al,01000000b
    out dx,al

    call initpalette            ;//call to palette setup routine (for FakeMode)

    call inittimer              ;//call to timer setup routine (for FakeMode)
    pop di                      ;//restore value of di
    ret
  _F_initgraph ENDP
************************************

That's it. I modified the original routine a bit as I keep WordMode;
it's because it is easier to write FakeMode pixels in WordMode.  You can
return to textmode or other graphics modes by normal BIOS function call
(int 10h, Fkt 0).  The calls to 'initpalette' and 'inittimer' are
necessary to install FakeMode and are not part of Y-Mode installation.


*** 2.2 Page Flipping

This is best done (I think) by synchronizing the timer interrupt with
the screen. Just before the vertical retrace appears, the interrupt is
called.  The interrupt handler routine should now set the screen offset
address to its new value and wait for the vertical retrace. Then it
should reprogram the timer and return to the main program. When the
vertical retrace occurs, the new offset address is loaded in the
internal registers of the VGA card and invokes the next screen update.
See [1], ([2]), [3].

So we'll just have to:
  - hook the timer interrupt
  - write our own interrupt handler
  - synchronize the timer interrupt with the screen
  - still call system timer routine at 18.2 Hz from interrupt handler
  - program the timer chip to achieve MonoFlop mode.

What could be simpler? :)


*** 2.2.1 Hooking/Dehooking the timer interrupt,
	  Synchronization with the screen

Hooking an interrupt is quite easy; DOS interupt 21h has got functions
to handle interrupt hooking (see below, routine inittimer).  To
synchronize the timer int with the screen, I first set the interrupt
speed much faster than the screen (256 Hz) and use a handler that counts
up a variable 'count'. Then I wait for a vertical retrace and let the
timer run.  When 'count' has changed at next vertical retrace, the timer
still is too fast. I lower speed and try again, until 'count' doesn't
change between start of timer and next vertical retrace. Then I know
that with this speed, I'm just below the minimal speed. I increase it a
little and now I know how long I have to wait aproximately between 2
timer int calls. Of course the value isn't exact, so I have to
synchronize every interrupt call for new; that's done by the interrupt
handler discussed below.  The routine 'closetimer' should be called when
you leave FakeMode; it stops the timer int and puts everything back to
normal.


************************************
synchroint PROC                 ;// This interrupt handler is used for
  push ax                       ;// screen synchronization.
  mov ax,counter
  inc ax                        ;// just count up 'counter'...
  mov counter,ax
  mov al,20h                    ;// send EOI to interrupt controller...
  out 20h,al
  pop ax
  iret                          ;// return from interrupt handler
synchroint ENDP

inittimer PROC                  ;// This routine is called when FakeMode is
  push di                       ;// installed. it initializes & synchronizes
  mov ax,1234h                  ;// the timer
  mov currentfloptime,ax        ;// start with 256 Hz

  mov ax,3508h                  ;//save old Interrupt 08
  int 21h
  mov alterint08,bx
  mov alterint08+2,es

  xor ax,ax                     ;//redirect Int. 08 to Synchronisation Rout.
  mov es,ax
  mov di,08h*4                  ;// this is the other method to access
  cli                           ;// interrupt vectors: via the interrupt table
  cld
  mov ax,offset synchroint
  stosw
  mov ax,cs
  stosw
  sti

  ;//------ synchronize timer with screen

  mov dx,3dah                   ;//Wait for End of Retrace
s1endretjmp:
  in al,dx
  and al,00001000b
  jnz s1endretjmp
s1retjmp:                       ;//Wait for Retrace
  in al,dx
  and al,00001000b
  jz s1retjmp

synchroback:                    ;//now we can start measurement...

    mov al,36h                  ;//start Systemtimer in Rectangle Mode
    out 43h,al
    mov ax,currentfloptime
    out 40h,al
    mov al,ah
    out 40h,al

    mov ax,0                    ;//reset counter. counter is increased
    mov counter, ax             ;//by interrupt routine

    mov dx,3dah                 ;//Wait for End of Retrace
  s2endretjmp:
      in al,dx
      and al,00001000b
    jnz s2endretjmp
  s2retjmp:                     ;//Wait for Retrace
      in al,dx
      and al,00001000b
    jz s2retjmp

    mov ax,counter              ;//did interrupt still occur?
    cmp ax,0
    je fertig                   ;//no -> ready
    mov ax,currentfloptime
    add ax,250                  ;//yes -> lower speed and try again
    mov currentfloptime,ax
  jmp synchroback

fertig:
  mov al,34h                    ;//set Systemtimer right (Monoflop)
  out 43h,al
  mov ax, currentfloptime
  sub ax,800                    ;//...we need time for the handler
  mov currentfloptime,ax
  out 40h,al
  mov al,ah
  out 40h,al

  xor ax,ax                     ;//redirect Int. 08 to Screenswitch Routine
  mov es,ax
  mov di,08h*4
  cli
  cld
  mov ax,offset switchpageint   ;//interrupt handler routine see below
  stosw
  mov ax,cs
  stosw
  sti
  pop di
  ret
inittimer ENDP

closetimer PROC         ;// this routine de-installs the timer handler
  push ds
  push di
  push si
  cli
  mov al,36h            ;//Systemtimer back to normal speed
  out 43h,al
  xor al,al
  out 40h,al
  out 40h,al
  push cs               ;//restore Interrupt Vector back to normal
  pop ds
  mov si,offset alterint08
  xor ax,ax
  mov es,ax
  mov di,08h*4
  cld
  movsw
  movsw
  sti
  pop si
  pop di
  pop ds
  ret
closetimer ENDP
************************************


*** 2.2.2 The interrupt Handler routine

This is the main timer interrupt routine which is called after every
screen update and performs the page flipping.  There are three necessary
things when you write a hardware interrupt handler:
1) be sure to preserve ALL registers you use (push them and pop them later)!
2) don't forget to acknowledge the hardware interrupt controller 
    (mov al,20h   out 20h,al)!
3) return from Interrupt with IRET, not with RET!

Read the comments; they should explain everything.
Literature used for this: [2], [3]

************************************
switchpageint PROC
  push ax                       ;//interrupt handlers must push all registers
  push bx                       ;//they use!
  push dx

  inc word ptr systimer         ;//set system timer (this is my own timer;
				;//i use it for timing in the main program)
  mov bx,currentpage
  add bx,32768
  mov currentpage,bx
  mov dx,3d4h
  mov al,0ch                    ;//set Start Adresse High (0Ch) to flip pages
  mov ah,bh
  out dx,ax

  mov dx,3dah                   ;//Wait for Retrace
swretjmp:                       ;//(this is done to keep synchronization)
    in al,dx
    and al,00001000b
  jz swretjmp

  mov al,34h                    ;//start Monoflop for new
  out 43h,al                    ;//(let the timer run for new)
  mov ax,currentfloptime
  out 40h,al
  mov al,ah
  out 40h,al

  mov bx,currentsystimer        ;//do Systemtimer call at 18.2 Hz
  add ax,bx
  mov currentsystimer,ax
  cmp ax,bx
  ja short nosysroutine         ;//No --> continue
  pop dx
  pop bx
  pop ax
  jmp dword ptr alterint08      ;//call Systemtimer Routine
nosysroutine:
  mov al,20h                    ;//OK to Interrupt Controller
  out 20h,al

  pop dx
  pop bx
  pop ax
  iret                          ;//return from interrupt
switchpageint ENDP
************************************

There may be timing problems when you use your own hardware interrupt
handler.  Especially the INT 13h calls are very time-sensitive, if a
timer interrupt routine is called just when the processor is in INT13
handler, and the com- puting of the timer int takes too long, the
computer may crash.  In such cases it may help to check if the computer
is just in the INT13 hand- ler when the timer interrupt is called (You
have to hook int 13 and set a variable; then continue with int13
handler. After the handler has finished, reset the variable. So the
timer int can check this variable to see if INT13 handler is active or
not.) Well, I never had problems with INT13 and FakeMode, so I didn't
implement this. :)


*** 2.3 Palette Setup

The palette is static; that means I don't change it when I flip pages.
To achieve the 3840 color mode, I split up the colors to green and
red/blue.  The palette contains 16*15 values red/blue and 16 values
green (16*15+16=256).  (16 colors red * 16 colors green * 15 colors blue
= 3840 different colors) To get harmonic greys, I set blue to the same
values as red and green, but just leave out the darkest blue value (you
can't see that one, anyway).  So when you set pixels later, you have to
decrement the blue value if it isn't zero to get the right color. (The
H_setsmallpixel routine of the example file included does that, for
example.) The palette values are stored directly to DAC, but are also
buffered in 'palette' to make later changes possible (fadein/out,
setluminance).


Here comes the palette setup routine:

************************************
palette db 768 dup (?)          ;//buffer for palette

initpalette PROC
  push di
  push cs
  pop es
  mov di, offset palette
  cld
  mov dx,3c8h
  xor ax,ax
  xor bx,bx
  out dx,al
  inc dx                ;//ah=red, bh=green, bl=blue
  mov cx,15
initpal_outer:          ;//setup red/blue part of palette (0..239)
  push cx
  mov cx,16
  initpal_inner:
      mov al,ah
      out dx,al
      stosb
      mov al,bh
      out dx,al
      stosb
      mov al,bl
      out dx,al
      stosb

      add ah,4
    loop initpal_inner
    mov ah,0
    add bl,4
    cmp bl,4
    jne goon
      add bl,4
  goon:
    pop cx
  loop initpal_outer

  mov cx,16
  xor ax,ax
  xor bx,bx
initpal_second:                 ;//setup green part of palette (240..255)
    mov al,ah
    out dx,al
    stosb
    mov al,bh
    out dx,al
    stosb
    mov al,bl
    out dx,al
    stosb
    add bh,4
  loop initpal_second
  pop di
  ret
initpalette ENDP
************************************


*** How to set pixels in FakeMode

The main purpose of the way I set pixels is to minimize the flicker.
One pixel on the screen consists of 2 pixels, one on page 1 and one on
page 2. On one of the pages the green value is displayed, on the other
the red/blue value.  Imagine I would set all pixels green values on page
1 and all red/blue values on page 2. I would get horrible flicker. To
prevent this, I set the values like follows:

  if xpos+ypos=odd, then set red/blue on page 1 and green on page 2
		    else set green on page 1 and red/blue on page 2.

So I get a 1/1 raster, and each of the 2 pages contain both red/blue and
green values. Look at the following routine to see how it is done
exactly:


************************************
_F_putsmallpixel PROC
;//values:  x=0..319, y=0..399, red=0..15, green=0..15, blue=0..15
  ARG x:word, y:word, red:byte, green:byte, blue:byte
  push bp
  mov bp,sp
  push di
  mov bx,x
  mov cx,bx
  and cl,00000011b      ;//calculate bitplane...
  mov dx,3c4h
  mov ax,0102h
  shl ah,cl
  out dx,ax             ;//...and set it
  mov ax,0a000h         ;//set destination segment
  mov es,ax
  mov ax,160            ;//set destination offset
  mov dx,y
  mul dx
  shr bx,1
  and bl,11111110b
  add bx,ax             ;//bx contains basic offset
  mov di,bx
  mov al,blue           ;//calculate red-blue value
  mov ah,16
  mul ah
  cmp ax,0
  je short smallpixgoon
    sub ax,16           ;// perform blue adjustment
smallpixgoon:
  add al,red
  add cx,y
  and cl,00000001b      ;// select if green value on page 1 or 2
  jz short stypetwo
    mov ah,al
    mov al,green
    add al,240
    mov es:[di],ax      ;// set both pixels (on page 1 & 2)
    jmp short send
stypetwo:
    mov ah,green
    add ah,240
    mov es:[di],ax      ;// set both pixels (on page 1 & 2)
send:
  pop di
  pop bp
  ret
_F_putsmallpixel ENDP
************************************


In FakeMode the video memory is built up like this:

Bitplane 0|  rb 0 | g  0 | g  4 | rb 4 | ...
Bitplane 1|  g  1 | rb 1 | rb 5 | g  5 | ...
Bitplane 2|  rb 2 | g  2 | g  6 | rb 6 | ...
Bitplane 3|  g  3 | rb 3 | rb 7 | g  7 | ...
Offset    |    0  |   1  |   2  |   3  | ...

So one line on the screen uses 160 bytes of data in each Bitplane.
The colors values for one pixel are stored besides each other (this is
because I use wordmode).


---------------------
Literature references
---------------------

[1]: Michael Tischer: PC intern 3.0; Data Becker  (a german book)
			Contains useful information about VGA programming
			(Although the Ferraro book might be better)

[2]: DOS international, issue 3/89 p.170 ff; Everts&Hagedorn (german computer
			magazine). This is an article about how sample output
			with PC internal speaker is done. That's where I got
			timer / interrupt programming from

[3]: A huge stack of copied sheets from several books I don't remember. :)

(Documentation excerpt ends------------------------------)

Code:

Code that acheives this effect in both C and Pascal is available on
ftp.cdrom.com in the directory /pub/demos/hornet/demonews/vgahard in the
file vgahard3.zip.  This article is stored there as well.  To compile
the code directly, you'll need Turbo Pascal 7.0 or Borland C 3.1 or
later.  (The code can be compiled on eariler compilers as well, but some
slight modification might be necessary.)

Notes:

Mr. Data was kind enough to point out that Atari ST coders have known
about these kind of tricks for years, because the Atari ST had more
limited graphics hardware than the Amiga.  Unfortunately, many of the
techniques he described to me that were used on the Atari can't really
be applied to today's VGA hardware, because some of the techniques
oscillated between video pages and colors too slowly.  The older
monitors of the time had slow phosphors, so the *monitor itself*
"blended" the colors, but today's monitors have very fast phosphors,
which results in terrible flickering if trying to use an Atari
technique.

Next time:

12-bit color displays very solid results, but you've probably noticed
its major weakness by now:  It requires so much CPU time ensuring that
the video pages keep flipping that it's not really useful for animation.
Next week, we'll tackle a different solution to this--and gain over
250,000 more colors in the process!  Have fun until then!
