Rawheds Tutorial#4:



			   IIIIIIII DDDD
			      II    DD DDD
			      II    DD   DD
----------------     CPU      II    DD    DD     ----------------------------
			      II    DD   DD
			      II    DD DDD
			   IIIIIIII DDDD


		             [Introduction]
		             [CPUID Intro]
		             [EFLAGS To Detect CPUID]
			     [Returned Data - Standard Level 0]
			     [Returned Data - Standard Level 1]
			     [Returned Data - Standard Level 2&3]
			     [Returned Data - Extended Level 0]
			     [Returned Data - Extended Level 1]
			     [Returned Data - Extended Level 2&3&4]
			     [Returned Data - Extended Level 5]
			     [Returned Data - Extended Level 6]
			     [Quick Example]
			     [Closing Words]

---==[Introduction]==---------------------------------------------------------

Ever needed to know what processor your program/demo was running on?  I did.  
After looking at these lengthy hacks people wrote long ago(before the CPUID
instruction) to identifty rudimentary processor information, I was very glad
when I did stumble into CPUID.  I didn't touch the instruction for months as
I didn't know how to use it, thought that all it did was output 1 for p5,
2 for p6, 3 for ppro etc.  Hehe, luckily  thats NOT the case! :)

So here I am, a little wiser, about to tell you about this actually quite cool
instruction.  Some of you might be wondering why a demo would be wanting to
know what processor its running on?  I'm not sure.  But initially I thought
it would be really cool for the demo to show specs about the machine it was
running on, just before the demo began.  Now that I think about it, that
was a rather silly reason.  Later on(after my cousin got his AMD machine), I 
wanted to create a demo which had 3DNow! and MMX support.  So I needed the
CPUID instruction to identify whether the chip could handle MMX/3DNow!.

Sofar my MMX & 3DNow! enabled demo is going very well, and I'll probably write
a tutorial about it soon.  I'd like to see some more demos which support these
technologies, as they are actually quite good.  MMX didn't recieve a HUGE 
round of applause when it was released - even I thought it was just another
one of Intels gimmicks - but actually(surprisingly) after coding it a bit, I've
found it to be actually very cool.  So anyways, enough jaw-gyration -  here is 
how its all done :)

---==[CPUID Intro]==--------------------------------------------------------

CPUID is an assembler instruction, just like MOV or ADD is.  It was implimented
from late 486's and up.  With cool new assemblers like NASM you can just type
the instruction in straight, but with others you'll have to use the opcode
which is: 0fh, 0a2h (two bytes). 

	Instruction:CPUID
	Description:Used to return information about the processor type, family
		    model and many other things.  The ammount of information
                    that this instruction can return is growing all the time.
                    Chip features can be returned, even L1/L2/L3(even) cache
                    sizes.
        Input:      Set EAX to the relivant level number.
        Output:     Varies according to which level number you ask it for.
	Example:    mov   eax, 1
		    CPUID

---==[EFLAGS To Detect CPUID]==-----------------------------------------------

The first step is to check whether the computer actually supports the CPUID 
instruction!  This can be done by looking at the EFLAGS register.  Only 32bit
processors support the EFLAGS register.  EFLAGS is just the 32bit version of
the 16bit FLAGS register :)  The EFLAGS register is a 32bit number, and each
number is either set/unset(1/0) for certain things.  Lets have a look:

(Remember 16bit machines will only have the 1st 16bits).
(Remember you won't need most of this - its mostly for interests sake :)
-----------------------------
Bit 00 : Carry flag
Bit 01 : Always 1
Bit 02 : Parity flag
Bit 03 : Always 0
Bit 04 : Auxilary Carry flag
Bit 05 : Always 0
Bit 06 : Zero flag
Bit 07 : Sign flag
Bit 08 : Trap flag
Bit 09 : Interupt flag
Bit 10 : Direction flag
Bit 11 : Overflow flag
Bit 12 : I/O Privelage level
Bit 13 : I/O Privelage level
Bit 14 : Nested Task flag
Bit 15 : Always 0
-----------------------------
Bit 16 : Resume flag
Bit 17 : Virtual Mode flag
Bit 18 : Alignment Check flag
Bit 19 : Virtual Interupt flag
Bit 20 : Virtual Interrupt Pending flag
Bit 21 : Identification flag
Bit 21->31 : Unspecified, so 0
-----------------------------

All we are interested in is Bit 21, the ID flag.  But how do we get it???
Lets consult our good ol' Intel.doc file we have lying in PCGPE.

PUSHF/PUSHFD - Push Flags onto Stack

        Usage:  PUSHF
                PUSHFD  (386+)
        Modifies flags: None

        Transfers the Flags Register onto the stack.  PUSHF saves a 16 bit
        value while PUSHFD saves a 32 bit value.

                                 Clocks                 Size
        Operands         808x  286   386   486          Bytes

        none            10/14   3     4     4             1
        none  (PM)        -     -     4     3             1


Cool!  There you go.  
One thing interesting is that PUSHF & PUSHFD both have the same opcode,
but its surposed to figure out if its to parse 16bit/32bit depending
on the destination operand.  So here is how we would test for Bit 21:

	NOTE!  Some processors support the CPUID instr, but you have
	to turn the feature on, so this code tries to turn Bit 21 on
	and if successful, then CPUID works.

		;cpuid supported?
		pushfd 			;push the flags onto the stack	
		pop eax 		;pop them back out, into EAX
		mov ebx, eax 		;keep original
		xor eax, 00200000h 	;turn bit 21 on.
		push eax 		;put altered EAX on stack
		popfd 			;pop stack into flags
		pushfd 			;push flags back onto stack
		pop eax 		;put them back into EAX
		cmp eax, ebx 		
		jnz @CPUID_SUPPORTED	;COOL!!
		;blah			;booo  
		@CPUID_SUPPORTED:

---==[Returned Data - Standard Level 0]==-------------------------------------

There are many levels that CPUID works at.  You need to set these levels
by setting EAX to the level number you want info about.  New levels are being
added to the CPUID instruction all the time, so I'll just introduct the
most common & useful ones.  There are also 2 types of levels: standard levels
and extended levels.  Standard levels go from 0 --> 7FFFFFFFh and extended
goes from 8000000h --> FFFFFFFFh.

	EAX=0

	Output:	[EAX] - Maxiumum supported standard level
		[EBX-EDX-ECX] - CPU vendor ID string

The maximum supported standard level is the maxiumum level you can take
EAX up to in the CPUID instruction, supported by the current CPU.
The CPU vendor string is really neat.  Concatinate(join) EBX, EDX and ECX
reading them as strings.  This should return either:

		GenuineIntel    (Intel processor)
		UMC UMC UMC     (UMC processor)
		AuthenticAMD    (AMD processor)
		CyrixInstead    (Cyrix processor)
		NexGenDriven    (NexGen processor)
		CentaurHauls    (Centaur/IDT processor)
		RiseRiseRise    (Rise Technology processor)

So thats how you find out what make the chipset is.  Apparently in some of the
earlier implimentations of CPUID, you were able to change the value of the 
vendor ID string.  This resulted in companies changing a lot of "CyrixInstead"
to "GenuineIntel" :)

---==[Returned Data - Standard Level 1]==-------------------------------------

Now its time to get more details on the CPU.  On level 1 we can find out if
the CPU is a primary/secondard processor, what family it is, its model, and 
its stepping values.  We can also find out what CPU features it supports,
such as MMX, 3DNow! and a ton more.

	EAX=1

	Output:	[EAX] - Processor type/family/model/stepping
		[EDX] - CPU "Feature flags"

For EAX, only the 1st 16bits are used, so its really AX that you need to use.
Its structured like this:
		
	EAX	FEDCBA9876543210
		00TTFFFFMMMMSSSS

		T - Processor type(2bits)
		F - Processor family(4bits)
		M - Processor model(4bits)
		S - Processor stepping(4bits)

Processor type:
	11b = reserved 
	10b = secondary processor (for MultiProcessing)
	01b = Overdrive processor 
	00b = primary processor 

Processor family:
	4 = most 80486s, AMD 5x86, Cyrix 5x86
	5 = Intel P5, P54C, P55C, P24T
	    NexGen Nx586, Cyrix M1
	    AMD K5, K6, Centaur/IDT C6, C2
	    Rise mP6
	6 = Intel P6, P2, P3, Cyrix M2 

This family data may seem messy(and it is), but things are made easier
when you combine this information with the VENDOR ID returned from level 0.
And again this cummulative data must be combined to use the value returned
in the "processor model" properly.

Processor model:
	Intel 80486 	0 i80486DX-25/33 
			1 i80486DX-50 
			2 i80486SX 
			3 i80486DX2 
			4 i80486SL 
			5 i80486SX2 
			7 i80486DX2WB 
			8 i80486DX4 
			9 i80486DX4WB 

	UMC 80486 	1 U5D 
			2 U5S 

	AMD 80486 	3 80486DX2 
			7 80486DX2WB 
			8 80486DX4 
			9 80486DX4WB 
			E 5x86 
			F 5x86WB 

	Cyrix 5x86 	9 5x86 

	Cyrix MediaGX 	4 GX, GXm 

	Intel P5-core 	0 P5 A-step 
			1 P5 
			2 P54C 
			3 P24T Overdrive 
			4 P55C 
			7 P54C 
			8 P55C (0.25m) 

	NexGen Nx586 	0 Nx586 or Nx586FPU (only later ones) 

	Cyrix M1 	2 6x86 
	
	Cyrix M2	0 6x86MX 
	
	AMD K5 		0 SSA5 (PR75, PR90, PR100) 
			1 5k86 (PR120, PR133) 
			2 5k86 (PR166) 
			3 5k86 (PR200) 

	AMD K6 		6 K6 (0.30 m) 
			7 K6 (0.25 m) 
			8 K6-2 
			9 K6-III 

	Centaur/IDT 	4 C6 
			8 C2 
	
	Rise 		0 mP6 

	Intel P6-core 	0 P6 A-step 
			1 P6 
			3 P2 (0.28 m) 
			5 P2 (0.25m) 
			6 P2 with on-die L2 cache 
			7 P3 

This list is NOT complete, for a better one, check INTEL/AMD/CYRIX.

EDX contains 32bits, each one a flag indicating what features the chip
supported.  For my program I just used it to see if MMX was
supported, then I just counted up the features and displayed how many 
features the chip had.

EDX bit 00 - FPU on-chip
    bit 01 - Virtual mode extension (V86 Mode Extensions)
    bit 02 - Debugging extension
    bit 03 - Page size extension (4 MB pages)
    bit 04 - Time stamp counter & RDTSC instruction
    bit 05 - RDMSR / WRMSR instructions
    bit 06 - Physical address extension (36-bit address, 2MB pages)
    bit 07 - Machine check exception
    bit 08 - CMPXCHG8B instruction
    bit 09 - On-chip APIC hardware (multiprocesssor operation support)
    bit 10 - undefined
    bit 11 - SYSENTER / SYSEXIT instructions
    bit 12 - Memory type range registers
    bit 13 - Page global enable
    bit 14 - Machine check architecture
    bit 15 - Conditional move instruction(CMOV)
    bit 16 - Page attribute table
    bit 17 - 36 bit Page Size Extenions
    bit 18 - undefined    
    bit 19 - undefined
    bit 20 - undefined    
    bit 21 - undefined
    bit 22 - undefined    
    bit 23 - MMX instructions
    bit 24 - Fast FPU save & restore   (FXSAVE and FXRSTOR)
    bit 25 - SSE, MXCSR, CR4.OSXMMEXCPT, #XF
    bit 26 - undefined
    bit 27 - undefined
    bit 28 - undefined
    bit 29 - undefined
    bit 30 - undefined
    bit 31 - undefined

I'm not sure what most of the above means, although I'd love to know.  Anybody
know?

---==[Returned Data - Standard Level 2&3]==-----------------------------------

I'm not gonna talk much about levels 2&3 because my processor doesn't
support them so I havn't tested them :)

Basically level 2 allows you to get details on the processor CACHE(L1 & L2).
Level 3 allows you to get the processors SERIAL number(if it has one).
I wanna find out how to disable serial number on the PIII's.

Visit www.sandpile.org for more info on these levels.

---==[Returned Data - Extended Level 0]==-------------------------------------

AMD introduced extended level for the CPUID instruction. The extended level
is not supported by all processors, but is definately supported by:
	AMD K6, K6-2 
	Cyrix (starting with GXm - a successor of MediaGX, but not on MII) 
	IDT C6-2. 

I'm not sure why this extended level was made, but level 0 starts with EAX=
80000000h, and so level 1 would be 80000001h.

Not all CPU's support this extended level, so you need to test if its supported.
Its easy though.  Just let EAX=80000000h, call CPUID and test the retured EAX
value.  If EAX >= 80000000h then the extended level is supported.

This is works because, just like standard level 0, EAX = the maximum supported 
level.  EBX, ECX & EDX return (just like standard level 0) the Vendor ID of the
processor.  There is a lot of fields which are duplicated in the extended
levels from the standard levels.  I think it is because the details for things
such as VendorID, cpu model, family, type etc had been fulled in the standard
level so newer device details are stored in the extended level.  But I'm not
sure about this.

	EAX=80000000h

	Output:	[EAX] - Maximum extended level supported
		[EBX-EDX-ECX] - VendorID

---==[Returned Data - Extended Level 1]==-------------------------------------

Here again, its VERY similar to the standard level 1.  Although again its
still different.  For instance to detect for 3DNow!, you can only use extended
level 1.  

	EAX=80000001h

	Output:	[EAX] - Processor family/model/stepping
		[EDX] - CPU "Feature flags"

	EAX	FEDCBA9876543210
		0000FFFFMMMMSSSS

		F - Processor family(4bits)
		M - Processor model(4bits)
		S - Processor stepping(4bits)

Processor family:
	5 = AMD K5, Centaur/IDT C2
	6 = AMD K6 

Processor model:
	AMD K5  	1 = 5k86 (PR120 or PR133) 
			2 = 5k86 (PR166) 
			3 = 5k86 (PR200) 

	AMD K6 		6 = K6 (0.30 m) 
	        	7 = K6 (0.25 m) 
			8 = K6-2 
			9 = K6-III 

	Centaur/IDT 	8 = C2 

Most of the details in the extended features flag is the same, here are some
main differences:

EDX bit 15 - Integer conditional move instruction(CMOV)
    bit 16 - FPU conditional move instruction(FCMOV)
    bit 24 - Cyrix Extended MMX
    bit 31 - 3DNow!


---==[Returned Data - Extended Level 2&3&4]==---------------------------------

Cool, here is something very different from the standard level.  These 3
levels are used in conjunction with eachother to return an ASCII string name of
the processor.  This sort of kills all those standard levels before which were
a pain in the butt to use to find the processor type, family, model etc.  But
not all processors support this instruction.

	EAX=80000002h

	Output:	[EAX] - Processors name string 01
		[EBX] - Processors name string 02
		[ECX] - Processors name string 03
		[EDX] - Processors name string 04

	EAX=80000003h

	Output:	[EAX] - Processors name string 05
		[EBX] - Processors name string 06
		[ECX] - Processors name string 07
		[EDX] - Processors name string 08

	EAX=80000004h

	Output:	[EAX] - Processors name string 09
		[EBX] - Processors name string 10
		[ECX] - Processors name string 11
		[EDX] - Processors name string 12

Just concatinate all the values to form a 48character string.  If the string
doesn't use all 48characters, it just fills the rest of the space with NULL
(zero) values.  Some examples of strings are:
"AMD-K6tm w/ multimedia extensions", "AMD K6-2 AMD-K6(tm) 3D processor"
"AMD-K6(tm)-III Processor", "IDT WinChip 2-3D".

---==[Returned Data - Extended Level 5]==-------------------------------------

This level is cool, it returns data about your level1 cache.  
Unfortunately Cyrix doesn't adhere to its return value format, so you have
to write a special function for the cyrix ext level 5.  Or just leave
this level out if a cyrix is detected :)
Here is how everybody else (except Cyrix) handle things:

	EAX=80000005h

	Output:	[EBX] - TLB data
		[ECX] - L1 DATA Cache data
		[EDX] - L1 CODE Cache data

	EBX bits 31..24 DATA TLB associativity (FFh=full)
		 23..16 DATA TLB entries
		 15..08 CODE TLB associativity (FFh=full)
		 07..00 CODE TLB entries 

	ECX bits 31..24 DATA L1 cache size in KBs 
		 23..16 DATA L1 cache associativity (FFh=full) 
		 15..08 DATA L1 cache lines per tag 
		 07..00 DATA L1 cache line size in bytes 

	EDX bits 31..24 CODE L1 cache size in KBs 
		 23..16 CODE L1 cache associativity (FFh=full) 
		 15..08 CODE L1 cache lines per tag 
		 07..00 CODE L1 cache line size in bytes 

Quite handy don't you think? :)

---==[Returned Data - Extended Level 6]==-------------------------------------

This level is very cool, returns data about your level2 cache!

	EAX=80000006h

	Output:	[ECX] - L2 Cache data

	ECX bits 31..16 L2 cache size in KBs 
		 15..12 L2 cache associativity (Fh=full) 
		 11..08 L2 cache lines per tag 
		 07..00 L2 cache line size in bytes 

---==[Quick Example]==--------------------------------------------------------

Some quick very simple code, just to get you going:

struct {
	long max_std_lvl;      //maximum standard level supported
	char vendorID[13];     //13th for NULL char
	char mmx;              //MMX supported?
	char _3dnow;           //3DNow! supported?
	long max_ext_lvl;      //maximum exteneded level supported
	};

	mov edi,[ptr_to_cputable]

;Get VendorID

    	mov eax,0 
    	CPUID                     ;CPUID!
    	mov [edi+0],eax           ;max standard level supported
    	add edi,4 
      	mov [edi+0],ebx
    	mov [edi+4],edx
    	mov [edi+8],ecx
    	mov [edi+12],byte 0       ;vendor ID string+padding

    	add edi,16

;Check for MMX

	mov EAX,1
	CPUID
	mov ebx,edx
	and ebx,0x800000
	shr ebx,23
        mov [edi+0],bl            ;MMX support?

        add edi,1

;Check for 3DNow!

	mov [edi+0],byte 0
	mov eax, 0x80000001 	  ;extended level 1
	CPUID 			
	test edx, 0x80000000 	
	jz @no_3dnow    	
	mov [edi+0],byte 1        ; 3DNow! technology supported
        @no_3dnow:

        add edi,1

;Check maximum extended levels

	mov eax, 0x80000000	
	CPUID 		
	mov [edi+0],eax	

---==[Closing Words]==--------------------------------------------------------

The CPUID instruction is growing all the time, so the ammount of information
you can extract from it is increasing!  You can get seriously low-down-'n-dirty
details about the CPU.  For more information check out: www.sandpile.org

Does anybody know where to find details on the PIII SSE/KNI/MMX2 instructions?


-Rawhed/Sensory Overload
-Mailto:andrew@overload.co.za
-Htpp://www.overload.co.za
-Andrew Griffiths
-South Africa
-10-07-1999
