NV2A/Vertex Shader
The Xbox implements https://www.opengl.org/registry/specs/NV/vertex_program.txt and https://www.opengl.org/registry/specs/NV/vertex_program1_1.txt This article will mainly focus on actual encoding on hardware as the behaviour is mostly outlined in those GL extensions already.
Contents
Operating modes
- Fixed / Programmable
- Writeable / Read-Only constants
- Low-constants only / all constants
- Vertex processing / State program
Registers
Input registers
There are 16 input registers v[0] to v[15].
They normally [map to the vertex attributes|NV2A/Vertex attributes]. However, in the case of vertex state programs, v[0] is fed from LAUNCH_DATA (PGRAPH Methods 0x1E80, 0x1E84, 0x1E88, 0x1E8C for XYZW respectively) instead.
Output registers
output registers: o[0] to o[?] (initialized to XYZ=0x00000000 W=0x3F800000) The alias names from the GL extension match.
Address register
A0.x exists as documented in the GL extension.
Temporary registers
There are 12 temporary registers: R0 to R11 (initialized to XYZW=0x00000000), as documented in the GL extension. Additionally, o[POS] is mirrored as R12 and can be used as source operand; so effectively you have 13 temporaries
Constant space
There are 192 constant registers in two seperate blocks with 96 constants each. They can be accessed through the PGRAPH RDI: select=0x17. Each constant slot is 4x DWORD, ordered as WZYX. Alternatively they can be uploaded through method [FIXME], with 4x DWORD, ordered XYZW.
In nvidia vertex programs only 96 constants are normally accessible. Microsoft exposed the 96 additional constant registers in D3D shaders through c[-96] to c[-1]. This documentation uses the GL terminology instead and expose the new registers as c[96] to c[191]. This means c[0] to c[191] valid.
Instructions
In total, there are 136 instruction slots.
Each slot consists of 16 bytes, we consider those as 4 seperate little-endian DWORDS describing the operation. Word 0 is inused.
Meaning | Word | Offset (bits) | Size (bits) |
ILU Operation | 1 | 25 | 3 |
MAC Operation | 1 | 21 | 4 |
Constant index | 1 | 13 | 8 |
Input index | 1 | 9 | 4 |
Source 1 negate | 1 | 8 | 1 |
Source 1 swizzle X | 1 | 6 | 2 |
Source 1 swizzle Y | 1 | 4 | 2 |
Source 1 swizzle Z | 1 | 2 | 2 |
Source 1 swizzle W | 1 | 0 | 2 |
Source 1 register | 2 | 28 | 4 |
Source 1 mux | 2 | 26 | 2 |
Source 2 negate | 2 | 25 | 1 |
Source 2 swizzle X | 2 | 23 | 2 |
Source 2 swizzle Y | 2 | 21 | 2 |
Source 2 swizzle Z | 2 | 19 | 2 |
Source 2 swizzle W | 2 | 17 | 2 |
Source 2 register | 2 | 13 | 4 |
Source 2 mux | 2 | 11 | 2 |
Source 3 negate | 2 | 10 | 1 |
Source 3 swizzle X | 2 | 8 | 2 |
Source 3 swizzle Y | 2 | 6 | 2 |
Source 3 swizzle Z | 2 | 4 | 2 |
Source 3 swizzle W | 2 | 2 | 2 |
Source 3 register (Hi) | 2 | 0 | 2 |
Source 3 register (Lo) | 3 | 30 | 2 |
Source 3 mux | 3 | 28 | 2 |
Destination MAC mask | 3 | 24 | 4 |
Destination temporary register | 3 | 20 | 4 |
Destination ILU mask | 3 | 16 | 4 |
Destination overall mask | 3 | 12 | 4 |
Destination select | 3 | 11 | 1 |
Destination output register | 3 | 3 | 8 |
Destination mux | 3 | 2 | 1 |
Relative constant addressing | 3 | 1 | 1 |
Final instruction marker (EOF) | 3 | 0 | 1 |
Value | Meaning |
0 | X |
1 | Y |
2 | Z |
3 | W |