Difference between revisions of "Boot Process"
(→MTRR Setup) |
|||
(47 intermediate revisions by 8 users not shown) | |||
Line 1: | Line 1: | ||
+ | This article describes the boot sequence of the Xbox. | ||
+ | A large portion of it is patended in [https://docs.google.com/viewer?url=patentimages.storage.googleapis.com/pdfs/US6907522.pdf Patent "US 6,907,522 B2"] | ||
+ | |||
== Overview == | == Overview == | ||
− | == MCPX == | + | The Xbox has a 256 kiB ROM containing the startup animation and sound, as well as the Xbox kernel, which contains a stripped down version of the Windows 2000 (NT 5.0) microkernel, the HAL, filesystems, as well as HDD and DVD drivers. |
+ | |||
+ | When the Xbox is turned on, the software in ROM is decompressed into RAM, and the kernel initializes the hardware. Because there are no audio or video drivers in the kernel, the startup code plays the animation and sound by accessing the registers of the hardware directly. As soon as the Xbox logo is on the display, the kernel unlocks the hard disk and checks whether there is a valid game medium in the DVD drive. If not, the file xboxdash.xbe gets loaded from partition #3 (Typically the [[Dashboard]]). In either case, the Microsoft logo is shown below the Xbox logo and the executable is started. If an error occurs (no/wrong hard disk, wrong signature, ...), the boot loader shows a [[Fatal Error]] screen and halts. | ||
+ | |||
+ | === Chain of trust === | ||
+ | |||
+ | The Xbox uses a chain of trust during the boot process: | ||
+ | |||
+ | '''MCPX X2''' | ||
+ | |||
+ | There's no chain of trust. | ||
+ | |||
+ | It behaves similar to the MCPX X3 1.0 boot, just that the MCPX ROM is not available, so the code is loaded from untrusted flash. | ||
+ | |||
+ | There are also different X-Code opcodes and keys. | ||
+ | |||
+ | '''MCPX X3 1.0''' | ||
+ | |||
+ | * MCPX ROM: | ||
+ | ** Runs untrusted from flash X-Codes in a limited virtual-machine. | ||
+ | ** Contains key + decrypts the 2BL using RC4. | ||
+ | ** In success case: Go to 2BL. | ||
+ | ** In error case: Hides ROM and intends to triple-fault. | ||
+ | * 2BL: | ||
+ | ** The MCPX ROM is hidden. | ||
+ | ** The 2BL decryption key is (overwritten with 0x00-Bytes){{FIXME|reason=Does this actually happen?}}. | ||
+ | ** Contains keys for kernel decryption and execution; decrypts kernel using RC4 and extracts using LZX.{{FIXME|reason=I believe kernel is also hashed}} | ||
+ | * Kernel: | ||
+ | ** The kernel decryption key is overwritten with 0x00-Bytes. | ||
+ | ** The 2BL is overwritten with 0xCC-Bytes. | ||
+ | ** Once the kernel is initialized, the INIT section is discarded. | ||
+ | ** The kernel only runs signed [[XBE]] files from allowed media. | ||
+ | |||
+ | '''MCPX X3 1.1''' | ||
+ | |||
+ | * MCPX ROM: | ||
+ | ** Runs untrusted from flash X-Codes in a limited virtual-machine. | ||
+ | ** Hashes the unencrypted FBL using TEA encryption. | ||
+ | ** In success case: Go to FBL. | ||
+ | ** In error case: Hides ROM and intends to triple-fault. | ||
+ | * FBL: | ||
+ | ** Verify 2BL image. | ||
+ | ** Derive key from key stored in MCPX + Flash, and decrypt 2BL. | ||
+ | ** Go to 2BL. | ||
+ | |||
+ | The rest of the boot behaves like MCPX X3 1.0. | ||
+ | |||
+ | '''Assumptions for chain-of-trust''' | ||
+ | |||
+ | * The CPU will start execution in trusted MCPX ROM. | ||
+ | * The MCPX ROM can not be read or modified. | ||
+ | * The decrypted 2BL or kernel can not be read entirely. | ||
+ | * All parts of the software following the MCPX are not-attackable and signed. | ||
+ | |||
+ | See [[Exploits]] for possible options to break the chain of trust. | ||
+ | |||
+ | == MCPX ROM == | ||
Certain things are still missing, for example, getting the CPU to 32 bit protected mode and enabling caching.{{FIXME}} | Certain things are still missing, for example, getting the CPU to 32 bit protected mode and enabling caching.{{FIXME}} | ||
Line 11: | Line 70: | ||
<pre> | <pre> | ||
void xcode_interpreter() { | void xcode_interpreter() { | ||
− | + | ||
− | uint32_t | + | // values are implied as x86 is just starting up |
− | uint32_t result | + | register uint32_t pc = 0; // stored in ESI register |
− | while ( | + | register uint8_t opcode = 0; // stored in AL register |
− | opcode = get_memory_byte( | + | register uint32_t operand_1 = 0; // stored in EBC register |
− | operand_1 = get_memory_dword( | + | register uint32_t operand_2 = 0; // stored in ECX register |
− | operand_2 = get_memory_dword( | + | register uint32_t result = 0; // stored in EDI register |
+ | register uint32_t scratch = 0; // stored in EBP register | ||
+ | |||
+ | // explicitly set startup point | ||
+ | pc = 0xFF000080; | ||
+ | |||
+ | while (1) { | ||
+ | opcode = get_memory_byte(pc); | ||
+ | operand_1 = get_memory_dword(pc+1); | ||
+ | operand_2 = get_memory_dword(pc+5); | ||
if (opcode == 0x07) { | if (opcode == 0x07) { | ||
Line 25: | Line 93: | ||
} | } | ||
− | + | if (opcode == 0x02) { | |
− | + | result = get_memory_dword(operand_1 & 0x0fffffff); | |
− | + | } else if (opcode == 0x03) { | |
− | + | set_memory_dword(operand_1) = operand_2; | |
− | + | } else if (opcode == 0x06) { | |
− | + | result = (result & operand_1) | operand_2; | |
− | + | } else if (opcode == 0x04) { | |
− | + | if (operand_1 == 0x80000880) { | |
− | + | operand_2 &= 0xfffffffd; | |
− | + | } | |
− | + | outl(operand_1, 0xcf8); | |
− | + | outl(operand_2, 0xcfc); | |
− | + | } else if (opcode == 0x05) { | |
− | + | outl(operand_1, 0xcf8); | |
− | + | result = inl(0xcfc); | |
− | + | } else if (opcode == 0x08) { | |
− | + | if (result != operand_1) { | |
− | + | pc += operand_2; | |
− | + | } | |
− | + | } else if (opcode == 0x09) { | |
− | + | pc += operand_2; | |
− | + | } else if (opcode == 0x10) { | |
− | + | scratch = (scratch & operand_1) | operand_2; | |
− | + | result = scratch; | |
− | + | } else if (opcode == 0x11) { | |
− | + | outb(operand_2, operand_1); | |
− | + | } else if (opcode == 0x12) { | |
− | + | result = inb(operand_1); | |
− | + | } else if (opcode == 0xee) { | |
− | + | break; | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
} | } | ||
− | + | pc += 9; | |
} | } | ||
} | } | ||
</pre> | </pre> | ||
− | === RC4 Decryption of the 2BL === | + | === RC4 Decryption of the 2BL (MCPX 1.0 only) === |
− | + | Version 1.0 of the ROM uses RC4 to decrypt the 2BL. | |
+ | The RC4 algorithm was included as part of MCPX 1.0 and seems to work fine with BIOS versions 3944, 4034, and 4134. | ||
− | ==== Stage 1 ==== | + | ==== Stage 1: Key Scheduling ==== |
− | + | The [https://en.wikipedia.org/wiki/RC4#Key-scheduling_algorithm_.28KSA.29 RC4 Key-Scheduling Algorithm] is used to initialize the RC4 “S” array, first initializing the identity permutation (writing 1, 2, ..., 255 to 0x8F000 to 0x8F0FF), then processed in a way similar to the PRGA to mix in the key. | |
<pre> | <pre> | ||
− | + | uint8_t *s = (uint8_t *)0x8f000; | |
− | + | uint32_t i; | |
− | + | for (i = 0; i <= 255; i++) { | |
− | + | s[i] = i; | |
− | + | } | |
+ | |||
+ | uint8_t *key = (uint8_t *)0xffffffa5; /* ROM offset 0x1a5. */ | ||
+ | uint8_t j, t; | ||
+ | |||
+ | /* It is unclear why values s[0x100..0x101] are being set to 0. They are | ||
+ | * not modified by the code, but later these will be be used as the initial | ||
+ | * i, j values in the PRGA. | ||
+ | */ | ||
+ | s[0x100] = 0x00; | ||
+ | s[0x101] = 0x00; | ||
+ | |||
+ | for (i = 0, j = 0; i <= 255; i++) { | ||
+ | j = j + s[i] + key[i%16]; | ||
+ | |||
+ | /* Swap s[i] and s[j] */ | ||
+ | t = s[i]; | ||
+ | s[i] = s[j]; | ||
+ | s[j] = t; | ||
} | } | ||
</pre> | </pre> | ||
− | ==== Stage 2 ==== | + | ==== Stage 2: PRGA ==== |
− | + | The [https://en.wikipedia.org/wiki/RC4#Pseudo-random_generation_algorithm_.28PRGA.29 RC4 Pseudo-random generation algorithm (PRGA)] is then used to decrypt the 2BL from 0xFFFF9E00, storing the decrypted 2BL at 0x00090000. It is 24KiB in size. | |
<pre> | <pre> | ||
− | + | uint8_t *encrypted = (uint8_t*)0xFFFF9E00; /* 2bl */ | |
− | + | uint8_t *decrypted = (uint8_t*)0x90000; /* Decrypted 2bl Destination */ | |
− | + | uint32_t pos; | |
− | |||
− | + | /* As noted above, s[0x100..0x101] were set to 0 earlier, but have not been | |
− | + | * modified since. The RC4 algorithm defines i and j both to be set to 0 | |
− | + | * before PRGA begins. */ | |
− | + | i = s[0x100]; | |
− | + | j = s[0x101]; | |
− | + | ||
+ | for (pos = 0; pos < 0x6000; pos++) { | ||
+ | /* Update i, j. */ | ||
+ | i = (i + 1) & 0xff; | ||
+ | j += s[i]; | ||
+ | |||
+ | /* Swap s[i] and s[j]. */ | ||
+ | t = s[i]; | ||
+ | s[i] = s[j]; | ||
+ | s[j] = t; | ||
+ | |||
+ | /* Decrypt message and write output. */ | ||
+ | decrypted[pos] = encrypted[pos] ^ s[ s[i] + s[j] ]; | ||
} | } | ||
</pre> | </pre> | ||
− | ==== Stage 3 ==== | + | ==== Stage 3: Signature Verification ==== |
− | + | Now that the Second-Stage Bootloader has been loaded, a quick sanity-check is performed: a “magic” signature is verified. If the signature doesn’t match, control goes to the error handler. If the signature does match, the code will jump to the 2bl entry point, which is given by the first dword of the decrypted 2bl. | |
<pre> | <pre> | ||
− | + | mov eax, [0x95fe4] | |
− | + | cmp eax, MAGIC_NUMBER | |
− | + | jne 0xffffff94 ; If signature check failed, jump to error handler | |
− | + | mov eax, [0x90000] | |
+ | jmp eax ; Jump to 2BL entry point | ||
+ | </pre> | ||
− | + | === TEA Verification of the FBL (MCPX 1.1 only) === | |
− | + | Version 1.1 of the ROM uses TEA (Tiny Encryption Algorithm) to verify the contents of the FBL, and delegates the task of decrypting 2BL to it. | |
− | + | This is exclusive to version 1.1 of the MCPX ROM and kernels 4817, 5101, 5530, 5713 and 5838. | |
− | + | The FBL is not encrypted with this algorithm, only verified, the algorithm being used as a checksum. | |
− | |||
− | + | A temporary buffer for storing the output hash is located at 0x8F000 in RAM, and the FBL is located at 0xFFFFD400 in flash. It is 0x2880 bytes in size, or ~11KiB. | |
− | + | ||
− | + | {{FIXME|reason=Document the algorithm.}} | |
− | + | ||
− | + | == FBL (MCPX 1.1 only) == | |
− | + | ||
− | + | The Flash Boot Loader (FBL) was added in MCPX 1.1. It is stored unencrypted in flash, verified with a TEA checksum by the MCPX ROM, and is designed to be an intermediary loader between the MCPX and the 2BL. | |
− | + | ||
− | + | The FBL's job is to verify flash integrity, then load and decrypt 2BL. It also contains cryptography functions that are later used by newer versions of the 2BL (for example, SHA-1). | |
− | + | ||
− | + | === Setting up stack === | |
− | } | + | |
− | + | The ESP is set to 0x8F000 in RAM, and later used to store variables used in the verification process. | |
+ | |||
+ | === Verifying Integrity === | ||
+ | |||
+ | {{FIXME|reason=What is it doing here? Something RSA-like?}} | ||
− | === | + | === Calculating 2BL Key === |
− | + | The FBL decrypts the 2BL by deriving an RC4 key from a secret key in the MCPX rom itself and a key stored in the flash. Pseudocode for this operation follows: | |
<pre> | <pre> | ||
− | + | uint8_t *mcpx_key = 0xFFFFFF9C; | |
− | + | uint8_t *flash_key = 0xFFFFFDF0; // 5838 | |
− | + | ||
− | + | uint8_t key_buffer[0x14]; | |
− | + | SHA1_Context sha; | |
− | + | RC4_Context rc4; | |
− | + | ||
+ | SHA1_Init(&sha); | ||
+ | SHA1_Update(&sha, mcpx_key, 0x10); // Secret key in the MCPX | ||
+ | SHA1_Update(&sha, flash_key, 0x10); // Key stored in flash | ||
+ | for (int i = 0; i < 0x10; i++) { | ||
+ | key_buffer[i] = mcpx_key[i] ^ 0x5C; | ||
} | } | ||
+ | SHA1_Update(&sha, key_buffer, 0x10); | ||
+ | SHA1_Final(&sha, key_buffer); | ||
+ | |||
+ | RC4_Init(&rc4, 0x14, key_buffer); | ||
</pre> | </pre> | ||
− | === | + | === Loading 2BL === |
− | + | After calculating the encryption key, the 2BL is loaded from flash at 0xFFFF9E00 and into memory at 0x3FA000. RC4 is used to decrypt the 2BL with the previously calculated key. | |
+ | |||
+ | The 2BL is jumped to with an entrypoint located at offset 0x35F8 in the 2BL. | ||
== 2BL == | == 2BL == | ||
− | Certain parts are still missing | + | {{FIXME|reason=Certain parts are still missing.}} |
=== MTRR Setup === | === MTRR Setup === | ||
Line 196: | Line 297: | ||
Once the MTRR have been written, the cache is enabled.{{FIXME}} | Once the MTRR have been written, the cache is enabled.{{FIXME}} | ||
− | === | + | === Register setup === |
+ | |||
+ | Now the 2BL will set up the segment registers{{FIXME|reason=why?!}} and stack: | ||
+ | |||
+ | {| class="wikitable" | ||
+ | ! Register !! Value !! Notes | ||
+ | |- | ||
+ | |ds || 0x0010 || rowspan="3" | Data segment{{citation needed}} | ||
+ | |||
+ | |- | ||
+ | |es || 0x0010 | ||
+ | |||
+ | |- | ||
+ | |ss || 0x0010 | ||
+ | |||
+ | |- | ||
+ | |esp || 0x00400000 || | ||
+ | |||
+ | |- | ||
+ | |fs || 0x0000 || | ||
+ | |||
+ | |- | ||
+ | |gs || 0x0000 || | ||
+ | |} | ||
+ | |||
+ | === Self-copy === | ||
+ | |||
+ | Now the 2BL copies itself (24 kiB) from 0x00090000 to memory address 0x00400000. | ||
=== Paging === | === Paging === | ||
− | === Kernel decryption === | + | Now a PDE is prepared at address 0x0000F000: |
+ | |||
+ | {| class="wikitable" | ||
+ | ! Offset in PDE !! Value !! Notes | ||
+ | |- | ||
+ | |0x800 || 0x000000E3 || rowspan="7" | Identity maps the first 256MiB of RAM: 0x80000000 and 0x00000000 will each map to physical page 0 <br><br> 0xE3: Flags: <br> * 0x80: 4 MiB page <br> * 0x40: Marked as previously written (Dirty) <br> * 0x20: Marked as previously accessed <br> * 0x02: Read/Write <br> * 0x01: Present | ||
+ | |- | ||
+ | |0x000 || 0x000000E3 | ||
+ | |- | ||
+ | |0x804 || 0x004000E3 | ||
+ | |- | ||
+ | |0x004 || 0x004000E3 | ||
+ | |- | ||
+ | | colspan="2" | ... | ||
+ | |- | ||
+ | |0x8FC || 0x0FC000E3 | ||
+ | |- | ||
+ | |0x0FC || 0x0FC000E3 | ||
+ | |- | ||
+ | |0x900 || 0x00000000 || rowspan="5" | Unmapping the rest of the pages | ||
+ | |- | ||
+ | |0x100 || 0x00000000 | ||
+ | |- | ||
+ | | colspan="2" | ... | ||
+ | |- | ||
+ | |0xFFC || 0x00000000 | ||
+ | |- | ||
+ | |0x7FC || 0x00000000 | ||
+ | |- | ||
+ | |0xC00 || 0x0000F063 || Maps the PDE (4 kiB page) to address 0xC0000000 <br><br> 0x63: Flags: <br> * 0x40: Marked as previously written (Dirty) <br> * 0x20: Marked as previously accessed <br> * 0x02: Read/Write <br> * 0x01: Present | ||
+ | |- | ||
+ | |0xFFC || 0xFFC000E3 || Identity maps the upper portion of the Flash (4 MiB page) to address 0xFFC00000 <br><br> 0xE3: Flags: <br> * 0x80: 4 MiB page <br> * 0x40: Marked as previously written (Dirty) <br> * 0x20: Marked as previously accessed <br> * 0x02: Read/Write <br> * 0x01: Present | ||
+ | |- | ||
+ | |0xFD0 || 0xFD0000FB || rowspan="4" | Maps 16 MiB for the GPU control registers <br><br> 0xFB: Flags: <br> * 0x80: 4 MiB page <br> * 0x40: Marked as previously written (Dirty) <br> * 0x20: Marked as previously accessed <br> * 0x10: Cache disabled <br> * 0x08: Write-Through caching <br> * 0x02: Read/Write <br> * 0x01: Present | ||
+ | |- | ||
+ | |0xFD4 || 0xFD4000FB | ||
+ | |- | ||
+ | |0xFD8 || 0xFD8000FB | ||
+ | |- | ||
+ | |0xFDC || 0xFDC000FB | ||
+ | |} | ||
+ | |||
+ | After setting up the PDE, the PAT is set up using <code>wrmsr</code>: {{FIXME}} | ||
+ | |||
+ | CR4 is touched {{FIXME}} | ||
+ | |||
+ | CR3 is touched {{FIXME}} | ||
+ | |||
+ | Now paging is activated by enabling the PG and WP bits in CR0. | ||
+ | Additionally, the same <code>or</code> instruction is used to enable the NE bit in cr0. | ||
+ | |||
+ | === 2BL main === | ||
+ | |||
+ | esp is now also reloaded to point at the relocated address. It will be set to 0x80400000 (absolute value, independent of previous esp value). | ||
+ | The 2BL will now <code>call</code> into the relocated 2BL code somewhere near 0x00400000. | ||
+ | |||
+ | ==== Disabling of the MCPX ROM ==== | ||
+ | |||
+ | <pre> | ||
+ | out32(0xCF8, 0x80000880); | ||
+ | out8(0xCFC, 0x02); | ||
+ | </pre> | ||
+ | |||
+ | ==== SMC handling ==== | ||
+ | |||
+ | The [[SMC]] has a watchdog functionality which must be turned off. | ||
+ | This is done by querying the SMC registers 0x1C - 0x1F. | ||
+ | If all of them are 0x00 the 2BL will shutdown the system{{FIXME}}. | ||
+ | If this is not the case, the bootloader calculates the watchdog challenge response and sends it to SMC registers 0x20 and 0x21. | ||
+ | |||
+ | Additionally, the 2BL will set SMC register 0x01 to 0 (which resets the cursor position for reading the SMC revision information). | ||
+ | |||
+ | ==== Enable IDE and NIC ==== | ||
+ | |||
+ | <pre> | ||
+ | out32(0xCF8, 0x8000088C); | ||
+ | out32(0xCFC, 0x40000000); | ||
+ | </pre> | ||
+ | |||
+ | ==== Memory cleanup ==== | ||
+ | |||
+ | The 2BL fills memory with 0xCC from 0x80090000 to 0x80095FFF. These are the 24 kiB where the 2BL was stored previously. | ||
+ | |||
+ | ==== Setup RAM timing ==== | ||
+ | |||
+ | Not described yet, this is complicated{{FIXME}}. | ||
+ | This got a lot more complicated when Microsoft started using different RAM sometime after [[Hardware Revisions#1.6|Hardware Revision 1.6]] was already out. | ||
+ | |||
+ | ==== Configure LDT bus ==== | ||
+ | |||
+ | DWORD flow control is enabled in the MCPX. | ||
+ | |||
+ | <pre> | ||
+ | out32(0xCF8, 0x80000854); | ||
+ | out32(0xCFC, in32(0xCFC) | 0x88000000); | ||
+ | </pre> | ||
+ | |||
+ | DWORD flow control is also enabled in the NV2A core. | ||
+ | |||
+ | <pre> | ||
+ | out32(0xCF8, 0x80000064); | ||
+ | out32(0xCFC, in32(0xCFC) | 0x88000000); | ||
+ | </pre> | ||
+ | |||
+ | The LDT bus is reset. | ||
+ | |||
+ | <pre> | ||
+ | out32(0xCF8, 0x8000006C); | ||
+ | uint32_t tmp = in32(0xCFC); | ||
+ | out32(0xCFC, tmp & 0xFFFFFFFE); | ||
+ | out32(0xCFC, tmp); | ||
+ | </pre> | ||
+ | |||
+ | The rest is unknown{{FIXME}}. | ||
+ | |||
+ | <pre> | ||
+ | out32(0xCF8, 0x80000080); | ||
+ | out32(0xCFC, 0x00000100); | ||
+ | </pre> | ||
+ | |||
+ | ==== Enable USB ASRC ==== | ||
+ | |||
+ | The USB controller's "automatic slew rate compensation" feature is enabled for MCPX revisions D01 and later. | ||
+ | |||
+ | <pre> | ||
+ | out32(0xCF8, 0x80000808); | ||
+ | uint8_t mcpx_revision = in8(0xCFC); | ||
+ | |||
+ | if (mcpx_revision >= 0xD1) { | ||
+ | out32(0xCF8, 0x800008C8); | ||
+ | out32(0xCFC, 0x00008F00); | ||
+ | } | ||
+ | </pre> | ||
+ | |||
+ | ==== Loading the kernel ==== | ||
+ | ===== Kernel-copy ===== | ||
+ | |||
+ | The Kernel is now copied into RAM. | ||
+ | |||
+ | ===== Kernel decryption ===== | ||
+ | |||
+ | The 2BL will copy the kernel decryption key (16 bytes) from offset 32 of an array of 3 keys: | ||
+ | |||
+ | {| class="wikitable" | ||
+ | ! Offset !! Use | ||
+ | |- | ||
+ | | 0 || EEPROM key | ||
+ | |- | ||
+ | | 16 || Certificate key | ||
+ | |- | ||
+ | | 32 || Kernel key | ||
+ | |} | ||
+ | |||
+ | The Kernel is then decrypted in-place using RC4. | ||
+ | |||
+ | ===== Kernel decompression ===== | ||
+ | |||
+ | The Kernel is decompressed directly to 0x80010000 using the LZX compression scheme, where it will reside until a full system shutdown. | ||
+ | |||
+ | ==== Running the kernel ==== | ||
+ | |||
+ | The xboxkrnl.exe header at 0x80010000 is checked to make sure it contains both the "MZ" and "PE" magic values at the correct positions. | ||
+ | If it is invalid, some hardware is touched {{FIXME|reason=which hardware and how?}} and the system is put into an infinite loop. | ||
+ | If it is valid, the kernel entry point is looked up from the PE optional header. The hardcoded image base of 0x80010000 is added to the entry point. | ||
+ | The entry-point is now being called. Arguments are passed on the stack, from right to left. | ||
+ | The first argument is a commandline string loaded from memory address 0x80400004. It is an empty string for retail BIOS{{FIXME|reason=Mention options for debug bioses here}}. | ||
+ | A pointer to the previously mentioned array of 3 keys is passed as the second argument. | ||
== Kernel == | == Kernel == | ||
+ | |||
+ | === Initialization === | ||
+ | |||
+ | ==== Stage 1 (Cold-boot only) ==== | ||
+ | |||
+ | The entry point to the kernel will first parse the arguments.{{FIXME}} | ||
+ | At the end, the kernel will call the initialization routine for what we'll refer to as: Stage 2a. | ||
+ | |||
+ | ==== Stage 2 (Cold-boot only) ==== | ||
+ | |||
+ | The kernel initialization will only happen once on a cold-boot. It will not happen for reboots. | ||
+ | |||
+ | * ebp is set to 0x00000000 | ||
+ | * esp is modified{{FIXME}} | ||
+ | * GDT is prepared{{FIXME}} and loaded{{FIXME}} | ||
+ | * cs and ds are reloaded | ||
+ | * fs is set{{FIXME}} | ||
+ | * TSS is loaded{{FIXME}} | ||
+ | * cr3 is moved to 3 tasks{{FIXME}} | ||
+ | * The CPU microcode is updated | ||
+ | |||
+ | After this comes Stage 3 initialization which will also be repeated on kernel re-initialization. | ||
+ | |||
+ | ==== Stage 3 ==== | ||
+ | |||
+ | This is code which is duplicated in INIT and .text sections. | ||
+ | * In the INIT section it directly follows the Stage 2 initialization. | ||
+ | * In the .text section it follows the Kernel re-initialization code mentioned below. | ||
+ | |||
+ | This code does the following: | ||
+ | |||
+ | * IDT is prepared{{FIXME}} and loaded{{FIXME}} | ||
+ | * {{FIXME|reason=a lot more happens here.. If you want to RE this: look into the HalReturnToFirmware code [last call is to Stage 2b]; the kernel entry point [last call is to Stage 2a]; or just search for lidt instructions}} | ||
+ | |||
+ | === Re-initialization === | ||
+ | |||
+ | On reboots, initialization Stage 1 and 2 are not in memory anymore (as the INIT section has been discarded), and can't be run anymore. | ||
+ | Instead, a seperate function replaces their functionality and then jumps directly to Stage 3 initalization. | ||
+ | |||
+ | This code is the partial kernel reinitialization, which will be ran on reboots using [[Kernel/HalReturnToFirmware]]. | ||
+ | |||
+ | * ebp is set to 0x00000000 | ||
+ | * esp is modified{{FIXME}} | ||
+ | * Some memory stuff in a seperate function{{FIXME}} | ||
+ | * The .data section from [[Flash]] is loaded and replaces the running .data | ||
+ | * The byte infront of KeSystemTime is set to 0x01, indicating the system comes from a reboot. | ||
+ | |||
+ | After this has completed, [[#Stage 3]] of the kernel initialization will take over. | ||
+ | |||
+ | ==== Skipped initialization ==== | ||
+ | |||
+ | When rebooting, certain parts are still initialized and assumed to be working: | ||
+ | |||
+ | (This list is currently in no particular order and incomplete) | ||
+ | |||
+ | * Anything already done by Stage 1 and Stage 2 | ||
+ | * PCI device setup | ||
+ | * EEPROM decryption{{FIXME}} | ||
+ | * Check for AV-Pack{{FIXME}} | ||
+ | * Video mode setup (boot animation is not played again) | ||
+ | * Some IDE stuff{{FIXME}} | ||
+ | * Some SMC stuff{{FIXME}} | ||
+ | * Memory allocator initialization{{FIXME}} | ||
+ | * Kernel debugger (Super-I/O) initialization{{FIXME}} (This still seems to be in memory?!){{citation needed}} | ||
=== Startup animation === | === Startup animation === | ||
− | |||
− | |||
== References == | == References == | ||
− | * [http://hackspot.net/XboxBlog/?p=1 Understanding the Xbox boot process] | + | * [https://web.archive.org/web/20141024145145/http://hackspot.net/XboxBlog/?p=1 Understanding the Xbox boot process] |
− | * [https://mborgerson.com/deconstructing-the-xbox-boot-rom Deconstructing the Boot ROM] | + | * [https://web.archive.org/web/20150612173331/https://mborgerson.com/deconstructing-the-xbox-boot-rom Deconstructing the Boot ROM] |
Latest revision as of 19:11, 10 January 2024
This article describes the boot sequence of the Xbox. A large portion of it is patended in Patent "US 6,907,522 B2"
Contents
Overview
The Xbox has a 256 kiB ROM containing the startup animation and sound, as well as the Xbox kernel, which contains a stripped down version of the Windows 2000 (NT 5.0) microkernel, the HAL, filesystems, as well as HDD and DVD drivers.
When the Xbox is turned on, the software in ROM is decompressed into RAM, and the kernel initializes the hardware. Because there are no audio or video drivers in the kernel, the startup code plays the animation and sound by accessing the registers of the hardware directly. As soon as the Xbox logo is on the display, the kernel unlocks the hard disk and checks whether there is a valid game medium in the DVD drive. If not, the file xboxdash.xbe gets loaded from partition #3 (Typically the Dashboard). In either case, the Microsoft logo is shown below the Xbox logo and the executable is started. If an error occurs (no/wrong hard disk, wrong signature, ...), the boot loader shows a Fatal Error screen and halts.
Chain of trust
The Xbox uses a chain of trust during the boot process:
MCPX X2
There's no chain of trust.
It behaves similar to the MCPX X3 1.0 boot, just that the MCPX ROM is not available, so the code is loaded from untrusted flash.
There are also different X-Code opcodes and keys.
MCPX X3 1.0
- MCPX ROM:
- Runs untrusted from flash X-Codes in a limited virtual-machine.
- Contains key + decrypts the 2BL using RC4.
- In success case: Go to 2BL.
- In error case: Hides ROM and intends to triple-fault.
- 2BL:
- The MCPX ROM is hidden.
- The 2BL decryption key is (overwritten with 0x00-Bytes)[FIXME].
- Contains keys for kernel decryption and execution; decrypts kernel using RC4 and extracts using LZX.[FIXME]
- Kernel:
- The kernel decryption key is overwritten with 0x00-Bytes.
- The 2BL is overwritten with 0xCC-Bytes.
- Once the kernel is initialized, the INIT section is discarded.
- The kernel only runs signed XBE files from allowed media.
MCPX X3 1.1
- MCPX ROM:
- Runs untrusted from flash X-Codes in a limited virtual-machine.
- Hashes the unencrypted FBL using TEA encryption.
- In success case: Go to FBL.
- In error case: Hides ROM and intends to triple-fault.
- FBL:
- Verify 2BL image.
- Derive key from key stored in MCPX + Flash, and decrypt 2BL.
- Go to 2BL.
The rest of the boot behaves like MCPX X3 1.0.
Assumptions for chain-of-trust
- The CPU will start execution in trusted MCPX ROM.
- The MCPX ROM can not be read or modified.
- The decrypted 2BL or kernel can not be read entirely.
- All parts of the software following the MCPX are not-attackable and signed.
See Exploits for possible options to break the chain of trust.
MCPX ROM
Certain things are still missing, for example, getting the CPU to 32 bit protected mode and enabling caching.[FIXME]
Xcodes
The xcode interpreter is common through both versions of the MCPX ROM. The high level interpretation of the MCPX ROM might look like this:
void xcode_interpreter() { // values are implied as x86 is just starting up register uint32_t pc = 0; // stored in ESI register register uint8_t opcode = 0; // stored in AL register register uint32_t operand_1 = 0; // stored in EBC register register uint32_t operand_2 = 0; // stored in ECX register register uint32_t result = 0; // stored in EDI register register uint32_t scratch = 0; // stored in EBP register // explicitly set startup point pc = 0xFF000080; while (1) { opcode = get_memory_byte(pc); operand_1 = get_memory_dword(pc+1); operand_2 = get_memory_dword(pc+5); if (opcode == 0x07) { opcode = operand_1; operand_1 = operand_2; operand_2 = result; } if (opcode == 0x02) { result = get_memory_dword(operand_1 & 0x0fffffff); } else if (opcode == 0x03) { set_memory_dword(operand_1) = operand_2; } else if (opcode == 0x06) { result = (result & operand_1) | operand_2; } else if (opcode == 0x04) { if (operand_1 == 0x80000880) { operand_2 &= 0xfffffffd; } outl(operand_1, 0xcf8); outl(operand_2, 0xcfc); } else if (opcode == 0x05) { outl(operand_1, 0xcf8); result = inl(0xcfc); } else if (opcode == 0x08) { if (result != operand_1) { pc += operand_2; } } else if (opcode == 0x09) { pc += operand_2; } else if (opcode == 0x10) { scratch = (scratch & operand_1) | operand_2; result = scratch; } else if (opcode == 0x11) { outb(operand_2, operand_1); } else if (opcode == 0x12) { result = inb(operand_1); } else if (opcode == 0xee) { break; } pc += 9; } }
RC4 Decryption of the 2BL (MCPX 1.0 only)
Version 1.0 of the ROM uses RC4 to decrypt the 2BL. The RC4 algorithm was included as part of MCPX 1.0 and seems to work fine with BIOS versions 3944, 4034, and 4134.
Stage 1: Key Scheduling
The RC4 Key-Scheduling Algorithm is used to initialize the RC4 “S” array, first initializing the identity permutation (writing 1, 2, ..., 255 to 0x8F000 to 0x8F0FF), then processed in a way similar to the PRGA to mix in the key.
uint8_t *s = (uint8_t *)0x8f000; uint32_t i; for (i = 0; i <= 255; i++) { s[i] = i; } uint8_t *key = (uint8_t *)0xffffffa5; /* ROM offset 0x1a5. */ uint8_t j, t; /* It is unclear why values s[0x100..0x101] are being set to 0. They are * not modified by the code, but later these will be be used as the initial * i, j values in the PRGA. */ s[0x100] = 0x00; s[0x101] = 0x00; for (i = 0, j = 0; i <= 255; i++) { j = j + s[i] + key[i%16]; /* Swap s[i] and s[j] */ t = s[i]; s[i] = s[j]; s[j] = t; }
Stage 2: PRGA
The RC4 Pseudo-random generation algorithm (PRGA) is then used to decrypt the 2BL from 0xFFFF9E00, storing the decrypted 2BL at 0x00090000. It is 24KiB in size.
uint8_t *encrypted = (uint8_t*)0xFFFF9E00; /* 2bl */ uint8_t *decrypted = (uint8_t*)0x90000; /* Decrypted 2bl Destination */ uint32_t pos; /* As noted above, s[0x100..0x101] were set to 0 earlier, but have not been * modified since. The RC4 algorithm defines i and j both to be set to 0 * before PRGA begins. */ i = s[0x100]; j = s[0x101]; for (pos = 0; pos < 0x6000; pos++) { /* Update i, j. */ i = (i + 1) & 0xff; j += s[i]; /* Swap s[i] and s[j]. */ t = s[i]; s[i] = s[j]; s[j] = t; /* Decrypt message and write output. */ decrypted[pos] = encrypted[pos] ^ s[ s[i] + s[j] ]; }
Stage 3: Signature Verification
Now that the Second-Stage Bootloader has been loaded, a quick sanity-check is performed: a “magic” signature is verified. If the signature doesn’t match, control goes to the error handler. If the signature does match, the code will jump to the 2bl entry point, which is given by the first dword of the decrypted 2bl.
mov eax, [0x95fe4] cmp eax, MAGIC_NUMBER jne 0xffffff94 ; If signature check failed, jump to error handler mov eax, [0x90000] jmp eax ; Jump to 2BL entry point
TEA Verification of the FBL (MCPX 1.1 only)
Version 1.1 of the ROM uses TEA (Tiny Encryption Algorithm) to verify the contents of the FBL, and delegates the task of decrypting 2BL to it. This is exclusive to version 1.1 of the MCPX ROM and kernels 4817, 5101, 5530, 5713 and 5838.
The FBL is not encrypted with this algorithm, only verified, the algorithm being used as a checksum.
A temporary buffer for storing the output hash is located at 0x8F000 in RAM, and the FBL is located at 0xFFFFD400 in flash. It is 0x2880 bytes in size, or ~11KiB.
[FIXME]
FBL (MCPX 1.1 only)
The Flash Boot Loader (FBL) was added in MCPX 1.1. It is stored unencrypted in flash, verified with a TEA checksum by the MCPX ROM, and is designed to be an intermediary loader between the MCPX and the 2BL.
The FBL's job is to verify flash integrity, then load and decrypt 2BL. It also contains cryptography functions that are later used by newer versions of the 2BL (for example, SHA-1).
Setting up stack
The ESP is set to 0x8F000 in RAM, and later used to store variables used in the verification process.
Verifying Integrity
[FIXME]
Calculating 2BL Key
The FBL decrypts the 2BL by deriving an RC4 key from a secret key in the MCPX rom itself and a key stored in the flash. Pseudocode for this operation follows:
uint8_t *mcpx_key = 0xFFFFFF9C; uint8_t *flash_key = 0xFFFFFDF0; // 5838 uint8_t key_buffer[0x14]; SHA1_Context sha; RC4_Context rc4; SHA1_Init(&sha); SHA1_Update(&sha, mcpx_key, 0x10); // Secret key in the MCPX SHA1_Update(&sha, flash_key, 0x10); // Key stored in flash for (int i = 0; i < 0x10; i++) { key_buffer[i] = mcpx_key[i] ^ 0x5C; } SHA1_Update(&sha, key_buffer, 0x10); SHA1_Final(&sha, key_buffer); RC4_Init(&rc4, 0x14, key_buffer);
Loading 2BL
After calculating the encryption key, the 2BL is loaded from flash at 0xFFFF9E00 and into memory at 0x3FA000. RC4 is used to decrypt the 2BL with the previously calculated key.
The 2BL is jumped to with an entrypoint located at offset 0x35F8 in the 2BL.
2BL
[FIXME]
MTRR Setup
First, the cache is disabled.[FIXME]
Then, the MTRR (Memory Type Range Register) will be setup (using wrmsr
) in the following way:
MTRR (ecx) | High value (edx) | Low value (eax) | Notes |
---|---|---|---|
0x200 | 0x00000000 | 0x00000006 | |
0x201 | 0x0000000F | 0xFC000800 | (For 64 MiB RAM BIOS) |
0xF8000800 | (For 128 MiB RAM BIOS) | ||
0x202 | 0x00000000 | 0xFFF80005 | |
0x203 | 0x0000000F | 0xFFF80800 | |
0x204 | 0x00000000 | 0x00000000 | Clear all unused MTRR |
... | |||
0x20F | 0x00000000 | 0x00000000 | |
0x2FF | 0x00000000 | 0x00000800 |
Once the MTRR have been written, the cache is enabled.[FIXME]
Register setup
Now the 2BL will set up the segment registers[FIXME] and stack:
Register | Value | Notes |
---|---|---|
ds | 0x0010 | Data segment[citation needed] |
es | 0x0010 | |
ss | 0x0010 | |
esp | 0x00400000 | |
fs | 0x0000 | |
gs | 0x0000 |
Self-copy
Now the 2BL copies itself (24 kiB) from 0x00090000 to memory address 0x00400000.
Paging
Now a PDE is prepared at address 0x0000F000:
Offset in PDE | Value | Notes |
---|---|---|
0x800 | 0x000000E3 | Identity maps the first 256MiB of RAM: 0x80000000 and 0x00000000 will each map to physical page 0 0xE3: Flags: * 0x80: 4 MiB page * 0x40: Marked as previously written (Dirty) * 0x20: Marked as previously accessed * 0x02: Read/Write * 0x01: Present |
0x000 | 0x000000E3 | |
0x804 | 0x004000E3 | |
0x004 | 0x004000E3 | |
... | ||
0x8FC | 0x0FC000E3 | |
0x0FC | 0x0FC000E3 | |
0x900 | 0x00000000 | Unmapping the rest of the pages |
0x100 | 0x00000000 | |
... | ||
0xFFC | 0x00000000 | |
0x7FC | 0x00000000 | |
0xC00 | 0x0000F063 | Maps the PDE (4 kiB page) to address 0xC0000000 0x63: Flags: * 0x40: Marked as previously written (Dirty) * 0x20: Marked as previously accessed * 0x02: Read/Write * 0x01: Present |
0xFFC | 0xFFC000E3 | Identity maps the upper portion of the Flash (4 MiB page) to address 0xFFC00000 0xE3: Flags: * 0x80: 4 MiB page * 0x40: Marked as previously written (Dirty) * 0x20: Marked as previously accessed * 0x02: Read/Write * 0x01: Present |
0xFD0 | 0xFD0000FB | Maps 16 MiB for the GPU control registers 0xFB: Flags: * 0x80: 4 MiB page * 0x40: Marked as previously written (Dirty) * 0x20: Marked as previously accessed * 0x10: Cache disabled * 0x08: Write-Through caching * 0x02: Read/Write * 0x01: Present |
0xFD4 | 0xFD4000FB | |
0xFD8 | 0xFD8000FB | |
0xFDC | 0xFDC000FB |
After setting up the PDE, the PAT is set up using wrmsr
: [FIXME]
CR4 is touched [FIXME]
CR3 is touched [FIXME]
Now paging is activated by enabling the PG and WP bits in CR0.
Additionally, the same or
instruction is used to enable the NE bit in cr0.
2BL main
esp is now also reloaded to point at the relocated address. It will be set to 0x80400000 (absolute value, independent of previous esp value).
The 2BL will now call
into the relocated 2BL code somewhere near 0x00400000.
Disabling of the MCPX ROM
out32(0xCF8, 0x80000880); out8(0xCFC, 0x02);
SMC handling
The SMC has a watchdog functionality which must be turned off. This is done by querying the SMC registers 0x1C - 0x1F. If all of them are 0x00 the 2BL will shutdown the system[FIXME]. If this is not the case, the bootloader calculates the watchdog challenge response and sends it to SMC registers 0x20 and 0x21.
Additionally, the 2BL will set SMC register 0x01 to 0 (which resets the cursor position for reading the SMC revision information).
Enable IDE and NIC
out32(0xCF8, 0x8000088C); out32(0xCFC, 0x40000000);
Memory cleanup
The 2BL fills memory with 0xCC from 0x80090000 to 0x80095FFF. These are the 24 kiB where the 2BL was stored previously.
Setup RAM timing
Not described yet, this is complicated[FIXME]. This got a lot more complicated when Microsoft started using different RAM sometime after Hardware Revision 1.6 was already out.
Configure LDT bus
DWORD flow control is enabled in the MCPX.
out32(0xCF8, 0x80000854); out32(0xCFC, in32(0xCFC) | 0x88000000);
DWORD flow control is also enabled in the NV2A core.
out32(0xCF8, 0x80000064); out32(0xCFC, in32(0xCFC) | 0x88000000);
The LDT bus is reset.
out32(0xCF8, 0x8000006C); uint32_t tmp = in32(0xCFC); out32(0xCFC, tmp & 0xFFFFFFFE); out32(0xCFC, tmp);
The rest is unknown[FIXME].
out32(0xCF8, 0x80000080); out32(0xCFC, 0x00000100);
Enable USB ASRC
The USB controller's "automatic slew rate compensation" feature is enabled for MCPX revisions D01 and later.
out32(0xCF8, 0x80000808); uint8_t mcpx_revision = in8(0xCFC); if (mcpx_revision >= 0xD1) { out32(0xCF8, 0x800008C8); out32(0xCFC, 0x00008F00); }
Loading the kernel
Kernel-copy
The Kernel is now copied into RAM.
Kernel decryption
The 2BL will copy the kernel decryption key (16 bytes) from offset 32 of an array of 3 keys:
Offset | Use |
---|---|
0 | EEPROM key |
16 | Certificate key |
32 | Kernel key |
The Kernel is then decrypted in-place using RC4.
Kernel decompression
The Kernel is decompressed directly to 0x80010000 using the LZX compression scheme, where it will reside until a full system shutdown.
Running the kernel
The xboxkrnl.exe header at 0x80010000 is checked to make sure it contains both the "MZ" and "PE" magic values at the correct positions. If it is invalid, some hardware is touched [FIXME] and the system is put into an infinite loop. If it is valid, the kernel entry point is looked up from the PE optional header. The hardcoded image base of 0x80010000 is added to the entry point. The entry-point is now being called. Arguments are passed on the stack, from right to left. The first argument is a commandline string loaded from memory address 0x80400004. It is an empty string for retail BIOS[FIXME]. A pointer to the previously mentioned array of 3 keys is passed as the second argument.
Kernel
Initialization
Stage 1 (Cold-boot only)
The entry point to the kernel will first parse the arguments.[FIXME] At the end, the kernel will call the initialization routine for what we'll refer to as: Stage 2a.
Stage 2 (Cold-boot only)
The kernel initialization will only happen once on a cold-boot. It will not happen for reboots.
- ebp is set to 0x00000000
- esp is modified[FIXME]
- GDT is prepared[FIXME] and loaded[FIXME]
- cs and ds are reloaded
- fs is set[FIXME]
- TSS is loaded[FIXME]
- cr3 is moved to 3 tasks[FIXME]
- The CPU microcode is updated
After this comes Stage 3 initialization which will also be repeated on kernel re-initialization.
Stage 3
This is code which is duplicated in INIT and .text sections.
- In the INIT section it directly follows the Stage 2 initialization.
- In the .text section it follows the Kernel re-initialization code mentioned below.
This code does the following:
- IDT is prepared[FIXME] and loaded[FIXME]
- [FIXME]
Re-initialization
On reboots, initialization Stage 1 and 2 are not in memory anymore (as the INIT section has been discarded), and can't be run anymore. Instead, a seperate function replaces their functionality and then jumps directly to Stage 3 initalization.
This code is the partial kernel reinitialization, which will be ran on reboots using Kernel/HalReturnToFirmware.
- ebp is set to 0x00000000
- esp is modified[FIXME]
- Some memory stuff in a seperate function[FIXME]
- The .data section from Flash is loaded and replaces the running .data
- The byte infront of KeSystemTime is set to 0x01, indicating the system comes from a reboot.
After this has completed, #Stage 3 of the kernel initialization will take over.
Skipped initialization
When rebooting, certain parts are still initialized and assumed to be working:
(This list is currently in no particular order and incomplete)
- Anything already done by Stage 1 and Stage 2
- PCI device setup
- EEPROM decryption[FIXME]
- Check for AV-Pack[FIXME]
- Video mode setup (boot animation is not played again)
- Some IDE stuff[FIXME]
- Some SMC stuff[FIXME]
- Memory allocator initialization[FIXME]
- Kernel debugger (Super-I/O) initialization[FIXME] (This still seems to be in memory?!)[citation needed]