----------------------------
H. Peter Anvin <hpa@zytor.com>
- Last update 2007-03-06
+ Last update 2007-05-23
On the i386 platform, the Linux kernel uses a rather complicated boot
convention. This has evolved partially due to historical aspects, as
expectations in the PC industry caused by the effective demise of
real-mode DOS as a mainstream operating system.
-Currently, four versions of the Linux/i386 boot protocol exist.
+Currently, the following versions of the Linux/i386 boot protocol exist.
Old kernels: zImage/Image support only. Some very early kernels
may not even support a command line.
0A0000 +------------------------+
| Reserved for BIOS | Do not use. Reserved for BIOS EBDA.
09A000 +------------------------+
- | Stack/heap/cmdline | For use by the kernel real-mode code.
+ | Command line |
+ | Stack/heap | For use by the kernel real-mode code.
098000 +------------------------+
| Kernel setup | The kernel real-mode code.
090200 +------------------------+
When using bzImage, the protected-mode kernel was relocated to
0x100000 ("high memory"), and the kernel real-mode block (boot sector,
setup, and stack/heap) was made relocatable to any address between
-0x10000 and end of low memory. Unfortunately, in protocols 2.00 and
-2.01 the command line is still required to live in the 0x9XXXX memory
-range, and that memory range is still overwritten by the early kernel.
-The 2.02 protocol resolves that problem.
+0x10000 and end of low memory. Unfortunately, in protocols 2.00 and
+2.01 the 0x90000+ memory range is still used internally by the kernel;
+the 2.02 protocol resolves that problem.
It is desirable to keep the "memory ceiling" -- the highest point in
low memory touched by the boot loader -- as low as possible, since
0x90000 segment, the boot loader should make sure not to use memory
above the 0x9A000 point; too many BIOSes will break above that point.
+For a modern bzImage kernel with boot protocol version >= 2.02, a
+memory layout like the following is suggested:
+
+ ~ ~
+ | Protected-mode kernel |
+100000 +------------------------+
+ | I/O memory hole |
+0A0000 +------------------------+
+ | Reserved for BIOS | Leave as much as possible unused
+ ~ ~
+ | Command line | (Can also be below the X+10000 mark)
+X+10000 +------------------------+
+ | Stack/heap | For use by the kernel real-mode code.
+X+08000 +------------------------+
+ | Kernel setup | The kernel real-mode code.
+ | Kernel boot sector | The kernel legacy boot sector.
+X +------------------------+
+ | Boot loader | <- Boot sector entry point 0000:7C00
+001000 +------------------------+
+ | Reserved for MBR/BIOS |
+000800 +------------------------+
+ | Typically used by MBR |
+000600 +------------------------+
+ | BIOS use only |
+000000 +------------------------+
+
+... where the address X is as low as the design of the boot loader
+permits.
+
**** THE REAL-MODE KERNEL HEADER
0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not
0235/3 N/A pad2 Unused
0238/4 2.06+ cmdline_size Maximum size of the kernel command line
+023C/4 2.07+ hardware_subarch Hardware subarchitecture
+0240/8 2.07+ hardware_subarch_data Subarchitecture-specific data
+0248/4 2.08+ compressed_payload_offset
+024C/4 2.08+ compressed_payload_length
(1) For backwards compatibility, if the setup_sects field contains 0, the
real value is 4.
setting fields in the header, you must make sure only to set fields
supported by the protocol version in use.
-The "kernel_version" field, if set to a nonzero value, contains a
-pointer to a null-terminated human-readable kernel version number
-string, less 0x200. This can be used to display the kernel version to
-the user. This value should be less than (0x200*setup_sects). For
-example, if this value is set to 0x1c00, the kernel version number
-string can be found at offset 0x1e00 in the kernel file. This is a
-valid value if and only if the "setup_sects" field contains the value
-14 or higher.
-
-Most boot loaders will simply load the kernel at its target address
-directly. Such boot loaders do not need to worry about filling in
-most of the fields in the header. The following fields should be
-filled out, however:
-
- vid_mode:
- Please see the section on SPECIAL COMMAND LINE OPTIONS.
-
- type_of_loader:
- If your boot loader has an assigned id (see table below), enter
- 0xTV here, where T is an identifier for the boot loader and V is
- a version number. Otherwise, enter 0xFF here.
-
- Assigned boot loader ids:
- 0 LILO
+
+**** DETAILS OF HEADER FIELDS
+
+For each field, some are information from the kernel to the bootloader
+("read"), some are expected to be filled out by the bootloader
+("write"), and some are expected to be read and modified by the
+bootloader ("modify").
+
+All general purpose boot loaders should write the fields marked
+(obligatory). Boot loaders who want to load the kernel at a
+nonstandard address should fill in the fields marked (reloc); other
+boot loaders can ignore those fields.
+
+The byte order of all fields is littleendian (this is x86, after all.)
+
+Field name: setup_sects
+Type: read
+Offset/size: 0x1f1/1
+Protocol: ALL
+
+ The size of the setup code in 512-byte sectors. If this field is
+ 0, the real value is 4. The real-mode code consists of the boot
+ sector (always one 512-byte sector) plus the setup code.
+
+Field name: root_flags
+Type: modify (optional)
+Offset/size: 0x1f2/2
+Protocol: ALL
+
+ If this field is nonzero, the root defaults to readonly. The use of
+ this field is deprecated; use the "ro" or "rw" options on the
+ command line instead.
+
+Field name: syssize
+Type: read
+Offset/size: 0x1f4/4 (protocol 2.04+) 0x1f4/2 (protocol ALL)
+Protocol: 2.04+
+
+ The size of the protected-mode code in units of 16-byte paragraphs.
+ For protocol versions older than 2.04 this field is only two bytes
+ wide, and therefore cannot be trusted for the size of a kernel if
+ the LOAD_HIGH flag is set.
+
+Field name: ram_size
+Type: kernel internal
+Offset/size: 0x1f8/2
+Protocol: ALL
+
+ This field is obsolete.
+
+Field name: vid_mode
+Type: modify (obligatory)
+Offset/size: 0x1fa/2
+
+ Please see the section on SPECIAL COMMAND LINE OPTIONS.
+
+Field name: root_dev
+Type: modify (optional)
+Offset/size: 0x1fc/2
+Protocol: ALL
+
+ The default root device device number. The use of this field is
+ deprecated, use the "root=" option on the command line instead.
+
+Field name: boot_flag
+Type: read
+Offset/size: 0x1fe/2
+Protocol: ALL
+
+ Contains 0xAA55. This is the closest thing old Linux kernels have
+ to a magic number.
+
+Field name: jump
+Type: read
+Offset/size: 0x200/2
+Protocol: 2.00+
+
+ Contains an x86 jump instruction, 0xEB followed by a signed offset
+ relative to byte 0x202. This can be used to determine the size of
+ the header.
+
+Field name: header
+Type: read
+Offset/size: 0x202/4
+Protocol: 2.00+
+
+ Contains the magic number "HdrS" (0x53726448).
+
+Field name: version
+Type: read
+Offset/size: 0x206/2
+Protocol: 2.00+
+
+ Contains the boot protocol version, in (major << 8)+minor format,
+ e.g. 0x0204 for version 2.04, and 0x0a11 for a hypothetical version
+ 10.17.
+
+Field name: readmode_swtch
+Type: modify (optional)
+Offset/size: 0x208/4
+Protocol: 2.00+
+
+ Boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
+
+Field name: start_sys
+Type: read
+Offset/size: 0x20c/4
+Protocol: 2.00+
+
+ The load low segment (0x1000). Obsolete.
+
+Field name: kernel_version
+Type: read
+Offset/size: 0x20e/2
+Protocol: 2.00+
+
+ If set to a nonzero value, contains a pointer to a NUL-terminated
+ human-readable kernel version number string, less 0x200. This can
+ be used to display the kernel version to the user. This value
+ should be less than (0x200*setup_sects).
+
+ For example, if this value is set to 0x1c00, the kernel version
+ number string can be found at offset 0x1e00 in the kernel file.
+ This is a valid value if and only if the "setup_sects" field
+ contains the value 15 or higher, as:
+
+ 0x1c00 < 15*0x200 (= 0x1e00) but
+ 0x1c00 >= 14*0x200 (= 0x1c00)
+
+ 0x1c00 >> 9 = 14, so the minimum value for setup_secs is 15.
+
+Field name: type_of_loader
+Type: write (obligatory)
+Offset/size: 0x210/1
+Protocol: 2.00+
+
+ If your boot loader has an assigned id (see table below), enter
+ 0xTV here, where T is an identifier for the boot loader and V is
+ a version number. Otherwise, enter 0xFF here.
+
+ Assigned boot loader ids:
+ 0 LILO (0x00 reserved for pre-2.00 bootloader)
1 Loadlin
- 2 bootsect-loader
+ 2 bootsect-loader (0x20, all other values reserved)
3 SYSLINUX
4 EtherBoot
5 ELILO
8 U-BOOT
9 Xen
A Gujin
+ B Qemu
- Please contact <hpa@zytor.com> if you need a bootloader ID
- value assigned.
-
- loadflags, heap_end_ptr:
- If the protocol version is 2.01 or higher, enter the
- offset limit of the setup heap into heap_end_ptr and set the
- 0x80 bit (CAN_USE_HEAP) of loadflags. heap_end_ptr appears to
- be relative to the start of setup (offset 0x0200).
-
- setup_move_size:
- When using protocol 2.00 or 2.01, if the real mode
- kernel is not loaded at 0x90000, it gets moved there later in
- the loading sequence. Fill in this field if you want
- additional data (such as the kernel command line) moved in
- addition to the real-mode kernel itself.
-
- ramdisk_image, ramdisk_size:
- If your boot loader has loaded an initial ramdisk (initrd),
- set ramdisk_image to the 32-bit pointer to the ramdisk data
- and the ramdisk_size to the size of the ramdisk data.
-
- The initrd should typically be located as high in memory as
- possible, as it may otherwise get overwritten by the early
- kernel initialization sequence. However, it must never be
- located above the address specified in the initrd_addr_max
- field. The initrd should be at least 4K page aligned.
-
- cmd_line_ptr:
- If the protocol version is 2.02 or higher, this is a 32-bit
- pointer to the kernel command line. The kernel command line
- can be located anywhere between the end of setup and 0xA0000.
- Fill in this field even if your boot loader does not support a
- command line, in which case you can point this to an empty
- string (or better yet, to the string "auto".) If this field
- is left at zero, the kernel will assume that your boot loader
- does not support the 2.02+ protocol.
-
- ramdisk_max:
- The maximum address that may be occupied by the initrd
- contents. For boot protocols 2.02 or earlier, this field is
- not present, and the maximum address is 0x37FFFFFF. (This
- address is defined as the address of the highest safe byte, so
- if your ramdisk is exactly 131072 bytes long and this field is
- 0x37FFFFFF, you can start your ramdisk at 0x37FE0000.)
-
- cmdline_size:
- The maximum size of the command line without the terminating
- zero. This means that the command line can contain at most
- cmdline_size characters. With protocol version 2.05 and
- earlier, the maximum size was 255.
-
+ Please contact <hpa@zytor.com> if you need a bootloader ID
+ value assigned.
+
+Field name: loadflags
+Type: modify (obligatory)
+Offset/size: 0x211/1
+Protocol: 2.00+
+
+ This field is a bitmask.
+
+ Bit 0 (read): LOADED_HIGH
+ - If 0, the protected-mode code is loaded at 0x10000.
+ - If 1, the protected-mode code is loaded at 0x100000.
+
+ Bit 6 (write): KEEP_SEGMENTS
+ Protocol: 2.07+
+ - if 0, reload the segment registers in the 32bit entry point.
+ - if 1, do not reload the segment registers in the 32bit entry point.
+ Assume that %cs %ds %ss %es are all set to flat segments with
+ a base of 0 (or the equivalent for their environment).
+
+ Bit 7 (write): CAN_USE_HEAP
+ Set this bit to 1 to indicate that the value entered in the
+ heap_end_ptr is valid. If this field is clear, some setup code
+ functionality will be disabled.
+
+Field name: setup_move_size
+Type: modify (obligatory)
+Offset/size: 0x212/2
+Protocol: 2.00-2.01
+
+ When using protocol 2.00 or 2.01, if the real mode kernel is not
+ loaded at 0x90000, it gets moved there later in the loading
+ sequence. Fill in this field if you want additional data (such as
+ the kernel command line) moved in addition to the real-mode kernel
+ itself.
+
+ The unit is bytes starting with the beginning of the boot sector.
+
+ This field is can be ignored when the protocol is 2.02 or higher, or
+ if the real-mode code is loaded at 0x90000.
+
+Field name: code32_start
+Type: modify (optional, reloc)
+Offset/size: 0x214/4
+Protocol: 2.00+
+
+ The address to jump to in protected mode. This defaults to the load
+ address of the kernel, and can be used by the boot loader to
+ determine the proper load address.
+
+ This field can be modified for two purposes:
+
+ 1. as a boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
+
+ 2. if a bootloader which does not install a hook loads a
+ relocatable kernel at a nonstandard address it will have to modify
+ this field to point to the load address.
+
+Field name: ramdisk_image
+Type: write (obligatory)
+Offset/size: 0x218/4
+Protocol: 2.00+
+
+ The 32-bit linear address of the initial ramdisk or ramfs. Leave at
+ zero if there is no initial ramdisk/ramfs.
+
+Field name: ramdisk_size
+Type: write (obligatory)
+Offset/size: 0x21c/4
+Protocol: 2.00+
+
+ Size of the initial ramdisk or ramfs. Leave at zero if there is no
+ initial ramdisk/ramfs.
+
+Field name: bootsect_kludge
+Type: kernel internal
+Offset/size: 0x220/4
+Protocol: 2.00+
+
+ This field is obsolete.
+
+Field name: heap_end_ptr
+Type: write (obligatory)
+Offset/size: 0x224/2
+Protocol: 2.01+
+
+ Set this field to the offset (from the beginning of the real-mode
+ code) of the end of the setup stack/heap, minus 0x0200.
+
+Field name: cmd_line_ptr
+Type: write (obligatory)
+Offset/size: 0x228/4
+Protocol: 2.02+
+
+ Set this field to the linear address of the kernel command line.
+ The kernel command line can be located anywhere between the end of
+ the setup heap and 0xA0000; it does not have to be located in the
+ same 64K segment as the real-mode code itself.
+
+ Fill in this field even if your boot loader does not support a
+ command line, in which case you can point this to an empty string
+ (or better yet, to the string "auto".) If this field is left at
+ zero, the kernel will assume that your boot loader does not support
+ the 2.02+ protocol.
+
+Field name: initrd_addr_max
+Type: read
+Offset/size: 0x22c/4
+Protocol: 2.03+
+
+ The maximum address that may be occupied by the initial
+ ramdisk/ramfs contents. For boot protocols 2.02 or earlier, this
+ field is not present, and the maximum address is 0x37FFFFFF. (This
+ address is defined as the address of the highest safe byte, so if
+ your ramdisk is exactly 131072 bytes long and this field is
+ 0x37FFFFFF, you can start your ramdisk at 0x37FE0000.)
+
+Field name: kernel_alignment
+Type: read (reloc)
+Offset/size: 0x230/4
+Protocol: 2.05+
+
+ Alignment unit required by the kernel (if relocatable_kernel is true.)
+
+Field name: relocatable_kernel
+Type: read (reloc)
+Offset/size: 0x234/1
+Protocol: 2.05+
+
+ If this field is nonzero, the protected-mode part of the kernel can
+ be loaded at any address that satisfies the kernel_alignment field.
+ After loading, the boot loader must set the code32_start field to
+ point to the loaded code, or to a boot loader hook.
+
+Field name: cmdline_size
+Type: read
+Offset/size: 0x238/4
+Protocol: 2.06+
+
+ The maximum size of the command line without the terminating
+ zero. This means that the command line can contain at most
+ cmdline_size characters. With protocol version 2.05 and earlier, the
+ maximum size was 255.
+
+Field name: hardware_subarch
+Type: write
+Offset/size: 0x23c/4
+Protocol: 2.07+
+
+ In a paravirtualized environment the hardware low level architectural
+ pieces such as interrupt handling, page table handling, and
+ accessing process control registers needs to be done differently.
+
+ This field allows the bootloader to inform the kernel we are in one
+ one of those environments.
+
+ 0x00000000 The default x86/PC environment
+ 0x00000001 lguest
+ 0x00000002 Xen
+
+Field name: hardware_subarch_data
+Type: write
+Offset/size: 0x240/8
+Protocol: 2.07+
+
+ A pointer to data that is specific to hardware subarch
+
+Field name: compressed_payload_offset
+Type: read
+Offset/size: 0x248/4
+Protocol: 2.08+
+
+ If non-zero then this field contains the offset from the end of the
+ real-mode code to the compressed payload. The compression format
+ should be determined using the standard magic number, currently only
+ gzip is used.
+
+Field name: compressed_payload_length
+Type: read
+Offset/size: 0x24c/4
+Protocol: 2.08+
+
+ The length of the compressed payload.
**** THE KERNEL COMMAND LINE
field.
+**** MEMORY LAYOUT OF THE REAL-MODE CODE
+
+The real-mode code requires a stack/heap to be set up, as well as
+memory allocated for the kernel command line. This needs to be done
+in the real-mode accessible memory in bottom megabyte.
+
+It should be noted that modern machines often have a sizable Extended
+BIOS Data Area (EBDA). As a result, it is advisable to use as little
+of the low megabyte as possible.
+
+Unfortunately, under the following circumstances the 0x90000 memory
+segment has to be used:
+
+ - When loading a zImage kernel ((loadflags & 0x01) == 0).
+ - When loading a 2.01 or earlier boot protocol kernel.
+
+ -> For the 2.00 and 2.01 boot protocols, the real-mode code
+ can be loaded at another address, but it is internally
+ relocated to 0x90000. For the "old" protocol, the
+ real-mode code must be loaded at 0x90000.
+
+When loading at 0x90000, avoid using memory above 0x9a000.
+
+For boot protocol 2.02 or higher, the command line does not have to be
+located in the same 64K segment as the real-mode setup code; it is
+thus permitted to give the stack/heap the full 64K segment and locate
+the command line above it.
+
+The kernel command line should not be located below the real-mode
+code, nor should it be located in high memory.
+
+
**** SAMPLE BOOT CONFIGURATION
As a sample configuration, assume the following layout of the real
-mode segment (this is a typical, and recommended layout):
+mode segment:
+
+ When loading below 0x90000, use the entire segment:
- 0x0000-0x7FFF Real mode kernel
- 0x8000-0x8FFF Stack and heap
- 0x9000-0x90FF Kernel command line
+ 0x0000-0x7fff Real mode kernel
+ 0x8000-0xdfff Stack and heap
+ 0xe000-0xffff Kernel command line
+
+ When loading at 0x90000 OR the protocol version is 2.01 or earlier:
+
+ 0x0000-0x7fff Real mode kernel
+ 0x8000-0x97ff Stack and heap
+ 0x9800-0x9fff Kernel command line
Such a boot loader should enter the following fields in the header:
ramdisk_image = <initrd_address>;
ramdisk_size = <initrd_size>;
}
+
+ if ( protocol >= 0x0202 && loadflags & 0x01 )
+ heap_end = 0xe000;
+ else
+ heap_end = 0x9800;
+
if ( protocol >= 0x0201 ) {
- heap_end_ptr = 0x9000 - 0x200;
+ heap_end_ptr = heap_end - 0x200;
loadflags |= 0x80; /* CAN_USE_HEAP */
}
+
if ( protocol >= 0x0202 ) {
- cmd_line_ptr = base_ptr + 0x9000;
+ cmd_line_ptr = base_ptr + heap_end;
+ strcpy(cmd_line_ptr, cmdline);
} else {
cmd_line_magic = 0xA33F;
- cmd_line_offset = 0x9000;
- setup_move_size = 0x9100;
+ cmd_line_offset = heap_end;
+ setup_move_size = heap_end + strlen(cmdline)+1;
+ strcpy(base_ptr+cmd_line_offset, cmdline);
}
} else {
/* Very old kernel */
+ heap_end = 0x9800;
+
cmd_line_magic = 0xA33F;
- cmd_line_offset = 0x9000;
+ cmd_line_offset = heap_end;
/* A very old kernel MUST have its real-mode code
loaded at 0x90000 */
if ( base_ptr != 0x90000 ) {
/* Copy the real-mode kernel */
memcpy(0x90000, base_ptr, (setup_sects+1)*512);
- /* Copy the command line */
- memcpy(0x99000, base_ptr+0x9000, 256);
-
base_ptr = 0x90000; /* Relocated */
}
+ strcpy(0x90000+cmd_line_offset, cmdline);
+
/* It is recommended to clear memory up to the 32K mark */
memset(0x90000 + (setup_sects+1)*512, 0,
(64-(setup_sects+1))*512);
line is parsed.
mem=<size>
- <size> is an integer in C notation optionally followed by K, M
- or G (meaning << 10, << 20 or << 30). This specifies the end
- of memory to the kernel. This affects the possible placement
- of an initrd, since an initrd should be placed near end of
+ <size> is an integer in C notation optionally followed by
+ (case insensitive) K, M, G, T, P or E (meaning << 10, << 20,
+ << 30, << 40, << 50 or << 60). This specifies the end of
+ memory to the kernel. This affects the possible placement of
+ an initrd, since an initrd should be placed near end of
memory. Note that this is an option to *both* the kernel and
the bootloader!
/* Set up the real-mode kernel stack */
_SS = seg;
- _SP = 0x9000; /* Load SP immediately after loading SS! */
+ _SP = heap_end;
_DS = _ES = _FS = _GS = seg;
jmp_far(seg+0x20, 0); /* Run the kernel */
a demand-loaded module!
-**** ADVANCED BOOT TIME HOOKS
+**** ADVANCED BOOT LOADER HOOKS
If the boot loader runs in a particularly hostile environment (such as
LOADLIN, which runs under DOS) it may be impossible to follow the
code32_start:
A 32-bit flat-mode routine *jumped* to immediately after the
transition to protected mode, but before the kernel is
- uncompressed. No segments, except CS, are set up; you should
- set them up to KERNEL_DS (0x18) yourself.
+ uncompressed. No segments, except CS, are guaranteed to be
+ set up (current kernels do, but older ones do not); you should
+ set them up to BOOT_DS (0x18) yourself.
After completing your hook, you should jump to the address
- that was in this field before your boot loader overwrote it.
+ that was in this field before your boot loader overwrote it
+ (relocated, if appropriate.)
+
+
+**** 32-bit BOOT PROTOCOL
+
+For machine with some new BIOS other than legacy BIOS, such as EFI,
+LinuxBIOS, etc, and kexec, the 16-bit real mode setup code in kernel
+based on legacy BIOS can not be used, so a 32-bit boot protocol needs
+to be defined.
+
+In 32-bit boot protocol, the first step in loading a Linux kernel
+should be to setup the boot parameters (struct boot_params,
+traditionally known as "zero page"). The memory for struct boot_params
+should be allocated and initialized to all zero. Then the setup header
+from offset 0x01f1 of kernel image on should be loaded into struct
+boot_params and examined. The end of setup header can be calculated as
+follow:
+
+ 0x0202 + byte value at offset 0x0201
+
+In addition to read/modify/write the setup header of the struct
+boot_params as that of 16-bit boot protocol, the boot loader should
+also fill the additional fields of the struct boot_params as that
+described in zero-page.txt.
+
+After setupping the struct boot_params, the boot loader can load the
+32/64-bit kernel in the same way as that of 16-bit boot protocol.
+
+In 32-bit boot protocol, the kernel is started by jumping to the
+32-bit kernel entry point, which is the start address of loaded
+32/64-bit kernel.
+
+At entry, the CPU must be in 32-bit protected mode with paging
+disabled; a GDT must be loaded with the descriptors for selectors
+__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
+segment; __BOOS_CS must have execute/read permission, and __BOOT_DS
+must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
+must be __BOOT_DS; interrupt must be disabled; %esi must hold the base
+address of the struct boot_params; %ebp, %edi and %ebx must be zero.