Tuesday, September 26, 2006

Hibernating when you have a large uncached extension

One of the main issues I've faced while developing a software RAID miniport device driver was allocating enough memory during normal runtime (for handling RAID 5 operations) and still being called for hibernation by Windows. If you ever try this out, you'll quickly discover that allocating an uncached extension larger than 900KB will prevent diskdump.sys from even calling your miniport's DriverEntry.

The whole algorithm is completely unknown as to how diskdump decides if it can start the miniport driver or not, in fact even the Microsoft storage devs are at a loss to explain the criteria diskdump has to crank up a miniport for hibernation or crashdump. After many, many hours of experimentation, I have found a way to satisfy diskdump's "criteria" to start up a miniport driver for hibernation or crashdump.

In my situation, my miniport driver was allocating 42MB of uncached memory during normal runtime. This of course prevented diskdump from even calling my DriverEntry routine for hibernation; not a good thing if you intend to get WHQL certification.

So here's the trick to hibernate while allocating a huge uncached extension:


Get rid of your device extension.

This seems to be the key factor for diskdump to "grant" hibernation to continue. How do you do this? Store your device extension data in your driver binary.

  • Simply create a global variable in your driver using the structure type of your device extension.
  • When specifiying the device extension size during DriverEntry, specify the size of a pointer (remember: the size of a pointer is different for 32-bit and 64-bit drivers). i.e. DeviceExtensionSize = sizeof(u32);
  • In your FindAdapter routine, use the pointer space that was allocated by Windows for your device extension to point to your global variable. i.e. *HwDeviceExtension = (u32)global_variable;
  • In all other entry points of your miniport, dereference the pointer to get access to your global variable. i.e. devExt_t *devext = (devExt_t *)*HwDeviceExtension;
Following this method of having a near-zero size device extension, you can have a very large uncached extension and will still be called for hibernation. Please note that the minimum memory requirements for the product I worked on was 1GB of RAM. I did test this implementation down to 512MB of RAM, but got mixed results.

Also note that the miniport driver I worked on had a very small SRB extension size (size of a pointer), so again, this may be a factor to look at when designing your miniport for this type of scenario.

During this implementation, a bug in the Windows Memory Manager was discovered that prevented this implementation for working properly on certain system configurations. Basically what happens is that after each hibernation, there is a memory leak (true for any miniport). This can build up over many subsequent hibernations (without shutting down in-between), but has not been seen yet in the field because most miniport drivers allocate a very small amount of memory (i.e. 64KB). Because the average size of a miniports uncached extension is extremely small, the corresponding memory leak is very small and does not affect hibernating thousands of times in a row.

In my case, because my uncached extension was 42MB, the memory leak became very noticeable after several hibernations, eventually preventing subsequent hibernations (maybe around 5-10 in a row before hitting the failure). I received a 0xC000009A error (
STATUS_INSUFFICIENT_RESOURCES) that hibernation could not be performed; this happened because diskdump.sys couldn't allocate enough resources for itself to get up a running.

This bug in Windows has been fixed by Microsoft and a hotfix is available for download. However, please note that if you follow this implementation, you will then have to require your customers to install this hotfix for hibernation to work. Again, this is a Microsoft bug that is only exagerrated because of the large mount of memory allocated for your uncached extension.

Hibernation / Crashdump details

Having spent numerous hours of trial-and-error discovery of the limits of hibernation and crashdump resources, I thought I'd post them here to save myself time later (and help anyone else searching for these answers). Note that this information was verified on Windows 2000 SP4, Windows XP SP2, and Windows 2003 SP1 & R2 using a ScsiPort miniport driver.

  • MaximumTransferLength: The maximum I/O size you'll get from ScsiPort during hibernation and crashdump is 4KB (PAGE_SIZE). This will change in Windows Vista to 64KB.
  • The maximum device extension size that can be allocated during hibernation and crashdump is 16KB.
  • The maximum uncached extension size that can be allocated during hibernation and crashdump is 63KB (despite the DDK documentation claiming the max is 32KB).
  • The maximum device extension size that can be allocated during normal runtime, while allowing Windows to call your miniport to hibernate or crashdump, is 900KB.
  • The maximum uncached extension size that can be allocated during normal runtime, while allowing Windows to call your miniport to hibernate or crashdump, is 900KB.

Text Mode Debugging

With Windows Server 2003 you can enable text-mode debugging during setup by pressing the F8 key. This means that you do not have to edit the Txtsetup.sif file to enable text-mode debugging.

Enable text-mode debugging during setup

To enable text-mode debugging during setup, press F8 when the Setup program is loading, and then press F8 again when you are prompted to press the F6 function key to install third-party small computer system interface (SCSI) and host controller drivers.

Debug sections that are added to setup

When you press F8 to enable text-mode debugging during Setup, the following default debug options are added to Setup:

/debug /debugport=com2 /baudrate=19200

The computer will send debugging output to COM2 during the text mode portion of setup. If your system has only COM1, you must modify the Txtsetup.sif file for the computer to send debugging output to COM1.

To use debugging, you must have symbols. You must also have a debugger computer that has a debug cable attached.

Note: To use text-mode debugging, the COM2 communications (COM) port must be available on your computer. If COM2 is not available on your computer, and you have to enable text-mode debugging during setup, copy the setup files to your hard disk, edit the Txtsetup.sif file to address COM1, and then run the Setup program from the hard disk. Make the following modifications to the Txtsetup.sif file:


[SetupData]

OsLoadOptions = "/noguiboot /fastdetect /debug /baudrate=57600 /debugport=com1"

OsLoadOptionsVar = "/fastdetect /debug /baudrate=57600 /debugport=com1"


Note: If your computer only has COM1, you must modify the Txtsetup.sif file to connect the debugger to COM1.

Bugcheck Codes (BSOD)

Bug Check 0x1: APC_INDEX_MISMATCH

Bug Check 0x2: DEVICE_QUEUE_NOT_BUSY

Bug Check 0x3: INVALID_AFFINITY_SET

Bug Check 0x4: INVALID_DATA_ACCESS_TRAP

Bug Check 0x5: INVALID_PROCESS_ATTACH_ATTEMPT

Bug Check 0x6: INVALID_PROCESS_DETACH_ATTEMPT

Bug Check 0x7: INVALID_SOFTWARE_INTERRUPT

Bug Check 0x8: IRQL_NOT_DISPATCH_LEVEL

Bug Check 0x9: IRQL_NOT_GREATER_OR_EQUAL

Bug Check 0xA: IRQL_NOT_LESS_OR_EQUAL

Bug Check 0xB: NO_EXCEPTION_HANDLING_SUPPORT

Bug Check 0xC: MAXIMUM_WAIT_OBJECTS_EXCEEDED

Bug Check 0xD: MUTEX_LEVEL_NUMBER_VIOLATION

Bug Check 0xE: NO_USER_MODE_CONTEXT

Bug Check 0xF: SPIN_LOCK_ALREADY_OWNED

Bug Check 0x10: SPIN_LOCK_NOT_OWNED

Bug Check 0x11: THREAD_NOT_MUTEX_OWNER

Bug Check 0x12: TRAP_CAUSE_UNKNOWN

Bug Check 0x13: EMPTY_THREAD_REAPER_LIST

Bug Check 0x14: CREATE_DELETE_LOCK_NOT_LOCKED

Bug Check 0x15: LAST_CHANCE_CALLED_FROM_KMODE

Bug Check 0x16: CID_HANDLE_CREATION

Bug Check 0x17: CID_HANDLE_DELETION

Bug Check 0x18: REFERENCE_BY_POINTER

Bug Check 0x19: BAD_POOL_HEADER

Bug Check 0x1A: MEMORY_MANAGEMENT

Bug Check 0x1B: PFN_SHARE_COUNT

Bug Check 0x1C: PFN_REFERENCE_COUNT

Bug Check 0x1D: NO_SPIN_LOCK_AVAILABLE

Bug Check 0x1E: KMODE_EXCEPTION_NOT_HANDLED

Bug Check 0x1F: SHARED_RESOURCE_CONV_ERROR

Bug Check 0x20: KERNEL_APC_PENDING_DURING_EXIT

Bug Check 0x21: QUOTA_UNDERFLOW

Bug Check 0x22: FILE_SYSTEM

Bug Check 0x23: FAT_FILE_SYSTEM

Bug Check 0x24: NTFS_FILE_SYSTEM

Bug Check 0x25: NPFS_FILE_SYSTEM

Bug Check 0x26: CDFS_FILE_SYSTEM

Bug Check 0x27: RDR_FILE_SYSTEM

Bug Check 0x28: CORRUPT_ACCESS_TOKEN

Bug Check 0x29: SECURITY_SYSTEM

Bug Check 0x2A: INCONSISTENT_IRP

Bug Check 0x2B: PANIC_STACK_SWITCH

Bug Check 0x2C: PORT_DRIVER_INTERNAL

Bug Check 0x2D: SCSI_DISK_DRIVER_INTERNAL

Bug Check 0x2E: DATA_BUS_ERROR

Bug Check 0x2F: INSTRUCTION_BUS_ERROR

Bug Check 0x30: SET_OF_INVALID_CONTEXT

Bug Check 0x31: PHASE0_INITIALIZATION_FAILED

Bug Check 0x32: PHASE1_INITIALIZATION_FAILED

Bug Check 0x33: UNEXPECTED_INITIALIZATION_CALL

Bug Check 0x34: CACHE_MANAGER

Bug Check 0x35: NO_MORE_IRP_STACK_LOCATIONS

Bug Check 0x36: DEVICE_REFERENCE_COUNT_NOT_ZERO

Bug Check 0x37: FLOPPY_INTERNAL_ERROR

Bug Check 0x38: SERIAL_DRIVER_INTERNAL

Bug Check 0x39: SYSTEM_EXIT_OWNED_MUTEX

Bug Check 0x3A: SYSTEM_UNWIND_PREVIOUS_USER

Bug Check 0x3B: SYSTEM_SERVICE_EXCEPTION

Bug Check 0x3C: INTERRUPT_UNWIND_ATTEMPTED

Bug Check 0x3D: INTERRUPT_EXCEPTION_NOT_HANDLED

Bug Check 0x3E: MULTIPROCESSOR_CONFIGURATION_NOT_SUPPORTED

Bug Check 0x3F: NO_MORE_SYSTEM_PTES

Bug Check 0x40: TARGET_MDL_TOO_SMALL

Bug Check 0x41: MUST_SUCCEED_POOL_EMPTY

Bug Check 0x42: ATDISK_DRIVER_INTERNAL

Bug Check 0x43: NO_SUCH_PARTITION

Bug Check 0x44: MULTIPLE_IRP_COMPLETE_REQUESTS

Bug Check 0x45: INSUFFICIENT_SYSTEM_MAP_REGS

Bug Check 0x46: DEREF_UNKNOWN_LOGON_SESSION

Bug Check 0x47: REF_UNKNOWN_LOGON_SESSION

Bug Check 0x48: CANCEL_STATE_IN_COMPLETED_IRP

Bug Check 0x49: PAGE_FAULT_WITH_INTERRUPTS_OFF

Bug Check 0x4A: IRQL_GT_ZERO_AT_SYSTEM_SERVICE

Bug Check 0x4B: STREAMS_INTERNAL_ERROR

Bug Check 0x4C: FATAL_UNHANDLED_HARD_ERROR

Bug Check 0x4D: NO_PAGES_AVAILABLE

Bug Check 0x4E: PFN_LIST_CORRUPT

Bug Check 0x4F: NDIS_INTERNAL_ERROR

Bug Check 0x50: PAGE_FAULT_IN_NONPAGED_AREA

Bug Check 0x51: REGISTRY_ERROR

Bug Check 0x52: MAILSLOT_FILE_SYSTEM

Bug Check 0x53: NO_BOOT_DEVICE

Bug Check 0x54: LM_SERVER_INTERNAL_ERROR

Bug Check 0x55: DATA_COHERENCY_EXCEPTION

Bug Check 0x56: INSTRUCTION_COHERENCY_EXCEPTION

Bug Check 0x57: XNS_INTERNAL_ERROR

Bug Check 0x58: FTDISK_INTERNAL_ERROR

Bug Check 0x59: PINBALL_FILE_SYSTEM

Bug Check 0x5A: CRITICAL_SERVICE_FAILED

Bug Check 0x5B: SET_ENV_VAR_FAILED

Bug Check 0x5C: HAL_INITIALIZATION_FAILED

Bug Check 0x5D: UNSUPPORTED_PROCESSOR

Bug Check 0x5E: OBJECT_INITIALIZATION_FAILED

Bug Check 0x5F: SECURITY_INITIALIZATION_FAILED

Bug Check 0x60: PROCESS_INITIALIZATION_FAILED

Bug Check 0x61: HAL1_INITIALIZATION_FAILED

Bug Check 0x62: OBJECT1_INITIALIZATION_FAILED

Bug Check 0x63: SECURITY1_INITIALIZATION_FAILED

Bug Check 0x64: SYMBOLIC_INITIALIZATION_FAILED

Bug Check 0x65: MEMORY1_INITIALIZATION_FAILED

Bug Check 0x66: CACHE_INITIALIZATION_FAILED

Bug Check 0x67: CONFIG_INITIALIZATION_FAILED

Bug Check 0x68: FILE_INITIALIZATION_FAILED

Bug Check 0x69: IO1_INITIALIZATION_FAILED

Bug Check 0x6A: LPC_INITIALIZATION_FAILED

Bug Check 0x6B: PROCESS1_INITIALIZATION_FAILED

Bug Check 0x6C: REFMON_INITIALIZATION_FAILED

Bug Check 0x6D: SESSION1_INITIALIZATION_FAILED

Bug Check 0x6E: SESSION2_INITIALIZATION_FAILED

Bug Check 0x6F: SESSION3_INITIALIZATION_FAILED

Bug Check 0x70: SESSION4_INITIALIZATION_FAILED

Bug Check 0x71: SESSION5_INITIALIZATION_FAILED

Bug Check 0x72: ASSIGN_DRIVE_LETTERS_FAILED

Bug Check 0x73: CONFIG_LIST_FAILED

Bug Check 0x74: BAD_SYSTEM_CONFIG_INFO

Bug Check 0x75: CANNOT_WRITE_CONFIGURATION

Bug Check 0x76: PROCESS_HAS_LOCKED_PAGES

Bug Check 0x77: KERNEL_STACK_INPAGE_ERROR

Bug Check 0x78: PHASE0_EXCEPTION

Bug Check 0x79: MISMATCHED_HAL

Bug Check 0x7A: KERNEL_DATA_INPAGE_ERROR

Bug Check 0x7B: INACCESSIBLE_BOOT_DEVICE

Bug Check 0x7C: BUGCODE_NDIS_DRIVER

Bug Check 0x7D: INSTALL_MORE_MEMORY

Bug Check 0x7E: SYSTEM_THREAD_EXCEPTION_NOT_HANDLED

Bug Check 0x7F: UNEXPECTED_KERNEL_MODE_TRAP

Bug Check 0x80: NMI_HARDWARE_FAILURE

Bug Check 0x81: SPIN_LOCK_INIT_FAILURE

Bug Check 0x82: DFS_FILE_SYSTEM

Bug Check 0x85: SETUP_FAILURE

Bug Check 0x8B: MBR_CHECKSUM_MISMATCH

Bug Check 0x8E: KERNEL_MODE_EXCEPTION_NOT_HANDLED

Bug Check 0x8F: PP0_INITIALIZATION_FAILED

Bug Check 0x90: PP1_INITIALIZATION_FAILED

Bug Check 0x92: UP_DRIVER_ON_MP_SYSTEM

Bug Check 0x93: INVALID_KERNEL_HANDLE

Bug Check 0x94: KERNEL_STACK_LOCKED_AT_EXIT

Bug Check 0x96: INVALID_WORK_QUEUE_ITEM

Bug Check 0x97: BOUND_IMAGE_UNSUPPORTED

Bug Check 0x98: END_OF_NT_EVALUATION_PERIOD

Bug Check 0x99: INVALID_REGION_OR_SEGMENT

Bug Check 0x9A: SYSTEM_LICENSE_VIOLATION

Bug Check 0x9B: UDFS_FILE_SYSTEM

Bug Check 0x9C: MACHINE_CHECK_EXCEPTION

Bug Check 0x9F: DRIVER_POWER_STATE_FAILURE

Bug Check 0xA0: INTERNAL_POWER_ERROR

Bug Check 0xA1: PCI_BUS_DRIVER_INTERNAL

Bug Check 0xA2: MEMORY_IMAGE_CORRUPT

Bug Check 0xA3: ACPI_DRIVER_INTERNAL

Bug Check 0xA4: CNSS_FILE_SYSTEM_FILTER

Bug Check 0xA5: ACPI_BIOS_ERROR

Bug Check 0xA7: BAD_EXHANDLE

Bug Check 0xAB: SESSION_HAS_VALID_POOL_ON_EXIT

Bug Check 0xAC: HAL_MEMORY_ALLOCATION

Bug Check 0xB4: VIDEO_DRIVER_INIT_FAILURE

Bug Check 0xB8: ATTEMPTED_SWITCH_FROM_DPC

Bug Check 0xB9: CHIPSET_DETECTED_ERROR

Bug Check 0xBA: SESSION_HAS_VALID_VIEWS_ON_EXIT

Bug Check 0xBB: NETWORK_BOOT_INITIALIZATION_FAILED

Bug Check 0xBC: NETWORK_BOOT_DUPLICATE_ADDRESS

Bug Check 0xBE: ATTEMPTED_WRITE_TO_READONLY_MEMORY

Bug Check 0xBF: MUTEX_ALREADY_OWNED

Bug Check 0xC1: SPECIAL_POOL_DETECTED_MEMORY_CORRUPTION

Bug Check 0xC2: BAD_POOL_CALLER

Bug Check 0xC4: DRIVER_VERIFIER_DETECTED_VIOLATION

Bug Check 0xC5: DRIVER_CORRUPTED_EXPOOL

Bug Check 0xC6: DRIVER_CAUGHT_MODIFYING_FREED_POOL

Bug Check 0xC7: TIMER_OR_DPC_INVALID

Bug Check 0xC8: IRQL_UNEXPECTED_VALUE

Bug Check 0xC9: DRIVER_VERIFIER_IOMANAGER_VIOLATION

Bug Check 0xCA: PNP_DETECTED_FATAL_ERROR

Bug Check 0xCB: DRIVER_LEFT_LOCKED_PAGES_IN_PROCESS

Bug Check 0xCC: PAGE_FAULT_IN_FREED_SPECIAL_POOL

Bug Check 0xCD: PAGE_FAULT_BEYOND_END_OF_ALLOCATION

Bug Check 0xCE: DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS

Bug Check 0xCF: TERMINAL_SERVER_DRIVER_MADE_INCORRECT_MEMORY_REFERENCE

Bug Check 0xD0: DRIVER_CORRUPTED_MMPOOL

Bug Check 0xD1: DRIVER_IRQL_NOT_LESS_OR_EQUAL

Bug Check 0xD2: BUGCODE_ID_DRIVER

Bug Check 0xD3: DRIVER_PORTION_MUST_BE_NONPAGED

Bug Check 0xD4: SYSTEM_SCAN_AT_RAISED_IRQL_CAUGHT_IMPROPER_DRIVER_UNLOAD

Bug Check 0xD5: DRIVER_PAGE_FAULT_IN_FREED_SPECIAL_POOL

Bug Check 0xD6: DRIVER_PAGE_FAULT_BEYOND_END_OF_ALLOCATION

Bug Check 0xD7: DRIVER_UNMAPPING_INVALID_VIEW

Bug Check 0xD8: DRIVER_USED_EXCESSIVE_PTES

Bug Check 0xD9: LOCKED_PAGES_TRACKER_CORRUPTION

Bug Check 0xDA: SYSTEM_PTE_MISUSE

Bug Check 0xDB: DRIVER_CORRUPTED_SYSPTES

Bug Check 0xDC: DRIVER_INVALID_STACK_ACCESS

Bug Check 0xDE: POOL_CORRUPTION_IN_FILE_AREA

Bug Check 0xDF: IMPERSONATING_WORKER_THREAD

Bug Check 0xE0: ACPI_BIOS_FATAL_ERROR

Bug Check 0xE1: WORKER_THREAD_RETURNED_AT_BAD_IRQL

Bug Check 0xE2: MANUALLY_INITIATED_CRASH

Bug Check 0xE3: RESOURCE_NOT_OWNED

Bug Check 0xE4: WORKER_INVALID

Bug Check 0xE6: DRIVER_VERIFIER_DMA_VIOLATION

Bug Check 0xE7: INVALID_FLOATING_POINT_STATE

Bug Check 0xE8: INVALID_CANCEL_OF_FILE_OPEN

Bug Check 0xE9: ACTIVE_EX_WORKER_THREAD_TERMINATION

Bug Check 0xEA: THREAD_STUCK_IN_DEVICE_DRIVER

Bug Check 0xEB: DIRTY_MAPPED_PAGES_CONGESTION

Bug Check 0xEC: SESSION_HAS_VALID_SPECIAL_POOL_ON_EXIT

Bug Check 0xED: UNMOUNTABLE_BOOT_VOLUME

Bug Check 0xEF: CRITICAL_PROCESS_DIED

Bug Check 0xF1: SCSI_VERIFIER_DETECTED_VIOLATION

Bug Check 0xF3: DISORDERLY_SHUTDOWN

Bug Check 0xF4: CRITICAL_OBJECT_TERMINATION

Bug Check 0xF5: FLTMGR_FILE_SYSTEM

Bug Check 0xF6: PCI_VERIFIER_DETECTED_VIOLATION

Bug Check 0xF7: DRIVER_OVERRAN_STACK_BUFFER

Bug Check 0xF8: RAMDISK_BOOT_INITIALIZATION_FAILED

Bug Check 0xF9: DRIVER_RETURNED_STATUS_REPARSE_FOR_VOLUME_OPEN

Bug Check 0xFE: BUGCODE_USB_DRIVER

Bug Check 0x1000007E: SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M

Bug Check 0x1000007F: UNEXPECTED_KERNEL_MODE_TRAP_M

Bug Check 0x1000008E: KERNEL_MODE_EXCEPTION_NOT_HANDLED_M

Bug Check 0x100000EA: THREAD_STUCK_IN_DEVICE_DRIVER_M

Bug Check 0xC0000218: STATUS_CANNOT_LOAD_REGISTRY_FILE

Bug Check 0xC000021A: STATUS_SYSTEM_PROCESS_TERMINATED

Bug Check 0xC0000221: STATUS_IMAGE_CHECKSUM_MISMATCH

Bug Check 0xDEADDEAD: MANUALLY_INITIATED_CRASH1