Introduction - Linux Kernel Crash Dump
The Linux Kernel Crash Dump (LKCD) project is designed to meet the needs
of customers and system administrators wanting a reliable method of
detecting, saving and examining system crashes. While more mature
operating systems have provided these capabilities by default for years,
Linux has yet to evolve to such a state. LKCD is an attempt to move
Linux towards greater supportability.
Kernel Crash Dump Requires Four Components:
- Kernel Support:
Kernel code for configuring dump parameters, catching error
conditions, and executing a kernel memory dump. Kernel.org
kernels need to be patched with the LKCD dump modules.
- Dump Configuration:
Facilities for integrating system crash dump capabilities into
the operating system. These facilities are in the form of
user-level applications to configure and enable crash dumps
and various system scripts necessary for incorporating LKCD into
the operating system.
- Dump Recovery:
User-level commands to retrieve a dump saved by the kernel and
transfer it to a user accessible location.
- Dump Analysis:
A debugger that can operate on the saved dump
image. The lkcdutils package provides the lcrash command for
dump analysis.
LKCD provides the all of the components (kernel and user level code)
designed to:
- Save the kernel memory image when the system dies due to a
software failure;
- Recover the kernel memory image when the system is rebooted;
- Analyze the memory image to determine what happened when the
failure occurred.
The memory image is stored into a dump device, which is represented
by one of the disk partitions on the system. That dump is recovered
with an application called lcrash (Linux Crash) once the system
boots back up, before the swap partitions are mounted. A report is
generated and saved into /var/log/dump.
|