I'm experiencing since some time (out of the blue) spontaneous reboots of some of my dom0s on lenny/i386 (others don't have this effect). As result of this, I started rolling out linux-image-2.6.32-bpo.5-xen-686 from bpo and backported xen and xen-common from testing:
# dpkg -l | grep xen | grep bpo
ii libxenstore3.0 4.0.1-1~bpo50+1 Xenstore communications library for Xen
ii linux-image-2.6.32-bpo.5-xen-686 2.6.32-26~bpo50+1 Linux 2.6.32 for modern PCs, Xen dom0 suppor
ii xen-hypervisor-4.0-i386 4.0.1-1~bpo50+1 The Xen Hypervisor on i386
ii xen-linux-system-2.6.32-bpo.5-xen-686 2.6.32-26~bpo50+1 Xen system with Linux 2.6.32 on modern PCs (
ii xen-tools 4.1-1~bpo50+1 Tools to manage Debian XEN virtual servers
ii xen-utils-4.0 4.0.1-1~bpo50+1 XEN administrative tools
ii xen-utils-common 4.0.0-1~bpo50+1 XEN administrative tools - common files
ii xenstore-utils 4.0.1-1~bpo50+1 Xenstore utilities for Xen
This fixed the problem of spontaneous reboots for all of the systems I have migrated .... until now! Last week I tried this solution on my private system and it failed in some way.
A full dump of the syslog is also available.
It looks like /etc/rcS.d/S03udev is hanging in some way .. the syslog for this looks like:
Oct 29 20:51:07 mordor kernel: [ 17.728120] Code: 04 00 00 00 00 83 3c 24 10 74 32 77 0c 83 3c 24 08 0f 85 9f 00 00 00 eb 12 83 3c 24 20 74 30 83 3c 24 40 0f 85 8d 00 00 00 eb 35 <0f> b6 01 8b 54 24 20 89 02 c7 42 04 00 00 00 00 eb 79 0f b7 01
Oct 29 20:51:07 mordor kernel: [ 17.731554] EIP: [] acpi_ex_system_memory_space_handler+0x189/0x221 SS:ESP 0069:ec135d74
Oct 29 20:51:07 mordor kernel: [ 17.731791] CR2: 00000000eda86000
Oct 29 20:51:07 mordor kernel: [ 17.731881] ---[ end trace 7b8af581772a1c55 ]---
Oct 29 20:51:07 mordor kernel: [ 18.116622] input: PC Speaker as /devices/platform/pcspkr/input/input6
Oct 29 20:51:07 mordor kernel: [ 18.427607] Error: Driver 'pcspkr' is already registered, aborting...
Oct 29 20:51:07 mordor kernel: [ 232.381814] Adding 10000452k swap on /dev/sda2. Priority:-1 extents:1 across:10000452k
The dom0 is there hanging in some way ... I don't have an exact way to get the system proceed from there ... usually I need to wait some seconds and pressing ^C after a while will get the box up.
Anyways ... I remember there are some problems with udev >= 150 and older kernels, which is also documented in the squeeze release notes. Maybe there is also a problem with older udev and newer kernels? But why is this combination working on other systems? Booting linux-image-2.6.32-bpo.5-686 works without any issue anyways, so it seems xen related. Any idea is appreciated!
UPDATE: I've also reported this via #602109.