Bug#629253: linux-image-2.6.32-5-amd64: Squeeze amd64 PV DOMU live migration fails
Package: linux-2.6
Version: 2.6.32-34
Severity: normal
Hi,
At first I must point out:
For me Squeeze is far away from being the best Debian ever.
I'm really disappointed from the quality of Debian Squeeze as server OS.
I began testing when Squeeze became stable. Too many problems. Much too much time spent on testing, finding errors and fixing/workaround (my boss really loves me now for the work that did not succeed during the last few months).
I don't know where to point this problems out at a higher position, because they are more general? Is there a contact to the "Debian management" where I can ask for more stability and completeness of Debian than for new features (possibly fine for a desktop only distri)? Answering this question would be nice!
Our business must have all security fixes within 3 months after release.
Also we have 24/7 web services, servers mostly clustered, where every minute downtime really costs money (if downtime is too long it costs existence).
We can't stay on Lenny for very long. EOL is coming near.
After about half a year Squeeze stable, it is far away from being ready for our production servers.
That's why we decided for now to go with a RHEL clone for our new installed productive servers.
I will keep an eye on Debian. Once the quality comes back....who knows.
A few lines later I describe the problems I had testing Squeeze as XEN DOM0.
My install system for Windows (OPSI) our Solaris servers (Jumpstart) and RHEL based distros (Kickstart) still resides on Debian Squeeze.
Moving them to another platform is a lot of work. That's why it would be still nice to have a live migrateable Debian Squeeze. Here we go:
Sometimes lm works but after migration machine is not reachable through network anymore.
Somebody mentioned a cronjob with traffic on vnet-device would help keeping machine reachable through network.
Mostly lm crashes during migration.
I tested all versions from 2.6.32-30 -- -34.
Dom0 is: Choose on - I tested:
XCP 1.0
OpenSuse 11.3 with Xen 4.0.1
Debian Squeeze (4.0.1)
Squeeze worked most bad as DOM0 and should not be advertised as a working enterprise solution for XEN virtualization:
- crashes our dell r7xx servers when using multipath + iscsi or only iscsi
- live migration (with a working os's(2008R2 f.e.) as DOMU ) works only 5-6 times then it crashes.
f.e. scripted live migration of a 2008R2 and a Lenny DOMU (at the same time) between 2 nodes worked 2 days
(DOMUs reachable all the time) with XCP 1.0 (about 230 live migrations each DOMU), then we stopped testing
- xend freezes
- random reboots of DOM0 due to other reasons that could not be located
- lots of scaring error messages when uptime grows
but back to live migration problem of Debian Squeeze DOMU's:
DomU's are installed with FAI.
Lenny with a similar install works just fine when:
/proc/cmdline: ... clocksource=jiffies
and
/etc/sysctl.conf: ...
xen_independent_wallclock=1
....
Live Migration works also fine with
Win2008R2
Winxpsp3
ncp 3.0.1
RHEL(PUIAS clone) 5.6
RHEL(PUIAS clone) 6.1
I assume the problem is the Squeeze kernel.
-- Package-specific info:
** Version:
Linux version 2.6.32-5-amd64 (Debian 2.6.32-34)
Kernel: Linux 2.6.32-5-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages linux-image-2.6.32-5-amd64 depends on:
ii debconf [debconf-2.0] 1.5.36.1 Debian configuration management sy
ii initramfs-tools [linux-initra 0.98.8 tools for generating an initramfs
ii linux-base 2.6.32-31 Linux image base package
ii module-init-tools 3.12-1 tools for managing Linux kernel mo
Versions of packages linux-image-2.6.32-5-amd64 recommends:
ii firmware-linux-free 2.6.32-31 Binary firmware for various driver
Versions of packages linux-image-2.6.32-5-amd64 suggests:
pn grub | lilo <none> (no description available)
pn linux-doc-2.6.32 <none> (no description available)
Versions of packages linux-image-2.6.32-5-amd64 is related to:
ii firmware-bnx2 0.28 Binary firmware for Broadcom NetXt
ii firmware-bnx2x 0.28 Binary firmware for Broadcom NetXt
pn firmware-ipw2x00 <none> (no description available)
pn firmware-ivtv <none> (no description available)
pn firmware-iwlwifi <none> (no description available)
ii firmware-linux 0.28 Binary firmware for various driver
ii firmware-linux-nonfree 0.28 Binary firmware for various driver
ii firmware-qlogic 0.28 Binary firmware for QLogic IBA7220
pn firmware-ralink <none> (no description available)
pn xen-hypervisor <none> (no description available)
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110604230209.2330.23569.reportbug@hpedebsv12.hpe .hoonet.org">http://lists.debian.org/20110604230209.2330.23569.reportbug@hpedebsv12.hpe .hoonet.org
06-05-2011, 10:54 AM
Bastian Blank
Bug#629253: linux-image-2.6.32-5-amd64: Squeeze amd64 PV DOMU live migration fails
On Sun, Jun 05, 2011 at 01:02:09AM +0200, Holger Fischer wrote:
> For me Squeeze is far away from being the best Debian ever.
A _lot_ of people think different. Please move this to debian-devel. I
will not answer this here.
> Sometimes lm works but after migration machine is not reachable through network anymore.
More information. "xm network-list --long $domain" before and after.
Informations about the network setup. Kernel log from the dom0.
> Mostly lm crashes during migration.
Messages?
> Squeeze worked most bad as DOM0 and should not be advertised as a working enterprise solution for XEN virtualization:
> - crashes our dell r7xx servers when using multipath + iscsi or only iscsi
iscsi and enterprise does not match.
> - live migration (with a working os's(2008R2 f.e.) as DOMU ) works only 5-6 times then it crashes.
Windows does not support PV, so no live migration.
Bastian
--
Another dream that failed. There's nothing sadder.
-- Kirk, "This side of Paradise", stardate 3417.3
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110605105440.GA5849@wavehammer.waldi.eu.org">htt p://lists.debian.org/20110605105440.GA5849@wavehammer.waldi.eu.org
06-05-2011, 12:06 PM
Holger Fischer
Bug#629253: linux-image-2.6.32-5-amd64: Squeeze amd64 PV DOMU live migration fails
Hi Bastian,
thank you for your quick answer. I split this email in 2 parts. Things belonging to the bug I send to 629253@bugs.debian.org. The answers to the rest I sent to you.
.
.
.
>
>> Sometimes lm works but after migration machine is not reachable through network anymore.
>
> More information. "xm network-list --long $domain" before and after.
> Informations about the network setup. Kernel log from the dom0.
I will do this this tests, but I have a lot of projects with higher priority at the moment. So please give me some time for that.
>
>> Mostly lm crashes during migration.
>
> Messages?
see above.
.
.
.
Kind regards
Dipl.-Ing.(FH) Holger Fischer
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4DEB715F.9020001@web.de">http://lists.debian.org/4DEB715F.9020001@web.de
06-05-2011, 12:44 PM
Holger Fischer
Bug#629253: linux-image-2.6.32-5-amd64: Squeeze amd64 PV DOMU live migration fails
Hi,
can't provide test results, but other people having same problem and a bit more info.
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4DEB7A45.9070900@web.de">http://lists.debian.org/4DEB7A45.9070900@web.de
06-05-2011, 01:22 PM
Holger Fischer
Bug#629253: linux-image-2.6.32-5-amd64: Squeeze amd64 PV DOMU live migration fails
Hi Basti, dear maintainers,
Google "debian squeeze live migration xenserver" or "debian squeeze "live migration" citrix".
The first results are not bad.
Read them through and you get an idea (a few people have ideas how to fix).
I tried googling for "squeeze domu live migration" brought me too much results for Squeeze as DOM0.
Possibly "debian squeeze live migration "xen cloud platform"" or "..."xcp"" or "..."oraclevm"" or "..."sles"" gives you some good results too.
f.e. http://forums.citrix.com/thread.jspa?threadID=281439&tstart=0 -> post 13
Using 2.6.38 from backports is not an option. I'm not allowed to use an unmaintained kernel.
Can remember I tested 2.6.38 and remind syslog-ng was not working/segfaulting.
I believe if you test live migration yourself on your test HW with Squeeze DOMU (amd64) on a DOM0 of your choice, you'll get similar results.
Kind regards
Dipl.-Ing.(FH) Holger Fischer
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4DEB82FF.6070401@web.de">http://lists.debian.org/4DEB82FF.6070401@web.de
06-07-2011, 08:42 AM
Ian Campbell
Bug#629253: linux-image-2.6.32-5-amd64: Squeeze amd64 PV DOMU live migration fails
On Sun, 2011-06-05 at 01:02 +0200, Holger Fischer wrote:
>
> Sometimes lm works but after migration machine is not reachable
> through network anymore.
> Somebody mentioned a cronjob with traffic on vnet-device would help
> keeping machine reachable through network.
You probably need to set the arp_notify sysctl on the net device inside
the guest. This was fixed upstream and backported into the 2.6.32.32
longterm release which is in the Debian 2.6.32-31 package which was
recently added to Squeeze in a stable update.
Ian.
--
Ian Campbell
Current Noise: Godflesh - Sterile Prophet (Version)
Q: How many mathematicians does it take to screw in a light bulb?
A: One. He gives it to six Californians, thereby reducing the problem
to the earlier joke.
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 1307436152.775.493.camel@zakaz.uk.xensource.com">h ttp://lists.debian.org/1307436152.775.493.camel@zakaz.uk.xensource.com