FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Kernel

 
 
LinkBack Thread Tools
 
Old 06-09-2011, 01:02 AM
Ben Hutchings
 
Default Bug#629865: xen-linux-system-2.6.26-2-xen-amd64 causes system crash when using aacraid driver

On Wed, 2011-06-08 at 19:14 -0500, Tim Vaillancourt wrote:
> Package: xen-linux-system-2.6.26-2-xen-amd64
> Version: 2.6.26-26lenny1
> Severity: critical
> Justification: breaks the whole system
>
>
> Similar to: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=596419,
> we have several Debian lenny systems running Xen that crash at the
> aacraid driver ('drivers/scsi/aacraid/aachba.c:2825') when using the
> 2.6.26-26lenny1 version of the kernel. In our situation, this will
> happen every few days, maybe once a week while we run an IO/CPU heavy
> backup using 'duplicity'.
[...]

If this driver is ever unable to set up DMA then it crashes. This means
it's broken in any system with an IOMMU (including Xen's software
IOMMU).

The same bug exists in mainline Linux today.

Ben.

--
Ben Hutchings
Lowery's Law:
If it jams, force it. If it breaks, it needed replacing anyway.
 
Old 06-09-2011, 02:33 AM
Ben Hutchings
 
Default Bug#629865: xen-linux-system-2.6.26-2-xen-amd64 causes system crash when using aacraid driver

On Wed, 2011-06-08 at 18:27 -0700, Tim Vaillancourt wrote:
> After some more thought, I understand the DMA and IOMMU portions of
> this issue, and why DMA cannot be setup, what I don't understand is
> why this would cause a loss of the aacraid driver and a system crash.
> Shouldn't this be expected? If not, what is not doing it's job
> correctly?

The aacraid driver needs to handle the error, but it does not.

> I guess what I am looking for is a way to run Xen in a stable way. It
> seems wrong to me that this brings my system down, even if it is a
> mainline bug.

It is wrong. This driver won't be reliable with Xen.

> Another side to this issue is we have this occur only on less than 10%
> of our identical systems. Any idea why that would be?

The recent change in the way you run backups presumably can cause a
sudden increase in memory usage, whereas on other systems this never
happens. Disk drivers (among others) cannot wait for data to be swapped
out when try to allocate memory, because they may themselves be used for
swapping. Generally the kernel tries to ensure there is some physical
memory available for immediate allocation, by swapping out data early.
But if there is a sudden increase in memory usage then the kernel may
not start swapping soon enough to avoid an allocation failure in the
disk driver.

Ben.

--
Ben Hutchings
Lowery's Law:
If it jams, force it. If it breaks, it needed replacing anyway.
 
Old 06-09-2011, 03:26 AM
Ben Hutchings
 
Default Bug#629865: xen-linux-system-2.6.26-2-xen-amd64 causes system crash when using aacraid driver

On Wed, 2011-06-08 at 20:16 -0700, Tim Vaillancourt wrote:
> Fantastic. I really appreciate your reply, and that clarifies things
> wonderfully.
>
> We've recently increased the VM count on the Xen servers we have this
> issue on, but we (still) run the duplicity backups from the Dom0 on one
> VM at a time backup load theoretically the same as before at any one
> time), so I'm thinking now the increased VM count (and thus IO) causes
> this to happen while duplicity is running at the same time in the dom0.
>
> That is really too bad cause IO-wise we aren't even pushing the disks
> that hard, but I guess we are asking it to do a lot of simultaneous
> things at once.
>
> Again, thank you for the reply. By chance, do you have any links to the
> mainstream bug you mentioned that is already open?

There wasn't any bug report; I just compared the code between 2.6.26 and
current development. My reply to you also went to the aacraid
developers and the SCSI mailing list.

Please reply-to-all.

Ben.

--
Ben Hutchings
Lowery's Law:
If it jams, force it. If it breaks, it needed replacing anyway.
 
Old 06-09-2011, 03:28 AM
Tim Vaillancourt
 
Default Bug#629865: xen-linux-system-2.6.26-2-xen-amd64 causes system crash when using aacraid driver

Understood. Thanks again for clarifying Ben.

PS: Whoops on Reply-to-All. Better late than never!

Cheers,

Tim

On 08/06/11 08:26 PM, Ben Hutchings wrote:
> On Wed, 2011-06-08 at 20:16 -0700, Tim Vaillancourt wrote:
>
>> Fantastic. I really appreciate your reply, and that clarifies things
>> wonderfully.
>>
>> We've recently increased the VM count on the Xen servers we have this
>> issue on, but we (still) run the duplicity backups from the Dom0 on one
>> VM at a time backup load theoretically the same as before at any one
>> time), so I'm thinking now the increased VM count (and thus IO) causes
>> this to happen while duplicity is running at the same time in the dom0.
>>
>> That is really too bad cause IO-wise we aren't even pushing the disks
>> that hard, but I guess we are asking it to do a lot of simultaneous
>> things at once.
>>
>> Again, thank you for the reply. By chance, do you have any links to the
>> mainstream bug you mentioned that is already open?
>>
> There wasn't any bug report; I just compared the code between 2.6.26 and
> current development. My reply to you also went to the aacraid
> developers and the SCSI mailing list.
>
> Please reply-to-all.
>
> Ben.
>
>



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4DF03DD4.5020202@elementspace.com">http://lists.debian.org/4DF03DD4.5020202@elementspace.com
 

Thread Tools




All times are GMT. The time now is 07:40 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org