FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Kernel

 
 
LinkBack Thread Tools
 
Old 02-19-2012, 07:21 PM
Thomas Goirand
 
Default Bug#660554: Kernel crashes at boot in drivers/dma/ioat/dma_v2.c:163 when running under Xen 4.1

Package: linux-image-3.2.0-1-686-pae
Version: 3.2.6-1
Severity: important

Hi,

When running kernel 3.2 with Xen 4.1 in SID, my development server just
crashes at boot time. When running with Linux 3.1 (which I currently do
so that I can continue the packaging work of XCP), there's no problem,
and the server boots up to the login.

It took me quite some time to do (it really isn't trivial to setup, btw),
but now I do have a serial console dump of the crash (see below).

Since it seems to be a driver issue, let me know if you need a dump of
lspci or something similar. Note that my server is a Supermicro with a
X8-STi-F motherboard, which is running all on Intel chipset, so this
should be some quite common hardware, especially for servers.

Cheers,

Thomas Goirand (zigo)

[ 8.771014] kernel BUG at /build/buildd-linux-2.6_3.2.6-1-i386-Ea61XL/linux-2.6-3.2.6/debian/build/source_i386_none/drivers/dma/ioat/dma_v2.c:163!M
[ 8.771023] invalid opcode: 0000 [#1] SMP M
[ 8.771031] Modules linked in: ioatdma(+) i2c_i801 serio_raw pcspkr dca i7core_edac i2c_core joydev evdev processor button edac_core iTCO_wdt iTCO_vendor_support thermal_sys ext3 jbd mbcache dm_mod raid1
[ 8.771131]
[ 8.771136] Pid: 0, comm: swapper/0 Not tainted 3.2.0-1-686-pae #1 Supermicro X8STi/X8STi
[ 8.771148] EIP: 0061:[<f74df33d>] EFLAGS: 00010246 CPU: 0
[ 8.771156] EIP is at __cleanup+0xd6/0x113 [ioatdma]
[ 8.771161] EAX: 00000000 EBX: c2d534c0 ECX: c2ea5e04 EDX: c2ea0204
[ 8.771167] ESI: c2f7dccc EDI: 00020002 EBP: 00000002 ESP: d840bfa0
[ 8.771172] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069
[ 8.771178] Process swapper/0 (pid: 0, ti=d840a000 task=c13d8fe0 task.ti=c13d2000)
[ 8.771183] Stack:
[ 8.771187] 211cc040 00000002 00010002 c2f7dccc c2f7dcfc 00000000 00000100 f74df3a0
[ 8.771206] 211cc040 c2f7dd4c c2f7dd50 c103c7c7 00000006 c13d8a18 c103cd5d c103cdf1
[ 8.771226] 00000001 0000000a 00000000 c13d3f3c c13d2000 c103cd5d c13d0000 c100cfdc
[ 8.771245] Call Trace:
[ 8.771253] [<f74df3a0>] ? ioat2_cleanup_event+0x26/0x3c [ioatdma]
[ 8.771263] [<c103c7c7>] ? tasklet_action+0x62/0xa5
[ 8.771270] [<c103cd5d>] ? local_bh_enable+0x2/0x2
[ 8.771276] [<c103cdf1>] ? __do_softirq+0x94/0x12f
[ 8.771283] [<c103cd5d>] ? local_bh_enable+0x2/0x2
[ 8.771288] <IRQ>
[ 8.771294] [<c103cfe2>] ? irq_exit+0x32/0x80
[ 8.771302] [<c11bcb85>] ? xen_evtchn_do_upcall+0x1d/0x26
[ 8.771311] [<c12be317>] ? xen_do_upcall+0x7/0xc
[ 8.771319] [<c105007b>] ? __hrtimer_start_range_ns+0x106/0x308
[ 8.771327] [<c10023a7>] ? hypercall_page+0x3a7/0x1000
[ 8.771334] [<c100609a>] ? xen_safe_halt+0xf/0x19
[ 8.771342] [<c1010cfc>] ? default_idle+0x52/0x87
[ 8.771348] [<c100b234>] ? cpu_idle+0x95/0xaf
[ 8.771355] [<c1412708>] ? start_kernel+0x32a/0x32f
[ 8.771362] [<c1414014>] ? xen_start_kernel+0x58b/0x592
[ 8.771366] Code: 0b 43 14 0f 94 c0 45 3b 6c 24 04 7d 04 84 c0 74 8f 0f ae f0 89 f6 01 ef 66 83 7c 24 08 00 66 89 be c0 00 00 00 74 06 84 c0 75 02 <0f> 0b 8b 14 24 39 6c 24 04 89 56 2c 75 27 f0 80 66 38 fe b8 d0
[ 8.771485] EIP: [<f74df33d>] __cleanup+0xd6/0x113 [ioatdma] SS:ESP 0069:d840bfa0
[ 8.771500] ---[ end trace e846e774b59f2515 ]---
[ 8.771505] Kernel panic - not syncing: Fatal exception in interrupt
[ 8.771511] Pid: 0, comm: swapper/0 Tainted: G D 3.2.0-1-686-pae #1
[ 8.771516] Call Trace:
[ 8.771523] [<c12b4d0b>] ? panic+0x4d/0x144
[ 8.771529] [<c12ba8ba>] ? oops_end+0x8e/0x99
[ 8.771536] [<c100c231>] ? do_bounds+0x4c/0x4c
[ 8.771542] [<c100c29a>] ? do_invalid_op+0x69/0x72
[ 8.771551] [<f74df33d>] ? __cleanup+0xd6/0x113 [ioatdma]
[ 8.771558] [<c102ac3d>] ? test_tsk_need_resched+0xa/0x13
[ 8.771565] [<c102dc06>] ? resched_task+0x30/0x57
[ 8.771571] [<c10060da>] ? xen_force_evtchn_callback+0xc/0x10
[ 8.771578] [<c1006740>] ? check_events+0x8/0xc
[ 8.771585] [<c1006737>] ? xen_restore_fl_direct_reloc+0x4/0x4
[ 8.771592] [<c105d43a>] ? arch_local_irq_restore+0x6/0x7
[ 8.771599] [<c103222c>] ? try_to_wake_up+0x14b/0x155
[ 8.771605] [<c12ba28f>] ? error_code+0x67/0x6c
[ 8.771614] [<f74df33d>] ? __cleanup+0xd6/0x113 [ioatdma]
[ 8.771623] [<f74df3a0>] ? ioat2_cleanup_event+0x26/0x3c [ioatdma]
[ 8.771630] [<c103c7c7>] ? tasklet_action+0x62/0xa5
[ 8.771637] [<c103cd5d>] ? local_bh_enable+0x2/0x2
[ 8.771643] [<c103cdf1>] ? __do_softirq+0x94/0x12f
[ 8.771650] [<c103cd5d>] ? local_bh_enable+0x2/0x2
[ 8.771654] <IRQ> [<c103cfe2>] ? irq_exit+0x32/0x80
[ 8.771665] [<c11bcb85>] ? xen_evtchn_do_upcall+0x1d/0x26
[ 8.771672] [<c12be317>] ? xen_do_upcall+0x7/0xc
[ 8.771679] [<c105007b>] ? __hrtimer_start_range_ns+0x106/0x308
[ 8.771685] [<c10023a7>] ? hypercall_page+0x3a7/0x1000
[ 8.771692] [<c100609a>] ? xen_safe_halt+0xf/0x19
[ 8.771699] [<c1010cfc>] ? default_idle+0x52/0x87
[ 8.771705] [<c100b234>] ? cpu_idle+0x95/0xaf
[ 8.771711] [<c1412708>] ? start_kernel+0x32a/0x32f
[ 8.771718] [<c1414014>] ? xen_start_kernel+0x58b/0x592
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.

-- System Information:
Debian Release: 6.0.4
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.32-5-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120219202133.9694.50330.reportbug@buzig.gplhost. com">http://lists.debian.org/20120219202133.9694.50330.reportbug@buzig.gplhost. com
 
Old 04-06-2012, 03:32 AM
Jonathan Nieder
 
Default Bug#660554: Kernel crashes at boot in drivers/dma/ioat/dma_v2.c:163 when running under Xen 4.1

tags 660554 + patch
quit

Hi Ben,

Ben Hutchings wrote:

> Can't find it; please provide a reference.

The patch hit linux-next as 275029353953 (ioat: fix size of
'completion' for Xen, 2012-03-23).

A patch against the packaging repo which applies the fix is attached
for convenience.

Thanks,
Jonathan
Index: debian/changelog
================================================== =================
--- debian/changelog (revision 18906)
+++ debian/changelog (working copy)
@@ -1,3 +1,9 @@
+linux-2.6 (3.2.14-2) UNRELEASED; urgency=low
+
+ * [x86] ioat: fix size of 'completion' for Xen (Closes: #660554)
+
+ -- Jonathan Nieder <jrnieder@gmail.com> Thu, 05 Apr 2012 22:25:07 -0500
+
linux-2.6 (3.2.14-1) unstable; urgency=low

* New upstream stable update:
Index: debian/patches/bugfix/x86/ioat-fix-size-of-completion-for-Xen.patch
================================================== =================
--- debian/patches/bugfix/x86/ioat-fix-size-of-completion-for-Xen.patch (revision 0)
+++ debian/patches/bugfix/x86/ioat-fix-size-of-completion-for-Xen.patch (working copy)
@@ -0,0 +1,210 @@
+From: Dan Williams <dan.j.williams@intel.com>
+Date: Fri, 23 Mar 2012 13:36:42 -0700
+Subject: ioat: fix size of 'completion' for Xen
+
+commit 275029353953c2117941ade84f02a2303912fad1 upstream.
+
+Starting with v3.2 Jonathan reports that Xen crashes loading the ioatdma
+driver. A debug run shows:
+
+ ioatdma 0000:00:16.4: desc[0]: (0x300cc7000->0x300cc7040) cookie: 0 flags: 0x2 ctl: 0x29 (op: 0 int_en: 1 compl: 1)
+ ...
+ ioatdma 0000:00:16.4: ioat_get_current_completion: phys_complete: 0xcc7000
+
+...which shows that in this environment GFP_KERNEL memory may be backed
+by a 64-bit dma address. This breaks the driver's assumption that an
+unsigned long should be able to contain the physical address for
+descriptor memory. Switch to dma_addr_t which beyond being the right
+size, is the true type for the data i.e. an io-virtual address
+inidicating the engine's last processed descriptor.
+
+[stable: 3.2+]
+Cc: <stable@vger.kernel.org>
+Reported-by: Jonathan Nieder <jrnieder@gmail.com>
+Reported-by: William Dauchy <wdauchy@gmail.com>
+Tested-by: William Dauchy <wdauchy@gmail.com>
+Tested-by: Dave Jiang <dave.jiang@intel.com>
+Signed-off-by: Dan Williams <dan.j.williams@intel.com>
+Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
+---
+ drivers/dma/ioat/dma.c | 16 ++++++++--------
+ drivers/dma/ioat/dma.h | 6 +++---
+ drivers/dma/ioat/dma_v2.c | 8 ++++----
+ drivers/dma/ioat/dma_v3.c | 8 ++++----
+ 4 files changed, 19 insertions(+), 19 deletions(-)
+
+diff --git a/drivers/dma/ioat/dma.c b/drivers/dma/ioat/dma.c
+index a4d6cb0c0343..659518015972 100644
+--- a/drivers/dma/ioat/dma.c
++++ b/drivers/dma/ioat/dma.c
+@@ -548,9 +548,9 @@ void ioat_dma_unmap(struct ioat_chan_common *chan, enum dma_ctrl_flags flags,
+ PCI_DMA_TODEVICE, flags, 0);
+ }
+
+-unsigned long ioat_get_current_completion(struct ioat_chan_common *chan)
++dma_addr_t ioat_get_current_completion(struct ioat_chan_common *chan)
+ {
+- unsigned long phys_complete;
++ dma_addr_t phys_complete;
+ u64 completion;
+
+ completion = *chan->completion;
+@@ -571,7 +571,7 @@ unsigned long ioat_get_current_completion(struct ioat_chan_common *chan)
+ }
+
+ bool ioat_cleanup_preamble(struct ioat_chan_common *chan,
+- unsigned long *phys_complete)
++ dma_addr_t *phys_complete)
+ {
+ *phys_complete = ioat_get_current_completion(chan);
+ if (*phys_complete == chan->last_completion)
+@@ -582,14 +582,14 @@ bool ioat_cleanup_preamble(struct ioat_chan_common *chan,
+ return true;
+ }
+
+-static void __cleanup(struct ioat_dma_chan *ioat, unsigned long phys_complete)
++static void __cleanup(struct ioat_dma_chan *ioat, dma_addr_t phys_complete)
+ {
+ struct ioat_chan_common *chan = &ioat->base;
+ struct list_head *_desc, *n;
+ struct dma_async_tx_descriptor *tx;
+
+- dev_dbg(to_dev(chan), "%s: phys_complete: %lx
",
+- __func__, phys_complete);
++ dev_dbg(to_dev(chan), "%s: phys_complete: %llx
",
++ __func__, (unsigned long long) phys_complete);
+ list_for_each_safe(_desc, n, &ioat->used_desc) {
+ struct ioat_desc_sw *desc;
+
+@@ -655,7 +655,7 @@ static void __cleanup(struct ioat_dma_chan *ioat, unsigned long phys_complete)
+ static void ioat1_cleanup(struct ioat_dma_chan *ioat)
+ {
+ struct ioat_chan_common *chan = &ioat->base;
+- unsigned long phys_complete;
++ dma_addr_t phys_complete;
+
+ prefetch(chan->completion);
+
+@@ -701,7 +701,7 @@ static void ioat1_timer_event(unsigned long data)
+ mod_timer(&chan->timer, jiffies + COMPLETION_TIMEOUT);
+ spin_unlock_bh(&ioat->desc_lock);
+ } else if (test_bit(IOAT_COMPLETION_PENDING, &chan->state)) {
+- unsigned long phys_complete;
++ dma_addr_t phys_complete;
+
+ spin_lock_bh(&ioat->desc_lock);
+ /* if we haven't made progress and we have already
+diff --git a/drivers/dma/ioat/dma.h b/drivers/dma/ioat/dma.h
+index 5216c8a92a21..8bebddd189c7 100644
+--- a/drivers/dma/ioat/dma.h
++++ b/drivers/dma/ioat/dma.h
+@@ -88,7 +88,7 @@ struct ioatdma_device {
+ struct ioat_chan_common {
+ struct dma_chan common;
+ void __iomem *reg_base;
+- unsigned long last_completion;
++ dma_addr_t last_completion;
+ spinlock_t cleanup_lock;
+ dma_cookie_t completed_cookie;
+ unsigned long state;
+@@ -333,7 +333,7 @@ int __devinit ioat_dma_self_test(struct ioatdma_device *device);
+ void __devexit ioat_dma_remove(struct ioatdma_device *device);
+ struct dca_provider * __devinit ioat_dca_init(struct pci_dev *pdev,
+ void __iomem *iobase);
+-unsigned long ioat_get_current_completion(struct ioat_chan_common *chan);
++dma_addr_t ioat_get_current_completion(struct ioat_chan_common *chan);
+ void ioat_init_channel(struct ioatdma_device *device,
+ struct ioat_chan_common *chan, int idx);
+ enum dma_status ioat_dma_tx_status(struct dma_chan *c, dma_cookie_t cookie,
+@@ -341,7 +341,7 @@ enum dma_status ioat_dma_tx_status(struct dma_chan *c, dma_cookie_t cookie,
+ void ioat_dma_unmap(struct ioat_chan_common *chan, enum dma_ctrl_flags flags,
+ size_t len, struct ioat_dma_descriptor *hw);
+ bool ioat_cleanup_preamble(struct ioat_chan_common *chan,
+- unsigned long *phys_complete);
++ dma_addr_t *phys_complete);
+ void ioat_kobject_add(struct ioatdma_device *device, struct kobj_type *type);
+ void ioat_kobject_del(struct ioatdma_device *device);
+ extern const struct sysfs_ops ioat_sysfs_ops;
+diff --git a/drivers/dma/ioat/dma_v2.c b/drivers/dma/ioat/dma_v2.c
+index 5d65f8377971..cb8864d45601 100644
+--- a/drivers/dma/ioat/dma_v2.c
++++ b/drivers/dma/ioat/dma_v2.c
+@@ -126,7 +126,7 @@ static void ioat2_start_null_desc(struct ioat2_dma_chan *ioat)
+ spin_unlock_bh(&ioat->prep_lock);
+ }
+
+-static void __cleanup(struct ioat2_dma_chan *ioat, unsigned long phys_complete)
++static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
+ {
+ struct ioat_chan_common *chan = &ioat->base;
+ struct dma_async_tx_descriptor *tx;
+@@ -178,7 +178,7 @@ static void __cleanup(struct ioat2_dma_chan *ioat, unsigned long phys_complete)
+ static void ioat2_cleanup(struct ioat2_dma_chan *ioat)
+ {
+ struct ioat_chan_common *chan = &ioat->base;
+- unsigned long phys_complete;
++ dma_addr_t phys_complete;
+
+ spin_lock_bh(&chan->cleanup_lock);
+ if (ioat_cleanup_preamble(chan, &phys_complete))
+@@ -259,7 +259,7 @@ int ioat2_reset_sync(struct ioat_chan_common *chan, unsigned long tmo)
+ static void ioat2_restart_channel(struct ioat2_dma_chan *ioat)
+ {
+ struct ioat_chan_common *chan = &ioat->base;
+- unsigned long phys_complete;
++ dma_addr_t phys_complete;
+
+ ioat2_quiesce(chan, 0);
+ if (ioat_cleanup_preamble(chan, &phys_complete))
+@@ -274,7 +274,7 @@ void ioat2_timer_event(unsigned long data)
+ struct ioat_chan_common *chan = &ioat->base;
+
+ if (test_bit(IOAT_COMPLETION_PENDING, &chan->state)) {
+- unsigned long phys_complete;
++ dma_addr_t phys_complete;
+ u64 status;
+
+ status = ioat_chansts(chan);
+diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
+index f519c93a61e7..2dbf32b02735 100644
+--- a/drivers/dma/ioat/dma_v3.c
++++ b/drivers/dma/ioat/dma_v3.c
+@@ -256,7 +256,7 @@ static bool desc_has_ext(struct ioat_ring_ent *desc)
+ * The difference from the dma_v2.c __cleanup() is that this routine
+ * handles extended descriptors and dma-unmapping raid operations.
+ */
+-static void __cleanup(struct ioat2_dma_chan *ioat, unsigned long phys_complete)
++static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
+ {
+ struct ioat_chan_common *chan = &ioat->base;
+ struct ioat_ring_ent *desc;
+@@ -314,7 +314,7 @@ static void __cleanup(struct ioat2_dma_chan *ioat, unsigned long phys_complete)
+ static void ioat3_cleanup(struct ioat2_dma_chan *ioat)
+ {
+ struct ioat_chan_common *chan = &ioat->base;
+- unsigned long phys_complete;
++ dma_addr_t phys_complete;
+
+ spin_lock_bh(&chan->cleanup_lock);
+ if (ioat_cleanup_preamble(chan, &phys_complete))
+@@ -333,7 +333,7 @@ static void ioat3_cleanup_event(unsigned long data)
+ static void ioat3_restart_channel(struct ioat2_dma_chan *ioat)
+ {
+ struct ioat_chan_common *chan = &ioat->base;
+- unsigned long phys_complete;
++ dma_addr_t phys_complete;
+
+ ioat2_quiesce(chan, 0);
+ if (ioat_cleanup_preamble(chan, &phys_complete))
+@@ -348,7 +348,7 @@ static void ioat3_timer_event(unsigned long data)
+ struct ioat_chan_common *chan = &ioat->base;
+
+ if (test_bit(IOAT_COMPLETION_PENDING, &chan->state)) {
+- unsigned long phys_complete;
++ dma_addr_t phys_complete;
+ u64 status;
+
+ status = ioat_chansts(chan);
+--
+1.7.10.rc4
+
Index: debian/patches/series/base
================================================== =================
--- debian/patches/series/base (revision 18906)
+++ debian/patches/series/base (working copy)
@@ -81,6 +81,7 @@
+ features/all/fs-hardlink-creation-restriction-cleanup.patch
+ bugfix/all/Don-t-limit-non-nested-epoll-paths.patch
+ bugfix/all/kbuild-do-not-check-for-ancient-modutils-tools.patch
++ bugfix/x86/ioat-fix-size-of-completion-for-Xen.patch

# Temporary, until the original change has been tested some more
+ debian/revert-CIFS-Respect-negotiated-MaxMpxCount.patch
 
Old 04-07-2012, 02:43 AM
Ben Hutchings
 
Default Bug#660554: Kernel crashes at boot in drivers/dma/ioat/dma_v2.c:163 when running under Xen 4.1

On Thu, 2012-04-05 at 22:32 -0500, Jonathan Nieder wrote:
> tags 660554 + patch
> quit
>
> Hi Ben,
>
> Ben Hutchings wrote:
>
> > Can't find it; please provide a reference.
>
> The patch hit linux-next as 275029353953 (ioat: fix size of
> 'completion' for Xen, 2012-03-23).
>
> A patch against the packaging repo which applies the fix is attached
> for convenience.

Applied, thanks.

Ben.

--
Ben Hutchings
Larkinson's Law: All laws are basically false.
 

Thread Tools




All times are GMT. The time now is 10:07 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org