FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Kernel

 
 
LinkBack Thread Tools
 
Old 07-14-2010, 01:57 PM
Vincent Danjean
 
Default Bug#545125: Intel + KSM still corrupt memory after resuming from suspend to disk

reopen 545125
found 545125 2.6.32-17
thanks

Hi,

Even if the problem occurs less often, I still experiment it sometimes
with 2.6.32-17.
The last time (yesterday), just after a resume, any new processus
segfault within the ld.so code... So I rebooted.
And today, I discovered, just before submitting a new bug, that the
fact that bash core dump each time I hit [tab] or [backspace] was due
to a disk corruption of bash binary (ie pb fixed by reinstalling my current
version of bash). I will now start a global fsck to check if other
on-disk structures have been corrupted or not.
So, this bug is not fully fixed.

Regards,
Vincent

--
Vincent Danjean GPG key ID 0x9D025E87 vdanjean@debian.org
GPG key fingerprint: FC95 08A6 854D DB48 4B9A 8A94 0BF7 7867 9D02 5E87
Unofficial packages: http://moais.imag.fr/membres/vincent.danjean/deb.html
APT repo: deb http://perso.debian.org/~vdanjean/debian unstable main




--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4C3DC24C.9040204@ens-lyon.org">http://lists.debian.org/4C3DC24C.9040204@ens-lyon.org
 
Old 07-14-2010, 06:18 PM
Julien Cristau
 
Default Bug#545125: Intel + KSM still corrupt memory after resuming from suspend to disk

On Wed, Jul 14, 2010 at 15:57:32 +0200, Vincent Danjean wrote:

> Even if the problem occurs less often, I still experiment it sometimes
> with 2.6.32-17.
> The last time (yesterday), just after a resume, any new processus
> segfault within the ld.so code... So I rebooted.
> And today, I discovered, just before submitting a new bug, that the
> fact that bash core dump each time I hit [tab] or [backspace] was due
> to a disk corruption of bash binary (ie pb fixed by reinstalling my current
> version of bash). I will now start a global fsck to check if other
> on-disk structures have been corrupted or not.
> So, this bug is not fully fixed.
>
I don't suppose it's possible that this was earlier on-disk corruption
still showing up after the reboot on the new kernel?

Cheers,
Julien
 
Old 07-14-2010, 10:19 PM
Vincent Danjean
 
Default Bug#545125: Intel + KSM still corrupt memory after resuming from suspend to disk

On 14/07/2010 20:18, Julien Cristau wrote:
> On Wed, Jul 14, 2010 at 15:57:32 +0200, Vincent Danjean wrote:
>
>> Even if the problem occurs less often, I still experiment it sometimes
>> with 2.6.32-17.
>> The last time (yesterday), just after a resume, any new processus
>> segfault within the ld.so code... So I rebooted.
>> And today, I discovered, just before submitting a new bug, that the
>> fact that bash core dump each time I hit [tab] or [backspace] was due
>> to a disk corruption of bash binary (ie pb fixed by reinstalling my current
>> version of bash). I will now start a global fsck to check if other
>> on-disk structures have been corrupted or not.
>> So, this bug is not fully fixed.
>>
> I don't suppose it's possible that this was earlier on-disk corruption
> still showing up after the reboot on the new kernel?

It is always possible. But the problem appears just after the reboot due
to the bad resume (and here I'm sure that memory was corrupted because
ld.so work perfectly since the reboot). So it would be a strange
coincidence.

The forced fscks on all my partitions do not detect any other problems
(of course, fsck would not tell anything about data integrity within
files)

> Cheers,
> Julien

[
as a side note, I discovered that trying to reinstall bash and its dependency
with one apt-get command fails:
vdanjean@eyak:~$ sudo apt-get install --reinstall bash base-files debianutils dash libc6 libncurses5 bash-completion
[...]
Do you want to continue [Y/n]?
E: Could not perform immediate configuration on 'bash'.Please see man 5 apt.conf under APT::Immediate-Configure for details. (2)
vdanjean@eyak:~$

Do you know if this is expected or if this is a bug (of apt I suppose)
]

Regards,
Vincent

--
Vincent Danjean Adresse: Laboratoire d'Informatique de Grenoble
Téléphone: +33 4 76 61 20 11 ENSIMAG - antenne de Montbonnot
Fax: +33 4 76 61 20 99 ZIRST 51, avenue Jean Kuntzmann
Email: Vincent.Danjean@imag.fr 38330 Montbonnot Saint Martin



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4C3E37DB.4070308@ens-lyon.org">http://lists.debian.org/4C3E37DB.4070308@ens-lyon.org
 
Old 07-14-2010, 10:25 PM
Vincent Danjean
 
Default Bug#545125: Intel + KSM still corrupt memory after resuming from suspend to disk

On 14/07/2010 20:18, Julien Cristau wrote:
> On Wed, Jul 14, 2010 at 15:57:32 +0200, Vincent Danjean wrote:
>
>> Even if the problem occurs less often, I still experiment it sometimes
>> with 2.6.32-17.
>> The last time (yesterday), just after a resume, any new processus
>> segfault within the ld.so code... So I rebooted.
>> And today, I discovered, just before submitting a new bug, that the
>> fact that bash core dump each time I hit [tab] or [backspace] was due
>> to a disk corruption of bash binary (ie pb fixed by reinstalling my current
>> version of bash). I will now start a global fsck to check if other
>> on-disk structures have been corrupted or not.
>> So, this bug is not fully fixed.
>>
> I don't suppose it's possible that this was earlier on-disk corruption
> still showing up after the reboot on the new kernel?

Note that before the memory corruption, I boot and made several cycles of
suspend/resume with 2.6.32-17 (this is a improvement with the previous
versions).

Regards,
Vincent

> Cheers,
> Julien


--
Vincent Danjean GPG key ID 0x9D025E87 vdanjean@debian.org
GPG key fingerprint: FC95 08A6 854D DB48 4B9A 8A94 0BF7 7867 9D02 5E87
Unofficial packages: http://moais.imag.fr/membres/vincent.danjean/deb.html
APT repo: deb http://perso.debian.org/~vdanjean/debian unstable main




--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4C3E396D.2060008@ens-lyon.org">http://lists.debian.org/4C3E396D.2060008@ens-lyon.org
 
Old 07-18-2010, 08:40 PM
Julien Cristau
 
Default Bug#545125: Intel + KSM still corrupt memory after resuming from suspend to disk

On Wed, Jul 14, 2010 at 15:57:32 +0200, Vincent Danjean wrote:

> reopen 545125
> found 545125 2.6.32-17
> thanks
>
> Hi,
>
> Even if the problem occurs less often, I still experiment it sometimes
> with 2.6.32-17.
> The last time (yesterday), just after a resume, any new processus
> segfault within the ld.so code... So I rebooted.
> And today, I discovered, just before submitting a new bug, that the
> fact that bash core dump each time I hit [tab] or [backspace] was due
> to a disk corruption of bash binary (ie pb fixed by reinstalling my current
> version of bash). I will now start a global fsck to check if other
> on-disk structures have been corrupted or not.
> So, this bug is not fully fixed.
>
New patch in mainline:

commit cd9f040df6ce46573760a507cb88192d05d27d86
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Sun Jul 18 09:44:37 2010 -0700

drm/i915: add 'reclaimable' to i915 self-reclaimable page allocations

The hibernate issues that got fixed in commit 985b823b9192 ("drm/i915:
fix hibernation since i915 self-reclaim fixes") turn out to have been
incomplete. Vefa Bicakci tested lots of hibernate cycles, and without
the __GFP_RECLAIMABLE flag the system eventually fails to resume.

With the flag added, Vefa can apparently hibernate forever (or until he
gets bored running his automated scripts, whichever comes first).

The reclaimable flag was there originally, and was one of the flags that
were dropped (unintentionally) by commit 4bdadb978569 ("drm/i915:
Selectively enable self-reclaim") that introduced all these problems,
but I didn't want to just blindly add back all the flags in commit
985b823b9192, and it looked like __GFP_RECLAIM wasn't necessary. It
clearly was.

I still suspect that there is some subtle reason we're missing that
causes the problems, but __GFP_RECLAIMABLE is certainly not wrong to use
in this context, and is what the code historically used. And we have no
idea what the causes the corruption without it.

Reported-and-tested-by: M. Vefa Bicakci <bicave@superonline.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Maybe this time it'll be fixed for good...

Cheers,
Julien
 

Thread Tools




All times are GMT. The time now is 12:21 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org