FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Development

 
 
LinkBack Thread Tools
 
Old 03-04-2009, 08:44 AM
Klaus Ethgen
 
Default Bug#419209: lvm2: Hangs during snapshot creation

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hi,

first of all, I raised the severity of the bug to critical as it makes
the whole system break. Also I add debian-devel to Cc as the bug is very
problematic and I wonder how lvm2 was able to get into lenny with that
big problem!

Also I am willing to help solving the bug. My next step will be to
import the whole version history to git and try to besect the problem.

First to my use of LVM.

I did use lvm1 with kernel 2.4 on etch for long time now using exact the
same procedure which trigger the problem right now. Then with the
release of lenny I start upgrading my stable systems. This involved a
kernel upgrade as lenny only runs with the very instable kernel tree 2.6
and upgrade of the lvm metadata.

A short test of the lvm snapshoots afterwards runs fine so I was not
care about this road works (Is that the correct expression for the
German "Baustelle"?). (I use snapshots every night in a script to backup
the system.)

Days after I noticed that apt-get install froze randomly and I had to
reboot the system with a power cycle (reboot didn't work). First I was
thinking about one of the many kernel bugs in 2.6 and tried the newest
kernel from kernel.org. But the bug was still there. After several days
I end in finding the problem in the LVM2 subsystem and two days later,
today, I found this ug report and was very shocked! If I had know of it
before I had never upgrade any of may stable systems to lenny! But now
one of them is sticked on lenny as I know no way to revert the LVM back
to LVM1.

So this bug is a complete show stopper for lenny!!!!

Am Sa den 14. Apr 2007 um 12:53 schrieb Jean-Luc Coulon (f5ibh):
> New lvm2 version (2.02.24) hangs during snapshot creation.
> The lvmvreate process is not killeable at this point and the system need to be
> reboted.

That is the correct description. But more over the system will be
unbootable at all! I have to run /etc/init.d/reboot stop by hand to hard
reboot the system. A normal shutdown will end in a hanging system with
no remote access at all. The only solution at that point is to
powercycle the machine which is very problematic with remote system.

Am Fr den 2. Nov 2007 um 16:21 schrieb Stefan Pfetzing:
> did you try to snapshot your /var? Because to me it seemms like the
> current lvm2 configurations tries to use /var/lock/lvm for its locking
> files, and this leads to a deadlock.

This is not really a problem as it is irrelevant if that file is locked
or not in the sapshoot.

Also the problem happens with random lvms not only with /var.

And I wonder why this should be a problem at all as the lvm1 was working
pretty stable for years now.

Am So den 30. Mär 2008 um 10:52 schrieb Bastian Blank:
> # Automatically generated email from bts, devscripts version 2.9.26
> severity 419209 important

The severity of this bug is absolute critical and not just important!

Regards
Klaus Ethgen
- --
Klaus Ethgen http://www.ethgen.de/
pub 2048R/D1A4EDE5 2000-02-26 Klaus Ethgen <Klaus@Ethgen.de>
Fingerprint: D7 67 71 C4 99 A6 D4 FE EA 40 30 57 3C 88 26 2B
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iQEVAwUBSa5Ne5+OKpjRpO3lAQr/Zgf+Pt3M9rZyoVbmHUKdylf35s+pVRoPEvoS
KFvIXd5Mjn2qwXkzEj9GF2NnS/MIsp5We8lj+KpupR7/Fqh5lbr/Iq1W4vVRgT9j
Ddf1uPAitd3jvn8yNUGUmED8+w9zZjwQl9RMIGxTfE2y637AXP CKaHqt4GhUZbAy
Ap3Ao22zjgNs0SdZK+ir5IsS8Xaz/MrE6y64SCZsjPvbsqV4BqzEmSr67dvz1DJO
E0gZlHdcP5MzUOSsd6jWjPBf8FNC2Q5KQr8fRWaI6hICsDIubo Ud6PHHMuXMIPzu
fLpqFeFYm9k7+9LkZ4AkvfWFBn4NeWxOE1c3IJWoOItJVSPblG UZEA==
=lhQ6
-----END PGP SIGNATURE-----


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 03-05-2009, 06:46 AM
Steve Langasek
 
Default Bug#419209: lvm2: Hangs during snapshot creation

On Wed, Mar 04, 2009 at 10:44:28AM +0100, Klaus Ethgen wrote:
> first of all, I raised the severity of the bug to critical as it makes
> the whole system break. Also I add debian-devel to Cc as the bug is very
> problematic and I wonder how lvm2 was able to get into lenny with that
> big problem!

Possibly because there are many people using LVM snapshotting who haven't
seen this bug? I use schroot with LVM snapshots all the time on a server
running the lenny kernel, and have never seen this bug.

> I did use lvm1 with kernel 2.4 on etch for long time now using exact the
> same procedure which trigger the problem right now.

Which was not a supported configuration. Linux 2.4 was supported in etch
only for purposes of upgrading, not for a running system. Your decision to
run an unsupported system actually makes it harder to track down this bug,
since the only baseline we have for comparison in your case is a system with
none of the relevant components in common.

> Then with the release of lenny I start upgrading my stable systems. This
> involved a kernel upgrade as lenny only runs with the very instable kernel
> tree 2.6 and upgrade of the lvm metadata.

Claiming that the 2.6 kernel is unstable only undermines your credibility.

> Days after I noticed that apt-get install froze randomly and I had to
> reboot the system with a power cycle (reboot didn't work). First I was
> thinking about one of the many kernel bugs in 2.6 and tried the newest
> kernel from kernel.org. But the bug was still there. After several days
> I end in finding the problem in the LVM2 subsystem and two days later,
> today, I found this ug report and was very shocked! If I had know of it
> before I had never upgrade any of may stable systems to lenny! But now
> one of them is sticked on lenny as I know no way to revert the LVM back
> to LVM1.

The original bug reporter claims that the hang only occurs with newer
versions of the lvm2 userspace tools. You could try downgrading to the etch
version of lvm2, to see whether this resolves the problem for you.

> Am Sa den 14. Apr 2007 um 12:53 schrieb Jean-Luc Coulon (f5ibh):
> > New lvm2 version (2.02.24) hangs during snapshot creation.
> > The lvmvreate process is not killeable at this point and the system need to be
> > reboted.

> That is the correct description. But more over the system will be
> unbootable at all! I have to run /etc/init.d/reboot stop by hand to hard
> reboot the system. A normal shutdown will end in a hanging system with
> no remote access at all. The only solution at that point is to
> powercycle the machine which is very problematic with remote system.

What does dmesg show for the system after the snapshot has hung?

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
slangasek@ubuntu.com vorlon@debian.org


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 03-05-2009, 07:48 AM
Bastian Blank
 
Default Bug#419209: lvm2: Hangs during snapshot creation

severity 419209 important
thanks

On Wed, Mar 04, 2009 at 10:44:28AM +0100, Klaus Ethgen wrote:
> first of all, I raised the severity of the bug to critical as it makes
> the whole system break.

lvm2 manages blockdevices via the device-mapper framework, so it manages
the system. It is the purpose of this tool and it will not hold you from
doing stupid things.

And for now I consider taking snapshots of / or /var as stupid, because
it is impossible to recover if something goes wrong. And without a
working filesystem on this locations, the system will just block. It may
work, but it also may break horrible as the kernel interface does not
allow to do this change atomic.

> Also I add debian-devel to Cc as the bug is very
> problematic and I wonder how lvm2 was able to get into lenny with that
> big problem!

We have many software who only works for most but not for all people.

> Also I am willing to help solving the bug. My next step will be to
> import the whole version history to git and try to besect the problem.

Why do you think this would be a problem of the userspace part?

> So this bug is a complete show stopper for lenny!!!!

If you want to help you can provide the following information when it
goes wrong:
- "uname -a"
- "dmesg"
- "dmsetup table"
- "cat /proc/mounts"
- debug log of the "lvcreate -s" call, using -vvvv
- your snapshot creation script

> Am Sa den 14. Apr 2007 um 12:53 schrieb Jean-Luc Coulon (f5ibh):
> > New lvm2 version (2.02.24) hangs during snapshot creation.
> > The lvmvreate process is not killeable at this point and the system need to be
> > reboted.
> That is the correct description.

There was a bug in older kernels which blocked on devmapper table
reload, however I've only seen this with the mirror target during a
pvmove call.

> But more over the system will be
> unbootable at all! I have to run /etc/init.d/reboot stop by hand to hard
> reboot the system. A normal shutdown will end in a hanging system with
> no remote access at all. The only solution at that point is to
> powercycle the machine which is very problematic with remote system.

This is the normal behaviour if you lock out either a filesystem or have
some parts of the kernel disfunctional after oopses.

> Am Fr den 2. Nov 2007 um 16:21 schrieb Stefan Pfetzing:
> > did you try to snapshot your /var? Because to me it seemms like the
> > current lvm2 configurations tries to use /var/lock/lvm for its locking
> > files, and this leads to a deadlock.
> This is not really a problem as it is irrelevant if that file is locked
> or not in the sapshoot.

It is. However I'm currently not sure if it ever tries to write/read
this files while it have an operation going.

> And I wonder why this should be a problem at all as the lvm1 was working
> pretty stable for years now.

lvm2 and lvm1 does not have many in common.

> Am So den 30. Mär 2008 um 10:52 schrieb Bastian Blank:
> > # Automatically generated email from bts, devscripts version 2.9.26
> > severity 419209 important
> The severity of this bug is absolute critical and not just important!

This is up to the maintainer. I use snapshots often and have not seen
such problems recently.

Bastian

--
We do not colonize. We conquer. We rule. There is no other way for us.
-- Rojan, "By Any Other Name", stardate 4657.5


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 03-05-2009, 07:58 AM
Klaus Ethgen
 
Default Bug#419209: lvm2: Hangs during snapshot creation

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Am Do den 5. Mär 2009 um 8:46 schrieb Steve Langasek:
> > the whole system break. Also I add debian-devel to Cc as the bug is very
> > problematic and I wonder how lvm2 was able to get into lenny with that
> > big problem!
>
> Possibly because there are many people using LVM snapshotting who haven't
> seen this bug? I use schroot with LVM snapshots all the time on a server
> running the lenny kernel, and have never seen this bug.

Well, I did also some tests and wasn't able to trigger that bug on my
laptop too. But I also have the lvm direct on the hard disk. On my
server there is a md device underlying. So I suggest the problem is just
with a lvm on top of a md device (which you see not that often on
desktop systems). When I am at home next week (so I can restart the
system by button), I will do a bisect to find out the broken patch. I
just have the complete cvs imported to git now.

But to use the same argumentation than you, there seems to be many
people out there who suffering from this bug.

But as I downgrade the lvm2 back to etch my backup worked this night
since I updated the lvm2 the first time again. So I can prove that the
bug just happened between that two releases reported in the original
report.

> > I did use lvm1 with kernel 2.4 on etch for long time now using exact the
> > same procedure which trigger the problem right now.
>
> Which was not a supported configuration. Linux 2.4 was supported in etch
> only for purposes of upgrading, not for a running system. Your decision to
> run an unsupported system actually makes it harder to track down this bug,
> since the only baseline we have for comparison in your case is a system with
> none of the relevant components in common.

Well, kernel 2.6 wasn't that stable to use it on a productive system.
(And in my opinion it isn't that stable still but there is no way to use
kernel 2.4 in lenny.) For my servers I prefer to have more stability
than more bleeding edge features!

> > Then with the release of lenny I start upgrading my stable systems. This
> > involved a kernel upgrade as lenny only runs with the very instable kernel
> > tree 2.6 and upgrade of the lvm metadata.
>
> Claiming that the 2.6 kernel is unstable only undermines your credibility.

That's your opinion. I did test the 2.6 kernel in several versions now
and it has many instabilities. Also in 2.6 they dropped nearly the
complete oss driver so I have no sound. (The alsa driver never worked in
all of my environments. It takes just 5 minutes to get a kernel panic if
I try to use alsa. And yes, I test it also with several hardware and
with many kernel versions.) Altogether I think that the kernel
developer has to start a development tree to get the 2.6 series calm
down and get stable. Until now there is no 2.6 release which is that
stable then the kernel 2.4.

> > Days after I noticed that apt-get install froze randomly and I had to
> > reboot the system with a power cycle (reboot didn't work). First I was
> > thinking about one of the many kernel bugs in 2.6 and tried the newest
> > kernel from kernel.org. But the bug was still there. After several days
> > I end in finding the problem in the LVM2 subsystem and two days later,
> > today, I found this ug report and was very shocked! If I had know of it
> > before I had never upgrade any of may stable systems to lenny! But now
> > one of them is sticked on lenny as I know no way to revert the LVM back
> > to LVM1.
>
> The original bug reporter claims that the hang only occurs with newer
> versions of the lvm2 userspace tools. You could try downgrading to the etch
> version of lvm2, to see whether this resolves the problem for you.

As I mention above I can prove that this solves my problem. So the buggy
path is just in between.

> > Am Sa den 14. Apr 2007 um 12:53 schrieb Jean-Luc Coulon (f5ibh):
> > > New lvm2 version (2.02.24) hangs during snapshot creation.
> > > The lvmvreate process is not killeable at this point and the system need to be
> > > reboted.
>
> > That is the correct description. But more over the system will be
> > unbootable at all! I have to run /etc/init.d/reboot stop by hand to hard
> > reboot the system. A normal shutdown will end in a hanging system with
> > no remote access at all. The only solution at that point is to
> > powercycle the machine which is very problematic with remote system.
>
> What does dmesg show for the system after the snapshot has hung?

Nothing. I also raised the log level of lvm to 7 (debug) but there is
nothing special in the log. The last lines are:
libdm-deptree.c:1187 Creating sysvg-lv_usr-real
ioctl/libdm-iface.c:1606 dm create sysvg-lv_usr-real LVM-aVvQkLwPuENl6auLtoUWJBqjM0OKG0NQ000000000000000000 00000000000000-real NF [16384]
libdm-common.c:607 sysvg-lv_usr-real: Stacking NODE_ADD (253,9) 0:6 0660
libdm-deptree.c:1463 Loading sysvg-lv_usr-real table
libdm-deptree.c:1413 Adding target: 0 14024704 linear 9:0 97714560
libdm-deptree.c:1413 Adding target: 14024704 2752512 linear 9:0 65920
ioctl/libdm-iface.c:1606 dm table (253:9) OF [16384]
ioctl/libdm-iface.c:1606 dm reload (253:9) NF [16384]
libdm-deptree.c:897 Resuming sysvg-lv_usr-real (253:9)
ioctl/libdm-iface.c:1606 dm resume (253:9) NF [16384]
libdm-common.c:635 sysvg-lv_usr-real: Stacking NODE_READ_AHEAD 1024 (flags=0)
libdm-deptree.c:1463 Loading sysvg-lv_usr table
libdm-deptree.c:1413 Adding target: 0 16777216 snapshot-origin 253:9
ioctl/libdm-iface.c:1606 dm table (253:0) OF [16384]
ioctl/libdm-iface.c:1606 dm reload (253:0) NF [16384]
libdm-deptree.c:1187 Creating sysvg-sv_usr-cow
ioctl/libdm-iface.c:1606 dm create sysvg-sv_usr-cow LVM-aVvQkLwPuENl6auLtoUWJBqjM0OKG0NQy2tplW0aa8Sb2JSSOL dRC0r3JhnJ00NC-cow NF [16384]
libdm-common.c:607 sysvg-sv_usr-cow: Stacking NODE_ADD (253,10) 0:6 0660
libdm-deptree.c:1463 Loading sysvg-sv_usr-cow table
libdm-deptree.c:1413 Adding target: 0 1048576 linear 9:0 434176384
ioctl/libdm-iface.c:1606 dm table (253:10) OF [16384]
ioctl/libdm-iface.c:1606 dm reload (253:10) NF [16384]
libdm-deptree.c:897 Resuming sysvg-sv_usr-cow (253:10)
ioctl/libdm-iface.c:1606 dm resume (253:10) NF [16384]
libdm-common.c:635 sysvg-sv_usr-cow: Stacking NODE_READ_AHEAD 0 (flags=1)
libdm-deptree.c:1463 Loading sysvg-sv_usr table
libdm-deptree.c:1413 Adding target: 0 16777216 snapshot 253:9 253:10 P 8
ioctl/libdm-iface.c:1606 dm table (253:8) OF [16384]
ioctl/libdm-iface.c:1606 dm reload (253:8) NF [16384]
activate/fs.c:167 Removing /dev/sysvg/lv_usr
activate/fs.c:174 Linking /dev/sysvg/lv_usr -> /dev/mapper/sysvg-lv_usr
activate/fs.c:167 Removing /dev/sysvg/sv_usr
activate/fs.c:174 Linking /dev/sysvg/sv_usr -> /dev/mapper/sysvg-sv_usr

Which seems to look just normal for me. But well, I do not know how it
SHOULD look like.

Regards
Klaus
- --
Klaus Ethgen http://www.ethgen.de/
pub 2048R/D1A4EDE5 2000-02-26 Klaus Ethgen <Klaus@Ethgen.de>
Fingerprint: D7 67 71 C4 99 A6 D4 FE EA 40 30 57 3C 88 26 2B
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iQEVAwUBSa+UTZ+OKpjRpO3lAQro8Af/fdtvn0lda808Pm7l9hVtRspUaVDh3BLM
sFa8E+2g+OgKMM7n+ZGdZqZ/pxpWsEDkX/SDeHC9xB9q0TMbkICaIUjw565dapr9
Di6W8VJBEEGCCzFHVcBzP7SiOmernEu8NrXe9+isjnDGmbjsux sypF4xQ7LApa+J
tKqy6/S5D/QVFyYnO+3r9g8dFmqN4cflUHgbACdHJpfIBgq6+AjyvictJVnH pimc
j/bt+n4j/m7c5IuAoPaUIwgK2JxMiHqU3H7+dYsZ7VFp+8iBPGrtBoRNlvE 10G79
ifHh74h8gN0g4TiQNmezFR6suhfv9hmOLSEqaftvgTuLWg20pE a3Vw==
=EnRv
-----END PGP SIGNATURE-----


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 03-05-2009, 08:45 AM
Klaus Ethgen
 
Default Bug#419209: lvm2: Hangs during snapshot creation

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Am Do den 5. Mär 2009 um 9:48 schrieb Bastian Blank:
> severity 419209 important

It is critical as it breaks the whole system! I do not want to start a
severity war with you but please do not set the severity to wrong level.
It do not fix the bug just lowering the severity.

> On Wed, Mar 04, 2009 at 10:44:28AM +0100, Klaus Ethgen wrote:
> > first of all, I raised the severity of the bug to critical as it makes
> > the whole system break.
>
> lvm2 manages blockdevices via the device-mapper framework, so it manages
> the system. It is the purpose of this tool and it will not hold you from
> doing stupid things.

Using snapshoots is no stupid thinks in my opinion. If you think that I
think the only stupid are your opinion.

So, enough bashing. Please stay on a objective level as this is a
critical bug in a core component which is proven to exists for many
people! Please help fixing it and do not play just with the severity!

> And for now I consider taking snapshots of / or /var as stupid, because
> it is impossible to recover if something goes wrong.

The braking happens with several lvs I had this some time with /usr some
time with /home and only one time with /var.

However snapshoting /var wasn't a problem in the past and is the
desired use of them too (also if you do not think so, but this is your
opinion!).

And I do not use snapshots for / but for another reason. I do not have
/ in lvm at all.

Ah, yes, and why do you think should /var be broken when using
snapshots? Just the snapshot might be broken. But as it is necessary to
reboot the system anyway this can be fixed easily after the boot the
same way as fixing a broken snapshot of /usr or /home. And with your
opinion it would be more stupid to use snapshots with /home cause the
most vitally data is in /home, not in /var. So, following your opinion
using snapshots at all is stupid. Please consider not to name other
people stupid.

> And without a working filesystem on this locations, the system will
> just block. It may work, but it also may break horrible as the kernel
> interface does not allow to do this change atomic.

That's wrong assuming. It was working well with lvm1 and (as I know now)
also with lvm2 up to the version in etch.

> > Also I add debian-devel to Cc as the bug is very
> > problematic and I wonder how lvm2 was able to get into lenny with that
> > big problem!
>
> We have many software who only works for most but not for all people.

So the software is not buggy if it just works for the most people?

> > Also I am willing to help solving the bug. My next step will be to
> > import the whole version history to git and try to besect the problem.
>
> Why do you think this would be a problem of the userspace part?

Cause just downgrading the userspace tools to 2.02.06-4etch1 fix the
bug!

I also first think of a kernel bug and, as I wrote in my mail, I did
update the kernel to the latest release to see if the bug still
persists before searching for other reasons.

> > So this bug is a complete show stopper for lenny!!!!
>
> If you want to help you can provide the following information when it
> goes wrong:
> - "uname -a"

I just did that, the last available kernel release:
Linux ikki 2.6.28.7 #1 Sun Mar 1 13:03:56 CET 2009 i686 GNU/Linux

> - "dmesg"

Unhelpful as the system is booted new now.

> - "dmsetup table"
~> dmsetup table
sysvg-lv_usr: 0 14024704 linear 9:0 97714560
sysvg-lv_usr: 14024704 2752512 linear 9:0 65920
sysvg-lv_var: 0 4194304 linear 9:0 2818432
sysvg-lv_mirror: 0 117440512 linear 9:0 126615936
sysvg-lv_local: 0 16777216 linear 9:0 7012736
sysvg-lv_home: 0 73924608 linear 9:0 23789952
sysvg-lv_home: 73924608 14155776 linear 9:0 111739264
sysvg-lv_misc: 0 167772160 linear 9:0 244056448
sysvg-lv_sec: 0 585826304 linear 9:2 65920
sysvg-lv_sec: 585826304 22347776 linear 9:0 411828608
sysvg-lv_hathi: 0 720896 linear 9:0 125895040


> - "cat /proc/mounts"
~> cat /proc/mounts
rootfs / rootfs rw 0 0
/dev/root / xfs rw,noatime,nodiratime,noquota 0 0
tmpfs /lib/init/rw tmpfs rw,nosuid,mode=755 0 0
proc /proc proc rw,nosuid,nodev,noexec 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec 0 0
tmpfs /dev tmpfs rw,size=10240k,mode=755 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,gid=5,mode=620 0 0
fusectl /sys/fs/fuse/connections fusectl rw 0 0
usbfs /proc/bus/usb usbfs rw 0 0
tmpfs /tmp tmpfs rw,nosuid,nodev 0 0
/dev/mapper/sysvg-lv_usr /usr xfs rw,noatime,nodiratime,nobarrier,noquota 0 0
/dev/mapper/sysvg-lv_var /var reiserfs rw,noatime,nodiratime 0 0
/dev/mapper/sysvg-lv_local /usr/local xfs rw,noatime,nodiratime,nobarrier,noquota 0 0
/dev/mapper/sysvg-lv_home /home reiserfs rw,nosuid,nodev,noatime,nodiratime 0 0
/dev/mapper/sysvg-lv_misc /misc xfs rw,nosuid,noatime,nodiratime,nobarrier,noquota 0 0
/dev/mapper/sysvg-lv_sec /misc/.sec xfs rw,nosuid,nodev,noatime,nodiratime,nobarrier,noquo ta 0 0
/dev/mapper/sysvg-lv_mirror /mirror xfs rw,nosuid,nodev,noatime,nodiratime,nobarrier,noquo ta 0 0
tmpfs /media tmpfs rw,nosuid,nodev,noexec,mode=755 0 0
/dev/mapper/sysvg-lv_hathi /hathi ext2 ro,errors=continue 0 0
rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
nfsd /proc/fs/nfsd nfsd rw 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,nosuid,nodev,noexec 0 0

The rest is private information (fuse filesytems).

> - debug log of the "lvcreate -s" call, using -vvvv

See my other mail.

> - your snapshot creation script

I will add some documentation and will put it online. For now just
believe me that it is going over all devices and try to make a snapshoot
of every one.

> > Am Sa den 14. Apr 2007 um 12:53 schrieb Jean-Luc Coulon (f5ibh):
> > > New lvm2 version (2.02.24) hangs during snapshot creation.
> > > The lvmvreate process is not killeable at this point and the system need to be
> > > reboted.
> > That is the correct description.
>
> There was a bug in older kernels which blocked on devmapper table
> reload, however I've only seen this with the mirror target during a
> pvmove call.

As you can see in my first mail and in this one, I use the most recent
kernel.

Also the bug IS in userspace as using the etch version fix the bug.

> > But more over the system will be
> > unbootable at all! I have to run /etc/init.d/reboot stop by hand to hard
> > reboot the system. A normal shutdown will end in a hanging system with
> > no remote access at all. The only solution at that point is to
> > powercycle the machine which is very problematic with remote system.
>
> This is the normal behaviour if you lock out either a filesystem or have
> some parts of the kernel disfunctional after oopses.

Yes, I know. But I also tell this a complete system breakage. (To show
you which is the right severity of this bug.)

> > Am Fr den 2. Nov 2007 um 16:21 schrieb Stefan Pfetzing:
> > > did you try to snapshot your /var? Because to me it seemms like the
> > > current lvm2 configurations tries to use /var/lock/lvm for its locking
> > > files, and this leads to a deadlock.
> > This is not really a problem as it is irrelevant if that file is locked
> > or not in the sapshoot.
>
> It is. However I'm currently not sure if it ever tries to write/read
> this files while it have an operation going.

Sorry, but is is not and it was never a problem. But it doesn't matter
ever as the most break I had was with /usr which will be snapshooted
first. But sometimes it works with /usr and then /var or /home or /misc
will trigger the bug.

> > And I wonder why this should be a problem at all as the lvm1 was working
> > pretty stable for years now.
>
> lvm2 and lvm1 does not have many in common.

I know. Well, no, they have the same structure. Just the meta data and
the way how it work is completely different.

> > Am So den 30. Mär 2008 um 10:52 schrieb Bastian Blank:
> > > # Automatically generated email from bts, devscripts version 2.9.26
> > > severity 419209 important
> > The severity of this bug is absolute critical and not just important!
>
> This is up to the maintainer. I use snapshots often and have not seen
> such problems recently.

Ah, that's a very loose interpretation of the debian policy. From
reportbug:
critical: makes unrelated software on the system (or the whole system)
break, or causes serious data loss, or introduces a security hole on
systems where you install the package.

And that is the case here. The whole system may break using the lenny
version of lvm2. There is no count for how many people that must apply.

I will just set the severity once again to critical as I think I did
make is clear why. If you want to start a severity war just do it. I
will never ever change the severity of this bug again. But this would be
contra productive for the bug solution.

And just to please again. Please stay on a objective level. The mail of
Steve Langasek was much more of help than yours which includes several
insults and not proved meanings.

Regards
Klaus
- --
Klaus Ethgen http://www.ethgen.de/
pub 2048R/D1A4EDE5 2000-02-26 Klaus Ethgen <Klaus@Ethgen.de>
Fingerprint: D7 67 71 C4 99 A6 D4 FE EA 40 30 57 3C 88 26 2B
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iQEVAwUBSa+fNJ+OKpjRpO3lAQp8Lgf/RgLl3yfUFU4duLhWk5P1ilMVAZsrXEZB
yqFHZrNRoFyY6oTq8rdrk4zee+1cFHruuJKoWdkvBBGBcWVt2y sCKAU1kgsd2e/n
veI+li+xv2EEOpinpF04IwPuPuDfNR6PJg/leosgBprN1akMZnBKnie3R7+KKj6n
hz18/wYZc9iYeoGqKEx6qHhglmwe37Mturk/8TPB4G8lAFZaAiatJvHpwauv0vpz
bvHkXWTGge9qtBo64GacKKAoIBm9M+5T7N905k5BuhiFWTXhAW yGswa3egh1Jtct
LrT/oF8NQZlL6c7GziXarCv2mGJUIyS1j8cqBSolwOEnsFM2eLKLeA ==
=Vf7M
-----END PGP SIGNATURE-----


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 03-05-2009, 09:43 AM
Wouter Verhelst
 
Default Bug#419209: lvm2: Hangs during snapshot creation

On Thu, Mar 05, 2009 at 10:45:24AM +0100, Klaus Ethgen wrote:
> Am Do den 5. Mär 2009 um 9:48 schrieb Bastian Blank:
> > severity 419209 important
>
> It is critical as it breaks the whole system! I do not want to start a
> severity war with you but please do not set the severity to wrong level.

Severity inflation is a very good way to lose your credibility.

'Breaks the whole system' means something like 'it removes files not
related to the package in question that will cause the system to stop
functioning', 'it spins so much at such high priority that no other
process can get CPU time anymore', or something similar.

In other words, 'critical' is reserved for bugs that cause so much
breakage that the person who did the upload should find a brown paper
bag to put over their head, or some such.

It is not meant for bugs that cause a package to stop functioning
correctly _within its own area of functionality_. If libc ceases to
function under certain circumstances, then no single application on the
entire system will run anymore, yet this is no reason to call it a
'critical' bug. If a newly-uploaded kernel will not boot, then the
entire system will become unusable, yet this does not make it a critical
bug. Similarly, if a bug in the LVM subsystem can make the kernel lock
up inside the device-mapper when doing certain operations, then this bug
remains within its own area of functionality, and the fact that the
system is inoperable after the bug triggers does not render it a
'critical' bug.

If you feel that this makes the 'critical' severity to be rare, then
that is correct -- by design.

The 'important' severity is reseved for 'a bug which has a major effect
on the usability of a package without rendering it completely unusable
to everyone'. This does apply here; 'the system ceases to operate
correctly' certainly means it has a major effect on the usability of the
LVM2 subsystem. However, as has been shown on this mailinglist,
certainly not everyone has the problems you describe -- it is not
'rendering it completely unusable to everyone'.

> It do not fix the bug just lowering the severity.

Nobody said anything of the sorts. However, inflating the bug severity
to a level that does not make sense will only make people angry, and not
cause them to speed up fixing the bug.

--
<Lo-lan-do> Home is where you have to wash the dishes.
-- #debian-devel, Freenode, 2004-09-22


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 03-06-2009, 12:19 PM
Arthur de Jong
 
Default Bug#419209: lvm2: Hangs during snapshot creation

On Thu, 2009-03-05 at 09:58 +0100, Klaus Ethgen wrote:
> Well, I did also some tests and wasn't able to trigger that bug on my
> laptop too. But I also have the lvm direct on the hard disk. On my
> server there is a md device underlying. So I suggest the problem is
> just with a lvm on top of a md device (which you see not that often on
> desktop systems).

FWIW, I have a similar setup and cannot reproduce this:

# uname -a
Linux bobo 2.6.26-1-amd64 #1 SMP Sat Jan 10 19:55:48 UTC 2009 x86_64 GNU/Linux

# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90
Creation Time : Sun Apr 29 20:01:14 2007
Raid Level : raid1
Array Size : 58596992 (55.88 GiB 60.00 GB)
Used Dev Size : 58596992 (55.88 GiB 60.00 GB)
[...]
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1

# pvs
PV VG Fmt Attr PSize PFree
/dev/md0 main lvm2 a- 55.88G 9.88G

# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
home main -wi-ao 30.00G
foo main -wi-ao 6.00G
root main -wi-ao 2.00G
squid main -wi-ao 2.00G
srv main -wi-ao 2.00G
swap main -wi-ao 2.00G
tmp main -wi-ao 512.00M
var main -wi-ao 1.50G

# dmsetup table
main-squid: 0 4194304 linear 9:0 75497856
main-swap: 0 4194304 linear 9:0 7340416
main-root: 0 4194304 linear 9:0 384
main-foo: 0 8388608 linear 9:0 79692160
main-foo: 8388608 4194304 linear 9:0 104857984
main-tmp: 0 1048576 linear 9:0 11534720
main-var: 0 3145728 linear 9:0 4194688
main-srv: 0 4194304 linear 9:0 109052288
main-home: 0 62914560 linear 9:0 12583296

# lvcreate -s --size 2G -n foobar /dev/main/srv
Logical volume "foobar" created

--
-- arthur - adejong@debian.org - http://people.debian.org/~adejong --
 
Old 03-06-2009, 12:45 PM
Klaus Ethgen
 
Default Bug#419209: lvm2: Hangs during snapshot creation

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hi,

Am Fr den 6. Mär 2009 um 14:19 schrieb Arthur de Jong:
> FWIW, I have a similar setup and cannot reproduce this:

Strange. It was just a suggestion. Lets look further to find the Bug.

> # uname -a
> Linux bobo 2.6.26-1-amd64 #1 SMP Sat Jan 10 19:55:48 UTC 2009 x86_64 GNU/Linux

That seems to be the only difference, x86_64. What about the other bug
reporter? What kernel and what setup do you have?

Gruß
Klaus
- --
Klaus Ethgen http://www.ethgen.de/
pub 2048R/D1A4EDE5 2000-02-26 Klaus Ethgen <Klaus@Ethgen.de>
Fingerprint: D7 67 71 C4 99 A6 D4 FE EA 40 30 57 3C 88 26 2B
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iQEVAwUBSbEo9Z+OKpjRpO3lAQqwcAf/X2Js2XdFgW7Wyh43dDdfa3VRMORGs3pH
GX+tN8goTBRNgIlGmHdDDDy7jcl46v/9O+0OAxuDVxJZ3DMHW1jFqPC5YjRY0J1Q
38gXgTV5Ilp1H8KltWkKXeAZiVKmts/mO5/DYFrhvCa2MbMr0FfgLHWveg4dQDZh
yas3LQJPPYYUws22TqO5sDfkzg6UreAv2Mv6JK5dp6VO1gH+Oa PdGaXeH+R3hXeH
XLeNJUdf5ckEP6BTjmA8N+d0m7oeeXK9SUgtWjgJq6ryJ/PfqerJfmORrdBKIlHv
TzMB82XNenMrD13w7LqPz4aH4NcEo76mxcUYBWhiqo+797sqRH 071g==
=Mk2p
-----END PGP SIGNATURE-----


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 

Thread Tools




All times are GMT. The time now is 07:23 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org