FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 06-19-2012, 01:52 PM
Spelic
 
Default Ext4 and xfs problems in dm-thin on allocation and discard

On 06/19/12 15:30, Mike Snitzer wrote:
I don't recall Spelic saying anything about EOPNOTSUPP. So what has
made you zero in on an -EOPNOTSUPP return (which should not be
happening)?


Exactly: I do not know if EOPNOTSUPP is being returned or not.

If this helps, I have configured dm-thin via lvm2
LVM version: 2.02.95(2) (2012-03-06)
Library version: 1.02.74 (2012-03-06)
Driver version: 4.22.0

from dmsetup table I only see one option : "skip_block_zeroing", if and
only if I configure it with -Zn . I do not see anything regarding
ignore_discard


vg1-pooltry1-tpool: 0 20971520 thin-pool 252:1 252:2 2048 0 1
skip_block_zeroing

vg1-pooltry1_tdata: 0 20971520 linear 9:20 62922752
vg1-pooltry1_tmeta: 0 8192 linear 9:20 83894272
vg1-thinlv1: 0 31457280 thin 252:3 1


and in dmesg:
[ 33.685200] device-mapper: thin: Discard unsupported by data device
(dm-2): Disabling discard passdown.
[ 33.709586] device-mapper: thin: Discard unsupported by data device
(dm-6): Disabling discard passdown.



I do not know what is the mechanism for which xfs cannot unmap blocks
from dm-thin, but it really can't.
If anyone has dm-thin installed he can try. This is 100% reproducible
for me.



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 06-19-2012, 02:05 PM
Eric Sandeen
 
Default Ext4 and xfs problems in dm-thin on allocation and discard

On 6/19/12 8:52 AM, Spelic wrote:
> On 06/19/12 15:30, Mike Snitzer wrote:
>> I don't recall Spelic saying anything about EOPNOTSUPP. So what has made you zero in on an -EOPNOTSUPP return (which should not be happening)?
>
> Exactly: I do not know if EOPNOTSUPP is being returned or not.
>
> If this helps, I have configured dm-thin via lvm2
> LVM version: 2.02.95(2) (2012-03-06)
> Library version: 1.02.74 (2012-03-06)
> Driver version: 4.22.0
>
> from dmsetup table I only see one option : "skip_block_zeroing", if and only if I configure it with -Zn . I do not see anything regarding ignore_discard
>
> vg1-pooltry1-tpool: 0 20971520 thin-pool 252:1 252:2 2048 0 1 skip_block_zeroing
> vg1-pooltry1_tdata: 0 20971520 linear 9:20 62922752
> vg1-pooltry1_tmeta: 0 8192 linear 9:20 83894272
> vg1-thinlv1: 0 31457280 thin 252:3 1
>
>
> and in dmesg:
> [ 33.685200] device-mapper: thin: Discard unsupported by data device (dm-2): Disabling discard passdown.
> [ 33.709586] device-mapper: thin: Discard unsupported by data device (dm-6): Disabling discard passdown.
>
>
> I do not know what is the mechanism for which xfs cannot unmap blocks from dm-thin, but it really can't.
> If anyone has dm-thin installed he can try. This is 100% reproducible for me.

Might be worth seeing if xfs is ever getting to its discard code? There is a tracepoint...

# mount -t debugfs none /sys/kernel/debug
# echo 1 > /sys/kernel/debug/tracing/tracing_enabled
# echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_discard_extent/enable

<run test>

# cat /sys/kernel/debug/tracing/trace

-Eric

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 06-19-2012, 02:09 PM
Lukáš Czerner
 
Default Ext4 and xfs problems in dm-thin on allocation and discard

On Mon, 18 Jun 2012, Spelic wrote:

> Date: Mon, 18 Jun 2012 23:33:50 +0200
> From: Spelic <spelic@shiftmail.org>
> To: xfs@oss.sgi.com, linux-ext4@vger.kernel.org,
> device-mapper development <dm-devel@redhat.com>
> Subject: Ext4 and xfs problems in dm-thin on allocation and discard
>
> Hello all
> I am doing some testing of dm-thin on kernel 3.4.2 and latest lvm from source
> (the rest is Ubuntu Precise 12.04).
> There are a few problems with ext4 and (different ones with) xfs
>
> I am doing this:
> dd if=/dev/zero of=zeroes bs=1M count=1000 conv=fsync
> lvs
> rm zeroes #optional
> dd if=/dev/zero of=zeroes bs=1M count=1000 conv=fsync #again
> lvs
> rm zeroes #optional
> ...
> dd if=/dev/zero of=zeroes bs=1M count=1000 conv=fsync #again
> lvs
> rm zeroes
> fstrim /mnt/mountpoint
> lvs
>
> On ext4 the problem is that it always reallocates blocks at different places,
> so you can see from lvs that space occupation in the pool and thinlv increases
> at each iteration of dd, again and again, until it has allocated the whole
> thin device (really 100% of it). And this is true regardless of me doing rm or
> not between one dd and the other.
> The other problem is that by doing this, ext4 always gets the worst
> performance from thinp, about 140MB/sec on my system, because it is constantly
> allocating blocks, instead of 350MB/sec which should have been with my system
> if it used already allocated regions (see below compared to xfs). I am on an
> MD raid-5 of 5 hdds.
> I could suggest to add a "thinp mode" mount option to ext4 affecting the
> allocator, so that it tries to reallocate recently used and freed areas and
> not constantly new areas. Note that mount -o discard does work and prevents
> allocation bloating, but it still always gets the worst write performances
> from thinp. Alternatively thinp could be improved so that block allocation is
> fast :-P (*)
> However, good news is that fstrim works correctly on ext4, and is able to drop
> all space allocated by all dd's. Also mount -o discard works.

I am happy to hear that discard actually works with ext4. Regarding
the performance problem, part of it has already been explained by
Dave and I agree with him.

With thin provisioning you'll get totally different file system
layout than on fully provisioned disk as you push more and more
writes to your drive. This unfortunately has great impact on
performance since file systems usually have a lot of optimization on
where to put data/metadata on the drive and how to read them.
However in case of thinly provisioned storage those optimization
would not help. And yes, you just have to expect lower performance
with dm-thin from the file system on top of it. It is not and it
will never be ideal solution for workloads where you expect the best
performance.

However optimization have to be done on dm and fs side and the work
is currently in progress and now when we have "cheap" thinp solution
I guess that the progress will by quite faster in that regard.

-Lukas

>
> On xfs there is a different problem.
> Xfs apparently correctly re-uses the same blocks so that after the first write
> at 140MB/sec, subsequent overwrites of the same file are at full speed such as
> 350MB/sec (same speed as with non-thin lvm), and also you don't see space
> occupation going up at every iteration of dd, either with or without rm
> in-between the dd's. [ok actually now retrying it needed 3 rewrites to
> stabilize allocation... probably an AG count thing.]
> However the problem with XFS is that discard doesn't appear to work. Fstrim
> doesn't work, and neither does "mount -o discard ... + rm zeroes" . There is
> apparently no way to drop the allocated blocks, as seen from lvs. This is in
> contrast to what it is written here http://xfs.org/index.php/FITRIM/discard
> which declare fstrim and mount -o discard to be working.
> Please note that since I am above MD raid5 (I believe this is the reason), the
> passdown of discards does not work, as my dmesg says:
> [160508.497879] device-mapper: thin: Discard unsupported by data device
> (dm-1): Disabling discard passdown.
> but AFAIU, unless there is a thinp bug, this should not affect the unmapping
> of thin blocks by fstrimming xfs... and in fact ext4 is able to do that.
>
> (*) Strange thing is that write performance appears to be roughly the same for
> default thin chunksize and for 1MB thin chunksize. I would have expected thinp
> allocation to be faster with larger thin chunksizes but instead it is actually
> slower (note that there are no snapshots here and hence no CoW). This is also
> true if I set the thinpool to not zero newly allocated blocks: performances
> are about 240 MB/sec then, but again they don't increase with larger
> chunksizes, they actually decrease slightly with very large chunksizes such as
> 16MB. Why is that?
>
> Thanks for your help
> S.
>

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 06-19-2012, 02:19 PM
"Ted Ts'o"
 
Default Ext4 and xfs problems in dm-thin on allocation and discard

On Tue, Jun 19, 2012 at 04:09:48PM +0200, Lukáš Czerner wrote:
>
> With thin provisioning you'll get totally different file system
> layout than on fully provisioned disk as you push more and more
> writes to your drive. This unfortunately has great impact on
> performance since file systems usually have a lot of optimization on
> where to put data/metadata on the drive and how to read them.
> However in case of thinly provisioned storage those optimization
> would not help. And yes, you just have to expect lower performance
> with dm-thin from the file system on top of it. It is not and it
> will never be ideal solution for workloads where you expect the best
> performance.

One of the things which would be nice to be able to easily set up is a
configuration where we get the benefits of thin provisioning with
respect to snapshost, but where the underlying block device used by
the file system is contiguous. That is, it would be really useful to
*not* use thin provisioning for the underlying file system, but to use
thin provisioned snapshots. That way we only pay the thinp
performance penalty for the snapshots, and not for normal file system
operations. This is something that would be very useful both for ext4
and xfs.

I talked to Alasdair about this a few months ago at the Collab Summit,
and I think it's doable today, but it was somewhat complicaed to set
up. I don't recall the details now, but perhaps someone who's more
familiar device mapper could outline the details, and perhaps we can
either simplify it or abstract it away in a convenient front-end
script?

- Ted

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 06-19-2012, 02:23 PM
Eric Sandeen
 
Default Ext4 and xfs problems in dm-thin on allocation and discard

On 6/19/12 9:19 AM, Ted Ts'o wrote:
> On Tue, Jun 19, 2012 at 04:09:48PM +0200, Lukáš Czerner wrote:
>>
>> With thin provisioning you'll get totally different file system
>> layout than on fully provisioned disk as you push more and more
>> writes to your drive. This unfortunately has great impact on
>> performance since file systems usually have a lot of optimization on
>> where to put data/metadata on the drive and how to read them.
>> However in case of thinly provisioned storage those optimization
>> would not help. And yes, you just have to expect lower performance
>> with dm-thin from the file system on top of it. It is not and it
>> will never be ideal solution for workloads where you expect the best
>> performance.
>
> One of the things which would be nice to be able to easily set up is a
> configuration where we get the benefits of thin provisioning with
> respect to snapshost, but where the underlying block device used by
> the file system is contiguous. That is, it would be really useful to
> *not* use thin provisioning for the underlying file system, but to use
> thin provisioned snapshots. That way we only pay the thinp
> performance penalty for the snapshots, and not for normal file system
> operations. This is something that would be very useful both for ext4
> and xfs.

I agree, and have asked for exactly the same thing... though I have no
idea how hard it is to disentangle allocation-aware snapshots from thing
provisioned storage.

-Eric

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 06-19-2012, 02:37 PM
Lukáš Czerner
 
Default Ext4 and xfs problems in dm-thin on allocation and discard

On Tue, 19 Jun 2012, Ted Ts'o wrote:

> Date: Tue, 19 Jun 2012 10:19:33 -0400
> From: Ted Ts'o <tytso@mit.edu>
> To: Lukáš Czerner <lczerner@redhat.com>
> Cc: Spelic <spelic@shiftmail.org>, xfs@oss.sgi.com,
> linux-ext4@vger.kernel.org,
> device-mapper development <dm-devel@redhat.com>
> Subject: Re: Ext4 and xfs problems in dm-thin on allocation and discard
>
> On Tue, Jun 19, 2012 at 04:09:48PM +0200, Lukáš Czerner wrote:
> >
> > With thin provisioning you'll get totally different file system
> > layout than on fully provisioned disk as you push more and more
> > writes to your drive. This unfortunately has great impact on
> > performance since file systems usually have a lot of optimization on
> > where to put data/metadata on the drive and how to read them.
> > However in case of thinly provisioned storage those optimization
> > would not help. And yes, you just have to expect lower performance
> > with dm-thin from the file system on top of it. It is not and it
> > will never be ideal solution for workloads where you expect the best
> > performance.
>
> One of the things which would be nice to be able to easily set up is a
> configuration where we get the benefits of thin provisioning with
> respect to snapshost, but where the underlying block device used by
> the file system is contiguous. That is, it would be really useful to
> *not* use thin provisioning for the underlying file system, but to use
> thin provisioned snapshots. That way we only pay the thinp
> performance penalty for the snapshots, and not for normal file system
> operations. This is something that would be very useful both for ext4
> and xfs.
>
> I talked to Alasdair about this a few months ago at the Collab Summit,
> and I think it's doable today, but it was somewhat complicaed to set
> up. I don't recall the details now, but perhaps someone who's more
> familiar device mapper could outline the details, and perhaps we can
> either simplify it or abstract it away in a convenient front-end
> script?

like ssm for example ?

Yes this would definitely help and I think there are actually more
possible optimization like this.

If we "cripple" the dm-thin so that only snapshot feature is
provided, but the actual thinp feature is not used. It would
definitely help the performance for those who are only interested in
snapshots. You'll still have your file system layout mixed up once
you start using snapshot, but it'll be definitely better. Also some
king of fs/dm interface for optimizing the layout might helpful as
well.

The other thing which could be done is to still enable to utilize
thinp feature, but try to keep file systems on the dm-thin relatively
separated and contiguous (although probably not in it's entire size).
It would certainly work only to some thin pool utilization threshold,
but it is something. Also if we can add some fs related optimization
to try not to span entire file system but rather utilize smaller parts
first (alter the block allocator so it does not allocate blocks from
random groups from entire fs but rather have smaller block group
working set at start), this can be even more useful.

-Lukas

>
> - Ted
> --
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 06-19-2012, 02:44 PM
Mike Snitzer
 
Default Ext4 and xfs problems in dm-thin on allocation and discard

On Tue, Jun 19 2012 at 9:52am -0400,
Spelic <spelic@shiftmail.org> wrote:

> On 06/19/12 15:30, Mike Snitzer wrote:
> >I don't recall Spelic saying anything about EOPNOTSUPP. So what
> >has made you zero in on an -EOPNOTSUPP return (which should not be
> >happening)?
>
> Exactly: I do not know if EOPNOTSUPP is being returned or not.
>
> If this helps, I have configured dm-thin via lvm2
> LVM version: 2.02.95(2) (2012-03-06)
> Library version: 1.02.74 (2012-03-06)
> Driver version: 4.22.0
>
> from dmsetup table I only see one option : "skip_block_zeroing", if
> and only if I configure it with -Zn . I do not see anything
> regarding ignore_discard
>
> vg1-pooltry1-tpool: 0 20971520 thin-pool 252:1 252:2 2048 0 1
> skip_block_zeroing
> vg1-pooltry1_tdata: 0 20971520 linear 9:20 62922752
> vg1-pooltry1_tmeta: 0 8192 linear 9:20 83894272
> vg1-thinlv1: 0 31457280 thin 252:3 1
>
>
> and in dmesg:
> [ 33.685200] device-mapper: thin: Discard unsupported by data
> device (dm-2): Disabling discard passdown.
> [ 33.709586] device-mapper: thin: Discard unsupported by data
> device (dm-6): Disabling discard passdown.
>
>
> I do not know what is the mechanism for which xfs cannot unmap
> blocks from dm-thin, but it really can't.
> If anyone has dm-thin installed he can try. This is 100%
> reproducible for me.

I was initially surprised by this considering the thinp-test-suite does
test a compilebench workload against xfs and ext4 using online discard
(-o discard).

But I just modified that test to use a thin-pool with 'ignore_discard'
and the test still passed on both ext4 and xfs.

So there is more work needed in the thinp-test-suite to use blktrace
hooks to verify that discards are occuring when the compilebench
generated files are removed.

I'll work through that and report back.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 06-19-2012, 06:48 PM
Mike Snitzer
 
Default Ext4 and xfs problems in dm-thin on allocation and discard

On Tue, Jun 19 2012 at 10:44am -0400,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Tue, Jun 19 2012 at 9:52am -0400,
> Spelic <spelic@shiftmail.org> wrote:
>
> > I do not know what is the mechanism for which xfs cannot unmap
> > blocks from dm-thin, but it really can't.
> > If anyone has dm-thin installed he can try. This is 100%
> > reproducible for me.
>
> I was initially surprised by this considering the thinp-test-suite does
> test a compilebench workload against xfs and ext4 using online discard
> (-o discard).
>
> But I just modified that test to use a thin-pool with 'ignore_discard'
> and the test still passed on both ext4 and xfs.
>
> So there is more work needed in the thinp-test-suite to use blktrace
> hooks to verify that discards are occuring when the compilebench
> generated files are removed.
>
> I'll work through that and report back.

blktrace shows discards for both xfs and ext4.

But in general xfs is issuing discards with much smaller extents than
ext4 does, e.g.:

to the thin device:
+ 128 vs + 32

to the thin-pool's data device:
+ 120 vs + 16

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 06-19-2012, 08:06 PM
Dave Chinner
 
Default Ext4 and xfs problems in dm-thin on allocation and discard

On Tue, Jun 19, 2012 at 02:48:59PM -0400, Mike Snitzer wrote:
> On Tue, Jun 19 2012 at 10:44am -0400,
> Mike Snitzer <snitzer@redhat.com> wrote:
>
> > On Tue, Jun 19 2012 at 9:52am -0400,
> > Spelic <spelic@shiftmail.org> wrote:
> >
> > > I do not know what is the mechanism for which xfs cannot unmap
> > > blocks from dm-thin, but it really can't.
> > > If anyone has dm-thin installed he can try. This is 100%
> > > reproducible for me.
> >
> > I was initially surprised by this considering the thinp-test-suite does
> > test a compilebench workload against xfs and ext4 using online discard
> > (-o discard).
> >
> > But I just modified that test to use a thin-pool with 'ignore_discard'
> > and the test still passed on both ext4 and xfs.
> >
> > So there is more work needed in the thinp-test-suite to use blktrace
> > hooks to verify that discards are occuring when the compilebench
> > generated files are removed.
> >
> > I'll work through that and report back.
>
> blktrace shows discards for both xfs and ext4.
>
> But in general xfs is issuing discards with much smaller extents than
> ext4 does, e.g.:

THat's normal when you use -o discard - XFS sends extremely
fine-grained discards as the have to be issued during the checkpoint
commit that frees the extent. Hence they can't be aggregated like is
done in ext4.

As it is, no-one really should be using -o discard - it is extremely
inefficient compared to a background fstrim run given that discards
are unqueued, blocking IOs. It's just a bad idea until the lower
layers get fixed to allow asynchronous, vectored discards and SATA
supports queued discards...

Cheers,

Dave.
--
Dave Chinner
david@fromorbit.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 06-19-2012, 08:21 PM
"Ted Ts'o"
 
Default Ext4 and xfs problems in dm-thin on allocation and discard

On Wed, Jun 20, 2012 at 06:06:31AM +1000, Dave Chinner wrote:
> > But in general xfs is issuing discards with much smaller extents than
> > ext4 does, e.g.:
>
> THat's normal when you use -o discard - XFS sends extremely
> fine-grained discards as the have to be issued during the checkpoint
> commit that frees the extent. Hence they can't be aggregated like is
> done in ext4.

Actually, ext4 is also sending the discards during (well, actually,
after) the commit which frees the extent/inode. We do aggregate them
while the commit is open, but once the transaction is committed, we
send out the discards. I suspect the difference is in the granularity
of the transactions between ext4 and xfs.

> As it is, no-one really should be using -o discard - it is extremely
> inefficient compared to a background fstrim run given that discards
> are unqueued, blocking IOs. It's just a bad idea until the lower
> layers get fixed to allow asynchronous, vectored discards and SATA
> supports queued discards...

What Dave said. :-) This is true for both ext4 and xfs.

As a result, I can very easily see there being a distinction made
between when we *do* want to pass the discards all the way down to the
device, and when we only want the thinp layer to process them ---
because for current devices, sending discards down to the physical
device is very heavyweight.

I'm not sure how we could do this without a nasty layering violation,
but some way in which we could label fstrim discards versus "we've
committed the unlink/truncate and so thinp can feel free to reuse
these blocks" discards would be interesting to consider.

- Ted

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 01:26 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org