FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 02-28-2011, 12:10 PM
Alasdair G Kergon
 
Default mirrored device with thousand of mappingtableentries

On Mon, Feb 28, 2011 at 02:17:52PM +0200, Eli Malul wrote:
> I expect to have thousands of non contiguous extents (so I created a
> synthetic mapping table to test the device-mapper behavior).
>
> Any suggestions?

> -----Original Message-----
> From: Alasdair G Kergon [mailto:agk@redhat.com]
> Sent: Monday, February 28, 2011 2:12 PM
> To: Eli Malul
> Cc: device-mapper development

> (And if, unlike that example, your extents are not
> contiguous,
> create two new devices that join them together, and mirror those.
> That's
> what LVM does.)

That.
dm0 = list 0
dm1 = list 1
dm2 = mirror of dm0 and dm1

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 02-28-2011, 12:13 PM
"Eli Malul"
 
Default mirrored device with thousand of mappingtableentries

OK, I will try that and check the implications of that, thanks!!

But, do you know why having 10,000 extents is causing this overwhelming
memory usage?

-----Original Message-----
From: Alasdair G Kergon [mailto:agk@redhat.com]
Sent: Monday, February 28, 2011 3:10 PM
To: Eli Malul
Cc: device-mapper development
Subject: Re: [dm-devel] mirrored device with thousand of
mappingtableentries

On Mon, Feb 28, 2011 at 02:17:52PM +0200, Eli Malul wrote:
> I expect to have thousands of non contiguous extents (so I created a
> synthetic mapping table to test the device-mapper behavior).
>
> Any suggestions?

> -----Original Message-----
> From: Alasdair G Kergon [mailto:agk@redhat.com]
> Sent: Monday, February 28, 2011 2:12 PM
> To: Eli Malul
> Cc: device-mapper development

> (And if, unlike that example, your extents are not
> contiguous,
> create two new devices that join them together, and mirror those.
> That's
> what LVM does.)

That.
dm0 = list 0
dm1 = list 1
dm2 = mirror of dm0 and dm1

Alasdair


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 02-28-2011, 12:29 PM
Alasdair G Kergon
 
Default mirrored device with thousand of mappingtableentries

On Mon, Feb 28, 2011 at 03:13:42PM +0200, Eli Malul wrote:
> But, do you know why having 10,000 extents is causing this overwhelming
> memory usage?

Nobody considered having a huge number of tiny mirrors reasonable or necessary
so no attempt has been made to optimise that situation.

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 02-28-2011, 12:38 PM
Zdenek Kabelac
 
Default mirrored device with thousand of mappingtableentries

Dne 28.2.2011 14:13, Eli Malul napsal(a):
>
> OK, I will try that and check the implications of that, thanks!!
>
> But, do you know why having 10,000 extents is causing this overwhelming
> memory usage?
>

If you really need that many devices - be also sure
you have disabled this kernel config option:

CONFIG_BLK_DEV_INTEGRITY


For some reason it consumes massive amount of memory.
Other than that - you should count with 64KB per device usually.

Zdenek

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 02-28-2011, 12:42 PM
"Eli Malul"
 
Default mirrored device with thousand of mappingtableentries

According to your suggestion I need to map the non contiguous extents
into linear devices and mirror them.

But, what if I am required to preserve the original extent's offset?
Since I need to mirror an existing device with user data already on
it...

-----Original Message-----
From: Alasdair G Kergon [mailto:agk@redhat.com]
Sent: Monday, February 28, 2011 3:29 PM
To: Eli Malul
Cc: device-mapper development
Subject: Re: [dm-devel] mirrored device with thousand of
mappingtableentries

On Mon, Feb 28, 2011 at 03:13:42PM +0200, Eli Malul wrote:
> But, do you know why having 10,000 extents is causing this
overwhelming
> memory usage?

Nobody considered having a huge number of tiny mirrors reasonable or
necessary
so no attempt has been made to optimise that situation.

Alasdair


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 02-28-2011, 02:01 PM
"Martin K. Petersen"
 
Default mirrored device with thousand of mappingtableentries

>>>>> "Zdenek" == Zdenek Kabelac <zkabelac@redhat.com> writes:

Zdenek> If you really need that many devices - be also sure you have
Zdenek> disabled this kernel config option:

Zdenek> CONFIG_BLK_DEV_INTEGRITY

Zdenek> For some reason it consumes massive amount of memory. Other
Zdenek> than that - you should count with 64KB per device usually.

Care to qualify that?

Unless your HBA indicates that it supports data integrity the only
penalty should be that each bio grows a pointer.

--
Martin K. Petersen Oracle Linux Engineering

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 02-28-2011, 03:25 PM
"Eli Malul"
 
Default mirrored device with thousand of mappingtableentries

One way to achieve that is to create two linear mapped devices amd
mirror them and another two linear device to simulate the original
offsets. The last shall be mapped to the first two linear devices so the
client will continue to read and write the same offsets he was used to.

Seems a little bit complex configuration but it shall do the trick.
I will test its scalability and update of course.

BTW -
CONFIG_BLK_DEV_INTEGRITY is disabled.

-----Original Message-----
From: Eli Malul
Sent: Monday, February 28, 2011 3:43 PM
To: 'Alasdair G Kergon'
Cc: device-mapper development
Subject: RE: [dm-devel] mirrored device with thousand of
mappingtableentries

According to your suggestion I need to map the non contiguous extents
into linear devices and mirror them.

But, what if I am required to preserve the original extent's offset?
Since I need to mirror an existing device with user data already on
it...

-----Original Message-----
From: Alasdair G Kergon [mailto:agk@redhat.com]
Sent: Monday, February 28, 2011 3:29 PM
To: Eli Malul
Cc: device-mapper development
Subject: Re: [dm-devel] mirrored device with thousand of
mappingtableentries

On Mon, Feb 28, 2011 at 03:13:42PM +0200, Eli Malul wrote:
> But, do you know why having 10,000 extents is causing this
overwhelming
> memory usage?

Nobody considered having a huge number of tiny mirrors reasonable or
necessary
so no attempt has been made to optimise that situation.

Alasdair


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-06-2011, 07:39 PM
Zdenek Kabelac
 
Default mirrored device with thousand of mappingtableentries

Dne 28.2.2011 16:01, Martin K. Petersen napsal(a):
>>>>>> "Zdenek" == Zdenek Kabelac <zkabelac@redhat.com> writes:
>
> Zdenek> If you really need that many devices - be also sure you have
> Zdenek> disabled this kernel config option:
>
> Zdenek> CONFIG_BLK_DEV_INTEGRITY
>
> Zdenek> For some reason it consumes massive amount of memory. Other
> Zdenek> than that - you should count with 64KB per device usually.
>
> Care to qualify that?
>
> Unless your HBA indicates that it supports data integrity the only
> penalty should be that each bio grows a pointer.
>

Ok - I've taken some new measurements on my (possible not the best way
configure linux kernel 2.6.38-rc7)

As I'm using some kernel memory debugging it might not apply in the same way
for non-debug kernel.

My finding seems to show that BIP-256 slabtop segment grow by ~73KB per
each device (while dm-io is ab out ~26KB)

That makes consumption ~730MB when 10000 devices are in the game.
Of course there are some other big slowdowns - so it's not the biggest
problem, however - if I get it right when I don't have hw with bio integrity
support - this memory is essencially wasted? as it will have not use.

Of course people who will plan to use such massive number of devices probably
should use hw where such memory loose isn't big problem - but it's quite
noticeable on my testing laptop with just 4G

Another minor issue could be seen in delaying device creation time.
(i.e. ~7000 device activation with BIP is delayed by ~30%).

Zdenek

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-07-2011, 01:59 AM
"Martin K. Petersen"
 
Default mirrored device with thousand of mappingtableentries

>>>>> "Zdenek" == Zdenek Kabelac <zkabelac@redhat.com> writes:

Zdenek> My finding seems to show that BIP-256 slabtop segment grow by
Zdenek> ~73KB per each device (while dm-io is ab out ~26KB)

Ok, I see it now that I tried with a bunch of DM devices.

DM allocates a bioset per volume. And since each bioset has an integrity
mempool you'll end up with a bunch of memory locked down. It seems like
a lot but it's actually the same amount as we reserve for the data path
(bio-0 + biovec-256).

Since a bioset is not necessarily tied to a single block device we can't
automatically decide whether to allocate the integrity pool or not. In
the DM case, however, we just set up the integrity profile so the
information is available.

Can you please try the following patch? This will change things so we
only attach an integrity pool to the bioset if the logical volume is
integrity-capable.

--
Martin K. Petersen Oracle Linux Engineering

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 38e4eb1..37a1b77 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -55,6 +55,7 @@ struct dm_table {
struct dm_target *targets;

unsigned discards_supported:1;
+ unsigned integrity_supported:1;

/*
* Indicates the rw permissions for the new logical
@@ -859,7 +860,11 @@ int dm_table_alloc_md_mempools(struct dm_table *t)
return -EINVAL;
}

- t->mempools = dm_alloc_md_mempools(type);
+ if (t->integrity_supported)
+ t->mempools = dm_alloc_md_mempools(type, 0);
+ else
+ t->mempools = dm_alloc_md_mempools(type, BIOSET_NO_INTEGRITY);
+
if (!t->mempools)
return -ENOMEM;

@@ -935,8 +940,10 @@ static int dm_table_prealloc_integrity(struct dm_table *t, struct mapped_device
struct dm_dev_internal *dd;

list_for_each_entry(dd, devices, list)
- if (bdev_get_integrity(dd->dm_dev.bdev))
+ if (bdev_get_integrity(dd->dm_dev.bdev)) {
+ t->integrity_supported = 1;
return blk_integrity_register(dm_disk(md), NULL);
+ }

return 0;
}
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index eaa3af0..f6146b5 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2643,7 +2643,7 @@ int dm_noflush_suspending(struct dm_target *ti)
}
EXPORT_SYMBOL_GPL(dm_noflush_suspending);

-struct dm_md_mempools *dm_alloc_md_mempools(unsigned type)
+struct dm_md_mempools *dm_alloc_md_mempools(unsigned type, unsigned int flags)
{
struct dm_md_mempools *pools = kmalloc(sizeof(*pools), GFP_KERNEL);

@@ -2663,7 +2663,8 @@ struct dm_md_mempools *dm_alloc_md_mempools(unsigned type)
goto free_io_pool_and_out;

pools->bs = (type == DM_TYPE_BIO_BASED) ?
- bioset_create(16, 0) : bioset_create(MIN_IOS, 0);
+ bioset_create_flags(16, 0, flags) :
+ bioset_create_flags(MIN_IOS, 0, flags);
if (!pools->bs)
goto free_tio_pool_and_out;

diff --git a/drivers/md/dm.h b/drivers/md/dm.h
index 0c2dd5f..d846ce0 100644
--- a/drivers/md/dm.h
+++ b/drivers/md/dm.h
@@ -149,7 +149,7 @@ void dm_kcopyd_exit(void);
/*
* Mempool operations
*/
-struct dm_md_mempools *dm_alloc_md_mempools(unsigned type);
+struct dm_md_mempools *dm_alloc_md_mempools(unsigned type, unsigned int);
void dm_free_md_mempools(struct dm_md_mempools *pools);

#endif
diff --git a/fs/bio.c b/fs/bio.c
index 4bd454f..6e4a381 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -1603,9 +1603,10 @@ void bioset_free(struct bio_set *bs)
EXPORT_SYMBOL(bioset_free);

/**
- * bioset_create - Create a bio_set
+ * bioset_create_flags - Create a bio_set
* @pool_size: Number of bio and bio_vecs to cache in the mempool
* @front_pad: Number of bytes to allocate in front of the returned bio
+ * @flags: Flags that affect memory allocation
*
* Description:
* Set up a bio_set to be used with @bio_alloc_bioset. Allows the caller
@@ -1615,7 +1616,8 @@ EXPORT_SYMBOL(bioset_free);
* Note that the bio must be embedded at the END of that structure always,
* or things will break badly.
*/
-struct bio_set *bioset_create(unsigned int pool_size, unsigned int front_pad)
+struct bio_set *bioset_create_flags(unsigned int pool_size,
+ unsigned int front_pad, unsigned int flags)
{
unsigned int back_pad = BIO_INLINE_VECS * sizeof(struct bio_vec);
struct bio_set *bs;
@@ -1636,7 +1638,8 @@ struct bio_set *bioset_create(unsigned int pool_size, unsigned int front_pad)
if (!bs->bio_pool)
goto bad;

- if (bioset_integrity_create(bs, pool_size))
+ if ((flags & BIOSET_NO_INTEGRITY) == 0 &&
+ bioset_integrity_create(bs, pool_size))
goto bad;

if (!biovec_create_pools(bs, pool_size))
@@ -1646,6 +1649,12 @@ bad:
bioset_free(bs);
return NULL;
}
+EXPORT_SYMBOL(bioset_create_flags);
+
+struct bio_set *bioset_create(unsigned int pool_size, unsigned int front_pad)
+{
+ return bioset_create_flags(pool_size, front_pad, 0);
+}
EXPORT_SYMBOL(bioset_create);

static void __init biovec_init_slabs(void)
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 35dcdb3..2f758f3 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -208,7 +208,12 @@ struct bio_pair {
extern struct bio_pair *bio_split(struct bio *bi, int first_sectors);
extern void bio_pair_release(struct bio_pair *dbio);

+enum bioset_flags {
+ BIOSET_NO_INTEGRITY = (1 << 0),
+};
+
extern struct bio_set *bioset_create(unsigned int, unsigned int);
+extern struct bio_set *bioset_create_flags(unsigned int, unsigned int, unsigned int);
extern void bioset_free(struct bio_set *);

extern struct bio *bio_alloc(gfp_t, int);

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-07-2011, 01:24 PM
Zdenek Kabelac
 
Default mirrored device with thousand of mappingtableentries

Dne 7.3.2011 03:59, Martin K. Petersen napsal(a):
>>>>>> "Zdenek" == Zdenek Kabelac <zkabelac@redhat.com> writes:
>
> Zdenek> My finding seems to show that BIP-256 slabtop segment grow by
> Zdenek> ~73KB per each device (while dm-io is ab out ~26KB)
>
> Ok, I see it now that I tried with a bunch of DM devices.
>
> DM allocates a bioset per volume. And since each bioset has an integrity
> mempool you'll end up with a bunch of memory locked down. It seems like
> a lot but it's actually the same amount as we reserve for the data path
> (bio-0 + biovec-256).
>
> Since a bioset is not necessarily tied to a single block device we can't
> automatically decide whether to allocate the integrity pool or not. In
> the DM case, however, we just set up the integrity profile so the
> information is available.
>
> Can you please try the following patch? This will change things so we
> only attach an integrity pool to the bioset if the logical volume is
> integrity-capable.
>

Yep - patch seems to fix the problem with wasted memory.

Thanks

tested-by: zkabelac@redhat.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 09:14 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org