FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 05-18-2012, 02:59 AM
 
Default block: Only clone bio vecs that are in use

From: Kent Overstreet <koverstreet@google.com>

bcache creates large bios internally, and then splits them according to
the device requirements before it sends them down. If a lower level
device tries to clone the bio, and the original bio had more than
BIO_MAX_PAGES, the clone will fail unecessarily.

We can fix this by only cloning the bio vecs that are actually in use.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
---
fs/bio.c | 13 +++++++------
1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index e2c0970..de0733e 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -435,8 +435,9 @@ EXPORT_SYMBOL(bio_phys_segments);
*/
void __bio_clone(struct bio *bio, struct bio *bio_src)
{
- memcpy(bio->bi_io_vec, bio_src->bi_io_vec,
- bio_src->bi_max_vecs * sizeof(struct bio_vec));
+ memcpy(bio->bi_io_vec,
+ bio_iovec(bio_src),
+ bio_segments(bio_src) * sizeof(struct bio_vec));

/*
* most users will be overriding ->bi_bdev with a new target,
@@ -445,10 +446,10 @@ void __bio_clone(struct bio *bio, struct bio *bio_src)
bio->bi_sector = bio_src->bi_sector;
bio->bi_bdev = bio_src->bi_bdev;
bio->bi_flags |= 1 << BIO_CLONED;
+ bio->bi_flags &= ~(1 << BIO_SEG_VALID);
bio->bi_rw = bio_src->bi_rw;
- bio->bi_vcnt = bio_src->bi_vcnt;
+ bio->bi_vcnt = bio_segments(bio_src);
bio->bi_size = bio_src->bi_size;
- bio->bi_idx = bio_src->bi_idx;
}
EXPORT_SYMBOL(__bio_clone);

@@ -463,7 +464,7 @@ EXPORT_SYMBOL(__bio_clone);
struct bio *bio_clone_bioset(struct bio *bio, gfp_t gfp_mask,
struct bio_set *bs)
{
- struct bio *b = bio_alloc_bioset(gfp_mask, bio->bi_max_vecs, bs);
+ struct bio *b = bio_alloc_bioset(gfp_mask, bio_segments(bio), bs);

if (!b)
return NULL;
@@ -493,7 +494,7 @@ EXPORT_SYMBOL(bio_clone);

struct bio *bio_clone_kmalloc(struct bio *bio, gfp_t gfp_mask)
{
- struct bio *b = bio_kmalloc(gfp_mask, bio->bi_max_vecs);
+ struct bio *b = bio_kmalloc(gfp_mask, bio_segments(bio));

if (!b)
return NULL;
--
1.7.9.rc2

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-06-2012, 11:16 PM
Mikulas Patocka
 
Default block: Only clone bio vecs that are in use

Hi Kent

When you change the semantics of an exported function, rename that
function. There may be external modules that use __bio_clone and this
change could silently introduce bugs in them.

Otherwise, the patchset looks fine.

Mikulas


On Mon, 6 Aug 2012, Kent Overstreet wrote:

> bcache creates large bios internally, and then splits them according to
> the device requirements before it sends them down. If a lower level
> device tries to clone the bio, and the original bio had more than
> BIO_MAX_PAGES, the clone will fail unecessarily.
>
> We can fix this by only cloning the bio vecs that are actually in use.
>
> Signed-off-by: Kent Overstreet <koverstreet@google.com>
> ---
> drivers/block/rbd.c | 2 +-
> drivers/md/dm.c | 5 ++---
> fs/bio.c | 13 +++++++------
> 3 files changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 692cf05..21edfe5 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -714,7 +714,7 @@ static struct bio *bio_chain_clone(struct bio **old, struct bio **next,
> }
>
> while (old_chain && (total < len)) {
> - tmp = bio_kmalloc(gfpmask, old_chain->bi_max_vecs);
> + tmp = bio_kmalloc(gfpmask, bio_segments(old_chain));
> if (!tmp)
> goto err_out;
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 3f3c26e..193fb19 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1057,11 +1057,10 @@ static struct bio *clone_bio(struct bio *bio, sector_t sector,
> {
> struct bio *clone;
>
> - clone = bio_alloc_bioset(GFP_NOIO, bio->bi_max_vecs, bs);
> + clone = bio_alloc_bioset(GFP_NOIO, bv_count, bs);
> __bio_clone(clone, bio);
> clone->bi_sector = sector;
> - clone->bi_idx = idx;
> - clone->bi_vcnt = idx + bv_count;
> + clone->bi_vcnt = bv_count;
> clone->bi_size = to_bytes(len);
> clone->bi_flags &= ~(1 << BIO_SEG_VALID);
>
> diff --git a/fs/bio.c b/fs/bio.c
> index 38ad026..631c67e 100644
> --- a/fs/bio.c
> +++ b/fs/bio.c
> @@ -449,8 +449,9 @@ EXPORT_SYMBOL(bio_phys_segments);
> */
> void __bio_clone(struct bio *bio, struct bio *bio_src)
> {
> - memcpy(bio->bi_io_vec, bio_src->bi_io_vec,
> - bio_src->bi_max_vecs * sizeof(struct bio_vec));
> + memcpy(bio->bi_io_vec,
> + bio_iovec(bio_src),
> + bio_segments(bio_src) * sizeof(struct bio_vec));
>
> /*
> * most users will be overriding ->bi_bdev with a new target,
> @@ -459,10 +460,10 @@ void __bio_clone(struct bio *bio, struct bio *bio_src)
> bio->bi_sector = bio_src->bi_sector;
> bio->bi_bdev = bio_src->bi_bdev;
> bio->bi_flags |= 1 << BIO_CLONED;
> + bio->bi_flags &= ~(1 << BIO_SEG_VALID);
> bio->bi_rw = bio_src->bi_rw;
> - bio->bi_vcnt = bio_src->bi_vcnt;
> + bio->bi_vcnt = bio_segments(bio_src);
> bio->bi_size = bio_src->bi_size;
> - bio->bi_idx = bio_src->bi_idx;
> }
> EXPORT_SYMBOL(__bio_clone);
>
> @@ -477,7 +478,7 @@ EXPORT_SYMBOL(__bio_clone);
> struct bio *bio_clone_bioset(struct bio *bio, gfp_t gfp_mask,
> struct bio_set *bs)
> {
> - struct bio *b = bio_alloc_bioset(gfp_mask, bio->bi_max_vecs, bs);
> + struct bio *b = bio_alloc_bioset(gfp_mask, bio_segments(bio), bs);
>
> if (!b)
> return NULL;
> @@ -507,7 +508,7 @@ EXPORT_SYMBOL(bio_clone);
>
> struct bio *bio_clone_kmalloc(struct bio *bio, gfp_t gfp_mask)
> {
> - struct bio *b = bio_kmalloc(gfp_mask, bio->bi_max_vecs);
> + struct bio *b = bio_kmalloc(gfp_mask, bio_segments(bio));
>
> if (!b)
> return NULL;
> --
> 1.7.7.3
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-08-2012, 11:28 PM
Tejun Heo
 
Default block: Only clone bio vecs that are in use

On Mon, Aug 06, 2012 at 07:16:33PM -0400, Mikulas Patocka wrote:
> Hi Kent
>
> When you change the semantics of an exported function, rename that
> function. There may be external modules that use __bio_clone and this
> change could silently introduce bugs in them.
>
> Otherwise, the patchset looks fine.

I don't know. This doesn't change the main functionality and should
be transparent unless the caller is doing something crazy. It *might*
be nice to rename but I don't think that's a must here.

Thanks.

--
tejun

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-08-2012, 11:47 PM
Muthu Kumar
 
Default block: Only clone bio vecs that are in use

Tejun,

This is changing the semantics of the clone. Sorry, I missed this
thread and replied separately. But anyway, replying it again here:


On Wed, Aug 8, 2012 at 4:28 PM, Tejun Heo <tj@kernel.org> wrote:
> On Mon, Aug 06, 2012 at 07:16:33PM -0400, Mikulas Patocka wrote:
>> Hi Kent
>>
>> When you change the semantics of an exported function, rename that
>> function. There may be external modules that use __bio_clone and this
>> change could silently introduce bugs in them.
>>
>> Otherwise, the patchset looks fine.
>
> I don't know. This doesn't change the main functionality and should
> be transparent unless the caller is doing something crazy. It *might*
> be nice to rename but I don't think that's a must here.
>
> Thanks.

--
You are changing the meaning of __bio_clone() here. In old code, the
number of io_vecs, bi_idx, bi_vcnt are preserved. But in this modified
code, you are mapping bio_src's bi_iovec[bi_idx] to bio_dests
bi_iovec[0] and also restricting the number of allocated io_vecs of
the clone. It may be useful for cases were we would like a identical
copy of the original bio (may not be in current code base, but this
implementation is definitely not what one would expect from the name
"clone").

May be, call this new implementation some thing else (and use it for bcache)?

---

Like Mikulas pointed out, this is an exported function and silently
changing the semantics will break external modules.

Regards,
Muthu


>
> --
> tejun
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-09-2012, 03:19 AM
Kent Overstreet
 
Default block: Only clone bio vecs that are in use

On Wed, Aug 08, 2012 at 04:47:46PM -0700, Muthu Kumar wrote:
> Tejun,
>
> This is changing the semantics of the clone. Sorry, I missed this
> thread and replied separately. But anyway, replying it again here:
>
>
> On Wed, Aug 8, 2012 at 4:28 PM, Tejun Heo <tj@kernel.org> wrote:
> > On Mon, Aug 06, 2012 at 07:16:33PM -0400, Mikulas Patocka wrote:
> >> Hi Kent
> >>
> >> When you change the semantics of an exported function, rename that
> >> function. There may be external modules that use __bio_clone and this
> >> change could silently introduce bugs in them.
> >>
> >> Otherwise, the patchset looks fine.
> >
> > I don't know. This doesn't change the main functionality and should
> > be transparent unless the caller is doing something crazy. It *might*
> > be nice to rename but I don't think that's a must here.
> >
> > Thanks.
>
> --
> You are changing the meaning of __bio_clone() here. In old code, the
> number of io_vecs, bi_idx, bi_vcnt are preserved. But in this modified
> code, you are mapping bio_src's bi_iovec[bi_idx] to bio_dests
> bi_iovec[0] and also restricting the number of allocated io_vecs of
> the clone. It may be useful for cases were we would like a identical
> copy of the original bio (may not be in current code base, but this
> implementation is definitely not what one would expect from the name
> "clone").

The problem is that bio_clone() is used on bios that were not allocated
or submitted by the cloning module.

If some code somewher submits a bio that points to 500 pages, but by the
time it gets to a driver it only points to 200 pages (say, because it
was split), that clone should succeed; it shouldn't fail simply because
it was trying to clone more than was necessary.

Bios have certain (poorly documented) semantics, and if this breaks
anything it's probably because that code was doing something crazy in
the first place.

In particular, if this change breaks anything then the new bio_split()
_will_ break things.

We need to be clear about our interfaces; in this case bi_idx and
bi_vcnt, in particular. Either this is a safe change, or it's not. If
no one knows... that's a bigger problem, and not just for this patch...

Fortunately this code actually has been tested quite a bit (and the bio
splitting code for even longer), and (somewhat to my surprise) I haven't
run into any bugs caused by it.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-09-2012, 03:25 AM
Kent Overstreet
 
Default block: Only clone bio vecs that are in use

On Wed, Aug 8, 2012 at 8:19 PM, Kent Overstreet <koverstreet@google.com> wrote:
> In particular, if this change breaks anything then the new bio_split()
> _will_ break things.
>
> We need to be clear about our interfaces; in this case bi_idx and
> bi_vcnt, in particular. Either this is a safe change, or it's not. If
> no one knows... that's a bigger problem, and not just for this patch...
>
> Fortunately this code actually has been tested quite a bit (and the bio
> splitting code for even longer), and (somewhat to my surprise) I haven't
> run into any bugs caused by it.

Oh, it's worse than that - remember how bio_kmalloc() works for up to
1024 pages, but bio_alloc_bioset() only up to 256?

We can already have situations where bios are allocated that are
impossible to clone (or if we don't, it's only because of
queue_limits. That's not sketchy at all.)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-09-2012, 07:01 AM
Tejun Heo
 
Default block: Only clone bio vecs that are in use

Hello,

On Wed, Aug 08, 2012 at 04:47:46PM -0700, Muthu Kumar wrote:
> You are changing the meaning of __bio_clone() here. In old code, the
> number of io_vecs, bi_idx, bi_vcnt are preserved. But in this modified
> code, you are mapping bio_src's bi_iovec[bi_idx] to bio_dests
> bi_iovec[0] and also restricting the number of allocated io_vecs of
> the clone. It may be useful for cases were we would like a identical
> copy of the original bio (may not be in current code base, but this
> implementation is definitely not what one would expect from the name
> "clone").

Implementation details changed somewhat but the high-level semantics
didn't change at all. Any driver not messing with bio internals - and
they shouldn't - shouldn't notice the change. No in-kernel drivers
seem to be broken by the change. If you ask me, this looks more like
a bug fix to me where the bug is a silly behavior restricting
usefulness of the interface.

> May be, call this new implementation some thing else (and use it for bcache)?

This doesn't only change __bio_clone() but all clone interface stacked
on top of it, so, no way. This ain't windows.

Thanks.

--
tejun

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-10-2012, 01:50 AM
Muthu Kumar
 
Default block: Only clone bio vecs that are in use

Kent,

>> --
>> You are changing the meaning of __bio_clone() here. In old code, the
>> number of io_vecs, bi_idx, bi_vcnt are preserved. But in this modified
>> code, you are mapping bio_src's bi_iovec[bi_idx] to bio_dests
>> bi_iovec[0] and also restricting the number of allocated io_vecs of
>> the clone. It may be useful for cases were we would like a identical
>> copy of the original bio (may not be in current code base, but this
>> implementation is definitely not what one would expect from the name
>> "clone").
>
> The problem is that bio_clone() is used on bios that were not allocated
> or submitted by the cloning module.
>
> If some code somewher submits a bio that points to 500 pages, but by the
> time it gets to a driver it only points to 200 pages (say, because it
> was split), that clone should succeed; it shouldn't fail simply because
> it was trying to clone more than was necessary.
>


I would say, the code that submits bio with 500 pages is broken. It
needs to take care of underlying restrictions before submitting bios.
And that's one of the reason for having bio_add_page(), especially for
stackable modules.


> Bios have certain (poorly documented) semantics, and if this breaks
> anything it's probably because that code was doing something crazy in
> the first place.
>

Agree. But doing the above doesn't help in improving the situation,
just makes it even less clear.

Regards,
Muthu.


> In particular, if this change breaks anything then the new bio_split()
> _will_ break things.
>
> We need to be clear about our interfaces; in this case bi_idx and
> bi_vcnt, in particular. Either this is a safe change, or it's not. If
> no one knows... that's a bigger problem, and not just for this patch...
>
> Fortunately this code actually has been tested quite a bit (and the bio
> splitting code for even longer), and (somewhat to my surprise) I haven't
> run into any bugs caused by it.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-10-2012, 02:29 AM
Muthu Kumar
 
Default block: Only clone bio vecs that are in use

Tejun,

On Thu, Aug 9, 2012 at 12:01 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On Wed, Aug 08, 2012 at 04:47:46PM -0700, Muthu Kumar wrote:
>> You are changing the meaning of __bio_clone() here. In old code, the
>> number of io_vecs, bi_idx, bi_vcnt are preserved. But in this modified
>> code, you are mapping bio_src's bi_iovec[bi_idx] to bio_dests
>> bi_iovec[0] and also restricting the number of allocated io_vecs of
>> the clone. It may be useful for cases were we would like a identical
>> copy of the original bio (may not be in current code base, but this
>> implementation is definitely not what one would expect from the name
>> "clone").
>
> Implementation details changed somewhat but the high-level semantics
> didn't change at all. Any driver not messing with bio internals - and
> they shouldn't - shouldn't notice the change.

The reason for doing this change is because the code in question is
messing with bio internals.

No in-kernel drivers
> seem to be broken by the change. If you ask me, this looks more like
> a bug fix to me where the bug is a silly behavior restricting
> usefulness of the interface.
>
>> May be, call this new implementation some thing else (and use it for bcache)?
>
> This doesn't only change __bio_clone() but all clone interface stacked
> on top of it, so, no way.

>This ain't windows.

ah... when you put it this way, it gets a different perspective

Anyway, my point is, we shouldn't make it non-obvious ("clone" should
be just "clone"). But, we can always add more comments i guess.

Regards,
Muthu


>
> Thanks.
>
> --
> tejun

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 10:48 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org