FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 01-30-2012, 04:54 AM
Tejun Heo
 
Default block: add missing block_bio_complete() tracepoint

On Mon, Jan 30, 2012 at 02:51:43PM +0900, Namhyung Kim wrote:
> Hi,
>
> 2012-01-30 11:53 AM, Tejun Heo wrote:
> >Hello,
> >
> >On Mon, Jan 30, 2012 at 11:49:43 AM +0900, Namhyung Kim wrote:
> >>Yes, blktrace will see both of completions for rq and bio's it
> >>contains from now on, and I thought it's your intention? For the
> >>bounced bio's, it'll see one more completion per bio.
> >
> >Umm... no, blktrace users shouldn't see confusing outputs. My point
> >was that we should leave raw tracepoints as raw monitor points so that
> >they can provide raw data to different users and filter from
> >blktrace.c so that the existing users aren't disturbed.
> >
>
> That means blktrace should not be changed, and not see duplicated
> completions for rq based drivers at all?

Yeap

--
tejun

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 01-30-2012, 05:02 AM
Namhyung Kim
 
Default block: add missing block_bio_complete() tracepoint

2012-01-30 2:54 PM, Tejun Heo wrote:

On Mon, Jan 30, 2012 at 02:51:43PM +0900, Namhyung Kim wrote:

Hi,

2012-01-30 11:53 AM, Tejun Heo wrote:

Hello,

On Mon, Jan 30, 2012 at 11:49:43 AM +0900, Namhyung Kim wrote:

Yes, blktrace will see both of completions for rq and bio's it
contains from now on, and I thought it's your intention? For the
bounced bio's, it'll see one more completion per bio.


Umm... no, blktrace users shouldn't see confusing outputs. My point
was that we should leave raw tracepoints as raw monitor points so that
they can provide raw data to different users and filter from
blktrace.c so that the existing users aren't disturbed.



That means blktrace should not be changed, and not see duplicated
completions for rq based drivers at all?


Yeap



I see. I'll prepare the patch soon.

Thanks,
Namhyung

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 01-30-2012, 05:38 AM
Namhyung Kim
 
Default block: add missing block_bio_complete() tracepoint

2012-01-29 6:41 PM, Namhyung Kim wrote:

The block_bio_complete() TP has been missed so long, so that bio-based
drivers haven't been able to trace its IO behavior. Add it.

In some rare cases, such as loop_switch, @bio->bi_bdev can be NULL.
Thus convert it to TRACE_EVENT_CONDITION() as Steven suggested.



Now I see that it seems TRACE_EVENT_CONDITION() can protect event
tracing from such condition, but what about other users of the TP like
blktrace? I think it'll still get NULL pointer dereference on
bdev_get_queue() after the change, right? If so, convert to T_E_C()
looks meaningless IMHO. Do I miss something?


Thanks,
Namhyung



From now on, request-based drivers will also get BLK_TA_COMPLETEs for
all bio's in requests. This needs to be handled in userland properly.

Also remove external use of the TP in DM and unexport it.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: dm-devel@redhat.com
---
* v2:
- Merge previous patches into one as suggested by Tejun.
- Drop BIO_COMPLETE_MASK. Now I think it should be handled in userland
like other (maybe duplicated, in some sense) bio complete reports.

block/blk-core.c | 1 -
drivers/md/dm.c | 1 -
fs/bio.c | 2 ++
include/trace/events/block.h | 4 +++-
4 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 1b4fd93af2c0..399c128f516c 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -37,7 +37,6 @@

EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_remap);
EXPORT_TRACEPOINT_SYMBOL_GPL(block_rq_remap);
-EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_complete);

DEFINE_IDA(blk_queue_ida);

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 4720f68f817e..01185fa0eb74 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -648,7 +648,6 @@ static void dec_pending(struct dm_io *io, int error)
queue_io(md, bio);
} else {
/* done with normal IO or empty flush */
- trace_block_bio_complete(md->queue, bio, io_error);
bio_endio(bio, io_error);
}
}
diff --git a/fs/bio.c b/fs/bio.c
index b1fe82cf88cf..14c03eaf384e 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -1447,6 +1447,8 @@ void bio_endio(struct bio *bio, int error)
else if (!test_bit(BIO_UPTODATE,&bio->bi_flags))
error = -EIO;

+ trace_block_bio_complete(bdev_get_queue(bio->bi_bdev), bio, error);
+
if (bio->bi_end_io)
bio->bi_end_io(bio, error);
}
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index 05c5e61f0a7c..96955f4828b3 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -213,12 +213,14 @@ TRACE_EVENT(block_bio_bounce,
* This tracepoint indicates there is no further work to do on this
* block IO operation @bio.
*/
-TRACE_EVENT(block_bio_complete,
+TRACE_EVENT_CONDITION(block_bio_complete,

TP_PROTO(struct request_queue *q, struct bio *bio, int error),

TP_ARGS(q, bio, error),

+ TP_CONDITION(bio->bi_bdev != NULL),
+
TP_STRUCT__entry(
__field( dev_t, dev )
__field( sector_t, sector )


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 01-30-2012, 04:05 PM
Tejun Heo
 
Default block: add missing block_bio_complete() tracepoint

On Mon, Jan 30, 2012 at 03:38:57PM +0900, Namhyung Kim wrote:
> 2012-01-29 6:41 PM, Namhyung Kim wrote:
> >The block_bio_complete() TP has been missed so long, so that bio-based
> >drivers haven't been able to trace its IO behavior. Add it.
> >
> >In some rare cases, such as loop_switch, @bio->bi_bdev can be NULL.
> >Thus convert it to TRACE_EVENT_CONDITION() as Steven suggested.
> >
>
> Now I see that it seems TRACE_EVENT_CONDITION() can protect event
> tracing from such condition, but what about other users of the TP
> like blktrace? I think it'll still get NULL pointer dereference on
> bdev_get_queue() after the change, right? If so, convert to T_E_C()
> looks meaningless IMHO. Do I miss something?

Not really following, but the whole point of using
TRACE_EVENT_CONDITION() is avoiding the conditional jump when the TP
is disabled. Whether the TP users need to test again / more isn't too
important.

Thanks.

--
tejun

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 01-31-2012, 05:30 AM
Namhyung Kim
 
Default block: add missing block_bio_complete() tracepoint

Hi,

2012-01-31 2:05 AM, Tejun Heo wrote:

On Mon, Jan 30, 2012 at 03:38:57PM +0900, Namhyung Kim wrote:

2012-01-29 6:41 PM, Namhyung Kim wrote:

The block_bio_complete() TP has been missed so long, so that bio-based
drivers haven't been able to trace its IO behavior. Add it.

In some rare cases, such as loop_switch, @bio->bi_bdev can be NULL.
Thus convert it to TRACE_EVENT_CONDITION() as Steven suggested.



Now I see that it seems TRACE_EVENT_CONDITION() can protect event
tracing from such condition, but what about other users of the TP
like blktrace? I think it'll still get NULL pointer dereference on
bdev_get_queue() after the change, right? If so, convert to T_E_C()
looks meaningless IMHO. Do I miss something?


Not really following, but the whole point of using
TRACE_EVENT_CONDITION() is avoiding the conditional jump when the TP
is disabled. Whether the TP users need to test again / more isn't too
important.

Thanks.



Right, but the point is it could make a NULL pointer dereference during
evaluation of the argument of the TP AFAICS. I'm not sure about the TP
implementation though, I think I was wrong - T_E_C() cannot protect us
from it because it happens just before jumping to the TP, right?


So I think we need a conditional jump (with the "likely" annotation) for
this even when the TP is disabled.


Thanks,
Namhyung

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 01-31-2012, 09:39 AM
Tejun Heo
 
Default block: add missing block_bio_complete() tracepoint

Hello,

On Mon, Jan 30, 2012 at 10:30 PM, Namhyung Kim <namhyung@gmail.com> wrote:
> Right, but the point is it could make a NULL pointer dereference during
> evaluation of the argument of the TP AFAICS. I'm not sure about the TP
> implementation though, I think I was wrong - T_E_C() cannot protect us from
> it because it happens just before jumping to the TP, right?
>
> So I think we need a conditional jump (with the "likely" annotation) for
> this even when the TP is disabled.

Hmmm... still not following. Where the said NULL dereference happen?
TEC conditional is equivalent to "if (COND) TP;". If you don't use
TEC, it'll be "if (COND) if (TP enabled) TP;". With TEC, it will be
"if (TP enabled) if (COND) TP;". There's no other difference.

Thanks.

--
tejun

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 02-01-2012, 01:18 AM
Namhyung Kim
 
Default block: add missing block_bio_complete() tracepoint

Hi,

2012-01-31 7:39 PM, Tejun Heo wrote:

Hello,

On Mon, Jan 30, 2012 at 10:30 PM, Namhyung Kim<namhyung@gmail.com> wrote:

Right, but the point is it could make a NULL pointer dereference during
evaluation of the argument of the TP AFAICS. I'm not sure about the TP
implementation though, I think I was wrong - T_E_C() cannot protect us from
it because it happens just before jumping to the TP, right?

So I think we need a conditional jump (with the "likely" annotation) for
this even when the TP is disabled.


Hmmm... still not following. Where the said NULL dereference happen?
TEC conditional is equivalent to "if (COND) TP;". If you don't use
TEC, it'll be "if (COND) if (TP enabled) TP;". With TEC, it will be
"if (TP enabled) if (COND) TP;". There's no other difference.

Thanks.



I've made a quick investigation on TP implementation, and finally
figured out what I was wrong - I thought the COND would be checkd in a
probe, but it's not. Thanks for pointing it out.


However, for some reason, it seems gcc generated code that evaluates the
arguments - bdev_get_queue() in this case - before checking the COND.
Simple test module below caused a NULL pointer dereference when I used
TRACE_EVENT_CONDITION(), but not for conditional jump:


#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/bio.h>

static int __init init_mod(void)
{
struct bio *bio = bio_alloc(GFP_KERNEL, 0);
bio_endio(bio, 0);
bio_put(bio);
return 0;
}

static void __exit exit_mod(void)
{
}

module_init(init_mod);
module_exit(exit_mod);


Thanks,
Namhyung

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 09:26 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org