FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 03-12-2009, 12:40 AM
Kiyoshi Ueda
 
Default dm: request-based dm-multipath

Hi Hannes,

On 2009/03/11 21:28 +0900, Hannes Reinecke wrote:
> Hi Kiyoshi,
>
> Kiyoshi Ueda wrote:
>> Hi Hannes,
>>
> [ .. ]
>>
>> Suspend was broken.
>> dm_suspend() recognized that suspend completed while some requests
>> were still in flight. So we could swap/free the in-use table while
>> there was in_flight request.
>> The patch is like the attached one, although it is not finalized and
>> I'm testing now.
>> I'll post an updated patch-set including the attached patch
>> this week or next week.
>>
>>
>> ---
>> drivers/md/dm.c | 236
>> ++++++++++++++++++++++++++++++++++----------------------
>> 1 file changed, 144 insertions(+), 92 deletions(-)
>>
>> Index: 2.6.29-rc2/drivers/md/dm.c
>> ================================================== =================
>> --- 2.6.29-rc2.orig/drivers/md/dm.c
>> +++ 2.6.29-rc2/drivers/md/dm.c
>> @@ -701,11 +701,17 @@ static void free_bio_clone(struct reques
>> }
>> }
>>
>> -static void dec_rq_pending(struct dm_rq_target_io *tio)
>> +/*
>> + * XXX: Not taking queue lock for efficiency.
>> + * For correctness, waiters will check that again with queue
>> lock held.
>> + * No false negative because this function will be called everytime
>> + * in_flight is decremented.
>> + */
>> +static void rq_completed(struct mapped_device *md)
>> {
>> - if (!atomic_dec_return(&tio->md->pending))
>> + if (!md->queue->in_flight)
>> /* nudge anyone waiting on suspend queue */
>> - wake_up(&tio->md->wait);
>> + wake_up(&md->wait);
>> }
>>
> Hmm. Don't think that's a good idea. Either take the spinlock here or
> in_flight should be atomic.

Thank you for the comment.
OK, I'll change to take queue_lock here for maintenancability now,
although the queue_lock is not needed logically.
Then, I'll have another patch to drop the queue_lock for efficiency
in the future.

Thanks,
Kiyoshi Ueda

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-12-2009, 07:58 AM
Kiyoshi Ueda
 
Default dm: request-based dm-multipath

Hi Hannes,

On 2009/03/10 16:17 +0900, Hannes Reinecke wrote:
>>>>> o kernel panic occurs by frequent table swapping during heavy I/Os.
>>>>>
>>>> That's probably fixed by this patch:
>>>>
>>>> --- linux-2.6.27/drivers/md/dm.c.orig 2009-01-23
>>>> 15:59:22.741461315 +0100
>>>> +++ linux-2.6.27/drivers/md/dm.c 2009-01-26
>>>> 09:03:02.787605723 +0100
>>>> @@ -714,13 +714,14 @@ static void free_bio_clone(struct reques
>>>> struct dm_rq_target_io *tio = clone->end_io_data;
>>>> struct mapped_device *md = tio->md;
>>>> struct bio *bio;
>>>> - struct dm_clone_bio_info *info;
>>>>
>>>> while ((bio = clone->bio) != NULL) {
>>>> clone->bio = bio->bi_next;
>>>>
>>>> - info = bio->bi_private;
>>>> - free_bio_info(md, info);
>>>> + if (bio->bi_private) {
>>>> + struct dm_clone_bio_info *info =
>>>> bio->bi_private;
>>>> + free_bio_info(md, info);
>>>> + }
>>>>
>>>> bio->bi_private = md->bs;
>>>> bio_put(bio);
>>>>
>>>> The info field is not necessarily filled here, so we have to check
>>>> for it
>>>> explicitly.
>>>>
>>>> With these two patches request-based multipathing have survived all
>>>> stress-tests
>>>> so far. Except on mainframe (zfcp), but that's more a driver-related
>>>> thing.
>>
>> Do you hit some problem without the patch above?
>> If so, that should be a programming bug and we need to fix it.
>> Otherwise,
>> we should be leaking a memory (since all cloned bio should always have
>> the dm_clone_bio_info structure in ->bi_private).
>>
> Yes, I've found that one later on.
> The real problem was in clone_setup_bios(), which might end up calling an
> invalid end_io_data pointer. Patch is attached.

Nice catch! Thank you for the patch.

> -static void free_bio_clone(struct request *clone)
> +static void free_bio_clone(struct request *clone, struct mapped_device *md)

I have changed the argument order to match with other free_* functions:
free_bio_clone(struct mapped_device *md, struct request *clone)

Thanks,
Kiyoshi Ueda

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-12-2009, 08:08 AM
Hannes Reinecke
 
Default dm: request-based dm-multipath

Hi Kiyoshi,

Kiyoshi Ueda wrote:

Hi Hannes,

On 2009/03/10 16:17 +0900, Hannes Reinecke wrote:

[ .. ]

Yes, I've found that one later on.
The real problem was in clone_setup_bios(), which might end up calling an
invalid end_io_data pointer. Patch is attached.


Nice catch! Thank you for the patch.


Oh, nae bother. Took me only a month to track it down :-(


-static void free_bio_clone(struct request *clone)
+static void free_bio_clone(struct request *clone, struct mapped_device *md)


I have changed the argument order to match with other free_* functions:
free_bio_clone(struct mapped_device *md, struct request *clone)


Sure. I wasn't sure myself which way round the arguments should be.

Do you have an updated patch of your suspend fixes? We've run into an issue
here which looks suspiciously close to that one (I/O is completed on a deleted
pgpath), so we would be happy to test it these out.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-13-2009, 12:03 AM
Kiyoshi Ueda
 
Default dm: request-based dm-multipath

Hi Hannes,

On 2009/03/12 18:08 +0900, Hannes Reinecke wrote:
> Do you have an updated patch of your suspend fixes? We've run into an issue
> here which looks suspiciously close to that one (I/O is completed on a
> deleted pgpath), so we would be happy to test it these out.

You mean that the issue occurs WITHOUT the suspend fix patch which
I sent. Is it right?
If so, you can use it, since I haven't added any big change about
suspend fix since then.
Logic changes I added about suspend fix since then are only in
rq_complete() to follow your comment. The updated rq_complete()
is below:

static void rq_completed(struct mapped_device *md)
{
struct request_queue *q = md->queue;
unsigned long flags;

spin_lock_irqsave(q->queue_lock, flags);
if (q->in_flight) {
spin_unlock_irqrestore(q->queue_lock, flags);
return;
}
spin_unlock_irqrestore(q->queue_lock, flags);

/* nudge anyone waiting on suspend queue */
wake_up(&md->wait);
}


I merged the previous suspend fix patch into the request-based dm
core patch, and I've been changing the core patch after that.
So I don't have a patch which addresses only suspend fix update.
Sorry about that.

Thanks,
Kiyoshi Ueda

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 09:52 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org