dm: relax ordering of bio-based flush implementation
Unlike REQ_HARDBARRIER, REQ_FLUSH/FUA doesn't mandate any ordering
against other bio's. This patch relaxes ordering around flushes.
* A flush bio is no longer deferred to workqueue directly. It's
processed like other bio's but __split_and_process_bio() uses
md->flush_bio as the clone source. md->flush_bio is initialized to
empty flush during md initialization and shared for all flushes.
* When dec_pending() detects that a flush has completed, it checks
whether the original bio has data. If so, the bio is queued to the
deferred list w/ REQ_FLUSH cleared; otherwise, it's completed.
* As flush sequencing is handled in the usual issue/completion path,
dm_wq_work() no longer needs to handle flushes differently. Now its
only responsibility is re-issuing deferred bio's the same way as
_dm_request() would. REQ_FLUSH handling logic including
process_flush() is dropped.
* There's no reason for queue_io() and dm_wq_work() write lock
dm->io_lock. queue_io() now only uses md->deferred_lock and
dm_wq_work() read locks dm->io_lock.
* bio's no longer need to be queued on the deferred list while a flush
is in progress making DMF_QUEUE_IO_TO_THREAD unncessary. Drop it.
This avoids stalling the device during flushes and simplifies the
implementation.
start_io_acct(ci.io);
while (ci.sector_count && !error) {
- if (!(bio->bi_rw & REQ_FLUSH))
+ if (!is_flush)
error = __clone_and_map(&ci);
else
error = __clone_and_map_flush(&ci);
@@ -1490,22 +1469,14 @@ static int _dm_request(struct request_queue *q, struct bio *bio)
part_stat_add(cpu, &dm_disk(md)->part0, sectors[rw], bio_sectors(bio));
part_stat_unlock();
- /*
- * If we're suspended or the thread is processing flushes
- * we have to queue this io for later.
- */
- if (unlikely(test_bit(DMF_QUEUE_IO_TO_THREAD, &md->flags)) ||
- (bio->bi_rw & REQ_FLUSH)) {
+ /* if we're suspended, we have to queue this io for later */
+ if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
up_read(&md->io_lock);
while (!test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) {
spin_lock_irq(&md->deferred_lock);
c = bio_list_pop(&md->deferred);
spin_unlock_irq(&md->deferred_lock);
- if (!c) {
- clear_bit(DMF_QUEUE_IO_TO_THREAD, &md->flags);
+ if (!c)
break;
- }
static void dm_queue_flush(struct mapped_device *md)
@@ -2672,17 +2610,12 @@ int dm_suspend(struct mapped_device *md, unsigned suspend_flags)
*
* To get all processes out of __split_and_process_bio in dm_request,
* we take the write lock. To prevent any process from reentering
- * __split_and_process_bio from dm_request, we set
- * DMF_QUEUE_IO_TO_THREAD.
- *
- * To quiesce the thread (dm_wq_work), we set DMF_BLOCK_IO_FOR_SUSPEND
- * and call flush_workqueue(md->wq). flush_workqueue will wait until
- * dm_wq_work exits and DMF_BLOCK_IO_FOR_SUSPEND will prevent any
- * further calls to __split_and_process_bio from dm_wq_work.
+ * __split_and_process_bio from dm_request and quiesce the thread
+ * (dm_wq_work), we set BMF_BLOCK_IO_FOR_SUSPEND and call
+ * flush_workqueue(md->wq).
*/
down_write(&md->io_lock);
set_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags);
- set_bit(DMF_QUEUE_IO_TO_THREAD, &md->flags);
up_write(&md->io_lock);
/*
--
1.7.1
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
09-01-2010, 01:51 PM
Mike Snitzer
dm: relax ordering of bio-based flush implementation
On Mon, Aug 30 2010 at 5:58am -0400,
Tejun Heo <tj@kernel.org> wrote:
> Unlike REQ_HARDBARRIER, REQ_FLUSH/FUA doesn't mandate any ordering
> against other bio's. This patch relaxes ordering around flushes.
>
> * A flush bio is no longer deferred to workqueue directly. It's
> processed like other bio's but __split_and_process_bio() uses
> md->flush_bio as the clone source. md->flush_bio is initialized to
> empty flush during md initialization and shared for all flushes.
>
> * When dec_pending() detects that a flush has completed, it checks
> whether the original bio has data. If so, the bio is queued to the
> deferred list w/ REQ_FLUSH cleared; otherwise, it's completed.
>
> * As flush sequencing is handled in the usual issue/completion path,
> dm_wq_work() no longer needs to handle flushes differently. Now its
> only responsibility is re-issuing deferred bio's the same way as
> _dm_request() would. REQ_FLUSH handling logic including
> process_flush() is dropped.
>
> * There's no reason for queue_io() and dm_wq_work() write lock
> dm->io_lock. queue_io() now only uses md->deferred_lock and
> dm_wq_work() read locks dm->io_lock.
>
> * bio's no longer need to be queued on the deferred list while a flush
> is in progress making DMF_QUEUE_IO_TO_THREAD unncessary. Drop it.
>
> This avoids stalling the device during flushes and simplifies the
> implementation.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
Looks good overall.
> @@ -144,11 +143,6 @@ struct mapped_device {
> spinlock_t deferred_lock;
>
> /*
> - * An error from the flush request currently being processed.
> - */
> - int flush_error;
> -
> - /*
> * Protect barrier_error from concurrent endio processing
> * in request-based dm.
> */
Could you please document why it is OK to remove 'flush_error' in the
patch header? The -EOPNOTSUPP handling removal (done in patch 2)
obviously helps enable this but it is not clear how the
'num_flush_requests' flushes that __clone_and_map_flush() generates do
not need explicit DM error handling.
Other than that.
Acked-by: Mike Snitzer <snitzer@redhat.com>
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
09-01-2010, 01:56 PM
Tejun Heo
dm: relax ordering of bio-based flush implementation
On 09/01/2010 03:51 PM, Mike Snitzer wrote:
> Could you please document why it is OK to remove 'flush_error' in the
> patch header? The -EOPNOTSUPP handling removal (done in patch 2)
> obviously helps enable this but it is not clear how the
> 'num_flush_requests' flushes that __clone_and_map_flush() generates do
> not need explicit DM error handling.
Sure, I'll. It's because it now uses the same error handling path in
dec_pending() all other bio's use. The flush_error thing was there
because flushes got executed/completed in a separate code path to
begin with. With the special path gone, there's no need for
flush_error path either.
Thanks.
--
tejun
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
09-03-2010, 10:29 AM
Tejun Heo
dm: relax ordering of bio-based flush implementation
Unlike REQ_HARDBARRIER, REQ_FLUSH/FUA doesn't mandate any ordering
against other bio's. This patch relaxes ordering around flushes.
* A flush bio is no longer deferred to workqueue directly. It's
processed like other bio's but __split_and_process_bio() uses
md->flush_bio as the clone source. md->flush_bio is initialized to
empty flush during md initialization and shared for all flushes.
* As a flush bio now travels through the same execution path as other
bio's, there's no need for dedicated error handling path either. It
can use the same error handling path in dec_pending(). Dedicated
error handling removed along with md->flush_error.
* When dec_pending() detects that a flush has completed, it checks
whether the original bio has data. If so, the bio is queued to the
deferred list w/ REQ_FLUSH cleared; otherwise, it's completed.
* As flush sequencing is handled in the usual issue/completion path,
dm_wq_work() no longer needs to handle flushes differently. Now its
only responsibility is re-issuing deferred bio's the same way as
_dm_request() would. REQ_FLUSH handling logic including
process_flush() is dropped.
* There's no reason for queue_io() and dm_wq_work() write lock
dm->io_lock. queue_io() now only uses md->deferred_lock and
dm_wq_work() read locks dm->io_lock.
* bio's no longer need to be queued on the deferred list while a flush
is in progress making DMF_QUEUE_IO_TO_THREAD unncessary. Drop it.
This avoids stalling the device during flushes and simplifies the
implementation.
start_io_acct(ci.io);
while (ci.sector_count && !error) {
- if (!(bio->bi_rw & REQ_FLUSH))
+ if (!is_flush)
error = __clone_and_map(&ci);
else
error = __clone_and_map_flush(&ci);
@@ -1419,22 +1398,14 @@ static int _dm_request(struct request_queue *q, struct bio *bio)
part_stat_add(cpu, &dm_disk(md)->part0, sectors[rw], bio_sectors(bio));
part_stat_unlock();
- /*
- * If we're suspended or the thread is processing flushes
- * we have to queue this io for later.
- */
- if (unlikely(test_bit(DMF_QUEUE_IO_TO_THREAD, &md->flags)) ||
- (bio->bi_rw & REQ_FLUSH)) {
+ /* if we're suspended, we have to queue this io for later */
+ if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
up_read(&md->io_lock);
while (!test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) {
spin_lock_irq(&md->deferred_lock);
c = bio_list_pop(&md->deferred);
spin_unlock_irq(&md->deferred_lock);
- if (!c) {
- clear_bit(DMF_QUEUE_IO_TO_THREAD, &md->flags);
+ if (!c)
break;
- }
static void dm_queue_flush(struct mapped_device *md)
@@ -2511,17 +2449,12 @@ int dm_suspend(struct mapped_device *md, unsigned suspend_flags)
*
* To get all processes out of __split_and_process_bio in dm_request,
* we take the write lock. To prevent any process from reentering
- * __split_and_process_bio from dm_request, we set
- * DMF_QUEUE_IO_TO_THREAD.
- *
- * To quiesce the thread (dm_wq_work), we set DMF_BLOCK_IO_FOR_SUSPEND
- * and call flush_workqueue(md->wq). flush_workqueue will wait until
- * dm_wq_work exits and DMF_BLOCK_IO_FOR_SUSPEND will prevent any
- * further calls to __split_and_process_bio from dm_wq_work.
+ * __split_and_process_bio from dm_request and quiesce the thread
+ * (dm_wq_work), we set BMF_BLOCK_IO_FOR_SUSPEND and call
+ * flush_workqueue(md->wq).
*/
down_write(&md->io_lock);
set_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags);
- set_bit(DMF_QUEUE_IO_TO_THREAD, &md->flags);
up_write(&md->io_lock);
/*
--
1.7.1
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel