drbd: don't block forever in disconnect during resync if fencing=r-a-stonith
authorLars Ellenberg <lars.ellenberg@linbit.com>
Thu, 16 Apr 2015 14:51:34 +0000 (16:51 +0200)
committerJens Axboe <axboe@fb.com>
Wed, 25 Nov 2015 16:22:02 +0000 (09:22 -0700)
Disconnect should wait for pending bitmap IO.
But if that bitmap IO is not happening, because it is waiting for
pending application IO, and there is no progress, because the fencing
policy suspended application IO because of the disconnect,
then we deadlock.

The bitmap writeout in this case does not care for concurrent
application IO, so there is no point waiting for it.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
drivers/block/drbd/drbd_main.c

index 136fa733a15e5efdefedecaab691bf491942592c..5b43dfb798191d0c90cc3b5115956575e56410f9 100644 (file)
@@ -3563,7 +3563,9 @@ void drbd_queue_bitmap_io(struct drbd_device *device,
 
        spin_lock_irq(&device->resource->req_lock);
        set_bit(BITMAP_IO, &device->flags);
-       if (atomic_read(&device->ap_bio_cnt) == 0) {
+       /* don't wait for pending application IO if the caller indicates that
+        * application IO does not conflict anyways. */
+       if (flags == BM_LOCKED_CHANGE_ALLOWED || atomic_read(&device->ap_bio_cnt) == 0) {
                if (!test_and_set_bit(BITMAP_IO_QUEUED, &device->flags))
                        drbd_queue_work(&first_peer_device(device)->connection->sender_work,
                                        &device->bm_io_work.w);