btrfs: dev-replace: go back to suspend state if another EXCL_OP is running
authorAnand Jain <anand.jain@oracle.com>
Sun, 11 Nov 2018 14:22:18 +0000 (22:22 +0800)
committerDavid Sterba <dsterba@suse.com>
Mon, 17 Dec 2018 13:51:34 +0000 (14:51 +0100)
In a secnario where balance and replace co-exists as below,

  - start balance
  - pause balance
  - start replace
  - reboot

and when system restarts, balance resumes first. Then the replace is
attempted to restart but will fail as the EXCL_OP lock is already held
by the balance. If so place the replace state back to
BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED state.

Fixes: 010a47bde9420 ("btrfs: add proper safety check before resuming dev-replace")
CC: stable@vger.kernel.org # 4.18+
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
fs/btrfs/dev-replace.c

index 11df8f778b637478701dbc02c31ed70f6f11505a..33d07c426c59ee949bf2aecbb13a4c44a080a928 100644 (file)
@@ -903,6 +903,10 @@ int btrfs_resume_dev_replace_async(struct btrfs_fs_info *fs_info)
         * dev-replace to start anyway.
         */
        if (test_and_set_bit(BTRFS_FS_EXCL_OP, &fs_info->flags)) {
+               btrfs_dev_replace_write_lock(dev_replace);
+               dev_replace->replace_state =
+                                       BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED;
+               btrfs_dev_replace_write_unlock(dev_replace);
                btrfs_info(fs_info,
                "cannot resume dev-replace, other exclusive operation running");
                return 0;