btrfs: Fix a deadlock in btrfs_dev_replace_finishing()
authorQu Wenruo <quwenruo@cn.fujitsu.com>
Wed, 20 Aug 2014 08:10:15 +0000 (16:10 +0800)
committerChris Mason <clm@fb.com>
Wed, 17 Sep 2014 20:38:22 +0000 (13:38 -0700)
commit12b894cb288d57292b01cf158177b6d5c89a6272
tree42716fe99beca6af385745eb575a1c506b5acf5f
parenta583c02664eea8796e80dd192a3bcc1d521939e5
btrfs: Fix a deadlock in btrfs_dev_replace_finishing()

btrfs-transacion:5657
[stack snip]
btrfs_bio_map()
    btrfs_bio_counter_inc_blocked()
        percpu_counter_inc(&fs_info->bio_counter)  ###bio_counter > 0(A)
        __btrfs_bio_map()
            btrfs_dev_replace_lock()
                mutex_lock(dev_replace->lock)    ###wait mutex(B)

btrfs:32612
[stack snip]
btrfs_dev_replace_start()
    btrfs_dev_replace_lock()
mutex_lock(dev_replace->lock)    ###hold mutex(B)
    btrfs_dev_replace_finishing()
        btrfs_rm_dev_replace_blocked()
            wait until percpu_counter_sum == 0    ###wait on bio_counter(A)

This bug can be triggered quite easily by the following test script:
http://pastebin.com/MQmb37Cy

This patch will fix the ABBA problem by calling
btrfs_dev_replace_unlock() before btrfs_rm_dev_replace_blocked().

The consistency of btrfs devices list and their superblocks is protected
by device_list_mutex, not btrfs_dev_replace_lock/unlock().
So it is safe the move btrfs_dev_replace_unlock() before
btrfs_rm_dev_replace_blocked().

Reported-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Cc: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <clm@fb.com>
fs/btrfs/dev-replace.c