nvme-fc: correct hang in nvme_ns_remove()
authorJames Smart <jsmart2021@gmail.com>
Thu, 11 Jan 2018 23:21:38 +0000 (15:21 -0800)
committerChristoph Hellwig <hch@lst.de>
Wed, 17 Jan 2018 16:55:02 +0000 (17:55 +0100)
When connectivity is lost to a device, the association is terminated
and the blk-mq queues are quiesced/stopped. When connectivity is
re-established, they are resumed.

If connectivity is lost for a sufficient amount of time that the
controller is then deleted, the delete path starts tearing down queues,
and eventually calling nvme_ns_remove(). It appears that pending
commands may cause blk_cleanup_queue() to never complete and the
teardown stalls.

Correct by starting the ns queues after transitioning to a DELETING
state, allowing pending commands to be flushed with io failures. Thus
the delete path is clear when reached.

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
drivers/nvme/host/fc.c

index a10c77139f764e444b58a8f789e88ae64fc66729..b76ba4629e02a41b811fe9344d3596a7084427f1 100644 (file)
@@ -2938,6 +2938,9 @@ nvme_fc_delete_ctrl(struct nvme_ctrl *nctrl)
         * waiting for io to terminate
         */
        nvme_fc_delete_association(ctrl);
+
+       /* resume the io queues so that things will fast fail */
+       nvme_start_queues(nctrl);
 }
 
 static void