scsi: lpfc: Fix driver not recovering NVME rports during target link faults
authorJames Smart <jsmart2021@gmail.com>
Mon, 9 Apr 2018 21:24:29 +0000 (14:24 -0700)
committerMartin K. Petersen <martin.petersen@oracle.com>
Wed, 18 Apr 2018 23:34:05 +0000 (19:34 -0400)
During target-side port faults, the driver would not recover all target
port logins. This resulted in a loss of nvme device discovery.

The driver is coded to wait for all GID_FT requests to complete before
restarting discovery. A fault is seen where the outstanding GIT_FT
counts are not properly decremented, thus discovery would never
start. Another fault was found in the clearing of the gidft_inp counter
that would be skipped in this condition. And a third fault found with
lpfc_nvme_register_port that would remove a reverence on the ndlp which
then allows a node swap on a port address change to prematurely remove
the reference and release the ndlp.

The following changes are made:

 - Correct the decrementing of the outstanding GID_FT counters.

 - In RSCN handling, no longer zero the counter before calling to issue
   another GID_FT.

 - No longer remove the reference on the dlp when the ndlp->nrport value
   is not yet null.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
drivers/scsi/lpfc/lpfc_ct.c
drivers/scsi/lpfc/lpfc_els.c
drivers/scsi/lpfc/lpfc_nvme.c

index 0617c8ea88c6fd582a56e14bbbb62b8df76ed4e6..1e7889e451602f2a2651252042f25b4f2641de78 100644 (file)
@@ -691,6 +691,11 @@ lpfc_cmpl_ct_cmd_gid_ft(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
                vport->fc_flag &= ~FC_RSCN_DEFERRED;
                spin_unlock_irq(shost->host_lock);
 
+               /* This is a GID_FT completing so the gidft_inp counter was
+                * incremented before the GID_FT was issued to the wire.
+                */
+               vport->gidft_inp--;
+
                /*
                 * Skip processing the NS response
                 * Re-issue the NS cmd
index 74895e62aaeaab6ae30d4d0b4cf206911a9c2739..6d84a10fef0791b8d70750192ffd63c8e9442de4 100644 (file)
@@ -6268,7 +6268,6 @@ lpfc_els_handle_rscn(struct lpfc_vport *vport)
                 * flush the RSCN.  Otherwise, the outstanding requests
                 * need to complete.
                 */
-               vport->gidft_inp = 0;
                if (lpfc_issue_gidft(vport) > 0)
                        return 1;
        } else {
index 1cb2c634e9f7160d36cd8708ec41c516d18b19ff..22962b08c275121dc220f64ac777e35c9ef03859 100644 (file)
@@ -2721,8 +2721,16 @@ lpfc_nvme_register_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp)
                        spin_unlock_irq(&vport->phba->hbalock);
                        rport->ndlp = NULL;
                        rport->remoteport = NULL;
-                       if (prev_ndlp)
-                               lpfc_nlp_put(ndlp);
+
+                       /* Reference only removed if previous NDLP is no longer
+                        * active. It might be just a swap and removing the
+                        * reference would cause a premature cleanup.
+                        */
+                       if (prev_ndlp && prev_ndlp != ndlp) {
+                               if ((!NLP_CHK_NODE_ACT(prev_ndlp)) ||
+                                   (!prev_ndlp->nrport))
+                                       lpfc_nlp_put(prev_ndlp);
+                       }
                }
 
                /* Clean bind the rport to the ndlp. */