Commit
0e9973ed3382 ("scsi: aacraid: Add periodic checks to see IOP reset
status") changed the way driver checks if a reset succeeded. Now, after an
IOP reset, aacraid immediately start polling a register to verify the reset
is complete.
This behavior cause regressions on the reset path in PowerPC (at least).
Since the delay after the IOP reset was removed by the aforementioned patch,
the fact driver just starts to read a register instantly after the reset
was issued (by writing in another register) "corrupts" the reset procedure,
which ends up failing all the time.
The issue highly impacted kdump on PowerPC, since on kdump path we
proactively issue a reset in adapter (through the reset_devices kernel
parameter).
This patch (re-)adds a delay right after IOP reset is issued. Empirically
we measured that 3 seconds is enough, but for safety reasons we delay
for 5s (and since it was 30s before, 5s is still a small amount).
For reference, without this patch we observe the following messages
on kdump kernel boot process:
[ 76.294] aacraid 0003:01:00.0: IOP reset failed
[ 76.294] aacraid 0003:01:00.0: ARC Reset attempt failed
[ 86.524] aacraid 0003:01:00.0: adapter kernel panic'd ff.
[ 86.524] aacraid 0003:01:00.0: Controller reset type is 3
[ 86.524] aacraid 0003:01:00.0: Issuing IOP reset
[146.534] aacraid 0003:01:00.0: IOP reset failed
[146.534] aacraid 0003:01:00.0: ARC Reset attempt failed
Fixes: 0e9973ed3382 ("scsi: aacraid: Add periodic checks to see IOP reset status")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>