drm/amdgpu: Fix skipping hangged job reset during gpu recover.
authorAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Wed, 31 Oct 2018 14:23:05 +0000 (10:23 -0400)
committerAlex Deucher <alexander.deucher@amd.com>
Thu, 1 Nov 2018 14:51:33 +0000 (09:51 -0500)
Problem:
During GPU recover DAL would hang in
amdgpu_pm_compute_clocks->amdgpu_fence_wait_empty

Fix:
Turns out there was a typo introduced by
3320b8d drm/amdgpu: remove job->ring which caused skipping
amdgpu_fence_driver_force_completion and so the hangged job
was never force signaled and this would cause the hang later in DAL.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index d11489e8b38882058dd4802ce5d0ffcfb4b0881a..f06d068b8eb51e9728d0b8bce689b86abf79ae9d 100644 (file)
@@ -3341,7 +3341,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 
                kthread_park(ring->sched.thread);
 
-               if (job && job->base.sched == &ring->sched)
+               if (job && job->base.sched != &ring->sched)
                        continue;
 
                drm_sched_hw_job_reset(&ring->sched, job ? &job->base : NULL);