drm/amdgpu: no job timeout setting on compute queues
authorEvan Quan <evan.quan@amd.com>
Thu, 15 Mar 2018 01:49:01 +0000 (09:49 +0800)
committerAlex Deucher <alexander.deucher@amd.com>
Wed, 21 Mar 2018 19:36:57 +0000 (14:36 -0500)
Under some heavy computing environment(e.g. dgemm test), it
takes the asic over 10+ seconds to finish the dispatched job
which will trigger the timeout.

It's quite confusing although it does not seem to bring any
real problems. As a quick workround, we choose to not enfoce
the timeout setting on compute queues.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c

index 008e1984b7e39b9d51a9021c0df5762d889d6506..455a81e4c246a3488bd7c8462acbe3a9b9ac5eb8 100644 (file)
@@ -435,7 +435,9 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
        if (ring->funcs->type != AMDGPU_RING_TYPE_KIQ) {
                r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
                                   num_hw_submission, amdgpu_job_hang_limit,
-                                  msecs_to_jiffies(amdgpu_lockup_timeout), ring->name);
+                                  (ring->funcs->type == AMDGPU_RING_TYPE_COMPUTE) ?
+                                  MAX_SCHEDULE_TIMEOUT : msecs_to_jiffies(amdgpu_lockup_timeout),
+                                  ring->name);
                if (r) {
                        DRM_ERROR("Failed to create scheduler on ring %s.\n",
                                  ring->name);