Chunming Zhou [Thu, 27 Apr 2017 07:13:52 +0000 (15:13 +0800)]
drm/amdgpu: fix deadlock of reservation between cs and gpu reset v2
the case could happen when gpu reset:
1. when gpu reset, cs can be continue until sw queue is full, then push job will wait with holding pd reservation.
2. gpu_reset routine will also need pd reservation to restore page table from their shadow.
3. cs is waiting for gpu_reset complete, but gpu reset is waiting for cs releases reservation.
v2: handle amdgpu_cs_submit error path.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Junwei Zhang [Thu, 27 Apr 2017 08:27:43 +0000 (16:27 +0800)]
drm/amdgpu: bump version for exporting gpu info for gfx9
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Junwei Zhang [Thu, 27 Apr 2017 03:12:07 +0000 (11:12 +0800)]
drm/amdgpu: export more gpu info for gfx9
v2: 64-bit aligned for gpu info
v3: squash in wave_front_fix
Signed-off-by: Ken Wang <Qingqing.Wang@amd.com>
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Qiang Yu <Qiang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Thu, 27 Apr 2017 15:13:39 +0000 (17:13 +0200)]
drm/amdgpu: remove unused and mostly unimplemented CGS functions v2
Those functions are all unused and some not even implemented.
v2: keep cgs_get_pci_resource, it is used by the ACP driver.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Fri, 21 Apr 2017 10:52:12 +0000 (18:52 +0800)]
drm/amd/powerplay: refine set pcie dpm default table on vega10.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Fri, 21 Apr 2017 10:33:05 +0000 (18:33 +0800)]
drm/amd/powerplay: disable cks by default on vega10.
run gpu test auto reboot when enable cks right now.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Fri, 21 Apr 2017 09:26:07 +0000 (17:26 +0800)]
drm/amd/powerplay: correct UlvOffsetVid on Vega10.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Xie [Tue, 25 Apr 2017 21:09:24 +0000 (17:09 -0400)]
drm/amdgpu: Fix use of interruptible waiting
There is no good mechanism to handle the corresponding error.
When signal interrupt happens, unpin is not called.
As a result, inside AMDGPU, the statistic of pin size will be wrong.
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Xie [Wed, 26 Apr 2017 17:31:01 +0000 (13:31 -0400)]
drm/amdgpu: Fix use of interruptible waiting
Either in cgs functions or for callers of cgs functions:
1. The signal interrupt can affect the expected behaviour
2. There is no good mechanism to handle the corresponding error
3. There is no chance of deadlock in these single BO waiting
4. There is no clear benefit for interruptible waiting
5. Future caller of these functions might have same issue.
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chunming Zhou [Mon, 24 Apr 2017 09:39:00 +0000 (17:39 +0800)]
drm/amdgpu: fix NULL pointer error
[ 141.420491] BUG: unable to handle kernel NULL pointer dereference at
0000000000000030
[ 141.420532] IP: [<
ffffffff81579ee1>] fence_remove_callback+0x11/0x60
[ 141.420563] PGD
20a030067
[ 141.420575] PUD
2088ca067
[ 141.420587] PMD 0
[ 141.420599] Oops: 0000 [#1] SMP
[ 141.420612] Modules linked in: amdgpu(OE) ttm(OE) drm_kms_helper(E) drm(E) i2c_algo_bit(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) rpcsec_gss_krb5(E) nfsv4(E) nfs(E) fscache(E) eeepc_wmi(E) asus_wmi(E) sparse_keymap(E) snd_hda_codec_realtek(E) video(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) joydev(E) snd_hda_codec(E) snd_seq_midi(E) snd_seq_midi_event(E) snd_hda_core(E) snd_hwdep(E) snd_rawmidi(E) snd_pcm(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) snd_seq(E) crc32_pclmul(E) ghash_clmulni_intel(E) snd_seq_device(E) snd_timer(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) snd(E) soundcore(E) serio_raw(E) shpchp(E) i2c_piix4(E) i2c_designware_platform(E) 8250_dw(E) i2c_designware_core(E) mac_hid(E) binfmt_misc(E)
[ 141.420948] nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) parport_pc(E) ppdev(E) lp(E) parport(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) r8169(E) ahci(E) mii(E) libahci(E) wmi(E)
[ 141.421042] CPU: 14 PID: 223 Comm: kworker/14:2 Tainted: G OE 4.9.0-custom #4
[ 141.421074] Hardware name: System manufacturer System Product Name/PRIME B350-PLUS, BIOS 0606 04/06/2017
[ 141.421146] Workqueue: events amd_sched_job_timedout [amdgpu]
[ 141.421169] task:
ffff88020b03ba80 task.stack:
ffffc900016f4000
[ 141.421193] RIP: 0010:[<
ffffffff81579ee1>] [<
ffffffff81579ee1>] fence_remove_callback+0x11/0x60
[ 141.421229] RSP: 0018:
ffffc900016f7d30 EFLAGS:
00010202
[ 141.421250] RAX:
ffff8801c049fc00 RBX:
ffff8801d4d8dc00 RCX:
0000000000000000
[ 141.421278] RDX:
0000000000000001 RSI:
ffff8801c049fcc0 RDI:
0000000000000000
[ 141.421307] RBP:
ffffc900016f7d48 R08:
0000000000000000 R09:
0000000000000000
[ 141.421334] R10:
00000020ed512a30 R11:
0000000000000001 R12:
0000000000000000
[ 141.421362] R13:
ffff880209ba4ba0 R14:
ffff880209ba4c58 R15:
ffff8801c055cc60
[ 141.421390] FS:
0000000000000000(0000) GS:
ffff88021ef80000(0000) knlGS:
0000000000000000
[ 141.421421] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 141.421443] CR2:
0000000000000030 CR3:
000000020b554000 CR4:
00000000003406e0
[ 141.421471] Stack:
[ 141.421480]
ffff8801d4d8dc00 ffff880209ba4c48 ffff880209ba4ba0 ffffc900016f7d78
[ 141.421513]
ffffffffa0697920 ffff880209ba0000 0000000000000000 ffff880209ba2770
[ 141.421549]
ffff880209ba4b08 ffffc900016f7df0 ffffffffa05ce2ae ffffffffa0509eb7
[ 141.421583] Call Trace:
[ 141.421628] [<
ffffffffa0697920>] amd_sched_hw_job_reset+0x50/0xb0 [amdgpu]
[ 141.421676] [<
ffffffffa05ce2ae>] amdgpu_gpu_reset+0x8e/0x690 [amdgpu]
[ 141.421712] [<
ffffffffa0509eb7>] ? drm_printk+0x97/0xa0 [drm]
[ 141.421770] [<
ffffffffa0698156>] amdgpu_job_timedout+0x46/0x50 [amdgpu]
[ 141.421829] [<
ffffffffa0696a07>] amd_sched_job_timedout+0x17/0x20 [amdgpu]
[ 141.421859] [<
ffffffff81095493>] process_one_work+0x153/0x3f0
[ 141.421884] [<
ffffffff81095c5b>] worker_thread+0x12b/0x4b0
[ 141.421907] [<
ffffffff81095b30>] ? rescuer_thread+0x350/0x350
[ 141.421931] [<
ffffffff8109b423>] kthread+0xd3/0xf0
[ 141.421951] [<
ffffffff8109b350>] ? kthread_park+0x60/0x60
[ 141.421975] [<
ffffffff817e1ee5>] ret_from_fork+0x25/0x30
[ 141.421996] Code: ac 81 e8 a3 1f b0 ff 48 c7 c0 ea ff ff ff e9 48 ff ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 49 89 fc 53 <48> 8b 7f 30 48 89 f3 e8 73 7c 26 00 48 8b 13 48 39 d3 41 0f 95
[ 141.422156] RIP [<
ffffffff81579ee1>] fence_remove_callback+0x11/0x60
[ 141.422183] RSP <
ffffc900016f7d30>
[ 141.422197] CR2:
0000000000000030
[ 141.433483] ---[ end trace
bc0949bf7ddd6d4b ]---
if the job is reset twice, then the parent could be NULL.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Roger.He [Fri, 21 Apr 2017 05:08:43 +0000 (13:08 +0800)]
drm/amdgpu: validate shadow before restoring from it
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger.He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Xie [Mon, 24 Apr 2017 19:33:16 +0000 (15:33 -0400)]
drm/amdgpu: Fix use of interruptible waiting
1. The signal interrupt can affect the expected behaviour.
2. There is no good mechanism to handle the corresponding error.
When signal interrupt happens, unpin is not called.
As a result, inside AMDGPU, the statistic of pin size will be wrong.
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Xie [Mon, 24 Apr 2017 19:26:57 +0000 (15:26 -0400)]
drm/amdgpu: Real return value can be over-written when clean up
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Xie [Mon, 24 Apr 2017 18:27:00 +0000 (14:27 -0400)]
drm/amdgpu: Fix use of interruptible waiting
1. The signal interrupt can affect the expected behaviour.
2. There is no good mechanism to handle the corresponding error.
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Xie [Mon, 24 Apr 2017 17:53:04 +0000 (13:53 -0400)]
drm/amdgpu: Fix use of interruptible waiting
1. The signal interrupt can affect the expected behaviour.
2. There is no good mechanism to handle the corresponding error.
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Xie [Mon, 24 Apr 2017 17:52:41 +0000 (13:52 -0400)]
drm/amdgpu: Fix use of interruptible waiting
1. The signal interrupt can affect the expected behaviour.
2. There is no mechanism to handle the corresponding error.
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Xie [Mon, 24 Apr 2017 17:30:43 +0000 (13:30 -0400)]
drm/amdgpu: Fix use of interruptible waiting
If amdgpu_bo_reserve function is interrupted by signal,
amdgpu_bo_kunmap function is not called.
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Kleiner [Sun, 23 Apr 2017 23:33:09 +0000 (01:33 +0200)]
drm/radeon: Make display watermark calculations more accurate
Avoid big roundoff errors in scanline/hactive durations for
high pixel clocks, especially for >= 500 Mhz, and thereby
program more accurate display fifo watermarks.
This is a port of the corresponding amdgpu patch.
Implemented for DCE 4,6,8.
Tested on Evergreen/DCE-4 with Radeon HD-5770.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Kleiner [Sun, 23 Apr 2017 23:33:08 +0000 (01:33 +0200)]
drm/radeon: Avoid overflows/divide-by-zero in latency_watermark calculations.
At dot clocks > approx. 250 Mhz, some of these calcs will overflow and
cause miscalculation of latency watermarks, and for some overflows also
divide-by-zero driver crash. Make calcs more overflow resistant.
This is a direct port of the corresponding patch from amdgpu-kms,
copy-paste for cik from dce-8 and si from dce-6, with a slightly
simpler variant for evergreen dce-4/5.
Only tested on DCE-4 evergreen with a Radeon HD-5770.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Kleiner [Sun, 23 Apr 2017 23:02:46 +0000 (01:02 +0200)]
drm/amdgpu: Add missing lb_vblank_lead_lines setup to DCE-6 path.
This apparently got lost when implementing the new DCE-6 support
and would cause failures in pageflip scheduling and timestamping.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pan Bian [Mon, 24 Apr 2017 08:45:51 +0000 (16:45 +0800)]
drm/radeon: check return value of radeon_fence_emit
Function radeon_fence_emit() returns -ENOMEM if there is no enough
memory. And in this case, function radeon_ring_unlock_undo() rather than
function radeon_ring_unlock_commit() should be called. However, in
function radeon_test_create_and_emit_fence(), the return value of
radeon_fence_emit() is ignored. This patch adds the check.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pan Bian [Mon, 24 Apr 2017 08:38:05 +0000 (16:38 +0800)]
drm/radeon: check return value of radeon_ring_lock
Function radeon_ring_lock() returns an errno on failure, and its return
value should be validated. However, in functions r420_cp_errata_init()
and r420_cp_errata_fini(), its return value is not checked. This patch
adds the checks.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Frank Min [Mon, 17 Apr 2017 03:19:45 +0000 (11:19 +0800)]
drm/amdgpu/soc15: enable UVD code path for sriov
Enable UVD block for SRIOV.
Signed-off-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Frank Min [Mon, 17 Apr 2017 03:51:44 +0000 (11:51 +0800)]
drm/amdgpu/uvd7: add UVD hw init sequences for sriov
Add UVD hw init.
Signed-off-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Frank Min [Mon, 17 Apr 2017 03:45:35 +0000 (11:45 +0800)]
drm/amdgpu/uvd7: add uvd doorbell initialization for sriov
Add UVD doorbell for SRIOV.
Signed-off-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Frank Min [Mon, 17 Apr 2017 03:28:12 +0000 (11:28 +0800)]
drm/amdgpu/uvd7: add sriov uvd initialization sequences
Add UVD initialization for SRIOV.
Signed-off-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiangliang Yu [Fri, 21 Apr 2017 08:21:41 +0000 (16:21 +0800)]
drm/amdgpu/vce4: replaced with virt_alloc_mm_table
Used virt_alloc_mm_table function to allocate MM table memory.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiangliang Yu [Fri, 21 Apr 2017 07:40:25 +0000 (15:40 +0800)]
drm/amdgpu/virt: add two functions for MM table
Add two functions to allocate & free MM table memory.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Frank Min [Sun, 16 Apr 2017 05:37:07 +0000 (13:37 +0800)]
drm/amdgpu/vce4: move mm table constructions functions into mmsch header file
Move mm table construction functions into mmsch header file so that
UVD can reuse it.
Signed-off-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Daniel Wang [Wed, 19 Apr 2017 08:09:08 +0000 (16:09 +0800)]
drm/amdgpu/vce4: fix a PSP loading VCE issue
Fixed PSP loading issue for sriov.
Signed-off-by: Daniel Wang <Daniel.Wang2@amd.com>
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Daniel Wang [Thu, 20 Apr 2017 03:45:09 +0000 (11:45 +0800)]
drm/amdgpu/psp: skip loading SDMA/RLCG under SRIOV VF
Now GPU hypervisor will load SDMA and RLCG ucode, so skip it
in guest.
Signed-off-by: Daniel Wang <Daniel.Wang2@amd.com>
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chunming Zhou [Mon, 24 Apr 2017 09:09:15 +0000 (17:09 +0800)]
drm/amdgpu: fix gpu reset crash
[ 413.687439] BUG: unable to handle kernel NULL pointer dereference at
0000000000000548
[ 413.687479] IP: [<
ffffffff8109b175>] to_live_kthread+0x5/0x60
[ 413.687507] PGD
1efd12067
[ 413.687519] PUD
1efd11067
[ 413.687531] PMD 0
[ 413.687543] Oops: 0000 [#1] SMP
[ 413.687557] Modules linked in: amdgpu(OE) ttm(OE) drm_kms_helper(E) drm(E) i2c_algo_bit(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) rpcsec_gss_krb5(E) nfsv4(E) nfs(E) fscache(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) eeepc_wmi(E) snd_hda_codec(E) asus_wmi(E) snd_hda_core(E) sparse_keymap(E) snd_hwdep(E) video(E) snd_pcm(E) snd_seq_midi(E) joydev(E) snd_seq_midi_event(E) snd_rawmidi(E) snd_seq(E) snd_seq_device(E) snd_timer(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) snd(E) crc32_pclmul(E) ghash_clmulni_intel(E) soundcore(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) shpchp(E) serio_raw(E) i2c_piix4(E) 8250_dw(E) i2c_designware_platform(E) i2c_designware_core(E) mac_hid(E) binfmt_misc(E)
[ 413.687894] parport_pc(E) ppdev(E) lp(E) parport(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) ahci(E) r8169(E) mii(E) libahci(E) wmi(E)
[ 413.687989] CPU: 13 PID: 1134 Comm: kworker/13:2 Tainted: G OE 4.9.0-custom #4
[ 413.688019] Hardware name: System manufacturer System Product Name/PRIME B350-PLUS, BIOS 0606 04/06/2017
[ 413.688089] Workqueue: events amd_sched_job_timedout [amdgpu]
[ 413.688116] task:
ffff88020f9657c0 task.stack:
ffffc90001a88000
[ 413.688139] RIP: 0010:[<
ffffffff8109b175>] [<
ffffffff8109b175>] to_live_kthread+0x5/0x60
[ 413.688171] RSP: 0018:
ffffc90001a8bd60 EFLAGS:
00010282
[ 413.688191] RAX:
ffff88020f0073f8 RBX:
ffff88020f000000 RCX:
0000000000000000
[ 413.688217] RDX:
0000000000000001 RSI:
ffff88020f9670c0 RDI:
0000000000000000
[ 413.688243] RBP:
ffffc90001a8bd78 R08:
0000000000000000 R09:
0000000000001000
[ 413.688269] R10:
0000006051b11a82 R11:
0000000000000001 R12:
0000000000000000
[ 413.688295] R13:
ffff88020f002770 R14:
ffff88020f004838 R15:
ffff8801b23c2c60
[ 413.688321] FS:
0000000000000000(0000) GS:
ffff88021ef40000(0000) knlGS:
0000000000000000
[ 413.688352] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 413.688373] CR2:
0000000000000548 CR3:
00000001efd0f000 CR4:
00000000003406e0
[ 413.688399] Stack:
[ 413.688407]
ffffffff8109b304 ffff88020f000000 0000000000000070 ffffc90001a8bdf0
[ 413.688439]
ffffffffa05ce29d ffffffffa052feb7 ffffffffa07b5820 ffffc90001a8bda0
[ 413.688470]
ffffffff00000018 ffff8801bb88f060 0000000001a8bdb8 ffff88021ef59280
[ 413.688502] Call Trace:
[ 413.688514] [<
ffffffff8109b304>] ? kthread_park+0x14/0x60
[ 413.688555] [<
ffffffffa05ce29d>] amdgpu_gpu_reset+0x7d/0x670 [amdgpu]
[ 413.688589] [<
ffffffffa052feb7>] ? drm_printk+0x97/0xa0 [drm]
[ 413.688643] [<
ffffffffa0698136>] amdgpu_job_timedout+0x46/0x50 [amdgpu]
[ 413.688700] [<
ffffffffa06969e7>] amd_sched_job_timedout+0x17/0x20 [amdgpu]
[ 413.688727] [<
ffffffff81095493>] process_one_work+0x153/0x3f0
[ 413.688751] [<
ffffffff81095c5b>] worker_thread+0x12b/0x4b0
[ 413.688773] [<
ffffffff8100392e>] ? do_syscall_64+0x6e/0x180
[ 413.688795] [<
ffffffff81095b30>] ? rescuer_thread+0x350/0x350
[ 413.688818] [<
ffffffff8100392e>] ? do_syscall_64+0x6e/0x180
[ 413.688839] [<
ffffffff8109b423>] kthread+0xd3/0xf0
[ 413.688858] [<
ffffffff8109b350>] ? kthread_park+0x60/0x60
[ 413.688881] [<
ffffffff817e1ee5>] ret_from_fork+0x25/0x30
[ 413.688901] Code: 25 40 d3 00 00 48 8b 80 48 05 00 00 48 89 e5 5d 48 8b 40 c8 48 c1 e8 02 83 e0 01 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 <48> 8b b7 48 05 00 00 55 48 89 e5 48 85 f6 74 31 8b 97 f8 18 00
[ 413.689045] RIP [<
ffffffff8109b175>] to_live_kthread+0x5/0x60
[ 413.689064] RSP <
ffffc90001a8bd60>
[ 413.689076] CR2:
0000000000000548
[ 413.697985] ---[ end trace
0a314a64821f84e9 ]---
The root cause is some ring doesn't have scheduler, like KIQ ring
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chunming Zhou [Fri, 21 Apr 2017 09:58:42 +0000 (17:58 +0800)]
drm/amdgpu: fix no-vmid job
[ 132.036658] amdgpu 0000:22:00.0: VM IB without ID
[ 132.036709] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
[ 132.036755] [drm:amd_sched_main [amdgpu]] *ERROR* Failed to run job!
root cause is fence is signaled during sync transfer.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Roger.He [Fri, 21 Apr 2017 06:24:26 +0000 (14:24 +0800)]
drm/amdgpu: fix indent
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Roger.He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chunming Zhou [Fri, 21 Apr 2017 08:40:00 +0000 (16:40 +0800)]
drm/amdgpu: increase gtt size to 3GB by default v2
v2: address Alex's comment, add AMDGPU_DEFAULT_GTT_SIZE_MB.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Fri, 21 Apr 2017 08:05:56 +0000 (10:05 +0200)]
drm/amdgpu: fix VM clearing in amdgpu_gem_object_close
We need to check if the VM is swapped out before trying to update it.
Fixes: 23e0563e48f7 ("drm/amdgpu: clear freed mappings immediately when BO may be freed")
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chunming Zhou [Thu, 13 Apr 2017 08:16:51 +0000 (16:16 +0800)]
drm/amdgpu: add gtt print like vram when dump mm table V2
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Thu, 20 Apr 2017 10:11:47 +0000 (12:11 +0200)]
drm/amdgpu: fix amdgpu_ttm_bo_eviction_valuable
BOs not mapped into the GART are always valuable for an eviction. Otherwise we
don't correctly swap them out on VRAM evictions during memory pressure.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Thu, 20 Apr 2017 08:33:23 +0000 (16:33 +0800)]
drm/amd/powerplay: Fix AVFS param.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Wed, 19 Apr 2017 08:00:21 +0000 (16:00 +0800)]
drm/amd/powerplay: enable clock stretch feature on Vega10.
Correctly calculate CKSVidOffset
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Thu, 20 Apr 2017 08:38:36 +0000 (16:38 +0800)]
drm/amd/powerplay: enable pcie dpm on Vega10.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Wed, 12 Apr 2017 09:52:07 +0000 (17:52 +0800)]
drm/amd/powerplay: allocate fb for avfs fuse table on vega10.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Wed, 12 Apr 2017 09:32:35 +0000 (17:32 +0800)]
drm/amd/powerplay: enable AGM logging while dpm disabled.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Thu, 20 Apr 2017 07:25:39 +0000 (15:25 +0800)]
drm/amd/powerplay: add error message to remind user updating firmware
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 24 Apr 2017 17:51:52 +0000 (13:51 -0400)]
Revert "drm/amd/amdgpu: Set VCE/UVD off during late init"
This leads to hangs on init.
This reverts commit
d1aff8ec49c3ece05cee9b6e63d44e96a420b068.
Zhang, Jerry [Wed, 19 Apr 2017 01:53:29 +0000 (09:53 +0800)]
drm/amdgpu: PRT support for gfx9 (v3)
Fix PRT handling on gfx9
v2: unify PRT bit for all ASICs
v3: move PRT flag checking in amdgpu_vm_bo_split_mapping()
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Acked-by: David Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Wed, 19 Apr 2017 12:41:19 +0000 (14:41 +0200)]
drm/amdgpu: fix amdgpu_vm_clear_freed v2
Use amdgpu_vm_bo_update_mapping() instead of amdgpu_vm_bo_split_mapping() here.
We don't want any flags set in the cleared areas and splitting
shouldn't be necessary.
v2: fix typo in commit message
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Trigger Huang [Mon, 17 Apr 2017 12:50:18 +0000 (08:50 -0400)]
drm/amdgpu: Destroy psp ring in hw_fini
Fix issue that PSP initialization will fail if reload amdgpu module.
That's because the PSP ring must be destroyed to be ready for the
next time PSP initialization.
Changes in v2:
- Move psp_ring_destroy before all BOs free (suggested by
Ray Huang).
Changes in v3:
- Check firmware load type, if it is not PSP, we should do
nothing in fw_fini(), and of course will not destroy
PSP ring too (suggested by Ray Huang).
Signed-off-by: Trigger Huang <trigger.huang@amd.com>
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Wed, 12 Apr 2017 09:34:26 +0000 (17:34 +0800)]
drm/amdgpu: update smu9 driver interface
Updated interface between the driver and the SMU controller.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tom St Denis [Wed, 19 Apr 2017 15:03:04 +0000 (11:03 -0400)]
drm/amd/amdgpu: Print out ring name in dev_info
So it's more obvious which rings are using which INV engines.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tom St Denis [Wed, 19 Apr 2017 13:02:41 +0000 (09:02 -0400)]
drm/amd/amdgpu: Change comp GFXv9 ring name to remove space
umr expects the ring name to be a complete word. This also
makes it consistent with GFXv7/8.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tom St Denis [Wed, 19 Apr 2017 13:01:42 +0000 (09:01 -0400)]
drm/amd/amdgpu: Change comp GFXv6 ring name to remove space
umr expects the ring name to be a complete word. This also
makes it consistent with GFXv7/8.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Trigger Huang [Mon, 17 Apr 2017 14:56:02 +0000 (10:56 -0400)]
drm/amdgpu: Fix module unload hang by KIQ on Vega10
Apply commit
4e683cb2644f ("drm/amdgpu: Fix module unload hang by
KIQ IRQ set")to vega10
V2:
delete reduant kiq irq funcs type check (suggested by Rex.Zhu)
Signed-off-by: Trigger Huang <trigger.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Tue, 18 Apr 2017 11:21:44 +0000 (19:21 +0800)]
drm/amdgpu: fix memory clock can't switch on CI.
if we set only lowest mclk level enabled,
when we enable uvd dpm during boot time,
mclk will be fixed in the lowest level.
the mclk switch will fail if try to enable
other level of mclk at this time.
so set all mclk levels enabled.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiangliang Yu [Fri, 14 Apr 2017 09:43:02 +0000 (17:43 +0800)]
drm/amdgpu/gfx9: bypass clockgating setting
For SRIOV doesn't need clockgating, bypass it.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiangliang Yu [Fri, 14 Apr 2017 09:40:57 +0000 (17:40 +0800)]
drm/amdgpu/mmhub_v1: bypass clockgating setting
For SRIOV doesn't need CG, so bypass it.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Mon, 6 Mar 2017 12:34:57 +0000 (13:34 +0100)]
drm/amdgpu: fix coding style and printing in amdgpu_doorbell_init
Based on commit "drm/radeon: remove useless and potentially wrong message".
The size of the info printing is incorrect and the PCI subsystems prints
the same info on boot anyway.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pixel Ding [Thu, 23 Feb 2017 03:10:33 +0000 (11:10 +0800)]
drm/amdgpu/virt: don't check VALID bit for FLR completion message
The interrupt after FLR is missed sometimes due to hardware reason, so
guest driver get the notification of FLR completion via polling
message. Then host doesn't write VALID bit to avoid sending interrupt,
otherwise the completion will be handled twice.
So there's a valid message without VALID bit for FLR completion,
driver should handle it without checking.
Signed-off-by: Pixel Ding <Pixel.Ding@amd.com>
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Junwei Zhang [Thu, 23 Feb 2017 03:01:40 +0000 (11:01 +0800)]
drm/amdgpu: fix double_offchip_lds_buf for gfx v6
Was incorrect for SI.
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Rex Zhu [Mon, 17 Apr 2017 12:46:29 +0000 (20:46 +0800)]
drm/amd/powerplay: delete dead functions in vega10.
Vega10 does not support AVFS BTC, remove function.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Mon, 17 Apr 2017 11:44:23 +0000 (19:44 +0800)]
drm/amd/amdgpu: coding style refine in sdma_v4_0.c
Replace 8 spaces with tabs.
correct {} braces, etc.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Mon, 17 Apr 2017 10:46:57 +0000 (18:46 +0800)]
drm/amdgpu: Remove redundant itermediate return val in sdma_v4_0.c
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Mon, 17 Apr 2017 07:32:52 +0000 (15:32 +0800)]
drm/ttm: cleanup unuse ret value
The ret must be 0 here, otherwise, the function will return after init_mem_type.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Fri, 14 Apr 2017 02:50:03 +0000 (10:50 +0800)]
drm/amdgpu: fix to print incorrect wptr address
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Thu, 13 Apr 2017 08:12:26 +0000 (16:12 +0800)]
drm/amdgpu: fix dead lock if any ip block resume failed in s3
Driver must free the console lock whether driver resuming successful
or not. Otherwise, fb_console will be always waiting for the lock and
then cause system stuck.
[ 244.405541] INFO: task kworker/0:0:4 blocked for more than 120 seconds.
[ 244.405543] Tainted: G OE 4.9.0-custom #1
[ 244.405544] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 244.405541] INFO: task kworker/0:0:4 blocked for more than 120 seconds.
[ 244.405543] Tainted: G OE 4.9.0-custom #1
[ 244.405544] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 244.405550] kworker/0:0 D 0 4 2 0x00080000
[ 244.405559] Workqueue: events console_callback
[ 244.405564]
ffff88045a2cfc00 0000000000000000 ffff880462b75940 ffffffff81c0e500
[ 244.405568]
ffff880476419280 ffffc900018f7c90 ffffffff817dcf62 000000000000003c
[ 244.405572]
0000000100000000 0000000000000002 ffff880462b75940 ffff880462b75940
[ 244.405573] Call Trace:
[ 244.405580] [<
ffffffff817dcf62>] ? __schedule+0x222/0x6a0
[ 244.405584] [<
ffffffff817dd416>] schedule+0x36/0x80
[ 244.405588] [<
ffffffff817e041c>] schedule_timeout+0x1fc/0x390
[ 244.405592] [<
ffffffff817df1b4>] __down_common+0xa5/0xf8
[ 244.405598] [<
ffffffff810b2ca8>] ? put_prev_entity+0x48/0x710
[ 244.405601] [<
ffffffff817df224>] __down+0x1d/0x1f
[ 244.405606] [<
ffffffff810c71a1>] down+0x41/0x50
[ 244.405611] [<
ffffffff810d380a>] console_lock+0x1a/0x40
[ 244.405614] [<
ffffffff814e3c03>] console_callback+0x13/0x160
[ 244.405617] [<
ffffffff817dcf6a>] ? __schedule+0x22a/0x6a0
[ 244.405623] [<
ffffffff810954e3>] process_one_work+0x153/0x3f0
[ 244.405628] [<
ffffffff81095cab>] worker_thread+0x12b/0x4b0
[ 244.405633] [<
ffffffff81095b80>] ? rescuer_thread+0x350/0x350
[ 244.405637] [<
ffffffff8109b473>] kthread+0xd3/0xf0
[ 244.405641] [<
ffffffff8109b3a0>] ? kthread_park+0x60/0x60
[ 244.405645] [<
ffffffff8109b3a0>] ? kthread_park+0x60/0x60
[ 244.405649] [<
ffffffff817e1ee5>] ret_from_fork+0x25/0x30
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Tue, 11 Apr 2017 17:20:20 +0000 (19:20 +0200)]
drm/radeon: force the UVD DPB into VRAM as well
Seems to be mandatory for WMV playback.
Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=100510
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 10 Apr 2017 19:36:32 +0000 (15:36 -0400)]
drm/amdgpu: bump version number to note race fix and new fence functionality
fixed in: "drm/amdgpu:fix race condition"
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 10 Apr 2017 19:32:43 +0000 (15:32 -0400)]
drm/amdgpu: fix spelling in header comment
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Fri, 7 Apr 2017 15:43:19 +0000 (17:43 +0200)]
drm/amdgpu: trace vm hub during flush as well v2
Trace on which hub we are doing the flush.
v2: fix typo in commit message
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Fri, 7 Apr 2017 13:31:13 +0000 (15:31 +0200)]
drm/amdgpu: trace the vmhub in grab_id as well
Trace on which VMHUB we assigned an VMID.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Thu, 30 Mar 2017 14:56:20 +0000 (16:56 +0200)]
drm/amdgpu: allow concurrent VM flushes
Enable concurrent VM flushes for Vega10.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Fri, 31 Mar 2017 09:03:50 +0000 (11:03 +0200)]
drm/amdgpu: assign VM invalidation engine manually v2
For Vega10 we have 18 VM invalidation engines for each VMHUB.
Start to assign them manually to the rings.
v2: add a BUG_ON if we use to many engines
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Thu, 30 Mar 2017 14:50:47 +0000 (16:50 +0200)]
drm/amdgpu: invalidate only the currently needed VMHUB v2
Drop invalidating both hubs from each engine.
v2: don't use hardcoded values
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Thu, 6 Apr 2017 15:52:39 +0000 (17:52 +0200)]
drm/amdgpu: split VMID management by VMHUB
This way GFX and MM won't fight for VMIDs any more.
Initially disabled since we need to stop flushing all HUBS
at the same time as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Thu, 6 Apr 2017 13:18:21 +0000 (15:18 +0200)]
drm/amdgpu: drop VMID per ring tracking
David suggested this a long time ago, instead of checking
each ring just walk over all the VMIDs in reverse LRU order.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Thu, 30 Mar 2017 12:49:50 +0000 (14:49 +0200)]
drm/amdgpu: add VMHUB to ring association
Add the info which ring belonging to which VMHUB.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Frank Min [Thu, 6 Apr 2017 06:46:50 +0000 (14:46 +0800)]
drm/amdgpu/vce4: enable ring & ib test for sriov
Now VCE block can work for SRIOV, enable ring & ib test.
Signed-off-by: Frank Min <Frank.Min@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiangliang Yu [Thu, 6 Apr 2017 06:43:48 +0000 (14:43 +0800)]
drm/amdgpu/vce4: workaround VCE ring test slow issue
Add VCE ring test slow workaround for SRIOV.
Signed-off-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Frank Min [Fri, 7 Apr 2017 02:38:52 +0000 (10:38 +0800)]
drm/amdgpu/vce4: update VCE initialization sequence for SRIOV
Update the initialization sequence of VCE to make VCE work.
Signed-off-by: Frank Min <Frank.Min@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Trigger Huang [Tue, 11 Apr 2017 05:56:36 +0000 (01:56 -0400)]
drm/amdgpu: Fix firmware UCODE_ID_STORAGE issue (v2)
In Tonga's virtualization environment, for firmware UCODE_ID_STORAGE,
there is no actual firmware data, but we still need alloc a BO and
tell the BO's mc address to HW, or world switch will hang on VFs.
v2: fix coding style (Alex)
Signed-off-by: Trigger Huang <trigger.huang@amd.com>
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Tue, 11 Apr 2017 01:24:56 +0000 (09:24 +0800)]
drm/amdgpu: fix to add buffer funcs check
This patch fixes the case when buffer funcs is empty and bo evict is
executing. It must double check buffer funcs, otherwise, a NULL
pointer dereference kernel panic will be encountered.
BUG: unable to handle kernel NULL pointer dereference at
00000000000001a4
IP: [<
ffffffffa067b6cd>] amdgpu_evict_flags+0x3d/0xf0 [amdgpu]
PGD 0
Oops: 0000 [#1] SMP
Modules linked in: amdgpu(OE) ttm drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt fmem(OE) physmem_drv(OE) rpcsec_gss_krb5 nfsv4 nfs fscache intel_rapl x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic kvm_intel snd_hda_intel snd_hda_codec kvm snd_hda_core joydev eeepc_wmi asus_wmi sparse_keymap snd_hwdep snd_pcm irqbypass crct10dif_pclmul snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq crc32_pclmul snd_seq_device ghash_clmulni_intel aesni_intel aes_x86_64 snd_timer lrw gf128mul mei_me snd glue_helper ablk_helper cryptd tpm_infineon mei lpc_ich serio_raw soundcore shpchp mac_hid nfsd auth_rpcgss nfs_acl lockd grace coretemp sunrpc parport_pc ppdev lp parport autofs4 hid_generic mxm_wmi r8169 usbhid ahci
psmouse libahci nvme mii hid nvme_core wmi video
CPU: 3 PID: 1627 Comm: kworker/u8:17 Tainted: G OE 4.9.0-custom #1
Hardware name: ASUS All Series/Z87-A, BIOS 1802 01/28/2014
Workqueue: events_unbound async_run_entry_fn
task:
ffff88021e7057c0 task.stack:
ffffc9000262c000
RIP: 0010:[<
ffffffffa067b6cd>] [<
ffffffffa067b6cd>] amdgpu_evict_flags+0x3d/0xf0 [amdgpu]
RSP: 0018:
ffffc9000262fb30 EFLAGS:
00010246
RAX:
0000000000000000 RBX:
ffff88021e8a5858 RCX:
0000000000000000
RDX:
0000000000000001 RSI:
ffffc9000262fb58 RDI:
ffff88021e8a5800
RBP:
ffffc9000262fb48 R08:
0000000000000000 R09:
ffff88021e8a5814
R10:
000000001def8f01 R11:
ffff88021def8c80 R12:
ffffc9000262fb58
R13:
ffff88021d2b1990 R14:
0000000000000000 R15:
ffff88021e8a5858
FS:
0000000000000000(0000) GS:
ffff88022ed80000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000000001a4 CR3:
0000000001c07000 CR4:
00000000001406e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Mon, 10 Apr 2017 01:29:13 +0000 (09:29 +0800)]
drm/amdgpu: fix to clear ASIC INIT COMPLETE bit on resuming phase
ASIC_INIT_COMPLETE bit must be cleared during S3 resuming phase,
because VBIOS will check the bit to decide if execute ASIC_Init
posting via kernel driver.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Mon, 10 Apr 2017 07:29:42 +0000 (15:29 +0800)]
drm/amdgpu: do not free fence buf when driver probes.
Fence buf needs to be used on suspend/resume phase.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Mon, 10 Apr 2017 06:40:51 +0000 (14:40 +0800)]
drm/amd/powerplay: fix suspend error on DPM disabled
Don't fail if DPM is disabled.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Monk Liu [Fri, 7 Apr 2017 10:39:07 +0000 (18:39 +0800)]
drm/amdgpu:fix race condition
sequence is protected by spinlock so don't access sequence
in paramter seq when invoking this function.
~0 means to get the latest sequence number and 0 means none to
get.
Change-Id: Ib7a03f3cf5594deeb4ad333cc59b47a6bddfd1ad
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tom St Denis [Fri, 7 Apr 2017 11:53:53 +0000 (07:53 -0400)]
drm/amd/amdgpu: Port gfx9 driver over to new read/write macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tom St Denis [Fri, 7 Apr 2017 11:53:31 +0000 (07:53 -0400)]
drm/amd/amdgpu: Introduce new read/write macros for SOC15
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Thu, 23 Mar 2017 03:20:25 +0000 (11:20 +0800)]
drm/amdgpu: add hw_start and non-psp firmware loading into resume
Rework in order to properly support suspend.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Tue, 21 Mar 2017 10:36:57 +0000 (18:36 +0800)]
drm/amdgpu: split psp ring init function
Rework in order to properly support suspend.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Tue, 21 Mar 2017 10:02:04 +0000 (18:02 +0800)]
drm/amdgpu: split psp asd function
Rework in order to properly support suspend.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Wed, 22 Mar 2017 02:16:05 +0000 (10:16 +0800)]
drm/amdgpu: use private memory to store psp firmware data
Rework in order to properly support suspend.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Tue, 21 Mar 2017 08:51:00 +0000 (16:51 +0800)]
drm/amdgpu: add psp firmware private memory
Needed for proper suspend support.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Tue, 21 Mar 2017 08:18:11 +0000 (16:18 +0800)]
drm/amdgpu: split psp tmr init function
Rework in order to properly support suspend.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Fri, 7 Apr 2017 06:56:58 +0000 (14:56 +0800)]
drm/amd/powerplay: align with VBIOS to support new AVFS structure
Align the driver with the latest vbios structures.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Fri, 28 Apr 2017 19:50:27 +0000 (05:50 +1000)]
Merge tag 'drm-intel-next-fixes-2017-04-27' of git://anongit.freedesktop.org/git/drm-intel into drm-next
drm/i915 and gvt fixes for drm-next/v4.12
* tag 'drm-intel-next-fixes-2017-04-27' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Confirm the request is still active before adding it to the await
drm/i915: Avoid busy-spinning on VLV_GLTC_PW_STATUS mmio
drm/i915/selftests: Allocate inode/file dynamically
drm/i915: Fix system hang with EI UP masked on Haswell
drm/i915: checking for NULL instead of IS_ERR() in mock selftests
drm/i915: Perform link quality check unconditionally during long pulse
drm/i915: Fix use after free in lpe_audio_platdev_destroy()
drm/i915: Use the right mapping_gfp_mask for final shmem allocation
drm/i915: Make legacy cursor updates more unsynced
drm/i915: Apply a cond_resched() to the saturated signaler
drm/i915: Park the signaler before sleeping
drm/i915/gvt: fix a bounds check in ring_id_to_context_switch_event()
drm/i915/gvt: Fix PTE write flush for taking runtime pm properly
drm/i915/gvt: remove some debug messages in scheduler timer handler
drm/i915/gvt: add mmio init for virtual display
drm/i915/gvt: use directly assignment for structure copying
drm/i915/gvt: remove redundant ring id check which cause significant CPU misprediction
drm/i915/gvt: remove redundant platform check for mocs load/restore
drm/i915/gvt: Align render mmio list to cacheline
drm/i915/gvt: cleanup some too chatty scheduler message
Dave Airlie [Fri, 28 Apr 2017 19:49:54 +0000 (05:49 +1000)]
Merge branch 'drm-vmwgfx-next' of git://people.freedesktop.org/~syeh/repos_linux into drm-next
trivial patch.
* 'drm-vmwgfx-next' of git://people.freedesktop.org/~syeh/repos_linux:
drm/vmwgfx: Convert macro to octal representation
Dave Airlie [Fri, 28 Apr 2017 19:48:45 +0000 (05:48 +1000)]
Merge branch 'for-upstream/mali-dp' of git://linux-arm.org/linux-ld into drm-next
Latest updates on Mali DP, adding support for colour management,
plane scaling and power management.
(these have been in -next for a while).
* 'for-upstream/mali-dp' of git://linux-arm.org/linux-ld:
drm: mali-dp: use div_u64 for expensive 64-bit divisions
drm: mali-dp: Check the mclk rate and allow up/down scaling
drm: mali-dp: Enable image enhancement when scaling
drm: mali-dp: Add plane upscaling support
drm/mali-dp: Add core_id file to the sysfs interface
drm: mali-dp: Add CTM support
drm: mali-dp: enable gamma support
drm: mali-dp: add malidp_crtc_state struct
drm: mali-dp: add custom reset hook for planes
drm: mali-dp: remove unused variable
drm: mali-dp: add atomic_print_state for planes
drm: mali-dp: Enable power management for the device.
drm: mali-dp: Update the state of all planes before re-enabling active CRTCs.
Arnd Bergmann [Tue, 25 Apr 2017 19:56:53 +0000 (21:56 +0200)]
drm: mali-dp: use div_u64 for expensive 64-bit divisions
On 32-bit machines, we can't divide 64-bit integers:
drivers/gpu/drm/arm/malidp_crtc.o: In function `malidp_crtc_atomic_check':
malidp_crtc.c:(.text.malidp_crtc_atomic_check+0x3c0): undefined reference to `__aeabi_uldivmod'
malidp_crtc.c:(.text.malidp_crtc_atomic_check+0x3dc): undefined reference to `__aeabi_uldivmod'
This calls the div_u64 function explicitly instead.
Fixes: 4cea4e9f6690 ("drm: mali-dp: Add plane upscaling support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Chris Wilson [Sat, 22 Apr 2017 08:15:37 +0000 (09:15 +0100)]
drm/i915: Confirm the request is still active before adding it to the await
Although we do check the completion-status of the request before
actually adding a wait on it (either to its submit fence or its
completion dma-fence), we currently do not check before adding it to the
dependency lists.
In fact, without checking for a completed request we may try to use the
signaler after it has been retired and its dependency tree freed:
[ 60.044057] BUG: KASAN: use-after-free in __list_add_valid+0x1d/0xd0 at addr
ffff880348c9e6a0
[ 60.044118] Read of size 8 by task gem_exec_fence/530
[ 60.044164] CPU: 1 PID: 530 Comm: gem_exec_fence Tainted: G E 4.11.0-rc7+ #46
[ 60.044226] Hardware name: ��������������������������������� ���������������������������������/���������������������������������, BIOS RYBDWi35.86A.0246.2
[ 60.044290] Call Trace:
[ 60.044337] dump_stack+0x4d/0x6a
[ 60.044383] kasan_object_err+0x21/0x70
[ 60.044435] kasan_report+0x225/0x4e0
[ 60.044488] ? __list_add_valid+0x1d/0xd0
[ 60.044534] ? kasan_kmalloc+0xad/0xe0
[ 60.044587] __asan_load8+0x5e/0x70
[ 60.044639] __list_add_valid+0x1d/0xd0
[ 60.044788] __i915_priotree_add_dependency+0x67/0x130 [i915]
[ 60.044895] i915_gem_request_await_request+0xa8/0x370 [i915]
[ 60.044974] i915_gem_request_await_dma_fence+0x129/0x140 [i915]
[ 60.045049] i915_gem_do_execbuffer.isra.37+0xb0a/0x26b0 [i915]
[ 60.045077] ? save_stack+0xb1/0xd0
[ 60.045105] ? save_stack_trace+0x1b/0x20
[ 60.045132] ? save_stack+0x46/0xd0
[ 60.045158] ? kasan_kmalloc+0xad/0xe0
[ 60.045184] ? __kmalloc+0xd8/0x670
[ 60.045229] ? drm_ioctl+0x359/0x640 [drm]
[ 60.045256] ? SyS_ioctl+0x41/0x70
[ 60.045330] ? i915_vma_move_to_active+0x540/0x540 [i915]
[ 60.045360] ? tty_insert_flip_string_flags+0xa1/0xf0
[ 60.045387] ? tty_flip_buffer_push+0x63/0x70
[ 60.045414] ? remove_wait_queue+0xa9/0xc0
[ 60.045441] ? kasan_unpoison_shadow+0x35/0x50
[ 60.045467] ? kasan_kmalloc+0xad/0xe0
[ 60.045494] ? kasan_check_write+0x14/0x20
[ 60.045568] i915_gem_execbuffer2+0xdb/0x2a0 [i915]
[ 60.045616] drm_ioctl+0x359/0x640 [drm]
[ 60.045705] ? i915_gem_execbuffer+0x5a0/0x5a0 [i915]
[ 60.045751] ? drm_version+0x150/0x150 [drm]
[ 60.045778] ? compat_start_thread+0x60/0x60
[ 60.045805] ? plist_del+0xda/0x1a0
[ 60.045833] do_vfs_ioctl+0x12e/0x910
[ 60.045860] ? ioctl_preallocate+0x130/0x130
[ 60.045886] ? pci_mmcfg_check_reserved+0xc0/0xc0
[ 60.045913] ? vfs_write+0x196/0x240
[ 60.045939] ? __fget_light+0xa7/0xc0
[ 60.045965] SyS_ioctl+0x41/0x70
[ 60.045991] entry_SYSCALL_64_fastpath+0x17/0x98
[ 60.046017] RIP: 0033:0x7feb2baefc47
[ 60.046042] RSP: 002b:
00007fff56d28e58 EFLAGS:
00000246 ORIG_RAX:
0000000000000010
[ 60.046075] RAX:
ffffffffffffffda RBX:
00007fff56d290a8 RCX:
00007feb2baefc47
[ 60.046102] RDX:
00007fff56d29050 RSI:
00000000c0406469 RDI:
0000000000000003
[ 60.046129] RBP:
00007fff56d29050 R08:
000055ecc4cd27d0 R09:
00007feb2bda8600
[ 60.046154] R10:
0000000000000073 R11:
0000000000000246 R12:
00000000c0406469
[ 60.046177] R13:
0000000000000003 R14:
000000000000000f R15:
0000000000000099
[ 60.046203] Object at
ffff880348c9e680, in cache i915_dependency size: 64
[ 60.046225] Allocated:
[ 60.046246] PID = 530
[ 60.046269] save_stack_trace+0x1b/0x20
[ 60.046292] save_stack+0x46/0xd0
[ 60.046318] kasan_kmalloc+0xad/0xe0
[ 60.046343] kasan_slab_alloc+0x12/0x20
[ 60.046368] kmem_cache_alloc+0xab/0x650
[ 60.046445] i915_gem_request_await_request+0x88/0x370 [i915]
[ 60.046559] i915_gem_request_await_dma_fence+0x129/0x140 [i915]
[ 60.046705] i915_gem_do_execbuffer.isra.37+0xb0a/0x26b0 [i915]
[ 60.046849] i915_gem_execbuffer2+0xdb/0x2a0 [i915]
[ 60.046936] drm_ioctl+0x359/0x640 [drm]
[ 60.046987] do_vfs_ioctl+0x12e/0x910
[ 60.047038] SyS_ioctl+0x41/0x70
[ 60.047090] entry_SYSCALL_64_fastpath+0x17/0x98
[ 60.047139] Freed:
[ 60.047179] PID = 530
[ 60.047223] save_stack_trace+0x1b/0x20
[ 60.047269] save_stack+0x46/0xd0
[ 60.047317] kasan_slab_free+0x72/0xc0
[ 60.047366] kmem_cache_free+0x39/0x160
[ 60.047512] i915_gem_request_retire+0x83f/0x930 [i915]
[ 60.047657] i915_gem_request_alloc+0x166/0x600 [i915]
[ 60.047799] i915_gem_do_execbuffer.isra.37+0xad8/0x26b0 [i915]
[ 60.047897] i915_gem_execbuffer2+0xdb/0x2a0 [i915]
[ 60.047942] drm_ioctl+0x359/0x640 [drm]
[ 60.047968] do_vfs_ioctl+0x12e/0x910
[ 60.047993] SyS_ioctl+0x41/0x70
[ 60.048019] entry_SYSCALL_64_fastpath+0x17/0x98
[ 60.048044] Memory state around the buggy address:
[ 60.048066]
ffff880348c9e580: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
[ 60.048105]
ffff880348c9e600: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
[ 60.048138] >
ffff880348c9e680: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 60.048170] ^
[ 60.048191]
ffff880348c9e700: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
[ 60.048225]
ffff880348c9e780: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
Note to hit the use-after-free requires us to be passed back a request
via a fence-array, that is from explicit fencing accumulated into a
sync-file fence-array.
Fixes: 52e542090701 ("drm/i915/scheduler: Record all dependencies upon request construction")
Testcase: igt/gem_exec_fence/expired-history
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170422081537.6468-1-chris@chris-wilson.co.uk
(cherry picked from commit
ade0b0c965f59176daddbef9c4717354034f9bce)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Chris Wilson [Fri, 21 Apr 2017 13:58:15 +0000 (14:58 +0100)]
drm/i915: Avoid busy-spinning on VLV_GLTC_PW_STATUS mmio
The busy-spin, as the first stage of intel_wait_for_register(), is
currently under suspicion for causing:
[ 62.034926] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1
[ 62.034928] Modules linked in: i2c_dev i915 intel_gtt drm_kms_helper prime_numbers
[ 62.034932] CPU: 1 PID: 183 Comm: kworker/1:2 Not tainted 4.11.0-rc7+ #471
[ 62.034933] Hardware name: / , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[ 62.034934] Workqueue: pm pm_runtime_work
[ 62.034936] task:
ffff880275a04ec0 task.stack:
ffffc900002d8000
[ 62.034936] RIP: 0010:__intel_wait_for_register_fw+0x77/0x1a0 [i915]
[ 62.034937] RSP: 0018:
ffffc900002dbc38 EFLAGS:
00000082
[ 62.034939] RAX:
ffffc90003530094 RBX:
0000000000130094 RCX:
0000000000000001
[ 62.034940] RDX:
00000000000000a1 RSI:
ffff88027fd15e58 RDI:
0000000000000000
[ 62.034941] RBP:
ffffc900002dbc78 R08:
0000000000000002 R09:
0000000000000000
[ 62.034942] R10:
ffffc900002dbc18 R11:
ffff880276429dd0 R12:
ffff8802707c0000
[ 62.034943] R13:
00000000000000a0 R14:
0000000000000000 R15:
00000000fffefc10
[ 62.034945] FS:
0000000000000000(0000) GS:
ffff88027fd00000(0000) knlGS:
0000000000000000
[ 62.034945] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 62.034947] CR2:
00007ffd3cd98ff8 CR3:
0000000274c19000 CR4:
00000000001006e0
[ 62.034947] Call Trace:
[ 62.034948] intel_wait_for_register+0x77/0x140 [i915]
[ 62.034949] vlv_suspend_complete+0x23/0x5b0 [i915]
[ 62.034950] intel_runtime_suspend+0x16c/0x2a0 [i915]
[ 62.034950] pci_pm_runtime_suspend+0x50/0x180
[ 62.034951] ? pci_pm_runtime_resume+0xa0/0xa0
[ 62.034952] __rpm_callback+0xc5/0x210
[ 62.034953] rpm_callback+0x1f/0x80
[ 62.034953] ? pci_pm_runtime_resume+0xa0/0xa0
[ 62.034954] rpm_suspend+0x118/0x580
[ 62.034955] pm_runtime_work+0x64/0x90
[ 62.034956] process_one_work+0x1bb/0x3e0
[ 62.034956] worker_thread+0x46/0x4f0
[ 62.034957] ? __schedule+0x18b/0x610
[ 62.034958] kthread+0xff/0x140
[ 62.034958] ? process_one_work+0x3e0/0x3e0
[ 62.034959] ? kthread_create_on_node+
and related hard lockups in CI for byt and bsw.
Note this effectively reverts commits
41ce405e6894 and
b27366958869
("drm/i915: Convert wait_for(I915_READ(reg)) to intel_wait_for_register()")
v2: Convert bool allow into a u32 mask for clarity and repeat the
comment on vlv rc6 timing to justify the 3ms timeout used for the wait (Ville)
Fixes: 41ce405e6894 ("drm/i915: Convert wait_for(I915_READ(reg)) to intel_wait_for_register()")
Fixes: b27366958869 ("drm/i915: Convert wait_for(I915_READ(reg)) to intel_wait_for_register()")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100718
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170421135815.11897-1-chris@chris-wilson.co.uk
Tested-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
(cherry picked from commit
3dd14c04d77d7d702de5aa7157df4cc9417329f3)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>