git.openwrt.org Git - openwrt/staging/blogic.git/log

drm/i915: Make some RPS functions static

They are not used anywhere else. Also, fix a small typo in a comment.

No functional changes.

Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1503532705-3692-1-git-send-email-oscar.mateo@intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/1503532705-3692-1-git-send-email-oscar.mateo@intel.com

drm/i915: Ignore duplicate VMA stored within the per-object handle LUT

By using drm_gem_flink/drm_gem_open on an object using the same fd, it
is possible for a client to create multiple handles pointing to the same
object (tied to the same contexts and VMA), as exemplified by
igt::gem_handle_to_libdrm_bo(). Since this duplication has been possible
since forever, we cannot assume that the handle:(fpriv, object) is
unique and so must handle the multiple users of a single VMA.

v2: Added commentary noise.

Testcase: igt/gem_close
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102355
Fixes: d1b48c1e7184 ("drm/i915: Replace execbuf vma ht with an idr")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170822110517.22277-3-chris@chris-wilson.co.uk
Tested-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>

drm/i915: Assert that the handle->vma lut is empty on object close

Make sure that we are not leaking an entry in the ctx->handles_lut by
asserting that the object was removed prior to being freed. This should
be enforced by all such handles being removed by i915_gem_close_object.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170822110517.22277-2-chris@chris-wilson.co.uk
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>

drm/i915: Assert the context is not closed on object-close

During the context-close, we should be decoupling all the vma from the
object so that upon object-closing we shouldn't see any vma from the
already closed contexts. So include a check upon closing the object that
the context is still open.

v2: Eek, the fpriv check is required for shared objects. Double eek, BAT
passed?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170822110517.22277-1-chris@chris-wilson.co.uk
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>

drm/i915/cnl: WaForceContextSaveRestoreNonCoherent

To avoid a potential hang condition with TLB invalidation
we need to enable masked bit 5 of MMIO 0xE5F0 at boot.

Same workaround was in place for previous platforms,
but the register offset has changed for CNL.
But also BSpec doesn't mention the bit 15 as set on gen9
platforms and mark bit as reserved on CNL.

v2: Improve commit message accepting Oscar's suggestion.

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170823203504.10012-1-rodrigo.vivi@intel.com

drm/i915/cnl: WaPushConstantDereferenceHoldDisable

CS sometimes hangs on 3D Push Constant dispatches with the new
deref enhancement logic in CNL.

v2: Improve the commit message (Rodrigo)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1503518191-19116-1-git-send-email-oscar.mateo@intel.com

drm/i915: Keep a small stash of preallocated WC pages

We use WC pages for coherent writes into the ppGTT on !llc
architectures. However, to create a WC page requires a stop_machine(),
i.e. is very slow. To compensate we currently keep a per-vm cache of
recently freed pages, but we still see the slow startup of new contexts.
We can amoritize that cost slightly by allocating WC pages in small
batches (PAGEVEC_SIZE == 14) and since creating a WC page implies a
stop_machine() there is no penalty for keeping that stash global.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170822173828.5932-1-chris@chris-wilson.co.uk

drm/i915/cfl: Coffee Lake works on Kaby Lake PCH.

Coffee Lake CPU on Kaby Lake PCH is possible.
It does exist, and it does work.

The only missed case was this warning here noticed
by Wendy who could get one system with this configuration
and reported the issue for us:

Hardware Configuration
Board ID KBL S DDR4 UDIMM EV CRB
Processor Intel® Processor code named Coffee Lake S, (6+2), 6 cores 12 threads, GT2, A0 (Internal) (QNJ4)

[ 3.220585] WARNING: CPU: 10 PID: 206 at drivers/gpu/drm/i915/i915_drv.c:340 i915_driver_load+0x1210/0x1660 [i915]
[ 3.221312] Modules linked in: hid_generic usbhid i915 i2c_algo_bit drm_kms_helper e1000e syscopyarea sysfillrect sysimgblt nvme fb_sys_fops ptp ahci i2c_hid drm pps_core nvme_core libahci wmi hid video
[ 3.222050] CPU: 10 PID: 206 Comm: systemd-udevd Not tainted 4.13.0-rc5-intel-next+ #1
[ 3.222706] Hardware name: Intel Corporation Kabylake Client platform/KBL S DDR4 UDIMM EV CRB, BIOS KBLSE2R1.R00.X089.P00.1705051000 05/05/2017

Cc: Wendy Wang <wendy.wang@intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170821235056.9015-1-rodrigo.vivi@intel.com

drm/i915/cnl: extract cnl_set_procmon_ref_values

Move the part that reads the table and sets registers based on the
table to its own function.

v2: Rebase.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170822000356.17330-2-rodrigo.vivi@intel.com

drm/i915/cnl: simplify cnl_procmon_values handling

Make it a little less magical and a little simpler and more hardcoded
so we don't end up with an array that's composed mostly of empty
entries.

v2: Add an enum for the voltage+register values (Ville).

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170822000356.17330-1-rodrigo.vivi@intel.com

drm/i915: Boost GPU clocks if we miss the pageflip's vblank

If we miss the current vblank because the gpu was busy, that may cause a
jitter as the frame rate temporarily drops. We try to limit the impact
of this by then boosting the GPU clock to deliver the frame as quickly
as possible. Originally done in commit 6ad790c0f5ac ("drm/i915: Boost GPU
frequency if we detect outstanding pageflips") but was never forward
ported to atomic and finally dropped in commit fd3a40242e87 ("drm/i915:
Rip out legacy page_flip completion/irq handling").

One of the most typical use-cases for this is a mostly idle desktop.
Rendering one frame of the desktop's frontbuffer can easily be
accomplished by the GPU running at low frequency, but often exceeds
the time budget of the desktop compositor. The result is that animations
such as opening the menu, doing a fullscreen switch, or even just trying
to move a window around are slow and jerky. We need to respond within a
frame to give the best impression of a smooth UX, as a compromise we
instead respond if that first frame misses its goal. The result should
be a near-imperceivable initial delay and a smooth animation even
starting from idle. The cost, as ever, is that we spend more power than
is strictly necessary as we overestimate the required GPU frequency and
then try to ramp down.

This of course is reactionary, too little, too late; nevertheless it is
surprisingly effective.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102199
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170817123706.6777-1-chris@chris-wilson.co.uk
Tested-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>

drm/i915: Constify states passed to enable/disable/etc. encoder hooks

The enable/disable/etc. encoder hooks aren't supposed to alter the
state(s), so pass them as const. Unfortunately C lacks any kind of deep
const thingy, so this can't catch all abuses. But at least it acts as a
hint to the reader telling them not to mess about with the state(s).

v2: Update intel_tv_mode_find() and ironlake_edp_pll_on() as well
v3: Deal with intel_sdvo_connector_state

Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170818134958.15502-9-ville.syrjala@linux.intel.com

drm/i915: Plumb crtc_state to PSR enable/disable

The PSR enable/disable need to know things about the crtc state, so
plumb it through. This will become even more important when we start
to reuse the generic infoframe code for the VSC DIP programming as the
infoframe code wants the crtc state as well.

v2: Fix kernel docs

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170818134958.15502-7-ville.syrjala@linux.intel.com

drm/i915: Init infoframe vfuncs for DP encoders as well

DP ports may want to use the video DIP for SDP transmission, so let's
initialize the vfuncs for DP encoders as well. The only exception is
port A eDP prior to HSW as that one doesn't have a video DIP instance.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170818134958.15502-6-ville.syrjala@linux.intel.com

drm/i915: Move infoframe vfuncs into intel_digital_port

DP ports will also want to utilize the video DIP for SDP transmission.
So let's move the vfuncs into the dig_port.

v2: Rebase due to DDI changes

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170818134958.15502-5-ville.syrjala@linux.intel.com

drm/i915: Disable infoframes when shutting down DDI HDMI

Disabling the video DIP when shutting the port down seems like a good
idea.

Bspec says:
"When disabling both the DIP port and DIP transmission,
first disable the port and then disable DIP."
and
"Restriction : GCP is only supported with HDMI when the bits per color is
not equal to 8. GCP must be enabled prior to enabling TRANS_DDI_FUNC_CTL
for HDMI with bits per color not equal to 8 and disabled after disabling
TRANS_DDI_FUNC_CTL"

So let's do it in the .post_disable() hook.

v2: Remove double "dpms off" caused by rebase fail

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170822140914.24413-1-ville.syrjala@linux.intel.com

drm/i915: Check has_infoframes when enabling infoframes

has_infoframe is what tells us whether infoframes should be enabled, so
let's pass that instead of has_hdmi_sink to .set_infoframes().

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170818134958.15502-3-ville.syrjala@linux.intel.com

drm/i915: Re-enable per-engine reset for Broxton

The corruption in CSB mmio reads we were seeing has been tracked down to
incorrectly touching forcewake of all domains, following an engine reset.
It is still a mistery why we only catched this in Broxton, since it
could happen in any platform.

With that fix already merged, commit 4055dc75d6b5 ("drm/i915: Stop
touching forcewake following a gen6+ engine reset"), lets try to enable
per-engine resets in Broxton one more time.

This reverts commit f188258bde0f ("drm/i915: Disable per-engine reset for
Broxton").

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170818172342.7282-1-michel.thierry@intel.com
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/dp: make is_edp non-static and rename to intel_dp_is_edp

Expose across driver for future work. No functional changes.

Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170818093020.19160-2-jani.nikula@intel.com

drm/i915/dp: rename intel_dp_is_edp to intel_dp_is_port_edp

Emphasize that this is based on the port, not intel_dp. This is also in
line with the underlying intel_bios_is_port_edp() function. No
functional changes.

Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170818093020.19160-1-jani.nikula@intel.com

drm/i915/cnl: Apply large line width optimization

This bit enables hardware that will change the approximation used for distances
calculations for AA wide lines so that they are rendered more accurately.

The default value for this bit leaves the legacy behavior. There is no good
reason to not enable the new approximation except if comparing to previous GEN
rendered images.

v2: Rebase
v3: Fix author.
    Rebased by Rodrigo who also added  a comment as suggested by Oscar.
    Since it is surrounded by Workarounds let's just add a comment to
    make clear it is not an Wa.

Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-4-rodrigo.vivi@intel.com

drm/i915/cnl: WaDisableEnhancedSBEVertexCaching

WA forTDS handle reallocation getting dropped by SDE,
which may result in PS attribute corruption.

Disable enhanced SBE vertex caching in COMMON_SLICE_CHICKEN2 offset.

v2: Make it until B0 as spec tells. (by Mika).

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-3-rodrigo.vivi@intel.com

drm/i915/cnl: Add WaDisableReplayBufferBankArbitrationOptimization

WA to disable replay buffer destination buffer arbitration optimization.

Same Wa on previous platforms has a different name: WaToEnableHwFixForPushConstHWBug

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-2-rodrigo.vivi@intel.com

drm/i915/cnl: Introduce initial Cannonlake Workarounds.

Let's inherit workarounds from previous platforms that
according to wa_database and BSpec are still valid for
Cannonlake.

v2: Add missed workarounds.
v3: Rebase
v4: Remove bad chunk that was added to rc6 disable. (Ander)
    Also remove A0 W/a that are not needed anymore.
v5: Rebase on top of CFL.
v6: Remove empty gen9_init_perctx_bb and gen9_init_indirectctx_bb
    since they don't carry any gen10 related W/a. (by Oscar).
    Also Remove A0 exclusive workaround.
v7: Remove more A0 exclusive workarounds. As pointed out by Oscar
    many workarounds were changed to be A0 only so let's remove
    them.

Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-1-rodrigo.vivi@intel.com

drm/i915: Clear lost context-switch interrupts across reset

During a global reset, we disable the irq. As we disable the irq, the
hardware may be raising a GT interrupt that we then ignore, leaving it
pending in the GTIIR. After the reset, we then re-enable the irq,
triggering the pending interrupt. However, that interrupt was for the
stale state from before the reset, and the contents of the CSB buffer
are now invalid.

v2: Add a comment to make it clear that the double clear is purely my
paranoia.

Reported-by: "Dong, Chuanxiao" <chuanxiao.dong@intel.com>
Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Dong, Chuanxiao" <chuanxiao.dong@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170807121919.30165-1-chris@chris-wilson.co.uk
Link: https://patchwork.freedesktop.org/patch/msgid/20170818090509.5363-1-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>

drm/i915: Update DRIVER_DATE to 20170818

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: use NULL for GPIO connection ID

The commit 213e08ad60ba
("drm/i915/bxt: add bxt dsi gpio element support")
enables GPIO support for Broxton based platforms.

While using that API we might get into troubles in the future, because
we can't rely on label name in the driver since vendor firmware might
provide any GPIO pin there, e.g. "reset", and even mark it in _DSD (in
which case the request will fail).

To avoid inconsistency and potential issues we have two options:
a) generate GPIO ACPI mapping table and supply it via
acpi_dev_add_driver_gpios(), or
b) just pass NULL as connection ID.

The b) approach is much simpler and would work since the driver relies
on GPIO indices only. Moreover, the _CRS fallback mechanism, when
requesting GPIO, has been made stricter, and supplying non-NULL
connection ID when neither _DSD, nor GPIO ACPI mapping is present, is
making request fail.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101921
Fixes: f10e4bf6632b ("gpio: acpi: Even more tighten up ACPI GPIO lookups")
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Tested-by: Mika Kahola <mika.kahola@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170817105541.63914-1-andriy.shevchenko@linux.intel.com

drm/i915: Mark the GT as busy before idling the previous request

In a synchronous setup, we may retire the last request before we
complete allocating the next request. As the last request is retired, we
queue a timer to mark the device as idle, and promptly have to execute
ad cancel that timer once we complete allocating the request and need to
keep the device awake. If we rearrange the mark_busy() to occur before
we retire the previous request, we can skip this ping-pong.

v2: Joonas pointed out that unreserve_seqno() was now doing more than
doing seqno handling and should be renamed to reflect its wider purpose.
That also highlighted the new asymmetry with reserve_seqno(), so fixup
that and rename both to [un]reserve_engine().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170817144719.10968-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Trivial grammar fix s/opt of/opt out of/ in comment

The word out was dropped from the sentence across the line break, put it
back.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-6-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Replace execbuf vma ht with an idr

This was the competing idea long ago, but it was only with the rewrite
of the idr as an radixtree and using the radixtree directly ourselves,
along with the realisation that we can store the vma directly in the
radixtree and only need a list for the reverse mapping, that made the
patch performant enough to displace using a hashtable. Though the vma ht
is fast and doesn't require any extra allocation (as we can embed the node
inside the vma), it does require a thread for resizing and serialization
and will have the occasional slow lookup. That is hairy enough to
investigate alternatives and favour them if equivalent in peak performance.
One advantage of allocating an indirection entry is that we can support a
single shared bo between many clients, something that was done on a
first-come first-serve basis for shared GGTT vma previously. To offset
the extra allocations, we create yet another kmem_cache for them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-5-chris@chris-wilson.co.uk

drm/i915: Simplify eb_lookup_vmas()

Since the introduction of being able to perform a lockless lookup of an
object (i915_gem_object_get_rcu() in fbbd37b36fa5 ("drm/i915: Move object
release to a freelist + worker") we no longer need to split the
object/vma lookup into 3 phases and so combine them into a much simpler
single loop.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-4-chris@chris-wilson.co.uk

drm/i915: Convert execbuf to use struct-of-array packing for critical fields

When userspace is doing most of the work, avoiding relocs (using
NO_RELOC) and opting out of implicit synchronisation (using ASYNC), we
still spend a lot of time processing the arrays in execbuf, even though
we now should have nothing to do most of the time. One issue that
becomes readily apparent in profiling anv is that iterating over the
large execobj[] is unfriendly to the loop prefetchers of the CPU and it
much prefers iterating over a pair of arrays rather than one big array.

v2: Clear vma[] on construction to handle errors during vma lookup

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-3-chris@chris-wilson.co.uk

drm/i915: Check context status before looking up our obj/vma

Since we keep the context around across the slow lookup where we may
drop the struct_mutex, we should double check that the context is still
valid upon reacquisition.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-2-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

drm/i915: Don't use MI_STORE_DWORD_IMM on Sandybridge/vcs

MI_STORE_DWORD_IMM just doesn't work on the video decode engine under
Sandybridge, so refrain from using it. Then switch the selftests over to
using the now common test prior to using MI_STORE_DWORD_IMM.

Fixes: 7dd4f6729f92 ("drm/i915: Async GPU relocation processing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.13-rc1+
Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915: Stop touching forcewake following a gen6+ engine reset

Forcewake is not affected by the engine reset on gen6+. Indeed the
reason why we added intel_uncore_forcewake_reset() to
gen6_reset_engines() was to keep the bookkeeping intact because the
reset did not touch the forcewake bit (yet we cancelled the forcewake
consumers)!  This was done in commit 521198a2e7095:
    Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Date:   Fri Aug 23 16:52:30 2013 +0300

drm/i915: sanitize forcewake registers on reset

In reset we try to restore the forcewake state to
pre reset state, using forcewake_count. The reset
doesn't seem to clear the forcewake bits so we
get warn on forcewake ack register not clearing.

That futzing of the forcewake bookkeeping was dropped in commit
0294ae7b44bb ("drm/i915: Consolidate forcewake resetting to a single
function"), but it did not make the realisation that the remaining
intel_uncore_forcewake_reset() was redundant.

The new danger with using intel_uncore_forcewake_reset() with per-engine
resets is that the driver and hw are still in an active state as we
perform the reset. We may be using the forcewake to read protected
registers elsewhere and those results may be clobbered by the concurrent
dropping of forcewake.

Reported-by: Michel Thierry <michel.thierry@intel.com>
Fixes: 142bc7d99bcf ("drm/i915: Modify error handler for per engine hang recovery")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170817173229.20324-1-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

MAINTAINERS: drm/i915 has a new maintainer team

For a bunch of reasons[1] I've decided to step down as maintainer and
let some other folks enjoy the reputation and hang out in the
spotlight.

Jani is going to stick around with his expertise in kms and having
done the fixes flow for a long time now. Joonas will join and bring in
his knowledge on all things GEM. Rodrigo has been less visible because
he's been doing tons of work taking care of the internal branch, and
it'd be good to have more continuity between these two worlds also on
the maintainer side.

1: They all boil down to: This is going to happen sooner or later
anyway, we have a great team, with the process improvements over the
last few years things work rather well, now is as good as any time to
do this. With that change I'll have more time for other aspects of the
stack development than maintainership.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Dave Airlie <airlied@gmail.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815160101.1683-1-daniel.vetter@ffwll.ch

drm/i915: Split pin mapping into per platform functions

Cleanup the code. Map the pins in accordance to
individual platforms rather than according to ports.
Create separate functions for platforms.

v2:
- Add missing condition for CoffeeLake. Make platform
specific functions static. Add function
i915_ddc_pin_mapping().

v3:
- Rename functions to x_port_to_ddc_pin() which directly
   indicates the purpose. Correct default return values on CNP
   and BXT. Rename i915_port_to_ to g4x_port_to since that was
   the first platform to run this. Correct code style. (Paulo)

Sugested-by Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Clinton Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1502927114-24012-1-git-send-email-anusha.srivatsa@intel.com

drm/i915/opregion: let user specify override VBT via firmware load

Sometimes it would be most enlightening to debug systems by replacing
the VBT to be used. For example, in the referenced bug the BIOS provides
different VBT depending on the boot mode (UEFI vs. legacy). It would be
interesting to try the failing boot mode with the VBT from the working
boot, and see if that makes a difference.

Add a module parameter to load the VBT using the firmware loader, not
unlike the EDID firmware mechanism.

As a starting point for experimenting, one can pick up the BIOS provided
VBT from /sys/kernel/debug/dri/0/i915_opregion/i915_vbt.

v2: clarify firmware load return value check (Bob)

v3: kfree the loaded firmware blob

References: https://bugs.freedesktop.org/show_bug.cgi?id=97822#c83
Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170817115209.25912-1-jani.nikula@intel.com

drm/i915/cnl: Reuse skl_wm_get_hw_state on Cannonlake.

Otherwise it reuses the ilk that has a completely different
wm.

Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170809205248.11917-6-rodrigo.vivi@intel.com

drm/i915/gen10: implement gen 10 watermarks calculations

They're slightly different than the gen 9 calculations.

v2: Remove TODO comment. Code matches recent spec.
v3: Rebase on top of latest skl code using new fp16.16 and
    fixing a logic issue. Auto rebase bot has apparently
    made some bad decisions that changed the logic of the
    code. (Noticed by Manesh, updated by Rodrigo).

Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811233825.32083-1-rodrigo.vivi@intel.com

drm/i915/cnl: Fix LSPCON support.

When LSPCON support was extended to CNL
one part was missed on lspcon_init.

So, instead of adding check per platform on lspcon_init
let's use HAS_LSPCON that is already there for that
purpose.

Fixes: ff15947e0f02 ("drm/i915/cnl: LSPCON support is gen9+")
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170816030403.11368-1-rodrigo.vivi@intel.com

drm/i915/vbt: ignore extraneous child devices for a port

Ever since we've parsed VBT child devices, starting from 6acab15a7b0d
("drm/i915: use the HDMI DDI buffer translations from VBT"), we've
ignored the child device information if more than one child device
references the same port. The rationale for this seems lost in time.

Since commit 311a20949f04 ("drm/i915: don't init DP or HDMI when not
supported by DDI port") we started using this information more to skip
HDMI/DP init if the port wasn't there per VBT child devices. However, at
the same time it added port defaults without further explanation.

Thus, if the child device info was skipped due to multiple child devices
referencing the same port, the device info would be retrieved from the
somewhat arbitrary defaults.

Finally, when commit bb1d132935c2 ("drm/i915/vbt: split out defaults
that are set when there is no VBT") stopped initializing the defaults
whenever VBT is present, thus trusting the VBT more, we stopped
initializing ports which were referenced by more than one child device.

Apparently at least Asus UX305UA, UX305U, and UX306U laptops have VBT
child device blocks which cause this behaviour. Arguably they were
shipped with a broken VBT.

Relax the rules for multiple references to the same port, and use the
first child device info to reference a port. Retain the logic to debug
log about this, though.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101745
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196233
Fixes: bb1d132935c2 ("drm/i915/vbt: split out defaults that are set when there is no VBT")
Tested-by: Oliver Weißbarth <mail@oweissbarth.de>
Reported-by: Oliver Weißbarth <mail@oweissbarth.de>
Reported-by: Didier G <didierg-divers@orange.fr>
Reported-by: Giles Anderson <agander@gmail.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: <stable@vger.kernel.org> # v4.12+
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811113907.6716-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/cnl: Setup PAT Index.

Different from previous platforms, on CNL+ there's separated
registers for separated indexes.

v2: Remove comments regarding uncertainty around the table.
v3: Remove extra line (by Ben)

Cc: Clint Taylor <clinton.a.taylor@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815232539.3562-1-rodrigo.vivi@intel.com

drm/i915/edp: Allow alternate fixed mode for eDP if available.

Some fixed resolution panels actually support more than one mode,
with the only thing different being the refresh rate.  Having this
alternate mode available to us is desirable, because it allows us to
test PSR on panels whose setup time at the preferred mode is too long.
With this patch we allow the use of the alternate mode if it's
available and it was specifically requested.

v2 and v3: Rebase
v4: * Fix up some leaky mode stuff (Chris)
    * Rebase
v5: * Fix a NULL pointer derefrence (David Weinehall)
v6: * Whitespace / spelling / checkpatch clean-up; no functional
      change. (David)
    * Rebase

Cc: David Weinehall <david.weinehall@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Signed-off-by: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1502308133-26892-1-git-send-email-jim.bride@linux.intel.com

drm/i915: Add support for drm syncobjs

This commit adds support for waiting on or signaling DRM syncobjs as
part of execbuf. It does so by hijacking the currently unused cliprects
pointer to instead point to an array of i915_gem_exec_fence structs
which containe a DRM syncobj and a flags parameter which specifies
whether to wait on it or to signal it. This implementation
theoretically allows for both flags to be set in which case it waits on
the dma_fence that was in the syncobj and then immediately replaces it
with the dma_fence from the current execbuf.

v2:
- Rebase on new syncobj API
v3:
- Pull everything out into helpers
- Do all allocation in gem_execbuffer2
- Pack the flags in the bottom 2 bits of the drm_syncobj*
v4:
- Prevent a potential race on syncobj->fence

Testcase: igt/gem_exec_fence/syncobj*
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/1499289202-25441-1-git-send-email-jason.ekstrand@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815145733.4562-1-chris@chris-wilson.co.uk

drm/i915: Handle full s64 precision for wait-ioctl

The wait-ioctl is optionally supplied a timeout with nanosecond
precision in a s64 field. We use nsecs_to_jiffies64() to convert that
into the jiffies consumed by the scheduler, but internally
nsecs_to_jiffies64() does not guard against overflow (as it's purpose is
for use by the scheduler and not drivers!). So we must guard against the
overflow ourselves, and in the process note that we may then return
much earlier than the timeout selected by the user, so don't report
ETIME unless we do hit the timeout. (Woe betold us though if the user
waits for a year (32bit) and the request is still not complete!)

v2: Refine overflow detection (to not include an overffow itself)

Reported-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811105731.9482-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Split obj->cache_coherent to track r/w

Another month, another story in the cache coherency saga. This time, we
come to the realisation that i915_gem_object_is_coherent() has been
reporting whether we can read from the target without requiring a cache
invalidate; but we were using it in places for testing whether we could
write into the object without requiring a cache flush. So split the
tracking into two, one to decide before reads, one after writes.

See commit e27ab73d17ef ("drm/i915: Mark CPU cache as dirty on every
transition for CPU writes") for the previous entry in this saga.

v2: Be verbose
v3: Remove unused function (i915_gem_object_is_coherent)
v4: Fix inverted coherency check prior to execbuf (from v2)
v5: Add comment for nasty code where we are optimising on gcc's behalf.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101109
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101555
Testcase: igt/kms_mmap_write_crc
Testcase: igt/kms_pwrite_crc
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Tested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811111116.10373-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915/hsw+: Add support for multiple power well regs

Future platforms increase the number of power wells which require
additional control registers. A convenient way to select the correct
register is to use the high bits of the power well ID as index. This
patch only prepares for this, while upcoming platform enabling patches
will add the actual new power well IDs and corresponding power well
control registers.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Animesh Manna <animesh.manna@intel.com>
Cc: Rakshmi Bhatia <rakshmi.bhatia@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Reviewed-by: Rakshmi Bhatia <rakshmi.bhatia@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170814151530.24154-2-imre.deak@intel.com

drm/i915: Work around GCC anonymous union initialization bug

GCC 4.4 can't cope with anonymous union initializers which seems to be a
bug in that version (see the Reference) and is fixed since GCC version
4.6. A workaround which is also used elsewhere in the kernel for the
same purpose is to wrap the initialization in curly braces, so do the
same here.

Fixes: b5565a2efc12 ("drm/i915/bxt, glk: Give a proper name to the power well struct phy field")
Reference: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170814151530.24154-1-imre.deak@intel.com

Merge tag 'gvt-next-2017-08-15' of https://github.com/01org/gvt-linux into drm-intel-next-queued

gvt-next-2017-08-15

gvt update for 4.14
- MMIO save/restore optimization (Changbin)
- Split workload scan vs. dispatch for more parallel exec (Ping)
- vGPU full 48bit ppgtt support (Joonas, Tina)
- vGPU hw id expose for perf (Zhenyu)
- other misc cleanup and fixes

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815023940.skhjfcsyrao7axqi@zhen-hp.sh.intel.com

drm/i915: Initialize 'data' in intel_dsi_dcs_backlight.c

variable 'data' may be used uninitialized in this function. thus,
'function dcs_get_backlight' will return unwanted value/fail.

Thus, adding NULL initialized to 'data' variable will solve the return
failure happening.

v2: Change commit message to reflect upstream with proper message

Fixes: 90198355b83c ("drm/i915/dsi: Add DCS control for Panel PWM")
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Yetunde Adebisi <yetundex.adebisi@intel.com>
Cc: Deepak M <m.deepak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Balasubramaniam, Hari Chand <hari.chand.balasubramaniam@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1502762746-191826-1-git-send-email-hari.chand.balasubramaniam@intel.com

drm/i915/dp: Validate the compliance test link parameters

Validate the compliance test link parameters when the compliance
test dpcd registers are read. Also validate them in compute_config
before using them since the max values might have been reduced
due to link training fallback.

If either the link rate or lane count is invalid, we still bail
from using the test parameters since the combination would not work
and instead use the fallback values.

v2:
* Added commit message to explain why we still bail when either of
of the params is invalid (Ville Syrjala)
* Add reason for validating in the comment (Jani Nikula)
* Also check if index >= 0 after validating (Jani Nikula)

Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1496954463-18038-2-git-send-email-manasi.d.navare@intel.com

drm/i915/dp: Generalize intel_dp_link_params function to accept arguments to be validated

This function now takes the link rate and lane ocunt to be validated
as an argument so that this can be used for validating even the
compliance test link parameters.

Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Tested-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1496954463-18038-1-git-send-email-manasi.d.navare@intel.com

drm/i915/gvt: Fix guest i915 full ppgtt blocking issue

Guest i915 full ppgtt functionality was blocking by an issue, which would
lead to gpu hardware hang. Guest i915 driver may update the ppgtt table
just before this workload is going to be submitted to the hardware by
device model. This case wasn't handled well by device model before, due
to the small time window between removing old ppgtt entry and adding the
new one. Errors occur when the workload is executed by hardware during
that small time window. This patch is to remove this time window by adding
the new ppgtt entry first and then remove the old one.

Changes in v2:
- Move VGT_CAPS_FULL_PPGTT introduction to patch 2/4. (Joonas)

Changes since v2:
- Divide the whole patch set into two separate patch series, with one
  patch in i915 side to check guest i915 full ppgtt capability and enable
  it when this capability is supported by the device model, and the other
  one in gvt side which fixs the blocking issue and enables the device
  model to provide the capability to guest. And this patch focuses on gvt
  side. (Joonas)
- Change the title from "reorder the shadow ppgtt update process by adding
  entry first" to "Fix guest i915 full ppgtt blocking issue". (Tina)

Changes since v3:
- Rebase to the latest branch.

Changes since v4:
- Tested by Tina Zhang.

Changes since v5:
- Rebase to the latest branch.

v6:
- Update full 48bit ppgtt definition

Cc: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915: Enable guest i915 full ppgtt functionality

Enable the guest i915 full ppgtt functionality when host can provide this
capability. vgt_caps is introduced to guest i915 driver to get the vgpu
capabilities from the device model. VGT_CPAS_FULL_PPGTT is one of the
capabilities type to let guest i915 dirver know that the guest i915 full
ppgtt is supported by device model.

Notice that the minor version of pvinfo isn't bumped because of this
vgt_caps introduction, due to older guest would be broken by simply
increasing the pvinfo version. Although the pvinfo minor version doesn't
increase, the compatibility won't be blocked. The compatibility is ensured
by checking the value of caps field in pvinfo. Zero means no full ppgtt
support and BIT(2) means this feature is provided.

Changes since v1:
- Use u32 instead of uint32_t (Joonas)
- Move VGT_CAPS_FULL_PPGTT introduction to this patch and use #define
  instead of enum (Joonas)
- Rewrite the vgpu full ppgtt capability checking logic. (Joonas)
- Some coding style refine. (Joonas)

Changes since v2:
- Divide the whole patch set into two separate patch series, with one
  patch in i915 side to check guest i915 full ppgtt capability and enable
  it when this capability is supported by the device model, and the other
  one in gvt side which fixs the blocking issue and enables the device
  model to provide the capability to guest. And this patch focuses on guest
  i915 side. (Joonas)
- Change the title from "introduce vgt_caps to pvinfo" to
  "Enable guest i915 full ppgtt functionality". (Tina)

Change since v3:
- Add some comments about pvinfo caps and version. (Joonas)

Change since v4:
- Tested by Tina Zhang.

Change since v5:
- Add limitation about supporting 32bit full ppgtt.

Change since v6:
- Change the fallback to 48bit full ppgtt if i915.ppgtt_enable=2. (Zhenyu)

Change in v9:
- Remove the fixme comment due to no plan for 32bit full ppgtt
  support. (Zhenyu)
- Reorder the patch-set to fix compiling issue with git-bisect. (Zhenyu)
- Add print log when forcing guest 48bit full ppgtt. (Zhenyu)

v10:
- Update against Joonas's has_full_ppgtt and has_full_48bit_ppgtt disconnect
  change. (Zhenyu)

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> # in v2
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915: Disconnect 32 and 48 bit ppGTT support

Configurations like virtualized environments may support only 48 bit
ppGTT without supporting 32 bit ppGTT. Support this by disconnecting
the relationship of the two feature bits.

Cc: Tina Zhang <tina.zhang@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915: More surgically unbreak the modeset vs reset deadlock

There's no reason to entirely wedge the gpu, for the minimal deadlock
bugfix we only need to unbreak/decouple the atomic commit from the gpu
reset. The simplest way to fix that is by replacing the
unconditional fence wait a the top of commit_tail by a wait which
completes either when the fences are done (normal case, or when a
reset doesn't need to touch the display state). Or when the gpu reset
needs to force-unblock all pending modeset states.

The lesser source of deadlocks is when we try to pin a new framebuffer
and run into a stall. There's a bunch of places this can happen, like
eviction, changing the caching mode, acquiring a fence on older
platforms. And we can't just break the depency loop and keep going,
the only way would be to break out and restart. But the problem with
that approach is that we must stall for the reset to complete before
we grab any locks, and with the atomic infrastructure that's a bit
tricky. The only place is the ioctl code, and we don't want to insert
code into e.g. the BUSY ioctl. Hence for that problem just create a
critical section, and if any code is in there, wedge the GPU. For the
steady-state this should never be a problem.

Note that in both cases TDR itself keeps working, so from a userspace
pov this trickery isn't observable. Users themselvs might spot a short
glitch while the rendering is catching up again, but that's still
better than pre-TDR where we've thrown away all the rendering,
including innocent batches. Also, this fixes the regression TDR
introduced of making gpu resets deadlock-prone when we do need to
touch the display.

One thing I noticed is that gpu_error.flags seems to use both our own
wait-queue in gpu_error.wait_queue, and the generic wait_on_bit
facilities. Not entirely sure why this inconsistency exists, I just
picked one style.

A possible future avenue could be to insert the gpu reset in-between
ongoing modeset changes, which would avoid the momentary glitch. But
that's a lot more work to implement in the atomic commit machinery,
and given that we only need this for pre-g4x hw, of questionable
utility just for the sake of polishing gpu reset even more on those
old boxes. It might be useful for other features though.

v2: Rebase onto 4.13 with a s/wait_queue_t/struct wait_queue_entry/.

v3: Really emabarrassing fixup, I checked the wrong bit and broke the
unbreak/wakeup logic.

v4: Also handle deadlocks in pin_to_display.

v5: Review from Michel:
- Fixup the BUILD_BUG_ON
- Don't forget about the overlay

Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v2)
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170808080828.23650-3-daniel.vetter@ffwll.ch
Reviewed-by: Michel Thierry <michel.thierry@intel.com>

drm/i915: Push i915_sw_fence_wait into the nonblocking atomic commit

Blocking in a worker is ok, that's what the unbound_wq is for. And it
unifies the paths between the blocking and nonblocking commit, giving
me just one path where I have to implement the deadlock avoidance
trickery in the next patch.

I first tried to implement the following patch without this rework, but
force-completing i915_sw_fence creates some serious challenges around
properly cleaning things up. So wasn't a feasible short-term approach.
Another approach would be to simple keep track of all pending atomic
commit work items and manually queue them from the reset code. With the
caveat that double-queue in case we race with the i915_sw_fence must be
avoided. Given all that, taking the cost of a double schedule in atomic
for the short-term fix is the best approach, but can be changed in the future of course.

v2: Amend commit message (Chris).

v3: Add comment explaining why we do nothing in the sw_fence complete
callback (Michel).

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v2)
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170808080828.23650-2-daniel.vetter@ffwll.ch

drm/i915: Avoid the gpu reset vs. modeset deadlock

... using the biggest hammer we have. This is essentially a weaponized
version of the timeout-based wedging Chris added in

commit 36703e79a982c8ce5a8e43833291f2719e92d0d1
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Jun 22 11:56:25 2017 +0100

drm/i915: Break modeset deadlocks on reset

Because defense-in-depth is good it's good to still have both. Also
note that with the locking change we can now restrict this a lot (old
gpus and special testing only), so this doesn't kill the TDR benefits
on at least anything remotely modern.

And futuremore with a few tricks it should be possible to make a much
more educated guess about whether an atomic commit is stuck waiting on
the gpu (atomic_t counting the pending i915_sw_fence used by the
atomic modeset code should do it), so we can improve this.

But for now just start with something that is guaranteed to recover
faster, for much better CI througput.

This defacto reverts TDR on these platforms, but there's not really a
single commit to specify as the sole offender.

v2: Add a debug message to explain what's going on. We can't DRM_ERROR
because that spams CI. And the timeout based fallback still prints a
DRM_ERROR, in case something goes wrong.

v3: Fix comment layout (Michel)

Fixes: 4680816be336 ("drm/i915: Wait first for submission, before waiting for request completion")
Fixes: 221fe7994554 ("drm/i915: Perform a direct reset of the GPU from the waiter")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v2)
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v2)
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170808080828.23650-1-daniel.vetter@ffwll.ch

drm/i915/gen9: Send all components in VF state

Update gen9 renderstate to account the, long overdue, changes for
igt commit 5c07135b7bd2 ("tools/null_state/gen9: Send all
components in VF state").

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170810110451.31635-1-mika.kuoppala@intel.com

drm/i915/guc: Rename GuC irq trigger function

We should emphasize that irq raising function depends on Gen.

v2: use yet another better name (Chris)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170809212603.28780-1-michal.wajdeczko@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt

When switching between contexts using the aliasing_ppgtt, the VM is
shared. We don't need to reload the PD registers unless they are dirty.

Martin Peres reported an issue that looks like corruption between
Haswell context switches, bisecting to commit f9326be5f1d3 ("drm/i915:
Rearrange switch_context to load the aliasing ppgtt on first use").
Switching between the same mm (the aliasing_ppgtt is used for all
contexts in this case) should be a nop, but appears to trigger some
side-effects in the context switch. However, as we know the switch
is redundant in this case, we can skip it and continue to ignore the
issue until somebody feels strong enough to investigate full-ppgtt on
gen7 again!

Except.. Martin was using full-ppgtt which is not supported as it
doesn't work correctly yet. So whilst the bisect did yield valuable
information about the failures, the fix should not have any user impact
under default settings, with the exception of a slightly lower
throughput on xcs as the VM would always be reloaded.

v2: Also remember to set the legacy_active_context following the switch
on xcs (commit e8a9c58fcd9a ("drm/i915: Unify active context tracking
between legacy/execlists/guc"))

Fixes: f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use")
Fixes: e8a9c58fcd9a ("drm/i915: Unify active context tracking between legacy/execlists/guc")
Reported-by: Martin Peres <martin.peres@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170812152724.6883-1-chris@chris-wilson.co.uk

drm/i915: Add SW_SYNC to our recommend testing Kconfig

Since we do use the SW_SYNC in igt for validating dma-fence and
sync_file, and wish to expand usage to cover driver independent portions
of syncobj interaction, ensure SW_SYNC is included in our testing
Kconfig.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170810094036.4307-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Introduce intel_hpd_pin function.

The idea is to have an unique place to decide the pin-port
per platform.

So let's create this function now without any functional
change. Just adding together code from hdmi and dp together.

v2: Add missing pin for port A.
v3: Fix typo on subject.
Avoid behaviour change so add WARN_ON and return
if port A on HDMI. (by DK).

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811182650.14327-2-rodrigo.vivi@intel.com

drm/i915: Simplify hpd pin to port

We will soon need to make that pin port association per
platform, so let's try to simplify it beforehand.

Also we are moving the backwards port to pin
here as well so let's use a standardized way.

One extra possibility here would be to add a
MISSING_CASE along with PORT_NONE, but I don't want
to change this behaviour for now.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811182650.14327-1-rodrigo.vivi@intel.com

drm/i915/cnl: Dump the right pll registers when dumping pipe config.

Different from SKL we don't need ctrl1 and cfgcr2, but
we need to dump cfgcr0 and cfgcr1 instead.

v2: rebase and commit message

Cc: Clint Taylor <clinton.a.taylor@intel.com>
Cc: Mika Kahola <mika.kahola@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170810224525.18278-1-rodrigo.vivi@intel.com

drm/i915/cnl: Add allowed DP rates for Cannonlake.

"Frequencies over 5.4 GHz only supported on certain
DDI ports and SKUs, and requires Vccio >= 0.95V."

More specifically, for current CNL SKUs available
(CNL-U and CNL-Y) we have:

DDI A - 5.4G eDP
DDI B - 8.1G DP
DDI C - 8.1G DP
DDI D - 5.4G DP

v2: Rebase on top of source_rates changes.
v3: Address the max 5.4 x 8.1 per DDI and also consider vccio.

Cc: Mika Kahola <mika.kahola@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170810224008.15571-1-rodrigo.vivi@intel.com

drm/i915: Return correct EDP voltage swing table for 0.85V

For 0.85V cnl_get_buf_trans_edp() returns the DP table, instead of EDP.
Use the correct table.

The error was pointed out by this clang warning:

drivers/gpu/drm/i915/intel_ddi.c:392:39: warning: variable
  'cnl_ddi_translations_edp_0_85V' is not needed and will not be emitted
  [-Wunneeded-internal-declaration]
    static const struct cnl_ddi_buf_trans cnl_ddi_translations_edp_0_85V[] = {

Fixes: cf54ca8bc567 ("drm/i915/cnl: Implement voltage swing sequence.")
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170717195854.192139-1-mka@chromium.org

drm/i915: make structure intel_sprite_plane_funcs static

The structure intel_sprite_plane_funcs is local to the source
and does not need to be in global scope, so make it static.

Cleans up sparse warning:
symbol 'intel_sprite_plane_funcs' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811134938.4183-1-colin.king@canonical.com

drm/i915/fbc: only update no_fbc_reason when active

In our snb farm in CI we have plenty of underruns, but not enough
stolen memory to enable fbc. Which means every time there's an
underrun the no_fbc_reason swichtes to something that makes
kms_frontbuffer_tracking fail instead of skip, adding massive amounts
of additional noise to igt test runs.

Make sure we don't try to disable fbc when it's off already.

v2: Squash in additional WARN_ON suggestion from Chris.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811072327.4335-1-daniel.vetter@ffwll.ch

drm/i915/cnl: Add slice and subslice information to debugfs.

A missing part to EU slice power gating is the
debugfs interface. This patch actually should have been
squashed to the initial EU slice power gating one.

v2: Initial patch was merged without this part.

Fixes: c7ae7e9ab207 ("drm/i915/cnl: Configure EU slice power gating.")
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170809200702.11236-1-rodrigo.vivi@intel.com

drm/i915/gen10: fix WM latency printing

Gen 10 is just like Gen 9, so let's consider that all the future
platforms are going to be like gen 9 instead of being like gen8-.

Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170809205248.11917-4-rodrigo.vivi@intel.com

drm/i915/gen10: fix the gen 10 SAGV block time

A previous commit added CNL to intel_has_sagv(), but forgot to adjust
the SAGV block time to gen 10 platforms.

Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170809205248.11917-3-rodrigo.vivi@intel.com

drm/i915/cnl: Enable SAGV for Cannonlake.

For now inherit from previous platforms.

v2: Rebase on top of CFL.

Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170809205248.11917-2-rodrigo.vivi@intel.com

drm/i915/gen10+: use the SKL code for reading WM latencies

Gen 10 should use the exact same code as Gen 9, so change the check to
take this into consideration, and also assume that future platforms
will run this code.

Also add a MISSING_CASE(), just in case we do something wrong, instead
of silently failing.

Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170809205248.11917-1-rodrigo.vivi@intel.com

drm/i915: Avoid null dereference if mst_port is unset.

I'm not sure if this is really the case and I don't believe
this is the real fix for the bug mentioned here, but since
I don't see a reliable path when mst_port is set and when
mode_valid is requested I believe it is worth to have this
protection here.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102022
Cc: Elizabeth <elizabethx.de.la.torre.mena@intel.com>
Cc: Stefan Assmann <sassmann@redhat.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170810145043.24047-1-rodrigo.vivi@intel.com

drm/i915/perf: Drop redundant check for perf.initialised on reset

As we cannot have an exclusive stream set if the perf has not been
initialized, we only need to check for that exclusive stream.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170810175743.25401-3-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

drm/i915/perf: Drop lockdep assert for i915_oa_init_reg_state()

This is called from execlist context init which we need to be unlocked.
Commit f89823c21224 ("drm/i915/perf: Implement
I915_PERF_ADD/REMOVE_CONFIG interface") added a lockdep assert to this
path for unclear reasons, remove it again!

Fixes: f89823c21224 ("drm/i915/perf: Implement I915_PERF_ADD/REMOVE_CONFIG interface")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170810175743.25401-2-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

drm/i915/perf: Initialise dynamic sysfs group before creation

Another case where we need to call sysfs_attr_init() to setup the
internal lockdep class prior to use:

[    9.325229] BUG: key ffff880168bc7bb0 not in .data!
[    9.325240] DEBUG_LOCKS_WARN_ON(1)
[    9.325250] ------------[ cut here ]------------
[    9.325280] WARNING: CPU: 1 PID: 275 at kernel/locking/lockdep.c:3156 lockdep_init_map+0x1b2/0x1c0
[    9.325301] Modules linked in: intel_powerclamp(+) coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel i915(+) snd_hda_intel snd_hda_codec snd_hwdep r8169 mii snd_hda_core snd_pcm prime_numbers i2c_hid pinctrl_geminilake pinctrl_intel
[    9.325375] CPU: 1 PID: 275 Comm: modprobe Not tainted 4.13.0-rc4-CI-Trybot_1040+ #1
[    9.325395] Hardware name: Intel Corp. Geminilake/GLK RVP2 LP4SD (07), BIOS GELKRVPA.X64.0045.B51.1704281422 04/28/2017
[    9.325422] task: ffff8801721a4ec0 task.stack: ffffc900001dc000
[    9.325440] RIP: 0010:lockdep_init_map+0x1b2/0x1c0
[    9.325456] RSP: 0018:ffffc900001dfa10 EFLAGS: 00010282
[    9.325473] RAX: 0000000000000016 RBX: ffff880168d54b80 RCX: 0000000000000000
[    9.325488] RDX: 0000000080000001 RSI: 0000000000000001 RDI: ffffffff810f0800
[    9.325505] RBP: ffffc900001dfa30 R08: 0000000000000001 R09: 0000000000000000
[    9.325521] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880168bc7bb0
[    9.325537] R13: 0000000000000000 R14: ffff880168bc7b98 R15: ffffffff81a263a0
[    9.325554] FS:  00007fb60c3fd700(0000) GS:ffff88017fc80000(0000) knlGS:0000000000000000
[    9.325574] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    9.325588] CR2: 0000006582777d80 CR3: 000000016d818000 CR4: 00000000003406e0
[    9.325604] Call Trace:
[    9.325618]  __kernfs_create_file+0x76/0xe0
[    9.325632]  sysfs_add_file_mode_ns+0x8a/0x1a0
[    9.325646]  internal_create_group+0xea/0x2c0
[    9.325660]  sysfs_create_group+0x13/0x20
[    9.325737]  i915_perf_register+0xde/0x220 [i915]
[    9.325800]  i915_driver_load+0xa77/0x16c0 [i915]
[    9.325863]  i915_pci_probe+0x37/0x90 [i915]
[    9.325880]  pci_device_probe+0xa8/0x130
[    9.325894]  driver_probe_device+0x29c/0x450
[    9.325908]  __driver_attach+0xe3/0xf0
[    9.325922]  ? driver_probe_device+0x450/0x450
[    9.325935]  bus_for_each_dev+0x62/0xa0
[    9.325948]  driver_attach+0x1e/0x20
[    9.325960]  bus_add_driver+0x173/0x270
[    9.325974]  driver_register+0x60/0xe0
[    9.325986]  __pci_register_driver+0x60/0x70
[    9.326044]  i915_init+0x6f/0x78 [i915]
[    9.326066]  ? 0xffffffffa024e000
[    9.326079]  do_one_initcall+0x43/0x170
[    9.326094]  ? rcu_read_lock_sched_held+0x7a/0x90
[    9.326109]  ? kmem_cache_alloc_trace+0x261/0x2d0
[    9.326124]  do_init_module+0x5f/0x206
[    9.326137]  load_module+0x2561/0x2da0
[    9.326150]  ? show_coresize+0x30/0x30
[    9.326165]  ? kernel_read_file+0x105/0x190
[    9.326180]  SyS_finit_module+0xc1/0x100
[    9.326192]  ? SyS_finit_module+0xc1/0x100
[    9.326210]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[    9.326223] RIP: 0033:0x7fb60bf359f9
[    9.326234] RSP: 002b:00007fff92b47c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    9.326255] RAX: ffffffffffffffda RBX: ffffffff814898a3 RCX: 00007fb60bf359f9
[    9.326271] RDX: 0000000000000000 RSI: 00000028a9ceef8b RDI: 0000000000000000
[    9.326287] RBP: ffffc900001dff88 R08: 0000000000000000 R09: 0000000000000000
[    9.326303] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000040000
[    9.326319] R13: 00000028aaef2a70 R14: 0000000000000000 R15: 00000028aaeee5d0
[    9.326339]  ? __this_cpu_preempt_check+0x13/0x20
[    9.326353] Code: f1 39 00 85 c0 0f 84 38 ff ff ff 83 3d 9f 44 ce 01 00 0f 85 2b ff ff ff 48 c7 c6 b2 a2 c7 81 48 c7 c7 53 40 c5 81 e8 3f 82 01 00 <0f> ff e9 11 ff ff ff 0f 1f 80 00 00 00 00 55 31 c9 31 d2 31 f6

Fixes: 701f8231a2fe ("drm/i915/perf: prune OA configs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170810175743.25401-1-chris@chris-wilson.co.uk
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

drm/i915: add register macro definition style guide

This is not to try to force a new style; this is my interpretation of
what the most common existing style is.

With hopes I don't need to answer so many questions about style going
forward.

Start a new style section in the i915 document to bolt the register
style guide into.

v2: vertical alignment, incorporate to kernel-doc, and more

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/9de4a5b1bea4e76461c70a1dd66751581de0124f.1502368010.git.jani.nikula@intel.com

drm/i915: enum i915_power_well_id is not proper kernel-doc

Revert to a normal comment, as the enum isn't properly documented
anyway.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f2c44a0aa00ea7d9b71e7a3183a7507f98811146.1502368010.git.jani.nikula@intel.com

Documentation/i915: remove sphinx conversion artefact

Remove old warning about docproc directive that's not supported in the
Sphinx toolchain.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/7fc8a110b78a9dc9a585dce643b68b4200b7e793.1502368010.git.jani.nikula@intel.com

drm/i915: Add format modifiers for Intel

This was based on a patch originally by Kristian. It has been modified
pretty heavily to use the new callbacks from the previous patch.

v2:
  - Add LINEAR and Yf modifiers to list (Ville)
  - Combine i8xx and i965 into one list of formats (Ville)
  - Allow 1010102 formats for Y/Yf tiled (Ville)

v3:
  - Handle cursor formats (Ville)
  - Put handling for LINEAR in the mod_support functions (Ville)

v4:
  - List each modifier explicitly in supported modifiers (Ville)
  - Handle the CURSOR plane (Ville)

v5:
  - Split out cursor and sprite handling (Ville)

v6:
  - Actually use the sprite funcs (Emil)
  - Use unreachable (Emil)

v7:
  - Only allow Intel modifiers and LINEAR (Ben)

v8
  - Fix spite assert introduced in v6 (Daniel)

v9
  - Change vendor check logic to avoid magic 56 (Emil)
  - Reorder skl_mod_support (Ville)
  - make intel_plane_funcs static, could be done as of v5 (Ville)
  - rename local variable intel_format_modifiers to modifiers (Ville)
    - actually use sprite modifiers
  - split out modifier/formats by platform (Ville)

v10:
  - Undo vendor check from v9

v11:
  - Squash CCS advertisement into this patch (daniels)
  - Don't advertise CCS on higher sprite planes (daniels)

v12:
  - Don't advertise Y-tiled or CCS on any sprite planes, since we don't
    allocate enough DDB space for it to work. (daniels)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> (v8)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Stone <daniels@collabora.com>

drm/i915: Add render decompression support

SKL+ display engine can scan out certain kinds of compressed surfaces
produced by the render engine. This involved telling the display engine
the location of the color control surfae (CCS) which describes
which parts of the main surface are compressed and which are not. The
location of CCS is provided by userspace as just another plane with its
own offset.

Add the required stuff to validate the user provided AUX plane metadata
and convert the user provided linear offset into something the hardware
can consume.

Due to hardware limitations we require that the main surface and
the AUX surface (CCS) be part of the same bo. The hardware also
makes life hard by not allowing you to provide separate x/y offsets
for the main and AUX surfaces (excpet with NV12), so finding suitable
offsets for both requires a bit of work. Assuming we still want keep
playing tricks with the offsets. I've just gone with a dumb "search
backward for suitable offsets" approach, which is far from optimal,
but it works.

Also not all planes will be capable of scanning out compressed surfaces,
and eg. 90/270 degree rotation is not supported in combination with
decompression either.

This patch may contain work from at least the following people:
* Vandana Kannan <vandana.kannan@intel.com>
* Daniel Vetter <daniel@ffwll.ch>
* Ben Widawsky <ben@bwidawsk.net>

v2: Deal with display workarounds 0390, 0531, 1125 (Paulo)
v3: Pretend CCS tiles are regular 128 byte wide Y tiles (Jason)
    Put the AUX register defines to the correct place
    Fix up the slightly bogus rotation check
v4: Use I915_WRITE_FW() due to plane update locking changes
    s/return -EINVAL/goto err/ in intel_framebuffer_init()
    Eliminate a bunch hardcoded numbers in CCS code

v5: (By Ben)
conflict resolution +
-               res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
+               res_blocks += fixed16_to_u32_round_up(y_tile_minimum);

v6: (daniels) Fix botched commit message.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Ville Syrjä <ville.syrjala@linux.intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Stone <daniels@collabora.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170801165817.7063-1-ben@bwidawsk.net

drm/i915: Implement .get_format_info() hook for CCS

SKL+ display engine can scan out certain kinds of compressed surfaces
produced by the render engine. This involved telling the display engine
the location of the color control surfae (CCS) which describes which
parts of the main surface are compressed and which are not. The location
of CCS is provided by userspace as just another plane with its own offset.

By providing our own format information for the CCS formats, we should
be able to make framebuffer_check() do the right thing for the CCS
surface as well.

Note that we'll return the same format info for both Y and Yf tiled
format as that's what happens with the non-CCS Y vs. Yf as well. If
desired, we could potentially return a unique pointer for each
pixel_format+tiling+ccs combination, in which case we immediately be
able to tell if any of that stuff changed by just comparing the
pointers. But that does sound a bit wasteful space wise.

v2: Drop the 'dev' argument from the hook
v3: Include the description of the CCS surface layout
v4: Pretend CCS tiles are regular 128 byte wide Y tiles (Jason)
v5: Re-drop 'dev', fix commit message, add missing drm_fourcc.h
description of CCS layout. (daniels)

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v3)
Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Ville Syrjä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Stone <daniels@collabora.com>

Merge airlied/drm-next into drm-intel-next-queued

Ben Widawsky/Daniel Stone need the extended modifier support from
drm-misc to be able to merge CCS support for i915.ko

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Supply the engine-id for our mock_engine()

In the original selftest, we didn't care what the engine->id was, just
that it could uniquely identify it. Later though, we started tracking
the mock engines in the fixed size arrays around the drm_i915_private and
so we now require their indices to be correct. This becomes an issue when
using the standalone harness which runs all available tests at module load,
and so we quickly assign an out-of-bounds index to an engine as we
reallocate the mock GEM device between tests. It doesn't show up in
igt/drv_selftest as that runs each subtest individually.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102045
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170809163930.26470-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

drm/i915/gvt: Add shadow context descriptor updating

The current context logic only updates the descriptor of context when
it's being pinned to graphics memory space. But this cannot satisfy the
requirement of shadow context. The addressing mode of the pinned shadow
context descriptor may be changed according to the guest addressing mode.
And this won't be updated, as the already pinned shadow context has no
chance to update its descriptor. And this will lead to GPU hang issue,
as shadow context is used with wrong descriptor. This patch fixes this
issue by letting the pinned shadow context descriptor update its
addressing mode on demand.

This patch fixes GPU HANG issue which happends after changing the
grub parameter i915.enable_ppgtt form 0x01 to 0x03 or vice versa and
then rebooting the guest.

Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Kechen Lu <kechen.lu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: expose vGPU context hw id

This exposes vGPU context hw id in mdev sysfs which is used to
do vGPU based profiling. Retrieved vGPU context hw id can be set
through i915 perf ioctl to set profiling for target vGPU.

Cc: Jiao Pengyuan <pengyuan.jiao@intel.com>
Cc: Niu Bing <bing.niu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Refine the intel_vgpu_reset_gtt reset function

When doing the VGPU reset, we don't need to do the gtt/ppgtt reset.
This will make the GVT to do the ppgtt shadow every time for
a workload and caused really bad performance after a VGPU reset.
This patch will make sure ppgtt clean only happen at device module
level reset to fix this.

Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Add carefully checking in GTT walker paths

When debugging the gtt code, found the intel_vgpu_gma_to_gpa() can
translate any given GMA though the GMA is not valid. This because
the GTT ops suppress the possible errors, which may result in an
invalid PT entry is retrieved by upper caller.

This patch changed the prototype of pte ops to propagate status to
callers. Then we make sure the GTT walker stop as early as when
a error is detected to prevent undefined behavior.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Remove duplicated MMIO entries

Remove duplicated MMIO entries in the tracked MMIO list. -EEXIST
is returned if duplicated MMIO entries are found when new MMIO
entry is added.

v2:
- Use WARN(1, ...) for more verbose message. (Zhenyu)

Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Yulei Zhang <yulei.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: take runtime pm when do early scan and shadow

Need to take runtime pm when do early scan/shadow of workload
for request operations.

Fixes: 7fa56bd159bc ("drm/i915/gvt: Audit and shadow workload during ELSP writing")
Cc: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Replace duplicated code with exist function

Use the exist function intel_gvt_ggtt_validate_range to replace
these duplicated code that do the same thing.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: To check whether workload scan and shadow has mutex hold

The function workload scan and shadow have to hold the drm.struct_mutex
before called. To avoid misusing of this function, add a lockdep assert
in it.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Audit and shadow workload during ELSP writing

Let the workload audit and shadow ahead of vGPU scheduling, that
will eliminate GPU idle time and improve performance for multi-VM.

The performance of Heaven running simultaneously in 3VMs has
improved 20% after this patch.

v2:Remove condition current->vgpu==vgpu when shadow during ELSP
writing.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Factor out scan and shadow from workload dispatch

To perform the workload scan and shadow in ELSP writing stage for
performance consideration, the workload scan and shadow stuffs
should be factored out from dispatch_workload().

v2:Put context pin before i915_add_request;
Refine the comments;
Rename some APIs;

v3:workload->status should set only when error happens.
v4:i915_add_request is must to have after i915_gem_request_alloc.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Optimize ring siwtch 2x faster again by light weight mmio access wrapper

The I915_READ/WRITE is not only a mmio read/write, it also contains
debug checking and Forcewake domain lookup. This is too heavy for
GVT ring switch case which access batch of mmio registers on ring
switch. We can handle Forcewake manually and use the raw
i915_read/write instead. The benefit from this is 2x faster mmio
switch performance.
Before After
cycles ~550000 ~250000

v2: Use existing I915_READ_FW/I915_WRITE_FW macro. (zhenyu)

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Optimize ring siwtch 2x faster by removing unnecessary POSTING_READ

There are lots of POSTING_READ alongside each mmio write Op. While
actually this is not necessary. It just bring too much latency since
PCIe read Op is very slow which is of non-posted transaction.

For PCIe device, the mem transaction for strong ordering rules are:
  o PCIe mmio write sequence is FIFO. Posted request cannot
    pass previous posted request.
  o PCIe mmio read will not go ahead of previous write.

Intel graphics doesn't support RO, so we can apply above rules. In
our case, we only need one POSTING_READ at last. This can remove
half of mmio read Op and then the average ring switch performance
is nearly doubled.
         Before       After
cycles  ~970000      ~550000

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Use gvt_err to print the resource not enough error

It is better to use gvt_err when the gvt resource is not enough so
the user can be notified from the kernel dmesg. And this kind of
error message is gvt related.

Suggested-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Bing Niu <bing.niu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>