git.openwrt.org Git - openwrt/staging/blogic.git/log

drm/i915: add render state initialization

HW guys say that it is not a cool idea to let device
go into rc6 without proper 3d pipeline state.

For each new uninitialized context, generate a
valid null render state to be run on context
creation.

This patch introduces a skeleton with empty states.

v2: - No need to vmap (Chris Wilson)
    - use .c files for state (Daniel Vetter)
    - no need to flush as i915_add_request does it
    - remove parameter for batch alloc size
    - don't wait for the init (Ben Widawsky)

v3: - move to cpu/gpu (Chris Wilson)

Tested-by: Kristen Carlson Accardi <kristen@linux.intel.com> (v1)
Tested-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Only do gtt cleanup in vma_unbind for the global vma

Otherwise we end up tearing down fences when e.g. the client quits
way too early. Might or might not fix a fence pin_count BUG Ville has
reported.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't drop pinned fences

Userspace can currently provoke this when e.g. trying to use a pinned
scanout as a cursor or overlay target. Later on that might lead to
some fun fence pin count mayhem.

Spurred by Ville's report that something goes wrong here and
originally I've thought that this might slip through the pwrite gtt
fastpath. But that one checks of obj tiling, so should be ok.

But one thing that _does_ blow up is the vma unbinding with more than
one address space. The next patch will fix this.

v2: Use a WARN_ON - Chris pointed out that we already catch all cases
so userspace can't provoke this like I've originally feared.

While reviewing relevant code I've noticed a pile of DRM_ERROR in the
overlay&cursor code which are all triggerable by userspace. Tune them
down while at it.

v3: Split out the DRM_ERROR->DRM_DEBUG_KMS change into a separate patch,
as requested by Chris.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: use dev_priv directly in i915_driver_unload

Noticed while playing with coccinelle.

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use the connector name in fbdev debug messages

During initial probing of the modes to assign to the fbdev console, we
use the CRTC and connector ids. These are much harder for us to
understand than if we used their actual names (or pipe in the CRTC
case). Similarly, we want to manually print the mode size rather than
rely on mode->name being set.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use for_each_crtc() when iterating through the CRTCs

Patch done using the following semantic patch (thanks Daniel for the
help!)

  @@
  iterator name list_for_each_entry;
  iterator name for_each_crtc;
  struct drm_crtc * crtc;
  struct drm_device * dev;
  @@
  -list_for_each_entry(crtc,&dev->mode_config.crtc_list, head) {
  +for_each_crtc(dev,crtc) {
   ...
  }

Followed by a couple of fixups by hand (that spatch doesn't match the
cases where list_for_each_entry() is not followed by a set of '{', '}',
but I couldn't figure out a way to leave the '{' out of the iterator
match).

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Introduce a for_each_crtc() macro

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use for_each_intel_crtc() when iterating through intel_crtcs

Generated using the semantic patch:

  @@
  iterator name list_for_each_entry;
  iterator name for_each_intel_crtc;
  struct intel_crtc * crtc;
  struct drm_device * dev;
  @@
  -list_for_each_entry(crtc,&dev->mode_config.crtc_list,...) {
  +for_each_intel_crtc(dev,crtc) {
...
  }

Followed by a couple of fixups by hand (that spatch doesn't match the
cases where list_for_each_entry() is not followed by a set of '{', '}',
but I couldn't figure out a way to leave the '{' out of the iterator
match).

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Introduce a for_each_intel_crtc() macro

Fed up with having that long list_for_each_entry() invocation?

Use for_each_intel_crtc()!

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use ilk_wm_max_level() in latency debugfs files

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Squash in patch that exported ilk_wm_max_level.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't cast void* pointers

That's not necessary and makes the code not as neat as it could be.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Work-around garbage DR4 from UXA

Somehow UXA submits a completely bogus DR4 value since essentially
forever. It was originally introduced in

commit bade7d7d2505a10a8a7d24b084aff9742e2d6d64
Author: Eric Anholt <eric@anholt.net>
Date:   Fri Jun 6 14:03:25 2008 -0700

    Use the DRM for submitting batchbuffers when available.

and dutifully copied around ever since. Since we want to keep the
general dirt catching around just special case the UXA value.

This regression was introduced in

commit 9cb346648d9c529eccc5c7f30093e82d37004e37
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Apr 24 08:09:11 2014 +0200

    drm/i915: Catch dirt in unused execbuffer fields

Comment from Chris' review:

"To be fair, it is a sensible value if one supposes a Region style API to
cliprects. Under that API, DR[14] define the extents of the clip region,
and ((0,0), (0,0)) [DR1==DR4==0] would mean all clipped, do not draw
anything."

v2: Pimp commit message a bit and remove the double space.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78494
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jörg Otte <jrg.otte@gmail.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: WARN_ON fence pin leaks

The fence pin count should always be <= the bo pin count. If that's
not the case then we have a funny problem and are leaking references
somewhere.

Which means we can catch fence pin leaks by checking for the same
upper limit as we do for the bo pin count. Inspired by a discussion
with Ville about a fence leak igt testcase.

v2: Also check for fence->pin_count <= ggtt_vma->pin_count, since that
might catch a leak even quicker. Also de-inline them, they're getting
too big.

v3: Don't separately check for MAX_PIN_COUNT since the > vma->pin_count
check will catch that already (Chris).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Kill vblank waits after pipe enable on gmch platforms

The pipe might not start to actually run until the port has been enabled
(depends on the platform and port type). So don't try to wait for vblank
after we enabled the pipe but haven't yet enabled the port.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
References: https://bugs.freedesktop.org/show_bug.cgi?id=77297
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Disable/enable planes as the first/last thing during modeset on gmch platforms

We already moved the plane disable/enable to happen as the first/last
thing on every other platforms. Follow suit with gmch platforms.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Frob drm_vblank_on conflict, as usual.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Ringbuffer signal func for the second BSD ring

This is missing in:

commit 78325f2d270897c9ee0887125b7abb963eb8efea
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Tue Apr 29 14:52:29 2014 -0700

drm/i915: Virtualize the ringbuffer signal func

Looks to me like a rebase side-effect...

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

x86/gpu: Sprinkle const, __init and __initconst to stolen memory quirks

gen8_stolen_size() is missing __init, so add it.

Also all the intel_stolen_funcs structures can be marked
__initconst.

intel_stolen_ids[] can also be made const if we replace the
__initdata with __initconst.

Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

x86/gpu: Implement stolen memory size early quirk for CHV

CHV uses the same bits as SNB/VLV to code the Graphics Mode Select field
(GFX stolen memory size) with the addition of finer granularity modes:
4MB increments from 0x11 (8MB) to 0x1d.

Values strictly above 0x1d are either reserved or not supported.

v2: 4MB increments, not 8MB. 32MB has been omitted from the list of new
    values (Ville Syrjälä)

v3: Also correctly interpret GGMS (GTT Graphics Memory Size) (Ville
    Syrjälä)

v4: Don't assign a value that needs 20bits or more to a u16 (Rafael
    Barbalho)

[vsyrjala: v5: Split from i915 changes and add chv_stolen_funcs]

Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com>
Tested-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Implement stolen memory size detection

CHV uses the same bits as SNB/VLV to code the Graphics Mode Select field
(GFX stolen memory size) with the addition of finer granularity modes:
4MB increments from 0x11 (8MB) to 0x1d.

Values strictly above 0x1d are either reserved or not supported.

v2: 4MB increments, not 8MB. 32MB has been omitted from the list of new
    values (Ville Syrjälä)

v3: Also correctly interpret GGMS (GTT Graphics Memory Size) (Ville
    Syrjälä)

v4: Don't assign a value that needs 20bits or more to a u16 (Rafael
    Barbalho)

[vsyrjala: v5: Split the early quirks to another patch]

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com>
Tested-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: CHV doesn't have CRT output

No CRT output on CHV, so don't call intel_crt_init().

v2: Don't disable CRT on HAS.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Add DPLL state readout support

Add chv_crtc_clock_get() to read out the DPLL settings.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
[danvet: Fix compile due to bikeshedded headers in an earlier patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Pipe select change for DP and HDMI

With additional of pipe C, current 1 bit registers for pipe select
for HDMI and DP are no longer able to gather for 3 pipes. As a result,
new bits location in the same registers are added.

For HDMI, VLV uses bit 30, CHV uses bit 24-25.

For DP, VLV uses bit 30, CHV uses bit 16-17.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Add phy supports for Cherryview

Added programming phy layer for CHV based on "Application note for 1273
CHV Display phy".

v2: Rebase the code and do some cleanup.
v3: Rework based on Ville review.
    -Fix the macro where the ch info need to swap, and add parens to ?
operator.
-Fix wrong bit define for DPIO_PCS_SWING_CALC_0 and
DPIO_PCS_SWING_CALC_1 and rename for meaningful.
    -Add some comments for CHV specific DPIO registers.
    -Change the dp margin registery value to decimal to align with the
doc.
-Fix the not clearing some value in vlv_dpio_read before write again.
    -Create new hdmi/dp encoder function for chv instead of share with
valleyview.
v4: Rebase the code after rename the DPIO registers define and upstream
change.
    Based on Ville review.
    -For unique transition scale selection, after Ville point out, look
like the doc might wrong for the bit 26.  Use bit 27 for ch0 and
ch1.
-Break up some dpio write value into two/three steps for readability.
-Remove unrelated change.
    -Add some shift define for some registers instead just give the hex
value.
    -Fix a bug where write to wrong VLV_TX_DW3.
v5: Based on Ville review.
- Move tx lane latency optimal setting from chv_dp_pre_pll_enable to
  chv_pre_enable_dp, and chv_hdmi_pre_pll_enable to
  chv_hdmi_pre_enable respectively.
- Fix typo in one margin_reg_value for DP_TRAIN_VOLTAGE_SWING_400.
- Clear DPIO_TX_UNIQ_TRANS_SCALE_EN for DP and HDMI.
- Mask the old deemph and swing bits for hdmi.
v6: Remove stub for pre_pll_enable for dp and hdmi.

Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[vsyrjala: Don't touch panel power sequencing on DP]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Add update and enable pll for Cherryview

Added programming PLL for CHV based on "Application note for 1273 CHV
Display phy".

v2:  -Break the common lane reset into another patch.
     -Break the clock calculation into another patch.

    -The changes are based on Ville review.
    -Rework based on DPIO register define naming convention change.
    -Break the dpio write into few lines to improve readability.
    -Correct the udelay during chv_enable_pll.
    -clean up some magic numbers with some new define.
    -program the afc recal bit which was missed.

v3: Based on Ville review
-  minor correction of the bit defination
    - add deassert/propagate data lane reset

v4: Corrected the udelay between dclkp enable and pll enable.
Minor comment and better way to clear the TX lane reset.

v5: Squash in fixup from Rafael Barbalho.

[vsyrjala: v6: Polish the defines (Imre)]

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: find the best divisor for the target clock v4

Based on the chv clock limit, find the best divisor.

The divisor data has been verified with this spreadsheet.
P1273_DPLL_Programming Spreadsheet.

v2: Rebase the code and change the chv_find_best_dpll based on new
standard way to use intel_PLL_is_valid. Besides, clean up some extra
variables.

v3: Ville suggest better fixed point for m2 calculation.

v4: -Add comment for the limit is compute using fast clock. (Ville)
-Don't pass the request clock to chv_clock, as the same function will
be use clock readout, which doens't have request clock. (Ville)
-Add and use DIV_ROUND_CLOSEST_ULL to consistent with other clock
calculation. (Ville)
-Fix the dp m2 after m2 has stored fixed point. (Ville)

Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com>
[vsyrjala: Avoid div-by-zero in chv_clock()]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Trigger phy common lane reset

During cold boot, the display controller needs to deassert the common
lane reset. Only do it once during intel_init_dpio for both PHYx2 and
PHYx1.

Besides, assert the common lane reset when disable pll. This still
to be determined whether need to do it by driver.

Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com>
[vsyrjala: Don't disable DPIO PLL when using DSI]
[vsyrjala: Don't call vlv_disable_pll() by accident on CHV]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
[danvet: Move part of a moved comment back as suggested by Imre since
it's valid for both byt and chv.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Add vlv_pipe_to_channel

Cherryview has 3 pipes. Some of the pll dpio offset calculation is
based on pipe number. Need to use vlv_pipe_to_channel to calculate the
correct phy channel to use for the pipe.

Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Update Cherryview DPLL changes to support Port D. v2

The additional DPLL registers added to support Port D.  Besides, add
some new PHY control and status registers based on B-spec.

v2: Based on Ville review
- Corrected DPIO_PHY_STATUS offset and name.
    - Rebase based on upstream change after introduce enum dpio_phy and
      enum dpio_channel.

v3: Rebased on top of Antti's 3-pipe prep patch. Note that the new offsets for
the DPLL registers aren't in place yet, so this introduces a slight regression.
But since 3 pipe support isn't fully enabled yet anyaway in -internal this
shouldn't matter too much.

Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Add DPIO offset for Cherryview. v3

CHV has 2 display phys. First phy (IOSF offset 0x1A) has two channels,
and second phy (IOSF offset 0x12) has single channel. The first phy is
used for port B and port C, while second phy is only for port D.

v2: Move the pipe to determine which phy to select for
vlv_dpio_read/vlv_dpio_write to another patch. (Daniel)
v3: Rebase the code based on rework on how to calculate DPIO offset.

Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Add DDL register defines for Cherryview

Fill in the sprite bits for DDL1/DDL2 registers, and add DDL3.

Still need to write the code to use these...

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

srm/i915/chv: Add Cherryview PCI IDs

v2: Update to also fill in the new num_pipes field.

v3: Rebase on top of the pciid extraction.

v4: Switch from info->has*ring to info->ring mask. Also add VEBOX support whiel
at it.

v5: s/CHV_PCI_IDS/CHV_IDS/, and drop the trailing '\'

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Initial clock gating support for Cherryview

CHV clock gating isn't identical to VLV, so add a new function
for it. This is only a start, and further changes are needed as
the details become available.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Add Cherryview interrupt registers into debugfs

Make i915_gem_interrupt debugfs file functional on CHV.

FIXME: Extract helpers for gt/display blocks to shrink the function a
bit and avoid duplication between bdw/chv (and other similar cases for
upstream).

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Drop unecessary casts in i915_irq.c

Inspired by a review bikeshed from Jani.

Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Preliminary interrupt support for Cherryview

CHV has the Gen8 master interrupt register, as well as Gen8
GT/PCU interrupt registers.

The display block is based on VLV, with the main difference
of adding pipe C.

v2: Rewrite the order of operations to make more sense
Don't bail out if MASTER_CTL register doesn't show an interrupt,
as display interrupts aren't reported there.

v3: Rebase on top of Egbert Eich's hpd irq handling rework by using
the relevant port hotplug logic like for vlv.

v4: Rebase on top of Ben's gt irq #define refactoring.

v5: Squash in gen8_gt_irq_handler refactoring from Zhao Yakui
<yakui.zhao@intel.com>

v6: Adapt to upstream changes, dev_priv->irq_received is gone.

v7: Enable 3 the commented-out 3 pipe support.

v8: Rebase on top of Paulo's irq setup rework, use the renamed macros from
upstream.

v9: Grab irq_lock around i915_enable_pipestat()

FIXME: There's probably some potential for more shared code between bdw and chv.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v2)
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Drop the unnecessary cast Jani spotted.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use hash tables for the command parser

For clients that submit large batch buffers the command parser has
a substantial impact on performance. On my HSW ULT system performance
drops as much as ~20% on some tests. Most of the time is spent in the
command lookup code. Converting that from the current naive search to
a hash table lookup reduces the performance drop to ~10%.

The choice of value for I915_CMD_HASH_ORDER allows all commands
currently used in the parser tables to hash to their own bucket (except
for one collision on the render ring). The tradeoff is that it wastes
memory. Because the opcodes for the commands in the tables are not
particularly well distributed, reducing the order still leaves many
buckets empty. The increased collisions don't seem to have a huge
impact on the performance gain, but for now anyhow, the parser trades
memory for performance.

NB: Ville noticed that the error paths through the ring init code
will leak memory. I've not addressed that here. We can do a follow
up pass to handle all of the leaks.

v2: improved comment describing selection of hash key mask (Damien)
replace a BUG_ON() with an error return (Tvrtko, Ville)
commit message improvements

Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert gmch platforms over to ilk_crtc_{enable, disable}_planes()

Use the same code for enabling/disabling planes on all platforms. Rename
the functions to reflect that they're no longer specific to any
platform.

For now we leave the plane enable/disable to ccur at the same old
position in the modeset sequence.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Frob drm_vblank_on conflict.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Flush request queue when waiting for ring space

During the review of

commit 1f70999f9052f5a1b0ce1a55aff3808f2ec9fe42
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Jan 27 22:43:07 2014 +0000

drm/i915: Prevent recursion by retiring requests when the ring is full

Ville raised the point that our interaction with request->tail was
likely to foul up other uses elsewhere (such as hang check comparing
ACTHD against requests).

However, we also need to restore the implicit retire requests that certain
test cases depend upon (e.g. igt/gem_exec_lut_handle), this raises the
spectre that the ppgtt will randomly call i915_gpu_idle() and recurse
back into intel_ring_begin().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78023
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
[danvet: Remove now unused 'tail' variable as spotted by Brad.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Improve fallback ring waiting

A few improvements to the fallback method for waiting upon ring space:

1. Fix the start/end wait tracepoints to always be paired.
2. Increase responsiveness of checking
3. Mark the process as waiting upon io
4. Check for signal interruptions

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
[danvet: Drop the s/msleep/io_schedule_timeout/ change again since the
latter isn't exported.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make aliasing a 2nd class VM

There is a good debate to be had about how best to fit the aliasing
PPGTT into the code. However, as it stands right now, getting aliasing
PPGTT bindings is a hack, and done through implicit arguments. To make
this absolutely clear, WARN and return an error if a driver writer tries
to do something they shouldn't.

I have no issue with an eventual revert of this patch. It makes sense
for what we have today.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7

It was always the intention to do the topdown allocation for context
objects (Chris' idea originally). Unfortunately, I never managed to land
the patch, but someone else did, so now we can use it.

As a reminder, hardware contexts never need to be in the precious GTT
aperture space - which is what is what happens with the normal bottom up
allocation we do today. Doing a top down allocation increases the odds
that the HW contexts can get out of the way, especially with per FD
contexts as is done in full PPGTT

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: vlv: enable runtime PM

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: vlv: add runtime PM support

Add runtime PM support for VLV, but leave it disabled. The next patch
enables it.

The suspend/resume sequence used is based on [1] and [2]. In practice we
depend on the GT RC6 mechanism to save the HW context depending on the
render and media power wells. By the time we run the runtime suspend
callback the display side is also off and the HW context for that is
managed by the display power domain framework.

Besides the above there are Gunit registers that depend on a system-wide
power well. This power well goes off once the device enters any of the
S0i[R123] states. To handle this scenario, save/restore these Gunit
registers. Note that this is not the complete register set dictated by
[2], to remove some overhead, registers that are known not to be used are
ignored. Also some registers are fully setup by initialization functions
called during resume, these are not saved either. The list of registers
can be further reduced, see the TODO note in the code.

[1] VLV_gfx_clocking_PM_reset_y12w21d3 / "Driver D3 entry/exit"
[2] VLV2_S0IXRegs

v2:
- unchanged
v3:
- fix s/GEN6_PMIIR/GEN6_PMIMR/ typo when saving/restoring registers
(Ville)
v4:
- rebased on the previous patch fixing GEN register prefixes

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[ rebased (according to v4) ]
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: propagate the error code from runtime PM callbacks

Atm, none of the RPM callbacks can fail, but the next patch adding
RPM support for VLV changes this, so prepare for it.

In case one of these callbacks return error RPM will get permanently
disabled until the error is explicitly cleared. In the future we could
add support for re-enabling it, for example after resetting the HW, but
for now - hopefully - we can live with the simpler solution.

v2:
- propagate the error from the resume callbacks too (Paulo)
v3:
- fix rebase fail typo around IS_GEN6() check in intel_runtime_suspend()

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: add various missing GTI/Gunit register definitions

Needed by the VLV S0ix context save/restore helpers.

v2:
- unchanged
v3:
- use proper GEN register prefixes (Ville)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Add DPINVGTT registers defines for Cherryview

Due to Pipe C DPINVGTT has more bits on CHV.

v2: Fix comment to say VLV/CHV (Rafael)

Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Add display interrupt registers bits for Cherryview

v2: Rebase on top of Ben's GT interrupt shuffling.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Add DPFLIPSTAT register bits for Cherryview

CHV has pipe C and PSR which cause changes to DPFLIPSTAT.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Add PIPESTAT register bits for Cherryview

FIXME: We probably want to sprinkle _CHV suffixes over these.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Enable aliasing PPGTT for CHV

Enable aliasing PPGTT for CHV, but keep full PPGTT still disabled until
it gets enabled for BDW.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Flush caches when programming page tables

Page table updates were getting stuck in the CPU cache on chv causing
spurious page faults and strange behaviour.

Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com>
[vsyrjala: Add !HAS_LLC checks]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: PPAT setup for Cherryview

Ignore the cache bits in PPAT and just set the snoop bit where
appropriate. BDW WB is mapped to snooped access, while all other
modes are mapped to non-snooped access.

The hardware supposedly ignores everything except the snoop bit
in the PPAT entries.

Additionally the hardware actually enforces snooping for all
page table accesses, and thus the snoop bit is ignored for PDEs.

v2: Rebased on top of the bdw resume fix to reload the ppat entries.

v3: Rebase on top of the i915_gem_gtt.h header extraction.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove useless checks from primary enable/disable

We won't be calling intel_enable_primary_plane() or
intel_disable_primary_plane() with the primary plane in the
wrong state. So remove the useless DISPLAY_PLANE_ENABLE checks.

v2: Convert the checks to WARNs instead (Daniel,Paulo)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Merge LP1+ watermarks in safer way

On ILK when we disable a particular watermark level, we must
maintain the actual watermark values for that level for some time
(until the next vblank possibly). Otherwise we risk underruns.

In order to achieve that result we must merge the LP1+ watermarks a
bit differently since we must also merge levels that are to be
disabled. We must also make sure we don't overflow the fields in the
watermark registers in case the calculated watermarks come out too
big to fit.

As early as possbile we mark all computed watermark levels as
disabled if they would exceed the register maximums. We make sure
to leave the actual watermarks for such levels zeroed out. Then during
merging, we take the maxium values for every level, regardless if
they're disabled or not. That may seem a bit pointless since at the
moment all the watermark levels we merge should have their values
zeroed if the level is already disabled. However soon we will be
dealing with intermediate watermarks that, in addition to the new
watermark values, also contain the previous watermark values, and so
levels that are disabled may no longer be zeroed out.

v2: Split the patch in two (Paulo)
Use if() instead of & when merging ->enable (Paulo)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: Fix commit message as noted by Paulo.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make sure computed watermarks never overflow the registers

When we calculate the watermarks for a pipe make sure we leave any
level fully zeroed out if it would exceed any of the maximum values
that fit in the registers.

This will be important later when we start to use also disabled
watermark levels during LP1+ merging.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add pipe update trace points

Add trace points for observing the atomic pipe update mechanism.

v2: Rebased due to earlier changes
v3: Pass intel_crtc instead of drm_crtc (Daniel)
v4: Pass frame counter from the caller to evaded/end since
the caller now always has that ready

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Sourab Gupta <sourabgupta@gmail.com>
Reviewed-by: Akash Goel <akash.goels@gmail.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Perform primary enable/disable atomically with sprite updates

Move the primary plane enable/disable to occur atomically with the
sprite update that caused the primary plane visibility to change.

FBC and IPS enable/disable is left to happen well before or after
the primary plane change.

v2: Pass intel_crtc instead of drm_crtc (Daniel)

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Sourab Gupta <sourabgupta@gmail.com>
Reviewed-by: Akash Goel <akash.goels@gmail.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make sprite updates atomic

Add a mechanism by which we can evade the leading edge of vblank. This
guarantees that no two sprite register writes will straddle on either
side of the vblank start, and that means all the writes will be latched
together in one atomic operation.

We do the vblank evade by checking the scanline counter, and if it's too
close to the start of vblank (too close has been hardcoded to 100usec
for now), we will wait for the vblank start to pass. In order to
eliminate random delayes from the rest of the system, we operate with
interrupts disabled, except when waiting for the vblank obviously.

Note that we now go digging through pipe_to_crtc_mapping[] in the
vblank interrupt handler, which is a bit dangerous since we set up
interrupts before the crtcs. However in this case since it's the vblank
interrupt, we don't actually unmask it until some piece of code
requests it.

v2: preempt_check_resched() calls after local_irq_enable() (Jesse)
    Hook up the vblank irq stuff on BDW as well
v3: Pass intel_crtc instead of drm_crtc (Daniel)
    Warn if crtc.mutex isn't locked (Daniel)
    Add an explicit compiler barrier and document the barriers (Daniel)
    Note the irq vs. modeset setup madness in the commit message (Daniel)
v4: Use prepare_to_wait() & co. directly and eliminate vbl_received
v5: Refactor intel_pipe_handle_vblank() vs. drm_handle_vblank() (Chris)
    Check for min/max scanline <= 0 (Chris)
    Don't call intel_pipe_update_end() if start failed totally (Chris)
    Check that the vblank counters match on both sides of the critical
    section (Chris)
v6: Fix atomic update for interlaced modes
v7: Reorder code for better readability (Chris)
v8: Drop preempt_check_resched(). It's not available to modules
    anymore and isn't even needed unless we ourselves cause
    a wakeup needing reschedule while interrupts are off

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Sourab Gupta <sourabgupta@gmail.com>
Reviewed-by: Akash Goel <akash.goels@gmail.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Support 64b relocations

All the rest of the code to enable this is in my branch. Without my
branch, hitting > 32b offsets is impossible. The code has always
"supported" 64b, but it's never actually been run of tested. This change
doesn't actually fix anything. [1] I am not sure why X won't work yet. I
do not get hangs or obvious errors.

There are 3 fixes grouped together here. First is to remove the
hardcoded 0 for the upper dword of the relocation. The next fix is to
use a 64b value for target_offset. The final fix is to not directly
apply target_offset to reloc->delta. reloc->delta is part of ABI, and so
we cannot change it. As it stands, 32b is enough to represent everything
we're interested in representing anyway. The main problem is, we cannot
add greater than 32b values to it directly.

[1] Almost all of intel-gpu-tools is not yet ready to test 64b
relocations. There are a few places that expect 32b values for offsets
and these all won't work.

Cc: Rafael Barbalho <rafael.barbalho@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Support 64b execbuf

Previously, our code only had a 32b offset value for where the
batchbuffer starts. With full PPGTT, and 64b canonical GPU address
space, that is an insufficient value. The code to expand is pretty
straight forward, and only one platform needs to do anything with the
extra bits.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/sdvo: Remove ->mode_set callback

SDVO is used by both crtcs using the i9xx_ and the ironlake_
functions. For both cases there is nothing between the
encoder->mode_set and the encoder->pre_enable calls that touches the
hardware.

The vlv_ functions are different since they enable the pll before the
->pre_enable hook. But SDVO isn't supported on vlv platforms, so this
doesn't matter.

We've also already clean up all the sdvo state computation logic, all
relevant parts are already in the ->compute_config hook. So we can
just get rid of the ->mode_set hook by converting it to a ->pre_enable
hook.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/crt: Remove ->mode_set callback

We only set a few bits in the ADPA register, which we then read back
in the enable/disable hooks. So we can just move that bit of state
computation code to the place where we need it since setting these
bits without enabling the CRT encoder has no effects.

The only exceptions are the hotplug bits since they affect the hotplug
detection logic, but we already set those in the ->reset function and
then never touch them.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/tv: Remove ->mode_set callback

Currently for the i9xx crtc hooks there's nothing between the call to
encoder->mode_set and encoder->pre_enable which touches the hardware.

Therefore, since tv is only used on gen3/4, we can just move the hook.
Yay for easy cases!

The only other important thing to check is that the new
->pre_enable hook is idempotent wrt the sw state since now it can
be called multiple times (due to DPMS). After a the bit of refactoring
this is now easy to check: It only reads crtc->config and computes
derived state but otherwise leaves it as-is, so we're good.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/tv: Rip out pipe-disabling nonsense from ->mode_set

The pipe and plane _are_ disabled when we call this. So replace it
all with the corresponding assert (as self-documenting code) and
rip out all the lore.

Checking for a disabled plane would require us to export those macros
from intel_display.c, but if the pipe is off the plane isn't working
either. So this single check is good enough.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/tv: De-magic device check

We only support TV-out on gen3/4 mobile platforms, and i915gm is the
only one that matches.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/tv: extract set_color_conversion

intel_tv_mode_set is still too bug.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/tv: extract set_tv_mode_timings

intel_tv_mode_set is just too big.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/dvo: Remove ->mode_set callback

Currently for the i9xx crtc hooks there's nothing between the call to
encoder->mode_set and encoder->pre_enable which touches the hardware.

Therefore, since dvo is only used on gen2, we can just move the hook.
Yay for easy cases!

The only other important thing to check is that the new
->pre_enable hook is idempotent wrt the sw state since now it can be
called multiple times (due to DPMS). It only reads crtc->config but
otherwise leaves it as-is, so we're good.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make encoder->mode_set callbacks optional

For a bunch of reasons we want to move away from the ->mode_set
callbacks: All hw state setup needs to move into ->enable hooks (so
that DOMS can do runtime pm) and all the configuration setup needs to
move into the compute_config functions.

To start with this make the enocer->mode_set callback optional.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make primary_enabled match the actual hardware state

The BIOS can enable a pipe but leave the primary plane disabled. This
coflicts with out current idea of primary_enabled. Read the actual
hardware plane state and set primary_enabled appropriately.

We currently assume that primary_enabled is always true when we're about
to disable a crtc. That needs to change now as the plane may not be
enabled. So replace the relevant WARNs with early returns in
intel_{enable,disable}_primary_hw_plane().

Fixes the following warning
[    3.831602] WARNING: CPU: 0 PID: 1112 at linux/drivers/gpu/drm/i915/intel_display.c:1918 intel_disable_primary_hw_plane+0xe4/0xf0 [i915]()

which got introduced here by me:
commit e9e39655c0c30cddc3f8c09a757678a24dd36737
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Mon Apr 28 15:53:25 2014 +0300

    drm/i915: Remove useless checks from primary enable/disable

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move ring_begin to signal()

Add_request has always contained both the semaphore mailbox updates as
well as the breadcrumb writes. Since the semaphore signal is the one
which actually knows about the number of dwords it needs to emit to the
ring, we move the ring_begin to that function. This allows us to remove
the hideously shared #define

On a related not, gen8 will use a different number of dwords for
semaphores, but not for add request.

v2: Make number of dwords an explicit part of signalling (via function
argument). (Chris)

v3: very slight comment change

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Virtualize the ringbuffer signal func

This abstraction again is in preparation for gen8. Gen8 will bring new
semantics for doing this operation.

While here, make the writes of MI_NOOPs explicit for non-existent rings.
This should have been implicit before.

NOTE: This is going to be removed in a few patches.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move semaphore specific ring members to struct

This will be helpful in abstracting some of the code in preparation for
gen8 semaphores.

v2: Move mbox stuff to a separate struct

v3: Rebased over VCS2 work

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: vlv: init only needed state during early power well enabling

During the initial power well enabling on the driver init/resume path
we can avoid initialzing part of the HW/SW state that will be
initialized anyway by the subsequent init/resume code. For some steps
like HPD initialization this redundancy is not only an overhead but an
actual problem, since they can't be run this early in the overall init
sequence.

Add a flag marking the init phase and skip reinitialzing state that is
not strictly necessary based on that.

This is also needed by the upcoming HPD init restructuring by Thierry
and Daniel.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Avoid NULL ctx->obj dereference in debugfs/i915_context_info

In commit 691e6415c891b8b2b082a120b896b443531c4d45
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed Apr 9 09:07:36 2014 +0100

drm/i915: Always use kref tracking for all contexts.

we populated fake contexts on all platforms. These were identical to the
full hardware context tracking structs, except for the ctx->obj used to
store the hardware state. However, there remained one place where we
assumed that if a context existed, it would have an object associated
with it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77717
Testcase: igt/drv_suspend/debugfs-reader
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add intel_get_crtc_scanline()

Add a new function intel_get_crtc_scanline() that returns the current
scanline counter for the crtc.

v2: Rebase after vblank timestamp changes.
    Use intel_ prefix instead of i915_ as is more customary for
    display related functions.
    Include DRM_SCANOUTPOS_INVBL in the return value even w/o
    adjustments, for a bit of extra consistency.
v3: Change the implementation to be based on DSL on all gens,
    since that's enough for the needs of atomic updates, and
    it will avoid complicating the scanout position calculations
    for the vblank timestamps
v4: Don't break scanline wraparound for interlaced modes

Reviewed-by: Sourab Gupta <sourabgupta@gmail.com>
Reviewed-by: Akash Goel <akash.goels@gmail.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix scanout position for real

Seems I've been a bit dense with regards to the start of vblank
vs. the scanline counter / pixel counter.

After staring at the pixel counter on gen4 I came to the conclusion
that the start of vblank interrupt and scanline counter increment
happen at the same time. The scanline counter increment is documented
to occur at start of hsync, which means that the start of vblank
interrupt must also trigger there. Looking at the pixel counter value
when the scanline wraps from vtotal-1 to 0 confirms that, as the pixel
counter at that point reads hsync_start. This also clarifies why we see
need the +1 adjustment to the scaline counter. The counter actually
starts counting from vtotal-1 on the first active line.

I also confirmed that the frame start interrupt happens ~1 line after
the start of vblank, but the frame start occurs at hblank_start instead.
We only use the frame start interrupt on gen2 where the start of vblank
interrupt isn't available. The only important thing to note here is that
frame start occurs after vblank start, so we don't have to play any
additional tricks to fix up the scanline counter.

The other thing to note is the fact that the pixel counter on gen3-4
starts counting from the start of horizontal active on the first active
line. That means that when we get the start of vblank interrupt, the
pixel counter reads (htotal*(vblank_start-1)+hsync_start). Since we
consider vblank to start at (htotal*vblank_start) we need to add a
constant (htotal-hsync_start) offset to the pixel counter, or else we
risk misdetecting whether we're in vblank or not.

I talked a bit with Art Runyan about these topics, and he confirmed my
findings. And that the same rules should hold for platforms which don't
have the pixel counter. That's good since without the pixel counter it's
rather difficult to verify the timings to this accuracy.

So the conclusion is that we can throw away all the ISR tricks I added,
and just increment the scanline counter by one always.

Reviewed-by: Sourab Gupta <sourabgupta@gmail.com>
Reviewed-by: Akash Goel <akash.goels@gmail.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bdw: Disable idle DOP clock gating

It seems we need this at least for the current platforms we have, but
probably not later. In any event, it should cause too much harm as we do
the same thing on several other platforms.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bdw: enable eDRAM.

The same register exists for querying and programming eDRAM AKA eLLC. So
we can simply use it. For now, use all the same defaults as we had
for Haswell, since like Haswell, I have no further details.

I do not actually have a part with eDRAM, so I cannot test this.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bdw: Add WT caching ability

I don't have any insight on what parts can do what. The docs do seem to
suggest WT caching works in at least the same manner as it does on
Haswell.

The addr = 0 is to shut up GCC:
drivers/gpu/drm/i915/i915_gem_gtt.c:80:7: warning: 'addr' may be used
uninitialized in this function [-Wmaybe-uninitialized]

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: bdw: fix RC6 enabled status reporting and disable runtime PM

On BDW we don't enable RC6 at the moment, but this isn't reflected in
the (sanitized) i915.enable_rc6 option. So make enable_rc6 report
correctly that RC6 is disabled, which will also effectively disable RPM
on BDW (since RPM depends on RC6).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77565

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix assert_plane warning during FDI link train

assert_plane_enabled() is now triggering during FDI link train because
we no longer enable planes that early.

This problem got introduced in:
commit a5c4d7bc187bd13bc11ac06bb4ea3a0d4001aa4d
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Fri Mar 7 18:32:13 2014 +0200

drm/i915: Disable/enable planes as the first/last thing during modeset on ILK+

Just drop the assert since we shouldn't need planes for link training.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: Squash in fixup for now unused plane local variable, reported
by 0-day tester.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Revert "drm/i915: fix build warning on 32-bit (v2)"

This reverts commit 60f2b4af1258c05e6b037af866be81abc24438f7.

The same warning has been fixed in e5081a538a565284fec5f30a937d98e460d5e780 and
these two commits got merged in 74e99a84de2d0980320612db8015ba606af42114 which
caused another warning. Simply, the reverted commit casted the pointer
difference to unsigned long and the other commit changed the output type from
long to ptrdiff_t.

The other commit fixes the original warning the better way so I'm reverting
this commit now.

Signed-off-by: Jan Moskyto Matejka <mq@suse.cz>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix deadlock during driver init on ILK

We have a struct_mutex deadlock during driver init on ILK

[   54.320273] =============================================
[   54.320371] [ INFO: possible recursive locking detected ]
[   54.320471] 3.15.0-rc2-flip_race+ #2 Not tainted
[   54.320567] ---------------------------------------------
[   54.320665] modprobe/2178 is trying to acquire lock:
[   54.320762]  (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0568b05>] intel_enable_gt_powersave+0xa5/0x9d0 [i915]
[   54.321111]
[   54.321111] but task is already holding lock:
[   54.321250]  (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa05b4c2e>] intel_modeset_init_hw+0x3e/0x60 [i915]
[   54.321583]
[   54.321583] other info that might help us debug this:
[   54.321724]  Possible unsafe locking scenario:
[   54.321724]
[   54.321863]        CPU0
[   54.321954]        ----
[   54.322046]   lock(&dev->struct_mutex);
[   54.322221]   lock(&dev->struct_mutex);
[   54.322397]
[   54.322397]  *** DEADLOCK ***
[   54.322397]
[   54.322638]  May be due to missing lock nesting notation
[   54.322638]
[   54.322781] 4 locks held by modprobe/2178:
[   54.322875]  #0:  (&dev->mutex){......}, at: [<ffffffff813592eb>] __driver_attach+0x5b/0xb0
[   54.323230]  #1:  (&dev->mutex){......}, at: [<ffffffff813592f9>] __driver_attach+0x69/0xb0
[   54.323582]  #2:  (drm_global_mutex){+.+.+.}, at: [<ffffffffa04e1e0d>] drm_dev_register+0x2d/0x120 [drm]
[   54.323945]  #3:  (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa05b4c2e>] intel_modeset_init_hw+0x3e/0x60 [i915]

This regression got introduced in:
commit 586d5270b60dc1f35cc3ca982d403765bad77965
Author: Imre Deak <imre.deak@intel.com>
Date:   Mon Apr 14 20:24:28 2014 +0300

    drm/i915: move getting struct_mutex lower in the callstack during GPU reset

Fix the problem by not taking struct_mutex around intel_enable_gt_powersave()
in intel_modeset_init_hw() since intel_enable_gt_powersave() now grabs the
mutex itself.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: remove extraneous VGA power domain put calls

In recent dmesg logs reported for unrelated issues I noticed some power
domain WARNs caused by the following.

The workaround

commit ce352550327b394f3072a07c9cd9d27af9276f15
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Fri Sep 20 10:14:23 2013 +0300

    drm/i915: Fix unclaimed register access due to delayed VGA memory disable

and following fixup of it

commit a14853206517b0c8102accbc77401805a0dbdb9e
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Mon Sep 16 17:38:34 2013 +0300

    drm/i915: Move power well init earlier during driver load

was partially reverted by

commit 7f16e5c1416070dc590dd333a2d677700046a4ab
Merge: 9d1cb91 5e01dc7
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Mon Nov 4 16:28:47 2013 +0100

    Merge tag 'v3.12' into drm-intel-next

but kept the power domain put calls on the error path.

I think for now we can keep things as-is (not reintroduce the w/a) and just fix
the error path, since
- nobody complained seeing this issue
- according to Ville someone is reworking the VGA arbitration scheme at the
  moment and when that's ready we have to rethink this part anyway

So fix this by just removing the put calls from the error path as well.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Integrate cmd parser kerneldoc

Ville noticed that we have this nice kerneldoc but it's not integrated
anywhere. Fix this asap!

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Do not call retire_requests from wait_for_rendering

A common issue we have is that retiring requests causes recursion
through GTT manipulation or page table manipulation which we can only
handle at very specific points. However, to maintain internal
consistency (enforced through our sanity checks on write_domain at
various points in the GEM object lifecycle) we do need to retire the
object prior to marking it with a new write_domain, and also clear the
write_domain for the implicit flush following a batch.

Note that this then allows the unbound objects to still be on the active
lists, and so care must be taken when removing objects from unbound lists
(similar to the caveats we face processing the bound lists).

v2: Fix i915_gem_shrink_all() to handle updated object lifetime rules,
by refactoring it to call into __i915_gem_shrink().

v3: Missed an object-retire prior to changing cache domains in
i915_gem_object_set_cache_leve()

v4: Rebase

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

lib: Export interval_tree

lib/interval_tree.c provides a simple interface for an interval-tree
(an augmented red-black tree) but is only built when testing the generic
macros for building interval-trees. For drivers with modest needs,
export the simple interval-tree library as is.

v2: Lots of help from Michel Lespinasse to only compile the code
    as required:
    - make INTERVAL_TREE a config option
    - make INTERVAL_TREE_TEST select the library functions
      and sanitize the filenames & Makefile
    - prepare interval_tree for being built as a module if required

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
[Acked for inclusion via drm/i915 by Andrew Morton.]
[danvet: switch to _GPL as per the mailing list discussion.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: vlv: increase timeout when forcing on the GFX clock

I've seen latencies up to 15msec, so increase the timeout to 20msec.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: vlv: factor out vlv_force_gfx_clock and check for pending force-off

This will be needed by the VLV runtime PM helpers too, so factor it out.

Also add a safety check for the case where the previous force-off is
still pending, since I'm not sure if Punit can handle a new setting
while the previous one hasn't settled yet.

v2:
- unchanged
v3:
- add a note to the commit message about the safety check (Ville)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: vlv: setup RPS min/max frequencies once during init time

When enabling runtime PM on VLV, GT power save enabling becomes relatively
frequent, so optimize it a bit.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: reinit GT power save during resume

During runtime suspend there can be a last pending rps.work, so make
sure it's canceled. Note that in the runtime suspend callback we can't
get any RPS interrupts since it's called only after the GPU goes idle
and we set the minimum RPS frequency. The next possibility for an RPS
interrupt is only after getting an RPM ref (for example because of a new
GPU command) and calling the RPM resume callback.

v2:
- patch introduced in v2 of the patchset
v3:
- Change the order of canceling the rps.work and disabling interrupts to
avoid the race between interrupt disabling and the the rps.work. Race
spotted by Ville.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: make runtime PM swizzling/ring_freq init platform independent

We need to re-init sizzling on all platforms so move it to the
platform independent runtime resume callback. The ring frequency reinit
is also needed everywhere except on VLV, but gen6_update_ring_freq()
will be a noop on VLV, so we can move this function too to platform
independent code.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: factor out gen6_update_ring_freq

This is needed by the next patch moving the call out from platform
specific RPM callbacks to platform independent code.

No functional change.

v2:
- patch introduce in v2 of the patchset
v3:
- simplify platform check condition (Ville)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: make runtime PM interrupt enable/disable platform independent

We need to disable the interrupts for all platforms, so make the helpers
for this platform independent and call them from them platform
independent runtime suspend/resume callbacks.

On HSW/BDW this will move interrupt disabling/re-enabling at the
beginning/end of runtime suspend/resume respectively, but I don't see
any reason why this would cause a problem there. In any case this seems
to be the correct thing to do even on those platforms.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: disable runtime PM if RC6 is disabled

On VLV we depend on RC6 to save the GT render and media HW context
before going to the D3 state via RPM, so as a preparation for the
VLV RPM support (added in an upcoming patch) disable RPM if RC6 is
disabled.

There is probably a similar dependency on other platforms too, so for
safety require RC6 for those too. For these platforms (SNB, HSW, BDW)
this is then a possible fix.

v2:
- require RC6 for all RPM platforms, not just for VLV (Paulo, Daniel)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: sanitize enable_rc6 option

Atm, an invalid enable_rc6 module option will be silently ignored, so
emit an info message about it. Doing an early sanitization we can also
reuse intel_enable_rc6() in a follow-up patch to see if RC6 is actually
enabled. Currently the caller would have to filter a non-zero return
value based on the platform we are running on. For example on VLV with
i915.enable_rc6 set to 2, RC6 won't be enabled but atm
intel_enable_rc6() would still return 2 in this case.

v2:
- simplify the platform check condition (Ville)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: fix unbalanced GT powersave enable / disable calls

Atm, we call intel_gt_powersave_enable() for GEN6 and GEN7 but disable
it for everything starting from GEN6. This is a problem in case of BDW.
Since I don't have a BDW to test if RC6 works properly, just keep it
disabled for now and fix only the disable function.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: vlv: check port power domain instead of only D0 for eDP VDD on

Some platforms need additional power domains to be on in addition to the
device D0 state to access the panel registers.

Suggested by Daniel.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76987
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: add missing error capturing of the PIPESTAT reg

While checking the error capture path I noticed that we lacked the
power domain-on check for PIPESTAT so fix this by moving that to where
the rest of pipe registers are captured.

The move also revealed that we actually don't include this register in
the error report, so fix that too.

v2:
- patch introduced in v2 of the patchset
v3:
- add back !HAS_PCH_SPLIT check (Ville)
[ Ignore my previous comment about the gen<=5 || vlv check, I realized
that it's the same as !HAS_PCH_SPLIT. ]

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>