git.openwrt.org Git - openwrt/staging/blogic.git/log

drm/i915/bxt: Implement enable/disable for Display C9 state

v2: Modified as per review comments from Imre
- Mention enabling instead of allowing in the debug trace and
  remove unnecessary comments.

v3:
- Rebase to latest.
- Move DC9-related functions from intel_display.c to intel_runtime_pm.c.

v4: (imre)
- remove DC5 disabling, it's a nop at this point
- squashed in Suketu's "Assert the requirements to enter or exit DC9"
  patch
- remove check for RUNTIME_PM from assert_can_enable_dc9, it's not a
  dependency

Signed-off-by: A.Sunil Kamath <sunil.kamath@intel.com> (v3)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Sagar Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: add description about the BXT PHYs

Extend the VLV/CHV DPIO (PHY) documentation with the BXT specifics.

v2:
- add more detail about the mapping between ports and transcoders (ville)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: add display initialize/uninitialize sequence (PHY)

Add PHY specific display initialization sequence as per BSpec.

Note that the PHY initialization/uninitialization are done
at their current place only for simplicity, in a future patch - when more
of the runtime PM features will be enabled - these will be moved to
power well#1 and modeset encoder enabling/disabling hooks respectively.

The call to uninitialize the PHY during system/runtime suspend will be
added later in this patchset.

v1: Added function definitions in header files
v2: Imre's review comments addressed
- Moved CDCLK related definitions to i915_reg.h
- Removed defintions for CDCLK frequency
- Split uninit_cdclk() by adding a phy_uninit function
- Calculate freq and decimal based on input frequency
- Program SSA precharge based on input frequency
- Use wait_for 1ms instead 200us udelay for DE PLL locking
- Removed initial value for divider, freq, decimal, ratio.
- Replaced polling loops with wait_for
- Parameterized latency optim setting
- Fix the parts where DE PLL has to be disabled.
- Call CDCLK selection from mode set

v3: (imre)
- add note about the plan to move the cdclk/phy init to a better place
- take rps.hw_lock around pcode access
- fix DDI PHY timeout value
- squash in Vandana's "PORT_CL2CM_DW6_A BUN fix",
  "DDI PHY programming register defn", "Do ddi_phy_init always",
- move PHY register macros next to the corresponding CHV/VLV macros
- move DE PLL register macros here from another patch since they are
  used here first
- add BXT_ prefix to CDCLK flags
- s/COMMON_RESET/COMMON_RESET_DIS/ and clarify related code comments
- fix incorrect read value for the RMW of BXT_PHY_CTL_FAMILY_DDI
- fix using GT_DISPLAY_EDP_POWER_ON vs. GT_DISPLAY_DDI_POWER_ON
  when powering on DDI ports
- fix incorrect port when setting BXT_PORT_TX_DW14_LN for DDI ports
- add missing masking when programming CDCLK_FREQ_DECIMAL
- add missing powering on for DDI-C port, rename OCL2_LDOFUSE_PWR_EN
  to OCL2_LDOFUSE_PWR_DIS to reduce confusion
- add note about mismatch with bspec in the PORT_REF_DW6 fields
- factor out PHY init code to a new function, so we can call it for
  PHY1 and PHY0, instead of open-coding the same

v4: (ville)
- split the CDCLK/PHY parts into two patches, update commit message
  accordingly
- use the existing dpio_phy enum instead of adding a new one for the
  same purpose
- flip the meaning of PHYs so that PHY_A is PHY1 and PHY_BC is PHY0 to
  better match CHV
- s/BXT_PHY/_BXT_PHY/
- use _PIPE for _BXT_PHY instead of open-coding it
- drop _0_2_0_GTTMMADR suffix from BXT_P_CR_GT_DISP_PWRON
- define GT_DISPLAY_POWER_ON in a more standard way
- make a note that the CHV ConfigDB also disagrees about GRC_CODE field
  definitions
- fix lane optimization refactoring fumble from v3
- add per PHY uninit functions to match the init counterparts

Signed-off-by: Vandana Kannan <vandana.kannan@intel.com> (v2)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: add display initialize/uninitialize sequence (CDCLK)

Add CDCLK specific display clock initialization sequence as per BSpec.

Note that the CDCLK initialization/uninitialization are done at their
current place only for simplicity, in a future patch - when more of the
runtime PM features will be enabled - these will be moved to power
well#1 and modeset encoder enabling/disabling hooks respectively. This
also means that atm dynamic power gating power well #1 is effectively
disabled.

The call to uninitialize CDCLK during system/runtime suspend will be
added later in this patchset.

v1: Added function definitions in header files
v2: Imre's review comments addressed
- Moved CDCLK related definitions to i915_reg.h
- Removed defintions for CDCLK frequency
- Split uninit_cdclk() by adding a phy_uninit function
- Calculate freq and decimal based on input frequency
- Program SSA precharge based on input frequency
- Use wait_for 1ms instead 200us udelay for DE PLL locking
- Removed initial value for divider, freq, decimal, ratio.
- Replaced polling loops with wait_for
- Parameterized latency optim setting
- Fix the parts where DE PLL has to be disabled.
- Call CDCLK selection from mode set

v3: (imre)
- add note about the plan to move the cdclk/phy init to a better place
- take rps.hw_lock around pcode access
- move DE PLL register macros here from another patch since they are
  used here first
- add BXT_ prefix to CDCLK flags
- add missing masking when programming CDCLK_FREQ_DECIMAL

v4: (ville)
- split the CDCLK/PHY parts into two patches, update commit message
  accordingly
- s/DISPLAY_PCU_CONTROL/HSW_PCODE_DE_WRITE_FREQ_REQ/
- simplify BXT_DE_PLL_RATIO macros
- fix BXT_DE_PLL_RATIO_MASK
- s/bxt_select_cdclk_freq/broxton_set_cdclk_freq/
- move cdclk init/uninit/set code from intel_ddi.c to intel_display.c
- remove redundant code comments for broxton_set_cdclk_freq()
- sanitize fixed point<->integer frequency value conversion
- use DRM_ERROR instead of WARN
- do RMW when programming BXT_DE_PLL_CTL for safety
- add note about PLL lock timeout being exactly 200us
- make PCU error messages more descriptive
- instead of using 0 freq to mean PLL off/bypass freq use 19200
  for clarity, as the latter one is the actual rate
- simplify pcode programming, removing duplicated
  sandybridge_pcode_write() call
- sanitize code flow, remove unnecessary scratch vars in
  broxton_set_cdclk() (imre)
- Remove bound check for maxmimum freq to match current code.
  This check will be added later at a more proper platform
  independent place once atomic support lands.
- add note to remove freq guard band which isn't needed on BXT
- add note to reduce freq to minimum if no pipe is enabled
- combine broxton_modeset_global_pipes() with
  valleyview_modeset_global_pipes()

Signed-off-by: Vandana Kannan <vandana.kannan@intel.com> (v2)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Rename vlv_cdclk_freq to cdclk_freq

Rename vlv_cdclk_freq to cdclk_freq so that it can be used for all
platforms as required. Needed by the next patch.

Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Signed-off-by: A.Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: PSR VLV: Add single frame update.

According to spec: "In PSR HW or SW mode, SW set this bit before writing
registers for a flip. It will be self-clear when it gets to the PSR
active state."

Some versions of spec mention that this is needed when in
"Persistent mode" but define it as same as "SW mode". Since this
fix the page flip case let's assume this is exactly what we need.

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: PSR: deprecate link_standby support for core platforms.

On Haswell and Broadwell with link in standby when exit event happens
between vblank and VSC packet, PSR exit on panel but DPA transmitter
still sends black pixel. When this condition hits, panel will intermittently
display black frame.

The known W/A for this case involve the of single_frame update
that isn't supported on Haswell and to be supported on Broadwell
3 other workarounds would be required. So it is better and safe to
just deprecate link_standby for now.

Also, link fully off saves more power than link_standby and afwk
no OEM is requesting link standby on VBT. There is no reason for that.

For Skylake let's just consider it behaves like Broadwell until
we prove otherwise.

v2: Fix commit message (Durga).

v3: Fix conflict with PSR2.

Reference: HSD: bdwgfx/1912559
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: PSR: Fix DP_PSR_NO_TRAIN_ON_EXIT logic

Since the beginning there is a missunderstanding on the meaning of this
dpcd bit.
This bit shouldn't indicate whether to use link standby or not, but just
be used to configure TP1, TP2 and TP3 times and tell hw aux should be skiped
since HW is the responsible one.

Even with help of frontbuffer tracking, HW is still fully responsible for
PSR exit logic with/without DP training.

DP_PSR_NO_TRAIN_ON_EXIT means the source doesn't need to do the training, but
it doesn't tell to avoid TP patterns, so we will send minimal TP1 and avoid
TP2. It also means that sink itself can take up to 5 idle frames for training.
6 in our case since we might be off by 1. So we also increment idle_frames by 4
here.

v2: Fix and improve commit message (Durga).
v3: Use minimal TP1 time avoiding TP2 and increase idle frame.

Cc: Durgadoss R <durgadoss.r@intel.com>
Cc: Arthur Runyan <arthur.j.runyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: PSR: Remove wrong LINK_DISABLE.

This wrong logic and useless define came from first versions and
came along with all rework. Just now I notice how ugly, wrong and
useless this is.

val is already defined as 0 anyway and logic is completelly wrong
and useless. So let's starting the link_standby fix with this
cleaning.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: Define BXT power domains

Add BXT power domains

v2: Use DOMAIN_PLLS instead of a new CDCLK one, whitespace fixes
(Damien)
v3: add VGA, TRANSCODER_A power domains (imre)

Signed-off-by: Satheeshakrishna M <satheeshakrishna.m@intel.com> (v1)
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: Enable GMBUS IRQ

GMBUS interrupt has been moved to CPU side in BXT.
What this patch does is:
1. Enable GMBUS IRQ in de_post_install function
2. Handle this interrupt as a port interrupt in display irq
   handler

v2: Rebase on top of the for_each_pipe() change adding dev_priv as
    first argument (Damien).
v3: read BXT_DE_PORT_GMBUS IIR flag only on BXT on other platforms
    it's reserved (imre)
v4: (jani)
- remove redundant 'BXT GMBUS' comment
- fix formatting of BXT_DE_PORT_GMBUS definition

Reviewed-by: Satheeshakrishna M <satheeshakrishna.m@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com> (v1)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: Add BXT support in gen8_irq functions

This patch adds conditional checks in gen8_irq functions
to support BXT. Most of the checks just look for PCH split
availability, and block the call to PCH interrupt functions if
not available.

v2: (jani)
- drop redundant TODO comment about PCH IRQ flags on BXT
- check HAS_PCH_SPLIT instead of IS_BROXTON when handling PCH specific
  IRQ events in gen8_irq_handler()
- check HAS_PCH_SPLIT before calling the function instead of a
  corresponding early return within the called function for
  ibx_irq_reset(), ibx_irq_pre_postinstall(), ibx_irq_postinstall()
v3: (jani)
- in ironlake_irq_postinstall() and ironlake_irq_reset() HAS_PCH_SPLIT
  is always true, so drop the check for it

Reviewed-by: Satheeshakrishna M <satheeshakrishna.m@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Shashank Sharma <ppashank.sharma@intel.com> (v1)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: Add DDI hpd handler

This patch adds a hot plug interrupt handler function for BXT.
What this function typically does is:
1. Check if hot plug is enabled from hot plug control register.
2. Call hpd_irq_handler with appropriate trigger to detect a
plug storm and schedule a bottom half.
3. Clear sticky status bits in hot plug control register..

v2: (jani)
- drop redundant unlikely()
- s/Todo/FIXME:/ in code comment
- declare 'found' var in the scope where it's used
- check for IS_BROXTON before handling BXT_DE_PORT_HOTPLUG_MASK

Reviewed-by: Satheeshakrishna M <satheeshakrishna.m@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com> (v1)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: support for HPD long/short status decoding

All non-GMCH platforms have the same register layout for HPD long/short
status, so let's use this condition instead of HAS_PCH_SPLIT, as the
latter doesn't apply for BXT.

Noticed by Daniel.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: DDI Hotplug interrupt setup

In BXT, DDI hotplug control has been moved to CPU from PCH.
This patch adds a new IRQ setup function for BXT which:
1. Checks which HPD ports are requested to be enabled by encoders.
2. Enables those ports in the hot plug control register.
3. Un-masks these port interrupts in the IMR register.
4. Enables these port interrupts in the IER register.

V3: Kept the default HPD filter count to default (500 us) as per
satheesh's comment
v4: Remove unused HPD filter defines (Damien)
v5: warn if trying to setup HPD on port A (imre)
v6: fix order of definitions for register bitfields (Daniel)
v7: (jani)
- define the size of the hpd_bxt array explicitly for bound checking
- use for_each_intel_encoder instead of open coding it
- fix format/order of definitions for BXT_HOTPLUG_CTL reg bitfields

Reviewed-by: Satheeshakrishna M <satheeshakrishna.m@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com> (v4)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: add bxt gmbus support

For BXT gmbus is pulled from PCH to CPU. From implementation point of
view only pin pair configuration will change. The existing
implementation supports all platforms previous to GEN8 and also SKL. But
for BXT pin pair configuration is completely different than SKL or other
previous GEN's. This patch introduces the new pin pair configuration
structure specific to BXT and also ensures every real gmbus port has a
gpio pin.

v3 by Jani: with the platform independent prep work in place, the bxt
enabling reduces to a fairly trivial patch. Credits are due Sunil for
giving me the ideas (with his patches) what the platform independent
parts should look like.

v4: Fix intel_hdmi_init_connector() for bxt. Abstract gmbus_pin access
more. s/GPU/PCH/ in commit message.

v5: Rebase.

Issue: VIZ-3574
Signed-off-by: A.Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Merge branch 'topic/bxt-stage1' into drm-intel-next-queued

Separate topic branch for bxt didn't work out since we needed to
refactor the gmbus code a bit to make it look decent. So backmerge.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/i915/bxt: don't use unsupported port detection

The port detection register flags in SFUSE_STRAP and DDI_BUF_CTL_A are
not defined for BXT, so don't use them.

Suggested by Satheesh.

v2:
- DDI_BUF_CTL_A bit 0 is not useful on BXT. Making changes to use this
bit when simulator or BXT is not applicable. Code re-arranged as per
Damien's suggestion.

v3:
- clarify commit message, add code comment (imre)

Signed-off-by: Vandana Kannan <vandana.kannan@intel.com> (v2)
Cc: M, Satheeshakrishna <satheeshakrishna.m@intel.com>
Cc: Lespiau, Damien <damien.lespiau@intel.com>
Cc: Shankar, Uma <uma.shankar@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: add workaround to avoid PTE corruption

Set TLBPF in TILECTL. This fixes an issue with BXT HW seeing
corrupted pte entries.

v2:
- move the workaround to bxt_init_clock_gating (imre)

Signed-off-by: Robert Beckett <robert.beckett@intel.com> (v1)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Nick Hoath <nicholas.hoath@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: add WaDisableMaskBasedCammingInRCC workaround

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Nick Hoath <nicholas.hoath@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: add WaDisableMaskBasedCammingInRCC workaround

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Nick Hoath <nicholas.hoath@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: add GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ workaround

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Nick Hoath <nicholas.hoath@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: add GEN8_SDEUNIT_CLOCK_GATE_DISABLE workaround

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Nick Hoath <nicholas.hoath@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: add bxt_init_clock_gating

v2:
- Make the condition to select between SKL and BXT consistent with the
corresponding condition in init_workarounds_ring (Nick)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Nick Hoath <nicholas.hoath@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen9: fix PIPE_CONTROL flush for VS_INVALIDATE

On GEN9+ per specification a NULL PIPE_CONTROL needs to be emitted
before any PIPE_CONTROL command with the VS_INVALIDATE flag set.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Nick Hoath <nicholas.hoath@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove obj->pin_mappable

The obj->pin_mappable flag only exists for debug purposes and is a
hindrance that is mistreated with rotated GGTT views. For debug
purposes, it suffices to mark objects with pin_display as being of note.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Optimistically spin for the request completion

This provides a nice boost to mesa in swap bound scenarios (as mesa
throttles itself to the previous frame and given the scenario that will
complete shortly). It will also provide a good boost to systems running
with semaphores disabled and so frequently waiting on the GPU as it
switches rings. In the most favourable of microbenchmarks, this can
increase performance by around 15% - though in practice improvements
will be marginal and rarely noticeable.

v2: Account for user timeouts
v3: Limit the spinning to a single jiffie (~1us) at most. On an
otherwise idle system, there is no scheduler contention and so without a
limit we would spin until the GPU is ready.
v4: Drop forcewake - the lazy coherent access doesn't require it, and we
have no reason to believe that the forcewake itself improves seqno
coherency - it only adds delay.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: skylake panel fitting using shared scalers

Enabling skylake panel fitting feature using shared scalers

v2:
-added force detach parameter for pfit disable purpose (me)
-read crtc scaler state from hw state (Daniel)
-replaced both skylake_pfit_enable and disable with skylake_pfit_update (me)
-added scaler id check to intel_pipe_config_compare (Daniel)

v3:
-updated function header to kerneldoc format (Matt)
-dropped need_scaling checks (Matt)

v4:
-move clearing of scaler id from commit path to check path (Matt)
-updated colorkey checks based on recent updates (me)
-squashed scaler check while enabling colorkey to here (me)
-use values in plane_state->src as regular integers (me)
-changes made not to modify state in commit path (Matt)

v5:
-squashed helper function to update scaler users to here (Matt)
-squashed helper function to detach scaler to here (Matt, me)
-changes to align with updated scaler structures (Matt, me)

Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: copy staged scaler state from drm state to crtc->config.

This is required for commit to perform as per staged assignment
of scalers until atomic crtc commit function is available.

As a place holder doing this copy from intel_atomic_commit for
scaling to operate correctly.

Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Ensure setting up scalers into staged crtc_state

From intel_atomic_check, call intel_atomic_setup_scalers() to
assign scalers based on staged scaling requests. Fail the
transaction if setup returns error.

Setting up of scalers should be moved to atomic crtc check once
atomic crtc is ready.

v2:
-updated parameter passing to setup_scalers (me)

Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: setup scalers for crtc_compute_config

Added intel_atomic_setup_scalers to setup scalers based on
staged scaling requests from a crtc and its planes. If staged
requests are supportable, this function assigns scalers to
requested planes and crtc. Note that the scaler assignement
itself is staged into crtc_state and respective plane_states
for later commit after all checks have been done.

overall high level flow:
- scaler requests are staged into crtc_state by planes/crtc
- check whether staged scaling requests can be supported
- add planes using scalers that aren't in current transaction
- assign scalers to requested users
- as part of plane commit, scalers will be committed
(i.e., either attached or detached) to respective planes in hw
- as part of crtc_commit, scaler will be either attached or detached
to crtc in hw

crtc_compute_config calls intel_atomic_setup_scalers() to start
scaler assignments as per scaler state in crtc config. This call
should be moved to atomic crtc once it is available.

v2:
-removed a log message (me)
-changed input parameter to crtc_state (me)

v3:
-remove assigning plane_state returned by drm_atomic_get_plane_state (Matt)
-fail if there is an error from drm_atomic_get_plane_state (Matt)

v4:
-changes to align with updated scaler structure (Matt, me)

v5:
-added addtional checks before enabling HQ mode (me)
-added comments to enable HQ mode (Matt)

Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Preserve scaler state when clearing crtc_state

crtc_state is cleared during mode set which wipes out complete
scaler state too. This is causing issues. To fix, ensure scaler
state is preserved because it contains not only crtc
scaler usage, but also planes using scalers on this crtc.

Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Dump scaler_state too as part of dumping crtc_state

Dumps scaler state as part of dumping crtc_state.

v2:
-use regular ints from plane_state->src (me)

v3:
-changes to align with updated scaler structures (Matt)
-interpret plane_state->src as 16.16 format (Matt, Daniel)

Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Keep sprite plane src rect in 16.16 format

This patch keeps intel_plane_state->src rect back
into 16.16 format.

v2:
-sprite src rect to match primary format (Matt, Daniel)

v3:
-moved a hunk from #14 to keep src rect in check & commit in tandom (Matt)

Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Initialize skylake scalers

Initializing scalers with supported values during crtc init.

v2:
-initialize single copy of min/max values (Matt)

v3:
-moved gen check to callsite (Matt)

v4:
-squashed planes begin with no scaler to here (me)

v5:
-updated init function with updated scaler state structure (Matt)

Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Initialize plane colorkey to NONE

This patch initializes plane colorkey to NONE.

Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: skylake scaler structure definitions

skylake scaler structure definitions. scalers live in crtc_state as
they are pipe resources. They can be used either as plane scaler or
panel fitter.

scaler assigned to either plane (for plane scaling) or crtc (for panel
fitting) is saved in scaler_id in plane_state or crtc_state respectively.

scaler_id is used instead of scaler pointer in plane or crtc state
to avoid updating scaler pointer everytime a new crtc_state is created.

v2:
-made single copy of min/max values for scalers (Matt)

v3:
-updated commentary for scaler_id (me)

v4:
-converted src/dst ranges to #defines, dropped ratios (Matt)

Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Register definitions for skylake scalers

Adding register definitions for skylake scalers.
v2:
-add #define for plane selection mask (me)

Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Disable Render power gating

When RC6 along with Render power gating is enabled, GPU hang
happens due to lack of synchronization between GTI and Render
power gating.

v2: Updated commit message and WA name (Damien)

Change-Id: If1614206341eb52a21eadae8c5ebb2655029b50c
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Sagar Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Allocate connector state together with the connectors

Connector states were being allocated in intel_setup_outputs() in loop
over all connectors. That meant hot-added connectors would have a NULL
state. Since the change to use a struct drm_atomic_state for the legacy
modeset, connector states are necessary for the i915 driver to function
properly, so that would lead to oopses.

v2: Fix test for intel_connector_init() success in lvds and sdvo (PRTS)

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reported-and-tested-by: Nicolas Kalkhof <nkalkhof@web.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove unused variable from execlists_context_queue

After commit d7b9ca2f7a41cd36f5ca6c220df48ca9294ed37a
("drm/i915: Remove request->uniq")

dev_priv is no longer needed.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: fix build for DEBUG_FS=n

Fix DEBUG_FS=n build broken by

commit aa7471d228eb6dfddd0d201ea9746d6a2020972a
Author: Jani Nikula <jani.nikula@intel.com>
Date: Wed Apr 1 11:15:21 2015 +0300

drm/i915: add i915 specific connector debugfs file for DPCD

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move vm page allocation in proper place

Move to i915_vma_bind as it is part of the binding.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Support for 90/270 rotation

v2: Moving creation of property in a function, checking for 90/270
rotation simultaneously (Chris)
Letting primary plane to be positioned
v3: Adding if/else for 90/270 and rest params programming, adding check for
pixel_format, some cleanup (review comments)
v4: Adding right pixel_formats, using src_* params instead of crtc_* for offset
and size programming (Ville)
v5: Rebased on -nightly and Tvrtko's series for gtt remapping.
v6: Rebased on -nightly (Tvrtko's series merged)
v7: Moving pixel_format check to intel_atomic_plane_check (Matt)

Signed-off-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Allow universal planes to position

Signed-off-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Naming constants to be written to GEN9_PG_ENABLE

Change-Id: I4253459c075c50d9b6f034b4ed4ad2f54cd7d1d7
Signed-off-by: Sagar Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove stale comment from __intel_set_mode()

Since the following commit, the PLL calculations are done earlier, so
the code following the comment doesn't do anything PLL or encoder
related. It only updates the primary plane now.

commit f3019a4d92f08b2dd92443a4b567a066a51c6ec0
Author: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Date: Wed Oct 29 11:32:37 2014 +0200

drm/i915: Remove crtc_mode_set() hook

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Simplify object is-pinned checking for shrinker

When looking for viable candidates to shrink, we only want objects that
are not pinned. However to do so we performed a double iteration over
the vma in the objects, first looking for the pin-count, then looking
for allocations. We can do both at once and be slightly more explicit in
our validity test.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Allocate context objects from stolen

As we never expose context objects directly to userspace, we can forgo
allocating a first-class GEM object for them and prefer to use the
limited resource of reserved/stolen memory for them. Note this means
that their initial contents are undefined.

However, a downside of using stolen objects for execlists is that we
cannot access the physical address directly (thanks MCH!) which prevents
their use.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove request->uniq

We already assign a unique identifier to every request: seqno. That
someone felt like adding a second one without even mentioning why and
tweaking ABI smells very fishy.

Fixes regression from
commit b3a38998f042b862f5ba4d7f2268f3a8dfb4883a
Author: Nick Hoath <nicholas.hoath@intel.com>
Date: Thu Feb 19 16:30:47 2015 +0000

drm/i915: Fix a use after free, and unbalanced refcounting

v2: Rebase

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Nick Hoath <nicholas.hoath@intel.com>
Cc: Thomas Daniel <thomas.daniel@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Jani Nikula <jani.nikula@intel.com>
[danvet: Fixup because different merge order.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Prefer to check for idleness in worker rather than sync-flush

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Tidy gen8 IRQ handler

Remove some needless variables and parameter passing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Reduce locking in gen8 IRQ handler

Similar in vain in reducing the number of unrequired spinlocks used for
execlist command submission (where the forcewake is required but
manually controlled), we know that the IRQ registers are outside of the
powerwell and so we can access them directly. Since we now have direct
access exported via I915_READ_FW/I915_WRITE_FW, lets put those to use in
the irq handlers as well.

In the process, reorder the execlist submission to happen as early as
possible.

v2: Restrict the untraced register mmio to just the GT path (i.e. the
hotpath for execlists)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Reduce locking in execlist command submission

This eliminates six needless spin lock/unlock pairs when writing out
ELSP.

v2: Respin with my preferred colour.
v3: Mostly back to the original colour

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v1]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove unused variable in intel_lrc.c

Already tagged this one and 0-day builder is failing me.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/i915: Use a separate slab for vmas

vma are more frequently allocated than objects and so should equally
benefit from having a dedicated slab.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use a separate slab for requests

requests are even more frequently allocated than objects and equally
benefit from having a dedicated slab.

v2: Rebase

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Clear crtc atomic flags at beginning of transaction

Once we have full atomic modeset, these kind of flags should be in a
real intel_crtc_state that's tracked properly.  In the meantime, make
sure we clear out any old flags at the beginning of a transaction so
that we don't wind up seeing leftover flags from old transactions that
were checked, but never went to the commit step.  At the moment, a
failed check or prepare could leave stale flags behind that interfere
with the next atomic transaction.

v2: Just do a memset; the series this patch was originally part of
    placed additional fields into the structure that shouldn't be
    cleared, but that's no longer the case.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Switch to full atomic helpers for plane updates/disable, take two

Switch from our plane update/disable entrypoints to use the full atomic
helpers (which generate a top-level atomic transaction) rather than the
transitional helpers (which only create/manipulate orphaned plane states
independent of a top-level transaction).  Various upcoming work (SKL
scalers, atomic watermarks, etc.) requires a full atomic transaction to
behave properly/cleanly.

Last time we tried this, we had to back out the change because we still
call the drm_plane vfuncs directly from within our legacy modesetting
code.  This potentially results in nested atomic transactions, locking
collisions, and other failures.  To avoid that problem again, we
sidestep the issue by calling the transitional helpers directly (rather
than through a vfunc) when we're nested inside of other legacy
modesetting code.  However this does allow legacy SetPlane() ioctl's to
process an entire drm_atomic_state transaction, which is important for
upcoming patches.

Cc: Chandra Konduru <chandra.konduru@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Update DRIVER_DATE to 20150410

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: Adding drm helper function drm_plane_from_index().

Adding drm helper function to return plane pointer from index where
index is a returned by drm_plane_index.

v2:
-avoided nested loop by adding loop count (Daniel)

v3:
-updated patch header prefix to 'drm' (Matt)

v4:
-fixed a kerneldoc issue (kbuild-internal)

Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove vestigal DRI1 ring quiescing code

After the removal of DRI1, all access to the rings are through requests
and so we can always be sure that there is a request to wait upon to
free up available space. The fallback code only existed so that we could
quiesce the GPU following unmediated access by DRI1.

v2: Rebase

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use the global runtime-pm wakelock for a busy GPU for execlists

When we submit a request to the GPU, we first take the rpm wakelock, and
only release it once the GPU has been idle for a small period of time
after all requests have been complete. This means that we are sure no
new interrupt can arrive whilst we do not hold the rpm wakelock and so
can drop the individual get/put around every single request inside
execlists.

Note: to close one potential issue we should mark the GPU as busy
earlier in __i915_add_request.

To elaborate: The issue is that we emit the irq signalling sequence
before we grab the rpm reference, which means we could miss the
resulting interrupt (since that's not set up when suspended). The only
bad side effect is a missed interrupt, gt mmio writes automatically
wake up the hw itself. But otoh we have an umbrella rpm reference for
the entirety of execbuf, as long as that's there we're covered.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Explain a bit more about the add_request issue, which after
some irc chatting with Chris turns out to not be an issue really.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use simpler form of spin_lock_irq(execlist_lock)

We can use the simpler spinlock form to disable interrupts as we are
always outside of an irq/softirq handler.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix the VBT child device parsing for BSW

Recent BSW VBT has a VBT child device size 37 bytes instead of the 33
bytes our code assumes. This means we fail to parse the VBT and thus
fail to detect eDP ports properly and just register them as DP ports
instead.

Fix it up by using the reported child device size from the VBT instead
of assuming it matches out struct defintions.

The latest spec I have shows that the child device size should be 36
bytes for rev >= 195, however on my BSW the size is actually 37 bytes.
And our current struct definition is 33 bytes.

Feels like the entire VBT parses would need to be rewritten to handle
changes in the layout better, but for now I've decided to do just the
bare minimum to get my eDP port back.

Cc: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use complete address space in true PPGTT

True PPGTT is capable of having a full address space, even if the system
has less allocated memory.

Note that aliasing PPGTT always aliases the GGTT and thus should remain
of the same size.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Dynamic page table allocations

This finishes off the dynamic page tables allocations, in the legacy 3
level style that already exists. Most everything has already been setup
to this point, the patch finishes off the enabling by setting the
appropriate function pointers.

In LRC mode, contexts need to know the PDPs when they are populated. With
dynamic page table allocations, these PDPs may not exist yet. Check if
PDPs have been allocated and use the scratch page if they do not exist yet.

Before submission, update the PDPs in the logic ring context as PDPs
have been allocated.

v2: Update aliasing/true ppgtt allocate/teardown/clear functions for
gen 6 & 7.

v3: Rebase.

v4: Remove BUG() from ppgtt_unbind_vma, but keep checking that either
teardown_va_range or clear_range functions exist (Daniel).

v5: Similar to gen6, in init, gen8_ppgtt_clear_range call is only needed
for aliasing ppgtt. Zombie tracking was originally added for teardown
function and is no longer required.

v6: Update err_out case in gen8_alloc_va_range (missed from lastest
rebase).

v7: Rebase after s/page_tables/page_table/.

v8: Updated scratch_pt check after scratch flag was removed in previous
patch.

v9: Note that lrc mode needs to be updated to support init state without
any PDP.

v10: Unmap correct page_table in gen8_alloc_va_range's error case, clean-up
gen8_aliasing_ppgtt_init (remove duplicated map), and initialize PTs
during page table allocation.

v11: Squashed LRC enabling commit, otherwise LRC mode would be left broken
until it was updated to handle the init case without any PDP.

v12: Do not overallocate new_pts bitmap, make alloc_gen8_temp_bitmaps
static and don't abuse of inline functions. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: begin bitmap tracking

Like with gen6/7, we can enable bitmap tracking with all the
preallocations to make sure things actually don't blow up.

v2: Rebased to match changes from previous patches.
v3: Without teardown logic, rely on used_pdpes and used_pdes when
freeing page tables.
v4: Rebased after s/page_tables/page_table/.
v5: Rebased after page table generalizations.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Split out mappings

When we do dynamic page table allocations for gen8, we'll need to have
more control over how and when we map page tables, similar to gen6.
In particular, DMA mappings for page directories/tables occur at allocation
time.

This patch adds the functionality and calls it at init, which should
have no functional change.

The PDPEs are still a special case for now. We'll need a function for
that in the future as well.

v2: Handle renamed unmap_and_free_page functions.
v3: Updated after teardown_va logic was removed.
v4: Rebase after s/page_tables/page_table/.
v5: No longer allocate all PDPs in GEN8+ systems with less than 4GB of
memory, and update populate_lr_context to handle this new case (proper
tracking will be added later in the patch series).
v6: Assign lrc page directory pointer addresses using a macro. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Extract PPGTT param from page_directory alloc

This will be useful for when we move to 48b addressing, and the PDP isn't
the root of the page table structure.

v2: Rebase after changes for Gen8+ systems with less than 4GB of memory.
v3: Rebase after Mika's code review.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: num_pd_pages/num_pd_entries isn't useful

These values are never quite useful for dynamic allocations of the page
tables. Getting rid of them will help prevent later confusion.

v2: Updated to use unmap_and_free_pd functions.
v3: Updated gen8_ppgtt_free after teardown logic was removed.
v4: Rebase after s/page_tables/page_table/.
v5: Keep allocating all page directories in GEN8+ systems with less
than 4GB of memory. Updated gen6_for_all_pdes.
v6: Prevent (harmless) out of range access in gen6_for_all_pdes.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Update pdp switch and point unused PDPs to scratch page

One important part of this patch is we now write a scratch page
directory into any unused PDP descriptors. This matters for 2 reasons,
first, we're not allowed to just use 0, or an invalid pointer, and second,
we must wipe out any previous contents from the last context.

The latter point only matters with full PPGTT. The former point only
effect platforms with less than 4GB memory.

v2: Updated commit message to point that we must set unused PDPs to the
scratch page.

v3: Unmap scratch_pd in gen8_ppgtt_free.

v4: Initialize scratch_pd. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: pagetable allocation rework

Start using gen8_for_each_pde macro to allocate page tables.

v2: teardown_va_range references removed.
v3: Rebase after s/page_tables/page_table/.
v4: Keep setting up page tables for all page directories in systems with
less than 4GB of memory.
v5: Also initialize the page tables. (Mika)
v6: Initialize all page tables, including the extra ones from systems
with less than 4GB of memory. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: page directories rework allocation

Start using gen8_for_each_pdpe macro to allocate the page directories.

Similar to PTs, while setting up a page directory, make all entries of
the pd point to the scratch pd before mapping (and make all its entries
point to the scratch page); this is to be safe in case of out of bound
access or proactive prefetch. Systems without LLC require an explicit
flush.

v2: Rebased after s/free_pt_*/unmap_and_free_pt/ change.
v3: Rebased after teardown va range logic was removed.
v4: Keep setting up all page directories for systems with less than 4GB
of memory.
v5: Initialize PDs. (Mika)
v6: Initialize also the extra PDs from systems with less than 4GB of
memory. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Add dynamic allocation macros and helper functions

Similar to gen6, we will use for_each_pde/for_each_pdpe
and pte/pde/pdpe_index to iterate over these new structures.

v2: Match trace_i915_va_teardown params
v3: Multiple rebases.
v4: Updated to use unmap_and_free_pt.
v5: teardown_va_range logic no longer needed.
v6: Rebase after s/page_tables/page_table/.
v7: Renamed commit to match what it does now (it was "Use dynamic
allocation idioms on free").
v8: Prevent (harmless) out of range access in gen8_for_each_pde and
gen8_for_each_pdpe_e.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[danvet: s/BUG/WARN/]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Initialize page tables

Similar to gen6, while setting up a page table, make all entries of the
pt point to the scratch page before mapping; this is to be safe in case
of out of bound access or proactive prefetch.

Systems without LLC require an explicit flush.

v2: Expanded commit text and fixed indentation (Mika)

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove unnecessary gen8_ppgtt_unmap_pages

We are already unmapping them in gen8_ppgtt_free. This function became
redundant since commit 06fda602dbca9c59d87db7da71192e4b54c9f5ff
("drm/i915: Create page table allocators").

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove _entry from PPGTT page structures

Lets try to keep this consistent:

Page Directory Pointer (PDP).
Page Directory (PD), also known as page directory pointer entries.
Page Table (PT), also known as page directory entries.

s/struct i915_page_table_entry/struct i915_page_table/
s/struct i915_page_directory_entry/struct i915_page_directory/
s/struct i915_page_directory_pointer_entry/struct
i915_page_directory_pointer/

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Record ring->start address in error state

This is mostly useful for execlists where the rings switch between
contexts (and so checking that the ring's start register matches the
context is important).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Suppress empty lines from debugfs/i915_gem_objects

This is just so that I don't have to read about the batch pool on
systems that are not using it! Rather than using a newline between the
kernel clients and userspace clients, just distinguish the internal
allocations with a '[k]'

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Include active flag when describing objects in debugfs

Since we use obj->active as a hint in many places throughout the code,
knowing its state in debugfs is extremely useful.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Split batch pool into size buckets

Now with the trimmed memcpy before the command parser, we try to
allocate many different sizes of batches, predominantly one or two
pages. We can therefore speed up searching for a good sized batch by
keeping the objects of buckets of roughly the same size.

v2: Add a comment about bucket sizes

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Free batch pool when idle

At runtime, this helps ensure that the batch pools are kept trim and
fast. Then at suspend, this releases memory that we do not need to
restore. It also ties into the oom-notifier to ensure that we recover as
much kernel memory as possible during OOM.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Split the batch pool by engine

I woke up one morning and found 50k objects sitting in the batch pool
and every search seemed to iterate the entire list... Painting the
screen in oils would provide a more fluid display.

One issue with the current design is that we only check for retirements
on the current ring when preparing to submit a new batch. This means
that we can have thousands of "active" batches on another ring that we
have to walk over. The simplest way to avoid that is to split the pools
per ring and then our LRU execution ordering will also ensure that the
inactive buffers remain at the front.

v2: execlists still requires duplicate code.
v3: execlists requires more duplicate code

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Tidy batch pool logic

Move the madvise logic out of the execbuffer main path into the
relatively rare allocation path, making the execbuffer manipulation less
fragile.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Split i915_gem_batch_pool into its own header

In the next patch, I want to use the structure elsewhere and so require
it defined earlier. Rather than move the definition to an earlier location
where it feels very odd, place it in its own header file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Re-enable RPS wait-boosting for all engines

This reverts commit ec5cc0f9b019af95e4571a9fa162d94294c8d90b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Jun 12 10:28:55 2014 +0100

drm/i915: Restrict GPU boost to the RCS engine

The premise that media/blitter workloads are not affected by boosting is
patently false with a trip through igt. The question that remains is
what exactly is going wrong with the media workload that prompted this?
Hopefully that would be fixed by the missing agressive downclocking, in
addition to the extra restrictions imposed on how frequent a process is
allowed to boost.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll>
Acked-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Deminish contribution of wait-boosting from clients

With boosting for missed pageflips, we have a much stronger indication
of when we need to (temporarily) boost GPU frequency to ensure smooth
delivery of frames. So now only allow each client to perform one RPS boost
in each period of GPU activity due to stalling on results.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Boost GPU frequency if we detect outstanding pageflips

If we hit a vblank and see that have a pageflip queue but not yet
processed, ensure that the GPU is running at maximum in order to clear
the backlog. Pageflips are only queued for the following vblank, if we
miss it, there will be a visible stutter. Boosting the GPU frequency
doesn't prevent us from missing the target vblank, but it should help
the subsequent frames hitting theirs.

v2: Reorder vblank vs flip-complete so that we only check for a missed
flip after processing the completion events, and avoid spurious boosts.

v3: Rename missed_vblank
v4: Rebase
v5: Cancel the outstanding work in runtime suspend
v6: Rebase
v7: Rebase required fixing

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Deepak S<deepak.s@linux.intel.com>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix computation of last_adjustment for RPS autotuning

The issue is that by computing the last_adj value after applying the
clamping, we can end up with a bogus value for feeding into the next RPS
autotuning step.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Agressive downclocking on Baytrail

Reuse the same reclocking strategy for Baytail as on its bigger brethren,
Sandybridge and Ivybridge. In particular, this makes the device quicker
to reclock (both up and down) though the tendency now is to downclock
more aggressively to compensate for the RPS boosts.

v2: Rebase
v3: Exclude Cherrytrail as Deepak was concerned that the increased
number of register writes would wake the common powerwell too often.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix the flip synchronisation to consider mmioflips

Currently we emit semaphore synchronisation as if we were going to flip
using the target CS engine, but we then change our minds and do the flip
using the CPU. Consequently we write instructions to the ring but never
use them - even to the point of filling that ring up entirely and never
submitting a request.

The wrinkle in the ointment is that we have to tell a white lie to
pin-to-display for it to skip the synchronisation for mmioflips as we
will create a task specifically for that slow synchronisation. An oddity
of note is the discrepancy in requests that we tell to pin-display to
serialise to and that we then eventually wait upon. This is due to a
limitation in the i915_gem_object_sync() routine that will be lifted
later.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Cache last obj->pages location for i915_gem_object_get_page()

The biggest user of i915_gem_object_get_page() is the relocation
processing during execbuffer. Typically userspace passes in a set of
relocations in sorted order. Sadly, we alternate between relocations
increasing from the start of the buffers, and relocations decreasing
from the end. However the majority of consecutive lookups will still be
in the same page. We could cache the start of the last sg chain, however
for most callers, the entire sgl is inside a single chain and so we see
no improve from the extra layer of caching.

v2: Avoid the double increment inside unlikely()

References: https://bugs.freedesktop.org/show_bug.cgi?id=88308
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Implement WaDisableVFUnitClockGating

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Fix stepping check for a couple of W/As

Both WaDisableSDEUnitClockGating and WaSetGAPSunitClckGateDisable are
needed on B0 as well.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Do not set L3-LLC Coherency bit in ctx descriptor

According to Spec this is a reserved bit for Gen9+ and should not be set.

Change-Id: I0215fb7057b94139b7a2f90ecc7a0201c0c93ad4
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: use kref_put_mutex in i915_gem_request_unreference__unlocked

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't use staged config in intel_mst_pre_enable_dp()

For the conversion to atomic. The pre_enable() hooks are called as part
of the crtc enable sequence, at which point the staged config was
already made effective. Furthermore, the function actually changes
hardware state, so it should anyway deal with current and not staged
config.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't use staged config in check_encoder_cloning()

Reduce dependency on the staged config by using the atomic state
instead.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't use staged config in check_digital_port_conflicts()

Reduce dependency on the staged config by using the atomic state
instead.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>