git.openwrt.org Git - openwrt/staging/blogic.git/log

drm/i915/bxt: don't allow cached GEM mappings on A stepping

Due to a coherency issue on BXT A steppings we can't guarantee a
coherent view of cached (CPU snooped) GPU mappings, so fail such
requests. User space is supposed to fall back to uncached mappings in
this case.

v2:
- limit the WA to A steppings, on later stepping this HW issue is fixed
v3:
- return error instead of trying to work around the issue in kernel,
since that could confuse user space (Chris)

Testcast: igt/gem_store_dword_batches_loop/cached-mapping
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: work around HW coherency issue when accessing GPU seqno

By running igt/store_dword_loop_render on BXT we can hit a coherency
problem where the seqno written at GPU command completion time is not
seen by the CPU. This results in __i915_wait_request seeing the stale
seqno and not completing the request (not considering the lost
interrupt/GPU reset mechanism). I also verified that this isn't a case
of a lost interrupt, or that the command didn't complete somehow: when
the coherency issue occured I read the seqno via an uncached GTT mapping
too. While the cached version of the seqno still showed the stale value
the one read via the uncached mapping was the correct one.

Work around this issue by clflushing the corresponding CPU cacheline
following any store of the seqno and preceding any reading of it. When
reading it do this only when the caller expects a coherent view.

v2:
- fix using the proper logical && instead of a bitwise & (Jani, Mika)
- limit the workaround to A stepping, on later steppings this HW issue
is fixed
v3:
- use a separate get_seqno/set_seqno vfunc (Chris)

Testcase: igt/store_dword_loop_render
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Also call frontbuffer flip when disabling planes.

We also need to call the frontbuffer flip to trigger proper
invalidations when disabling planes. Otherwise we will miss
screen updates when disabling sprites or cursor.

On core platforms where HW tracking also works, this issue
is totally masked because HW tracking triggers PSR exit
however on VLV/CHV that has only SW tracking we miss screen
updates when disabling planes.

It was caught with kms_psr_sink_crc sprite_plane_onoff
and cursor_plane_onoff subtests running on VLV/CHV.

This is probably a regression since I can also get this
with the manual test case, but with so many changes on atomic
modeset I couldn't track exactly when this was introduced.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Change SRM, LRM instructions to use correct length

MI_STORE_REGISTER_MEM, MI_LOAD_REGISTER_MEM instructions are not really
variable length instructions unlike MI_LOAD_REGISTER_IMM where it expects
(reg, addr) pairs so use fixed length for these instructions.

v2: rebase

Cc: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[danvet: Appease checkpatch as Mika spotted in i915_reg.h - it seems
terminally unhappy about i915_cmd_parser.c so that would be a separate
patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

doc: drm: Fix mis-spelling of i915_guc_submission includes

In commit
d1675198e: drm/i915: Integrate GuC-based command submission

the drm.tmpl include lines reference the intel_guc_submission.c but the
patch adds the file i915_guc_submission.c. drm.tmpl fails to build with:
docproc: .//drivers/gpu/drm/i915/intel_guc_submission.c:
No such file or directory

Change the file reference to the actual file.

Signed-off-by: Graham Whaley <graham.whaley@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: remove excessive scaler debugging messages

There's so much scaler debugging messages that it makes other debugging
hard. Remove them.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Debugfs interface for GuC submission statistics

This provides a means of reading status and counts relating
to GuC actions and submissions.

v2:
    Remove surplus blank line in output [Chris Wilson]

v5:
    Added GuC per-engine submission & seqno statistics

v6:
    Add per-ring statistics to client, refactor client-dumper.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Alex Dai <yu.dai@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Integrate GuC-based command submission

GuC-based submission is mostly the same as execlist mode, up to
intel_logical_ring_advance_and_submit(), where the context being
dispatched would be added to the execlist queue; at this point
we submit the context to the GuC backend instead.

There are, however, a few other changes also required, notably:
1.  Contexts must be pinned at GGTT addresses accessible by the GuC
    i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the
    PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls.

2.  The GuC's TLB must be invalidated after a context is pinned at
    a new GGTT address.

3.  GuC firmware uses the one page before Ring Context as shared data.
    Therefore, whenever driver wants to get base address of LRC, we
    will offset one page for it. LRC_PPHWSP_PN is defined as the page
    number of LRCA.

4.  In the work queue used to pass requests to the GuC, the GuC
    firmware requires the ring-tail-offset to be represented as an
    11-bit value, expressed in QWords. Therefore, the ringbuffer
    size must be reduced to the representable range (4 pages).

v2:
    Defer adding #defines until needed [Chris Wilson]
    Rationalise type declarations [Chris Wilson]

v4:
    Squashed kerneldoc patch into here [Daniel Vetter]

v5:
    Update request->tail in code common to both GuC and execlist modes.
    Add a private version of lr_context_update(), as sharing the
        execlist version leads to race conditions when the CPU and
        the GuC both update TAIL in the context image.
    Conversion of error-captured HWS page to string must account
        for offset from start of object to actual HWS (LRC_PPHWSP_PN).

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Interrupt routing for GuC submission

Turn on interrupt steering to route necessary interrupts to GuC.

v6:
Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Implementation of GuC submission client

A GuC client has its own doorbell and workqueue. It maintains the
doorbell cache line, process description object and work queue item.

A default guc_client is created for the i915 driver to use for
normal-priority in-order submission.

Note that the created client is not yet ready for use; doorbell
allocation will fail as we haven't yet linked the GuC's context
descriptor to the default contexts for each ring (see later patch).

v2:
    Defer adding structure members until needed [Chris Wilson]
    Rationalise type declarations [Chris Wilson]

v5:
    Add GuC per-engine submission & seqno statistics.
    Move wq locking to encompass both get_space() and add_item().
    Take forcewake lock in host2guc_action() [Tom O'Rourke]

v6:
    Fix GuC doorbell cacheline selection code (the
        cacheline-within-page calculation was wrong).
    Rename GuC priorities to make them closer to the names used in
        the GuC firmware source, matching what the autogenerated
        versions will (probably) be.
    Add per-ring statistics to client.

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Enable GuC firmware log

Allocate a GEM object to hold GuC log data. A debugfs interface
(i915_guc_log_dump) is provided to print out the log content.

v2:
Add struct members at point of use [Chris Wilson]

v6:
Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Prepare for GuC-based command submission

This adds the first of the data structures used to communicate with the
GuC (the pool of guc_context structures).

We create a GuC-specific wrapper round the GEM object allocator as all
GEM objects shared with the GuC must be pinned into GGTT space at an
address that is NOT in the range [0..WOPCM_TOP), as that range of GGTT
addresses is not accessible to the GuC (from the GuC's point of view,
it's permanently reserved for other objects such as the BootROM & SRAM).

Later, we will need to allocate additional GuC-sharable objects for the
submission client(s) and the GuC's debug log.

v2:
    Remove redundant initialisation [Chris Wilson]
    Defer adding struct members until needed [Chris Wilson]
    Local functions should pass dev_priv rather than dev [Chris Wilson]

v5:
    Invalidate GuC TLB after allocating and pinning a new object

v6:
    Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Expose one LRC function for GuC submission mode

GuC submission is basically execlist submission, but with the GuC
handling the actual writes to the ELSP and the resulting context
switch interrupts.  So to describe a context for submission via
the GuC, we need one of the same functions used in execlist mode.
This commit exposes one such function, changing its name to better
describe what it does (it's related to logical ring contexts rather
than to execlists per se).

v2:
    Replaces previous "drm/i915: Move execlists defines from .c to .h"

v3:
    Incorporates a change to one of the functions exposed here that was
        previously part of an internal patch, but which was omitted from
        the version recently committed to drm-intel-nightly:
    7a01a0a drm/i915/lrc: Update PDPx registers with lri commands
        So we reinstate this change here.

v4:
    Drop v3 change, update function parameters due to collision with
        8ee3615 drm/i915: Convert execlists_ctx_descriptor() for requests

v5:
    Don't expose execlists_update_context() after all. The current
        version is no longer compatible with GuC submission; trying to
        share the execlist version of this function results in both GuC
        and CPU updating TAIL in the context image, with bad results when
        they get out of step. The GuC submission path now has its own
        private version that just updates the ringbuffer start address,
        and not TAIL or PDPx.

v6:
    Rebased

Issue: VIZ-4884
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Debugfs interface to read GuC load status

The new node provides access to the status of the GuC-specific loader;
also the scratch registers used for communication between the i915
driver and the GuC firmware.

v2:
Changes to output formats per Chris Wilson's suggestions

v6:
Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: GuC-specific firmware loader

This fetches the required firmware image from the filesystem,
then loads it into the GuC's memory via a dedicated DMA engine.

This patch is derived from GuC loading work originally done by
Vinit Azad and Ben Widawsky.

v2:
    Various improvements per review comments by Chris Wilson

v3:
    Removed 'wait' parameter to intel_guc_ucode_load() as firmware
        prefetch is no longer supported in the common firmware loader,
per Daniel Vetter's request.
    Firmware checker callback fn now returns errno rather than bool.

v4:
    Squash uC-independent code into GuC-specifc loader [Daniel Vetter]
    Don't keep the driver working (by falling back to execlist mode)
        if GuC firmware loading fails [Daniel Vetter]

v5:
    Clarify WOPCM-related #defines [Tom O'Rourke]
    Delete obsolete code no longer required with current h/w & f/w
        [Tom O'Rourke]
    Move the call to intel_guc_ucode_init() later, so that it can
        allocate GEM objects, and have it fetch the firmware; then
intel_guc_ucode_load() doesn't need to fetch it later.
        [Daniel Vetter].

v6:
    Update comment describing intel_guc_ucode_load() [Tom O'Rourke]

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Kill intel_dp->{link_bw, rate_select}

We only need the link_bw/rate_select parameters when starting link
training, and they should be computed based on the currently active
config, so throw them out from intel_dp and just compute on demand.

Toss in an extra debug print to see rate_select in addition to link_bw,
as the latter may be 0 for eDP 1.4.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't use link_bw to select between TP1 and TP3

intel_dp->link_bw is going away, so consul the port_clock instead when
choosing between TP1 and TP3.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move intel_dp->lane_count into pipe_config

Currently we clobber intel_dp->lane_count in compute config, which means
after a rejected modeset we may no longer be able to retrain the current
link. Move lane_count into pipe_config to avoid that.

v2: Add missing ':' to the pipe config debug dump

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Avoid confusion between DP and TRANS_DP_CTL in DP .get_config()

Use a separate variable for the TRANS_DP_CTL value instead of reusing
'tmp' that otherwise contains the DP port register value.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't pass clock to DDI PLL select functions

All the *_ddi_pll_select() functions get passed the port_clock and pipe
config as parameters. We only need to pass the pipe config, and the
functions can dig up the port_clock themselves.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't use link_bw for PLL setup

Use port_clock instead of link_bw when picking the PLL parameters for
DP. link_bw may be zero with an eDP 1.4 sink that supports
DP_LINK_RATE_SET so we shouldn't use it for anything other than feed it
to the sink appropriately.

v2: Fix typo in commit message (Sivakumar)

Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Clean up DP/HDMI limited color range handling

Currently we treat intel_{dp,hdmi}->color_range as partly user
controller value (via the property) but we also change it during
.compute_config() when using the "Automatic" mode. That is a bit
confusing, so let's just change things so that we store the user
property values in intel_dp, and only change what's stored in
pipe_config during .compute_config().

There should be no functional change.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Do not check or a stalled pageflip prior to it being queued

When we queue the command or operation to change the scanout address, we
mark the flip as in progress. We can use this flag to prevent us from
checking for a stalled flip prior to its existence!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: clflush on pin_to_display after pwrite to UC bo in LLC

Currently we don't clflush on pin_to_display if the bo is already
UC/WT and is not in the CPU write domain. This causes problems with
pwrite since pwrite doesn't change the write domain, and it avoids
clflushing on UC/WT buffers on LLC platforms unless the buffer is
currently being scanned out.

Fix the problem by marking the cache dirty and adjusting
i915_gem_object_set_cache_level() to clflush when the cache is dirty
even if the cache_level doesn't change.

My last attempt [1] at fixing this via write domain frobbing was shot
down, but now with the cache_dirty flag we can do things in a nicer way.

[1] http://lists.freedesktop.org/archives/intel-gfx/2014-November/055390.html

v2: Drop the I915_CACHE_NONE/WT checks from pwrite

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86422
Testcase: igt/kms_pwrite_crc
Testcase: igt/gem_pwrite_snooped
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: WA for swapped HPD pins in A stepping

WA for BXT A0/A1, where DDIB's HPD pin is swapped to DDIA, so enabling
DDIA HPD pin in place of DDIB.

v2: For DP, irq_port is used to determine the encoder instead of
hpd_pin and removing the edp HPD logic because port A HPD is not
present(Imre)
v3: Rebased on top of Imre's patchset for enabling HPD on PORT A.
Added hpd_pin swapping for intel_dp_init_connector, setting encoder
for PORT_A as per the WA in irq_port (Imre)
v4: Dont enable interrupt for edp, also reframe the description (Siva)
v5: Don’t check for PORT_A in intel_ddi_init to update dig_port,
instead avoid setting hpd_pin itself (Imre)

Signed-off-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bxt: Add HPD support for DDIA

Also remove redundant comments.

Signed-off-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Always pass dev pointer in pdp_init

And fix 0-DAY kernel test infrastructure warning.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use complete virtual address range on 32-bit platforms

With the offset length being taken care of in ("drm/i915/gtt: Allow >=
4GB offsets in X86_32"), the code should be finally safe in 32-bit
kernels.

This reverts commit 501fd70fcaebc911b6b96a7b331e6960e5af67e7
Author: Michel Thierry <michel.thierry@intel.com>
Date: Fri May 29 14:15:05 2015 +0100

drm/i915: limit PPGTT size to 2GB in 32-bit platforms

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gtt: Allow >= 4GB offsets in X86_32

Similar to commit c44ef60e437019b8ca1dab8b4d2e8761fd4ce1e9 ("drm/i915/gtt:
Allow >= 4GB sizes for vm"), i915_gem_obj_offset and i915_gem_obj_ggtt_offset
return an unsigned long, which in only 4-bytes long in 32-bit kernels.

Change return type (and other related offset variables) to u64.

Since Global GTT is always limited to 4GB, this change would not be required
in i915_gem_obj_ggtt_offset, but this is done for consistency.

v2: Remove unnecessary offset variable in do_pin, as we already have
vma->node.start (Chris).
Update GGTT offset too (Tvrtko).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Dont -ETIMEDOUT on identical new and previous (count, crc).

By Vesa DP 1.2 spec TEST_CRC_COUNT is a "4 bit wrap counter which
increments each time the TEST_CRC_x_x are updated."

However if we are trying to verify the screen hasn't changed we get
same (count, crc) pair twice. Without this patch we would return
-ETIMEOUT in this case.

So, if in 6 vblanks the pair (count, crc) hasn't changed we
return it anyway instead of returning error and let test case decide
if it was right or not.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Save latest known sink CRC to compensate delayed counter reset.

By Vesa DP 1.2 Spec TEST_CRC_COUNT should be
"reset to 0 when TEST_SINK bit 0 = 0."

However for some strange reason when PSR is enabled in
certain platforms this is not true. At least not immediatelly.

So we face cases like this:

first get_sink_crc operation:
     count: 0, crc: 000000000000
     count: 1, crc: c101c101c101
returned expected crc: c101c101c101

secont get_sink_crc operation:
     count: 1, crc: c101c101c101
     count: 0, crc: 000000000000
     count: 1, crc: 0000c1010000
should return expected crc: 0000c1010000

But also the reset to 0 should be faster resulting into:

get_sink_crc operation:
     count: 1, crc: c101c101c101
     count: 1, crc: 0000c1010000
should return expected crc: 0000c1010000

So in order to know that the second one is valid one
we need to compare the pair (count, crc) with latest (count, crc).

If the pair changed you have your valid CRC.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Force sink crc stop before start.

By Vesa DP spec, test counter at DP_TEST_SINK_MISC just reset to 0
when unsetting DP_TEST_SINK_START, so let's force this stop here.

But let's minimize the aux transactions and just do it when we know
it hasn't been properly stoped.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/userptr: Kill user_size limit check

GTT was only 32b and its max value is 4GB. In order to allow objects
bigger than 4GB in 48b PPGTT, i915_gem_userptr_ioctl we could check
against max 48b range (1ULL << 48).

But since the check no longer applies, just kill the limit.

v2: Use the default ctx to infer the ppgtt max size (Akash).
v3: Just kill the limit, it was only there for early detection of an
error when used for execbuffer (Chris).

Cc: Akash Goel <akash.goel@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: batch_obj vm offset must be u64

Otherwise it can overflow in 48-bit mode, and cause an incorrect
exec_start.

Before commit 5f19e2bffa63a91cd4ac1adcec648e14a44277ce ("drm/i915: Merged
the many do_execbuf() parameters into a structure"), it was already an u64.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: object size needs to be u64

In a 48b world, users can try to allocate buffers bigger than 4GB; in
these cases it is important that size is a 64b variable.

v2: Drop the warning about bind with size 0, it shouldn't happen anyway.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Add ppgtt info and debug_dump

v2: Clean up patch after rebases.
v3: gen8_dump_ppgtt for 32b and 48b PPGTT.
v4: Use used_pml4es/pdpes (Akash).
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rely on used_px bits instead of null checking (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Expand error state's address width to 64b

v2: For semaphore errors, object is mapped to GGTT and offset will not
be > 4GB, print only lower 32-bits (Akash)
v3: Print gtt_offset in groups of 32-bit (Chris)

Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Initialize PDPs and PML4

Similar to PDs, while setting up a page directory pointer, make all entries
of the pdp point to the scratch pd before mapping (and make all its entries
point to the scratch page); this is to be safe in case of out of bound
access or proactive prefetch.

Also add a scratch pdp, which the PML4 entries point to.

v2: Handle scratch_pdp allocation failure correctly, and keep
initialize_px functions together (Akash)
v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Rely on
the added macros to initialize the pdps.
v4: Rebase after final merged version of Mika's ppgtt/scratch patches
(and removed commit message part related to v3).
v5: Update commit message to also mention PML4 table initialization and
the new scratch pdp (Akash).

Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Add 4 level support in insert_entries and clear_range

When 48b is enabled, gen8_ppgtt_insert_entries needs to read the Page Map
Level 4 (PML4), before it selects which Page Directory Pointer (PDP)
it will write to.

Similarly, gen8_ppgtt_clear_range needs to get the correct PDP/PD range.

This patch was inspired by Ben's "Depend exclusively on map and
unmap_vma".

v2: Rebase after s/page_tables/page_table/.
v3: Remove unnecessary pdpe loop in gen8_ppgtt_clear_range_4lvl and use
clamp_pdp in gen8_ppgtt_insert_entries (Akash).
v4: Merge gen8_ppgtt_clear_range_4lvl into gen8_ppgtt_clear_range to
maintain symmetry with gen8_ppgtt_insert_entries (Akash).
v5: Do not mix pages and bytes in insert_entries (Akash).
v6: Prevent overflow in sg_nents << PAGE_SHIFT, when inserting 4GB at
once.
v7: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
Use gen8_px_index functions, and remove unnecessary number of pages
parameter in insert_pte_entries.
v8: Change gen8_ppgtt_clear_pte_range to stop at PDP boundary, instead of
adding and extra clamp function; remove unnecessary pdp_start/pdp_len
variables (Akash).
v9: pages->orig_nents instead of sg_nents(pages->sgl) to get the
length (Akash).
v10: Remove pdp warning check ingen8_ppgtt_insert_pte_entries until this
commit (Akash).

Reviewed-by: Akash Goel <akash.goel@intel.com> (v9)
Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Pass sg_iter through pte inserts

As a step towards implementing 4 levels, while not discarding the
existing pte insert functions, we need to pass the sg_iter through.
The current function understands to the page directory granularity.
An object's pages may span the page directory, and so using the iter
directly as we write the PTEs allows the iterator to stay coherent
through a VMA insert operation spanning multiple page table levels.

v2: Rebase after s/page_tables/page_table/.
v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series;
updated commit message (s/map/insert).
v4: Rebase.

Reviewed-by: Akash Goel <akash.goel@intel.com> (v3)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Add 4 level switching infrastructure and lrc support

In 64b (48bit canonical) PPGTT addressing, the PDP0 register contains
the base address to PML4, while the other PDP registers are ignored.

In LRC, the addressing mode must be specified in every context
descriptor, and the base address to PML4 is stored in the reg state.

v2: PML4 update in legacy context switch is left for historic reasons,
the preferred mode of operation is with lrc context based submission.
v3: s/gen8_map_page_directory/gen8_setup_page_directory and
s/gen8_map_page_directory_pointer/gen8_setup_page_directory_pointer.
Also, clflush will be needed for bxt. (Akash)
v4: Squashed lrc-specific code and use a macro to set PML4 register.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
PDP update in bb_start is only for legacy 32b mode.
v6: Rebase after final merged version of Mika's ppgtt/scratch
patches.
v7: There is no need to update the pml4 register value in
execlists_update_context. (Akash)
v8: Move pd and pdp setup functions to a previous patch, they do not
belong here. (Akash)
v9: Check USES_FULL_48BIT_PPGTT instead of GEN8_CTX_ADDRESSING_MODE in
gen8_emit_bb_start to check if emit pdps is needed. (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: implement alloc/free for 4lvl

PML4 has no special attributes, and there will always be a PML4.
So simply initialize it at creation, and destroy it at the end.

The code for 4lvl is able to call into the existing 3lvl page table code
to handle all of the lower levels.

v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the
compiler happy. And define ret only in one place.
Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl.
v3: Use i915_dma_unmap_single instead of pci API. Fix a
couple of incorrect checks when unmapping pdp and pd pages (Akash).
v4: Call __pdp_fini also for 32b PPGTT. Clean up alloc_pdp param list.
v5: Prevent (harmless) out of range access in gen8_for_each_pml4e.
v6: Simplify alloc_vma_range_4lvl and gen8_ppgtt_init_common error
paths. (Akash)
v7: Rebase, s/gen8_ppgtt_free_*/gen8_ppgtt_cleanup_*/.
v8: Change location of pml4_init/fini. It will make next patches
cleaner.
v9: Rebase after Mika's ppgtt cleanup / scratch merge patch series, while
trying to reuse as much as possible for pdp alloc. pml4_init/fini
replaced by setup/cleanup_px macros.
v10: Rebase after Mika's merged ppgtt cleanup patch series.
v11: Rebase after final merged version of Mika's ppgtt/scratch
patches.
v12: Fix pdpe start value in trace (Akash)
v13: Define all 4lvl functions in this patch directly, instead of
previous patches, add i915_page_directory_pointer_entry_alloc here,
use test_bit to detect when pdp is already allocated (Akash).
v14: Move pdp allocation into a new gen8_ppgtt_alloc_page_dirpointers
funtion, as we do for pds and pts; move pd and pdp setup functions to
this patch (Akash).
v15: Added kfree(pdp) from previous patch to this (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Add PML4 structure

Introduces the Page Map Level 4 (PML4), ie. the new top level structure
of the page tables.

To facilitate testing, 48b mode will be available on Broadwell and
GEN9+, when i915.enable_ppgtt = 3.

v2: Remove unnecessary CONFIG_X86_64 checks, ppgtt code is already
32/64-bit safe (Chris).
v3: Add goto free_scratch in temp 48-bit mode init code (Akash).
v4: kfree the pdp until the 4lvl alloc/free patch (Akash).
v5: Postpone 48-bit code in sanitize_enable_ppgtt (Akash).
v6: Keep _insert_pte_entries changes outside this patch (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Add dynamic page trace events

The dynamic page allocation patch series added it for GEN6, this patch
adds them for GEN8.

v2: Consolidate pagetable/page_directory events
v3: Multiple rebases.
v4: Rebase after s/page_tables/page_table/.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rebase after gen8_map_pagetable_range removal.
v7: Use generic page name (px) in DECLARE_EVENT_CLASS (Akash)
v8: Defer define of i915_page_directory_pointer_entry_alloc (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT

The insert_entries function was the function used to write PTEs. For the
PPGTT it was "hardcoded" to only understand two level page tables, which
was the case for GEN7. We can reuse this for 4 level page tables, and
remove the concept of insert_entries, which was never viable past 2
level page tables anyway, but it requires a bit of rework to make the
function a bit more generic.

v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v3: Rebase after final merged version of Mika's ppgtt/scratch patches.
v4: Check and warn for NULL value of pdp pointer (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Abstract PDP usage

Up until now, ppgtt->pdp has always been the root of our page tables.
Legacy 32b addresses acted like it had 1 PDP with 4 PDPEs.

In preparation for 4 level page tables, we need to stop using ppgtt->pdp
directly unless we know it's what we want. The future structure will use
ppgtt->pml4 for the top level, and the pdp is just one of the entries
being pointed to by a pml4e. The temporal pdp local variable will be
removed once the rest of the 4-level code lands.

Also, start passing the vm pointer to the alloc functions, instead of
ppgtt.

v2: Updated after dynamic page allocation changes.
v3: Rebase after s/page_tables/page_table/.
v4: Rebase after changes in "Dynamic page table allocations" patch.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rebase after final merged version of Mika's ppgtt/scratch patches.
v7: Keep pagetable map in-line (and avoid unnecessary for_each_pde
loops), remove redundant ppgtt pointer in _alloc_pagetabs (Akash)
v8: Fix text indentation in _alloc_pagetabs/page_directories (Chris)
v9: Defer gen8_alloc_va_range_4lvl definition until 4lvl is implemented,
clean-up gen8_ppgtt_cleanup [pun intended] (Akash).
v10: Clean-up commit message (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: "Akash Goel" <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/gen8: Make pdp allocation more dynamic

This transitional patch doesn't do much for the existing code. However,
it should make upcoming patches to use the full 48b address space a bit
easier.

32-bit ppgtt uses just 4 PDPs, while 48-bit ppgtt will have up-to 512;
this patch prepares the existing functions to query the right number of pdps
at run-time. This also means that used_pdpes should also be allocated during
ppgtt_init, as the bitmap size will depend on the ppgtt address range
selected.

v2: Renamed pdp_free to be similar to pd/pt (unmap_and_free_pdp).
v3: To facilitate testing, 48b mode will be available on Broadwell and
GEN9+, when i915.enable_ppgtt = 3.
v4: Rebase after s/page_tables/page_table/, added extra information
about 4-level page table formats and use IS_ENABLED macro.
v5: Check CONFIG_X86_64 instead of CONFIG_64BIT.
v6: Rebase after Mika's ppgtt cleanup / scratch merge patch series, and
follow
his nomenclature in pdp functions (there is no alloc_pdp yet).
v7: Rebase after merged version of Mika's ppgtt cleanup patch series.
v8: Rebase after final merged version of Mika's ppgtt/scratch patches.
v9: Introduce PML4 (and 48-bit checks) until next patch (Akash).
v10: Also use test_bit to detect when pd/pt are already allocated (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
[danvet: Amend commit message as suggested by Michel.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove unnecessary gen8_clamp_pd

gen8_clamp_pd clamps to the next page directory boundary, but the macro
gen8_for_each_pde already has a check to stop at the page directory
boundary.

Furthermore, i915_pte_count also restricts to the next page table
boundary.

v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series.

Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: "Akash Goel" <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Per-DDI I_boost override

An OEM may request increased I_boost beyond the recommended values
by specifying an I_boost value to be applied to all swing entries for
a port. These override values are specified in VBT.

v2: rebase and remove unused iboost_bit variable

Issue: VIZ-5676
Signed-off-by: Antti Koskipaa <antti.koskipaa@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Merge tag 'drm-intel-fixes-2015-08-14' into drm-intel-next-fixes

Backmerge drm-intel-fixes because a bunch of atomic patch backporting
we had to do lead to horrible conflicts.

Conflicts:
drivers/gpu/drm/drm_crtc.c
Just a bit of context conflict between -next and -fixes.
drivers/gpu/drm/i915/intel_atomic.c
drivers/gpu/drm/i915/intel_display.c
Atomic conflicts, always pick the code from -next.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/i915/skl: WaIgnoreDDIAStrap is forever, always init DDI A

There is currently conflicting documentation on which steppings the
workaround is needed, up to C vs. forever. However there is post-C
stepping hardware that doesn't report port presence on DDI A, leading to
black screen on eDP. Assume the strap isn't connected, and try to enable
DDI A on these machines. (We'll still check the VBT for the info in DDI
init.)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: fix checksum write for automated test reply

DP spec requires the checksum of the last block read to be written
when replying to TEST_EDID_READ. This patch fixes the current code
to do the same.

v2: removed loop for jumping blocks and performed direct addition
as recommended by Daniel

Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Contain the WA_REG macro

Prevent leaking the if scoping by containing the WA_REG
macro inside its own scope.

Reported-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
[danvet: Appease checkpatch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove the failed context from the fpriv->context_idr

If we encounter an allocation failure during ppggt creation (trivial
even with 16Gib+ RAM!), we need to remove the dead context from the
fpriv->context_idr along with the references.

gem_exec_ctx: page allocation failure: order:0, mode:0x8004
CPU: 3 PID: 27272 Comm: gem_exec_ctx Tainted: G W 4.2.0-rc5+ #37
0000000000000000 ffff880086ff7a78 ffffffff816b947a ffff88041ed90038
0000000000008004 ffff880086ff7b08 ffffffff8114b1a5 ffff880086ff7ac8
ffffffff8108d848 0000000000000000 ffffffff81ce84b8 0000000000000000
Call Trace:
[<ffffffff816b947a>] dump_stack+0x45/0x57
[<ffffffff8114b1a5>] warn_alloc_failed+0xd5/0x120
[<ffffffff8108d848>] ? __wake_up+0x48/0x60
[<ffffffff8114e0ed>] __alloc_pages_nodemask+0x73d/0x8e0
[<ffffffffc0472238>] ? i915_gem_execbuffer2+0x148/0x240 [i915]
[<ffffffffc0474240>] __setup_page_dma+0x30/0x110 [i915]
[<ffffffffc0477f61>] gen8_ppgtt_init+0x31/0x2f0 [i915]
[<ffffffffc04785e0>] i915_ppgtt_init+0x30/0x80 [i915]
[<ffffffffc0478928>] i915_ppgtt_create+0x48/0xc0 [i915]
[<ffffffffc046c9c2>] i915_gem_create_context+0x1c2/0x390 [i915]
[<ffffffffc046d9cb>] i915_gem_context_create_ioctl+0x5b/0xa0 [i915]

leading to an oops in i915_gem_context_close. Also note that this
benchmark should not be running out of memory in the first place...

Testcase: igt/benchmark/gem_exec_ctx -b create # ppgtt >= 2
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Report IOMMU enabled status for GPU hangs

The IOMMU for Intel graphics has historically had many issues resulting
in random GPU hangs. Lets include its status when capturing the GPU hang
error state for post-mortem analysis.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Check idle to active before processing CSQ

If idle to active bit is set, the rest of the fields
in CSQ are not valid.

Bail out early if this is the case in order to prevent
rest of the loop inspecting stale values.

This was found by Bspec/code inspection. Doesn't seem to fix any of
the known issues.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
[danvet: Add note about how this was found.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Set alternate aux for DDI-E

There is no correspondent Aux channel for DDI-E.

So we need to rely on VBT to let us know witch one
is being used instead.

v2: Removing some trailing spaces and giving proper
credit to Xiong that added a nice way to avoid port
conflicts by setting supports_dp = 0 when using
equivalent aux for DDI-E.

Credits-to: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Set power domain for DDI-E

DDI-E and DDI-A share 4 the same DDI-A lanes.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: fix stolen bios_reserved checks

I started digging this when I noticed that the BDW code was just
reserving 1mb by coincidence since it was reading reserved fields.
Then I noticed we didn't have any values set for SNB and earlier, and
that the HSW sizes were wrong. After that, I noticed that the reserved
area has a specific start, and may not exactly end where the stolen
memory ends. I also noticed the base pointer can be zero. So I decided
to just write a single patch fixing everything instead of 20 patches
that would be much harder to review.

This patch may solve random stolen memory corruption/problems on
almost all platforms. Notice that since this is always dealing with
the top of the stolen memory, the problems are not so easy to
reproduce - especially since FBC is still disabled by default.

One of the major differences of this patch is that we now look at both
the size and base address. By only looking at the size we were
assuming that the reserved area was always at the very top of
stolen, which is not always true.

After we merge the patch series that allows user space to allocate
stolen memory we'll be able to write IGT tests that maybe catch the
bugs fixed by this patch.

v2:
  - s/BIOS reserved/stolen reserved/g (Chris)
  - Don't DRM_ERROR if we can't do anything about it (Chris)
  - Improve debug messages (Chris).
  - Use the gen7 version instead of gen6 on HSW. Tom found some
    documentation problems, so I think with gen7 we're on the safer
    side (Tom).

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use masked write for Context Status Buffer Pointer

This register needs to be updated with masked writes.

This was found by code inspection and comparison with Bspec and
doesn't seem to fix any known issue.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
[danvet: Add note about impact.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl WaDisableSbeCacheDispatchPortSharing

Add WaDisableSbeCacheDispatchPortSharing:skl

Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Spam less on dp aux send/receive problems

If we encounter frequent problems with dp aux channel
communications, we end up spamming the dmesg with the
exact similar trace and status.

Inject a new backtrace only if we have new information
to share as otherwise we flush out all other important
stuff.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Handle return value in intel_pin_and_fence_fb_obj, v2.

-EDEADLK has special meaning in atomic, but get_fence may call
i915_find_fence_reg which can return -EDEADLK.

This has special meaning in the atomic world, so convert the error
to -EBUSY for this case.

Changes since v1:
- Add comment in the code.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Only update mode related state if a modeset happened.

The rest will be a noop anyway, since without modeset there will be
no updated dplls and no modeset state to update.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove connectors_active.

There are no more users, byebye!

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove connectors_active from intel_dp.c, v2.

Now that everything's atomic, checking encoder->base.crtc is enough.
This function doesn't have the locks to dereference crtc->state, but
stealing an encoder bound to any crtc is probably enough reason to warn.

Changes since v1:
- Commit message.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove connectors_active from sanitization, v2.

connectors_active will be removed, so just calculate this instead.

Changes since v1:
- Look for the right pointer in intel_sanitize_encoder.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Get rid of dpms handling.

This is now done completely atomically.
Keep connectors_active for now, but make it mirror crtc_state->active.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make crtc checking use the atomic state, v2.

Instead of allocating pipe_config on the stack use the old
crtc_state, it's only going to freed from this point on.

All crtc' are now only checked once during modeset,
because false positives can happen with encoders after
dpms changes and to limit the amount of errors for 1 failure.

Changes since v1:
- crtc_state -> old_crtc_state
- state -> old_state

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove connectors_active from state checking.

Connectors are updated atomically now, so the only interaction
with the encoder is through base.crtc.

If it's NULL the encoder's not part of any crtc, and if it's
not NULL then active should be equal to crtc_state->active.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove some unneeded checks from check_crtc_state.

This is handled by the atomic core now, no need to check this for ourself.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert connector checking to atomic, v3.

Right now dpms callbacks can still fiddle with the connector state,
but it can only turn connectors off.

This is remediated by only checking crtc->state->active when the
connector is active, and ignore crtc->state->active when the
connector is off.

connectors_active is no longer checked, and will be removed later
in this series together with dpms.

Another check for !encoder->crtc is performed by check_encoder_state
too, so it can be removed.

Changes since v1:
- Add commit message.
- rename state to old_state.
- Move deletion of mst_port check to mst patch.
Changes since v2:
- Fix a null pointer dereference on MST now hw readout is fixed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Update atomic state when removing mst connector, v3.

Fully remove the MST connector from the atomic state, and remove the
early returns in check_*_state for MST connectors.

With atomic the state can be made consistent all the time.

Thanks to Sivakumar Thulasimani for the idea of using
drm_atomic_helper_set_config.

Changes since v1:
- Remove the MST check in intel_connector_check_state too.
Changes since v2:
- Use drm_atomic_helper_set_config.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Validate the state after an atomic modeset only, and pass the state.

First step in removing dpms and validating atomic state.

There can still be a mismatch in the connector state because the dpms
callbacks are still used, but this can not happen immediately after a modeset.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make the force_thru workaround atomic, v2.

Set connectors_changed to force a modeset if the panel fitter's force
enabled on eDP.

Changes since v1:
- Use connectors_changed instead of active_changed because it's a
routing update.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Commit planes on each crtc separately.

This patch is based on the upstream commit 5ac1c4bcf073ad and amended
for v4.2 to make sure it works as intended.

Repeated calls to begin_crtc_commit can cause warnings like this:
[  169.127746] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:616
[  169.127835] in_atomic(): 0, irqs_disabled(): 1, pid: 1947, name: kms_flip
[  169.127840] 3 locks held by kms_flip/1947:
[  169.127843]  #0:  (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffff814774bc>] __drm_modeset_lock_all+0x9c/0x130
[  169.127860]  #1:  (crtc_ww_class_acquire){+.+.+.}, at: [<ffffffff814774cd>] __drm_modeset_lock_all+0xad/0x130
[  169.127870]  #2:  (crtc_ww_class_mutex){+.+.+.}, at: [<ffffffff81477178>] drm_modeset_lock+0x38/0x110
[  169.127879] irq event stamp: 665690
[  169.127882] hardirqs last  enabled at (665689): [<ffffffff817ffdb5>] _raw_spin_unlock_irqrestore+0x55/0x70
[  169.127889] hardirqs last disabled at (665690): [<ffffffffc0197a23>] intel_pipe_update_start+0x113/0x5c0 [i915]
[  169.127936] softirqs last  enabled at (665470): [<ffffffff8108a766>] __do_softirq+0x236/0x650
[  169.127942] softirqs last disabled at (665465): [<ffffffff8108ae75>] irq_exit+0xc5/0xd0
[  169.127951] CPU: 1 PID: 1947 Comm: kms_flip Not tainted 4.1.0-rc4-patser+ #4039
[  169.127954] Hardware name: LENOVO 2349AV8/2349AV8, BIOS G1ETA5WW (2.65 ) 04/15/2014
[  169.127957]  ffff8800c49036f0 ffff8800cde5fa28 ffffffff817f6907 0000000080000001
[  169.127964]  0000000000000000 ffff8800cde5fa58 ffffffff810aebed 0000000000000046
[  169.127970]  ffffffff81c5d518 0000000000000268 0000000000000000 ffff8800cde5fa88
[  169.127981] Call Trace:
[  169.127992]  [<ffffffff817f6907>] dump_stack+0x4f/0x7b
[  169.128001]  [<ffffffff810aebed>] ___might_sleep+0x16d/0x270
[  169.128008]  [<ffffffff810aed38>] __might_sleep+0x48/0x90
[  169.128017]  [<ffffffff817fc359>] mutex_lock_nested+0x29/0x410
[  169.128073]  [<ffffffffc01635f0>] ? vgpu_write64+0x220/0x220 [i915]
[  169.128138]  [<ffffffffc017fddf>] ? ironlake_update_primary_plane+0x2ff/0x410 [i915]
[  169.128198]  [<ffffffffc0190e75>] intel_frontbuffer_flush+0x25/0x70 [i915]
[  169.128253]  [<ffffffffc01831ac>] intel_finish_crtc_commit+0x4c/0x180 [i915]
[  169.128279]  [<ffffffffc00784ac>] drm_atomic_helper_commit_planes+0x12c/0x240 [drm_kms_helper]
[  169.128338]  [<ffffffffc0184264>] __intel_set_mode+0x684/0x830 [i915]
[  169.128378]  [<ffffffffc018a84a>] intel_crtc_set_config+0x49a/0x620 [i915]
[  169.128385]  [<ffffffff817fdd39>] ? mutex_unlock+0x9/0x10
[  169.128391]  [<ffffffff81467b69>] drm_mode_set_config_internal+0x69/0x120
[  169.128398]  [<ffffffff8119b547>] ? might_fault+0x57/0xb0
[  169.128403]  [<ffffffff8146bf93>] drm_mode_setcrtc+0x253/0x620
[  169.128409]  [<ffffffff8145c600>] drm_ioctl+0x1a0/0x6a0
[  169.128415]  [<ffffffff810b3b41>] ? get_parent_ip+0x11/0x50
[  169.128424]  [<ffffffff811e9ab8>] do_vfs_ioctl+0x2f8/0x530
[  169.128429]  [<ffffffff810d0fcd>] ? trace_hardirqs_on+0xd/0x10
[  169.128435]  [<ffffffff812e7676>] ? selinux_file_ioctl+0x56/0x100
[  169.128439]  [<ffffffff811e9d71>] SyS_ioctl+0x81/0xa0
[  169.128445]  [<ffffffff81800697>] system_call_fastpath+0x12/0x6f

Solve it by using the newly introduced drm_atomic_helper_commit_planes_on_crtc.

The problem here was that the drm_atomic_helper_commit_planes() helper
we were using was basically designed to do

    begin_crtc_commit(crtc #1)
    begin_crtc_commit(crtc #2)
    ...
    commit all planes
    finish_crtc_commit(crtc #1)
    finish_crtc_commit(crtc #2)

The problem here is that since our hardware relies on vblank evasion,
our CRTC 'begin' function waits until we're out of the danger zone in
which register writes might wind up straddling the vblank, then disables
interrupts; our 'finish' function re-enables interrupts after the
registers have been written.  The expectation is that the operations between
'begin' and 'end' must be performed without sleeping (since interrupts
are disabled) and should happen as quickly as possible.  By clumping all
of the 'begin' calls together, we introducing a couple problems:
* Subsequent 'begin' invocations might sleep (which is illegal)
* The first 'begin' ensured that we were far enough from the vblank that
   we could write our registers safely and ensure they all fell within
   the same frame.  Adding extra delay waiting for subsequent CRTC's
   wasn't accounted for and could put us back into the 'danger zone' for
   CRTC #1.

This commit solves the problem by using a new helper that allows an
order of operations like:

   for each crtc {
        begin_crtc_commit(crtc)  // sleep (maybe), then disable interrupts
        commit planes for this specific CRTC
        end_crtc_commit(crtc)    // reenable interrupts
   }

so that sleeps will only be performed while interrupts are enabled and
we can be sure that registers for a CRTC will be written immediately
once we know we're in the safe zone.

The crtc->config->base.crtc update may seem unrelated, but the helper
will use it to obtain the crtc for the state. Without the update it
will dereference NULL and crash.

Changes since v1:
- Use Matt Roper's commit message.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=90398
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: calculate primary visibility changes instead of calling from set_config

This should be much cleaner, with the same effects.

(cherry picked for v4.2 from commit fb9d6cf8c29bfcb0b3c602f7ded87f128d730382)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=90398
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: Only dither on 6bpc panels

In

commit d328c9d78d64ca11e744fe227096990430a88477
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Fri Apr 10 16:22:37 2015 +0200

drm/i915: Select starting pipe bpp irrespective or the primary plane

we started to select the pipe bpp from sink capabilities and not from
the primary framebuffer - that one might change (and we don't want to
incur a modeset) and sprites might contain higher bpp content too.

We also selected dithering on a 8 bpc screen displaying a 24bpp rgb
primary, because pipe_bpp is 24 for such a typical 8 bpc sink, but since
the commit mentioned above, base_bpp is always the absolute maximum
supported by the hardware, e.g., 36 bpp on my Ironlake chip. Iow. the
only way to not get dithering would have been to connect a deep color 12
bpc display, so pipe_bpp == 36 == base_bpp.

Hence only enable dithering on 6bpc screens where we difinitely and
always want it.

Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
Reported-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

Linux 4.2-rc6

Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input subsystem fixes from Dmitry Torokhov:
"Just small ALPS and Elan touchpads, and other driver fixups"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: elantech - add special check for fw_version 0x470f01 touchpad
  Input: twl4030-vibra - fix ERROR: Bad of_node_put() warning
  Input: alps - only Dell laptops have separate button bits for v2 dualpoint sticks
  Input: axp20x-pek - add module alias
  Input: turbografx - fix potential out of bound access

Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS fixes from Ralf Baechle:
"Another round of MIPS fixes for 4.2.  No area does particularly stand
  out but we have a two unpleasant ones:

   - Kernel ptes are marked with a global bit which allows the kernel to
     share kernel TLB entries between all processes.  For this to work
     both entries of an adjacent even/odd pte pair need to have the
     global bit set.  There has been a subtle race in setting the other
     entry's global bit since ~ 2000 but it take particularly
     pathological workloads that essentially do mostly vmalloc/vfree to
     trigger this.

     This pull request fixes the 64-bit case but leaves the case of 32
     bit CPUs with 64 bit ptes unsolved for now.  The unfixed cases
     affect hardware that is not available in the field yet.

   - Instruction emulation requires loading instructions from user space
     but the current fast but simplistic approach will fail on pages
     that are PROT_EXEC but !PROT_READ.  For this reason we temporarily
     do not permit this permission and will map pages with PROT_EXEC |
     PROT_READ.

  The remainder of this pull request is more or less across the field
  and the short log explains them well"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Make set_pte() SMP safe.
  MIPS: Replace add and sub instructions in relocate_kernel.S with addiu
  MIPS: Flush RPS on kernel entry with EVA
  Revert "MIPS: BCM63xx: Provide a plat_post_dma_flush hook"
  MIPS: BMIPS: Delete unused Kconfig symbol
  MIPS: Export get_c0_perfcount_int()
  MIPS: show_stack: Fix stack trace with EVA
  MIPS: do_mcheck: Fix kernel code dump with EVA
  MIPS: SMP: Don't increment irq_count multiple times for call function IPIs
  MIPS: Partially disable RIXI support.
  MIPS: Handle page faults of executable but unreadable pages correctly.
  MIPS: Malta: Don't reinitialise RTC
  MIPS: unaligned: Fix build error on big endian R6 kernels
  MIPS: Fix sched_getaffinity with MT FPAFF enabled
  MIPS: Fix build with CONFIG_OF=y for non OF-enabled targets
  CPUFREQ: Loongson2: Fix broken build due to incorrect include.

Merge branch 'for-linus-4.2' of git://git./linux/kernel/git/mason/linux-btrfs

Pull btrfs fix from Chris Mason:
"We have a btrfs quota regression fix.

  I merged this one on Thursday and have run it through tests against
  current master.

  Normally I wouldn't have sent this while you were finalizing rc6, but
  I'm feeding mosquitoes in the adirondacks next week, so I wanted to
  get this one out before leaving.  I'll leave longer tests running and
  check on things during the week, but I don't expect any problems"

* 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  btrfs: qgroup: Fix a regression in qgroup reserved space.

Merge branch 'for-rc' of git://git./linux/kernel/git/rzhang/linux

Pull thermal management fixes from Zhang Rui:
"Specifics:

   - fix an error that "weight_attr" sysfs attribute is not removed
     while unbinding.  From: Viresh Kumar.

   - fix power allocator governor tracing to return the real request.
     From Javi Merino.

   - remove redundant owner assignment of hisi platform thermal driver.
     From Krzysztof Kozlowski.

   - a couple of small fixes of Exynos thermal driver.  From Krzysztof
     Kozlowski and Chanwoo Choi"

* 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  thermal: Drop owner assignment from platform_driver
  thermal: exynos: Remove unused code related to platform_data on probe()
  thermal: exynos: Add the dependency of CONFIG_THERMAL_OF instead of CONFIG_OF
  thermal: exynos: Disable the regulator on probe failure
  thermal: power_allocator: trace the real requested power
  thermal: remove dangling 'weight_attr' device file

Merge tag 'arc-v4.2-rc6-fixes' of git://git./linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:
"Here's a late pull request for accumulated ARC fixes which came out of
  extended testing of the new ARCv2 port with LTP etc.  llock/scond
  livelock workaround has been reviewed by PeterZ.  The changes look a
  lot but I've crafted them into finer grained patches for better
  tracking later.

  I have some more fixes (ARC Futex backend) ready to go but those will
  have to wait for tglx to return from vacation.

  Summary:
   - Enable a reduced config of HS38 (w/o div-rem, ll64...)
   - Add software workaround for LLOCK/SCOND livelock
   - Fallout of a recent pt_regs update"

* tag 'arc-v4.2-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff
  ARC: Make pt_regs regs unsigned
  ARCv2: spinlock/rwlock: Reset retry delay when starting a new spin-wait cycle
  ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff
  ARC: LLOCK/SCOND based rwlock
  ARC: LLOCK/SCOND based spin_lock
  ARC: refactor atomic inline asm operands with symbolic names
  Revert "ARCv2: STAR 9000837815 workaround hardware exclusive transactions livelock"
  ARCv2: [axs103_smp] Reduce clk for Quad FPGA configs
  ARCv2: Fix the peripheral address space detection
  ARCv2: allow selection of page size for MMUv4
  ARCv2: lib: memset: Don't assume 64-bit load/stores
  ARCv2: lib: memcpy: Missing PREFETCHW
  ARCv2: add knob for DIV_REV in Kconfig
  ARC/time: Migrate to new 'set-state' interface

Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull virtio fix from Michael Tsirkin:
"A last minute fix for the new virtio input driver.  It seems pretty
   obvious, and the problem it's fixing would be quite hard to debug"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio-input: reset device and detach unused during remove

Merge tag 'dm-4.2-fixes-4' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

- stable fix for a dm_merge_bvec() regression on 32 bit Fedora systems.

- fix for a 4.2 DM thinp discard regression due to inability to
   properly delete a range of blocks in a data mapping btree.

* tag 'dm-4.2-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm btree remove: fix bug in remove_one()
  dm: fix dm_merge_bvec regression on 32 bit systems

Merge tag 'sound-4.2-rc6' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"The only bulk changes in this request is ABI updates for ASoC topology
  API.  It's a new API that was introduced in 4.2, and we'd like to
  avoid ABI change after the release, so it's taken now.  As there is no
  real in-tree user for this API, it should be fairly safe.

  Other than that, the usual small fixes are found in various drivers:
  ASoC cs4265, rt5645, intel-sst, firewire, oxygen and HD-audio"

* tag 'sound-4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: topology: Add private data type and bump ABI version to 3
  ASoC: topology: Add ops support to byte controls UAPI
  ASoC: topology: Update TLV support so we can support more TLV types
  ASoC: topology: add private data to manifest
  ASoC: topology: Add subsequence in topology
  ALSA: hda - one Dell machine needs the headphone white noise fixup
  ALSA: fireworks/firewire-lib: add support for recent firmware quirk
  Revert "ALSA: fireworks: add support for AudioFire2 quirk"
  ASoC: topology: fix typo in soc_tplg_kcontrol_bind_io()
  ALSA: HDA: Dont check return for snd_hdac_chip_readl
  ALSA: HDA: Fix stream assignment for host in decoupled mode
  ASoC: rt5645: Fix lost pin setting for DMIC1
  ALSA: oxygen: Fix logical-not-parentheses warning
  ASoC: Intel: sst_byt: fix initialize 'NULL device *' issue
  ASoC: Intel: haswell: fix initialize 'NULL device *' issue
  ASoC: cs4265: Fix setting dai format for Left/Right Justified

Merge tag 'hwmon-for-linus-v4.2-rc6' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

- Export module alias information in g762 and nct7904 to support
   auto-loading.

- Blacklist Dell Studio XPS 8100 in dell-smm to fix fan control
   problems.

* tag 'hwmon-for-linus-v4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (g762) Export OF module alias information
  hwmon: (nct7904) Export I2C module alias information
  hwmon: (dell-smm) Blacklist Dell Studio XPS 8100

Merge tag 'usb-4.2-rc6' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are some USB and PHY fixes for 4.2-rc6 that resolve some reported
  issues.

  All of these have been in the linux-next tree for a while, full
  details on the patches are in the shortlog below"

* tag 'usb-4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  ARM: dts: dra7: Add syscon-pllreset syscon to SATA PHY
  drivers/usb: Delete XHCI command timer if necessary
  xhci: fix off by one error in TRB DMA address boundary check
  usb: udc: core: add device_del() call to error pathway
  phy: ti-pipe3: i783 workaround for SATA lockup after dpll unlock/relock
  phy-sun4i-usb: Add missing EXPORT_SYMBOL_GPL for sun4i_usb_phy_set_squelch_detect
  USB: sierra: add 1199:68AB device ID
  usb: gadget: f_printer: actually limit the number of instances
  usb: gadget: f_hid: actually limit the number of instances
  usb: gadget: f_uac2: fix calculation of uac2->p_interval
  usb: gadget: bdc: fix a driver crash on disconnect
  usb: chipidea: ehci_init_driver is intended to call one time
  USB: qcserial: Add support for Dell Wireless 5809e 4G Modem
  USB: qcserial/option: make AT URCs work for Sierra Wireless MC7305/MC7355

Merge tag 'staging-4.2-rc6' of git://git./linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg KH:
"Here are three bugfixes for some staging driver issues that have been
  reported.  All have been in the linux-next tree for a while"

* tag 'staging-4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: lustre: Include unaligned.h instead of access_ok.h
  staging: vt6655: vnt_bss_info_changed check conf->beacon_rate is not NULL
  staging: comedi: das1800: add missing break in switch

Merge tag 'char-misc-4.2-rc6' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc fixes from Greg KH:
"Here are some extcon fixes for 4.2-rc6 that resolve some reported
  problems.

  All have been in linux-next for a while"

* tag 'char-misc-4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  extcon: Fix extcon_cable_get_state() from getting old state after notification
  extcon: Fix hang and extcon_get/set_cable_state().
  extcon: palmas: Fix NULL pointer error

Merge tag 'drm-intel-fixes-2015-08-07' of git://anongit.freedesktop.org/drm-intel

Pull drm fixes from Daniel Vetter:
"One i915 regression fix and a drm core one since Dave's not around,
  both introduced in 4.2 so not cc: stable.

  The fix for the warning Ted reported isn't in here yet since he didn't
  yet supply a tested-by and I can't repro this one myself (it's in
  fixup code that needs firmware doing something i915 wouldn't do)"

* tag 'drm-intel-fixes-2015-08-07' of git://anongit.freedesktop.org/drm-intel:
  drm/vblank: Use u32 consistently for vblank counters
  drm/i915: Allow parsing of variable size child device entries from VBT

Input: elantech - add special check for fw_version 0x470f01 touchpad

It is no need to check the packet[0] for sanity check when doing
elantech_packet_check_v4() function for fw_version = 0x470f01 touchpad.

Signed-off by: Duson Lin <dusonlin@emc.com.tw>
Reviewed-by: Ulrik De Bie <ulrik.debie-os@e2big.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

dm btree remove: fix bug in remove_one()

remove_one() was not incrementing the key for the beginning of the
range, so not all entries were being removed. This resulted in
discards that were not unmapping all blocks.

Fixes: 4ec331c3ea ("dm btree: add dm_btree_remove_leaves()")
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

drm/vblank: Use u32 consistently for vblank counters

In

commit 99264a61dfcda41d86d0960cf2d4c0fc2758a773
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Wed Apr 15 19:34:43 2015 +0200

drm/vblank: Fixup and document timestamp update/read barriers

I've switched vblank->count from atomic_t to unsigned long and
accidentally created an integer comparison bug in
drm_vblank_count_and_time since vblanke->count might overflow the u32
local copy and hence the retry loop never succeed.

Fix this by consistently using u32.

Cc: Michel Dänzer <michel@daenzer.net>
Reported-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

Merge tag 'asoc-fix-v4.2-rc5' of git://git./linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v4.2

There are a couple of small driver specific fixes here but the
overwhelming bulk of these changes are fixes to the topology ABI that
has been newly introduced in v4.2. Once this makes it into a release we
will have to firm this up but for now getting enhancements in before
they've made it into a release is the most expedient thing.

ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff

The increment of delay counter was 2 instructions:
Arithmatic Shfit Left (ASL) + set to 1 on overflow

This can be done in 1 using ROtate Left (ROL)

Suggested-by: Nigel Topham <ntopham@synopsys.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>

Merge git://git./linux/kernel/git/davem/sparc

Pull sparc fix from David Miller:
"FPU register corruption bug fix"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix userspace FPU register corruptions.

Merge branch 'akpm' (patches from Andrew)

Merge fixes from Andrew Morton:
"21 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits)
  writeback: fix initial dirty limit
  mm/memory-failure: set PageHWPoison before migrate_pages()
  mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
  mm/memory-failure: give up error handling for non-tail-refcounted thp
  mm/memory-failure: fix race in counting num_poisoned_pages
  mm/memory-failure: unlock_page before put_page
  ipc: use private shmem or hugetlbfs inodes for shm segments.
  mm: initialize hotplugged pages as reserved
  ocfs2: fix shift left overflow
  kthread: export kthread functions
  fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()
  lib/iommu-common.c: do not use 0xffffffffffffffffl for computing align_mask
  mm/slub: allow merging when SLAB_DEBUG_FREE is set
  signalfd: fix information leak in signalfd_copyinfo
  signal: fix information leak in copy_siginfo_to_user
  signal: fix information leak in copy_siginfo_from_user32
  ocfs2: fix BUG in ocfs2_downconvert_thread_do_work()
  fs, file table: reinit files_stat.max_files after deferred memory initialisation
  mm, meminit: replace rwsem with completion
  mm, meminit: allow early_pfn_to_nid to be used during runtime
  ...

sparc64: Fix userspace FPU register corruptions.

If we have a series of events from userpsace, with %fprs=FPRS_FEF,
like follows:

ETRAP
ETRAP
VIS_ENTRY(fprs=0x4)
VIS_EXIT
RTRAP (kernel FPU restore with fpu_saved=0x4)
RTRAP

We will not restore the user registers that were clobbered by the FPU
using kernel code in the inner-most trap.

Traps allocate FPU save slots in the thread struct, and FPU using
sequences save the "dirty" FPU registers only.

This works at the initial trap level because all of the registers
get recorded into the top-level FPU save area, and we'll return
to userspace with the FPU disabled so that any FPU use by the user
will take an FPU disabled trap wherein we'll load the registers
back up properly.

But this is not how trap returns from kernel to kernel operate.

The simplest fix for this bug is to always save all FPU register state
for anything other than the top-most FPU save area.

Getting rid of the optimized inner-slot FPU saving code ends up
making VISEntryHalf degenerate into plain VISEntry.

Longer term we need to do something smarter to reinstate the partial
save optimizations. Perhaps the fundament error is having trap entry
and exit allocate FPU save slots and restore register state. Instead,
the VISEntry et al. calls should be doing that work.

This bug is about two decades old.

Reported-by: James Y Knight <jyknight@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>