Dave Airlie [Mon, 7 Dec 2015 08:17:09 +0000 (18:17 +1000)]
Merge tag 'topic/drm-misc-2015-12-04' of git://anongit.freedesktop.org/drm-intel into drm-next
New -misc pull. Big thing is Thierry's atomic helpers for system suspend
resume, which I'd like to use in i915 too. Hence the pull.
* tag 'topic/drm-misc-2015-12-04' of git://anongit.freedesktop.org/drm-intel:
drm: keep connector status change logging human readable
drm/atomic-helper: Reject attempts at re-stealing encoders
drm/atomic-helper: Implement subsystem-level suspend/resume
drm: Implement drm_modeset_lock_all_ctx()
drm/gma500: Add driver private mutex for the fault handler
drm/gma500: Drop dev->struct_mutex from mmap offset function
drm/gma500: Drop dev->struct_mutex from fbdev init/teardown code
drm/gma500: Drop dev->struct_mutex from modeset code
drm/gma500: Use correct unref in the gem bo create function
drm/edid: Make the detailed timing CEA/HDMI mode fixup accept up to 5kHz clock difference
drm/atomic_helper: Add drm_atomic_helper_disable_planes_on_crtc()
drm: Serialise multiple event readers
drm: Drop dev->event_lock spinlock around faulting copy_to_user()
Jani Nikula [Thu, 3 Dec 2015 12:00:03 +0000 (14:00 +0200)]
drm: keep connector status change logging human readable
We've had human readable connector status change debug logging since
commit
ed7951dc13aad4a14695ec8122e9f0e2ef25d39e
Author: Lespiau, Damien <damien.lespiau@intel.com>
Date: Fri May 10 12:36:42 2013 +0000
drm: Make the HPD status updates debug logs more readable
but
commit
162b6a57ac50eec236530a16c071ffa50e87362a
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Wed Jan 21 08:45:21 2015 +0100
drm/probe-helper: don't lose hotplug event
added a new one with just the numbers. Fix it.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1449144003-2877-1-git-send-email-jani.nikula@intel.com
Daniel Vetter [Thu, 3 Dec 2015 09:49:14 +0000 (10:49 +0100)]
drm/atomic-helper: Reject attempts at re-stealing encoders
This can happen when we run out of encoders for a multi-crtc modeset,
or also when userspace is silly and tries to clone multiple connectors
that need the same encoder on the same crtc.
Reported-and-Tested-and-Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1449136154-11588-1-git-send-email-daniel.vetter@ffwll.ch
Thierry Reding [Wed, 2 Dec 2015 16:50:04 +0000 (17:50 +0100)]
drm/atomic-helper: Implement subsystem-level suspend/resume
Provide subsystem-level suspend and resume helpers that can be used to
implement suspend/resume on atomic mode-setting enabled drivers.
v2: simplify locking, enhance kerneldoc comments
v3: pass lock acquisition context by parameter, improve kerneldoc
v4: - remove redundant code (already provided by atomic helpers)
(Maarten Lankhorst)
- move backoff dance from drm_modeset_lock_all_ctx() into suspend
helper (Daniel Vetter)
v5: handle potential EDEADLK from drm_atomic_helper_duplicate_state()
and drm_atomic_helper_disable_all() (Daniel Vetter)
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1449075005-13937-2-git-send-email-thierry.reding@gmail.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Thierry Reding [Wed, 2 Dec 2015 16:50:03 +0000 (17:50 +0100)]
drm: Implement drm_modeset_lock_all_ctx()
This function is like drm_modeset_lock_all(), but it takes the lock
acquisition context as a parameter rather than storing it in the DRM
device's mode_config structure.
Implement drm_modeset_{,un}lock_all() in terms of the new function for
better code reuse, and add a note to the kerneldoc that new code should
use the new functions.
v2: improve kerneldoc
v4: rename drm_modeset_lock_all_crtcs() to drm_modeset_lock_all_ctx()
and take mode_config's .connection_mutex instead of .mutex lock to
avoid lock inversion (Daniel Vetter), use drm_modeset_drop_locks()
which is now the equivalent of drm_modeset_unlock_all_ctx()
v5: do not take the dev->mode_config.connection_mutex in
drm_atomic_legacy_backoff() since drm_modeset_lock_all_ctx()
already keeps it, enhance kerneldoc for drm_modeset_lock_all_ctx()
(Daniel Vetter)
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1449075005-13937-1-git-send-email-thierry.reding@gmail.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Mon, 23 Nov 2015 09:32:53 +0000 (10:32 +0100)]
drm/gma500: Add driver private mutex for the fault handler
There's currently two places where the gma500 fault handler relies
upon dev->struct_mutex:
- To protect r->mappping
- To make sure vm_insert_pfn isn't called concurrently (in which case
the 2nd thread would get an error code).
Everything else (specifically psb_gtt_pin) is already protected by
some other locks. Hence just create a new driver-private mmap_mutex
just for this function.
With this gma500 is complete dev->struct_mutex free!
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Acked-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1448271183-20523-21-git-send-email-daniel.vetter@ffwll.ch
Daniel Vetter [Mon, 23 Nov 2015 09:32:52 +0000 (10:32 +0100)]
drm/gma500: Drop dev->struct_mutex from mmap offset function
Simply forgotten about this when I was doing my general cleansing of
simple gem mmap offset functions. There's nothing but core functions
called here, and they all have their own protection already.
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Acked-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1448271183-20523-20-git-send-email-daniel.vetter@ffwll.ch
Daniel Vetter [Mon, 23 Nov 2015 09:32:51 +0000 (10:32 +0100)]
drm/gma500: Drop dev->struct_mutex from fbdev init/teardown code
This is init/teardown code, locking is just to appease locking checks.
And since gem create/free doesn't need this any more there's really no
reason for grabbing dev->struct_mutex.
Again important to switch obj_unref to _unlocked variants.
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Acked-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1448271183-20523-19-git-send-email-daniel.vetter@ffwll.ch
Daniel Vetter [Mon, 23 Nov 2015 09:32:50 +0000 (10:32 +0100)]
drm/gma500: Drop dev->struct_mutex from modeset code
It's either init code or already protected by other means. Note
that psb_gtt_pin/unpin has it's own lock, and that's really the
only piece of driver private state - all the modeset state is
protected with modeset locks already.
The only important bit is to switch all gem_obj_unref calls to the
_unlocked variant.
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Acked-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1448271183-20523-18-git-send-email-daniel.vetter@ffwll.ch
Daniel Vetter [Mon, 23 Nov 2015 09:32:49 +0000 (10:32 +0100)]
drm/gma500: Use correct unref in the gem bo create function
This is called without dev->struct_mutex held, we need to use the
_unlocked variant.
Never caught in the wild since you'd need an evil userspace which
races a gem_close ioctl call with the in-progress open.
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Acked-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1448271183-20523-17-git-send-email-daniel.vetter@ffwll.ch
Ville Syrjälä [Mon, 16 Nov 2015 19:05:12 +0000 (21:05 +0200)]
drm/edid: Make the detailed timing CEA/HDMI mode fixup accept up to 5kHz clock difference
Rather than using drm_match_cea_mode() to see if the EDID detailed
timings are supposed to represent one of the CEA/HDMI modes, add a
special version of that function that takes in an explicit clock
tolerance value (in kHz). When looking at the detailed timings specify
the tolerance as 5kHz due to the 10kHz clock resolution limit inherent
in detailed timings.
drm_match_cea_mode() uses the normal KHZ2PICOS() matching of clocks,
which only allows smaller errors for lower clocks (eg. for 25200 it
won't allow any error) and a bigger error for higher clocks (eg. for
297000 it actually matches 296913-297000). So it doesn't really match
what we want for the fixup. Using the explicit +-5kHz is much better
for this use case.
Not sure if we should change the normal mode matching to also use
something else besides KHZ2PICOS() since it allows a different
proportion of error depending on the clock. I believe VESA CVT
allows a maximum deviation of .5%, so using that for normal mode
matching might be a good idea?
Cc: Adam Jackson <ajax@redhat.com>
Tested-by: nathan.d.ciobanu@linux.intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92217
Fixes: fa3a7340eaa1 ("drm/edid: Fix up clock for CEA/HDMI modes specified via detailed timings")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Dave Airlie [Mon, 30 Nov 2015 22:01:53 +0000 (08:01 +1000)]
Merge tag 'drm-intel-next-2015-11-20-merged' of git://anongit.freedesktop.org/drm-intel into drm-next
drm-intel-next-2015-11-20-rebased:
4 weeks because of my vacation, so a bit more:
- final bits of the typesafe register mmio functions (Ville)
- power domain fix for hdmi detection (Imre)
- tons of fixes and improvements to the psr code (Rodrigo)
- refactoring of the dp detection code (Ander)
- complete rework of the dmc loader and dc5/dc6 handling (Imre, Patrik and
others)
- dp compliance improvements from Shubhangi Shrivastava
- stop_machine hack from Chris to fix corruptions when updating GTT ptes on bsw
- lots of fifo underrun fixes from Ville
- big pile of fbc fixes and improvements from Paulo
- fix fbdev failures paths (Tvrtko and Lukas Wunner)
- dp link training refactoring (Ander)
- interruptible prepare_plane for atomic (Maarten)
- basic kabylake support (Deepak&Rodrigo)
- don't leak ringspace on resets (Chris)
drm-intel-next-2015-10-23:
- 2nd attempt at atomic watermarks from Matt, but just prep for now
- fixes all over
* tag 'drm-intel-next-2015-11-20-merged' of git://anongit.freedesktop.org/drm-intel: (209 commits)
drm/i915: Update DRIVER_DATE to
20151120
drm/i915: take a power domain reference while checking the HDMI live status
drm/i915: take a power domain ref only when needed during HDMI detect
drm/i915: Tear down fbdev if initialization fails
async: export current_is_async()
Revert "drm/i915: Initialize HWS page address after GPU reset"
drm/i915: Fix oops caused by fbdev initialization failure
drm/i915: Fix i915_ggtt_view_equal to handle rotation correctly
drm/i915: Stuff rotation params into view union
drm/i915: Drop return value from intel_fill_fb_ggtt_view
drm/i915 : Fix to remove unnecsessary checks in postclose function.
drm/i915: add MISSING_CASE to a few port/aux power domain helpers
drm/i915/ddi: fix intel_display_port_aux_power_domain() after HDMI detect
drm/i915: Remove platform specific *_dp_detect() functions
drm/i915: Don't do edp panel detection in g4x_dp_detect()
drm/i915: Send TP1 TP2/3 even when panel claims no NO_TRAIN_ON_EXIT.
drm/i915: PSR: Don't Skip aux handshake on DP_PSR_NO_TRAIN_ON_EXIT.
drm/i915: Reduce PSR re-activation time for VLV/CHV.
drm/i915: Delay first PSR activation.
drm/i915: Type safe register read/write
...
Dave Airlie [Mon, 30 Nov 2015 22:01:18 +0000 (08:01 +1000)]
Merge tag 'topic/drm-misc-2015-11-26' of git://anongit.freedesktop.org/drm-intel into drm-next
Here's the first drm-misc pull, with really mostly misc stuff all over.
Somewhat invasive is only Ville's change to mark the arg struct for
fb_create const - that might conflict with a new driver pull. So better to
get in fast.
* tag 'topic/drm-misc-2015-11-26' of git://anongit.freedesktop.org/drm-intel:
drm/mm: use list_next_entry
drm/i915: fix potential dangling else problems in for_each_ macros
drm: fix potential dangling else problems in for_each_ macros
drm/sysfs: Send out uevent when connector->force changes
drm/atomic: Small documentation fix.
drm/mm: rewrite drm_mm_for_each_hole
drm/sysfs: Grab lock for edid/modes_show
drm: Print the src/dst/clip rectangles in error in drm_plane_helper
drm: Add "prefix" parameter to drm_rect_debug_print()
drm: Keep coordinates in the typical x, y, w, h order instead of x, y, h, w
drm: Pass the user drm_mode_fb_cmd2 as const to .fb_create()
drm: modes: replace simple_strtoul by kstrtouint
drm: Describe the Rotation property bits.
drm: Remove unused fbdev_list members
GPU-DRM: Delete unnecessary checks before drm_property_unreference_blob()
drm/dp: add eDP DPCD backlight control bit definitions
drm/tegra: Remove local fbdev emulation Kconfig option
drm/imx: Remove local fbdev emulation Kconfig option
drm/gem: Update/Polish docs
drm: Update GEM refcounting docs
Jyri Sarha [Fri, 27 Nov 2015 14:14:01 +0000 (16:14 +0200)]
drm/atomic_helper: Add drm_atomic_helper_disable_planes_on_crtc()
Add drm_atomic_helper_disable_planes_on_crtc() for disabling all
planes associated with the given CRTC. This can be used for instance
in the CRTC helper disable callback to disable all planes before
shutting down the display pipeline.
v2:
- Address Daniels review comments [1]
- Do atomic_begin() and atomic_flush() always if they are defined and
atomic knob is set
- update kerneldoc
- Put drm_atomic_helper_disable_planes_on_crtc() after
drm_atomic_helper_commit_planes_on_crtc() in drm_atomic_helper.c
to have functions in the same order as in drm_atomic_helper.h
Signed-off-by: Jyri Sarha <jsarha@ti.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1448633641-6486-1-git-send-email-jsarha@ti.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Linus Torvalds [Mon, 30 Nov 2015 02:58:26 +0000 (18:58 -0800)]
Linux 4.4-rc3
Linus Torvalds [Mon, 30 Nov 2015 01:38:08 +0000 (17:38 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull nouveau and radeon fixes from Dave Airlie:
"Just some nouveau and radeon/amdgpu fixes.
The nouveau fixes look large as the firmware context files are
regenerated, but the actual change is quite small"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: make some dpm errors debug only
drm/nouveau/volt/pwm/gk104: fix an off-by-one resulting in the voltage not being set
drm/nouveau/nvif: allow userspace access to its own client object
drm/nouveau/gr/gf100-: fix oops when calling zbc methods
drm/nouveau/gr/gf117-: assume no PPC if NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK is zero
drm/nouveau/gr/gf117-: read NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK from correct GPC
drm/nouveau/gr/gf100-: split out per-gpc address calculation macro
drm/nouveau/bios: return actual size of the buffer retrieved via _ROM
drm/nouveau/instmem: protect instobj list with a spinlock
drm/nouveau/pci: enable c800 magic for some unknown Samsung laptop
drm/nouveau/pci: enable c800 magic for Clevo P157SM
drm/radeon: make rv770_set_sw_state failures non-fatal
drm/amdgpu: move dependency handling out of atomic section v2
drm/amdgpu: optimize scheduler fence handling
drm/amdgpu: remove vm->mutex
drm/amdgpu: add mutex for ba_va->valids/invalids
drm/amdgpu: adapt vce session create interface changes
drm/amdgpu: vce use multiple cache surface starting from stoney
drm/amdgpu: reset vce trap interrupt flag
Linus Torvalds [Mon, 30 Nov 2015 01:30:41 +0000 (17:30 -0800)]
Merge tag 'rtc-4.4-2' of git://git./linux/kernel/git/abelloni/linux
Pull RTC fixes from Alexandre Belloni:
"Two fixes for the ds1307 alarm and wakeup"
* tag 'rtc-4.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
rtc: ds1307: fix alarm reading at probe time
rtc: ds1307: fix kernel splat due to wakeup irq handling
Linus Torvalds [Mon, 30 Nov 2015 01:24:35 +0000 (17:24 -0800)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fix from Ralf Baechle:
"Just a fix for empty loops that may be removed by non-antique GCC"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Fix delay loops which may be removed by GCC.
Linus Torvalds [Mon, 30 Nov 2015 01:18:41 +0000 (17:18 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/geert/linux-m68k
Pull m68k fixes from Geert Uytterhoeven:
"Summary:
- Add missing initialization of max_pfn, which is needed to make
selftests/vm/mlock2-tests succeed,
- Wire up new mlock2 syscall"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Wire up mlock2
m68knommu: Add missing initialization of max_pfn and {min,max}_low_pfn
m68k/mm: sun3 - Add missing initialization of max_pfn and {min,max}_low_pfn
m68k/mm: m54xx - Add missing initialization of max_pfn
m68k/mm: motorola - Add missing initialization of max_pfn
Linus Torvalds [Mon, 30 Nov 2015 01:13:07 +0000 (17:13 -0800)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"Just two changes this time around:
- wire up the new mlock2 syscall added during the last merge window
- fix a build problem with certain configurations provoked by making
CONFIG_OF user selectable"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8454/1: OF implies OF_FLATTREE
ARM: wire up mlock2 syscall
Linus Torvalds [Sun, 29 Nov 2015 17:03:57 +0000 (09:03 -0800)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
- fix tcm-user backend driver expired cmd time processing (agrover)
- eliminate kref_put_spinlock_irqsave() for I/O completion (bart)
- fix iscsi login kthread failure case hung task regression (nab)
- fix COMPARE_AND_WRITE completion use-after-free race (nab)
- fix COMPARE_AND_WRITE with SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC non zero
SGL offset data corruption. (Jan + Doug)
- fix >= v4.4-rc1 regression for tcm_qla2xxx enable configfs attribute
(Himanshu + HCH)
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target/stat: print full t10_wwn.model buffer
target: fix COMPARE_AND_WRITE non zero SGL offset data corruption
qla2xxx: Fix regression introduced by target configFS changes
kref: Remove kref_put_spinlock_irqsave()
target: Invoke release_cmd() callback without holding a spinlock
target: Fix race for SCF_COMPARE_AND_WRITE_POST checking
iscsi-target: Fix rx_login_comp hang after login failure
iscsi-target: return -ENOMEM instead of -1 in case of failed kmalloc()
target/user: Do not set unused fields in tcmu_ops
target/user: Fix time calc in expired cmd processing
Linus Torvalds [Sun, 29 Nov 2015 16:58:48 +0000 (08:58 -0800)]
Merge branch 'next' of git://git./linux/kernel/git/rzhang/linux
Pull thermal management fixes from Zhang Rui:
"Specifics:
- several fixes and cleanups on Rockchip thermal drivers.
- add the missing support of RK3368 SoCs in Rockchip driver.
- small fixes on of-thermal, power_allocator, rcar driver, IMX, and
QCOM drivers, and also compilation fixes, on thermal.h, when thermal
is not selected"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
imx: thermal: use CPU temperature grade info for thresholds
thermal: fix thermal_zone_bind_cooling_device prototype
Revert "thermal: qcom_spmi: allow compile test"
thermal: rcar_thermal: remove redundant operation
thermal: of-thermal: Reduce log level for message when can't fine thermal zone
thermal: power_allocator: Use temperature reading from tz
thermal: rockchip: Support the RK3368 SoCs in thermal driver
thermal: rockchip: consistently use int for temperatures
thermal: rockchip: Add the sort mode for adc value increment or decrement
thermal: rockchip: improve the conversion function
thermal: rockchip: trivial: fix typo in commit
thermal: rockchip: better to compatible the driver for different SoCs
dt-bindings: rockchip-thermal: Support the RK3368 SoCs compatible
David Disseldorp [Fri, 27 Nov 2015 17:37:47 +0000 (18:37 +0100)]
target/stat: print full t10_wwn.model buffer
Cut 'n paste error saw it only process sizeof(t10_wwn.vendor) characters.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Jan Engelhardt [Mon, 23 Nov 2015 16:46:32 +0000 (17:46 +0100)]
target: fix COMPARE_AND_WRITE non zero SGL offset data corruption
target_core_sbc's compare_and_write functionality suffers from taking
data at the wrong memory location when writing a CAW request to disk
when a SGL offset is non-zero.
This can happen with loopback and vhost-scsi fabric drivers when
SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC is used to map existing user-space
SGL memory into COMPARE_AND_WRITE READ/WRITE payload buffers.
Given the following sample LIO subtopology,
% targetcli ls /loopback/
o- loopback ................................. [1 Target]
o- naa.
6001405ebb8df14a ....... [naa.
60014059143ed2b3]
o- luns ................................... [2 LUNs]
o- lun0 ................ [iblock/ram0 (/dev/ram0)]
o- lun1 ................ [iblock/ram1 (/dev/ram1)]
% lsscsi -g
[3:0:1:0] disk LIO-ORG IBLOCK 4.0 /dev/sdc /dev/sg3
[3:0:1:1] disk LIO-ORG IBLOCK 4.0 /dev/sdd /dev/sg4
the following bug can be observed in Linux 4.3 and 4.4~rc1:
% perl -e 'print chr$_ for 0..255,reverse 0..255' >rand
% perl -e 'print "\0" x 512' >zero
% cat rand >/dev/sdd
% sg_compare_and_write -i rand -D zero --lba 0 /dev/sdd
% sg_compare_and_write -i zero -D rand --lba 0 /dev/sdd
Miscompare reported
% hexdump -Cn 512 /dev/sdd
00000000 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
00000200
Rather than writing all-zeroes as instructed with the -D file, it
corrupts the data in the sector by splicing some of the original
bytes in. The page of the first entry of cmd->t_data_sg includes the
CDB, and sg->offset is set to a position past the CDB. I presume that
sg->offset is also the right choice to use for subsequent sglist
members.
Signed-off-by: Jan Engelhardt <jengelh@netitwork.de>
Tested-by: Douglas Gilbert <dgilbert@interlog.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Himanshu Madhani [Tue, 24 Nov 2015 17:20:15 +0000 (12:20 -0500)]
qla2xxx: Fix regression introduced by target configFS changes
this patch fixes following regression
# targetcli
[Errno 13] Permission denied: '/sys/kernel/config/target/qla2xxx/21:00:00:0e:1e:08:c7:20/tpgt_1/enable'
Fixes: 2eafd72939fd ("target: use per-attribute show and store methods")
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Bart Van Assche [Thu, 22 Oct 2015 23:02:14 +0000 (16:02 -0700)]
kref: Remove kref_put_spinlock_irqsave()
The last user is gone. Hence remove this function.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Bart Van Assche [Thu, 22 Oct 2015 22:57:04 +0000 (15:57 -0700)]
target: Invoke release_cmd() callback without holding a spinlock
This patch fixes the following kernel warning because it avoids that
IRQs are disabled while ft_release_cmd() is invoked (fc_seq_set_resp()
invokes spin_unlock_bh()):
WARNING: CPU: 3 PID: 117 at kernel/softirq.c:150 __local_bh_enable_ip+0xaa/0x110()
Call Trace:
[<
ffffffff814f71eb>] dump_stack+0x4f/0x7b
[<
ffffffff8105e56a>] warn_slowpath_common+0x8a/0xc0
[<
ffffffff8105e65a>] warn_slowpath_null+0x1a/0x20
[<
ffffffff81062b2a>] __local_bh_enable_ip+0xaa/0x110
[<
ffffffff814ff229>] _raw_spin_unlock_bh+0x39/0x40
[<
ffffffffa03a7f94>] fc_seq_set_resp+0xe4/0x100 [libfc]
[<
ffffffffa02e604a>] ft_free_cmd+0x4a/0x90 [tcm_fc]
[<
ffffffffa02e6972>] ft_release_cmd+0x12/0x20 [tcm_fc]
[<
ffffffffa042bd66>] target_release_cmd_kref+0x56/0x90 [target_core_mod]
[<
ffffffffa042caf0>] target_put_sess_cmd+0xc0/0x110 [target_core_mod]
[<
ffffffffa042cb81>] transport_release_cmd+0x41/0x70 [target_core_mod]
[<
ffffffffa042d975>] transport_generic_free_cmd+0x35/0x420 [target_core_mod]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joern Engel <joern@logfs.org>
Reviewed-by: Andy Grover <agrover@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Fri, 6 Nov 2015 07:37:59 +0000 (23:37 -0800)]
target: Fix race for SCF_COMPARE_AND_WRITE_POST checking
This patch addresses a race + use after free where the first
stage of COMPARE_AND_WRITE in compare_and_write_callback()
is rescheduled after the backend sends the secondary WRITE,
resulting in second stage compare_and_write_post() callback
completing in target_complete_ok_work() before the first
can return.
Because current code depends on checking se_cmd->se_cmd_flags
after return from se_cmd->transport_complete_callback(),
this results in first stage having SCF_COMPARE_AND_WRITE_POST
set, which incorrectly falls through into second stage CAW
processing code, eventually triggering a NULL pointer
dereference due to use after free.
To address this bug, pass in a new *post_ret parameter into
se_cmd->transport_complete_callback(), and depend upon this
value instead of ->se_cmd_flags to determine when to return
or fall through into ->queue_status() code for CAW.
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Thu, 5 Nov 2015 22:11:59 +0000 (14:11 -0800)]
iscsi-target: Fix rx_login_comp hang after login failure
This patch addresses a case where iscsi_target_do_tx_login_io()
fails sending the last login response PDU, after the RX/TX
threads have already been started.
The case centers around iscsi_target_rx_thread() not invoking
allow_signal(SIGINT) before the send_sig(SIGINT, ...) occurs
from the failure path, resulting in RX thread hanging
indefinately on iscsi_conn->rx_login_comp.
Note this bug is a regression introduced by:
commit
e54198657b65625085834847ab6271087323ffea
Author: Nicholas Bellinger <nab@linux-iscsi.org>
Date: Wed Jul 22 23:14:19 2015 -0700
iscsi-target: Fix iscsit_start_kthreads failure OOPs
To address this bug, complete ->rx_login_complete for good
measure in the failure path, and immediately return from
RX thread context if connection state did not actually reach
full feature phase (TARG_CONN_STATE_LOGGED_IN).
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Luis de Bethencourt [Mon, 19 Oct 2015 20:18:24 +0000 (21:18 +0100)]
iscsi-target: return -ENOMEM instead of -1 in case of failed kmalloc()
Smatch complains about returning hard coded error codes, silence this
warning.
drivers/target/iscsi/iscsi_target_parameters.c:211
iscsi_create_default_params() warn: returning -1 instead of -ENOMEM is sloppy
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Andy Grover [Fri, 13 Nov 2015 18:42:20 +0000 (10:42 -0800)]
target/user: Do not set unused fields in tcmu_ops
TCMU sets TRANSPORT_FLAG_PASSTHROUGH, so INQUIRY commands will not be
emulated by LIO but passed up to userspace. Therefore TCMU should not
set these, just like pscsi doesn't.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Andy Grover [Fri, 13 Nov 2015 18:42:19 +0000 (10:42 -0800)]
target/user: Fix time calc in expired cmd processing
Reversed arguments meant that we were doing nothing for cmds whose deadline
had passed.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Arnd Bergmann [Thu, 19 Nov 2015 12:20:54 +0000 (13:20 +0100)]
ARM: 8454/1: OF implies OF_FLATTREE
On the ARM architecture, individual platforms select CONFIG_USE_OF if they
need it, but all device tree code is keyed off CONFIG_OF. When building
a platform without DT support and manually enabling CONFIG_OF, we now
get a number of build errors, e.g.
arch/arm/kernel/devtree.c: In function 'setup_machine_fdt':
arch/arm/kernel/devtree.c:215:19: error: implicit declaration of function 'early_init_dt_verify' [-Werror=implicit-function-declaration]
We could now try to separate the use case of booting from DT vs. the
case of using the dynamic implementation, but that seems more complicated
than it can gain us.
This simply changes the ARM Kconfig file to always enable OF_RESERVED_MEM
and OF_EARLY_FLATTREE when CONFIG_OF is enabled. These options add a little
extra code when we just want the dynamic OF implementation, but that seems
like a rather obscure case, and this version solves all CONFIG_OF related
randconfig regressions.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 0166dc11be91 ("of: make CONFIG_OF user selectable")
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Sat, 28 Nov 2015 21:07:41 +0000 (13:07 -0800)]
Merge tag 'pci-v4.4-fixes-1' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"Here are a few fixes I'd like to have in v4.4: a generic one for sysfs
and three for HiSilicon and DesignWare host controllers.
Summary:
NUMA:
- Prevent out of bounds access in numa_node override (Mathias Krause)
HiSilicon host bridge driver:
- Fix deferred probing (Arnd Bergmann)
Synopsys DesignWare host bridge driver:
- Remove incorrect io_base assignment (Stanimir Varbanov)
- Move align_resource function pointer to pci_host_bridge structure
(Gabriele Paoloni)"
* tag 'pci-v4.4-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
ARM/PCI: Move align_resource function pointer to pci_host_bridge structure
PCI: hisi: Fix deferred probing
PCI: designware: Remove incorrect io_base assignment
PCI: Prevent out of bounds access in numa_node override
Linus Torvalds [Sat, 28 Nov 2015 01:22:47 +0000 (17:22 -0800)]
Merge tag 'nfs-for-4.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
Stable patches:
- Fix a NFSv4 callback identifier leak that was also causing client
crashes
- Fix NFSv4 callback decoding issues when incoming requests are
truncated
- Don't declare the attribute cache valid when we call
nfs_update_inode with an empty attribute structure.
- Resend LAYOUTGET when there is a race that changes the seqid
Bugfixes:
- Fix a number of issues with the NFSv4.2 CLONE ioctl()
- Properly set NFS v4.2 NFSDBG_FACILITY
- NFSv4 referrals are broken; Cleanup FATTR4_WORD0_FS_LOCATIONS after
decoding success
- Use sliding delay when LAYOUTGET gets NFS4ERR_DELAY
- Ensure that attrcache is revalidated after a SETATTR"
* tag 'nfs-for-4.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
nfs4: resend LAYOUTGET when there is a race that changes the seqid
nfs: if we have no valid attrs, then don't declare the attribute cache valid
nfs: ensure that attrcache is revalidated after a SETATTR
nfs4: limit callback decoding to received bytes
nfs4: start callback_ident at idr 1
nfs: use sliding delay when LAYOUTGET gets NFS4ERR_DELAY
NFS4: Cleanup FATTR4_WORD0_FS_LOCATIONS after decoding success
NFS: Properly set NFS v4.2 NFSDBG_FACILITY
nfs: reduce the amount of ifdefs for v4.2 in nfs4file.c
nfs: use btrfs ioctl defintions for clone
nfs: allow intra-file CLONE
nfs: offer native ioctls even if CONFIG_COMPAT is set
nfs: pass on count for CLONE operations
Linus Torvalds [Fri, 27 Nov 2015 23:53:23 +0000 (15:53 -0800)]
Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog fixes from Wim Van Sebroeck:
- a null pointer dereference fix for omap_wdt
- some clock related fixes for pnx4008
- an underflow fix in wdt_set_timeout() for w83977f_wdt
- restart fix for tegra wdt
- Kconfig change to support Freescale Layerscape platforms
- fix for stopping the mtk_wdt watchdog
* git://www.linux-watchdog.org/linux-watchdog:
watchdog: mtk_wdt: Use MODE_KEY when stopping the watchdog
watchdog: Add support for Freescale Layerscape platforms
watchdog: tegra: Stop watchdog first if restarting
watchdog: w83977f_wdt: underflow in wdt_set_timeout()
watchdog: pnx4008: make global wdt_clk static
watchdog: pnx4008: fix warnings caused by enabling unprepared clock
watchdog: omap_wdt: fix null pointer dereference
Linus Torvalds [Fri, 27 Nov 2015 23:45:45 +0000 (15:45 -0800)]
Merge branch 'for-linus-4.4' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"This has Mark Fasheh's patches to fix quota accounting during subvol
deletion, which we've been working on for a while now. The patch is
pretty small but it's a key fix.
Otherwise it's a random assortment"
* 'for-linus-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: fix balance range usage filters in 4.4-rc
btrfs: qgroup: account shared subtree during snapshot delete
Btrfs: use btrfs_get_fs_root in resolve_indirect_ref
btrfs: qgroup: fix quota disable during rescan
Btrfs: fix race between cleaner kthread and space cache writeout
Btrfs: fix scrub preventing unused block groups from being deleted
Btrfs: fix race between scrub and block group deletion
btrfs: fix rcu warning during device replace
btrfs: Continue replace when set_block_ro failed
btrfs: fix clashing number of the enhanced balance usage filter
Btrfs: fix the number of transaction units needed to remove a block group
Btrfs: use global reserve when deleting unused block group after ENOSPC
Btrfs: tests: checking for NULL instead of IS_ERR()
btrfs: fix signed overflows in btrfs_sync_file
Linus Torvalds [Fri, 27 Nov 2015 23:27:52 +0000 (15:27 -0800)]
Merge branch 'for-linus2' of git://git./linux/kernel/git/jmorris/linux-security
Pull security layer fixes from James Morris:
"A fix for SELinux policy processing (regression introduced by
commit
fa1aa143ac4a: "selinux: extended permissions for ioctls"), as
well as a fix for the user-triggerable oops in the Keys code"
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
KEYS: Fix handling of stored error in a negatively instantiated user key
selinux: fix bug in conditional rules handling
Linus Torvalds [Fri, 27 Nov 2015 22:22:03 +0000 (14:22 -0800)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"There is a small backlog of at91 patches here, the most significant is
the addition of some sama5d2 Xplained nodes that were waiting on an
MFD include file to get merged through another tree.
We normally try to sort those out before the merge window opens, but
the maintainer wasn't aware of that here and I decided to merge the
changes this time as an exception.
On OMAP a series of audio changes for dra7 missed the merge window but
turned out to be necessary to fix a boot time imprecise external abort
error and to get audio working.
The other changes are the usual simple changes, here is a list sorted
by platform:
at91:
removal of a useless defconfig option
removal of some legacy DT pieces
use of the proper watchdog compatible string
update of the MAINTAINERS entries for some Atmel drivers
drivers/scpi:
hide get_scpi_ops in module from built-in code
imx:
add missing .irq_set_type for i.MX GPC irq_chip.
fix the wrong spi-num-chipselects settings for Vybrid DSPI devices.
fix a merge error in Vybrid dts regarding to ADC device property
keystone:
fix the optional PDSP firmware loading
fix linking RAM setup for QMs
fix crash with clk_ignore_unused
mediatek:
Enable SCPSYS power domain driver by default
mvebu:
fix QNAP TS219 power-off in dts
fix legacy get_irqnr_and_base for dove and orion5x
omap:
fix l4 related boot time errors for dm81xx
use lockless cldm/pwrdm api in omap4_boot_secondary
remove t410 abort handler to avoid hiding other critical errors
mark cpuidle tracepoints as _rcuidle
fix module alias for omap-ocp2scp
pxa:
palm: Fix typos in PWM lookup table code
renesas:
missing __initconst annotation for r8a7793_boards_compat_dt
rockchip:
disable mmc-tuning on the veyron-minnie board
adding the init state for the over-temperature-protection
zx:
only build power domain code when CONFIG_PM=y"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (31 commits)
ARM: OMAP4+: SMP: use lockless clkdm/pwrdm api in omap4_boot_secondary
arm: omap2+: add missing HWMOD_NO_IDLEST in 81xx hwmod data
ARM: orion5x: Fix legacy get_irqnr_and_base
ARM: dove: Fix legacy get_irqnr_and_base
soc: Mediatek: Enable SCPSYS power domain driver by default
ARM: dts: vfxxx: Fix dspi[01] spi-num-chipselects.
ARM: dts: keystone: k2l: fix kernel crash when clk_ignore_unused is not in bootargs
soc: ti: knav_qmss_queue: Fix linking RAM setup for queue managers
soc: ti: use request_firmware_direct() as acc firmware is optional
ARM: imx: add platform irq type setting in gpc
ARM: dts: vfxxx: Fix erroneous property in esdhc0 node
ARM: shmobile: r8a7793: proper constness with __initconst
scpi: hide get_scpi_ops in module from built-in code
ARM: zx: only build power domain code when CONFIG_PM=y
ARM: pxa: palm: Fix typos in PWM lookup table code
ARM: dts: Kirkwood: Fix QNAP TS219 power-off
ARM: dts: rockchip: Add OTP gpio pinctrl to rk3288 tsadc node
ARM: dts: rockchip: temporarily remove emmc hs200 speed from rk3288 minnie
MAINTAINERS: Atmel drivers: change NAND and ISI entries
ARM: at91/dt: sama5d2 Xplained: add several devices
...
Linus Torvalds [Fri, 27 Nov 2015 21:12:42 +0000 (13:12 -0800)]
Merge tag 'pm+acpi-4.4-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull more power management and ACPI fixes from Rafael Wysocki:
"These fix one recent regression (cpufreq core), fix up two features
added recently (ACPI CPPC support, SCPI support in the arm_big_little
cpufreq driver) and fix three older bugs in the intel_pstate driver.
Specifics:
- Fix a recent regression in the cpufreq core causing it to fail to
clean up sysfs directories properly on cpufreq driver removal
(Viresh Kumar).
- Fix a build problem in the SCPI support code recently added to the
arm_big_little cpufreq driver (Punit Agrawal).
- Fix up the recently added CPPC cpufreq frontend to process the CPU
coordination information provided by the platform firmware
correctly (Ashwin Chaugule).
- Fix the intel_pstate driver to behave as intended when switched
over to the "performance" mode via sysfs if hardware-driven P-state
selection (HWP) is enabled (Alexandra Yates).
- Fix two rounding errors in the intel_pstate driver that sometimes
cause it to use lower P-states than requested (Prarit Bhargava)"
* tag 'pm+acpi-4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
intel_pstate: Fix "performance" mode behavior with HWP enabled
cpufreq: SCPI: Depend on SCPI clk driver
cpufreq: intel_pstate: Fix limits->max_perf rounding error
cpufreq: intel_pstate: Fix limits->max_policy_pct rounding error
cpufreq: Always remove sysfs cpuX/cpufreq link on ->remove_dev()
cpufreq: CPPC: Initialize and check CPUFreq CPU co-ord type correctly
Dave Airlie [Fri, 27 Nov 2015 20:50:34 +0000 (06:50 +1000)]
Merge branch 'linux-4.4' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes
Ben Skeggs wrote:
A couple of regression fixes, some more boards whitelisted for a hw bug
workaround, gr/ucode fixes for hangs a user is seeing.
The changes look larger than they actually are due to the ucode binaries
(*.fucN.h) being regenerated.
* 'linux-4.4' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nouveau/volt/pwm/gk104: fix an off-by-one resulting in the voltage not being set
drm/nouveau/nvif: allow userspace access to its own client object
drm/nouveau/gr/gf100-: fix oops when calling zbc methods
drm/nouveau/gr/gf117-: assume no PPC if NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK is zero
drm/nouveau/gr/gf117-: read NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK from correct GPC
drm/nouveau/gr/gf100-: split out per-gpc address calculation macro
drm/nouveau/bios: return actual size of the buffer retrieved via _ROM
drm/nouveau/instmem: protect instobj list with a spinlock
drm/nouveau/pci: enable c800 magic for some unknown Samsung laptop
drm/nouveau/pci: enable c800 magic for Clevo P157SM
Linus Torvalds [Fri, 27 Nov 2015 19:59:02 +0000 (11:59 -0800)]
Merge tag 'sound-4.4-rc3' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Here are no big surprises but just all small fixes, mostly
device-specific quirks for HD-audio and USB-audio:
- Fix for detection of FireWire DICE Loud devices
- Intel Broxton HDMI/DP PCI IDs and relevant quirks
- Noise fixes: Dell XPS13 2015 model, Dell Latitude E6440, Gigabyte
Z170X mobo
- Fix the headphone mixer assignment on HP laptops for PulseAudio
- USB-MIDI fixes for Medeli DD305 and CH345
- Apply fixup for Acer Aspire One Cloudbook 14"
* tag 'sound-4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix noise on Gigabyte Z170X mobo
ALSA: hda - Fix headphone noise after Dell XPS 13 resume back from S3
ALSA: hda - Apply HP headphone fixups more generically
ALSA: hda - Add fixup for Acer Aspire One Cloudbook 14
ALSA: hda - apply SKL display power request/release patch to BXT
ALSA: hda - add PCI IDs for Intel Broxton
ALSA: usb-audio: work around CH345 input SysEx corruption
ALSA: usb-audio: prevent CH345 multiport output SysEx corruption
ALSA: usb-audio: add packet size quirk for the Medeli DD305
ALSA: dice: fix detection of Loud devices
ALSA: hda - Fix noise on Dell Latitude E6440
Linus Torvalds [Fri, 27 Nov 2015 19:09:59 +0000 (11:09 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Build fix when !CONFIG_UID16 (the patch is touching generic files but
it only affects arm64 builds; submitted by Arnd Bergmann)
- EFI fixes to deal with early_memremap() returning NULL and correctly
mapping run-time regions
- Fix CPUID register extraction of unsigned fields (not to be
sign-extended)
- ASID allocator fix to deal with long-running tasks over multiple
generation roll-overs
- Revert support for marking page ranges as contiguous PTEs (it leads
to TLB conflicts and requires additional non-trivial kernel changes)
- Proper early_alloc() failure check
- Disable KASan for 48-bit VA and 16KB page configuration (the pgd is
larger than the KASan shadow memory)
- Update the fault_info table (original descriptions based on early
engineering spec)
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: efi: fix initcall return values
arm64: efi: deal with NULL return value of early_memremap()
arm64: debug: Treat the BRPs/WRPs as unsigned
arm64: cpufeature: Track unsigned fields
arm64: cpufeature: Add helpers for extracting unsigned values
Revert "arm64: Mark kernel page ranges contiguous"
arm64: mm: keep reserved ASIDs in sync with mm after multiple rollovers
arm64: KASAN depends on !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
arm64: efi: correctly map runtime regions
arm64: mm: fix fault_info table xFSC decoding
arm64: fix building without CONFIG_UID16
arm64: early_alloc: Fix check for allocation failure
Linus Torvalds [Fri, 27 Nov 2015 19:05:50 +0000 (11:05 -0800)]
Merge tag 'nios2-v4.4-rc3' of git://git./linux/kernel/git/lftan/nios2
Pull nios2 fix from Ley Foon Tan:
"nios2: fix cache coherency"
* tag 'nios2-v4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
nios2: fix cache coherency
Ralf Baechle [Fri, 27 Nov 2015 18:17:01 +0000 (19:17 +0100)]
MIPS: Fix delay loops which may be removed by GCC.
GCC 4.1 and newer remove empty loops. This becomes a problem when delay
loops get removed. Fixed by rewriting to user the proper Linux interface
for such delays.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: John Crispin <blogic@openwrt.org>
Linus Torvalds [Fri, 27 Nov 2015 18:08:31 +0000 (10:08 -0800)]
Merge tag 'arc-4.4-rc3-fixes' of git://git./linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
- Fix for perf callgraph unwinding causing RCU stalls
- Fix to enable Linux to run on non-default Interrupt priority 0
- Removal of pointless SYNC from __switch_to()
* tag 'arc-4.4-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: dw2 unwind: Remove falllback linear search thru FDE entries
ARC: remove SYNC from __switch_to()
ARCv2: Use the default irq priority for idle sleep
ARC: Abstract out ISA specific SLEEP args
ARC: comments update
ARC: switch to arc-linux- CROSS_COMPILE prefix across all configs
Arnd Bergmann [Fri, 27 Nov 2015 16:41:48 +0000 (17:41 +0100)]
Merge tag 'v4.4-rockchip-dts32-fixes1' of git://git./linux/kernel/git/mmind/linux-rockchip into fixes
Merge "ARM: rockchip: devicetree fixes for 4.4" from Heiko Stuebner:
Two fixes to Rockchip devicetree files, disabling the mmc-tuning
on the veyron-minnie board for now and adding the init state for
the over-temperature-protection to prevent glitches making the
system reboot sometimes.
* tag 'v4.4-rockchip-dts32-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
ARM: dts: rockchip: Add OTP gpio pinctrl to rk3288 tsadc node
ARM: dts: rockchip: temporarily remove emmc hs200 speed from rk3288 minnie
Arnd Bergmann [Fri, 27 Nov 2015 16:28:41 +0000 (17:28 +0100)]
Merge tag 'mvebu-fixes-4.4-1' of git://git.infradead.org/linux-mvebu into fixes
Merge "mvebu fixes for 4.4 (part 1)" from Jason Cooper:
- Fix QNAP TS219 power-off in dts
- Fix legacy get_irqnr_and_base for dove and orion5x
* tag 'mvebu-fixes-4.4-1' of git://git.infradead.org/linux-mvebu:
ARM: orion5x: Fix legacy get_irqnr_and_base
ARM: dove: Fix legacy get_irqnr_and_base
ARM: dts: Kirkwood: Fix QNAP TS219 power-off
Arnd Bergmann [Fri, 27 Nov 2015 16:28:10 +0000 (17:28 +0100)]
Merge tag 'renesas-fixes-for-v4.4' of git://git./linux/kernel/git/horms/renesas into fixes
Merge "Renesas ARM Based SoC Fixes for v4.4" from Simon Horman:
* r8a7793 SoC: Annotate r8a7793_boards_compat_dt with __initconst
Aside from being correct this builds that otherwise
fail with section mismatch errors.
* tag 'renesas-fixes-for-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: r8a7793: proper constness with __initconst
Rafael J. Wysocki [Fri, 27 Nov 2015 15:23:59 +0000 (16:23 +0100)]
Merge branches 'pm-cpufreq' and 'acpi-cppc'
* pm-cpufreq:
intel_pstate: Fix "performance" mode behavior with HWP enabled
cpufreq: SCPI: Depend on SCPI clk driver
cpufreq: intel_pstate: Fix limits->max_perf rounding error
cpufreq: intel_pstate: Fix limits->max_policy_pct rounding error
cpufreq: Always remove sysfs cpuX/cpufreq link on ->remove_dev()
* acpi-cppc:
cpufreq: CPPC: Initialize and check CPUFreq CPU co-ord type correctly
Linus Torvalds [Thu, 26 Nov 2015 19:42:25 +0000 (11:42 -0800)]
Merge tag 'for-linus-4.4-rc2-tag' of git://git./linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel:
- Fix gntdev and numa balancing.
- Fix x86 boot crash due to unallocated legacy irq descs.
- Fix overflow in evtchn device when > 1024 event channels.
* tag 'for-linus-4.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/evtchn: dynamically grow pending event channel ring
xen/events: Always allocate legacy interrupts on PV guests
xen/gntdev: Grant maps should not be subject to NUMA balancing
Linus Torvalds [Thu, 26 Nov 2015 19:19:59 +0000 (11:19 -0800)]
Merge tag 'powerpc-4.4-3' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- tm: Block signal return from setting invalid MSR state from Michael
Neuling
- tm: Check for already reclaimed tasks from Michael Neuling
* tag 'powerpc-4.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/tm: Check for already reclaimed tasks
powerpc/tm: Block signal return setting invalid MSR state
David Vrabel [Thu, 26 Nov 2015 16:14:35 +0000 (16:14 +0000)]
xen/evtchn: dynamically grow pending event channel ring
If more than 1024 event channels are bound to a evtchn device then it
possible (even with well behaved applications) for the ring to
overflow and events to be lost (reported as an -EFBIG error).
Dynamically increase the size of the ring so there is always enough
space for all bound events. Well behaved applicables that only unmask
events after draining them from the ring can thus no longer lose
events.
However, an application could unmask an event before draining it,
allowing multiple entries per port to accumulate in the ring, and a
overflow could still occur. So the overflow detection and reporting
is retained.
The ring size is initially only 64 entries so the common use case of
an application only binding a few events will use less memory than
before. The ring size may grow to 512 KiB (enough for all 2^17
possible channels). This order 7 kmalloc() may fail due to memory
fragmentation, so we fall back to trying vmalloc().
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ard Biesheuvel [Mon, 23 Nov 2015 07:43:24 +0000 (08:43 +0100)]
arm64: efi: fix initcall return values
Even though initcall return values are typically ignored, the
prototype is to return 0 on success or a negative errno value on
error. So fix the arm_enable_runtime_services() implementation to
return 0 on conditions that are not in fact errors, and return a
meaningful error code otherwise.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ard Biesheuvel [Mon, 23 Nov 2015 07:43:23 +0000 (08:43 +0100)]
arm64: efi: deal with NULL return value of early_memremap()
Add NULL return value checks to two invocations of early_memremap()
in the UEFI init code. For the UEFI configuration tables, we just
warn since we have a better chance of being able to report the issue
in a way that can actually be noticed by a human operator if we don't
abort right away. For the UEFI memory map, however, all we can do is
panic() since we cannot proceed without a description of memory.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Suzuki K. Poulose [Wed, 18 Nov 2015 17:08:58 +0000 (17:08 +0000)]
arm64: debug: Treat the BRPs/WRPs as unsigned
IDAA64DFR0_EL1: BRPs and WRPs are unsigned values. Use
the appropriate helpers to extract those fields.
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Suzuki K. Poulose [Wed, 18 Nov 2015 17:08:57 +0000 (17:08 +0000)]
arm64: cpufeature: Track unsigned fields
Some of the feature bits have unsigned values and need
to be treated accordingly to avoid errors. Adds the property
to the feature bits and use the appropriate field extract helpers.
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Boris Ostrovsky [Fri, 20 Nov 2015 16:25:04 +0000 (11:25 -0500)]
xen/events: Always allocate legacy interrupts on PV guests
After commit
8c058b0b9c34 ("x86/irq: Probe for PIC presence before
allocating descs for legacy IRQs") early_irq_init() will no longer
preallocate descriptors for legacy interrupts if PIC does not
exist, which is the case for Xen PV guests.
Therefore we may need to allocate those descriptors ourselves.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Suzuki K. Poulose [Wed, 18 Nov 2015 17:08:56 +0000 (17:08 +0000)]
arm64: cpufeature: Add helpers for extracting unsigned values
The cpuid_feature_extract_field() extracts the feature value
as a signed integer. This could be problematic for features
whose values are unsigned. e.g, ID_AA64DFR0_EL1:BRPs. Add
an unsigned variant for the unsigned fields.
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Boris Ostrovsky [Tue, 10 Nov 2015 20:10:33 +0000 (15:10 -0500)]
xen/gntdev: Grant maps should not be subject to NUMA balancing
Doing so will cause the grant to be unmapped and then, during
fault handling, the fault to be mistakenly treated as NUMA hint
fault.
In addition, even if those maps could partcipate in NUMA
balancing, it wouldn't provide any benefit since we are unable
to determine physical page's node (even if/when VNUMA is
implemented).
Marking grant maps' VMAs as VM_IO will exclude them from being
part of NUMA balancing.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Simon Guinot [Thu, 26 Nov 2015 14:37:13 +0000 (15:37 +0100)]
rtc: ds1307: fix alarm reading at probe time
With the actual code, read_alarm() always returns -EINVAL when called
during the RTC device registration. This prevents from retrieving an
already configured alarm in hardware.
This patch fixes the issue by moving the HAS_ALARM bit configuration
(if supported by the hardware) above the rtc_device_register() call.
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Catalin Marinas [Thu, 26 Nov 2015 15:42:41 +0000 (15:42 +0000)]
Revert "arm64: Mark kernel page ranges contiguous"
This reverts commit
348a65cdcbbf243073ee39d1f7d4413081ad7eab.
Incorrect page table manipulation that does not respect the ARM ARM
recommended break-before-make sequence may lead to TLB conflicts. The
contiguous PTE patch makes the system even more susceptible to such
errors by changing the mapping from a single page to a contiguous range
of pages. An additional TLB invalidation would reduce the risk window,
however, the correct fix is to switch to a temporary swapper_pg_dir.
Once the correct workaround is done, the reverted commit will be
re-applied.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Jeremy Linton <jeremy.linton@arm.com>
Will Deacon [Thu, 26 Nov 2015 13:49:39 +0000 (13:49 +0000)]
arm64: mm: keep reserved ASIDs in sync with mm after multiple rollovers
Under some unusual context-switching patterns, it is possible to end up
with multiple threads from the same mm running concurrently with
different ASIDs:
1. CPU x schedules task t with mm p containing ASID a and generation g
This task doesn't block and the CPU doesn't context switch.
So:
* per_cpu(active_asid, x) = {g,a}
* p->context.id = {g,a}
2. Some other CPU generates an ASID rollover. The global generation is
now (g + 1). CPU x is still running t, with no context switch and
so per_cpu(reserved_asid, x) = {g,a}
3. CPU y schedules task t', which shares mm p with t. The generation
mismatches, so we take the slowpath and hit the reserved ASID from
CPU x. p is then updated so that p->context.id = {g + 1,a}
4. CPU y schedules some other task u, which has an mm != p.
5. Some other CPU generates *another* CPU rollover. The global
generation is now (g + 2). CPU x is still running t, with no context
switch and so per_cpu(reserved_asid, x) = {g,a}.
6. CPU y once again schedules task t', but now *fails* to hit the
reserved ASID from CPU x because of the generation mismatch. This
results in a new ASID being allocated, despite the fact that t is
still running on CPU x with the same mm.
Consequently, TLBIs (e.g. as a result of CoW) will not be synchronised
between the two threads.
This patch fixes the problem by updating all of the matching reserved
ASIDs when we hit on the slowpath (i.e. in step 3 above). This keeps
the reserved ASIDs in-sync with the mm and avoids the problem.
Reported-by: Tony Thompson <anthony.thompson@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Andrey Ryabinin [Tue, 17 Nov 2015 15:47:08 +0000 (18:47 +0300)]
arm64: KASAN depends on !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
On KASAN + 16K_PAGES + 48BIT_VA
arch/arm64/mm/kasan_init.c: In function ‘kasan_early_init’:
include/linux/compiler.h:484:38: error: call to ‘__compiletime_assert_95’ declared with attribute error: BUILD_BUG_ON failed: !IS_ALIGNED(KASAN_SHADOW_END, PGDIR_SIZE)
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
Currently KASAN will not work on 16K_PAGES and 48BIT_VA, so
forbid such configuration to avoid above build failure.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Suzuki K. Poulose <Suzuki.Poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ley Foon Tan [Thu, 26 Nov 2015 14:25:58 +0000 (22:25 +0800)]
nios2: fix cache coherency
There is intermittent cache coherency issue caught in toolchian tests.
Revert to use flushd.
Signed-off-by: Ley Foon Tan <lftan@altera.com>
Chris Wilson [Wed, 25 Nov 2015 14:39:03 +0000 (14:39 +0000)]
drm: Serialise multiple event readers
The previous patch reintroduced a race condition whereby a failure in
one reader may allow a second reader to see out-of-order events.
Introduce a mutex to serialise readers so that an event is completed in
its entirety before another reader may process an event. The two readers
may race against each other, but the events each retrieves are in the
correct order.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1448462343-2072-2-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Wed, 25 Nov 2015 14:39:02 +0000 (14:39 +0000)]
drm: Drop dev->event_lock spinlock around faulting copy_to_user()
In
commit
cdd1cf799bd24ac0a4184549601ae302267301c5
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Dec 4 21:03:25 2014 +0000
drm: Make drm_read() more robust against multithreaded races
I fixed the races by serialising the use of the event by extending the
dev->event_lock. However, as Thomas pointed out, the copy_to_user() may
fault (even the __copy_to_user_inatomic() variant used here) and calling
into the driver backend with the spinlock held is bad news. Therefore we
have to drop the spinlock before the copy, but that exposes us to the
old race whereby a second reader could see an out-of-order event (as the
first reader may claim the first request but fail to copy it back to
userspace and so on returning it to the event list it will be behind the
current event being copied by the second reader).
Reported-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1448462343-2072-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
James Morris [Thu, 26 Nov 2015 04:04:19 +0000 (15:04 +1100)]
Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/selinux into for-linus2
Dave Airlie [Thu, 26 Nov 2015 02:42:15 +0000 (12:42 +1000)]
Merge branch 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Radeon and amdgpu fixes for 4.4:
- DPM fixes for r7xx devices
- VCE fixes for Stoney
- GPUVM fixes
- Scheduler fixes
* 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: make some dpm errors debug only
drm/radeon: make rv770_set_sw_state failures non-fatal
drm/amdgpu: move dependency handling out of atomic section v2
drm/amdgpu: optimize scheduler fence handling
drm/amdgpu: remove vm->mutex
drm/amdgpu: add mutex for ba_va->valids/invalids
drm/amdgpu: adapt vce session create interface changes
drm/amdgpu: vce use multiple cache surface starting from stoney
drm/amdgpu: reset vce trap interrupt flag
Linus Torvalds [Wed, 25 Nov 2015 23:11:08 +0000 (15:11 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"A couple of fixes for sendfile lockups caught by Dmitry + a fix for
ancient sysvfs symlink breakage"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: Avoid softlockups with sendfile(2)
vfs: Make sendfile(2) killable even better
fix sysvfs symlinks
Arnd Bergmann [Wed, 25 Nov 2015 22:48:57 +0000 (23:48 +0100)]
Merge tag 'omap-for-v4.4/fixes-rc2' of git://git./linux/kernel/git/tmlind/linux-omap into fixes
Merge "Fixes for omaps for v4.4-rc cycle" from Tony Lindgren:
- A series of audio changes for dra7 that missed the merge window but turned
out to be necessary to fix a boot time imprecise external abort error and to
getaudio working
- Fix l4 related boot time errors for dm81xx
- Use lockless cldm/pwrdm api in omap4_boot_secondary
- Remove t410 custom abort handler that is no longer needed and may
hide other critical errors
- Mark cpuidle tracepoints as _rcuidle
- Fix module alias for omap-ocp2scp
* tag 'omap-for-v4.4/fixes-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP4+: SMP: use lockless clkdm/pwrdm api in omap4_boot_secondary
arm: omap2+: add missing HWMOD_NO_IDLEST in 81xx hwmod data
ARM: OMAP2+: remove custom abort handler for t410
ARM: OMAP: DRA7: hwmod: Add data for McASP3
ARM: OMAP2+: hwmod: Add hwmod flag for HWMOD_OPT_CLKS_NEEDED
ARM: dts: dra7: Fix McASP3 node regarding to clocks
bus: omap-ocp2scp: Fix module alias
ARM: OMAP2+: PM: Denote the cpuidle tracepoints as _rcuidle()
Arnd Bergmann [Wed, 25 Nov 2015 22:48:12 +0000 (23:48 +0100)]
Merge tag 'keystone-fixes-for-4.4' of git://git./linux/kernel/git/ssantosh/linux-keystone into fixes
Merge "Few Keystone fixes for 4.4-rcx" from Santosh Shilimkar:
- Fix the optional PDSP firmware loading
- Fix linking RAM setup for QMs
- Fix crash with clk_ignore_unused
* tag 'keystone-fixes-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone:
ARM: dts: keystone: k2l: fix kernel crash when clk_ignore_unused is not in bootargs
soc: ti: knav_qmss_queue: Fix linking RAM setup for queue managers
soc: ti: use request_firmware_direct() as acc firmware is optional
Arnd Bergmann [Wed, 25 Nov 2015 22:47:38 +0000 (23:47 +0100)]
Merge tag 'v4.4-rc2' into fixes
Linux 4.4-rc2 is backmerged from the keystone fixes.
Arnd Bergmann [Wed, 25 Nov 2015 22:45:53 +0000 (23:45 +0100)]
Merge tag 'imx-fixes-4.4' of git://git./linux/kernel/git/shawnguo/linux into fixes
Merge "The i.MX fixes for 4.4" from Shawn Guo:
- Add missing .irq_set_type for i.MX GPC irq_chip. It fixes an issue
that device IRQ type setting doesn't match the one specified in device
tree, since stacked IRQ domain is adopted in GPC driver.
- Fix the wrong spi-num-chipselects settings for Vybrid DSPI devices.
- Fix a merge error in Vybrid dts regarding to ADC device property
fsl,adck-max-frequency
* tag 'imx-fixes-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: vfxxx: Fix dspi[01] spi-num-chipselects.
ARM: imx: add platform irq type setting in gpc
ARM: dts: vfxxx: Fix erroneous property in esdhc0 node
Alexandra Yates [Wed, 18 Nov 2015 22:58:40 +0000 (14:58 -0800)]
intel_pstate: Fix "performance" mode behavior with HWP enabled
If hardware-driven P-state selection (HWP) is enabled, the
"performance" mode of intel_pstate should only allow the processor
to use the highest-performance P-state available. That is not
the case currently, so make it actually happen.
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Alexandra Yates <alexandra.yates@linux.intel.com>
[ rjw: Subject and changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Jeff Layton [Wed, 25 Nov 2015 18:43:14 +0000 (13:43 -0500)]
nfs4: resend LAYOUTGET when there is a race that changes the seqid
pnfs_layout_process will check the returned layout stateid against what
the kernel has in-core. If it turns out that the stateid we received is
older, then we should resend the LAYOUTGET instead of falling back to
MDS I/O.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org # 3.18+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Jeff Layton [Wed, 25 Nov 2015 18:50:11 +0000 (13:50 -0500)]
nfs: if we have no valid attrs, then don't declare the attribute cache valid
If we pass in an empty nfs_fattr struct to nfs_update_inode, it will
(correctly) not update any of the attributes, but it then clears the
NFS_INO_INVALID_ATTR flag, which indicates that the attributes are
up to date. Don't clear the flag if the fattr struct has no valid
attrs to apply.
Reviewed-by: Steve French <steve.french@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Jeff Layton [Wed, 25 Nov 2015 18:50:45 +0000 (13:50 -0500)]
nfs: ensure that attrcache is revalidated after a SETATTR
If we get no post-op attributes back from a SETATTR operation, then no
attributes will of course be updated during the call to
nfs_update_inode.
We know however that the attributes are invalid at that point, since we
just changed some of them. At the very least, the ctime will be bogus.
If we get no post-op attributes back on the call, mark the attrcache
invalid to reflect that fact.
Reviewed-by: Steve French <steve.french@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Gabriele Paoloni [Wed, 11 Nov 2015 01:12:25 +0000 (09:12 +0800)]
ARM/PCI: Move align_resource function pointer to pci_host_bridge structure
Commit
b3a72384fe29 ("ARM/PCI: Replace pci_sys_data->align_resource with
global function pointer") introduced an ARM-specific align_resource()
function pointer. This is not portable to other arches and doesn't work
for platforms with two different PCIe host bridge controllers.
Move the function pointer to the pci_host_bridge structure so each host
bridge driver can specify its own align_resource() function.
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Linus Torvalds [Wed, 25 Nov 2015 19:08:35 +0000 (11:08 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull more block layer fixes from Jens Axboe:
"I wasn't going to send off a new pull before next week, but the blk
flush fix from Jan from the other day introduced a regression. It's
rare enough not to have hit during testing, since it requires both a
device that rejects the first flush, and bad timing while it does
that. But since someone did hit it, let's get the revert into 4.4-rc3
so we don't have a released rc with that known issue.
Apart from that revert, three other fixes:
- From Christoph, a fix for a missing unmap in NVMe request
preparation.
- An NVMe fix from Nishanth that fixes data corruption on powerpc.
- Also from Christoph, fix a list_del() attempt on blk-mq that didn't
have a matching list_add() at timer start"
* 'for-linus' of git://git.kernel.dk/linux-block:
Revert "blk-flush: Queue through IO scheduler when flush not required"
block: fix blk_abort_request for blk-mq drivers
nvme: add missing unmaps in nvme_queue_rq
NVMe: default to 4k device page size
Grygorii Strashko [Mon, 16 Nov 2015 17:38:53 +0000 (19:38 +0200)]
ARM: OMAP4+: SMP: use lockless clkdm/pwrdm api in omap4_boot_secondary
OMAP CPU hotplug uses cpu1's clocks and power domains for CPU1 wake up
from low power states (or turn on CPU1). This part of code is also
part of system suspend (disable_nonboot_cpus()).
>From other side, cpu1's clocks and power domains are used by CPUIdle. All above
functionality is mutually exclusive and, therefore, lockless clkdm/pwrdm api
can be used in omap4_boot_secondary().
This fixes below back-trace on -RT which is triggered by
pwrdm_lock/unlock():
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 1, irqs_disabled(): 0, pid: 118, name: sh
9 locks held by sh/118:
#0: (sb_writers#4){.+.+.+}, at: [<
c0144a6c>] vfs_write+0x13c/0x164
#1: (&of->mutex){+.+.+.}, at: [<
c01b4c70>] kernfs_fop_write+0x48/0x19c
#2: (s_active#24){.+.+.+}, at: [<
c01b4c78>] kernfs_fop_write+0x50/0x19c
#3: (device_hotplug_lock){+.+.+.}, at: [<
c03cbff0>] lock_device_hotplug_sysfs+0xc/0x4c
#4: (&dev->mutex){......}, at: [<
c03cd284>] device_online+0x14/0x88
#5: (cpu_add_remove_lock){+.+.+.}, at: [<
c003af90>] cpu_up+0x50/0x1a0
#6: (cpu_hotplug.lock){++++++}, at: [<
c003ae48>] cpu_hotplug_begin+0x0/0xc4
#7: (cpu_hotplug.lock#2){+.+.+.}, at: [<
c003aec0>] cpu_hotplug_begin+0x78/0xc4
#8: (boot_lock){+.+...}, at: [<
c002b254>] omap4_boot_secondary+0x1c/0x178
Preemption disabled at:[< (null)>] (null)
CPU: 0 PID: 118 Comm: sh Not tainted
4.1.12-rt11-01998-gb4a62c3-dirty #137
Hardware name: Generic DRA74X (Flattened Device Tree)
[<
c0017574>] (unwind_backtrace) from [<
c0013be8>] (show_stack+0x10/0x14)
[<
c0013be8>] (show_stack) from [<
c05a8670>] (dump_stack+0x80/0x94)
[<
c05a8670>] (dump_stack) from [<
c05ad158>] (rt_spin_lock+0x24/0x54)
[<
c05ad158>] (rt_spin_lock) from [<
c0030dac>] (clkdm_wakeup+0x10/0x2c)
[<
c0030dac>] (clkdm_wakeup) from [<
c002b2c0>] (omap4_boot_secondary+0x88/0x178)
[<
c002b2c0>] (omap4_boot_secondary) from [<
c0015d00>] (__cpu_up+0xc4/0x164)
[<
c0015d00>] (__cpu_up) from [<
c003b09c>] (cpu_up+0x15c/0x1a0)
[<
c003b09c>] (cpu_up) from [<
c03cd2d4>] (device_online+0x64/0x88)
[<
c03cd2d4>] (device_online) from [<
c03cd360>] (online_store+0x68/0x74)
[<
c03cd360>] (online_store) from [<
c01b4ce0>] (kernfs_fop_write+0xb8/0x19c)
[<
c01b4ce0>] (kernfs_fop_write) from [<
c0144124>] (__vfs_write+0x20/0xd8)
[<
c0144124>] (__vfs_write) from [<
c01449c0>] (vfs_write+0x90/0x164)
[<
c01449c0>] (vfs_write) from [<
c01451e4>] (SyS_write+0x44/0x9c)
[<
c01451e4>] (SyS_write) from [<
c0010240>] (ret_fast_syscall+0x0/0x54)
CPU1: smp_ops.cpu_die() returned, trying to resuscitate
Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tony Lindgren [Wed, 25 Nov 2015 18:56:40 +0000 (10:56 -0800)]
Merge branch '81xx' into omap-for-v4.4/fixes
Neil Armstrong [Fri, 13 Nov 2015 16:29:53 +0000 (17:29 +0100)]
arm: omap2+: add missing HWMOD_NO_IDLEST in 81xx hwmod data
Add missing HWMOD_NO_IDLEST hwmod flag for entries not
having omap4 clkctrl values.
The emac0 hwmod flag fixes the davinci_emac driver probe
since the return of pm_resume() call is now checked.
This solves the following boot errors :
[ 0.121429] omap_hwmod: l4_ls: _wait_target_ready failed: -16
[ 0.121441] omap_hwmod: l4_ls: cannot be enabled for reset (3)
[ 0.124342] omap_hwmod: l4_hs: _wait_target_ready failed: -16
[ 0.124352] omap_hwmod: l4_hs: cannot be enabled for reset (3)
[ 1.967228] omap_hwmod: emac0: _wait_target_ready failed: -16
Cc: Brian Hutchinson <b.hutchman@gmail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Jens Axboe [Wed, 25 Nov 2015 17:12:54 +0000 (10:12 -0700)]
Revert "blk-flush: Queue through IO scheduler when flush not required"
This reverts commit
1b2ff19e6a957b1ef0f365ad331b608af80e932e.
Jan writes:
--
Thanks for report! After some investigation I found out we allocate
elevator specific data in __get_request() only for non-flush requests. And
this is actually required since the flush machinery uses the space in
struct request for something else. Doh. So my patch is just wrong and not
easy to fix since at the time __get_request() is called we are not sure
whether the flush machinery will be used in the end. Jens, please revert
1b2ff19e6a957b1ef0f365ad331b608af80e932e. Thanks!
I'm somewhat surprised that you can reliably hit the race where flushing
gets disabled for the device just while the request is in flight. But I
guess during boot it makes some sense.
--
So let's just revert it, we can fix the queue run manually after the
fact. This race is rare enough that it didn't trigger in testing, it
requires the specific disable-while-in-flight scenario to trigger.
Linus Torvalds [Wed, 25 Nov 2015 17:01:49 +0000 (09:01 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Bug fixes for all architectures. Nothing really stands out"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
KVM: nVMX: remove incorrect vpid check in nested invvpid emulation
arm64: kvm: report original PAR_EL1 upon panic
arm64: kvm: avoid %p in __kvm_hyp_panic
KVM: arm/arm64: vgic: Trust the LR state for HW IRQs
KVM: arm/arm64: arch_timer: Preserve physical dist. active state on LR.active
KVM: arm/arm64: Fix preemptible timer active state crazyness
arm64: KVM: Add workaround for Cortex-A57 erratum 834220
arm64: KVM: Fix AArch32 to AArch64 register mapping
ARM/arm64: KVM: test properly for a PTE's uncachedness
KVM: s390: fix wrong lookup of VCPUs by array index
KVM: s390: avoid memory overwrites on emergency signal injection
KVM: Provide function for VCPU lookup by id
KVM: s390: fix pfmf intercept handler
KVM: s390: enable SIMD only when no VCPUs were created
KVM: x86: request interrupt window when IRQ chip is split
KVM: x86: set KVM_REQ_EVENT on local interrupt request from user space
KVM: x86: split kvm_vcpu_ready_for_interrupt_injection out of dm_request_for_irq_injection
KVM: x86: fix interrupt window handling in split IRQ chip case
MIPS: KVM: Uninit VCPU in vcpu_create error path
MIPS: KVM: Fix CACHE immediate offset sign extension
...
Alex Deucher [Mon, 23 Nov 2015 21:38:12 +0000 (16:38 -0500)]
drm/radeon: make some dpm errors debug only
"Could not force DPM to low", etc. is usually harmless and
just confuses users.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Mark Rutland [Mon, 23 Nov 2015 11:09:11 +0000 (11:09 +0000)]
arm64: efi: correctly map runtime regions
The kernel may use a page granularity of 4K, 16K, or 64K depending on
configuration.
When mapping EFI runtime regions, we use memrange_efi_to_native to round
the physical base address of a region down to a kernel page boundary,
and round the size up to a kernel page boundary, adding the residue left
over from rounding down the physical base address. We do not round down
the virtual base address.
In __create_mapping we account for the offset of the virtual base from a
granule boundary, adding the residue to the size before rounding the
base down to said granule boundary.
Thus we account for the residue twice, and when the residue is non-zero
will cause __create_mapping to map an additional page at the end of the
region. Depending on the memory map, this page may be in a region we are
not intended/permitted to map, or may clash with a different region that
we wish to map. In typical cases, mapping the next item in the memory
map will overwrite the erroneously created entry, as we sort the memory
map in the stub.
As __create_mapping can cope with base addresses which are not page
aligned, we can instead rely on it to map the region appropriately, and
simplify efi_virtmap_init by removing the unnecessary code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Mark Rutland [Mon, 23 Nov 2015 15:09:36 +0000 (15:09 +0000)]
arm64: mm: fix fault_info table xFSC decoding
We are missing descriptions for some valid xFSC values in the fault info
table (e.g. "TLB conflict abort"), and have erroneous descriptions for
reserved values (e.g. "asynchronous external abort", "debug event").
This patch adds the missing xFSC values, and removes erroneous decoding
of values reserved by the architecture, as described in ARM DDI 0487A.h.
At the same time, fixed the unbalanced brackets for the synchronous
parity error strings in the table.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Arnd Bergmann [Fri, 20 Nov 2015 11:12:21 +0000 (12:12 +0100)]
arm64: fix building without CONFIG_UID16
As reported by Michal Simek, building an ARM64 kernel with CONFIG_UID16
disabled currently fails because the system call table still needs to
reference the individual function entry points that are provided by
kernel/sys_ni.c in this case, and the declarations are hidden inside
of #ifdef CONFIG_UID16:
arch/arm64/include/asm/unistd32.h:57:8: error: 'sys_lchown16' undeclared here (not in a function)
__SYSCALL(__NR_lchown, sys_lchown16)
I believe this problem only exists on ARM64, because older architectures
tend to not need declarations when their system call table is built
in assembly code, while newer architectures tend to not need UID16
support. ARM64 only uses these system calls for compatibility with
32-bit ARM binaries.
This changes the CONFIG_UID16 check into CONFIG_HAVE_UID16, which is
set unconditionally on ARM64 with CONFIG_COMPAT, so we see the
declarations whenever we need them, but otherwise the behavior is
unchanged.
Fixes: af1839eb4bd4 ("Kconfig: clean up the long arch list for the UID16 config option")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Geliang Tang [Wed, 25 Nov 2015 13:23:07 +0000 (21:23 +0800)]
drm/mm: use list_next_entry
To make the intention clearer, use list_next_entry instead of list_entry.
Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Nicolas Pitre [Mon, 23 Nov 2015 03:44:19 +0000 (22:44 -0500)]
ARM: orion5x: Fix legacy get_irqnr_and_base
Commit
5be9fc23cd ("ARM: orion5x: fix legacy orion5x IRQ numbers") shifted
IRQ numbers by one but didn't update the get_irqnr_and_base macro
accordingly. This macro is involved when CONFIG_MULTI_IRQ_HANDLER
is not defined.
[jac:
5d6bed2a9c went in to v4.2, but was backported to v3.18]
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Fixes: 5be9fc23cd ("ARM: orion5x: fix legacy orion5x IRQ numbers")
Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Nicolas Pitre [Mon, 23 Nov 2015 03:40:03 +0000 (22:40 -0500)]
ARM: dove: Fix legacy get_irqnr_and_base
Commit
5d6bed2a9c ("ARM: dove: fix legacy dove IRQ numbers") shifted
IRQ numbers by one but didn't update the get_irqnr_and_base macro
accordingly. This macro is involved when CONFIG_MULTI_IRQ_HANDLER
is not defined.
[jac:
5d6bed2a9c went in to v4.2, but was backported to v3.18]
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Fixes: 5d6bed2a9c ("ARM: dove: fix legacy dove IRQ numbers")
Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Haozhong Zhang [Wed, 25 Nov 2015 09:21:39 +0000 (17:21 +0800)]
KVM: nVMX: remove incorrect vpid check in nested invvpid emulation
This patch removes the vpid check when emulating nested invvpid
instruction of type all-contexts invalidation. The existing code is
incorrect because:
(1) According to Intel SDM Vol 3, Section "INVVPID - Invalidate
Translations Based on VPID", invvpid instruction does not check
vpid in the invvpid descriptor when its type is all-contexts
invalidation.
(2) According to the same document, invvpid of type all-contexts
invalidation does not require there is an active VMCS, so/and
get_vmcs12() in the existing code may result in a NULL-pointer
dereference. In practice, it can crash both KVM itself and L1
hypervisors that use invvpid (e.g. Xen).
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Holger Hoffstätte [Tue, 17 Nov 2015 11:29:32 +0000 (12:29 +0100)]
btrfs: fix balance range usage filters in 4.4-rc
There's a regression in 4.4-rc since commit
bc3094673f22
(btrfs: extend balance filter usage to take minimum and maximum) in that
existing (non-ranged) balance with -dusage=x no longer works; all chunks
are skipped.
After staring at the code for a while and wondering why a non-ranged
balance would even need min and max thresholds (..which then were not
set correctly, leading to the bug) I realized that the only problem
was the fact that the filter functions were named wrong, thanks to
patching copypasta. Simply renaming both functions lets the existing
btrfs-progs call balance with -dusage=x and now the non-ranged filter
function is invoked, properly using only a single chunk limit.
Signed-off-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
Fixes: bc3094673f22 ("btrfs: extend balance filter usage to take minimum and maximum")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Mark Fasheh [Thu, 5 Nov 2015 22:38:00 +0000 (14:38 -0800)]
btrfs: qgroup: account shared subtree during snapshot delete
Commit
0ed4792 ('btrfs: qgroup: Switch to new extent-oriented qgroup
mechanism.') removed our qgroup accounting during
btrfs_drop_snapshot(). Predictably, this results in qgroup numbers
going bad shortly after a snapshot is removed.
Fix this by adding a dirty extent record when we encounter extents during
our shared subtree walk. This effectively restores the functionality we had
with the original shared subtree walking code in
1152651 (btrfs: qgroup:
account shared subtrees during snapshot delete).
The idea with the original patch (and this one) is that shared subtrees can
get skipped during drop_snapshot. The shared subtree walk then allows us a
chance to visit those extents and add them to the qgroup work for later
processing. This ultimately makes the accounting for drop snapshot work.
The new qgroup code nicely handles all the other extents during the tree
walk via the ref dec/inc functions so we don't have to add actions beyond
what we had originally.
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Chris Mason <clm@fb.com>
Josef Bacik [Thu, 5 Nov 2015 22:37:58 +0000 (14:37 -0800)]
Btrfs: use btrfs_get_fs_root in resolve_indirect_ref
The backref code will look up the fs_root we're trying to resolve our indirect
refs for, unfortunately we use btrfs_read_fs_root_no_name, which returns -ENOENT
if the ref is 0. This isn't helpful for the qgroup stuff with snapshot delete
as it won't be able to search down the snapshot we are deleting, which will
cause us to miss roots. So use btrfs_get_fs_root and send false for check_ref
so we can always get the root we're looking for. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Chris Mason <clm@fb.com>
Justin Maggard [Fri, 6 Nov 2015 18:36:42 +0000 (10:36 -0800)]
btrfs: qgroup: fix quota disable during rescan
There's a race condition that leads to a NULL pointer dereference if you
disable quotas while a quota rescan is running. To fix this, we just need
to wait for the quota rescan worker to actually exit before tearing down
the quota structures.
Signed-off-by: Justin Maggard <jmaggard@netgear.com>
Signed-off-by: Chris Mason <clm@fb.com>
Filipe Manana [Mon, 23 Nov 2015 15:25:16 +0000 (15:25 +0000)]
Btrfs: fix race between cleaner kthread and space cache writeout
When a block group becomes unused and the cleaner kthread is currently
running, we can end up getting the current transaction aborted with error
-ENOENT when we try to commit the transaction, leading to the following
trace:
[59779.258768] WARNING: CPU: 3 PID: 5990 at fs/btrfs/extent-tree.c:3740 btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs]()
[59779.272594] BTRFS: Transaction aborted (error -2)
(...)
[59779.291137] Call Trace:
[59779.291621] [<
ffffffff812566f4>] dump_stack+0x4e/0x79
[59779.292543] [<
ffffffff8104d0a6>] warn_slowpath_common+0x9f/0xb8
[59779.293435] [<
ffffffffa04cb81f>] ? btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs]
[59779.295000] [<
ffffffff8104d107>] warn_slowpath_fmt+0x48/0x50
[59779.296138] [<
ffffffffa04c2721>] ? write_one_cache_group.isra.32+0x77/0x82 [btrfs]
[59779.297663] [<
ffffffffa04cb81f>] btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs]
[59779.299141] [<
ffffffffa0549b0d>] commit_cowonly_roots+0x1de/0x261 [btrfs]
[59779.300359] [<
ffffffffa04dd5b6>] btrfs_commit_transaction+0x4c4/0x99c [btrfs]
[59779.301805] [<
ffffffffa04b5df4>] btrfs_sync_fs+0x145/0x1ad [btrfs]
[59779.302893] [<
ffffffff81196634>] sync_filesystem+0x7f/0x93
(...)
[59779.318186] ---[ end trace
577e2daff90da33a ]---
The following diagram illustrates a sequence of steps leading to this
problem:
CPU 1 CPU 2
<at transaction N>
adds bg A to list
fs_info->unused_bgs
adds bg B to list
fs_info->unused_bgs
<transaction kthread
commits transaction N
and wakes up the
cleaner kthread>
cleaner kthread
delete_unused_bgs()
sees bg A in list
fs_info->unused_bgs
btrfs_start_transaction()
<transaction N + 1 starts>
deletes bg A
update_block_group(bg C)
--> adds bg C to list
fs_info->unused_bgs
deletes bg B
sees bg C in the list
fs_info->unused_bgs
btrfs_remove_chunk(bg C)
btrfs_remove_block_group(bg C)
--> checks if the block group
is in a dirty list, and
because it isn't now, it
does nothing
--> the block group item
is deleted from the
extent tree
--> adds bg C to list
transaction->dirty_bgs
some task calls
btrfs_commit_transaction(t N + 1)
commit_cowonly_roots()
btrfs_write_dirty_block_groups()
--> sees bg C in cur_trans->dirty_bgs
--> calls write_one_cache_group()
which returns -ENOENT because
it did not find the block group
item in the extent tree
--> transaction aborte with -ENOENT
because write_one_cache_group()
returned that error
So fix this by adding a block group to the list of dirty block groups
before adding it to the list of unused block groups.
This happened on a stress test using fsstress plus concurrent calls to
fallocate 20G and truncate (releasing part of the space allocated with
fallocate).
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Filipe Manana [Thu, 19 Nov 2015 11:45:48 +0000 (11:45 +0000)]
Btrfs: fix scrub preventing unused block groups from being deleted
Currently scrub can race with the cleaner kthread when the later attempts
to delete an unused block group, and the result is preventing the cleaner
kthread from ever deleting later the block group - unless the block group
becomes used and unused again. The following diagram illustrates that
race:
CPU 1 CPU 2
cleaner kthread
btrfs_delete_unused_bgs()
gets block group X from
fs_info->unused_bgs and
removes it from that list
scrub_enumerate_chunks()
searches device tree using
its commit root
finds device extent for
block group X
gets block group X from the tree
fs_info->block_group_cache_tree
(via btrfs_lookup_block_group())
sets bg X to RO
sees the block group is
already RO and therefore
doesn't delete it nor adds
it back to unused list
So fix this by making scrub add the block group again to the list of
unused block groups if the block group is still unused when it finished
scrubbing it and it hasn't been removed already.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Filipe Manana [Thu, 19 Nov 2015 10:57:20 +0000 (10:57 +0000)]
Btrfs: fix race between scrub and block group deletion
Scrub can race with the cleaner kthread deleting block groups that are
unused (and with relocation too) leading to a failure with error -EINVAL
that gets returned to user space.
The following diagram illustrates how it happens:
CPU 1 CPU 2
cleaner kthread
btrfs_delete_unused_bgs()
gets block group X from
fs_info->unused_bgs
sets block group to RO
btrfs_remove_chunk(bg X)
deletes device extents
scrub_enumerate_chunks()
searches device tree using
its commit root
finds device extent for
block group X
gets block group X from the tree
fs_info->block_group_cache_tree
(via btrfs_lookup_block_group())
sets bg X to RO (again)
btrfs_remove_block_group(bg X)
deletes block group from
fs_info->block_group_cache_tree
removes extent map from
fs_info->mapping_tree
scrub_chunk(offset X)
searches fs_info->mapping_tree
for extent map starting at
offset X
--> doesn't find any such
extent map
--> returns -EINVAL and scrub
errors out to userspace
with -EINVAL
Fix this by dealing with an extent map lookup failure as an indicator of
block group deletion.
Issue reproduced with fstest btrfs/071.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>