git.openwrt.org Git - openwrt/staging/blogic.git/log

of/irq: move of_msi_map_rid declaration to the correct ifdef section

In checking fixes for of_irq_find_parent declaration location, I found
that of_msi_map_rid is also wrong. of_msi_map_rid is not implemented for
Sparc, so it should not be in the Sparc specific section of the header.
Move it to just depend on OF_IRQ.

Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>

of/irq: Export of_irq_find_parent again

of_irq_find_parent was made static since it had no users outside of
of_irq.c. Export it again since we are going to use it again.

Signed-off-by: Carlo Caione <carlo@endlessm.com>
[robh: move of_irq_find_parent to correct ifdef section]
Signed-off-by: Rob Herring <robh@kernel.org>

of/fdt: Add mutex protection for calls to __unflatten_device_tree()

__unflatten_device_tree() calls unflatten_dt_node(), which declares
a static variable. It is therefore not reentrant.

One of the callers of __unflatten_device_tree(), unflatten_device_tree(),
is only called once during early initialization and does not need to be
protected. The other caller, of_fdt_unflatten_tree(), can be called at
any time, possibly multiple times in parallel. This can happen, for
example, if multiple devicetree overlays have to be loaded and installed.

Without this protection, errors such as the following may be seen.

kernel: End of tree marker overwritten: e6a3a458
kernel: find_target_node:
Failed to find target-indirect node at /fragment@0
kernel: __of_overlay_create: of_build_overlay_info() failed for tree@/

Add a mutex to of_fdt_unflatten_tree() to make the call reentrant.

Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Rob Herring <robh@kernel.org>

of/address: fix typo in comment block of of_translate_one()

Remove the "not" before "cannot".

I am fixing the comment block style while I am here.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Rob Herring <robh@kernel.org>

of: do not use 0x in front of %pa

Signed-off-by: Dmitry V. Krivenok <krivenok.dmitry@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>

of: Fix comparison of reserved memory regions

In order to check for overlapping reserved memory regions, we first need
to sort the array of memory regions. This is implemented using sort(),
and a custom comparison function __rmem_cmp().

Unfortunatley __rmem_cmp() doesn't work in all cases. Because the two
base values are phys_addr_t, they may be u64 on some platforms, in which
case subtracting one from the other and then (implicitly) casting to int
does not give us the -ve/0/+ve value we need.

This leads to incorrect reports about overlaps, eg:

ibm,slw-image@1ffe600000 (0x0000001ffe600000--0x0000001ffe700000) overlaps with
ibm,firmware-allocs-memory@1000000000 (0x0000001000000000--0x0000001000dc0200)

Fix it by just doing the standard double if and return 0 logic.

Fixes: ae1add247bf8 ("of: Check for overlap in reserved memory regions")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Rob Herring <robh@kernel.org>

Linux 4.4-rc3

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull nouveau and radeon fixes from Dave Airlie:
"Just some nouveau and radeon/amdgpu fixes.

  The nouveau fixes look large as the firmware context files are
  regenerated, but the actual change is quite small"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon: make some dpm errors debug only
  drm/nouveau/volt/pwm/gk104: fix an off-by-one resulting in the voltage not being set
  drm/nouveau/nvif: allow userspace access to its own client object
  drm/nouveau/gr/gf100-: fix oops when calling zbc methods
  drm/nouveau/gr/gf117-: assume no PPC if NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK is zero
  drm/nouveau/gr/gf117-: read NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK from correct GPC
  drm/nouveau/gr/gf100-: split out per-gpc address calculation macro
  drm/nouveau/bios: return actual size of the buffer retrieved via _ROM
  drm/nouveau/instmem: protect instobj list with a spinlock
  drm/nouveau/pci: enable c800 magic for some unknown Samsung laptop
  drm/nouveau/pci: enable c800 magic for Clevo P157SM
  drm/radeon: make rv770_set_sw_state failures non-fatal
  drm/amdgpu: move dependency handling out of atomic section v2
  drm/amdgpu: optimize scheduler fence handling
  drm/amdgpu: remove vm->mutex
  drm/amdgpu: add mutex for ba_va->valids/invalids
  drm/amdgpu: adapt vce session create interface changes
  drm/amdgpu: vce use multiple cache surface starting from stoney
  drm/amdgpu: reset vce trap interrupt flag

Merge tag 'rtc-4.4-2' of git://git./linux/kernel/git/abelloni/linux

Pull RTC fixes from Alexandre Belloni:
"Two fixes for the ds1307 alarm and wakeup"

* tag 'rtc-4.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
rtc: ds1307: fix alarm reading at probe time
rtc: ds1307: fix kernel splat due to wakeup irq handling

Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS fix from Ralf Baechle:
"Just a fix for empty loops that may be removed by non-antique GCC"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Fix delay loops which may be removed by GCC.

Merge branch 'for-linus' of git://git./linux/kernel/git/geert/linux-m68k

Pull m68k fixes from Geert Uytterhoeven:
"Summary:

   - Add missing initialization of max_pfn, which is needed to make
     selftests/vm/mlock2-tests succeed,

   - Wire up new mlock2 syscall"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: Wire up mlock2
  m68knommu: Add missing initialization of max_pfn and {min,max}_low_pfn
  m68k/mm: sun3 - Add missing initialization of max_pfn and {min,max}_low_pfn
  m68k/mm: m54xx - Add missing initialization of max_pfn
  m68k/mm: motorola - Add missing initialization of max_pfn

Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:
"Just two changes this time around:

   - wire up the new mlock2 syscall added during the last merge window

   - fix a build problem with certain configurations provoked by making
     CONFIG_OF user selectable"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8454/1: OF implies OF_FLATTREE
  ARM: wire up mlock2 syscall

Merge git://git./linux/kernel/git/nab/target-pending

Pull SCSI target fixes from Nicholas Bellinger:
- fix tcm-user backend driver expired cmd time processing (agrover)
- eliminate kref_put_spinlock_irqsave() for I/O completion (bart)
- fix iscsi login kthread failure case hung task regression (nab)
- fix COMPARE_AND_WRITE completion use-after-free race (nab)
- fix COMPARE_AND_WRITE with SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC non zero
   SGL offset data corruption.  (Jan + Doug)
- fix >= v4.4-rc1 regression for tcm_qla2xxx enable configfs attribute
   (Himanshu + HCH)

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target/stat: print full t10_wwn.model buffer
  target: fix COMPARE_AND_WRITE non zero SGL offset data corruption
  qla2xxx: Fix regression introduced by target configFS changes
  kref: Remove kref_put_spinlock_irqsave()
  target: Invoke release_cmd() callback without holding a spinlock
  target: Fix race for SCF_COMPARE_AND_WRITE_POST checking
  iscsi-target: Fix rx_login_comp hang after login failure
  iscsi-target: return -ENOMEM instead of -1 in case of failed kmalloc()
  target/user: Do not set unused fields in tcmu_ops
  target/user: Fix time calc in expired cmd processing

Merge branch 'next' of git://git./linux/kernel/git/rzhang/linux

Pull thermal management fixes from Zhang Rui:
"Specifics:

- several fixes and cleanups on Rockchip thermal drivers.

- add the missing support of RK3368 SoCs in Rockchip driver.

- small fixes on of-thermal, power_allocator, rcar driver, IMX, and
   QCOM drivers, and also compilation fixes, on thermal.h, when thermal
   is not selected"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  imx: thermal: use CPU temperature grade info for thresholds
  thermal: fix thermal_zone_bind_cooling_device prototype
  Revert "thermal: qcom_spmi: allow compile test"
  thermal: rcar_thermal: remove redundant operation
  thermal: of-thermal: Reduce log level for message when can't fine thermal zone
  thermal: power_allocator: Use temperature reading from tz
  thermal: rockchip: Support the RK3368 SoCs in thermal driver
  thermal: rockchip: consistently use int for temperatures
  thermal: rockchip: Add the sort mode for adc value increment or decrement
  thermal: rockchip: improve the conversion function
  thermal: rockchip: trivial: fix typo in commit
  thermal: rockchip: better to compatible the driver for different SoCs
  dt-bindings: rockchip-thermal: Support the RK3368 SoCs compatible

target/stat: print full t10_wwn.model buffer

Cut 'n paste error saw it only process sizeof(t10_wwn.vendor) characters.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

target: fix COMPARE_AND_WRITE non zero SGL offset data corruption

target_core_sbc's compare_and_write functionality suffers from taking
data at the wrong memory location when writing a CAW request to disk
when a SGL offset is non-zero.

This can happen with loopback and vhost-scsi fabric drivers when
SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC is used to map existing user-space
SGL memory into COMPARE_AND_WRITE READ/WRITE payload buffers.

Given the following sample LIO subtopology,

% targetcli ls /loopback/
o- loopback ................................. [1 Target]
  o- naa.6001405ebb8df14a ....... [naa.60014059143ed2b3]
    o- luns ................................... [2 LUNs]
      o- lun0 ................ [iblock/ram0 (/dev/ram0)]
      o- lun1 ................ [iblock/ram1 (/dev/ram1)]
% lsscsi -g
[3:0:1:0]    disk    LIO-ORG  IBLOCK           4.0   /dev/sdc   /dev/sg3
[3:0:1:1]    disk    LIO-ORG  IBLOCK           4.0   /dev/sdd   /dev/sg4

the following bug can be observed in Linux 4.3 and 4.4~rc1:

% perl -e 'print chr$_ for 0..255,reverse 0..255' >rand
% perl -e 'print "\0" x 512' >zero
% cat rand >/dev/sdd
% sg_compare_and_write -i rand -D zero --lba 0 /dev/sdd
% sg_compare_and_write -i zero -D rand --lba 0 /dev/sdd
Miscompare reported
% hexdump -Cn 512 /dev/sdd
00000000  0f 0e 0d 0c 0b 0a 09 08  07 06 05 04 03 02 01 00
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
*
00000200

Rather than writing all-zeroes as instructed with the -D file, it
corrupts the data in the sector by splicing some of the original
bytes in. The page of the first entry of cmd->t_data_sg includes the
CDB, and sg->offset is set to a position past the CDB. I presume that
sg->offset is also the right choice to use for subsequent sglist
members.

Signed-off-by: Jan Engelhardt <jengelh@netitwork.de>
Tested-by: Douglas Gilbert <dgilbert@interlog.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

qla2xxx: Fix regression introduced by target configFS changes

this patch fixes following regression

# targetcli
[Errno 13] Permission denied: '/sys/kernel/config/target/qla2xxx/21:00:00:0e:1e:08:c7:20/tpgt_1/enable'

Fixes: 2eafd72939fd ("target: use per-attribute show and store methods")
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

kref: Remove kref_put_spinlock_irqsave()

The last user is gone. Hence remove this function.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

target: Invoke release_cmd() callback without holding a spinlock

This patch fixes the following kernel warning because it avoids that
IRQs are disabled while ft_release_cmd() is invoked (fc_seq_set_resp()
invokes spin_unlock_bh()):

WARNING: CPU: 3 PID: 117 at kernel/softirq.c:150 __local_bh_enable_ip+0xaa/0x110()
Call Trace:
[<ffffffff814f71eb>] dump_stack+0x4f/0x7b
[<ffffffff8105e56a>] warn_slowpath_common+0x8a/0xc0
[<ffffffff8105e65a>] warn_slowpath_null+0x1a/0x20
[<ffffffff81062b2a>] __local_bh_enable_ip+0xaa/0x110
[<ffffffff814ff229>] _raw_spin_unlock_bh+0x39/0x40
[<ffffffffa03a7f94>] fc_seq_set_resp+0xe4/0x100 [libfc]
[<ffffffffa02e604a>] ft_free_cmd+0x4a/0x90 [tcm_fc]
[<ffffffffa02e6972>] ft_release_cmd+0x12/0x20 [tcm_fc]
[<ffffffffa042bd66>] target_release_cmd_kref+0x56/0x90 [target_core_mod]
[<ffffffffa042caf0>] target_put_sess_cmd+0xc0/0x110 [target_core_mod]
[<ffffffffa042cb81>] transport_release_cmd+0x41/0x70 [target_core_mod]
[<ffffffffa042d975>] transport_generic_free_cmd+0x35/0x420 [target_core_mod]

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joern Engel <joern@logfs.org>
Reviewed-by: Andy Grover <agrover@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

target: Fix race for SCF_COMPARE_AND_WRITE_POST checking

This patch addresses a race + use after free where the first
stage of COMPARE_AND_WRITE in compare_and_write_callback()
is rescheduled after the backend sends the secondary WRITE,
resulting in second stage compare_and_write_post() callback
completing in target_complete_ok_work() before the first
can return.

Because current code depends on checking se_cmd->se_cmd_flags
after return from se_cmd->transport_complete_callback(),
this results in first stage having SCF_COMPARE_AND_WRITE_POST
set, which incorrectly falls through into second stage CAW
processing code, eventually triggering a NULL pointer
dereference due to use after free.

To address this bug, pass in a new *post_ret parameter into
se_cmd->transport_complete_callback(), and depend upon this
value instead of ->se_cmd_flags to determine when to return
or fall through into ->queue_status() code for CAW.

Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

iscsi-target: Fix rx_login_comp hang after login failure

This patch addresses a case where iscsi_target_do_tx_login_io()
fails sending the last login response PDU, after the RX/TX
threads have already been started.

The case centers around iscsi_target_rx_thread() not invoking
allow_signal(SIGINT) before the send_sig(SIGINT, ...) occurs
from the failure path, resulting in RX thread hanging
indefinately on iscsi_conn->rx_login_comp.

Note this bug is a regression introduced by:

  commit e54198657b65625085834847ab6271087323ffea
  Author: Nicholas Bellinger <nab@linux-iscsi.org>
  Date:   Wed Jul 22 23:14:19 2015 -0700

      iscsi-target: Fix iscsit_start_kthreads failure OOPs

To address this bug, complete ->rx_login_complete for good
measure in the failure path, and immediately return from
RX thread context if connection state did not actually reach
full feature phase (TARG_CONN_STATE_LOGGED_IN).

Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

iscsi-target: return -ENOMEM instead of -1 in case of failed kmalloc()

Smatch complains about returning hard coded error codes, silence this
warning.

drivers/target/iscsi/iscsi_target_parameters.c:211
iscsi_create_default_params() warn: returning -1 instead of -ENOMEM is sloppy

Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

target/user: Do not set unused fields in tcmu_ops

TCMU sets TRANSPORT_FLAG_PASSTHROUGH, so INQUIRY commands will not be
emulated by LIO but passed up to userspace. Therefore TCMU should not
set these, just like pscsi doesn't.

Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

target/user: Fix time calc in expired cmd processing

Reversed arguments meant that we were doing nothing for cmds whose deadline
had passed.

Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

ARM: 8454/1: OF implies OF_FLATTREE

On the ARM architecture, individual platforms select CONFIG_USE_OF if they
need it, but all device tree code is keyed off CONFIG_OF. When building
a platform without DT support and manually enabling CONFIG_OF, we now
get a number of build errors, e.g.

arch/arm/kernel/devtree.c: In function 'setup_machine_fdt':
arch/arm/kernel/devtree.c:215:19: error: implicit declaration of function 'early_init_dt_verify' [-Werror=implicit-function-declaration]

We could now try to separate the use case of booting from DT vs. the
case of using the dynamic implementation, but that seems more complicated
than it can gain us.

This simply changes the ARM Kconfig file to always enable OF_RESERVED_MEM
and OF_EARLY_FLATTREE when CONFIG_OF is enabled. These options add a little
extra code when we just want the dynamic OF implementation, but that seems
like a rather obscure case, and this version solves all CONFIG_OF related
randconfig regressions.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 0166dc11be91 ("of: make CONFIG_OF user selectable")
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

Merge tag 'pci-v4.4-fixes-1' of git://git./linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
"Here are a few fixes I'd like to have in v4.4: a generic one for sysfs
  and three for HiSilicon and DesignWare host controllers.

  Summary:

  NUMA:
   - Prevent out of bounds access in numa_node override (Mathias Krause)

  HiSilicon host bridge driver:
   - Fix deferred probing (Arnd Bergmann)

  Synopsys DesignWare host bridge driver:
   - Remove incorrect io_base assignment (Stanimir Varbanov)
   - Move align_resource function pointer to pci_host_bridge structure
     (Gabriele Paoloni)"

* tag 'pci-v4.4-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  ARM/PCI: Move align_resource function pointer to pci_host_bridge structure
  PCI: hisi: Fix deferred probing
  PCI: designware: Remove incorrect io_base assignment
  PCI: Prevent out of bounds access in numa_node override

Merge tag 'nfs-for-4.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:

  Stable patches:
   - Fix a NFSv4 callback identifier leak that was also causing client
     crashes
   - Fix NFSv4 callback decoding issues when incoming requests are
     truncated
   - Don't declare the attribute cache valid when we call
     nfs_update_inode with an empty attribute structure.
   - Resend LAYOUTGET when there is a race that changes the seqid

  Bugfixes:
   - Fix a number of issues with the NFSv4.2 CLONE ioctl()
   - Properly set NFS v4.2 NFSDBG_FACILITY
   - NFSv4 referrals are broken; Cleanup FATTR4_WORD0_FS_LOCATIONS after
     decoding success
   - Use sliding delay when LAYOUTGET gets NFS4ERR_DELAY
   - Ensure that attrcache is revalidated after a SETATTR"

* tag 'nfs-for-4.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  nfs4: resend LAYOUTGET when there is a race that changes the seqid
  nfs: if we have no valid attrs, then don't declare the attribute cache valid
  nfs: ensure that attrcache is revalidated after a SETATTR
  nfs4: limit callback decoding to received bytes
  nfs4: start callback_ident at idr 1
  nfs: use sliding delay when LAYOUTGET gets NFS4ERR_DELAY
  NFS4: Cleanup FATTR4_WORD0_FS_LOCATIONS after decoding success
  NFS: Properly set NFS v4.2 NFSDBG_FACILITY
  nfs: reduce the amount of ifdefs for v4.2 in nfs4file.c
  nfs: use btrfs ioctl defintions for clone
  nfs: allow intra-file CLONE
  nfs: offer native ioctls even if CONFIG_COMPAT is set
  nfs: pass on count for CLONE operations

Merge git://www.linux-watchdog.org/linux-watchdog

Pull watchdog fixes from Wim Van Sebroeck:
- a null pointer dereference fix for omap_wdt
- some clock related fixes for pnx4008
- an underflow fix in wdt_set_timeout() for w83977f_wdt
- restart fix for tegra wdt
- Kconfig change to support Freescale Layerscape platforms
- fix for stopping the mtk_wdt watchdog

* git://www.linux-watchdog.org/linux-watchdog:
  watchdog: mtk_wdt: Use MODE_KEY when stopping the watchdog
  watchdog: Add support for Freescale Layerscape platforms
  watchdog: tegra: Stop watchdog first if restarting
  watchdog: w83977f_wdt: underflow in wdt_set_timeout()
  watchdog: pnx4008: make global wdt_clk static
  watchdog: pnx4008: fix warnings caused by enabling unprepared clock
  watchdog: omap_wdt: fix null pointer dereference

Merge branch 'for-linus-4.4' of git://git./linux/kernel/git/mason/linux-btrfs

Pull btrfs fixes from Chris Mason:
"This has Mark Fasheh's patches to fix quota accounting during subvol
  deletion, which we've been working on for a while now.  The patch is
  pretty small but it's a key fix.

  Otherwise it's a random assortment"

* 'for-linus-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  btrfs: fix balance range usage filters in 4.4-rc
  btrfs: qgroup: account shared subtree during snapshot delete
  Btrfs: use btrfs_get_fs_root in resolve_indirect_ref
  btrfs: qgroup: fix quota disable during rescan
  Btrfs: fix race between cleaner kthread and space cache writeout
  Btrfs: fix scrub preventing unused block groups from being deleted
  Btrfs: fix race between scrub and block group deletion
  btrfs: fix rcu warning during device replace
  btrfs: Continue replace when set_block_ro failed
  btrfs: fix clashing number of the enhanced balance usage filter
  Btrfs: fix the number of transaction units needed to remove a block group
  Btrfs: use global reserve when deleting unused block group after ENOSPC
  Btrfs: tests: checking for NULL instead of IS_ERR()
  btrfs: fix signed overflows in btrfs_sync_file

Merge branch 'for-linus2' of git://git./linux/kernel/git/jmorris/linux-security

Pull security layer fixes from James Morris:
"A fix for SELinux policy processing (regression introduced by
  commit fa1aa143ac4a: "selinux: extended permissions for ioctls"), as
  well as a fix for the user-triggerable oops in the Keys code"

* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  KEYS: Fix handling of stored error in a negatively instantiated user key
  selinux: fix bug in conditional rules handling

Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
"There is a small backlog of at91 patches here, the most significant is
  the addition of some sama5d2 Xplained nodes that were waiting on an
  MFD include file to get merged through another tree.

  We normally try to sort those out before the merge window opens, but
  the maintainer wasn't aware of that here and I decided to merge the
  changes this time as an exception.

  On OMAP a series of audio changes for dra7 missed the merge window but
  turned out to be necessary to fix a boot time imprecise external abort
  error and to get audio working.

  The other changes are the usual simple changes, here is a list sorted
  by platform:

  at91:
removal of a useless defconfig option
removal of some legacy DT pieces
use of the proper watchdog compatible string
update of the MAINTAINERS entries for some Atmel drivers

  drivers/scpi:
hide get_scpi_ops in module from built-in code

  imx:
add missing .irq_set_type for i.MX GPC irq_chip.
fix the wrong spi-num-chipselects settings for Vybrid DSPI devices.
fix a merge error in Vybrid dts regarding to ADC device property

  keystone:
        fix the optional PDSP firmware loading
        fix linking RAM setup for QMs
        fix crash with clk_ignore_unused

  mediatek:
Enable SCPSYS power domain driver by default

  mvebu:
fix QNAP TS219 power-off in dts
fix legacy get_irqnr_and_base for dove and orion5x

  omap:
fix l4 related boot time errors for dm81xx
use lockless cldm/pwrdm api in omap4_boot_secondary
remove t410 abort handler to avoid hiding other critical errors
mark cpuidle tracepoints as _rcuidle
fix module alias for omap-ocp2scp

  pxa:
palm: Fix typos in PWM lookup table code

  renesas:
missing __initconst annotation for r8a7793_boards_compat_dt

  rockchip:
disable mmc-tuning on the veyron-minnie board
adding the init state for the over-temperature-protection

  zx:
only build power domain code when CONFIG_PM=y"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (31 commits)
  ARM: OMAP4+: SMP: use lockless clkdm/pwrdm api in omap4_boot_secondary
  arm: omap2+: add missing HWMOD_NO_IDLEST in 81xx hwmod data
  ARM: orion5x: Fix legacy get_irqnr_and_base
  ARM: dove: Fix legacy get_irqnr_and_base
  soc: Mediatek: Enable SCPSYS power domain driver by default
  ARM: dts: vfxxx: Fix dspi[01] spi-num-chipselects.
  ARM: dts: keystone: k2l: fix kernel crash when clk_ignore_unused is not in bootargs
  soc: ti: knav_qmss_queue: Fix linking RAM setup for queue managers
  soc: ti: use request_firmware_direct() as acc firmware is optional
  ARM: imx: add platform irq type setting in gpc
  ARM: dts: vfxxx: Fix erroneous property in esdhc0 node
  ARM: shmobile: r8a7793: proper constness with __initconst
  scpi: hide get_scpi_ops in module from built-in code
  ARM: zx: only build power domain code when CONFIG_PM=y
  ARM: pxa: palm: Fix typos in PWM lookup table code
  ARM: dts: Kirkwood: Fix QNAP TS219 power-off
  ARM: dts: rockchip: Add OTP gpio pinctrl to rk3288 tsadc node
  ARM: dts: rockchip: temporarily remove emmc hs200 speed from rk3288 minnie
  MAINTAINERS: Atmel drivers: change NAND and ISI entries
  ARM: at91/dt: sama5d2 Xplained: add several devices
  ...

Merge tag 'pm+acpi-4.4-rc3' of git://git./linux/kernel/git/rafael/linux-pm

Pull more power management and ACPI fixes from Rafael Wysocki:
"These fix one recent regression (cpufreq core), fix up two features
  added recently (ACPI CPPC support, SCPI support in the arm_big_little
  cpufreq driver) and fix three older bugs in the intel_pstate driver.

  Specifics:

   - Fix a recent regression in the cpufreq core causing it to fail to
     clean up sysfs directories properly on cpufreq driver removal
     (Viresh Kumar).

   - Fix a build problem in the SCPI support code recently added to the
     arm_big_little cpufreq driver (Punit Agrawal).

   - Fix up the recently added CPPC cpufreq frontend to process the CPU
     coordination information provided by the platform firmware
     correctly (Ashwin Chaugule).

   - Fix the intel_pstate driver to behave as intended when switched
     over to the "performance" mode via sysfs if hardware-driven P-state
     selection (HWP) is enabled (Alexandra Yates).

   - Fix two rounding errors in the intel_pstate driver that sometimes
     cause it to use lower P-states than requested (Prarit Bhargava)"

* tag 'pm+acpi-4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  intel_pstate: Fix "performance" mode behavior with HWP enabled
  cpufreq: SCPI: Depend on SCPI clk driver
  cpufreq: intel_pstate: Fix limits->max_perf rounding error
  cpufreq: intel_pstate: Fix limits->max_policy_pct rounding error
  cpufreq: Always remove sysfs cpuX/cpufreq link on ->remove_dev()
  cpufreq: CPPC: Initialize and check CPUFreq CPU co-ord type correctly

Merge branch 'linux-4.4' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes

Ben Skeggs wrote:
A couple of regression fixes, some more boards whitelisted for a hw bug
workaround, gr/ucode fixes for hangs a user is seeing.

The changes look larger than they actually are due to the ucode binaries
(*.fucN.h) being regenerated.

* 'linux-4.4' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
  drm/nouveau/volt/pwm/gk104: fix an off-by-one resulting in the voltage not being set
  drm/nouveau/nvif: allow userspace access to its own client object
  drm/nouveau/gr/gf100-: fix oops when calling zbc methods
  drm/nouveau/gr/gf117-: assume no PPC if NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK is zero
  drm/nouveau/gr/gf117-: read NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK from correct GPC
  drm/nouveau/gr/gf100-: split out per-gpc address calculation macro
  drm/nouveau/bios: return actual size of the buffer retrieved via _ROM
  drm/nouveau/instmem: protect instobj list with a spinlock
  drm/nouveau/pci: enable c800 magic for some unknown Samsung laptop
  drm/nouveau/pci: enable c800 magic for Clevo P157SM

Merge tag 'sound-4.4-rc3' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"Here are no big surprises but just all small fixes, mostly
  device-specific quirks for HD-audio and USB-audio:

   - Fix for detection of FireWire DICE Loud devices
   - Intel Broxton HDMI/DP PCI IDs and relevant quirks
   - Noise fixes: Dell XPS13 2015 model, Dell Latitude E6440, Gigabyte
     Z170X mobo
   - Fix the headphone mixer assignment on HP laptops for PulseAudio
   - USB-MIDI fixes for Medeli DD305 and CH345
   - Apply fixup for Acer Aspire One Cloudbook 14"

* tag 'sound-4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Fix noise on Gigabyte Z170X mobo
  ALSA: hda - Fix headphone noise after Dell XPS 13 resume back from S3
  ALSA: hda - Apply HP headphone fixups more generically
  ALSA: hda - Add fixup for Acer Aspire One Cloudbook 14
  ALSA: hda - apply SKL display power request/release patch to BXT
  ALSA: hda - add PCI IDs for Intel Broxton
  ALSA: usb-audio: work around CH345 input SysEx corruption
  ALSA: usb-audio: prevent CH345 multiport output SysEx corruption
  ALSA: usb-audio: add packet size quirk for the Medeli DD305
  ALSA: dice: fix detection of Loud devices
  ALSA: hda - Fix noise on Dell Latitude E6440

Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

- Build fix when !CONFIG_UID16 (the patch is touching generic files but
   it only affects arm64 builds; submitted by Arnd Bergmann)

- EFI fixes to deal with early_memremap() returning NULL and correctly
   mapping run-time regions

- Fix CPUID register extraction of unsigned fields (not to be
   sign-extended)

- ASID allocator fix to deal with long-running tasks over multiple
   generation roll-overs

- Revert support for marking page ranges as contiguous PTEs (it leads
   to TLB conflicts and requires additional non-trivial kernel changes)

- Proper early_alloc() failure check

- Disable KASan for 48-bit VA and 16KB page configuration (the pgd is
   larger than the KASan shadow memory)

- Update the fault_info table (original descriptions based on early
   engineering spec)

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: efi: fix initcall return values
  arm64: efi: deal with NULL return value of early_memremap()
  arm64: debug: Treat the BRPs/WRPs as unsigned
  arm64: cpufeature: Track unsigned fields
  arm64: cpufeature: Add helpers for extracting unsigned values
  Revert "arm64: Mark kernel page ranges contiguous"
  arm64: mm: keep reserved ASIDs in sync with mm after multiple rollovers
  arm64: KASAN depends on !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
  arm64: efi: correctly map runtime regions
  arm64: mm: fix fault_info table xFSC decoding
  arm64: fix building without CONFIG_UID16
  arm64: early_alloc: Fix check for allocation failure

Merge tag 'nios2-v4.4-rc3' of git://git./linux/kernel/git/lftan/nios2

Pull nios2 fix from Ley Foon Tan:
"nios2: fix cache coherency"

* tag 'nios2-v4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
nios2: fix cache coherency

MIPS: Fix delay loops which may be removed by GCC.

GCC 4.1 and newer remove empty loops. This becomes a problem when delay
loops get removed. Fixed by rewriting to user the proper Linux interface
for such delays.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: John Crispin <blogic@openwrt.org>

Merge tag 'arc-4.4-rc3-fixes' of git://git./linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:
- Fix for perf callgraph unwinding causing RCU stalls
- Fix to enable Linux to run on non-default Interrupt priority 0
- Removal of pointless SYNC from __switch_to()

* tag 'arc-4.4-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: dw2 unwind: Remove falllback linear search thru FDE entries
  ARC: remove SYNC from __switch_to()
  ARCv2: Use the default irq priority for idle sleep
  ARC: Abstract out ISA specific SLEEP args
  ARC: comments update
  ARC: switch to arc-linux- CROSS_COMPILE prefix across all configs

Merge tag 'v4.4-rockchip-dts32-fixes1' of git://git./linux/kernel/git/mmind/linux-rockchip into fixes

Merge "ARM: rockchip: devicetree fixes for 4.4" from Heiko Stuebner:

Two fixes to Rockchip devicetree files, disabling the mmc-tuning
on the veyron-minnie board for now and adding the init state for
the over-temperature-protection to prevent glitches making the
system reboot sometimes.

* tag 'v4.4-rockchip-dts32-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
ARM: dts: rockchip: Add OTP gpio pinctrl to rk3288 tsadc node
ARM: dts: rockchip: temporarily remove emmc hs200 speed from rk3288 minnie

Merge tag 'mvebu-fixes-4.4-1' of git://git.infradead.org/linux-mvebu into fixes

Merge "mvebu fixes for 4.4 (part 1)" from Jason Cooper:

- Fix QNAP TS219 power-off in dts
- Fix legacy get_irqnr_and_base for dove and orion5x

* tag 'mvebu-fixes-4.4-1' of git://git.infradead.org/linux-mvebu:
  ARM: orion5x: Fix legacy get_irqnr_and_base
  ARM: dove: Fix legacy get_irqnr_and_base
  ARM: dts: Kirkwood: Fix QNAP TS219 power-off

Merge tag 'renesas-fixes-for-v4.4' of git://git./linux/kernel/git/horms/renesas into fixes

Merge "Renesas ARM Based SoC Fixes for v4.4" from Simon Horman:

* r8a7793 SoC: Annotate r8a7793_boards_compat_dt with __initconst
  Aside from being correct this builds that otherwise
  fail with section mismatch errors.

* tag 'renesas-fixes-for-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  ARM: shmobile: r8a7793: proper constness with __initconst

Merge branches 'pm-cpufreq' and 'acpi-cppc'

* pm-cpufreq:
  intel_pstate: Fix "performance" mode behavior with HWP enabled
  cpufreq: SCPI: Depend on SCPI clk driver
  cpufreq: intel_pstate: Fix limits->max_perf rounding error
  cpufreq: intel_pstate: Fix limits->max_policy_pct rounding error
  cpufreq: Always remove sysfs cpuX/cpufreq link on ->remove_dev()

* acpi-cppc:
  cpufreq: CPPC: Initialize and check CPUFreq CPU co-ord type correctly

Merge tag 'for-linus-4.4-rc2-tag' of git://git./linux/kernel/git/xen/tip

Pull xen bug fixes from David Vrabel:

- Fix gntdev and numa balancing.

- Fix x86 boot crash due to unallocated legacy irq descs.

- Fix overflow in evtchn device when > 1024 event channels.

* tag 'for-linus-4.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/evtchn: dynamically grow pending event channel ring
  xen/events: Always allocate legacy interrupts on PV guests
  xen/gntdev: Grant maps should not be subject to NUMA balancing

Merge tag 'powerpc-4.4-3' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

- tm: Block signal return from setting invalid MSR state from Michael
   Neuling

- tm: Check for already reclaimed tasks from Michael Neuling

* tag 'powerpc-4.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/tm: Check for already reclaimed tasks
  powerpc/tm: Block signal return setting invalid MSR state

xen/evtchn: dynamically grow pending event channel ring

If more than 1024 event channels are bound to a evtchn device then it
possible (even with well behaved applications) for the ring to
overflow and events to be lost (reported as an -EFBIG error).

Dynamically increase the size of the ring so there is always enough
space for all bound events.  Well behaved applicables that only unmask
events after draining them from the ring can thus no longer lose
events.

However, an application could unmask an event before draining it,
allowing multiple entries per port to accumulate in the ring, and a
overflow could still occur.  So the overflow detection and reporting
is retained.

The ring size is initially only 64 entries so the common use case of
an application only binding a few events will use less memory than
before.  The ring size may grow to 512 KiB (enough for all 2^17
possible channels).  This order 7 kmalloc() may fail due to memory
fragmentation, so we fall back to trying vmalloc().

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

arm64: efi: fix initcall return values

Even though initcall return values are typically ignored, the
prototype is to return 0 on success or a negative errno value on
error. So fix the arm_enable_runtime_services() implementation to
return 0 on conditions that are not in fact errors, and return a
meaningful error code otherwise.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: efi: deal with NULL return value of early_memremap()

Add NULL return value checks to two invocations of early_memremap()
in the UEFI init code. For the UEFI configuration tables, we just
warn since we have a better chance of being able to report the issue
in a way that can actually be noticed by a human operator if we don't
abort right away. For the UEFI memory map, however, all we can do is
panic() since we cannot proceed without a description of memory.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: debug: Treat the BRPs/WRPs as unsigned

IDAA64DFR0_EL1: BRPs and WRPs are unsigned values. Use
the appropriate helpers to extract those fields.

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: cpufeature: Track unsigned fields

Some of the feature bits have unsigned values and need
to be treated accordingly to avoid errors. Adds the property
to the feature bits and use the appropriate field extract helpers.

Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

xen/events: Always allocate legacy interrupts on PV guests

After commit 8c058b0b9c34 ("x86/irq: Probe for PIC presence before
allocating descs for legacy IRQs") early_irq_init() will no longer
preallocate descriptors for legacy interrupts if PIC does not
exist, which is the case for Xen PV guests.

Therefore we may need to allocate those descriptors ourselves.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

arm64: cpufeature: Add helpers for extracting unsigned values

The cpuid_feature_extract_field() extracts the feature value
as a signed integer. This could be problematic for features
whose values are unsigned. e.g, ID_AA64DFR0_EL1:BRPs. Add
an unsigned variant for the unsigned fields.

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

xen/gntdev: Grant maps should not be subject to NUMA balancing

Doing so will cause the grant to be unmapped and then, during
fault handling, the fault to be mistakenly treated as NUMA hint
fault.

In addition, even if those maps could partcipate in NUMA
balancing, it wouldn't provide any benefit since we are unable
to determine physical page's node (even if/when VNUMA is
implemented).

Marking grant maps' VMAs as VM_IO will exclude them from being
part of NUMA balancing.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

rtc: ds1307: fix alarm reading at probe time

With the actual code, read_alarm() always returns -EINVAL when called
during the RTC device registration. This prevents from retrieving an
already configured alarm in hardware.

This patch fixes the issue by moving the HAS_ALARM bit configuration
(if supported by the hardware) above the rtc_device_register() call.

Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

Revert "arm64: Mark kernel page ranges contiguous"

This reverts commit 348a65cdcbbf243073ee39d1f7d4413081ad7eab.

Incorrect page table manipulation that does not respect the ARM ARM
recommended break-before-make sequence may lead to TLB conflicts. The
contiguous PTE patch makes the system even more susceptible to such
errors by changing the mapping from a single page to a contiguous range
of pages. An additional TLB invalidation would reduce the risk window,
however, the correct fix is to switch to a temporary swapper_pg_dir.
Once the correct workaround is done, the reverted commit will be
re-applied.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Jeremy Linton <jeremy.linton@arm.com>

arm64: mm: keep reserved ASIDs in sync with mm after multiple rollovers

Under some unusual context-switching patterns, it is possible to end up
with multiple threads from the same mm running concurrently with
different ASIDs:

1. CPU x schedules task t with mm p containing ASID a and generation g
   This task doesn't block and the CPU doesn't context switch.
   So:
     * per_cpu(active_asid, x) = {g,a}
     * p->context.id = {g,a}

2. Some other CPU generates an ASID rollover. The global generation is
   now (g + 1). CPU x is still running t, with no context switch and
   so per_cpu(reserved_asid, x) = {g,a}

3. CPU y schedules task t', which shares mm p with t. The generation
   mismatches, so we take the slowpath and hit the reserved ASID from
   CPU x. p is then updated so that p->context.id = {g + 1,a}

4. CPU y schedules some other task u, which has an mm != p.

5. Some other CPU generates *another* CPU rollover. The global
   generation is now (g + 2). CPU x is still running t, with no context
   switch and so per_cpu(reserved_asid, x) = {g,a}.

6. CPU y once again schedules task t', but now *fails* to hit the
   reserved ASID from CPU x because of the generation mismatch. This
   results in a new ASID being allocated, despite the fact that t is
   still running on CPU x with the same mm.

Consequently, TLBIs (e.g. as a result of CoW) will not be synchronised
between the two threads.

This patch fixes the problem by updating all of the matching reserved
ASIDs when we hit on the slowpath (i.e. in step 3 above). This keeps
the reserved ASIDs in-sync with the mm and avoids the problem.

Reported-by: Tony Thompson <anthony.thompson@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: KASAN depends on !(ARM64_16K_PAGES && ARM64_VA_BITS_48)

On KASAN + 16K_PAGES + 48BIT_VA
arch/arm64/mm/kasan_init.c: In function ‘kasan_early_init’:
include/linux/compiler.h:484:38: error: call to ‘__compiletime_assert_95’ declared with attribute error: BUILD_BUG_ON failed: !IS_ALIGNED(KASAN_SHADOW_END, PGDIR_SIZE)
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)

Currently KASAN will not work on 16K_PAGES and 48BIT_VA, so
forbid such configuration to avoid above build failure.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Suzuki K. Poulose <Suzuki.Poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

nios2: fix cache coherency

There is intermittent cache coherency issue caught in toolchian tests.
Revert to use flushd.

Signed-off-by: Ley Foon Tan <lftan@altera.com>

Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/selinux into for-linus2

Merge branch 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Radeon and amdgpu fixes for 4.4:
- DPM fixes for r7xx devices
- VCE fixes for Stoney
- GPUVM fixes
- Scheduler fixes

* 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: make some dpm errors debug only
  drm/radeon: make rv770_set_sw_state failures non-fatal
  drm/amdgpu: move dependency handling out of atomic section v2
  drm/amdgpu: optimize scheduler fence handling
  drm/amdgpu: remove vm->mutex
  drm/amdgpu: add mutex for ba_va->valids/invalids
  drm/amdgpu: adapt vce session create interface changes
  drm/amdgpu: vce use multiple cache surface starting from stoney
  drm/amdgpu: reset vce trap interrupt flag

Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
"A couple of fixes for sendfile lockups caught by Dmitry + a fix for
  ancient sysvfs symlink breakage"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: Avoid softlockups with sendfile(2)
  vfs: Make sendfile(2) killable even better
  fix sysvfs symlinks

Merge tag 'omap-for-v4.4/fixes-rc2' of git://git./linux/kernel/git/tmlind/linux-omap into fixes

Merge "Fixes for omaps for v4.4-rc cycle" from Tony Lindgren:

- A series of audio changes for dra7 that missed the merge window but turned
  out to be necessary to fix a boot time imprecise external abort error and to
  getaudio working

- Fix l4 related boot time errors for dm81xx

- Use lockless cldm/pwrdm api in omap4_boot_secondary

- Remove t410 custom abort handler that is no longer needed and may
  hide other critical errors

- Mark cpuidle tracepoints as _rcuidle

- Fix module alias for omap-ocp2scp

* tag 'omap-for-v4.4/fixes-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: OMAP4+: SMP: use lockless clkdm/pwrdm api in omap4_boot_secondary
  arm: omap2+: add missing HWMOD_NO_IDLEST in 81xx hwmod data
  ARM: OMAP2+: remove custom abort handler for t410
  ARM: OMAP: DRA7: hwmod: Add data for McASP3
  ARM: OMAP2+: hwmod: Add hwmod flag for HWMOD_OPT_CLKS_NEEDED
  ARM: dts: dra7: Fix McASP3 node regarding to clocks
  bus: omap-ocp2scp: Fix module alias
  ARM: OMAP2+: PM: Denote the cpuidle tracepoints as _rcuidle()

Merge tag 'keystone-fixes-for-4.4' of git://git./linux/kernel/git/ssantosh/linux-keystone into fixes

Merge "Few Keystone fixes for 4.4-rcx" from Santosh Shilimkar:
- Fix the optional PDSP firmware loading
- Fix linking RAM setup for QMs
- Fix crash with clk_ignore_unused

* tag 'keystone-fixes-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone:
  ARM: dts: keystone: k2l: fix kernel crash when clk_ignore_unused is not in bootargs
  soc: ti: knav_qmss_queue: Fix linking RAM setup for queue managers
  soc: ti: use request_firmware_direct() as acc firmware is optional

Merge tag 'v4.4-rc2' into fixes

Linux 4.4-rc2 is backmerged from the keystone fixes.

Merge tag 'imx-fixes-4.4' of git://git./linux/kernel/git/shawnguo/linux into fixes

Merge "The i.MX fixes for 4.4" from Shawn Guo:

- Add missing .irq_set_type for i.MX GPC irq_chip.  It fixes an issue
  that device IRQ type setting doesn't match the one specified in device
  tree, since stacked IRQ domain is adopted in GPC driver.
- Fix the wrong spi-num-chipselects settings for Vybrid DSPI devices.
- Fix a merge error in Vybrid dts regarding to ADC device property
  fsl,adck-max-frequency

* tag 'imx-fixes-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: vfxxx: Fix dspi[01] spi-num-chipselects.
  ARM: imx: add platform irq type setting in gpc
  ARM: dts: vfxxx: Fix erroneous property in esdhc0 node

intel_pstate: Fix "performance" mode behavior with HWP enabled

If hardware-driven P-state selection (HWP) is enabled, the
"performance" mode of intel_pstate should only allow the processor
to use the highest-performance P-state available. That is not
the case currently, so make it actually happen.

Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Alexandra Yates <alexandra.yates@linux.intel.com>
[ rjw: Subject and changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

nfs4: resend LAYOUTGET when there is a race that changes the seqid

pnfs_layout_process will check the returned layout stateid against what
the kernel has in-core. If it turns out that the stateid we received is
older, then we should resend the LAYOUTGET instead of falling back to
MDS I/O.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org # 3.18+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: if we have no valid attrs, then don't declare the attribute cache valid

If we pass in an empty nfs_fattr struct to nfs_update_inode, it will
(correctly) not update any of the attributes, but it then clears the
NFS_INO_INVALID_ATTR flag, which indicates that the attributes are
up to date. Don't clear the flag if the fattr struct has no valid
attrs to apply.

Reviewed-by: Steve French <steve.french@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: ensure that attrcache is revalidated after a SETATTR

If we get no post-op attributes back from a SETATTR operation, then no
attributes will of course be updated during the call to
nfs_update_inode.

We know however that the attributes are invalid at that point, since we
just changed some of them. At the very least, the ctime will be bogus.
If we get no post-op attributes back on the call, mark the attrcache
invalid to reflect that fact.

Reviewed-by: Steve French <steve.french@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

ARM/PCI: Move align_resource function pointer to pci_host_bridge structure

Commit b3a72384fe29 ("ARM/PCI: Replace pci_sys_data->align_resource with
global function pointer") introduced an ARM-specific align_resource()
function pointer. This is not portable to other arches and doesn't work
for platforms with two different PCIe host bridge controllers.

Move the function pointer to the pci_host_bridge structure so each host
bridge driver can specify its own align_resource() function.

Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull more block layer fixes from Jens Axboe:
"I wasn't going to send off a new pull before next week, but the blk
  flush fix from Jan from the other day introduced a regression.  It's
  rare enough not to have hit during testing, since it requires both a
  device that rejects the first flush, and bad timing while it does
  that.  But since someone did hit it, let's get the revert into 4.4-rc3
  so we don't have a released rc with that known issue.

  Apart from that revert, three other fixes:

   - From Christoph, a fix for a missing unmap in NVMe request
     preparation.

   - An NVMe fix from Nishanth that fixes data corruption on powerpc.

   - Also from Christoph, fix a list_del() attempt on blk-mq that didn't
     have a matching list_add() at timer start"

* 'for-linus' of git://git.kernel.dk/linux-block:
  Revert "blk-flush: Queue through IO scheduler when flush not required"
  block: fix blk_abort_request for blk-mq drivers
  nvme: add missing unmaps in nvme_queue_rq
  NVMe: default to 4k device page size

ARM: OMAP4+: SMP: use lockless clkdm/pwrdm api in omap4_boot_secondary

OMAP CPU hotplug uses cpu1's clocks and power domains for CPU1 wake up
from low power states (or turn on CPU1). This part of code is also
part of system suspend (disable_nonboot_cpus()).
>From other side, cpu1's clocks and power domains are used by CPUIdle. All above
functionality is mutually exclusive and, therefore, lockless clkdm/pwrdm api
can be used in omap4_boot_secondary().

This fixes below back-trace on -RT which is triggered by
pwrdm_lock/unlock():

BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 1, irqs_disabled(): 0, pid: 118, name: sh
9 locks held by sh/118:
  #0:  (sb_writers#4){.+.+.+}, at: [<c0144a6c>] vfs_write+0x13c/0x164
  #1:  (&of->mutex){+.+.+.}, at: [<c01b4c70>] kernfs_fop_write+0x48/0x19c
  #2:  (s_active#24){.+.+.+}, at: [<c01b4c78>] kernfs_fop_write+0x50/0x19c
  #3:  (device_hotplug_lock){+.+.+.}, at: [<c03cbff0>] lock_device_hotplug_sysfs+0xc/0x4c
  #4:  (&dev->mutex){......}, at: [<c03cd284>] device_online+0x14/0x88
  #5:  (cpu_add_remove_lock){+.+.+.}, at: [<c003af90>] cpu_up+0x50/0x1a0
  #6:  (cpu_hotplug.lock){++++++}, at: [<c003ae48>] cpu_hotplug_begin+0x0/0xc4
  #7:  (cpu_hotplug.lock#2){+.+.+.}, at: [<c003aec0>] cpu_hotplug_begin+0x78/0xc4
  #8:  (boot_lock){+.+...}, at: [<c002b254>] omap4_boot_secondary+0x1c/0x178
Preemption disabled at:[<  (null)>]   (null)

CPU: 0 PID: 118 Comm: sh Not tainted 4.1.12-rt11-01998-gb4a62c3-dirty #137
Hardware name: Generic DRA74X (Flattened Device Tree)
[<c0017574>] (unwind_backtrace) from [<c0013be8>] (show_stack+0x10/0x14)
[<c0013be8>] (show_stack) from [<c05a8670>] (dump_stack+0x80/0x94)
[<c05a8670>] (dump_stack) from [<c05ad158>] (rt_spin_lock+0x24/0x54)
[<c05ad158>] (rt_spin_lock) from [<c0030dac>] (clkdm_wakeup+0x10/0x2c)
[<c0030dac>] (clkdm_wakeup) from [<c002b2c0>] (omap4_boot_secondary+0x88/0x178)
[<c002b2c0>] (omap4_boot_secondary) from [<c0015d00>] (__cpu_up+0xc4/0x164)
[<c0015d00>] (__cpu_up) from [<c003b09c>] (cpu_up+0x15c/0x1a0)
[<c003b09c>] (cpu_up) from [<c03cd2d4>] (device_online+0x64/0x88)
[<c03cd2d4>] (device_online) from [<c03cd360>] (online_store+0x68/0x74)
[<c03cd360>] (online_store) from [<c01b4ce0>] (kernfs_fop_write+0xb8/0x19c)
[<c01b4ce0>] (kernfs_fop_write) from [<c0144124>] (__vfs_write+0x20/0xd8)
[<c0144124>] (__vfs_write) from [<c01449c0>] (vfs_write+0x90/0x164)
[<c01449c0>] (vfs_write) from [<c01451e4>] (SyS_write+0x44/0x9c)
[<c01451e4>] (SyS_write) from [<c0010240>] (ret_fast_syscall+0x0/0x54)
CPU1: smp_ops.cpu_die() returned, trying to resuscitate

Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>

Merge branch '81xx' into omap-for-v4.4/fixes

arm: omap2+: add missing HWMOD_NO_IDLEST in 81xx hwmod data

Add missing HWMOD_NO_IDLEST hwmod flag for entries not
having omap4 clkctrl values.
The emac0 hwmod flag fixes the davinci_emac driver probe
since the return of pm_resume() call is now checked.

This solves the following boot errors :
[    0.121429] omap_hwmod: l4_ls: _wait_target_ready failed: -16
[    0.121441] omap_hwmod: l4_ls: cannot be enabled for reset (3)
[    0.124342] omap_hwmod: l4_hs: _wait_target_ready failed: -16
[    0.124352] omap_hwmod: l4_hs: cannot be enabled for reset (3)
[    1.967228] omap_hwmod: emac0: _wait_target_ready failed: -16

Cc: Brian Hutchinson <b.hutchman@gmail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>

Revert "blk-flush: Queue through IO scheduler when flush not required"

This reverts commit 1b2ff19e6a957b1ef0f365ad331b608af80e932e.

Jan writes:

--

Thanks for report! After some investigation I found out we allocate
elevator specific data in __get_request() only for non-flush requests. And
this is actually required since the flush machinery uses the space in
struct request for something else. Doh. So my patch is just wrong and not
easy to fix since at the time __get_request() is called we are not sure
whether the flush machinery will be used in the end. Jens, please revert
1b2ff19e6a957b1ef0f365ad331b608af80e932e. Thanks!

I'm somewhat surprised that you can reliably hit the race where flushing
gets disabled for the device just while the request is in flight. But I
guess during boot it makes some sense.

--

So let's just revert it, we can fix the queue run manually after the
fact. This race is rare enough that it didn't trigger in testing, it
requires the specific disable-while-in-flight scenario to trigger.

Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
"Bug fixes for all architectures.  Nothing really stands out"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
  KVM: nVMX: remove incorrect vpid check in nested invvpid emulation
  arm64: kvm: report original PAR_EL1 upon panic
  arm64: kvm: avoid %p in __kvm_hyp_panic
  KVM: arm/arm64: vgic: Trust the LR state for HW IRQs
  KVM: arm/arm64: arch_timer: Preserve physical dist. active state on LR.active
  KVM: arm/arm64: Fix preemptible timer active state crazyness
  arm64: KVM: Add workaround for Cortex-A57 erratum 834220
  arm64: KVM: Fix AArch32 to AArch64 register mapping
  ARM/arm64: KVM: test properly for a PTE's uncachedness
  KVM: s390: fix wrong lookup of VCPUs by array index
  KVM: s390: avoid memory overwrites on emergency signal injection
  KVM: Provide function for VCPU lookup by id
  KVM: s390: fix pfmf intercept handler
  KVM: s390: enable SIMD only when no VCPUs were created
  KVM: x86: request interrupt window when IRQ chip is split
  KVM: x86: set KVM_REQ_EVENT on local interrupt request from user space
  KVM: x86: split kvm_vcpu_ready_for_interrupt_injection out of dm_request_for_irq_injection
  KVM: x86: fix interrupt window handling in split IRQ chip case
  MIPS: KVM: Uninit VCPU in vcpu_create error path
  MIPS: KVM: Fix CACHE immediate offset sign extension
  ...

drm/radeon: make some dpm errors debug only

"Could not force DPM to low", etc. is usually harmless and
just confuses users.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

arm64: efi: correctly map runtime regions

The kernel may use a page granularity of 4K, 16K, or 64K depending on
configuration.

When mapping EFI runtime regions, we use memrange_efi_to_native to round
the physical base address of a region down to a kernel page boundary,
and round the size up to a kernel page boundary, adding the residue left
over from rounding down the physical base address. We do not round down
the virtual base address.

In __create_mapping we account for the offset of the virtual base from a
granule boundary, adding the residue to the size before rounding the
base down to said granule boundary.

Thus we account for the residue twice, and when the residue is non-zero
will cause __create_mapping to map an additional page at the end of the
region. Depending on the memory map, this page may be in a region we are
not intended/permitted to map, or may clash with a different region that
we wish to map. In typical cases, mapping the next item in the memory
map will overwrite the erroneously created entry, as we sort the memory
map in the stub.

As __create_mapping can cope with base addresses which are not page
aligned, we can instead rely on it to map the region appropriately, and
simplify efi_virtmap_init by removing the unnecessary code.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: mm: fix fault_info table xFSC decoding

We are missing descriptions for some valid xFSC values in the fault info
table (e.g. "TLB conflict abort"), and have erroneous descriptions for
reserved values (e.g. "asynchronous external abort", "debug event").

This patch adds the missing xFSC values, and removes erroneous decoding
of values reserved by the architecture, as described in ARM DDI 0487A.h.

At the same time, fixed the unbalanced brackets for the synchronous
parity error strings in the table.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: fix building without CONFIG_UID16

As reported by Michal Simek, building an ARM64 kernel with CONFIG_UID16
disabled currently fails because the system call table still needs to
reference the individual function entry points that are provided by
kernel/sys_ni.c in this case, and the declarations are hidden inside
of #ifdef CONFIG_UID16:

arch/arm64/include/asm/unistd32.h:57:8: error: 'sys_lchown16' undeclared here (not in a function)
__SYSCALL(__NR_lchown, sys_lchown16)

I believe this problem only exists on ARM64, because older architectures
tend to not need declarations when their system call table is built
in assembly code, while newer architectures tend to not need UID16
support. ARM64 only uses these system calls for compatibility with
32-bit ARM binaries.

This changes the CONFIG_UID16 check into CONFIG_HAVE_UID16, which is
set unconditionally on ARM64 with CONFIG_COMPAT, so we see the
declarations whenever we need them, but otherwise the behavior is
unchanged.

Fixes: af1839eb4bd4 ("Kconfig: clean up the long arch list for the UID16 config option")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

ARM: orion5x: Fix legacy get_irqnr_and_base

Commit 5be9fc23cd ("ARM: orion5x: fix legacy orion5x IRQ numbers") shifted
IRQ numbers by one but didn't update the get_irqnr_and_base macro
accordingly. This macro is involved when CONFIG_MULTI_IRQ_HANDLER
is not defined.

[jac: 5d6bed2a9c went in to v4.2, but was backported to v3.18]

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Fixes: 5be9fc23cd ("ARM: orion5x: fix legacy orion5x IRQ numbers")
Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Jason Cooper <jason@lakedaemon.net>

ARM: dove: Fix legacy get_irqnr_and_base

Commit 5d6bed2a9c ("ARM: dove: fix legacy dove IRQ numbers") shifted
IRQ numbers by one but didn't update the get_irqnr_and_base macro
accordingly. This macro is involved when CONFIG_MULTI_IRQ_HANDLER
is not defined.

[jac: 5d6bed2a9c went in to v4.2, but was backported to v3.18]

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Fixes: 5d6bed2a9c ("ARM: dove: fix legacy dove IRQ numbers")
Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Jason Cooper <jason@lakedaemon.net>

KVM: nVMX: remove incorrect vpid check in nested invvpid emulation

This patch removes the vpid check when emulating nested invvpid
instruction of type all-contexts invalidation. The existing code is
incorrect because:
(1) According to Intel SDM Vol 3, Section "INVVPID - Invalidate
     Translations Based on VPID", invvpid instruction does not check
     vpid in the invvpid descriptor when its type is all-contexts
     invalidation.
(2) According to the same document, invvpid of type all-contexts
     invalidation does not require there is an active VMCS, so/and
     get_vmcs12() in the existing code may result in a NULL-pointer
     dereference. In practice, it can crash both KVM itself and L1
     hypervisors that use invvpid (e.g. Xen).

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

btrfs: fix balance range usage filters in 4.4-rc

There's a regression in 4.4-rc since commit bc3094673f22
(btrfs: extend balance filter usage to take minimum and maximum) in that
existing (non-ranged) balance with -dusage=x no longer works; all chunks
are skipped.

After staring at the code for a while and wondering why a non-ranged
balance would even need min and max thresholds (..which then were not
set correctly, leading to the bug) I realized that the only problem
was the fact that the filter functions were named wrong, thanks to
patching copypasta. Simply renaming both functions lets the existing
btrfs-progs call balance with -dusage=x and now the non-ranged filter
function is invoked, properly using only a single chunk limit.

Signed-off-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
Fixes: bc3094673f22 ("btrfs: extend balance filter usage to take minimum and maximum")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>

btrfs: qgroup: account shared subtree during snapshot delete

Commit 0ed4792 ('btrfs: qgroup: Switch to new extent-oriented qgroup
mechanism.') removed our qgroup accounting during
btrfs_drop_snapshot(). Predictably, this results in qgroup numbers
going bad shortly after a snapshot is removed.

Fix this by adding a dirty extent record when we encounter extents during
our shared subtree walk. This effectively restores the functionality we had
with the original shared subtree walking code in 1152651 (btrfs: qgroup:
account shared subtrees during snapshot delete).

The idea with the original patch (and this one) is that shared subtrees can
get skipped during drop_snapshot. The shared subtree walk then allows us a
chance to visit those extents and add them to the qgroup work for later
processing. This ultimately makes the accounting for drop snapshot work.

The new qgroup code nicely handles all the other extents during the tree
walk via the ref dec/inc functions so we don't have to add actions beyond
what we had originally.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Chris Mason <clm@fb.com>

Btrfs: use btrfs_get_fs_root in resolve_indirect_ref

The backref code will look up the fs_root we're trying to resolve our indirect
refs for, unfortunately we use btrfs_read_fs_root_no_name, which returns -ENOENT
if the ref is 0.  This isn't helpful for the qgroup stuff with snapshot delete
as it won't be able to search down the snapshot we are deleting, which will
cause us to miss roots.  So use btrfs_get_fs_root and send false for check_ref
so we can always get the root we're looking for.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Chris Mason <clm@fb.com>

btrfs: qgroup: fix quota disable during rescan

There's a race condition that leads to a NULL pointer dereference if you
disable quotas while a quota rescan is running. To fix this, we just need
to wait for the quota rescan worker to actually exit before tearing down
the quota structures.

Signed-off-by: Justin Maggard <jmaggard@netgear.com>
Signed-off-by: Chris Mason <clm@fb.com>

Btrfs: fix race between cleaner kthread and space cache writeout

When a block group becomes unused and the cleaner kthread is currently
running, we can end up getting the current transaction aborted with error
-ENOENT when we try to commit the transaction, leading to the following
trace:

  [59779.258768] WARNING: CPU: 3 PID: 5990 at fs/btrfs/extent-tree.c:3740 btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs]()
  [59779.272594] BTRFS: Transaction aborted (error -2)
  (...)
  [59779.291137] Call Trace:
  [59779.291621]  [<ffffffff812566f4>] dump_stack+0x4e/0x79
  [59779.292543]  [<ffffffff8104d0a6>] warn_slowpath_common+0x9f/0xb8
  [59779.293435]  [<ffffffffa04cb81f>] ? btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs]
  [59779.295000]  [<ffffffff8104d107>] warn_slowpath_fmt+0x48/0x50
  [59779.296138]  [<ffffffffa04c2721>] ? write_one_cache_group.isra.32+0x77/0x82 [btrfs]
  [59779.297663]  [<ffffffffa04cb81f>] btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs]
  [59779.299141]  [<ffffffffa0549b0d>] commit_cowonly_roots+0x1de/0x261 [btrfs]
  [59779.300359]  [<ffffffffa04dd5b6>] btrfs_commit_transaction+0x4c4/0x99c [btrfs]
  [59779.301805]  [<ffffffffa04b5df4>] btrfs_sync_fs+0x145/0x1ad [btrfs]
  [59779.302893]  [<ffffffff81196634>] sync_filesystem+0x7f/0x93
  (...)
  [59779.318186] ---[ end trace 577e2daff90da33a ]---

The following diagram illustrates a sequence of steps leading to this
problem:

       CPU 1                                             CPU 2

                           <at transaction N>

                                                        adds bg A to list
                                                        fs_info->unused_bgs

                                                        adds bg B to list
                                                        fs_info->unused_bgs

                           <transaction kthread
                            commits transaction N
                            and wakes up the
                            cleaner kthread>

  cleaner kthread
    delete_unused_bgs()

      sees bg A in list
      fs_info->unused_bgs

      btrfs_start_transaction()

                           <transaction N + 1 starts>

      deletes bg A

                                                        update_block_group(bg C)

                                                          --> adds bg C to list
                                                              fs_info->unused_bgs

      deletes bg B

      sees bg C in the list
      fs_info->unused_bgs

      btrfs_remove_chunk(bg C)
        btrfs_remove_block_group(bg C)

          --> checks if the block group
              is in a dirty list, and
              because it isn't now, it
              does nothing

          --> the block group item
              is deleted from the
              extent tree

                                                          --> adds bg C to list
                                                              transaction->dirty_bgs

                                                         some task calls
                                                         btrfs_commit_transaction(t N + 1)
                                                           commit_cowonly_roots()
                                                             btrfs_write_dirty_block_groups()
                                                               --> sees bg C in cur_trans->dirty_bgs
                                                               --> calls write_one_cache_group()
                                                                   which returns -ENOENT because
                                                                   it did not find the block group
                                                                   item in the extent tree
                                                               --> transaction aborte with -ENOENT
                                                                   because write_one_cache_group()
                                                                   returned that error

So fix this by adding a block group to the list of dirty block groups
before adding it to the list of unused block groups.

This happened on a stress test using fsstress plus concurrent calls to
fallocate 20G and truncate (releasing part of the space allocated with
fallocate).

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>

Btrfs: fix scrub preventing unused block groups from being deleted

Currently scrub can race with the cleaner kthread when the later attempts
to delete an unused block group, and the result is preventing the cleaner
kthread from ever deleting later the block group - unless the block group
becomes used and unused again. The following diagram illustrates that
race:

              CPU 1                                 CPU 2

cleaner kthread
   btrfs_delete_unused_bgs()

     gets block group X from
     fs_info->unused_bgs and
     removes it from that list

                                             scrub_enumerate_chunks()

                                               searches device tree using
                                               its commit root

                                               finds device extent for
                                               block group X

                                               gets block group X from the tree
                                               fs_info->block_group_cache_tree
                                               (via btrfs_lookup_block_group())

                                               sets bg X to RO

     sees the block group is
     already RO and therefore
     doesn't delete it nor adds
     it back to unused list

So fix this by making scrub add the block group again to the list of
unused block groups if the block group is still unused when it finished
scrubbing it and it hasn't been removed already.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>

Btrfs: fix race between scrub and block group deletion

Scrub can race with the cleaner kthread deleting block groups that are
unused (and with relocation too) leading to a failure with error -EINVAL
that gets returned to user space.

The following diagram illustrates how it happens:

              CPU 1                                 CPU 2

cleaner kthread
   btrfs_delete_unused_bgs()

     gets block group X from
     fs_info->unused_bgs

     sets block group to RO

       btrfs_remove_chunk(bg X)

         deletes device extents

                                         scrub_enumerate_chunks()

                                           searches device tree using
                                           its commit root

                                           finds device extent for
                                           block group X

                                           gets block group X from the tree
                                           fs_info->block_group_cache_tree
                                           (via btrfs_lookup_block_group())

                                           sets bg X to RO (again)

          btrfs_remove_block_group(bg X)

            deletes block group from
            fs_info->block_group_cache_tree

            removes extent map from
            fs_info->mapping_tree

                                               scrub_chunk(offset X)

                                                 searches fs_info->mapping_tree
                                                 for extent map starting at
                                                 offset X

                                                    --> doesn't find any such
                                                        extent map
                                                    --> returns -EINVAL and scrub
                                                        errors out to userspace
                                                        with -EINVAL

Fix this by dealing with an extent map lookup failure as an indicator of
block group deletion.
Issue reproduced with fstest btrfs/071.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>

btrfs: fix rcu warning during device replace

The test btrfs/011 triggers a rcu warning
Reviewed-by: Anand Jain <anand.jain@oracle.com>
===============================
[ INFO: suspicious RCU usage. ]
4.4.0-rc1-default+ #286 Tainted: G        W
-------------------------------
fs/btrfs/volumes.c:1977 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
4 locks held by btrfs/28786:

0:  (&fs_info->dev_replace.lock_finishing_cancel_unmount){+.+...}, at: [<ffffffffa00bc785>] btrfs_dev_replace_finishing+0x45/0xa00 [btrfs]
1:  (uuid_mutex){+.+.+.}, at: [<ffffffffa00bc84f>] btrfs_dev_replace_finishing+0x10f/0xa00 [btrfs]
2:  (&fs_devs->device_list_mutex){+.+.+.}, at: [<ffffffffa00bc868>] btrfs_dev_replace_finishing+0x128/0xa00 [btrfs]
3:  (&fs_info->chunk_mutex){+.+...}, at: [<ffffffffa00bc87d>] btrfs_dev_replace_finishing+0x13d/0xa00 [btrfs]

stack backtrace:
CPU: 0 PID: 28786 Comm: btrfs Tainted: G        W       4.4.0-rc1-default+ #286
Hardware name: Intel Corporation SandyBridge Platform/To be filled by O.E.M., BIOS ASNBCPT1.86C.0031.B00.1006301607 06/30/2010
0000000000000001 ffff8800a07dfb48 ffffffff8141d47b 0000000000000001
0000000000000001 0000000000000000 ffff8801464a4f00 ffff8800a07dfb78
ffffffff810cd883 ffff880146eb9400 ffff8800a3698600 ffff8800a33fe220
Call Trace:
[<ffffffff8141d47b>] dump_stack+0x4f/0x74
[<ffffffff810cd883>] lockdep_rcu_suspicious+0x103/0x140
[<ffffffffa0071261>] btrfs_rm_dev_replace_remove_srcdev+0x111/0x130 [btrfs]
[<ffffffff810d354d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff81449536>] ? __percpu_counter_sum+0x66/0x80
[<ffffffffa00bcc15>] btrfs_dev_replace_finishing+0x4d5/0xa00 [btrfs]
[<ffffffffa00bc96e>] ? btrfs_dev_replace_finishing+0x22e/0xa00 [btrfs]
[<ffffffffa00a8795>] ? btrfs_scrub_dev+0x415/0x6d0 [btrfs]
[<ffffffffa003ea69>] ? btrfs_start_transaction+0x9/0x20 [btrfs]
[<ffffffffa00bda79>] btrfs_dev_replace_start+0x339/0x590 [btrfs]
[<ffffffff81196aa5>] ? __might_fault+0x95/0xa0
[<ffffffffa0078638>] btrfs_ioctl_dev_replace+0x118/0x160 [btrfs]
[<ffffffff811409c6>] ? stack_trace_call+0x46/0x70
[<ffffffffa007c914>] ? btrfs_ioctl+0x24/0x1770 [btrfs]
[<ffffffffa007ce43>] btrfs_ioctl+0x553/0x1770 [btrfs]
[<ffffffff811409c6>] ? stack_trace_call+0x46/0x70
[<ffffffff811d6eb1>] ? do_vfs_ioctl+0x21/0x5a0
[<ffffffff811d6f1c>] do_vfs_ioctl+0x8c/0x5a0
[<ffffffff811e3336>] ? __fget_light+0x86/0xb0
[<ffffffff811e3369>] ? __fdget+0x9/0x20
[<ffffffff811d7451>] ? SyS_ioctl+0x21/0x80
[<ffffffff811d7483>] SyS_ioctl+0x53/0x80
[<ffffffff81b1efd7>] entry_SYSCALL_64_fastpath+0x12/0x6f

This is because of unprotected use of rcu_dereference in
btrfs_scratch_superblocks. We can't add rcu locks around the whole
function because we read the superblock.

The fix will use the rcu string buffer directly without the rcu locking.
Thi is safe as the device will not go away in the meantime. We're
holding the device list mutexes.

Restructuring the code to narrow down the rcu section turned out to be
impossible, we need to call filp_open (through update_dev_time) on the
buffer and this could call kmalloc/__might_sleep. We could call kstrdup
with GFP_ATOMIC but it's not absolutely necessary.

Fixes: 12b1c2637b6e (Btrfs: enhance btrfs_scratch_superblock to scratch all superblocks)
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>

btrfs: Continue replace when set_block_ro failed

xfstests/011 failed in node with small_size filesystem.
Can be reproduced by following script:
  DEV_LIST="/dev/vdd /dev/vde"
  DEV_REPLACE="/dev/vdf"

  do_test()
  {
      local mkfs_opt="$1"
      local size="$2"

      dmesg -c >/dev/null
      umount $SCRATCH_MNT &>/dev/null

      echo  mkfs.btrfs -f $mkfs_opt "${DEV_LIST[*]}"
      mkfs.btrfs -f $mkfs_opt "${DEV_LIST[@]}" || return 1
      mount "${DEV_LIST[0]}" $SCRATCH_MNT

      echo -n "Writing big files"
      dd if=/dev/urandom of=$SCRATCH_MNT/t0 bs=1M count=1 >/dev/null 2>&1
      for ((i = 1; i <= size; i++)); do
          echo -n .
          /bin/cp $SCRATCH_MNT/t0 $SCRATCH_MNT/t$i || return 1
      done
      echo

      echo Start replace
      btrfs replace start -Bf "${DEV_LIST[0]}" "$DEV_REPLACE" $SCRATCH_MNT || {
          dmesg
          return 1
      }
      return 0
  }

  # Set size to value near fs size
  # for example, 1897 can trigger this bug in 2.6G device.
  #
  ./do_test "-d raid1 -m raid1" 1897

System will report replace fail with following warning in dmesg:
[  134.710853] BTRFS: dev_replace from /dev/vdd (devid 1) to /dev/vdf started
[  135.542390] BTRFS: btrfs_scrub_dev(/dev/vdd, 1, /dev/vdf) failed -28
[  135.543505] ------------[ cut here ]------------
[  135.544127] WARNING: CPU: 0 PID: 4080 at fs/btrfs/dev-replace.c:428 btrfs_dev_replace_start+0x398/0x440()
[  135.545276] Modules linked in:
[  135.545681] CPU: 0 PID: 4080 Comm: btrfs Not tainted 4.3.0 #256
[  135.546439] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[  135.547798]  ffffffff81c5bfcf ffff88003cbb3d28 ffffffff817fe7b5 0000000000000000
[  135.548774]  ffff88003cbb3d60 ffffffff810a88f1 ffff88002b030000 00000000ffffffe4
[  135.549774]  ffff88003c080000 ffff88003c082588 ffff88003c28ab60 ffff88003cbb3d70
[  135.550758] Call Trace:
[  135.551086]  [<ffffffff817fe7b5>] dump_stack+0x44/0x55
[  135.551737]  [<ffffffff810a88f1>] warn_slowpath_common+0x81/0xc0
[  135.552487]  [<ffffffff810a89e5>] warn_slowpath_null+0x15/0x20
[  135.553211]  [<ffffffff81448c88>] btrfs_dev_replace_start+0x398/0x440
[  135.554051]  [<ffffffff81412c3e>] btrfs_ioctl+0x1d2e/0x25c0
[  135.554722]  [<ffffffff8114c7ba>] ? __audit_syscall_entry+0xaa/0xf0
[  135.555506]  [<ffffffff8111ab36>] ? current_kernel_time64+0x56/0xa0
[  135.556304]  [<ffffffff81201e3d>] do_vfs_ioctl+0x30d/0x580
[  135.557009]  [<ffffffff8114c7ba>] ? __audit_syscall_entry+0xaa/0xf0
[  135.557855]  [<ffffffff810011d1>] ? do_audit_syscall_entry+0x61/0x70
[  135.558669]  [<ffffffff8120d1c1>] ? __fget_light+0x61/0x90
[  135.559374]  [<ffffffff81202124>] SyS_ioctl+0x74/0x80
[  135.559987]  [<ffffffff81809857>] entry_SYSCALL_64_fastpath+0x12/0x6f
[  135.560842] ---[ end trace 2a5c1fc3205abbdd ]---

Reason:
When big data writen to fs, the whole free space will be allocated
for data chunk.
And operation as scrub need to set_block_ro(), and when there is
only one metadata chunk in system(or other metadata chunks
are all full), the function will try to allocate a new chunk,
and failed because no space in device.

Fix:
When set_block_ro failed for metadata chunk, it is not a problem
because scrub_lock paused commit_trancaction in same time, and
metadata are always cowed, so the on-the-fly writepages will not
write data into same place with scrub/replace.
Let replace continue in this case is no problem.

Tested by above script, and xfstests/011, plus 100 times xfstests/070.

Changelog v1->v2:
1: Add detail comments in source and commit-message.
2: Add dmesg detail into commit-message.
3: Limit return value of -ENOSPC to be passed.
All suggested by: Filipe Manana <fdmanana@gmail.com>

Suggested-by: Filipe Manana <fdmanana@gmail.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>

btrfs: fix clashing number of the enhanced balance usage filter

I've accidentally picked an already used number for the enhanced usage
filter represented by BTRFS_BALANCE_ARGS_USAGE_RANGE, clashing with
BTRFS_BALANCE_ARGS_CONVERT. Introduced during the development phase,
no backward compatibility issues.

Reported-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: bc3094673f22 ("btrfs: extend balance filter usage to take minimum and maximum")
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>

Btrfs: fix the number of transaction units needed to remove a block group

We were using only 1 transaction unit when attempting to delete an unused
block group but in reality we need 3 + N units, where N corresponds to the
number of stripes. We were accounting only for the addition of the orphan
item (for the block group's free space cache inode) but we were not
accounting that we need to delete one block group item from the extent
tree, one free space item from the tree of tree roots and N device extent
items from the device tree.

While one unit is not enough, it worked most of the time because for each
single unit we are too pessimistic and assume an entire tree path, with
the highest possible heigth (8), needs to be COWed with eventual node
splits at every possible level in the tree, so there was usually enough
reserved space for removing all the items and adding the orphan item.

However after adding the orphan item, writepages() can by called by the VM
subsystem against the btree inode when we are under memory pressure, which
causes writeback to start for the nodes we COWed before, this forces the
operation to remove the free space item to COW again some (or all of) the
same nodes (in the tree of tree roots). Even without writepages() being
called, we could fail with ENOSPC because these items are located in
multiple trees and one of them might have a higher heigth and require
node/leaf splits at many levels, exhausting all the reserved space before
removing all the items and adding the orphan.

In the kernel 4.0 release, commit 3d84be799194 ("Btrfs: fix BUG_ON in
btrfs_orphan_add() when delete unused block group"), we attempted to fix
a BUG_ON due to ENOSPC when trying to add the orphan item by making the
cleaner kthread reserve one transaction unit before attempting to remove
the block group, but this was not enough. We had a couple user reports
still hitting the same BUG_ON after 4.0, like Stefan Priebe's report on
a 4.2-rc6 kernel for example:

http://www.spinics.net/lists/linux-btrfs/msg46070.html

So fix this by reserving all the necessary units of metadata.

Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Fixes: 3d84be799194 ("Btrfs: fix BUG_ON in btrfs_orphan_add() when delete unused block group")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>

Btrfs: use global reserve when deleting unused block group after ENOSPC

It's possible to reach a state where the cleaner kthread isn't able to
start a transaction to delete an unused block group due to lack of enough
free metadata space and due to lack of unallocated device space to allocate
a new metadata block group as well. If this happens try to use space from
the global block group reserve just like we do for unlink operations, so
that we don't reach a permanent state where starting a transaction for
filesystem operations (file creation, renames, etc) keeps failing with
-ENOSPC. Such an unfortunate state was observed on a machine where over
a dozen unused data block groups existed and the cleaner kthread was
failing to delete them due to ENOSPC error when attempting to start a
transaction, and even running balance with a -dusage=0 filter failed with
ENOSPC as well. Also unmounting and mounting again the filesystem didn't
help. Allowing the cleaner kthread to use the global block reserve to
delete the unused data block groups fixed the problem.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>

Btrfs: tests: checking for NULL instead of IS_ERR()

btrfs_alloc_dummy_root() return an error pointer on failure, it never
returns NULL.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>

btrfs: fix signed overflows in btrfs_sync_file

The calculation of range length in btrfs_sync_file leads to signed
overflow. This was caught by PaX gcc SIZE_OVERFLOW plugin.

https://forums.grsecurity.net/viewtopic.php?f=1&t=4284

The fsync call passes 0 and LLONG_MAX, the range length does not fit to
loff_t and overflows, but the value is converted to u64 so it silently
works as expected.

The minimal fix is a typecast to u64, switching functions to take
(start, end) instead of (start, len) would be more intrusive.

Coccinelle script found that there's one more opencoded calculation of
the length.

<smpl>
@@
loff_t start, end;
@@
* end - start
</smpl>

CC: stable@vger.kernel.org
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>

arm64: early_alloc: Fix check for allocation failure

In early_alloc we check if the memblock_alloc failed by checking
the virtual address of the result, which will never fail. This patch
fixes it to check the actual result for failure.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

rtc: ds1307: fix kernel splat due to wakeup irq handling

Since commit 3fffd1283927 ("i2c: allow specifying
separate wakeup interrupt in device tree") we have
automatic wakeup irq support for i2c devices. That
commit missed the fact that rtc-1307 had its own
wakeup irq handling and ended up introducing a
kernel splat for at least Beagle x15 boards.

Fix that by reverting original commit _and_ passing
correct interrupt names on DTS so i2c-core can
choose correct IRQ as wakeup.

Now that we have automatic wakeirq support, we can
revert the original commit which did it manually.

Fixes the following warning:

[ 10.346582] WARNING: CPU: 1 PID: 263 at linux/drivers/base/power/wakeirq.c:43 dev_pm_attach_wake_irq+0xbc/0xd4()
[ 10.359244] rtc-ds1307 2-006f: wake irq already initialized

Cc: Tony Lindgren <tony@atomide.com>
Cc: Nishanth Menon <nm@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

drm/nouveau/volt/pwm/gk104: fix an off-by-one resulting in the voltage not being set

Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Martin Peres <martin.peres@free.fr>

drm/nouveau/nvif: allow userspace access to its own client object

Regression from "abi16: implement limited interoperability with
usif/nvif".

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>