openwrt/staging/blogic.git
8 years agodevicetree: Add Creative Technology vendor id
Marek Vasut [Sun, 8 May 2016 20:33:22 +0000 (22:33 +0200)]
devicetree: Add Creative Technology vendor id

Add vendor ID for Creative technology.

Signed-off-by: Marek Vasut <marek.vasut@gmail.com>
Cc: Antony Pavlov <antonynpavlov@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agogpio: dt-bindings: add ibm,ppc4xx-gpio binding
Christian Lamparter [Wed, 11 May 2016 22:07:48 +0000 (00:07 +0200)]
gpio: dt-bindings: add ibm,ppc4xx-gpio binding

This patch adds binding information for IBM/AMCC/APM GPIO
Controllers of the PowerPC 4XX series and compatible SoCs.

The "PowerPC 405EP Embedded Processor Data Sheet" has the
following to say about the GPIO controllers: "

 - Controller functions and GPIO registers are programmed
   and accessed via memory-mapped OPB bus master accesses

 - All GPIOs are pin-shared with other functions. DCRs control
   whether a particular pin that has GPIO capabilities acts
   as a GPIO or is used for another purpose.

 - Each GPIO outputs is separately programmable to emulate
   an open-drain driver (i.e. drives to zero, threestated if
   output bit is 1)

"
The ppc4xx_gpio.c driver is part of the platform/sysdev drivers
in arch/powerpc/sysdev.

Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoof/unittest: Remove unnecessary module.h header inclusion
Javier Martinez Canillas [Wed, 11 May 2016 20:26:18 +0000 (16:26 -0400)]
of/unittest: Remove unnecessary module.h header inclusion

The OF_UNITTEST Kconfig symbol is bool so this unittest can only be
built-in and not build as a module. Also, nothing defined in this
header file used so is not necessary to include it.

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agodrivers/of: Fix build warning in populate_node()
Gavin Shan [Fri, 13 May 2016 11:31:39 +0000 (21:31 +1000)]
drivers/of: Fix build warning in populate_node()

Function populate_node() is used to unflatten FDT blob to device
tree. It supports maximal 64 level of device nodes. There is one
array @fpsizes[64] tracking the full name length of last unflattened
device node in the corresponding level (index of element in the
array - 1). Build warning is seen with CONFIG_FRAME_WARN=1024 like
below on ARM64 as Geert reported. The issue can be reproduced on
PPC64 as well.

  $ make drivers/of/fdt.o
  drivers/of/fdt.c:443:1: warning: the frame size of 1136 bytes is \
  larger than 1024 bytes [-Wframe-larger-than=]

This changes the data type of @fpsizes[i] from "unsigned long" to
"unsigned int" to avoid the build warning. The return value type
of populate_node() and its @fpsize argument is adjusted accordingly.
With this applied, 256 bytes saved from the stack frame on ARM64 and
PPC64 platforms and the above warning isn't seen.

Fixes: 50800082f176 ("drivers/of: Avoid recursively calling unflatten_dt_node()")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agodrivers/of: Fix depth when unflattening devicetree
Rhyland Klein [Wed, 11 May 2016 17:36:57 +0000 (13:36 -0400)]
drivers/of: Fix depth when unflattening devicetree

When the implementation for unflatten_dt_node() changed from being
recursive to being non-recursive, it had a side effect of increasing the
depth passed to fdt_next_node() by 1. This is fine most of the time, but
it seems that when the end of the dtb is being parsed, it will cause the
FDT_END condition in fdt_next_node() to return a different value
(returning nextoffset instead of -FDT_ERR_NOTFOUND). This ends up passing
an FDT_ERR_TRUNCATED error back to the unflatten_dt_node() which then
sees that and complains "Error -8 processing FDT" causing boot to fail.

This patch simply avoids incrementing depth and uses modified accesses
for local array indices so that the depth is the same as it was before
the change as far as fdt_next_node() is concerned.

This problem was discovered trying to boot Tegra210-Smaug platforms.

Fixes: 50800082f176 ("drivers/of: Avoid recursively calling unflatten_dt_node()")
Signed-off-by: Rhyland Klein <rklein@nvidia.com>
[robh: squashed in KASAN fix from Rhyland]
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoof: dynamic: changeset prop-update revert fix
Pantelis Antoniou [Mon, 9 May 2016 13:20:42 +0000 (16:20 +0300)]
of: dynamic: changeset prop-update revert fix

When reverting an update property changeset entry that created a
property the reverse operation is a remove property and not an update.

Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agodrivers/of: Export of_detach_node()
Gavin Shan [Tue, 3 May 2016 13:22:52 +0000 (23:22 +1000)]
drivers/of: Export of_detach_node()

This exports of_detach_node() for PowerPC PowerNV PCI hotplug
driver. No functional changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agodrivers/of: Return allocated memory from of_fdt_unflatten_tree()
Gavin Shan [Tue, 3 May 2016 13:22:51 +0000 (23:22 +1000)]
drivers/of: Return allocated memory from of_fdt_unflatten_tree()

This returns the allocate memory chunk, storing the unflattened device
tree, from of_fdt_unflatten_tree() so that memory chunk can be released
on demand in PowerNV PCI hotplug driver.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agodrivers/of: Specify parent node in of_fdt_unflatten_tree()
Gavin Shan [Tue, 3 May 2016 13:22:50 +0000 (23:22 +1000)]
drivers/of: Specify parent node in of_fdt_unflatten_tree()

This adds one more argument to of_fdt_unflatten_tree() to specify
the parent node of the FDT blob that is going to be unflattened.
In the result, the function can be used to unflatten FDT blob that
represents device sub-tree in PowerNV PCI hotplug driver.

Cc: Jyri Sarha <jsarha@ti.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Jyri Sarha <jsarha@ti.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agodrivers/of: Rename unflatten_dt_node()
Gavin Shan [Tue, 3 May 2016 13:22:49 +0000 (23:22 +1000)]
drivers/of: Rename unflatten_dt_node()

This renames unflatten_dt_node() to unflatten_dt_nodes() as it
populates multiple device nodes from FDT blob. No logical changes
introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agodrivers/of: Avoid recursively calling unflatten_dt_node()
Gavin Shan [Tue, 3 May 2016 13:22:48 +0000 (23:22 +1000)]
drivers/of: Avoid recursively calling unflatten_dt_node()

In current implementation, unflatten_dt_node() is called recursively
to unflatten device nodes in FDT blob. It's stress to limited stack
capacity, especially to adopt the function to unflatten device sub-tree
that possibly has multiple root nodes. In that case, we runs out of
stack and the system can't boot up successfully.

In order to reuse the function to unflatten device sub-tree, this avoids
calling the function recursively, meaning the device nodes are unflattened
in one call on unflatten_dt_node(): two arrays are introduced to track the
parent path size and the device node of current level of depth, which will
be used by the device node on next level of depth to be unflattened. All
device nodes in more than 64 level of depth are dropped and hopefully,
the system can boot up successfully with the partial device-tree.

Also, the parameter "poffset" and "fpsize" are unused and dropped and the
parameter "dryrun" is figured out from "mem == NULL". Besides, the return
value of the function is changed to indicate the size of memory consumed by
the unflatten device tree or error code.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agodrivers/of: Split unflatten_dt_node()
Gavin Shan [Tue, 3 May 2016 13:22:47 +0000 (23:22 +1000)]
drivers/of: Split unflatten_dt_node()

The function unflatten_dt_node() is called recursively to unflatten
device nodes and properties in the FDT blob. It looks complicated
and hard to be understood.

This splits the function into 3 functions: populate_properties(),
populate_node() and unflatten_dt_node(). populate_properties(),
which is called by populate_node(), creates properties for the
indicated device node. The later one creates the device nodes
from FDT blob. populate_node() gets the offset in FDT blob for
next device nodes and then calls populate_node(). No logical
changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoof: include errno.h in of_graph.h
Arnd Bergmann [Mon, 2 May 2016 10:59:19 +0000 (12:59 +0200)]
of: include errno.h in of_graph.h

When CONFIG_OF is disabled, we have to include linux/errno.h before
including of_graph.h, or get build errors like in the newly added
sun4i drm driver:

In file included from ../drivers/gpu/drm/sun4i/sun4i_drv.c:14:0:
include/linux/of_graph.h: In function 'of_graph_parse_endpoint':
include/linux/of_graph.h:58:10: error: 'ENOSYS' undeclared (first use in this function)

A better solution is to ensure that the header can be included
by itself, so let's include linux/errno.h here to fix the error
we just got, and any similar future error.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 9026e0d122ac ("drm: Add Allwinner A10 Display Engine support")
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoof: document refcount incrementation of of_get_cpu_node()
Masahiro Yamada [Wed, 20 Apr 2016 01:18:46 +0000 (10:18 +0900)]
of: document refcount incrementation of of_get_cpu_node()

This function increments refcount.  This is worth noting.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: soc: fix spelling mistakes
Eric Engestrom [Mon, 25 Apr 2016 00:24:19 +0000 (01:24 +0100)]
Documentation: dt: soc: fix spelling mistakes

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: power: fix spelling mistake
Eric Engestrom [Mon, 25 Apr 2016 00:24:18 +0000 (01:24 +0100)]
Documentation: dt: power: fix spelling mistake

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: pinctrl: fix spelling mistake
Eric Engestrom [Mon, 25 Apr 2016 00:24:17 +0000 (01:24 +0100)]
Documentation: dt: pinctrl: fix spelling mistake

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: opp: fix spelling mistake
Eric Engestrom [Mon, 25 Apr 2016 00:24:16 +0000 (01:24 +0100)]
Documentation: dt: opp: fix spelling mistake

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: net: fix spelling mistakes
Eric Engestrom [Mon, 25 Apr 2016 00:24:15 +0000 (01:24 +0100)]
Documentation: dt: net: fix spelling mistakes

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: mtd: fix spelling mistake
Eric Engestrom [Mon, 25 Apr 2016 00:24:14 +0000 (01:24 +0100)]
Documentation: dt: mtd: fix spelling mistake

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: mmc: fix spelling mistake
Eric Engestrom [Mon, 25 Apr 2016 00:24:13 +0000 (01:24 +0100)]
Documentation: dt: mmc: fix spelling mistake

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: mfd: fix spelling mistakes
Eric Engestrom [Mon, 25 Apr 2016 00:24:12 +0000 (01:24 +0100)]
Documentation: dt: mfd: fix spelling mistakes

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: media: fix spelling mistake
Eric Engestrom [Mon, 25 Apr 2016 00:24:11 +0000 (01:24 +0100)]
Documentation: dt: media: fix spelling mistake

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: interrupt-controller: fix spelling mistakes
Eric Engestrom [Mon, 25 Apr 2016 00:24:10 +0000 (01:24 +0100)]
Documentation: dt: interrupt-controller: fix spelling mistakes

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: input: fix spelling mistakes
Eric Engestrom [Mon, 25 Apr 2016 00:24:09 +0000 (01:24 +0100)]
Documentation: dt: input: fix spelling mistakes

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: dma: fix spelling mistake
Eric Engestrom [Mon, 25 Apr 2016 00:24:08 +0000 (01:24 +0100)]
Documentation: dt: dma: fix spelling mistake

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: display: fix spelling mistake
Eric Engestrom [Mon, 25 Apr 2016 00:24:07 +0000 (01:24 +0100)]
Documentation: dt: display: fix spelling mistake

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: clock: fix spelling mistakes
Eric Engestrom [Mon, 25 Apr 2016 00:24:06 +0000 (01:24 +0100)]
Documentation: dt: clock: fix spelling mistakes

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
[robh: s/describe/described/]
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: dt: arm: fix spelling mistakes
Eric Engestrom [Mon, 25 Apr 2016 00:24:05 +0000 (01:24 +0100)]
Documentation: dt: arm: fix spelling mistakes

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoserial: Move Marvell UART DT bindings to correct location
Geert Uytterhoeven [Fri, 22 Apr 2016 15:29:16 +0000 (17:29 +0200)]
serial: Move Marvell UART DT bindings to correct location

All other UART DT binding documentation is under
Documentation/devicetree/bindings/serial/.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: bindings: fix palmas-rtc documentation
Dr. H. Nikolaus Schaller [Thu, 21 Apr 2016 16:03:15 +0000 (18:03 +0200)]
Documentation: bindings: fix palmas-rtc documentation

change 100mA -> 100uA

Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agophy: phy-stih41x-usb: DT spelling s/#phy-cell/#phy-cells/
Geert Uytterhoeven [Wed, 20 Apr 2016 15:32:17 +0000 (17:32 +0200)]
phy: phy-stih41x-usb: DT spelling s/#phy-cell/#phy-cells/

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoPCI: hisi: DT spelling s/interrupts-*/interrupt-*/
Geert Uytterhoeven [Wed, 20 Apr 2016 15:32:16 +0000 (17:32 +0200)]
PCI: hisi: DT spelling s/interrupts-*/interrupt-*/

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agomisc: sram: DT spelling s/#adress-cells/#address-cells/
Geert Uytterhoeven [Wed, 20 Apr 2016 15:32:15 +0000 (17:32 +0200)]
misc: sram: DT spelling s/#adress-cells/#address-cells/

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agodevicetree: bindings: designware-pcie: Fix unit address
Stephen Boyd [Tue, 12 Apr 2016 20:01:34 +0000 (13:01 -0700)]
devicetree: bindings: designware-pcie: Fix unit address

Remove the 0x in the unit address because it shouldn't be there.

Cc: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agodt-bindings: tegra: Rename some bindings for consistency
Thierry Reding [Tue, 12 Apr 2016 15:07:36 +0000 (17:07 +0200)]
dt-bindings: tegra: Rename some bindings for consistency

Device tree binding for NVIDIA Tegra have traditionally carried the
"nvidia," vendor prefix in the filename. A couple of odd ones don't, so
fix them up for consistency.

Also rename existing bindings to reflect the first compatible value that
they document. This wasn't done consistently either.

Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agodt-bindings: tegra: Remove 0, prefix from unit-addresses
Thierry Reding [Tue, 12 Apr 2016 15:07:35 +0000 (17:07 +0200)]
dt-bindings: tegra: Remove 0, prefix from unit-addresses

When Tegra124 support was first merged the unit-addresses of all devices
were listed with a "0," prefix to encode the reg property's second cell.
It turns out that this notation is not correct, and the "," separator is
only used to separate fields in the unit address (such as the device and
function number in PCI devices), not individual cells for addresses
with more than one cell.

Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoof: Add Inforce Computing to vendor prefix list
Srinivas Kandagatla [Tue, 12 Apr 2016 09:29:57 +0000 (10:29 +0100)]
of: Add Inforce Computing to vendor prefix list

This patch adds Inforce Computing to vendor prefix list.
This vendor makes boards like IFC6410, IFC6540 based on Qualcomm SOCs.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoof: Add Arrow Electronics to vendor prefix list
Srinivas Kandagatla [Tue, 12 Apr 2016 09:29:56 +0000 (10:29 +0100)]
of: Add Arrow Electronics to vendor prefix list

This patch adds Arrow Electronics to vendor perfix list, as this vendor
makes some of the Qualcomm SOC based 96boards like DB600c and DB410c.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoof: Add vendor prefix for Shenzhen Embest Technology
Sergio Prado [Sat, 9 Apr 2016 18:14:25 +0000 (15:14 -0300)]
of: Add vendor prefix for Shenzhen Embest Technology

Signed-off-by: Sergio Prado <sergio.prado@e-labworks.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: devicetree: bindings: regulator: palmas-pmic.txt
Schuyler Patton [Tue, 29 Mar 2016 12:42:43 +0000 (18:12 +0530)]
Documentation: devicetree: bindings: regulator: palmas-pmic.txt

Adding support for the tps659038 pmic so it doesn't generate a warning
when running the patch check script to
Documentation/devicetree/bindings/regulator/palmas-pmic.txt

Adding a note that the tps659037 device is a OTP spin of the
tps659038 pmic and device compatible.

Signed-off-by: Schuyler Patton <spatton@ti.com>
Signed-off-by: Keerthy <j-keerthy@ti.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agodt-bindings: Correct path for ARM GIC documentation
Jon Hunter [Thu, 17 Mar 2016 14:47:57 +0000 (14:47 +0000)]
dt-bindings: Correct path for ARM GIC documentation

Commit eb3fcf007fff ("dt-bindings: consolidate interrupt controller
bindings") moved the binding documentation for the ARM GIC from
arm/gic.txt to interrupt-controller/arm,gic.txt. However, there are
still some binding documents referring to the old path. Update these
binding documents to use the correct location.

Fixes: eb3fcf007fff ("dt-bindings: consolidate interrupt controller bindings")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoDocumentation: devicetree: Clean up gpio-keys example
Andreas Färber [Wed, 16 Mar 2016 11:53:29 +0000 (12:53 +0100)]
Documentation: devicetree: Clean up gpio-keys example

Drop #address-cells and #size-cells, which are not required by the
gpio-keys binding documentation, as button sub-nodes are not devices.

Rename sub-nodes to avoid new dtc unit address warnings when copied.

While at it, adopt the dashes convention for the node name.

Reported-by: Julien Chauveau <chauveau.julien@gmail.com>
Cc: Julien Chauveau <chauveau.julien@gmail.com>
Cc: Javier Martinez Canillas <javier@dowhile0.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Reviewed-by: Julien Chauveau <chauveau.julien@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoiommu/arm-smmu: Make use of phandle iterators in device-tree parsing
Joerg Roedel [Mon, 4 Apr 2016 15:49:22 +0000 (17:49 +0200)]
iommu/arm-smmu: Make use of phandle iterators in device-tree parsing

Remove the usage of of_parse_phandle_with_args() and replace
it by the phandle-iterator implementation so that we can
parse out all of the potentially present 128 stream-ids.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoof: Introduce of_phandle_iterator_args()
Joerg Roedel [Mon, 4 Apr 2016 15:49:21 +0000 (17:49 +0200)]
of: Introduce of_phandle_iterator_args()

This helper function can be used to copy the arguments of a
phandle to an array.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoof: Introduce of_for_each_phandle() helper macro
Joerg Roedel [Mon, 4 Apr 2016 15:49:20 +0000 (17:49 +0200)]
of: Introduce of_for_each_phandle() helper macro

With this macro any user can easily iterate over a list of
phandles. The patch also converts __of_parse_phandle_with_args()
to make use of the macro.

The of_count_phandle_with_args() function is not converted,
because the macro hides the return value of of_phandle_iterator_init(),
which is needed in there.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoof: Remove counting special case from __of_parse_phandle_with_args()
Joerg Roedel [Mon, 4 Apr 2016 15:49:19 +0000 (17:49 +0200)]
of: Remove counting special case from __of_parse_phandle_with_args()

The index = -1 case in __of_parse_phandle_with_args() is
used to just return the number of phandles. That special
case needs extra handling, so move it to the place where it
is needed: of_count_phandle_with_args().

This allows to further simplify __of_parse_phandle_with_args()
later on.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoof: Move phandle walking to of_phandle_iterator_next()
Joerg Roedel [Mon, 4 Apr 2016 15:49:18 +0000 (17:49 +0200)]
of: Move phandle walking to of_phandle_iterator_next()

Move the code to walk over the phandles out of the loop in
__of_parse_phandle_with_args() to a separate function that
just works with the iterator handle: of_phandle_iterator_next().

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoof: Introduce struct of_phandle_iterator
Joerg Roedel [Mon, 4 Apr 2016 15:49:17 +0000 (17:49 +0200)]
of: Introduce struct of_phandle_iterator

This struct carrys all necessary information to iterate over
a list of phandles and extract the arguments. Add an
init-function for the iterator and make use of it in
__of_parse_phandle_with_args().

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agofdt: fix extend of cmd line
Max Uvarov [Wed, 13 Apr 2016 09:52:16 +0000 (12:52 +0300)]
fdt: fix extend of cmd line

On arm CONFIG_CMDLINE_EXTEND does not append build-in
cmdline in kernel to U-boot parameters. Fix it here.
Theoretically this patch should repair kdump work where
it adds elfcorehdr= and memmap additional parameters
to second kernel.

Signed-off-by: Max Uvarov <muvarov@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
8 years agodtc: turn off dtc unit address warnings by default
Rob Herring [Thu, 24 Mar 2016 15:52:42 +0000 (10:52 -0500)]
dtc: turn off dtc unit address warnings by default

The newly added dtc warning to check DT unit-address without reg
property and vice-versa generates lots of warnings. Turn off the check
unless building with W=1 or W=2.

Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: linux-kbuild@vger.kernel.org
8 years agoscripts/dtc: Update to upstream version 53bf130b1cdd
Rob Herring [Fri, 4 Mar 2016 14:56:58 +0000 (08:56 -0600)]
scripts/dtc: Update to upstream version 53bf130b1cdd

Sync to upstream dtc commit 53bf130b1cdd ("libfdt: simplify
fdt_node_check_compatible()"). This adds the following commits from
upstream:

53bf130 libfdt: simplify fdt_node_check_compatible()
c9d9121 Warn on node name unit-address presence/absence mismatch
2e53f9d Catch unsigned 32bit overflow when parsing flattened device tree offsets

Signed-off-by: Rob Herring <robh@kernel.org>
8 years agoLinux 4.6-rc1
Linus Torvalds [Sat, 26 Mar 2016 23:03:24 +0000 (16:03 -0700)]
Linux 4.6-rc1

8 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph...
Linus Torvalds [Sat, 26 Mar 2016 22:53:16 +0000 (15:53 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

Pull Ceph updates from Sage Weil:
 "There is quite a bit here, including some overdue refactoring and
  cleanup on the mon_client and osd_client code from Ilya, scattered
  writeback support for CephFS and a pile of bug fixes from Zheng, and a
  few random cleanups and fixes from others"

[ I already decided not to pull this because of it having been rebased
  recently, but ended up changing my mind after all.  Next time I'll
  really hold people to it.  Oh well.   - Linus ]

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (34 commits)
  libceph: use KMEM_CACHE macro
  ceph: use kmem_cache_zalloc
  rbd: use KMEM_CACHE macro
  ceph: use lookup request to revalidate dentry
  ceph: kill ceph_get_dentry_parent_inode()
  ceph: fix security xattr deadlock
  ceph: don't request vxattrs from MDS
  ceph: fix mounting same fs multiple times
  ceph: remove unnecessary NULL check
  ceph: avoid updating directory inode's i_size accidentally
  ceph: fix race during filling readdir cache
  libceph: use sizeof_footer() more
  ceph: kill ceph_empty_snapc
  ceph: fix a wrong comparison
  ceph: replace CURRENT_TIME by current_fs_time()
  ceph: scattered page writeback
  libceph: add helper that duplicates last extent operation
  libceph: enable large, variable-sized OSD requests
  libceph: osdc->req_mempool should be backed by a slab pool
  libceph: make r_request msg_size calculation clearer
  ...

8 years agoMerge tag 'ofs-pull-tag-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap...
Linus Torvalds [Sat, 26 Mar 2016 19:59:04 +0000 (12:59 -0700)]
Merge tag 'ofs-pull-tag-1' of git://git./linux/kernel/git/hubcap/linux

Pull orangefs filesystem from Mike Marshall.

This finally merges the long-pending orangefs filesystem, which has been
much cleaned up with input from Al Viro over the last six months.  From
the documentation file:

 "OrangeFS is an LGPL userspace scale-out parallel storage system.  It
  is ideal for large storage problems faced by HPC, BigData, Streaming
  Video, Genomics, Bioinformatics.

  Orangefs, originally called PVFS, was first developed in 1993 by Walt
  Ligon and Eric Blumer as a parallel file system for Parallel Virtual
  Machine (PVM) as part of a NASA grant to study the I/O patterns of
  parallel programs.

  Orangefs features include:

    - Distributes file data among multiple file servers
    - Supports simultaneous access by multiple clients
    - Stores file data and metadata on servers using local file system
      and access methods
    - Userspace implementation is easy to install and maintain
    - Direct MPI support
    - Stateless"

see Documentation/filesystems/orangefs.txt for more in-depth details.

* tag 'ofs-pull-tag-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: (174 commits)
  orangefs: fix orangefs_superblock locking
  orangefs: fix do_readv_writev() handling of error halfway through
  orangefs: have ->kill_sb() evict the VFS side of things first
  orangefs: sanitize ->llseek()
  orangefs-bufmap.h: trim unused junk
  orangefs: saner calling conventions for getting a slot
  orangefs_copy_{to,from}_bufmap(): don't pass bufmap pointer
  orangefs: get rid of readdir_handle_s
  ornagefs: ensure that truncate has an up to date inode size
  orangefs: move code which sets i_link to orangefs_inode_getattr
  orangefs: remove needless wrapper around GFP_KERNEL
  orangefs: remove wrapper around mutex_lock(&inode->i_mutex)
  orangefs: refactor inode type or link_target change detection
  orangefs: use new getattr for revalidate and remove old getattr
  orangefs: use new getattr in inode getattr and permission
  orangefs: use new orangefs_inode_getattr to get size in write and llseek
  orangefs: use new orangefs_inode_getattr to create new inodes
  orangefs: rename orangefs_inode_getattr to orangefs_inode_old_getattr
  orangefs: remove inode->i_lock wrapper
  orangefs: put register_chrdev immediately before register_filesystem
  ...

8 years agoMerge tag 'ntb-4.6' of git://github.com/jonmason/ntb
Linus Torvalds [Sat, 26 Mar 2016 18:37:42 +0000 (11:37 -0700)]
Merge tag 'ntb-4.6' of git://github.com/jonmason/ntb

Pull NTB bug fixes from Jon Mason:
 "NTB bug fixes for tasklet from spinning forever, link errors,
  translation window setup, NULL ptr dereference, and ntb-perf errors.

  Also, a modification to the driver API that makes _addr functions
  optional"

* tag 'ntb-4.6' of git://github.com/jonmason/ntb:
  NTB: Remove _addr functions from ntb_hw_amd
  NTB: Make _addr functions optional in the API
  NTB: Fix incorrect clean up routine in ntb_perf
  NTB: Fix incorrect return check in ntb_perf
  ntb: fix possible NULL dereference
  ntb: add missing setup of translation window
  ntb: stop link work when we do not have memory
  ntb: stop tasklet from spinning forever during shutdown.
  ntb: perf test: fix address space confusion

8 years agoMerge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 26 Mar 2016 18:31:01 +0000 (11:31 -0700)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
 "The only new stuff which missed the first pull request is an update to
  the UFS driver.

  The rest is an assortment of bug fixes and minor tweaks which appeared
  recently (some are fixes for recent code and some are stuff spotted
  recently by the checkers or the new gcc-6 compiler [most of Arnd's
  stuff])"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits)
  scsi_common: do not clobber fixed sense information
  scsi: ufs: select CONFIG_NLS
  scsi: fc: use get/put_unaligned64 for wwn access
  fnic: move printk()s outside of the critical code section.
  qla2xxx: avoid maybe_uninitialized warning
  megaraid_sas: add missing curly braces in ioctl handler
  lpfc: fix misleading indentation
  scsi_transport_sas: add 'scsi_target_id' sysfs attribute
  scsi_dh_alua: uninitialized variable in alua_check_vpd()
  scsi: ufs-qcom: add printouts of testbus debug registers
  scsi: ufs-qcom: enable/disable the device ref clock
  scsi: ufs-qcom: set PA_Local_TX_LCC_Enable before link startup
  scsi: ufs: add device quirk delay before putting UFS rails in LPM
  scsi: ufs: fix leakage during link off state
  scsi: ufs: tune UniPro parameters to optimize hibern8 exit time
  scsi: ufs: handle non spec compliant bkops behaviour by device
  scsi: ufs: add retry for query descriptors
  scsi: ufs: add error recovery after DL NAC error
  scsi: ufs: make error handling bit faster
  scsi: ufs: disable vccq if it's not needed by UFS device
  ...

8 years agof2fs/crypto: fix xts_tweak initialization
Linus Torvalds [Sat, 26 Mar 2016 17:13:05 +0000 (10:13 -0700)]
f2fs/crypto: fix xts_tweak initialization

Commit 0b81d07790726 ("fs crypto: move per-file encryption from f2fs
tree to fs/crypto") moved the f2fs crypto files to fs/crypto/ and
renamed the symbol prefixes from "f2fs_" to "fscrypt_" (and from "F2FS_"
to just "FS" for preprocessor symbols).

Because of the symbol renaming, it's a bit hard to see it as a file
move: use

    git show -M30 0b81d07790726

to lower the rename detection to just 30% similarity and make git show
the files as renamed (the header file won't be shown as a rename even
then - since all it contains is symbol definitions, it looks almost
completely different).

Even with the renames showing as renames, the diffs are not all that
easy to read, since so much is just the renames.  But Eric Biggers
noticed that it's not just all renames: the initialization of the
xts_tweak had been broken too, using the inode number rather than the
page offset.

That's not right - it makes the xfs_tweak the same for all pages of each
inode.  It _might_ make sense to make the xfs_tweak contain both the
offset _and_ the inode number, but not just the inode number.

Reported-by: Eric Biggers <ebiggers3@gmail.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoNTB: Remove _addr functions from ntb_hw_amd
Allen Hubbe [Mon, 21 Mar 2016 08:53:14 +0000 (04:53 -0400)]
NTB: Remove _addr functions from ntb_hw_amd

Kernel zero day testing warned about address space confusion.  A virtual
iomem address was used where a physical address is expected.  The
offending functions implement an optional part of the api, so they are
removed.  They can be added later, after testing.

Fixes: a1b3695820aa490e58915d720a1438069813008b
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Acked-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
8 years agoorangefs: fix orangefs_superblock locking
Al Viro [Fri, 25 Mar 2016 23:56:34 +0000 (19:56 -0400)]
orangefs: fix orangefs_superblock locking

* switch orangefs_remount() to taking ORANGEFS_SB(sb) instead of sb
* remove from the list _before_ orangefs_unmount() - request_mutex
in the latter will make sure that nothing observed in the loop in
ORANGEFS_DEV_REMOUNT_ALL handling will get freed until the end
of loop
* on removal, keep the forward pointer and zero the back one.  That
way we can drop and regain the spinlock in the loop body (again,
ORANGEFS_DEV_REMOUNT_ALL one) and still be able to get to the
rest of the list.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
8 years agoorangefs: fix do_readv_writev() handling of error halfway through
Al Viro [Wed, 17 Feb 2016 01:15:43 +0000 (20:15 -0500)]
orangefs: fix do_readv_writev() handling of error halfway through

Error should only be returned if nothing had been read/written.
Otherwise we need to report a short read/write instead.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
8 years agoorangefs: have ->kill_sb() evict the VFS side of things first
Al Viro [Wed, 17 Feb 2016 02:08:29 +0000 (21:08 -0500)]
orangefs: have ->kill_sb() evict the VFS side of things first

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
8 years agoorangefs: sanitize ->llseek()
Al Viro [Wed, 17 Feb 2016 01:25:19 +0000 (20:25 -0500)]
orangefs: sanitize ->llseek()

a) open files can't have NULL inodes
b) it's SEEK_END, not ORANGEFS_SEEK_END; no need to get cute.
c) make_bad_inode() on lseek()?

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
8 years agoorangefs-bufmap.h: trim unused junk
Al Viro [Wed, 17 Feb 2016 01:12:04 +0000 (20:12 -0500)]
orangefs-bufmap.h: trim unused junk

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
8 years agoorangefs: saner calling conventions for getting a slot
Al Viro [Wed, 17 Feb 2016 01:10:26 +0000 (20:10 -0500)]
orangefs: saner calling conventions for getting a slot

just have it return the slot number or -E... - the caller checks
the sign anyway

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
8 years agoorangefs_copy_{to,from}_bufmap(): don't pass bufmap pointer
Al Viro [Wed, 17 Feb 2016 01:06:19 +0000 (20:06 -0500)]
orangefs_copy_{to,from}_bufmap(): don't pass bufmap pointer

it's always __orangefs_bufmap

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
8 years agoorangefs: get rid of readdir_handle_s
Al Viro [Wed, 17 Feb 2016 00:54:13 +0000 (19:54 -0500)]
orangefs: get rid of readdir_handle_s

no point, really - we couldn't keep those across the calls of
getdents(); it would be too easy to DoS, having all slots exhausted.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
8 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Fri, 25 Mar 2016 23:59:11 +0000 (16:59 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge fourth patch-bomb from Andrew Morton:
 "A lot more stuff than expected, sorry.  A bunch of ocfs2 reviewing was
  finished off.

   - mhocko's oom-reaper out-of-memory-handler changes

   - ocfs2 fixes and features

   - KASAN feature work

   - various fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (42 commits)
  thp: fix typo in khugepaged_scan_pmd()
  MAINTAINERS: fill entries for KASAN
  mm/filemap: generic_file_read_iter(): check for zero reads unconditionally
  kasan: test fix: warn if the UAF could not be detected in kmalloc_uaf2
  mm, kasan: stackdepot implementation. Enable stackdepot for SLAB
  arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections
  mm, kasan: add GFP flags to KASAN API
  mm, kasan: SLAB support
  kasan: modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right()
  include/linux/oom.h: remove undefined oom_kills_count()/note_oom_kill()
  mm/page_alloc: prevent merging between isolated and other pageblocks
  drivers/memstick/host/r592.c: avoid gcc-6 warning
  ocfs2: extend enough credits for freeing one truncate record while replaying truncate records
  ocfs2: extend transaction for ocfs2_remove_rightmost_path() and ocfs2_update_edge_lengths() before to avoid inconsistency between inode and et
  ocfs2/dlm: move lock to the tail of grant queue while doing in-place convert
  ocfs2: solve a problem of crossing the boundary in updating backups
  ocfs2: fix occurring deadlock by changing ocfs2_wq from global to local
  ocfs2/dlm: fix BUG in dlm_move_lockres_to_recovery_list
  ocfs2/dlm: fix race between convert and recovery
  ocfs2: fix a deadlock issue in ocfs2_dio_end_io_write()
  ...

8 years agoMerge tag 'pm+acpi-4.6-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 25 Mar 2016 23:55:37 +0000 (16:55 -0700)]
Merge tag 'pm+acpi-4.6-rc1-3' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixlet from Rafael Wysocki:
 "One of commits in my previous pull request changed the permissions of
  drivers/power/avs/rockchip-io-domain.c to executable by mistake"

* tag 'pm+acpi-4.6-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Fix permissions of drivers/power/avs/rockchip-io-domain.c

8 years agoMerge tag 'please-pull-preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 25 Mar 2016 23:48:45 +0000 (16:48 -0700)]
Merge tag 'please-pull-preadv2' of git://git./linux/kernel/git/aegl/linux

Pull ia64 update from Tony Luck:
 "Wire up new system calls p{read,write}v2 for ia64"

* tag 'please-pull-preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  [IA64] Enable preadv2 and pwritev2 syscalls for ia64

8 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Fri, 25 Mar 2016 23:39:05 +0000 (16:39 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull more input updates from Dmitry Torokhov:
 "Second round of updates for the input subsystem.

  The BYD PS/2 protocol driver now uses absolute reporting mode and
  should behave more like other touchpads; Synaptics driver needed to
  extend one of its quirks to a newer firmware version, and a few USB
  drivers got tightened up checks for the contents of their descriptors"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: sur40 - fix DMA on stack
  Input: ati_remote2 - fix crashes on detecting device with invalid descriptor
  Input: synaptics - handle spurious release of trackstick buttons, again
  Input: synaptics-rmi4 - remove check of Non-NULL array
  Input: byd - enable absolute mode
  Input: ims-pcu - sanity check against missing interfaces
  Input: melfas_mip4 - add hw_version sysfs attribute

8 years agothp: fix typo in khugepaged_scan_pmd()
Kirill A. Shutemov [Fri, 25 Mar 2016 21:22:20 +0000 (14:22 -0700)]
thp: fix typo in khugepaged_scan_pmd()

!PageLRU should lead to SCAN_PAGE_LRU, not SCAN_SCAN_ABORT result.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Ebru Akagunduz <ebru.akagunduz@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS: fill entries for KASAN
Andrey Ryabinin [Fri, 25 Mar 2016 21:22:17 +0000 (14:22 -0700)]
MAINTAINERS: fill entries for KASAN

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm/filemap: generic_file_read_iter(): check for zero reads unconditionally
Nicolai Stange [Fri, 25 Mar 2016 21:22:14 +0000 (14:22 -0700)]
mm/filemap: generic_file_read_iter(): check for zero reads unconditionally

If
 - generic_file_read_iter() gets called with a zero read length,
 - the read offset is at a page boundary,
 - IOCB_DIRECT is not set
-  and the page in question hasn't made it into the page cache yet,
then do_generic_file_read() will trigger a readahead with a req_size hint
of zero.

Since roundup_pow_of_two(0) is undefined, UBSAN reports

  UBSAN: Undefined behaviour in include/linux/log2.h:63:13
  shift exponent 64 is too large for 64-bit type 'long unsigned int'
  CPU: 3 PID: 1017 Comm: sa1 Tainted: G L 4.5.0-next-20160318+ #14
  [...]
  Call Trace:
   [...]
   [<ffffffff813ef61a>] ondemand_readahead+0x3aa/0x3d0
   [<ffffffff813ef61a>] ? ondemand_readahead+0x3aa/0x3d0
   [<ffffffff813c73bd>] ? find_get_entry+0x2d/0x210
   [<ffffffff813ef9c3>] page_cache_sync_readahead+0x63/0xa0
   [<ffffffff813cc04d>] do_generic_file_read+0x80d/0xf90
   [<ffffffff813cc955>] generic_file_read_iter+0x185/0x420
   [...]
   [<ffffffff81510b06>] __vfs_read+0x256/0x3d0
   [...]

when get_init_ra_size() gets called from ondemand_readahead().

The net effect is that the initial readahead size is arch dependent for
requested read lengths of zero: for example, since

  1UL << (sizeof(unsigned long) * 8)

evaluates to 1 on x86 while its result is 0 on ARMv7, the initial readahead
size becomes 4 on the former and 0 on the latter.

What's more, whether or not the file access timestamp is updated for zero
length reads is decided differently for the two cases of IOCB_DIRECT
being set or cleared: in the first case, generic_file_read_iter()
explicitly skips updating that timestamp while in the latter case, it is
always updated through the call to do_generic_file_read().

According to POSIX, zero length reads "do not modify the last data access
timestamp" and thus, the IOCB_DIRECT behaviour is POSIXly correct.

Let generic_file_read_iter() unconditionally check the requested read
length at its entry and return immediately with success if it is zero.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokasan: test fix: warn if the UAF could not be detected in kmalloc_uaf2
Alexander Potapenko [Fri, 25 Mar 2016 21:22:11 +0000 (14:22 -0700)]
kasan: test fix: warn if the UAF could not be detected in kmalloc_uaf2

Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm, kasan: stackdepot implementation. Enable stackdepot for SLAB
Alexander Potapenko [Fri, 25 Mar 2016 21:22:08 +0000 (14:22 -0700)]
mm, kasan: stackdepot implementation. Enable stackdepot for SLAB

Implement the stack depot and provide CONFIG_STACKDEPOT.  Stack depot
will allow KASAN store allocation/deallocation stack traces for memory
chunks.  The stack traces are stored in a hash table and referenced by
handles which reside in the kasan_alloc_meta and kasan_free_meta
structures in the allocated memory chunks.

IRQ stack traces are cut below the IRQ entry point to avoid unnecessary
duplication.

Right now stackdepot support is only enabled in SLAB allocator.  Once
KASAN features in SLAB are on par with those in SLUB we can switch SLUB
to stackdepot as well, thus removing the dependency on SLUB stack
bookkeeping, which wastes a lot of memory.

This patch is based on the "mm: kasan: stack depots" patch originally
prepared by Dmitry Chernenkov.

Joonsoo has said that he plans to reuse the stackdepot code for the
mm/page_owner.c debugging facility.

[akpm@linux-foundation.org: s/depot_stack_handle/depot_stack_handle_t]
[aryabinin@virtuozzo.com: comment style fixes]
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoarch, ftrace: for KASAN put hard/soft IRQ entries into separate sections
Alexander Potapenko [Fri, 25 Mar 2016 21:22:05 +0000 (14:22 -0700)]
arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections

KASAN needs to know whether the allocation happens in an IRQ handler.
This lets us strip everything below the IRQ entry point to reduce the
number of unique stack traces needed to be stored.

Move the definition of __irq_entry to <linux/interrupt.h> so that the
users don't need to pull in <linux/ftrace.h>.  Also introduce the
__softirq_entry macro which is similar to __irq_entry, but puts the
corresponding functions to the .softirqentry.text section.

Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm, kasan: add GFP flags to KASAN API
Alexander Potapenko [Fri, 25 Mar 2016 21:22:02 +0000 (14:22 -0700)]
mm, kasan: add GFP flags to KASAN API

Add GFP flags to KASAN hooks for future patches to use.

This patch is based on the "mm: kasan: unified support for SLUB and SLAB
allocators" patch originally prepared by Dmitry Chernenkov.

Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm, kasan: SLAB support
Alexander Potapenko [Fri, 25 Mar 2016 21:21:59 +0000 (14:21 -0700)]
mm, kasan: SLAB support

Add KASAN hooks to SLAB allocator.

This patch is based on the "mm: kasan: unified support for SLUB and SLAB
allocators" patch originally prepared by Dmitry Chernenkov.

Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokasan: modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right()
Alexander Potapenko [Fri, 25 Mar 2016 21:21:56 +0000 (14:21 -0700)]
kasan: modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right()

This patchset implements SLAB support for KASAN

Unlike SLUB, SLAB doesn't store allocation/deallocation stacks for heap
objects, therefore we reimplement this feature in mm/kasan/stackdepot.c.
The intention is to ultimately switch SLUB to use this implementation as
well, which will save a lot of memory (right now SLUB bloats each object
by 256 bytes to store the allocation/deallocation stacks).

Also neither SLUB nor SLAB delay the reuse of freed memory chunks, which
is necessary for better detection of use-after-free errors.  We
introduce memory quarantine (mm/kasan/quarantine.c), which allows
delayed reuse of deallocated memory.

This patch (of 7):

Rename kmalloc_large_oob_right() to kmalloc_pagealloc_oob_right(), as
the test only checks the page allocator functionality.  Also reimplement
kmalloc_large_oob_right() so that the test allocates a large enough
chunk of memory that still does not trigger the page allocator fallback.

Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoinclude/linux/oom.h: remove undefined oom_kills_count()/note_oom_kill()
Tetsuo Handa [Fri, 25 Mar 2016 21:21:53 +0000 (14:21 -0700)]
include/linux/oom.h: remove undefined oom_kills_count()/note_oom_kill()

A leftover from commit c32b3cbe0d06 ("oom, PM: make OOM detection in the
freezer path raceless").

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm/page_alloc: prevent merging between isolated and other pageblocks
Vlastimil Babka [Fri, 25 Mar 2016 21:21:50 +0000 (14:21 -0700)]
mm/page_alloc: prevent merging between isolated and other pageblocks

Hanjun Guo has reported that a CMA stress test causes broken accounting of
CMA and free pages:

> Before the test, I got:
> -bash-4.3# cat /proc/meminfo | grep Cma
> CmaTotal:         204800 kB
> CmaFree:          195044 kB
>
>
> After running the test:
> -bash-4.3# cat /proc/meminfo | grep Cma
> CmaTotal:         204800 kB
> CmaFree:         6602584 kB
>
> So the freed CMA memory is more than total..
>
> Also the the MemFree is more than mem total:
>
> -bash-4.3# cat /proc/meminfo
> MemTotal:       16342016 kB
> MemFree:        22367268 kB
> MemAvailable:   22370528 kB

Laura Abbott has confirmed the issue and suspected the freepage accounting
rewrite around 3.18/4.0 by Joonsoo Kim.  Joonsoo had a theory that this is
caused by unexpected merging between MIGRATE_ISOLATE and MIGRATE_CMA
pageblocks:

> CMA isolates MAX_ORDER aligned blocks, but, during the process,
> partialy isolated block exists. If MAX_ORDER is 11 and
> pageblock_order is 9, two pageblocks make up MAX_ORDER
> aligned block and I can think following scenario because pageblock
> (un)isolation would be done one by one.
>
> (each character means one pageblock. 'C', 'I' means MIGRATE_CMA,
> MIGRATE_ISOLATE, respectively.
>
> CC -> IC -> II (Isolation)
> II -> CI -> CC (Un-isolation)
>
> If some pages are freed at this intermediate state such as IC or CI,
> that page could be merged to the other page that is resident on
> different type of pageblock and it will cause wrong freepage count.

This was supposed to be prevented by CMA operating on MAX_ORDER blocks,
but since it doesn't hold the zone->lock between pageblocks, a race
window does exist.

It's also likely that unexpected merging can occur between
MIGRATE_ISOLATE and non-CMA pageblocks.  This should be prevented in
__free_one_page() since commit 3c605096d315 ("mm/page_alloc: restrict
max order of merging on isolated pageblock").  However, we only check
the migratetype of the pageblock where buddy merging has been initiated,
not the migratetype of the buddy pageblock (or group of pageblocks)
which can be MIGRATE_ISOLATE.

Joonsoo has suggested checking for buddy migratetype as part of
page_is_buddy(), but that would add extra checks in allocator hotpath
and bloat-o-meter has shown significant code bloat (the function is
inline).

This patch reduces the bloat at some expense of more complicated code.
The buddy-merging while-loop in __free_one_page() is initially bounded
to pageblock_border and without any migratetype checks.  The checks are
placed outside, bumping the max_order if merging is allowed, and
returning to the while-loop with a statement which can't be possibly
considered harmful.

This fixes the accounting bug and also removes the arguably weird state
in the original commit 3c605096d315 where buddies could be left
unmerged.

Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock")
Link: https://lkml.org/lkml/2016/3/2/280
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Hanjun Guo <guohanjun@huawei.com>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Debugged-by: Laura Abbott <labbott@redhat.com>
Debugged-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> [3.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrivers/memstick/host/r592.c: avoid gcc-6 warning
Arnd Bergmann [Fri, 25 Mar 2016 21:21:47 +0000 (14:21 -0700)]
drivers/memstick/host/r592.c: avoid gcc-6 warning

The r592 driver relies on behavior of the DMA mapping API that is
normally observed but not guaranteed by the API.  Instead it uses a
runtime check to fail transfers if the API ever behaves

When CONFIG_NEED_SG_DMA_LENGTH is not set, one of the checks turns into a
comparison of a variable with itself, which gcc-6.0 now warns about:

drivers/memstick/host/r592.c: In function 'r592_transfer_fifo_dma':
drivers/memstick/host/r592.c:302:31: error: self-comparison always evaluates to false [-Werror=tautological-compare]
    (sg_dma_len(&dev->req->sg) < dev->req->sg.length)) {
                               ^

The check itself is not a problem, so this patch just rephrases the
condition in a way that gcc does not consider an indication of a mistake.
We already know that dev->req->sg.length was initially R592_LFIFO_SIZE, so
we can compare it to that constant again.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: extend enough credits for freeing one truncate record while replaying truncate...
Xue jiufei [Fri, 25 Mar 2016 21:21:44 +0000 (14:21 -0700)]
ocfs2: extend enough credits for freeing one truncate record while replaying truncate records

Now function ocfs2_replay_truncate_records() first modifies tl_used,
then calls ocfs2_extend_trans() to extend transactions for gd and alloc
inode used for freeing clusters.  jbd2_journal_restart() may be called
and it may happen that tl_used in truncate log is decreased but the
clusters are not freed, which means these clusters are lost.  So we
should avoid extending transactions in these two operations.

Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Acked-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: extend transaction for ocfs2_remove_rightmost_path() and ocfs2_update_edge_len...
Xue jiufei [Fri, 25 Mar 2016 21:21:41 +0000 (14:21 -0700)]
ocfs2: extend transaction for ocfs2_remove_rightmost_path() and ocfs2_update_edge_lengths() before to avoid inconsistency between inode and et

I found that jbd2_journal_restart() is called in some places without
keeping things consistently before.  However, jbd2_journal_restart() may
commit the handle's transaction and restart another one.  If the first
transaction is committed successfully while another not, it may cause
filesystem inconsistency or read only.  This is an effort to fix this
kind of problems.

This patch (of 3):

The following functions will be called while truncating an extent:
ocfs2_remove_btree_range
  -> ocfs2_start_trans
  -> ocfs2_remove_extent
     -> ocfs2_truncate_rec
       -> ocfs2_extend_rotate_transaction
         -> jbd2_journal_restart if jbd2_journal_extend fail
       -> ocfs2_rotate_tree_left
         -> ocfs2_remove_rightmost_path
             -> ocfs2_extend_rotate_transaction
               -> ocfs2_unlink_subtree
                -> ocfs2_update_edge_lengths
                  -> ocfs2_extend_trans
                    -> jbd2_journal_restart if jbd2_journal_extend fail
  -> ocfs2_et_update_clusters
  -> ocfs2_commit_trans

jbd2_journal_restart() may be called and it may happened that the buffers
dirtied in ocfs2_truncate_rec() are committed while buffers dirtied in
ocfs2_et_update_clusters() are not, the total clusters on extent tree and
i_clusters in ocfs2_dinode is inconsistency.  So the clusters got from
ocfs2_dinode is incorrect, and it also cause read-only problem when call
ocfs2_commit_truncate() with the error message: "Inode %llu has empty
extent block at %llu".

We should extend enough credits for function ocfs2_remove_rightmost_path
and ocfs2_update_edge_lengths to avoid this inconsistency.

Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Acked-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2/dlm: move lock to the tail of grant queue while doing in-place convert
xuejiufei [Fri, 25 Mar 2016 21:21:38 +0000 (14:21 -0700)]
ocfs2/dlm: move lock to the tail of grant queue while doing in-place convert

We have found a bug when two nodes doing umount one after another.

1) Node 1 migrate a lockres that has 3 locks in grant queue such as
   N2(PR)<->N3(NL)<->N4(PR) to N2.  After migration, lvb of the lock
   N3(NL) and N4(PR) are empty on node 2 because migration target do not
   copy lvb to these two lock.

2) Node 3 want to convert to PR, it can be granted in
   __dlmconvert_master(), and the order of these locks is unchanged.  The
   lvb of the lock N3(PR) on node 2 is copyed from lockres in function
   dlm_update_lvb() while the lvb of lock N4(PR) is still empty.

3) Node 2 want to leave domain, it will migrate this lockres to node 3.
   Then node 2 will trigger the BUG in dlm_prepare_lvb_for_migration()
   when adding the lock N4(PR) to mres with the following message because
   the lvb of mres is already copied from lock N3(PR), but the lvb of lock
   N4(PR) is empty.

"Mismatched lvb in lock cookie=%u:%llu, name=%.*s, node=%u"

[akpm@linux-foundation.org: tweak comment]
Signed-off-by: xuejiufei <xuejiufei@huawei.com>
Acked-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: solve a problem of crossing the boundary in updating backups
jiangyiwen [Fri, 25 Mar 2016 21:21:35 +0000 (14:21 -0700)]
ocfs2: solve a problem of crossing the boundary in updating backups

In update_backups() there exists a problem of crossing the boundary as
follows:

we assume that lun will be resized to 1TB(cluster_size is 32kb), it will
include 0~33554431 cluster, in update_backups func, it will backup super
block in location of 1TB which is the 33554432th cluster, so the
phenomenon of crossing the boundary happens.

Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Xue jiufei <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: fix occurring deadlock by changing ocfs2_wq from global to local
jiangyiwen [Fri, 25 Mar 2016 21:21:32 +0000 (14:21 -0700)]
ocfs2: fix occurring deadlock by changing ocfs2_wq from global to local

This patch fixes a deadlock, as follows:

  Node 1                Node 2                  Node 3
1)volume a and b are    only mount vol a        only mount vol b
  mounted

2)                      start to mount b        start to mount a

3)                      check hb of Node 3      check hb of Node 2
                        in vol a, qs_holds++    in vol b, qs_holds++

4) -------------------- all nodes' network down --------------------

5)                      progress of mount b     the same situation as
                        failed, and then call   Node 2
                        ocfs2_dismount_volume.
                        but the process is hung,
                        since there is a work
                        in ocfs2_wq cannot beo
                        completed. This work is
                        about vol a, because
                        ocfs2_wq is global wq.
                        BTW, this work which is
                        scheduled in ocfs2_wq is
                        ocfs2_orphan_scan_work,
                        and the context in this work
                        needs to take inode lock
                        of orphan_dir, because
                        lockres owner are Node 1 and
                        all nodes' nework has been down
                        at the same time, so it can't
                        get the inode lock.

6)                      Why can't this node be fenced
                        when network disconnected?
                        Because the process of
                        mount is hung what caused qs_holds
                        is not equal 0.

Because all works in the ocfs2_wq are relative to the super block.

The solution is to change the ocfs2_wq from global to local.  In other
words, move it into struct ocfs2_super.

Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Xue jiufei <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2/dlm: fix BUG in dlm_move_lockres_to_recovery_list
Joseph Qi [Fri, 25 Mar 2016 21:21:29 +0000 (14:21 -0700)]
ocfs2/dlm: fix BUG in dlm_move_lockres_to_recovery_list

When master handles convert request, it queues ast first and then
returns status.  This may happen that the ast is sent before the request
status because the above two messages are sent by two threads.  And
right after the ast is sent, if master down, it may trigger BUG in
dlm_move_lockres_to_recovery_list in the requested node because ast
handler moves it to grant list without clear lock->convert_pending.  So
remove BUG_ON statement and check if the ast is processed in
dlmconvert_remote.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reported-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Tariq Saeed <tariq.x.saeed@oracle.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2/dlm: fix race between convert and recovery
Joseph Qi [Fri, 25 Mar 2016 21:21:26 +0000 (14:21 -0700)]
ocfs2/dlm: fix race between convert and recovery

There is a race window between dlmconvert_remote and
dlm_move_lockres_to_recovery_list, which will cause a lock with
OCFS2_LOCK_BUSY in grant list, thus system hangs.

dlmconvert_remote
{
        spin_lock(&res->spinlock);
        list_move_tail(&lock->list, &res->converting);
        lock->convert_pending = 1;
        spin_unlock(&res->spinlock);

        status = dlm_send_remote_convert_request();
        >>>>>> race window, master has queued ast and return DLM_NORMAL,
               and then down before sending ast.
               this node detects master down and calls
               dlm_move_lockres_to_recovery_list, which will revert the
               lock to grant list.
               Then OCFS2_LOCK_BUSY won't be cleared as new master won't
               send ast any more because it thinks already be authorized.

        spin_lock(&res->spinlock);
        lock->convert_pending = 0;
        if (status != DLM_NORMAL)
                dlm_revert_pending_convert(res, lock);
        spin_unlock(&res->spinlock);
}

In this case, check if res->state has DLM_LOCK_RES_RECOVERING bit set
(res is still in recovering) or res master changed (new master has
finished recovery), reset the status to DLM_RECOVERING, then it will
retry convert.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reported-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Tariq Saeed <tariq.x.saeed@oracle.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: fix a deadlock issue in ocfs2_dio_end_io_write()
Ryan Ding [Fri, 25 Mar 2016 21:21:23 +0000 (14:21 -0700)]
ocfs2: fix a deadlock issue in ocfs2_dio_end_io_write()

The code should call ocfs2_free_alloc_context() to free meta_ac &
data_ac before calling ocfs2_run_deallocs().  Because
ocfs2_run_deallocs() will acquire the system inode's i_mutex hold by
meta_ac.  So try to release the lock before ocfs2_run_deallocs().

Fixes: af1310367f41 ("ocfs2: fix sparse file & data ordering issue in direct io.")
Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Acked-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: fix disk file size and memory file size mismatch
Ryan Ding [Fri, 25 Mar 2016 21:21:20 +0000 (14:21 -0700)]
ocfs2: fix disk file size and memory file size mismatch

When doing append direct write in an already allocated cluster, and fast
path in ocfs2_dio_get_block() is triggered, function
ocfs2_dio_end_io_write() will be skipped as there is no context
allocated.

As a result, the disk file size will not be changed as it should be.
The solution is to skip fast path when we are about to change file size.

Fixes: af1310367f41 ("ocfs2: fix sparse file & data ordering issue in direct io.")
Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Acked-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: take ip_alloc_sem in ocfs2_dio_get_block & ocfs2_dio_end_io_write
Ryan Ding [Fri, 25 Mar 2016 21:21:18 +0000 (14:21 -0700)]
ocfs2: take ip_alloc_sem in ocfs2_dio_get_block & ocfs2_dio_end_io_write

Take ip_alloc_sem to prevent concurrent access to extent tree, which may
cause the extent tree in an unstable state.

Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: fix ip_unaligned_aio deadlock with dio work queue
Ryan Ding [Fri, 25 Mar 2016 21:21:15 +0000 (14:21 -0700)]
ocfs2: fix ip_unaligned_aio deadlock with dio work queue

In the current implementation of unaligned aio+dio, lock order behave as
follow:

in user process context:
  -> call io_submit()
    -> get i_mutex
<== window1
      -> get ip_unaligned_aio
        -> submit direct io to block device
    -> release i_mutex
  -> io_submit() return

in dio work queue context(the work queue is created in __blockdev_direct_IO):
  -> release ip_unaligned_aio
<== window2
    -> get i_mutex
      -> clear unwritten flag & change i_size
    -> release i_mutex

There is a limitation to the thread number of dio work queue.  256 at
default.  If all 256 thread are in the above 'window2' stage, and there
is a user process in the 'window1' stage, the system will became
deadlock.  Since the user process hold i_mutex to wait ip_unaligned_aio
lock, while there is a direct bio hold ip_unaligned_aio mutex who is
waiting for a dio work queue thread to be schedule.  But all the dio
work queue thread is waiting for i_mutex lock in 'window2'.

This case only happened in a test which send a large number(more than
256) of aio at one io_submit() call.

My design is to remove ip_unaligned_aio lock.  Change it to a sync io
instead.  Just like ip_unaligned_aio lock, serialize the unaligned aio
dio.

[akpm@linux-foundation.org: remove OCFS2_IOCB_UNALIGNED_IO, per Junxiao Bi]
Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: code clean up for direct io
Ryan Ding [Fri, 25 Mar 2016 21:21:12 +0000 (14:21 -0700)]
ocfs2: code clean up for direct io

Clean up ocfs2_file_write_iter & ocfs2_prepare_inode_for_write:
 * remove append dio check: it will be checked in ocfs2_direct_IO()
 * remove file hole check: file hole is supported for now
 * remove inline data check: it will be checked in ocfs2_direct_IO()
 * remove the full_coherence check when append dio: we will get the
   inode_lock in ocfs2_dio_get_block, there is no need to fall back to
   buffer io to ensure the coherence semantics.

Now the drop dio procedure is gone.  :)

[akpm@linux-foundation.org: remove unused label]
Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: fix sparse file & data ordering issue in direct io
Ryan Ding [Fri, 25 Mar 2016 21:21:09 +0000 (14:21 -0700)]
ocfs2: fix sparse file & data ordering issue in direct io

There are mainly three issues in the direct io code path after commit
24c40b329e03 ("ocfs2: implement ocfs2_direct_IO_write"):

  * Does not support sparse file.
  * Does not support data ordering.  eg: when write to a file hole, it
    will alloc extent first.  If system crashed before io finished, data
    will corrupt.
  * Potential risk when doing aio+dio.  The -EIOCBQUEUED return value is
    likely to be ignored by ocfs2_direct_IO_write().

To resolve above problems, re-design direct io code with following ideas:
  * Use buffer io to fill in holes.  And this will make better
    performance also.
  * Clear unwritten after direct write finished.  So we can make sure
    meta data changes after data write to disk.  (Unwritten extent is
    invisible to user, from user's view, meta data is not changed when
    allocate an unwritten extent.)
  * Clear ocfs2_direct_IO_write().  Do all ending work in end_io.

This patch has passed fs,dio,ltp-aiodio.part1,ltp-aiodio.part2,ltp-aiodio.part4
test cases of ltp.

For performance improvement, see following test result:
ocfs2 cluster size 1MB, ocfs2 volume is mounted on /mnt/.
The original way:
  + rm /mnt/test.img -f
  + dd if=/dev/zero of=/mnt/test.img bs=4K count=1048576 oflag=direct
  1048576+0 records in
  1048576+0 records out
  4294967296 bytes (4.3 GB) copied, 1707.83 s, 2.5 MB/s
  + rm /mnt/test.img -f
  + dd if=/dev/zero of=/mnt/test.img bs=256K count=16384 oflag=direct
  16384+0 records in
  16384+0 records out
  4294967296 bytes (4.3 GB) copied, 582.705 s, 7.4 MB/s

After this patch:
  + rm /mnt/test.img -f
  + dd if=/dev/zero of=/mnt/test.img bs=4K count=1048576 oflag=direct
  1048576+0 records in
  1048576+0 records out
  4294967296 bytes (4.3 GB) copied, 64.6412 s, 66.4 MB/s
  + rm /mnt/test.img -f
  + dd if=/dev/zero of=/mnt/test.img bs=256K count=16384 oflag=direct
  16384+0 records in
  16384+0 records out
  4294967296 bytes (4.3 GB) copied, 34.7611 s, 124 MB/s

Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: record UNWRITTEN extents when populate write desc
Ryan Ding [Fri, 25 Mar 2016 21:21:06 +0000 (14:21 -0700)]
ocfs2: record UNWRITTEN extents when populate write desc

To support direct io in ocfs2_write_begin_nolock & ocfs2_write_end_nolock.

There is still one issue in the direct write procedure.

phase 1: alloc extent with UNWRITTEN flag
phase 2: submit direct data to disk, add zero page to page cache
phase 3: clear UNWRITTEN flag when data has been written to disk

When there are 2 direct write A(0~3KB),B(4~7KB) writing to the same
cluster 0~7KB (cluster size 8KB).  Write request A arrive phase 2 first,
it will zero the region (4~7KB).  Before request A enter to phase 3,
request B arrive phase 2, it will zero region (0~3KB).  This is just like
request B steps request A.

To resolve this issue, we should let request B knows this cluster is already
under zero, to prevent it from steps the previous write request.

This patch will add function ocfs2_unwritten_check() to do this job.  It
will record all clusters that are under direct write(it will be recorded
in the 'ip_unwritten_list' member of inode info), and prevent the later
direct write writing to the same cluster to do the zero work again.

Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: return the physical address in ocfs2_write_cluster
Ryan Ding [Fri, 25 Mar 2016 21:21:03 +0000 (14:21 -0700)]
ocfs2: return the physical address in ocfs2_write_cluster

To support direct io in ocfs2_write_begin_nolock & ocfs2_write_end_nolock.

Direct io needs to get the physical address from write_begin, to map the
user page.  This patch is to change the arg 'phys' of
ocfs2_write_cluster to a pointer, so it can be retrieved to write_begin.
And we can retrieve it to the direct io procedure.

Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: do not change i_size in write_end for direct io
Ryan Ding [Fri, 25 Mar 2016 21:21:01 +0000 (14:21 -0700)]
ocfs2: do not change i_size in write_end for direct io

To support direct io in ocfs2_write_begin_nolock & ocfs2_write_end_nolock.

Append direct io do not change i_size in get block phase.  It only move
to orphan when starting write.  After data is written to disk, it will
delete itself from orphan and update i_size.  So skip i_size change
section in write_begin for direct io.

And when there is no extents alloc, no meta data changes needed for
direct io (since write_begin start trans for 2 reason: alloc extents &
change i_size.  Now none of them needed).  So we can skip start trans
procedure.

Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: test target page before change it
Ryan Ding [Fri, 25 Mar 2016 21:20:58 +0000 (14:20 -0700)]
ocfs2: test target page before change it

To support direct io in ocfs2_write_begin_nolock & ocfs2_write_end_nolock.

Direct io data will not appear in buffer.  The w_target_page member will
not be filled by direct io.  So avoid to use it when it's NULL.  Unlinke
buffer io and mmap, direct io will call write_begin with more than 1
page a time.  So the target_index is not sufficient to describe the
actual data.  change it to a range start at target_index, end in
end_index.

Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>