git.openwrt.org Git - openwrt/staging/blogic.git/log

mmc: sdhci-esdhc-imx: change pinctrl state according to uhs mode

Without proper pinctrl state, the card may not be able to work
on high speed stablely. e.g. SDR104.

This patch add pinctrl state switch code according to different
uhs mode include 100mhz sate, 200mhz sate and normal state
(50Mhz and below).

Signed-off-by: Dong Aisheng <b29396@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: sdhci-esdhc-imx: add sd3.0 SDR clock tuning support

Freescale i.MX6Q/DL uSDHC clock tuning progress is a little different from
the standard tuning process defined in host controller spec v3.0.
Thus we use platform_execute_tuning instead of standard sdhci tuning.

The main difference are:
1) not only generate Buffer Read Ready interrupt when tuning is performing.
   It generates all other DATA interrupts like the normal data command.
2) SDHCI_CTRL_EXEC_TUNING is not automatically cleared by HW,
   instead it's controlled by SW.
3) SDHCI_CTRL_TUNED_CLK is not automatically set by HW,
   it's controlled by SW.
4) the clock delay for every tuning is set by SW.

Signed-off-by: Dong Aisheng <b29396@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: sdhci-esdhc-imx: support real clock on and off for imx6q

The signal voltage switch flow requires to shutdown and output
clock in a specific sequence according to standard host controller
v3.0 spec. In that timing, the card must really receive clock or not.

However, for i.MX6Q, the uSDHC will not output clock even the clock
is enabled until there is command or data in transfer on the bus,
which will then cause singal voltage switch always to fail.

For i.MX6Q, we clear ESDHC_VENDOR_SPEC_FRC_SDCLK_ON bit to let
controller to gate off clock automatically and set that bit
to force clock output if clock is on.

This is required by SD3.0 support.

Signed-off-by: Dong Aisheng <b29396@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: sdhci-esdhc: move common esdhc_set_clock to platform driver

We need a lot of imx6 specific things into common esdhc_set_clock
for support SD3.0 and eMMC DDR mode which is not needed for power pc
platforms, so esdhc_set_clock seems not so common anymore.

Instead of keeping add platform specfics things into this common API,
we choose to move that code into platform driver itself to handle.
This can also exclude the dependency between imx and power pc on this
headfile and is easy for maintain in the future.

Signed-off-by: Dong Aisheng <b29396@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: sdhci: allow platform access of sdhci_send_command

It helps for platform code to use it send tuning commands.

Signed-off-by: Dong Aisheng <b29396@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: sdhci: add hooks for platform specific tuning

The tuning of some platforms may not follow the standard host control
spec v3.0, e.g. Freescale uSDHC on i.MX6Q/DL.
Add a hook here to allow execute platform specific tuning instead of
standard host controller tuning.

The hook only replaces the tuning process, so it's placed after tuning
checking and before the real tuning process.

Some notes for the tuning hook:
1) it needs handle lock itself if it wants to access host controller
according platform specific implementation.
2) do not need to handle runtime pm since it executes with runtime pm
get already.

Signed-off-by: Dong Aisheng <b29396@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: sdhci-bcm2835: Use sdhci_pltfm_unregister instead of open coded

This avoid duplicated implementation.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: sdhci-bcm-kona: Use sdhci_pltfm_unregister instead of open coded

This avoid duplicated implementation and also fixes missing iounmap() and
release_mem_region() calls in sdhci_bcm_kona_remove(). sdhci_pltfm_init()
calls request_mem_region() and ioremap(), thus we need to call the
corresponding iounmap() and release_mem_region() calls in
sdhci_bcm_kona_remove().

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc-socfpga: Staticize dw_mci_socfpga_probe

'dw_mci_socfpga_probe' is used only in this file. Make it static.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc-socfpga: Remove redundant of_match_ptr

'dw_mci_socfpga_match' is always compiled in.
Hence of_match_ptr is not necessary.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: mvsdio: Convert to devm_ioremap_resource

devm_request_and_ioremap() is deprecated. Use devm_ioremap_resource()
instead.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: core: clean up duplicate macros

Clean up the duplicate macros:
mmc_sd_card_uhs -> mmc_card_uhs
mmc_sd_card_set_uhs -> mmc_card_set_uhs

Signed-off-by: Jackey Shen <jackey.shen@amd.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: atmel-mci: fix oops in atmci_tasklet_func

In some cases, a NULL pointer dereference happens because data is NULL when
STATE_END_REQUEST case is reached in atmci_tasklet_func.

Cc: <stable@vger.kernel.org> # 3.9+
Signed-off-by: Rodolfo Giometti <giometti@enneenne.com>
Acked-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: atmel-mci: abort transfer on timeout error

When a software timeout occurs, the transfer is not stopped. In DMA case,
it causes DMA channel to be stuck because the transfer is still active
causing following transfers to be queued but not computed.

Cc: <stable@vger.kernel.org> # 3.9+
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Reported-by: Alexander Morozov <etesial@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: add ignorance case for CMD13 CRC error

While speed mode is changed, CMD13 cannot be guaranteed.
According to the spec., it is not recommended to use CMD13
to check the busy completion of the timing change.
If CMD13 is used in this case, CRC error must be ignored.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: fix the transfer termination in IDMAC mode

In IDMAC mode EVENT_XFER_COMPLETE is set when RI/TI of last descriptor
is done. So if errors are happened in the middle of data transfers,
'dw_mci_stop_dma' during error handing can be called and eventually
prevents this flag to be set. This results in permanent wait for
EVENT_XFER_COMPLETE in 'dw_mci_tasklet_func'. Therefore, if dma
running is stopped forcibly, EVENT_XFER_COMPLETE should be set.

Reported-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: amend use of idmac sw reset

First, compiling warning along with previous change is removed.
[drivers/mmc/host/dw_mmc.c:1890:7: warning: unused variable 'ctrl']
And with the recommendation in manual, IDMAC software reset is followed
by dma-reset of the CTRL register in order to terminate the transfer.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: gather each reset code into functions

There are three resets in CTRL register. FIFO reset is especially used
in several points with the same routine. It could be replaced with one
function and the others may be applied similarly if needed. So,
mci_wait_reset() is modified to allow various bit field of reset.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: rework the code related to cmd/data completion

Main change corresponds to dw_mci_command_complete(). And EBE is
divided into read and write. Some minor changes for code readability.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: guarantee stop-abort cmd in data errors

In error cases, DTO interrupt may or may not be generated depending
on remained data. Stop/Abort command ensures DTO generation for that
situation. Currently if 'stop' field of data is empty, there is no
stop/abort command. So, it could hang waiting DTO. This change
reinforces these cases.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: fix error handling on response error

Even if response error is detected in case data command, data transfer
is continued. It means that data can live in FIFO. Current handling
just breaks out the request when seeing the command error. This causes
kernel panic in dw_mci_read_data_pio() [host->data = NULL]. And also,
FIFO should be guaranteed to be empty.

Unable to handle kernel NULL pointer dereference at virtual address 00000018
<...>
[<c02af814>] (dw_mci_read_data_pio+0x68/0x198) from [<c02b04b4>] (dw_mci_interrupt+0x374/0x3a0)
[<c02b04b4>] (dw_mci_interrupt+0x374/0x3a0) from [<c006b094>] (handle_irq_event_percpu+0x50/0x194)
[<c006b094>] (handle_irq_event_percpu+0x50/0x194) from [<c006b214>] (handle_irq_event+0x3c/0x5c)
[<c006b214>] (handle_irq_event+0x3c/0x5c) from [<c006de1c>] (handle_fasteoi_irq+0xa4/0x148)
[<c006de1c>] (handle_fasteoi_irq+0xa4/0x148) from [<c006aa88>] (generic_handle_irq+0x20/0x30)
[<c006aa88>] (generic_handle_irq+0x20/0x30) from [<c000f154>] (handle_IRQ+0x38/0x90)
[<c000f154>] (handle_IRQ+0x38/0x90) from [<c00085bc>] (gic_handle_irq+0x34/0x68)
[<c00085bc>] (gic_handle_irq+0x34/0x68) from [<c0011f40>] (__irq_svc+0x40/0x70)
Exception stack(0xef0b1c00 to 0xef0b1c48)
1c00: 000eb0cf ffffffff 00001300 c01a7738 ef295e10 0000000a c04df298 ef0b1dc0
1c20: ef295ec0 00000000 00000000 00000006 00000000 ef0b1c48 c02b1274 c01a7764
1c40: 20000113 ffffffff
[<c0011f40>] (__irq_svc+0x40/0x70) from [<c01a7764>] (__loop_delay+0x0/0xc)
Code: e1a00005 e0891006 e0662004 e12fff33 (e59a3018)
---[ end trace a7043b9ba9aed1db ]---
Kernel panic - not syncing: Fatal exception in interrupt

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: control card read threshold

Card Read Threshold should be ensured that the card clock does not stop
in the middle of a block of data being transferred from the card to the
Host. Specially, clock stop is allowed in fast transfer such as HS200
or SDR104 mode. And so, it should be enabled.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: adjust the fifoth with block size

This change helps to choose msize, rx_watermark and tx_watermark
depending on block size for IDMAC mode. For SDIO block size can be
variable, so if these values are set incorrectly, card clock may stop.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: set the supported max/min frequency

Both f_max and f_min will be informed for core layer to request
valid clock rate. But current setting from 'host->bus_hz' may
not represent the max/min frequency properly. Even if host can
actually support high speed than bus_hz, core layer will not
request clock rate over bus_hz. Basically, f_max/f_min can be set
with the values according to spec. And then host will make its best
effort to meet the rate.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: move supports-highspeed of quirks to caps

'supports-highspeed' is not one of the quirks but is a capability.
So, it's removed from quirks.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: add the capability to support hs200 mode

As host controller can support eMMC's HS200 mode at 1.8V or 1.2V,
these capability will be added.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: exynos: adjust the clock rate with speed mode

Exynos's host has divider logic before 'cclk_in' to controller core.
It means that actual clock rate of ciu clock comes from this divider
value. So, source clock should be adjusted along with 'ciu_div' which
indicates the host's divider ratio. Setting clock rate basically fits
the required speed. Specially, 'cclk_in' should have double rate of
target speed in case of DDR 8-bit mode.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: exynos: add variable delay tuning sequence

Implements variable delay tuning. In this change, exynos host can
determine the correct sampling point for the HS200 and SDR104 speed mode.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: add support tuning scheme

For the speed modes HS200 and SDR104, tuning is needed to determine the
correct sampling point. Actual tuning procedure is provided by specific
host controller driver. This patch defines the tuning command and
tuning data. Additionally, 'struct dw_mci_slot' is moved to header
file to consider the extensive usages in driver.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: exynos: configure SMU in exynos5420

Exynos5420 Mobile Storage Host controller has Security Management
Unit (SMU) for channel 0 and channel 1 (mainly for eMMC).
This time, SMU configuration is set for non-encryption mode.

Signed-off-by: Yuvaraj Kumar C D <yuvaraj.cd@samsung.com>
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Tested-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: move the platform specific init call

Current platform specific private data initialization call
dw_mci_exynos_priv_init() can be used to do platform specific
initialization of SMU and others in future. So the drv_data->init
call has moved to dw_mci_probe().

Signed-off-by: Yuvaraj Kumar C D <yuvaraj.cd@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Tested-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: socfpga: move socfpga private init

Currently platform specific private data initialization is done by
dw_mci_socfpga_priv_init and dw_mci_socfpga_parse_dt. As we already have
separate platform specific device tree parser dw_mci_socfpga_parse_dt,
move the dw_mci_socfpga_priv_init code to dw_mci_socfpga_parse_dt.
We can use the dw_mci_socfpga_priv_init to do some actual platform
specific initialization.

Signed-off-by: Yuvaraj Kumar C D <yuvaraj.cd@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Tested-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: exynos: move the exynos private init

Currently platform specific private data initialization is done by
dw_mci_exynos_priv_init and dw_mci_exynos_parse_dt. As we already have
separate platform specific device tree parser dw_mci_exynos_parse_dt,
move the dw_mci_exynos_priv_init code to dw_mci_exynos_parse_dt.
We can use the dw_mci_exynos_priv_init to do some actual platform
specific initialization of SMU and etc.

Signed-off-by: Yuvaraj Kumar C D <yuvaraj.cd@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Tested-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: Set timeout to max upon resume

The TMOUT register is set to 0xffffffff at probe time but isn't
set after suspend/resume. Add an init of this value.

No problems were observed without this (it will also be set in
__dw_mci_start_request if there is data to send), but it makes the
register dump before and after suspend cleaner.

Signed-off-by: Doug Anderson <dianders@chromium.org>
Acked-by: Seungwon Jeon <tgih.jun@samsung.com>
Reviewed-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: Honor requests to set the clock to 0

Previously the dw_mmc driver would ignore any requests to disable the
card's clock. This doesn't seem like a good thing in general, but had
one extra bad side effect in the following situation:
* mmc core would set clk to 400kHz at boot time while scanning
* mmc core would set clk to 0 since no card, but it would be ignored.
* suspend to ram and resume; clocks in the dw_mmc IP block are now 0
but dw_mmc thinks that they're 400kHz (it ignored the set to 0).
* insert card
* mmc core would set clk to 400kHz which would be considered a no-op.

Note that if there is no card in the slot and we do a suspend/resume
cycle, we _do_ still end up with differences in a dw_mmc register
dump, but the differences are clock related and we've got the clock
disabled both before and after, so this should be OK.

Signed-off-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: Add exynos resume_noirq callback to clear WAKEUP_INT

If the WAKEUP_INT is asserted at wakeup and not cleared, we'll end up
looping around forever. This has been seen to happen on exynos5420
silicon despite the fact that we haven't enabled any wakeup events due
to a silicon errata. It is safe to do on all exynos variants.

Signed-off-by: Doug Anderson <dianders@chromium.org>
Acked-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: don't queue up a card detect at slot startup

The MMC subsystem handles looking for a card at probe time. Queuing up our
own can race with the rest of the MMC subsystem and cause problems if we
get unlucky with timing. Just remove driver own detection triggering.  While
progressing the request from 'mmc_rescan', if 'dw_mci_work_routine_card'
routine is activated, it will cancel the current request. The problem case
is that 'mmc_rescan' is prior to 'dw_mci_work_routine_card' from host own.
Specifically, the following message shows the detection problem in driver's
probing. It would get an err -123 (-ENOMEDIUM) during probe.

[    4.216595] dwmmc_exynos 12210000.dwmmc1: Using internal DMA controller.
[    4.395935] dwmmc_exynos 12210000.dwmmc1: Version ID is 250a
[    4.401948] dwmmc_exynos 12210000.dwmmc1: DW MMC controller at irq 108, 64 bit host data width, 64 deep fifo
[    4.424430] dwmmc_exynos 12210000.dwmmc1: sdr0 mode (irq=108, width=0)
[    4.453975] dwmmc_exynos 12210000.dwmmc1: sdr0 mode (irq=108, width=0)
[    4.459592] mmc_host mmc1: Bus speed (slot 0) = 100000000Hz (slot req 400000Hz, actual 400000HZ div = 125)
[    4.484258] dwmmc_exynos 12210000.dwmmc1: 1 slots initialized
[    4.485406] dwmmc_exynos 12210000.dwmmc1: sdr0 mode (irq=108, width=0)
[    4.487606] dwmmc_exynos 12210000.dwmmc1: sdr0 mode (irq=108, width=0)
[    4.489794] dwmmc_exynos 12210000.dwmmc1: sdr0 mode (irq=108, width=0)
[    4.509757] mmc1: error -123 whilst initialising SDIO card

Signed-off-by: Doug Anderson <dianders@chromium.org>
Acked-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

sh: ecovec: fixup compile error on sdhi

afa2c9407f8908 ("sh: ecovec24: Use MMC/SDHI CD and RO GPIO") added
.tmio_flags = TMIO_MMC_USE_GPIO_CD on sh_mobile_sdhi_info, but it needs
<linux/mfd/tmio.h> header. This patch adds it.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Reviewed-by: Yusuke Goda <yusuke.goda.sx@renesas.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: core: remove dead function mmc_try_claim_host

cscope says there are no callers for mmc_try_claim_host in the kernel.
No reason to keep it.

Signed-off-by: Grant Grundler <grundler@chromium.org>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

Merge tag 'stable/for-linus-3.12-rc2-tag' of git://git./linux/kernel/git/xen/tip

Pull Xen fixes from Konrad Rzeszutek Wilk:
"Bug-fixes and one update to the kernel-paramters.txt documentation.

   - Fix PV spinlocks triggering jump_label code bug
   - Remove extraneous code in the tpm front driver
   - Fix ballooning out of pages when non-preemptible
   - Fix deadlock when using a 32-bit initial domain with large amount
     of memory
   - Add xen_nopvpsin parameter to the documentation"

* tag 'stable/for-linus-3.12-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/spinlock: Document the xen_nopvspin parameter.
  xen/p2m: check MFN is in range before using the m2p table
  xen/balloon: don't alloc page while non-preemptible
  xen: Do not enable spinlocks before jump_label_init() has executed
  tpm: xen-tpmfront: Remove the locality sysfs attribute
  tpm: xen-tpmfront: Fix default durations

Merge tag 'dm-3.12-fixes' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device-mapper fixes from Mike Snitzer:
"A few fixes for dm-snapshot, a 32 bit fix for dm-stats, a couple error
  handling fixes for dm-multipath.  A fix for the thin provisioning
  target to not expose non-zero discard limits if discards are disabled.

  Lastly, add two DM module parameters which allow users to tune the
  emergency memory reserves that DM mainatins per device -- this helps
  fix a long-standing issue for dm-multipath.  The conservative default
  reserve for request-based dm-multipath devices (256) has proven
  problematic for users with many multipathed SCSI devices but
  relatively little memory.  To responsibly select a smaller value users
  should use the new nr_bios tracepoint info (via commit 75afb352
  "block: Add nr_bios to block_rq_remap tracepoint") to determine the
  peak number of bios their workloads create"

* tag 'dm-3.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: add reserved_bio_based_ios module parameter
  dm: add reserved_rq_based_ios module parameter
  dm: lower bio-based mempool reservation
  dm thin: do not expose non-zero discard limits if discards disabled
  dm mpath: disable WRITE SAME if it fails
  dm-snapshot: fix performance degradation due to small hash size
  dm snapshot: workaround for a false positive lockdep warning
  dm stats: fix possible counter corruption on 32-bit systems
  dm mpath: do not fail path on -ENOSPC

Merge git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS fixes from Ralf Baechle:

- Fix a comment

- A small cleanup the main purpose of which is to work around an
   internal compiler error bug in certain Codesource toolchains.

* git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: mm: Move some checks out of 'for' loop in DMA operations
  MIPS: cpu-features.h: s/MIPS53/MIPS64/

Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc

Pull powerpc fixes from Ben Herrenschmidt:
"Here are a few things for -rc2, this time it's all written by me so it
  can only be perfect .... right ? :)

  So we have the fix to call irq_enter/exit on the irq stack we've been
  discussing, plus a cleanup on top to remove an unused (and broken)
  stack limit tracking feature (well, make it 32-bit only in fact where
  it is used and works properly).

  Then we have two things that I wrote over the last couple of days and
  made the executive decision to include just because I can (and I'm
  sure you won't object .... right ?).

  They fix a couple of annoying and long standing "issues":

   - We had separate zImages for when booting via Open Firmware vs.
     booting via a flat device-tree, while it's trivial to make one that
     deals with both

   - We wasted a ton of cycles spinning secondary CPUs uselessly at boot
     instead of starting them when needed on pseries, thus contributing
     significantly to global warming"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/pseries: Do not start secondaries in Open Firmware
  powerpc/zImage: make the "OF" wrapper support ePAPR boot
  powerpc: Remove ksp_limit on ppc64
  powerpc/irq: Run softirqs off the top of the irq stack

Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"An EFI fix and two reboot-quirk fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/reboot: Fix apparent cut-n-paste mistake in Dell reboot workaround
  x86/reboot: Add quirk to make Dell C6100 use reboot=pci automatically
  x86, efi: Don't map Boot Services on i386

Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
"Three small fixes"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/balancing: Fix cfs_rq->task_h_load calculation
  sched/balancing: Fix 'local->avg_load > busiest->avg_load' case in fix_small_imbalance()
  sched/balancing: Fix 'local->avg_load > sds->avg_load' case in calculate_imbalance()

Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"Assorted standalone fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Add model number for Avoton Silvermont
  perf: Fix capabilities bitfield compatibility in 'struct perf_event_mmap_page'
  perf/x86/intel/uncore: Don't use smp_processor_id() in validate_group()
  perf: Update ABI comment
  tools lib lk: Uninclude linux/magic.h in debugfs.c
  perf tools: Fix old GCC build error in trace-event-parse.c:parse_proc_kallsyms()
  perf probe: Fix finder to find lines of given function
  perf session: Check for SIGINT in more loops
  perf tools: Fix compile with libelf without get_phdrnum
  perf tools: Fix buildid cache handling of kallsyms with kcore
  perf annotate: Fix objdump line parsing offset validation
  perf tools: Fill in new definitions for madvise()/mmap() flags
  perf tools: Sharpen the libaudit dependencies test

update contact information for Mikael Pettersson

My old @it.uu.se email address is going away, so update relevant
files to point to my @gmail.com address instead. In sata_promise.c
just delete the address, people can get it from MAINTAINERS.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MIPS: mm: Move some checks out of 'for' loop in DMA operations

The check cpu_needs_post_dma_flush() in mips_dma_sync_sg_for_cpu() and
the check !plat_device_is_coherent() in mips_dma_sync_sg_for_device()
can be moved outside the for loop.

As a side effect, this also avoids a GCC bug that caused kernel compile
to fail with the error:

arch/mips/mm/dma-default.c: In function 'mips_dma_sync_sg_for_cpu':
arch/mips/mm/dma-default.c:316:1: internal compiler error: in add_insn_before, at emit-rtl.c:3852

This gcc failure is seen in Code Sourcery toolchains [e.g. gcc version
4.7.2 (Sourcery CodeBench Lite 2012.09-99)] after commit "MIPS: Optimize
current_cpu_type() for better code."

Signed-off-by: Jayachandran C <jchandra@broadcom.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5907/
Reviewed-by: Markos Chandras <markos.chandras@imgtec.com>
Tested-by: Markos Chandras <markos.chandras@imgtec.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

xen/spinlock: Document the xen_nopvspin parameter.

Which disables in the ticketlock slowpath the Xen PV optimization's.
Useful for diagnosing issues and comparing benchmarks in
over-commit CPU scenarios.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/p2m: check MFN is in range before using the m2p table

On hosts with more than 168 GB of memory, a 32-bit guest may attempt
to grant map an MFN that is error cannot lookup in its mapping of the
m2p table.  There is an m2p lookup as part of m2p_add_override() and
m2p_remove_override().  The lookup falls off the end of the mapped
portion of the m2p and (because the mapping is at the highest virtual
address) wraps around and the lookup causes a fault on what appears to
be a user space address.

do_page_fault() (thinking it's a fault to a userspace address), tries
to lock mm->mmap_sem.  If the gntdev device is used for the grant map,
m2p_add_override() is called from from gnttab_mmap() with mm->mmap_sem
already locked.  do_page_fault() then deadlocks.

The deadlock would most commonly occur when a 64-bit guest is started
and xenconsoled attempts to grant map its console ring.

Introduce mfn_to_pfn_no_overrides() which checks the MFN is within the
mapped portion of the m2p table before accessing the table and use
this in m2p_add_override(), m2p_remove_override(), and mfn_to_pfn()
(which already had the correct range check).

All faults caused by accessing the non-existant parts of the m2p are
thus within the kernel address space and exception_fixup() is called
without trying to lock mm->mmap_sem.

This means that for MFNs that are outside the mapped range of the m2p
then mfn_to_pfn() will always look in the m2p overrides.  This is
correct because it must be a foreign MFN (and the PFN in the m2p in
this case is only relevant for the other domain).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@citrix.com>
Cc: Jan Beulich <JBeulich@suse.com>
--
v3: check for auto_translated_physmap in mfn_to_pfn_no_overrides()
v2: in mfn_to_pfn() look in m2p_overrides if the MFN is out of
    range as it's probably foreign.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

x86/reboot: Fix apparent cut-n-paste mistake in Dell reboot workaround

This seems to have been copied from the Optiplex 990 entry
above, but somoene forgot to change the ident text.

Signed-off-by: Dave Jones <davej@fedoraproject.org>
Link: http://lkml.kernel.org/r/20130925001344.GA13554@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

powerpc/pseries: Do not start secondaries in Open Firmware

Starting secondary CPUs early on from Open Firmware and placing them
in a holding spin loop slows down the boot process significantly under
some hypervisors such as KVM.

This is also unnecessary when RTAS supports querying the CPU state

So let's not do it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/zImage: make the "OF" wrapper support ePAPR boot

This makes the "OF" zImage wrapper (zImage.pseries, zImage.pmac,
zImage.maple) work if booted via a flat device-tree (ePAPR boot
mode), and thus potentially usable with kexec.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Remove ksp_limit on ppc64

We've been keeping that field in thread_struct for a while, it contains
the "limit" of the current stack pointer and is meant to be used for
detecting stack overflows.

It has a few problems however:

- First, it was never actually *used* on 64-bit. Set and updated but
not actually exploited

- When switching stack to/from irq and softirq stacks, it's update
is racy unless we hard disable interrupts, which is costly. This
is fine on 32-bit as we don't soft-disable there but not on 64-bit.

Thus rather than fixing 2 in order to implement 1 in some hypothetical
future, let's remove the code completely from 64-bit. In order to avoid
a clutter of ifdef's, we remove the updates from C code completely
during interrupt stack switching, and instead maintain it from the
asm helper that is used to do the stack switching in the first place.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/irq: Run softirqs off the top of the irq stack

Nowadays, irq_exit() calls __do_softirq() pretty much directly
instead of calling do_softirq() which switches to the decicated
softirq stack.

This has lead to observed stack overflows on powerpc since we call
irq_enter() and irq_exit() outside of the scope that switches to
the irq stack.

This fixes it by moving the stack switching up a level, making
irq_enter() and irq_exit() run off the irq stack.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

mm: Place preemption point in do_mlockall() loop

There is a loop in do_mlockall() that lacks a preemption point, which
means that the following can happen on non-preemptible builds of the
kernel. Dave Jones reports:

"My fuzz tester keeps hitting this.  Every instance shows the non-irq
  stack came in from mlockall.  I'm only seeing this on one box, but
  that has more ram (8gb) than my other machines, which might explain
  it.

    INFO: rcu_preempt self-detected stall on CPU { 3}  (t=6500 jiffies g=470344 c=470343 q=0)
    sending NMI to all CPUs:
    NMI backtrace for cpu 3
    CPU: 3 PID: 29664 Comm: trinity-child2 Not tainted 3.11.0-rc1+ #32
    Call Trace:
      lru_add_drain_all+0x15/0x20
      SyS_mlockall+0xa5/0x1a0
      tracesys+0xdd/0xe2"

This commit addresses this problem by inserting the required preemption
point.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'akpm' (patches from Andrew Morton)

Merge fixes from Andrew Morton:
"Bunch of fixes.

  And a reversion of mhocko's "Soft limit rework" patch series.  This is
  actually your fault for opening the merge window when I was off racing ;)

  I didn't read the email thread before sending everything off.
  Johannes Weiner raised significant issues:

    http://www.spinics.net/lists/cgroups/msg08813.html

  and we agreed to back it all out"

I clearly need to be more aware of Andrew's racing schedule.

* akpm:
  MAINTAINERS: update mach-bcm related email address
  checkpatch: make extern in .h prototypes quieter
  cciss: fix info leak in cciss_ioctl32_passthru()
  cpqarray: fix info leak in ida_locked_ioctl()
  kernel/reboot.c: re-enable the function of variable reboot_default
  audit: fix endless wait in audit_log_start()
  revert "memcg, vmscan: integrate soft reclaim tighter with zone shrinking code"
  revert "memcg: get rid of soft-limit tree infrastructure"
  revert "vmscan, memcg: do softlimit reclaim also for targeted reclaim"
  revert "memcg: enhance memcg iterator to support predicates"
  revert "memcg: track children in soft limit excess to improve soft limit"
  revert "memcg, vmscan: do not attempt soft limit reclaim if it would not scan anything"
  revert "memcg: track all children over limit in the root"
  revert "memcg, vmscan: do not fall into reclaim-all pass too quickly"
  fs/ocfs2/super.c: use a bigger nodestr in ocfs2_dismount_volume
  watchdog: update watchdog_thresh properly
  watchdog: update watchdog attributes atomically

MAINTAINERS: update mach-bcm related email address

Update email address on mach-bcm + drivers for Broadcom mobile SoCs.

Signed-off-by: Christian Daudt <csd@broadcom.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

checkpatch: make extern in .h prototypes quieter

The use of extern in .h files is a bit contentious.

Make the warning be emitted only when --strict is used on the command
line.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cciss: fix info leak in cciss_ioctl32_passthru()

The arg64 struct has a hole after ->buf_size which isn't cleared. Or if
any of the calls to copy_from_user() fail then that would cause an
information leak as well.

This was assigned CVE-2013-2147.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cpqarray: fix info leak in ida_locked_ioctl()

The pciinfo struct has a two byte hole after ->dev_fn so stack
information could be leaked to the user.

This was assigned CVE-2013-2147.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel/reboot.c: re-enable the function of variable reboot_default

Commit 1b3a5d02ee07 ("reboot: move arch/x86 reboot= handling to generic
kernel") did some cleanup for reboot= command line, but it made the
reboot_default inoperative.

The default value of variable reboot_default should be 1, and if command
line reboot= is not set, system will use the default reboot mode.

[akpm@linux-foundation.org: fix comment layout]
Signed-off-by: Li Fei <fei.li@intel.com>
Signed-off-by: liu chuansheng <chuansheng.liu@intel.com>
Acked-by: Robin Holt <robinmholt@linux.com>
Cc: <stable@vger.kernel.org> [3.11.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

audit: fix endless wait in audit_log_start()

After commit 829199197a43 ("kernel/audit.c: avoid negative sleep
durations") audit emitters will block forever if userspace daemon cannot
handle backlog.

After the timeout the waiting loop turns into busy loop and runs until
daemon dies or returns back to work. This is a minimal patch for that
bug.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Richard Guy Briggs <rgb@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Chuck Anderson <chuck.anderson@oracle.com>
Cc: Dan Duval <dan.duval@oracle.com>
Cc: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

revert "memcg, vmscan: integrate soft reclaim tighter with zone shrinking code"

Revert commit 3b38722efd9f ("memcg, vmscan: integrate soft reclaim
tighter with zone shrinking code")

I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.

Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

revert "memcg: get rid of soft-limit tree infrastructure"

Revert commit e883110aad71 ("memcg: get rid of soft-limit tree
infrastructure")

I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.

Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

revert "vmscan, memcg: do softlimit reclaim also for targeted reclaim"

Revert commit a5b7c87f9207 ("vmscan, memcg: do softlimit reclaim also
for targeted reclaim")

I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.

Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

revert "memcg: enhance memcg iterator to support predicates"

Revert commit de57780dc659 ("memcg: enhance memcg iterator to support
predicates")

I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.

Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

revert "memcg: track children in soft limit excess to improve soft limit"

Revert commit 7d910c054be4 ("memcg: track children in soft limit excess
to improve soft limit")

I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.

Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

revert "memcg, vmscan: do not attempt soft limit reclaim if it would not scan anything"

Revert commit e839b6a1c8d0 ("memcg, vmscan: do not attempt soft limit
reclaim if it would not scan anything")

I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.

Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

revert "memcg: track all children over limit in the root"

Revert commit 1be171d60bdd ("memcg: track all children over limit in the
root")

I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.

Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

revert "memcg, vmscan: do not fall into reclaim-all pass too quickly"

Revert commit e975de998b96 ("memcg, vmscan: do not fall into reclaim-all
pass too quickly")

I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.

Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/ocfs2/super.c: use a bigger nodestr in ocfs2_dismount_volume

While printing 32-bit node numbers, an 8-byte string is not enough.
Increase the size of the string to 12 chars.

This got left out in commit 49fa8140e487 ("fs/ocfs2/super.c: Use bigger
nodestr to accomodate 32-bit node numbers").

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

watchdog: update watchdog_thresh properly

watchdog_tresh controls how often nmi perf event counter checks per-cpu
hrtimer_interrupts counter and blows up if the counter hasn't changed
since the last check.  The counter is updated by per-cpu
watchdog_hrtimer hrtimer which is scheduled with 2/5 watchdog_thresh
period which guarantees that hrtimer is scheduled 2 times per the main
period.  Both hrtimer and perf event are started together when the
watchdog is enabled.

So far so good.  But...

But what happens when watchdog_thresh is updated from sysctl handler?

proc_dowatchdog will set a new sampling period and hrtimer callback
(watchdog_timer_fn) will use the new value in the next round.  The
problem, however, is that nobody tells the perf event that the sampling
period has changed so it is ticking with the period configured when it
has been set up.

This might result in an ear ripping dissonance between perf and hrtimer
parts if the watchdog_thresh is increased.  And even worse it might lead
to KABOOM if the watchdog is configured to panic on such a spurious
lockup.

This patch fixes the issue by updating both nmi perf even counter and
hrtimers if the threshold value has changed.

The nmi one is disabled and then reinitialized from scratch.  This has
an unpleasant side effect that the allocation of the new event might
fail theoretically so the hard lockup detector would be disabled for
such cpus.  On the other hand such a memory allocation failure is very
unlikely because the original event is deallocated right before.

It would be much nicer if we just changed perf event period but there
doesn't seem to be any API to do that right now.  It is also unfortunate
that perf_event_alloc uses GFP_KERNEL allocation unconditionally so we
cannot use on_each_cpu() and do the same thing from the per-cpu context.
The update from the current CPU should be safe because
perf_event_disable removes the event atomically before it clears the
per-cpu watchdog_ev so it cannot change anything under running handler
feet.

The hrtimer is simply restarted (thanks to Don Zickus who has pointed
this out) if it is queued because we cannot rely it will fire&adopt to
the new sampling period before a new nmi event triggers (when the
treshold is decreased).

[akpm@linux-foundation.org: the UP version of __smp_call_function_single ended up in the wrong place]
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

watchdog: update watchdog attributes atomically

proc_dowatchdog doesn't synchronize multiple callers which might lead to
confusion when two parallel callers might confuse watchdog_enable_all_cpus
resp watchdog_disable_all_cpus (eg watchdog gets enabled even if
watchdog_thresh was set to 0 already).

This patch adds a local mutex which synchronizes callers to the sysctl
handler.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'bcache' (bcache fixes from Kent Overstreet)

Merge bcache fixes from Kent Overstreet:
"There's fixes for _three_ different data corruption bugs, all of which
  were found by users hitting them in the wild.

  The first one isn't bcache specific - in 3.11 bcache was switched to
  the bio_copy_data in fs/bio.c, and that's when the bug in that code
  was discovered, but it's also used by raid1 and pktcdvd.  (That was my
  code too, so the bug's doubly embarassing given that it was or
  should've been just a cut and paste from bcache code.  Dunno what
  happened there).

  Most of these (all the non data corruption bugs, actually) were ready
  before the merge window and have been sitting in Jens' tree, but I
  don't know what's been up with him lately..."

* emailed patches from Kent Overstreet <kmo@daterainc.com>:
  bcache: Fix flushes in writeback mode
  bcache: Fix for handling overlapping extents when reading in a btree node
  bcache: Fix a shrinker deadlock
  bcache: Fix a dumb CPU spinning bug in writeback
  bcache: Fix a flush/fua performance bug
  bcache: Fix a writeback performance regression
  bcache: Correct printf()-style format length modifier
  bcache: Fix for when no journal entries are found
  bcache: Strip endline when writing the label through sysfs
  bcache: Fix a dumb journal discard bug
  block: Fix bio_copy_data()

bcache: Fix flushes in writeback mode

In writeback mode, when we get a cache flush we need to make sure we
issue a flush to the backing device.

The code for sending down an extra flush was wrong - by cloning the bio
we were probably getting flags that didn't make sense for a bare flush,
and also the old code was firing for FUA bios, for which we don't need
to send a flush to the backing device.

This was causing data corruption somehow - the mechanism was never
determined, but this patch fixes it for the users that were seeing it.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bcache: Fix for handling overlapping extents when reading in a btree node

btree_sort_fixup() was overly clever, because it was trying to avoid
pulling a key off the btree iterator in more than one place.

This led to a really obscure bug where we'd break early from the loop in
btree_sort_fixup() if the current key overlapped with keys in more than
one older set, and the next key it overlapped with was zero size.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bcache: Fix a shrinker deadlock

GFP_NOIO means we could be getting called recursively - mca_alloc() ->
mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then.
Whoops.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bcache: Fix a dumb CPU spinning bug in writeback

schedule_timeout() != schedule_timeout_uninterruptible()

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bcache: Fix a flush/fua performance bug

bch_journal_meta() was missing the flush to make the journal write
actually go down (instead of waiting up to journal_delay_ms)...

Whoops

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bcache: Fix a writeback performance regression

Background writeback works by scanning the btree for dirty data and
adding those keys into a fixed size buffer, then for each dirty key in
the keybuf writing it to the backing device.

When read_dirty() finishes and it's time to scan for more dirty data, we
need to wait for the outstanding writeback IO to finish - they still
take up slots in the keybuf (so that foreground writes can check for
them to avoid races) - without that wait, we'll continually rescan when
we'll be able to add at most a key or two to the keybuf, and that takes
locks that starves foreground IO. Doh.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bcache: Correct printf()-style format length modifier

Fix

drivers/md/bcache/btree.c: In function ‘bch_btree_node_read’:
drivers/md/bcache/btree.c:259: warning: format ‘%lu’ expects type ‘long unsigned int’, but argument 3 has type ‘size_t’

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bcache: Fix for when no journal entries are found

The journal replay code didn't handle this case, causing it to go into
an infinite loop...

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bcache: Strip endline when writing the label through sysfs

sysfs attributes with unusual characters have crappy failure modes
in Squeeze (udev 164); later versions of udev are unaffected.

This should make these characters more unusual.

Signed-off-by: Gabriel de Perthuis <g2p.code@gmail.com>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bcache: Fix a dumb journal discard bug

That switch statement was obviously wrong, leading to some sort of weird
spinning on rare occasion with discards enabled...

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

block: Fix bio_copy_data()

The memcpy() in bio_copy_data() was using the wrong offset vars, leading
to data corruption in weird unusual setups.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.9
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

xen/balloon: don't alloc page while non-preemptible

get_balloon_scratch_page() disables preemption so we cannot call
alloc_page() in between get/put_balloon_scratch_page(). Shuffle bits
around in decrease_reservation() to avoid this.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen: Do not enable spinlocks before jump_label_init() has executed

xen_init_spinlocks() currently calls static_key_slow_inc() before
jump_label_init() is invoked. When CONFIG_JUMP_LABEL is set (which usually is
the case) the effect of this static_key_slow_inc() is deferred until after
jump_label_init(). This is different from when CONFIG_JUMP_LABEL is not set, in
which case the key is set immediately. Thus, depending on the value of config
option, we may observe different behavior.

In addition, when we come to __jump_label_transform() from jump_label_init(),
the key (paravirt_ticketlocks_enabled) is already enabled. On processors where
ideal_nop is not the same as default_nop this will cause a BUG() since it is
expected that before a key is enabled the latter is replaced by the former
during initialization.

To address this problem we need to move
static_key_slow_inc(&paravirt_ticketlocks_enabled) so that it is called
after jump_label_init(). We also need to make sure that this is done before
other cpus start to boot. early_initcall appears to be a good place to do so.
(Note that we cannot move whole xen_init_spinlocks() there since pv_lock_ops
need to be set before alternative_instructions() runs.)

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: Added extra comments in the code]
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

tpm: xen-tpmfront: Remove the locality sysfs attribute

Upon deeper review it was agreed to remove the driver-unique
'locality' sysfs attribute before it is present in a released
kernel.

The attribute was introduced in e2683957fb268c6b29316fd9e7191e13239a30a5
during the 3.12 merge window, so this patch needs to go in before
3.12 is released.

The hope is to have a well defined locality API that all the other
locality aware drivers can use, perhaps in 3.13.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>

tpm: xen-tpmfront: Fix default durations

All the default durations were being set to 10 minutes which is
way too long for the timeouts. Normal values for the longest
duration are around 5 mins, and short duration ar around .5s.

Further, these are just the default, tpm_get_timeouts will set
them to values from the TPM (or throw an error).

Just remove them.

Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

drm/i2c: tda998x: fix audio muting

Fix a bug that was introduced in commit c4c11dd160a8 ("drm/i2c: tda998x:
add video and audio input configuration") when Sebastian cleaned up my
original patch. Without this being fixed, audio is muted when the
display is turned off, never to be re-enabled.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Cc: Darren Etheridge <detheridge@ti.com>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ipc: fix race with LSMs

Currently, IPC mechanisms do security and auditing related checks under
RCU.  However, since security modules can free the security structure,
for example, through selinux_[sem,msg_queue,shm]_free_security(), we can
race if the structure is freed before other tasks are done with it,
creating a use-after-free condition.  Manfred illustrates this nicely,
for instance with shared mem and selinux:

-> do_shmat calls rcu_read_lock()
-> do_shmat calls shm_object_check().
     Checks that the object is still valid - but doesn't acquire any locks.
     Then it returns.
-> do_shmat calls security_shm_shmat (e.g. selinux_shm_shmat)
-> selinux_shm_shmat calls ipc_has_perm()
-> ipc_has_perm accesses ipc_perms->security

shm_close()
-> shm_close acquires rw_mutex & shm_lock
-> shm_close calls shm_destroy
-> shm_destroy calls security_shm_free (e.g. selinux_shm_free_security)
-> selinux_shm_free_security calls ipc_free_security(&shp->shm_perm)
-> ipc_free_security calls kfree(ipc_perms->security)

This patch delays the freeing of the security structures after all RCU
readers are done.  Furthermore it aligns the security life cycle with
that of the rest of IPC - freeing them based on the reference counter.
For situations where we need not free security, the current behavior is
kept.  Linus states:

"... the old behavior was suspect for another reason too: having the
  security blob go away from under a user sounds like it could cause
  various other problems anyway, so I think the old code was at least
  _prone_ to bugs even if it didn't have catastrophic behavior."

I have tested this patch with IPC testcases from LTP on both my
quad-core laptop and on a 64 core NUMA server.  In both cases selinux is
enabled, and tests pass for both voluntary and forced preemption models.
While the mentioned races are theoretical (at least no one as reported
them), I wanted to make sure that this new logic doesn't break anything
we weren't aware of.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MIPS: cpu-features.h: s/MIPS53/MIPS64/

No support for MIPS53 processors yet.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5876/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

Linux 3.12-rc2

Merge tag 'staging-3.12-rc2' of git://git./linux/kernel/git/gregkh/staging

Pull staging fixes from Greg KH:
"Here are a number of small staging tree and iio driver fixes.  Nothing
  major, just lots of little things"

* tag 'staging-3.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (34 commits)
  iio:buffer_cb: Add missing iio_buffer_init()
  iio: Prevent race between IIO chardev opening and IIO device free
  iio: fix: Keep a reference to the IIO device for open file descriptors
  iio: Stop sampling when the device is removed
  iio: Fix crash when scan_bytes is computed with active_scan_mask == NULL
  iio: Fix mcp4725 dev-to-indio_dev conversion in suspend/resume
  iio: Fix bma180 dev-to-indio_dev conversion in suspend/resume
  iio: Fix tmp006 dev-to-indio_dev conversion in suspend/resume
  iio: iio_device_add_event_sysfs() bugfix
  staging: iio: ade7854-spi: Fix return value
  staging:iio:hmc5843: Fix measurement conversion
  iio: isl29018: Fix uninitialized value
  staging:iio:dummy fix kfifo_buf kconfig dependency issue if kfifo modular and buffer enabled for built in dummy driver.
  iio: at91: fix adc_clk overflow
  staging: line6: add bounds check in snd_toneport_source_put()
  Staging: comedi: Fix dependencies for drivers misclassified as PCI
  staging: r8188eu: Adjust RX gain
  staging: r8188eu: Fix smatch warning in core/rtw_ieee80211.
  staging: r8188eu: Fix smatch error in core/rtw_mlme_ext.c
  staging: r8188eu: Fix Smatch off-by-one warning in hal/rtl8188e_hal_init.c
  ...

Merge tag 'usb-3.12-rc2' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are a number of small USB fixes for 3.12-rc2.

  One is a revert of a EHCI change that isn't quite ready for 3.12.
  Others are minor things, gadget fixes, Kconfig fixes, and some quirks
  and documentation updates.

  All have been in linux-next for a bit"

* tag 'usb-3.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: pl2303: distinguish between original and cloned HX chips
  USB: Faraday fotg210: fix email addresses
  USB: fix typo in usb serial simple driver Kconfig
  Revert "USB: EHCI: support running URB giveback in tasklet context"
  usb: s3c-hsotg: do not disconnect gadget when receiving ErlySusp intr
  usb: s3c-hsotg: fix unregistration function
  usb: gadget: f_mass_storage: reset endpoint driver data when disabled
  usb: host: fsl-mph-dr-of: Staticize local symbols
  usb: gadget: f_eem: Staticize eem_alloc
  usb: gadget: f_ecm: Staticize ecm_alloc
  usb: phy: omap-usb3: Fix return value
  usb: dwc3: gadget: avoid memory leak when failing to allocate all eps
  usb: dwc3: remove extcon dependency
  usb: gadget: add '__ref' for rndis_config_register() and cdc_config_register()
  usb: dwc3: pci: add support for BayTrail
  usb: gadget: cdc2: fix conversion to new interface of f_ecm
  usb: gadget: fix a bug and a WARN_ON in dummy-hcd
  usb: gadget: mv_u3d_core: fix violation of locking discipline in mv_u3d_ep_disable()

dm: add reserved_bio_based_ios module parameter

Allow user to change the number of IOs that are reserved by
bio-based DM's mempools by writing to this file:
/sys/module/dm_mod/parameters/reserved_bio_based_ios

The default value is RESERVED_BIO_BASED_IOS (16). The maximum allowed
value is RESERVED_MAX_IOS (1024).

Export dm_get_reserved_bio_based_ios() for use by DM targets and core
code. Switch to sizing dm-io's mempool and bioset using DM core's
configurable 'reserved_bio_based_ios'.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Frank Mayhar <fmayhar@google.com>

dm: add reserved_rq_based_ios module parameter

Allow user to change the number of IOs that are reserved by
request-based DM's mempools by writing to this file:
/sys/module/dm_mod/parameters/reserved_rq_based_ios

The default value is RESERVED_REQUEST_BASED_IOS (256). The maximum
allowed value is RESERVED_MAX_IOS (1024).

Export dm_get_reserved_rq_based_ios() for use by DM targets and core
code. Switch to sizing dm-mpath's mempool using DM core's configurable
'reserved_rq_based_ios'.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Frank Mayhar <fmayhar@google.com>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>

dm: lower bio-based mempool reservation

Bio-based device mapper processing doesn't need larger mempools (like
request-based DM does), so lower the number of reserved entries for
bio-based operation. 16 was already used for bio-based DM's bioset
but mistakenly wasn't used for it's _io_cache.

Formalize difference between bio-based and request-based defaults by
introducing RESERVED_BIO_BASED_IOS and RESERVED_REQUEST_BASED_IOS.

(based on older code from Mikulas Patocka)

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Frank Mayhar <fmayhar@google.com>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>

dm thin: do not expose non-zero discard limits if discards disabled

Fix issue where the block layer would stack the discard limits of the
pool's data device even if the "ignore_discard" pool feature was
specified.

The pool and thin device(s) still had discards disabled because the
QUEUE_FLAG_DISCARD request_queue flag wasn't set. But to avoid user
confusion when "ignore_discard" is used: both the pool device and the
thin device(s) have zeroes for all discard limits.

Also, always set discard_zeroes_data_unsupported in targets because they
should never advertise the 'discard_zeroes_data' capability (even if the
pool's data device supports it).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>