git.openwrt.org Git - openwrt/staging/blogic.git/log

blk-mq: fix discard merge with scheduler attached

I ran into an issue on my laptop that triggered a bug on the
discard path:

WARNING: CPU: 2 PID: 207 at drivers/nvme/host/core.c:527 nvme_setup_cmd+0x3d3/0x430
Modules linked in: rfcomm fuse ctr ccm bnep arc4 binfmt_misc snd_hda_codec_hdmi nls_iso8859_1 nls_cp437 vfat snd_hda_codec_conexant fat snd_hda_codec_generic iwlmvm snd_hda_intel snd_hda_codec snd_hwdep mac80211 snd_hda_core snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq x86_pkg_temp_thermal intel_powerclamp kvm_intel uvcvideo iwlwifi btusb snd_seq_device videobuf2_vmalloc btintel videobuf2_memops kvm snd_timer videobuf2_v4l2 bluetooth irqbypass videobuf2_core aesni_intel aes_x86_64 crypto_simd cryptd snd glue_helper videodev cfg80211 ecdh_generic soundcore hid_generic usbhid hid i915 psmouse e1000e ptp pps_core xhci_pci xhci_hcd intel_gtt
CPU: 2 PID: 207 Comm: jbd2/nvme0n1p7- Tainted: G     U           4.15.0+ #176
Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET59W (1.33 ) 12/19/2017
RIP: 0010:nvme_setup_cmd+0x3d3/0x430
RSP: 0018:ffff880423e9f838 EFLAGS: 00010217
RAX: 0000000000000000 RBX: ffff880423e9f8c8 RCX: 0000000000010000
RDX: ffff88022b200010 RSI: 0000000000000002 RDI: 00000000327f0000
RBP: ffff880421251400 R08: ffff88022b200000 R09: 0000000000000009
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000ffff
R13: ffff88042341e280 R14: 000000000000ffff R15: ffff880421251440
FS:  0000000000000000(0000) GS:ffff880441500000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055b684795030 CR3: 0000000002e09006 CR4: 00000000001606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  nvme_queue_rq+0x40/0xa00
  ? __sbitmap_queue_get+0x24/0x90
  ? blk_mq_get_tag+0xa3/0x250
  ? wait_woken+0x80/0x80
  ? blk_mq_get_driver_tag+0x97/0xf0
  blk_mq_dispatch_rq_list+0x7b/0x4a0
  ? deadline_remove_request+0x49/0xb0
  blk_mq_do_dispatch_sched+0x4f/0xc0
  blk_mq_sched_dispatch_requests+0x106/0x170
  __blk_mq_run_hw_queue+0x53/0xa0
  __blk_mq_delay_run_hw_queue+0x83/0xa0
  blk_mq_run_hw_queue+0x6c/0xd0
  blk_mq_sched_insert_request+0x96/0x140
  __blk_mq_try_issue_directly+0x3d/0x190
  blk_mq_try_issue_directly+0x30/0x70
  blk_mq_make_request+0x1a4/0x6a0
  generic_make_request+0xfd/0x2f0
  ? submit_bio+0x5c/0x110
  submit_bio+0x5c/0x110
  ? __blkdev_issue_discard+0x152/0x200
  submit_bio_wait+0x43/0x60
  ext4_process_freed_data+0x1cd/0x440
  ? account_page_dirtied+0xe2/0x1a0
  ext4_journal_commit_callback+0x4a/0xc0
  jbd2_journal_commit_transaction+0x17e2/0x19e0
  ? kjournald2+0xb0/0x250
  kjournald2+0xb0/0x250
  ? wait_woken+0x80/0x80
  ? commit_timeout+0x10/0x10
  kthread+0x111/0x130
  ? kthread_create_worker_on_cpu+0x50/0x50
  ? do_group_exit+0x3a/0xa0
  ret_from_fork+0x1f/0x30
Code: 73 89 c1 83 ce 10 c1 e1 10 09 ca 83 f8 04 0f 87 0f ff ff ff 8b 4d 20 48 8b 7d 00 c1 e9 09 48 01 8c c7 00 08 00 00 e9 f8 fe ff ff <0f> ff 4c 89 c7 41 bc 0a 00 00 00 e8 0d 78 d6 ff e9 a1 fc ff ff
---[ end trace 50d361cc444506c8 ]---
print_req_error: I/O error, dev nvme0n1, sector 847167488

Decoding the assembly, the request claims to have 0xffff segments,
while nvme counts two. This turns out to be because we don't check
for a data carrying request on the mq scheduler path, and since
blk_phys_contig_segment() returns true for a non-data request,
we decrement the initial segment count of 0 and end up with
0xffff in the unsigned short.

There are a few issues here:

1) We should initialize the segment count for a discard to 1.
2) The discard merging is currently using the data limits for
   segments and sectors.

Fix this up by having attempt_merge() correctly identify the
request, and by initializing the segment count correctly
for discards.

This can only be triggered with mq-deadline on discard capable
devices right now, which isn't a common configuration.

Signed-off-by: Jens Axboe <axboe@kernel.dk>

blk-mq: introduce BLK_STS_DEV_RESOURCE

This status is returned from driver to block layer if device related
resource is unavailable, but driver can guarantee that IO dispatch
will be triggered in future when the resource is available.

Convert some drivers to return BLK_STS_DEV_RESOURCE.  Also, if driver
returns BLK_STS_RESOURCE and SCHED_RESTART is set, rerun queue after
a delay (BLK_MQ_DELAY_QUEUE) to avoid IO stalls.  BLK_MQ_DELAY_QUEUE is
3 ms because both scsi-mq and nvmefc are using that magic value.

If a driver can make sure there is in-flight IO, it is safe to return
BLK_STS_DEV_RESOURCE because:

1) If all in-flight IOs complete before examining SCHED_RESTART in
blk_mq_dispatch_rq_list(), SCHED_RESTART must be cleared, so queue
is run immediately in this case by blk_mq_dispatch_rq_list();

2) if there is any in-flight IO after/when examining SCHED_RESTART
in blk_mq_dispatch_rq_list():
- if SCHED_RESTART isn't set, queue is run immediately as handled in 1)
- otherwise, this request will be dispatched after any in-flight IO is
  completed via blk_mq_sched_restart()

3) if SCHED_RESTART is set concurently in context because of
BLK_STS_RESOURCE, blk_mq_delay_run_hw_queue() will cover the above two
cases and make sure IO hang can be avoided.

One invariant is that queue will be rerun if SCHED_RESTART is set.

Suggested-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:
"This is the main pull request for block IO related changes for the
  4.16 kernel. Nothing major in this pull request, but a good amount of
  improvements and fixes all over the map. This contains:

   - BFQ improvements, fixes, and cleanups from Angelo, Chiara, and
     Paolo.

   - Support for SMR zones for deadline and mq-deadline from Damien and
     Christoph.

   - Set of fixes for bcache by way of Michael Lyle, including fixes
     from himself, Kent, Rui, Tang, and Coly.

   - Series from Matias for lightnvm with fixes from Hans Holmberg,
     Javier, and Matias. Mostly centered around pblk, and the removing
     rrpc 1.2 in preparation for supporting 2.0.

   - A couple of NVMe pull requests from Christoph. Nothing major in
     here, just fixes and cleanups, and support for command tracing from
     Johannes.

   - Support for blk-throttle for tracking reads and writes separately.
     From Joseph Qi. A few cleanups/fixes also for blk-throttle from
     Weiping.

   - Series from Mike Snitzer that enables dm to register its queue more
     logically, something that's alwways been problematic on dm since
     it's a stacked device.

   - Series from Ming cleaning up some of the bio accessor use, in
     preparation for supporting multipage bvecs.

   - Various fixes from Ming closing up holes around queue mapping and
     quiescing.

   - BSD partition fix from Richard Narron, fixing a problem where we
     can't mount newer (10/11) FreeBSD partitions.

   - Series from Tejun reworking blk-mq timeout handling. The previous
     scheme relied on atomic bits, but it had races where we would think
     a request had timed out if it to reused at the wrong time.

   - null_blk now supports faking timeouts, to enable us to better
     exercise and test that functionality separately. From me.

   - Kill the separate atomic poll bit in the request struct. After
     this, we don't use the atomic bits on blk-mq anymore at all. From
     me.

   - sgl_alloc/free helpers from Bart.

   - Heavily contended tag case scalability improvement from me.

   - Various little fixes and cleanups from Arnd, Bart, Corentin,
     Douglas, Eryu, Goldwyn, and myself"

* 'for-4.16/block' of git://git.kernel.dk/linux-block: (186 commits)
  block: remove smart1,2.h
  nvme: add tracepoint for nvme_complete_rq
  nvme: add tracepoint for nvme_setup_cmd
  nvme-pci: introduce RECONNECTING state to mark initializing procedure
  nvme-rdma: remove redundant boolean for inline_data
  nvme: don't free uuid pointer before printing it
  nvme-pci: Suspend queues after deleting them
  bsg: use pr_debug instead of hand crafted macros
  blk-mq-debugfs: don't allow write on attributes with seq_operations set
  nvme-pci: Fix queue double allocations
  block: Set BIO_TRACE_COMPLETION on new bio during split
  blk-throttle: use queue_is_rq_based
  block: Remove kblockd_schedule_delayed_work{,_on}()
  blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays
  blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly()
  lib/scatterlist: Fix chaining support in sgl_alloc_order()
  blk-throttle: track read and write request individually
  block: add bdev_read_only() checks to common helpers
  block: fail op_is_write() requests to read-only partitions
  blk-throttle: export io_serviced_recursive, io_service_bytes_recursive
  ...

Merge tag 'edac_for_4.16' of git://git./linux/kernel/git/bp/bp

Pull EDAC updates from Borislav Petkov:

- new EDAC driver for some TI SOCs (Tero Kristo)

- small cleanups

* tag 'edac_for_4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC, mv64x60: Fix an error handling path
  EDAC, ti: Add support for TI keystone and DRA7xx EDAC
  EDAC, octeon: Fix an uninitialized variable warning

Merge tag 'regmap-v4.16' of git://git./linux/kernel/git/broonie/regmap

Pull regmap updates from Mark Brown:
"A very busy release for regmap, all fairly specialist stuff but
  useful:

   - Support for disabling locking from Bartosz Golaszewski, allowing
     users that handle their own locking to save some overhead.

   - Support for hwspinlocks in syscons in MFD from Baolin Wang, this is
     going through the regmap tree since the first users turned up some
     some cases that needed interface tweaks with 0 being used as a
     syscon identifier.

   - Support for devices with no read or write flag from Andrew F.
     Davis.

   - Basic support for devices on SoundWire buses from Vinod Koul"

* tag 'regmap-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  mfd: syscon: Add hardware spinlock support
  regmap: Allow empty read/write_flag_mask
  regcache: flat: Un-inline index lookup from cache access
  regmap: Add SoundWire bus support
  regmap: Add one flag to indicate if a hwlock should be used
  regmap: debugfs: document why we don't create the debugfs entries
  regmap: debugfs: emit a debug message when locking is disabled
  regmap: use proper part of work_buf for storing val
  regmap: potentially duplicate the name string stored in regmap
  regmap: Disable debugfs when locking is disabled
  regmap: rename regmap_lock_unlock_empty() to regmap_lock_unlock_none()
  regmap: allow to disable all locking mechanisms
  regmap: Remove the redundant config to select hwspinlock

Merge tag 'regulator-v4.16' of git://git./linux/kernel/git/broonie/regulator

Pull regulator updates from Mark Brown:
"This is a quiet release in terms of code volume but a fairly big one
  in terms of framework changes - we've got one long awaited feature in
  the form of runtime configuration of suspend and the start of coupled
  regulator support too:

   - Support for modifying the voltage and enable configuration devices
     will have in suspend, contributed by Chunyan Zhang.

   - Support for the Spreadtrum SC2731, contributed by Erick Chen.

   - The start of changes to support coupled regulators from Maciej
     Purski, the rest of the series should arrive for v4.17"

* tag 'regulator-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: Fix build error
  regulator: core: Refactor regulator_list_voltage()
  regulator: core: Move of_find_regulator_by_node() to of_regulator.c
  regulator: add PM suspend and resume hooks
  regulator: empty the old suspend functions
  regulator: leave one item to record whether regulator is enabled
  regulator: make regulator voltage be an array to support more states
  regulator: added support for suspend states
  regulator: qcom_spmi: Use regmap helpers for enable/disable/is_enabled callback
  regulator: sc2731: Fix defines for SC2731_WR_UNLOCK and SC2731_PWR_WR_PROT_VALUE
  regulator: fix incorrect indentation of two assignment statements
  regulator: sc2731: Add regulator driver to support Spreadtrum SC2731 PMIC
  regulator: Add Spreadtrum SC2731 regulator documentation
  regulator: Update code examples in documentation
  MAINTAINERS: regulator: Add Documentation/power/regulator/
  regulator: tps65218: Add NULL test for devm_kzalloc call
  regulator: tps65218: Remove unused enum tps65218_regulators

Merge tag 'spi-v4.16' of git://git./linux/kernel/git/broonie/spi

Pull spi updates from Mark Brown:
"Quite a quiet release for SPI, there are no changes at all to the core
  and not that many changes to drivers. Highlights of those driver
  changes include:

   - SH MSIOF support for GPIO chip selects contributed by Geert
     Uytterhoeven.

   - Full duplex support for a3700 contributed by Maxime Chevallier.

   - Support for DMA transfers on Atmel devices that require a bounce
     buffer contributed by Radu Pirea"

* tag 'spi-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (31 commits)
  spi: dw: Remove unused members from struct chip_data
  spi: orion: Fix a resource leak if the optional "axi" clk is deferred
  spi: a3700: Remove endianness swapping for full-duplex transfers
  spi: a3700: Remove endianness swapping functions when accessing FIFOs
  spi: a3700: Add full-duplex support
  spi: a3700: Allow to enable or disable FIFO mode
  spi: a3700: Set frequency limits at startup
  spi: a3700: Clear DATA_OUT when performing a read
  spi: orion: Fix clock resource by adding an optional bus clock
  spi: s3c64xx: add SPDX identifier
  spi: imx: do not access registers while clocks disabled
  spi: atmel: Implements transfers with bounce buffer
  spi: sh-msiof: Fix timeout failures for TX-only DMA transfers
  spi: spi-fsl-dspi: account for const type of of_device_id.data
  spi: bcm53xx: simplify reading SPI data
  spi: sirf: account for const type of of_device_id.data
  spi: pxa2xx: Use gpiod_put() not gpiod_free()
  spi: pxa2xx: avoid redundant gpio_to_desc(desc_to_gpio()) round-trip
  spi: sh-msiof: Document hardware limitations related to chip selects
  spi: sh-msiof: Implement cs-gpios configuration
  ...

Merge tag 'mmc-v4.16' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC updates from Ulf Hansson:
"There are two major achievements for MMC in this release, which
  deserves to be specially highlighted.

  First, we have converted the MMC block device from using the legacy
  blk interface into using the modern blkmq interface. Not only do we
  get all the nice effects from using blkmq, but it also means that new
  fresh nice code replaces old rusty code. Great news to everybody that
  cares about MMC/SD!

  It should also be noted that converting to blkmq has not been trivial,
  mostly because of that we have been carrying too much of MMC specific
  optimizations for the I/O request path, rather than striving to move
  these to the generic blk layer. Hopefully we won't be doing that
  mistake, ever again.

  Special thanks to Adrian Hunter (Intel) and to Linus Walleij (Linaro),
  who both have been working on this for quite some time!

  Second, on top of the blkmq deployment, we have enabled full support
  the eMMC command queuing feature, introduced in the eMMC v.5.1 spec.
  This also includes an implementation of a host driver library,
  supporting the corresponding CQHCI HW. Ideally, those controllers that
  supports CQHCI should only need some minor adaptations to make this
  play.

  So far the sdhci-pci driver for the Intel GLKs and the sdhci-of-arasan
  driver used on Rockchip RK3399, have enabled support for eMMC command
  queueing.

  Worth to highlight is also that, implementing the eMMC command queuing
  support has been a collaborative effort, as several people from
  Codeaurora, Rockchip, Intel and Linaro have been involved. However,
  the work has been driven by Adrian Hunter (Intel).

  In some shadow of the above, here are the rest of the highlights:

  MMC core:
   - Don't remove non-removable cards during system suspend
   - Add a slot-gpio helper to check capability of GPIO WP detection

  MMC host:
   - sdhci: Cleanups and improvements of some wakeup related code
   - sdhci-pci-arasan: New variant to support Arasan PCI HW with integrated phy
   - sdhci-acpi: Avoid broken UHS transfer modes on Intel CHT
   - sdhci-acpi: Add support for ACPI HID of AMD Controller with HS400
   - sdhci_f_sdh30: Add ACPI support
   - sdhci-esdhc-imx: Enable/disable clock at runtime suspend/resume
   - sdhci-of-esdhc: A few minor fixes
   - mmci: Add support for new STM32 variant
   - renesas_sdhi: enable R-Car D3 (r8a77995) support
   - tmio/renesas_sdhi: Re-structuring, cleanups and modernizations"

* tag 'mmc-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (96 commits)
  mmc: mmci: fix error return code in mmci_probe()
  mmc: davinci: suppress error message on EPROBE_DEFER
  mmc: davinci: dont' use module_platform_driver_probe()
  mmc: tmio: hide unused tmio_mmc_clk_disable/tmio_mmc_clk_enable functions
  mmc: mmci: Add STM32 variant
  mmc: mmci: Add support for setting pad type via pinctrl
  mmc: mmci: Don't pretend all variants to have OPENDRAIN bit
  mmc: mmci: Don't pretend all variants to have MCI_STARBITERR flag
  mmc: mmci: Don't pretend all variants to have MMCIMASK1 register
  mmc: tmio: refactor .get_ro hook
  mmc: slot-gpio: add a helper to check capability of GPIO WP detection
  mmc: tmio: remove dma_ops from tmio_mmc_host_probe() argument
  mmc: tmio: move {tmio_}mmc_of_parse() to tmio_mmc_host_alloc()
  mmc: tmio: move clk_enable/disable out of tmio_mmc_host_probe()
  mmc: tmio: ioremap memory resource in tmio_mmc_host_alloc()
  mmc: sh_mmcif: remove redundant initialization of 'opc'
  mmc: sdhci: Rework sdhci_enable_irq_wakeups()
  mmc: sdhci: Handle failure of enable_irq_wake()
  mmc: sdhci: Stop exporting sdhci_enable_irq_wakeups()
  mmc: sdhci-pci: Use device wakeup capability to determine MMC_PM_WAKE_SDIO_IRQ capability
  ...

Merge tag 'hwmon-for-linus-v4.16' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon updates from Guenter Roeck:

- New driver for W83773G

- Fan control support for PMBus drivers

- Improvements and minor fixes in several drivers

* tag 'hwmon-for-linus-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (32 commits)
  hwmon: (dell-smm) Disable fan support for Dell Vostro 3360
  hwmon: (dell-smm) Disable fan support for Dell Inspiron 7720
  hwmon: (dell-smm) Enable broken functionality via "force" module param
  hwmon: (k10temp) Add temperature offset for Ryzen 1900X
  hwmon: (lm75) Fix trailing semicolon
  hwmon: (ina2xx) Fix access to uninitialized mutex
  hwmon: (pmbus/ir35221) Remove unnecessary scaling
  hwmon: (sht3x) wait predefined limits loading complete before access
  hwmon: (pmbus/ibm-cffps) Add dependency on LEDS_CLASS
  hwmon: (pmbus/cffps) Add led class device for power supply fault led
  hwmon: (pmbus) cffps: Add PMBUS_SKIP_STATUS_CHECK
  hwmon: (aspeed-pwm-tacho) Deassert reset in probe
  dt-bindings: hwmon: aspeed-pwm-tacho: Add reset node
  hwmon: (pmbus) cffps: Add debugfs entries
  hwmon: (pmbus) Export pmbus device debugfs directory entry
  hwmon: (w83773g) Fix fault detection and reporting
  hwmon: (hih6130) Fix documentation of struct hih6130
  hwmon: (iio_hwmon) Fix documentation of struct iio_hwmon_state
  hwmon: (sht15) Fix parameter documentation of sht15_crc8()
  hwmon: (sht21) Fix documentation of struct sht21
  ...

Merge tag 'mtd/for-4.16' of git://git.infradead.org/linux-mtd

Pull MTD updates from Boris Brezillon:
"MTD core changes:
   - Rework core functions to avoid duplicating generic checks in
     NAND/OneNAND sub-layers
   - Update the MAINTAINERS entry to reflect the fact that MTD
     maintainers now use a single git tree

  MTD driver changes:
   - CFI: use macros instead of inline functions to limit stack usage
     and make KASAN happy

  NAND core changes:
   - Fix NAND_CMD_NONE handling in nand_command[_lp]() hooks
   - Introduce the ->exec_op() infrastructure
   - Rework NAND buffers handling
   - Fix ECC requirements for K9F4G08U0D
   - Fix nand_do_read_oob() to return the number of bitflips
   - Mark K9F1G08U0E as not supporting subpage writes

  NAND driver changes:
   - MTK: Rework the driver to support new IP versions
   - OMAP OneNAND: Full rework to use new APIs (libgpio, dmaengine) and
     fix DT support
   - Marvell: Add a new driver to replace the pxa3xx one

  SPI NOR core changes:
   - Add support to new ISSI and Cypress/Spansion memory parts.
   - Fix support of Micron memories by checking error bits in the FSR.
   - Fix update of block-protection bits by reading back the SR.
   - Restore the internal state of the SPI flash memory when removing
     the device.

  SPI NOR driver changes:
   - Maintenance for Freescale, Intel and Metiatek drivers.
   - Add support of the direct access mode for the Cadence QSPI
     controller"

* tag 'mtd/for-4.16' of git://git.infradead.org/linux-mtd: (93 commits)
  mtd: nand: sunxi: Fix ECC strength choice
  mtd: nand: gpmi: Fix subpage reads
  mtd: nand: Fix build issues due to an anonymous union
  mtd: nand: marvell: Fix missing memory allocation modifier
  mtd: nand: marvell: remove redundant variable 'oob_len'
  mtd: nand: marvell: fix spelling mistake: "suceed"-> "succeed"
  mtd: onenand: omap2: Remove redundant dev_err call in omap2_onenand_probe()
  mtd: Remove duplicate checks on mtd_oob_ops parameter
  mtd: Fallback to ->_read/write_oob() when ->_read/write() is missing
  mtd: mtdpart: Make ECC stat handling consistent
  mtd: onenand: omap2: print resource using %pR format string
  mtd: mtk-nor: modify functions' name more generally
  mtd: onenand: samsung: remove incorrect __iomem annotation
  MAINTAINERS: Add entry for Marvell NAND controller driver
  ARM: OMAP2+: Remove gpmc-onenand
  mtd: onenand: omap2: Configure driver from DT
  mtd: onenand: omap2: Decouple DMA enabling from INT pin availability
  mtd: onenand: omap2: Do not make delay for GPIO OMAP3 specific
  mtd: onenand: omap2: Convert to use dmaengine for memcpy
  mtd: onenand: omap2: Unify OMAP2 and OMAP3 DMA implementation
  ...

Merge tag 'for-backlight-next-4.16' of git://git./linux/kernel/git/lee/backlight

Pull backlight updates from Lee Jones:
"Fix-ups:
   - Deprecate pci_get_bus_and_slot() in apple_bl

  Bug Fixes:
   - Enable Chip Select when conducting SPI transfers in corgi_lcd,
     tdo24m, tosa_lcd"

* tag 'for-backlight-next-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  backlight: tdo24m: Fix the SPI CS between transfers
  backlight: apple_bl: Deprecate pci_get_bus_and_slot()

Merge tag 'mfd-next-4.16' of git://git./linux/kernel/git/lee/mfd

Pull MFD updates from Lee Jones:
"New Drivers:
   - Add support for RAVE Supervisory Processor

  Moved drivers:
   - Move Realtek Card Reader Driver to Misc

  New Device Support:
   - Add support for Pinctrl to axp20x

  New Functionality:
   - Add resume support to atmel-flexcom

  Fix-ups:
   - Split MFD (mfd) and userspace handlers (platform) in cros_ec
   - Fix trivial (whitespace, spelling) issue(s) in pcf50633-core
   - Clean-up error handling in ab8500-debugfs
   - General tidying up in tmio_core
   - Kconfig fix-ups for qcom-pm8xxx
   - Licensing changes (SPDX) to stm32-lptimer, stm32-timers
   - Device Tree fixups in mc13xxx
   - Simplify/remove unused code in cros_ec_spi, axp20x, ti_am335x_tscadc,
     kempld-core, intel_soc_pmic_core.c, ab8500-debugfs"

* tag 'mfd-next-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (32 commits)
  mfd: lpc_ich: Do not touch SPI-NOR write protection bit on Apollo Lake
  mfd: axp20x: Mark axp288 CHRG_BAK_CTRL register volatile
  mfd: ab8500: Introduce DEFINE_SHOW_ATTRIBUTE() macro
  atmel_flexcom: Support resuming after a chip reset
  mfd: Remove duplicate includes
  dt-bindings: mfd: mc13xxx: Add the unit address to sysled
  mfd: stm32: Adopt SPDX identifier
  mfd: axp20x: Add pinctrl cell for AXP813
  mfd: pm8xxx: Make elegible for COMPILE_TEST
  mfd: kempld-core: Use resource_size function on resource object
  mfd: tmio: Move register macros to tmio_core.c
  mfd: cros ec: spi: Simplify delay handling between SPI messages
  mfd: palmas: Assign the right powerhold mask for tps65917
  mfd: ab8500-debugfs: Use common error handling code in ab8500_print_modem_registers()
  mfd: ti_am335x_tscadc: Remove redundant assignment to node
  mfd: pcf50633: Fix spelling mistake: 'Falied' -> 'Failed'
  dt-bindings: watchdog: Add bindings for RAVE SP watchdog driver
  watchdog: Add RAVE SP watchdog driver
  mfd: Add driver for RAVE Supervisory Processor
  serdev: Introduce devm_serdev_device_open()
  ...

Merge tag 'pnp-4.16-rc1' of git://git./linux/kernel/git/rafael/linux-pm

Pull PNP updates from Rafael Wysocki:
"These make pnpbios_thread_init() use PTR_ERR_OR_ZERO() and remove an
  unnecessary kallsyms include from drivers/pnp/quirks.c (Vasyl
  Gomonovych, Sergey Senozhatsky)"

* tag 'pnp-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PNP: pnpbios: Use PTR_ERR_OR_ZERO()
  PNP: remove unneeded kallsyms include

Merge tag 'acpi-4.16-rc1' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
"The majority of this is an update of the ACPICA kernel code to
  upstream revision 20171215 with a cosmetic change and a maintainers
  information update on top of it.

  The rest is mostly some minor fixes and cleanups in the ACPI drivers
  and cleanups to initialization on x86.

  Specifics:

   - Update the ACPICA kernel code to upstream revision 20171215 including:
      * Support for ACPI 6.0A changes in the NFIT table (Bob Moore)
      * Local 64-bit divide in string conversions (Bob Moore)
      * Fix for a regression in acpi_evaluate_object_type() (Bob Moore)
      * Fixes for memory leaks during package object resolution (Bob
        Moore)
      * Deployment of safe version of strncpy() (Bob Moore)
      * Debug and messaging updates (Bob Moore)
      * Support for PDTT, SDEV, TPM2 tables in iASL and tools (Bob
        Moore)
      * Null pointer dereference avoidance in Op and cleanups (Colin Ian
        King)
      * Fix for memory leak from building prefixed pathname (Erik
        Schmauss)
      * Coding style fixes, disassembler and compiler updates (Hanjun
        Guo, Erik Schmauss)
      * Additional PPTT flags from ACPI 6.2 (Jeremy Linton)
      * Fix for an off-by-one error in acpi_get_timer_duration()
        (Jung-uk Kim)
      * Infinite loop detection timeout and utilities cleanups (Lv
        Zheng)
      * Windows 10 version 1607 and 1703 OSI strings (Mario
        Limonciello)

   - Update ACPICA information in MAINTAINERS to reflect the current
     status of ACPICA maintenance and rename a local variable in one
     function to match the corresponding upstream code (Rafael Wysocki)

   - Clean up ACPI-related initialization on x86 (Andy Shevchenko)

   - Add support for Intel Merrifield to the ACPI GPIO code (Andy
     Shevchenko)

   - Clean up ACPI PMIC drivers (Andy Shevchenko, Arvind Yadav)

   - Fix the ACPI Generic Event Device (GED) driver to free IRQs on
     shutdown and clean up the PCI IRQ Link driver (Sinan Kaya)

   - Make the GHES code call into the AER driver on all errors and clean
     up the ACPI APEI code (Colin Ian King, Tyler Baicar)

   - Make the IA64 ACPI NUMA code parse all SRAT entries (Ganapatrao
     Kulkarni)

   - Add a lid switch blacklist to the ACPI button driver and make it
     print extra debug messages on lid events (Hans de Goede)

   - Add quirks for Asus GL502VSK and UX305LA to the ACPI battery driver
     and clean it up somewhat (Bjørn Mork, Kai-Heng Feng)

   - Add device link for CHT SD card dependency on I2C to the ACPI LPSS
     (Intel SoCs) driver and make it avoid creating platform device
     objects for devices without MMIO resources (Adrian Hunter, Hans de
     Goede)

   - Fix the ACPI GPE mask kernel command line parameter handling
     (Prarit Bhargava)

   - Fix the handling of (incorrectly exposed) backlight interfaces
     without LCD (Hans de Goede)

   - Fix the usage of debugfs_create_*() in the ACPI EC driver (Geert
     Uytterhoeven)"

* tag 'acpi-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (62 commits)
  ACPI/PCI: pci_link: reduce verbosity when IRQ is enabled
  ACPI / LPSS: Do not instiate platform_dev for devs without MMIO resources
  ACPI / PMIC: Convert to use builtin_platform_driver() macro
  ACPI / x86: boot: Propagate error code in acpi_gsi_to_irq()
  ACPICA: Update version to 20171215
  ACPICA: trivial style fix, no functional change
  ACPICA: Fix a couple memory leaks during package object resolution
  ACPICA: Recognize the Windows 10 version 1607 and 1703 OSI strings
  ACPICA: DT compiler: prevent error if optional field at the end of table is not present
  ACPICA: Rename a global variable, no functional change
  ACPICA: Create and deploy safe version of strncpy
  ACPICA: Cleanup the global variables and update comments
  ACPICA: Debugger: fix slight indentation issue
  ACPICA: Fix a regression in the acpi_evaluate_object_type() interface
  ACPICA: Update for a few debug output statements
  ACPICA: Debug output, no functional change
  ACPI: EC: Fix debugfs_create_*() usage
  ACPI / video: Default lcd_only to true on Win8-ready and newer machines
  ACPI / x86: boot: Don't setup SCI on HW-reduced platforms
  ACPI / x86: boot: Use INVALID_ACPI_IRQ instead of 0 for acpi_sci_override_gsi
  ...

Merge tag 'pm-4.16-rc1' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
"This includes some infrastructure changes in the PM core, mostly
  related to integration between runtime PM and system-wide suspend and
  hibernation, plus some driver changes depending on them and fixes for
  issues in that area which have become quite apparent recently.

  Also included are changes making more x86-based systems use the Low
  Power Sleep S0 _DSM interface by default, which turned out to be
  necessary to handle power button wakeups from suspend-to-idle on
  Surface Pro3.

  On the cpufreq front we have fixes and cleanups in the core, some new
  hardware support, driver updates and the removal of some unused code
  from the CPU cooling thermal driver.

  Apart from this, the Operating Performance Points (OPP) framework is
  prepared to be used with power domains in the future and there is a
  usual bunch of assorted fixes and cleanups.

  Specifics:

   - Define a PM driver flag allowing drivers to request that their
     devices be left in suspend after system-wide transitions to the
     working state if possible and add support for it to the PCI bus
     type and the ACPI PM domain (Rafael Wysocki).

   - Make the PM core carry out optimizations for devices with driver PM
     flags set in some cases and make a few drivers set those flags
     (Rafael Wysocki).

   - Fix and clean up wrapper routines allowing runtime PM device
     callbacks to be re-used for system-wide PM, change the generic
     power domains (genpd) framework to stop using those routines
     incorrectly and fix up a driver depending on that behavior of genpd
     (Rafael Wysocki, Ulf Hansson, Geert Uytterhoeven).

   - Fix and clean up the PM core's device wakeup framework and
     re-factor system-wide PM core code related to device wakeup
     (Rafael Wysocki, Ulf Hansson, Brian Norris).

   - Make more x86-based systems use the Low Power Sleep S0 _DSM
     interface by default (to fix power button wakeup from
     suspend-to-idle on Surface Pro3) and add a kernel command line
     switch to tell it to ignore the system sleep blacklist in the ACPI
     core (Rafael Wysocki).

   - Fix a race condition related to cpufreq governor module removal and
     clean up the governor management code in the cpufreq core (Rafael
     Wysocki).

   - Drop the unused generic code related to the handling of the static
     power energy usage model in the CPU cooling thermal driver along
     with the corresponding documentation (Viresh Kumar).

   - Add mt2712 support to the Mediatek cpufreq driver (Andrew-sh
     Cheng).

   - Add a new operating point to the imx6ul and imx6q cpufreq drivers
     and switch the latter to using clk_bulk_get() (Anson Huang, Dong
     Aisheng).

   - Add support for multiple regulators to the TI cpufreq driver along
     with a new DT binding related to that and clean up that driver
     somewhat (Dave Gerlach).

   - Fix a powernv cpufreq driver regression leading to incorrect CPU
     frequency reporting, fix that driver to deal with non-continguous
     P-states correctly and clean it up (Gautham Shenoy, Shilpasri
     Bhat).

   - Add support for frequency scaling on Armada 37xx SoCs through the
     generic DT cpufreq driver (Gregory CLEMENT).

   - Fix error code paths in the mvebu cpufreq driver (Gregory CLEMENT).

   - Fix a transition delay setting regression in the longhaul cpufreq
     driver (Viresh Kumar).

   - Add Skylake X (server) support to the intel_pstate cpufreq driver
     and clean up that driver somewhat (Srinivas Pandruvada).

   - Clean up the cpufreq statistics collection code (Viresh Kumar).

   - Drop cluster terminology and dependency on physical_package_id from
     the PSCI driver and drop dependency on arm_big_little from the SCPI
     cpufreq driver (Sudeep Holla).

   - Add support for system-wide suspend and resume to the RAPL power
     capping driver and drop a redundant semicolon from it (Zhen Han,
     Luis de Bethencourt).

   - Make SPI domain validation (in the SCSI SPI transport driver) and
     system-wide suspend mutually exclusive as they rely on the same
     underlying mechanism and cannot be carried out at the same time
     (Bart Van Assche).

   - Fix the computation of the amount of memory to preallocate in the
     hibernation core and clean up one function in there (Rainer Fiebig,
     Kyungsik Lee).

   - Prepare the Operating Performance Points (OPP) framework for being
     used with power domains and clean up one function in it (Viresh
     Kumar, Wei Yongjun).

   - Clean up the generic sysfs interface for device PM (Andy
     Shevchenko).

   - Fix several minor issues in power management frameworks and clean
     them up a bit (Arvind Yadav, Bjorn Andersson, Geert Uytterhoeven,
     Gustavo Silva, Julia Lawall, Luis de Bethencourt, Paul Gortmaker,
     Sergey Senozhatsky, gaurav jindal).

   - Make it easier to disable PM via Kconfig (Mark Brown).

   - Clean up the cpupower and intel_pstate_tracer utilities (Doug
     Smythies, Laura Abbott)"

* tag 'pm-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (89 commits)
  PCI / PM: Remove spurious semicolon
  cpufreq: scpi: remove arm_big_little dependency
  drivers: psci: remove cluster terminology and dependency on physical_package_id
  powercap: intel_rapl: Fix trailing semicolon
  dmaengine: rcar-dmac: Make DMAC reinit during system resume explicit
  PM / runtime: Allow no callbacks in pm_runtime_force_suspend|resume()
  PM / hibernate: Drop unused parameter of enough_swap
  PM / runtime: Check ignore_children in pm_runtime_need_not_resume()
  PM / runtime: Rework pm_runtime_force_suspend/resume()
  PM / genpd: Stop/start devices without pm_runtime_force_suspend/resume()
  cpufreq: powernv: Dont assume distinct pstate values for nominal and pmin
  cpufreq: intel_pstate: Add Skylake servers support
  cpufreq: intel_pstate: Replace bxt_funcs with core_funcs
  platform/x86: surfacepro3: Support for wakeup from suspend-to-idle
  ACPI / PM: Use Low Power S0 Idle on more systems
  PM / wakeup: Print warn if device gets enabled as wakeup source during sleep
  PM / domains: Don't skip driver's ->suspend|resume_noirq() callbacks
  PM / core: Propagate wakeup_path status flag in __device_suspend_late()
  PM / core: Re-structure code for clearing the direct_complete flag
  powercap: add suspend and resume mechanism for SOC power limit
  ...

Merge tag 'sound-4.16-rc1' of git://git./linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
"The major changes in the core API side in this cycle are the still
  on-going ASoC componentization works. Other than that, only few small
  changes such as 20bit PCM format support are found.

  Meanwhile the rest majority of changes are for ASoC drivers:

   - Large cleanups of some of the TI CODEC drivers

   - Continued work on Intel ASoC stuff for new quirks, ACPI GPIO
     handling, Kconfigs and lots of cleanups

   - Refactoring of the Freescale SSI driver, as preliminary work for
     the upcoming changes

   - Work on ST DFSDM driver, including the required IIO patches

   - New drivers for Allwinner A83T, Maxim MAX89373, SocioNext UiniPhier
     EVEA Tempo Semiconductor TSCS42xx and TI PCM816x, TAS5722 and
     TAS6424 devices

   - Removal of dead codes for SN95031 and board drivers

  Last but not least, a few HD-audio and USB-audio quirks are included
  as usual, too"

* tag 'sound-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (303 commits)
  ALSA: hda - Reduce the suspend time consumption for ALC256
  ASoC: use seq_file to dump the contents of dai_list,platform_list and codec_list
  ASoC: soc-core: add missing EXPORT_SYMBOL_GPL() for snd_soc_rtdcom_lookup
  IIO: ADC: stm32-dfsdm: remove unused variable again
  ASoC: bcm2835: fix hw_params error when device is in prepared state
  ASoC: mxs-sgtl5000: Do not print error on probe deferral
  ASoC: sgtl5000: Do not print error on probe deferral
  ASoC: Intel: remove select on non-existing SND_SOC_INTEL_COMMON
  ALSA: usb-audio: Support changing input on Sound Blaster E1
  ASoC: Intel: remove second duplicated assignment to pointer 'res'
  ALSA: hda/realtek - update ALC215 depop optimize
  ALSA: hda/realtek - Support headset mode for ALC215/ALC285/ALC289
  ALSA: pcm: Fix trailing semicolon
  ASoC: add Component level .read/.write
  ASoC: cx20442: fix regression by adding back .read/.write
  ASoC: uda1380: fix regression by adding back .read/.write
  ASoC: tlv320dac33: fix regression by adding back .read/.write
  ALSA: hda - Use IS_REACHABLE() for dependency on input
  IIO: ADC: stm32-dfsdm: fix static check warning
  IIO: ADC: stm32-dfsdm: code optimization
  ...

Merge tag 'init_task-20180117' of git://git./linux/kernel/git/dhowells/linux-fs

Pull init_task initializer cleanups from David Howells:
"It doesn't seem useful to have the init_task in a header file rather
  than in a normal source file. We could consolidate init_task handling
  instead and expand out various macros.

  Here's a series of patches that consolidate init_task handling:

   (1) Make THREAD_SIZE available to vmlinux.lds for cris, hexagon and
       openrisc.

   (2) Alter the INIT_TASK_DATA linker script macro to set
       init_thread_union and init_stack rather than defining these in C.

       Insert init_task and init_thread_into into the init_stack area in
       the linker script as appropriate to the configuration, with
       different section markers so that they end up correctly ordered.

       We can then get merge ia64's init_task.c into the main one.

       We then have a bunch of single-use INIT_*() macros that seem only
       to be macros because they used to be used per-arch. We can then
       expand these in place of the user and get rid of a few lines and
       a lot of backslashes.

   (3) Expand INIT_TASK() in place.

   (4) Expand in place various small INIT_*() macros that are defined
       conditionally. Expand them and surround them by #if[n]def/#endif
       in the .c file as it takes fewer lines.

   (5) Expand INIT_SIGNALS() and INIT_SIGHAND() in place.

   (6) Expand INIT_STRUCT_PID in place.

  These macros can then be discarded"

* tag 'init_task-20180117' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  Expand INIT_STRUCT_PID and remove
  Expand the INIT_SIGNALS and INIT_SIGHAND macros and remove
  Expand various INIT_* macros and remove
  Expand INIT_TASK() in init/init_task.c and remove
  Construct init thread stack in the linker script rather than by union
  openrisc: Make THREAD_SIZE available to vmlinux.lds
  hexagon: Make THREAD_SIZE available to vmlinux.lds
  cris: Make THREAD_SIZE available to vmlinux.lds

Merge tag 'nand/for-4.16' of git://git.infradead.org/linux-mtd into mtd/next

Pull NAND changes from Boris Brezillon:

"
  Core changes:
  * Fix NAND_CMD_NONE handling in nand_command[_lp]() hooks
  * Introduce the ->exec_op() infrastructure
  * Rework NAND buffers handling
  * Fix ECC requirements for K9F4G08U0D
  * Fix nand_do_read_oob() to return the number of bitflips
  * Mark K9F1G08U0E as not supporting subpage writes

  Driver changes:
  * MTK: Rework the driver to support new IP versions
  * OMAP OneNAND: Full rework to use new APIs (libgpio, dmaengine) and fix
    DT support
  * Marvell: Add a new driver to replace the pxa3xx one
"

Merge tag 'spi-nor/for-4.16' of git://git.infradead.org/linux-mtd into mtd/next

Pull spi-nor changes from Cyrille Pitchen:

"
  This pull-request contains the following notable changes:

  Core changes:
  * Add support to new ISSI and Cypress/Spansion memory parts.
  * Fix support of Micron memories by checking error bits in the FSR.
  * Fix update of block-protection bits by reading back the SR.
  * Restore the internal state of the SPI flash memory when removing the
    device.

  Driver changes:
  * Maintenance for Freescale, Intel and Metiatek drivers.
  * Add support of the direct access mode for the Cadence QSPI controller.
"

Linux 4.15

Merge branch 'x86-pti-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 retpoline fixlet from Thomas Gleixner:
"Remove the ESP/RSP thunks for retpoline as they cannot ever work.

Get rid of them before they show up in a release"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/retpoline: Remove the esp/rsp thunk

Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
"A set of small fixes for 4.15:

   - Fix vmapped stack synchronization on systems with 4-level paging
     and a large amount of memory caused by a missing 5-level folding
     which made the pgd synchronization logic to fail and causing double
     faults.

   - Add a missing sanity check in the vmalloc_fault() logic on 5-level
     paging systems.

   - Bring back protection against accessing a freed initrd in the
     microcode loader which was lost by a wrong merge conflict
     resolution.

   - Extend the Broadwell micro code loading sanity check.

   - Add a missing ENDPROC annotation in ftrace assembly code which
     makes ORC unhappy.

   - Prevent loading the AMD power module on !AMD platforms. The load
     itself is uncritical, but an unload attempt results in a kernel
     crash.

   - Update Peter Anvins role in the MAINTAINERS file"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ftrace: Add one more ENDPROC annotation
  x86: Mark hpa as a "Designated Reviewer" for the time being
  x86/mm/64: Tighten up vmalloc_fault() sanity checks on 5-level kernels
  x86/mm/64: Fix vmapped stack syncing on very-large-memory 4-level systems
  x86/microcode: Fix again accessing initrd after having been freed
  x86/microcode/intel: Extend BDW late-loading further with LLC size check
  perf/x86/amd/power: Do not load AMD power module on !AMD platforms

Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Thomas Gleixner:
"A single fix for a ~10 years old problem which causes high resolution
  timers to stop after a CPU unplug/plug cycle due to a stale flag in
  the per CPU hrtimer base struct.

  Paul McKenney was hunting this for about a year, but the heisenbug
  nature made it resistant against debug attempts for quite some time"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  hrtimer: Reset hrtimer cpu base proper on CPU hotplug

Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fix from Thomas Gleixner:
"A single bug fix to prevent a subtle deadlock in the scheduler core
code vs cpu hotplug"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Fix cpu.max vs. cpuhotplug deadlock

Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
"Four patches which all address lock inversions and deadlocks in the
  perf core code and the Intel debug store"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix perf,x86,cpuhp deadlock
  perf/core: Fix ctx::mutex deadlock
  perf/core: Fix another perf,trace,cpuhp lock inversion
  perf/core: Fix lock inversion between perf,trace,cpuhp

Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull locking fixes from Thomas Gleixner:
"Two final locking fixes for 4.15:

   - Repair the OWNER_DIED logic in the futex code which got wreckaged
     with the recent fix for a subtle race condition.

   - Prevent the hard lockup detector from triggering when dumping all
     held locks in the system"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/lockdep: Avoid triggering hardlockup from debug_show_all_locks()
  futex: Fix OWNER_DEAD fixup

x86/ftrace: Add one more ENDPROC annotation

When ORC support was added for the ftrace_64.S code, an ENDPROC
for function_hook() was missed. This results in the following warning:

arch/x86/kernel/ftrace_64.o: warning: objtool: .entry.text+0x0: unreachable instruction

Fixes: e2ac83d74a4d ("x86/ftrace: Fix ORC unwinding from ftrace handlers")
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/20180128022150.dqierscqmt3uwwsr@treble

hwmon: (dell-smm) Disable fan support for Dell Vostro 3360

Calling fan related SMM functions implemented by Dell BIOS firmware on Dell
Vostro 3360 freeze kernel for about 500ms.

Unfortunately, it is unlikely for Dell to fix this since the machine
is pretty old, so this commit just disables fan support to make the
system usable again.

Via "force" module param fan support can be enabled.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=195751
Link: http://lkml.iu.edu/hypermail/linux/kernel/1711.2/06083.html
Reviewed-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (dell-smm) Disable fan support for Dell Inspiron 7720

Calling fan related SMM functions implemented by Dell BIOS firmware on Dell
Inspiron 7720 freeze kernel for about 500ms. Until Dell fixes it we need to
disable fan support for Dell Inspiron 7720 as it makes system unusable.

Via "force" module param fan support can be enabled.

Reported-by: vova7890@mail.ru
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=195751
Cc: stable@vger.kernel.org # v4.0+, will need backport
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (dell-smm) Enable broken functionality via "force" module param

Some Dell machines are broken and some functionality is disabled. Show
warning into dmesg about this fact and allow user via "force" module param
to override brokenness and enable broken functionality.

Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hrtimer: Reset hrtimer cpu base proper on CPU hotplug

The hrtimer interrupt code contains a hang detection and mitigation
mechanism, which prevents that a long delayed hrtimer interrupt causes a
continous retriggering of interrupts which prevent the system from making
progress. If a hang is detected then the timer hardware is programmed with
a certain delay into the future and a flag is set in the hrtimer cpu base
which prevents newly enqueued timers from reprogramming the timer hardware
prior to the chosen delay. The subsequent hrtimer interrupt after the delay
clears the flag and resumes normal operation.

If such a hang happens in the last hrtimer interrupt before a CPU is
unplugged then the hang_detected flag is set and stays that way when the
CPU is plugged in again. At that point the timer hardware is not armed and
it cannot be armed because the hang_detected flag is still active, so
nothing clears that flag. As a consequence the CPU does not receive hrtimer
interrupts and no timers expire on that CPU which results in RCU stalls and
other malfunctions.

Clear the flag along with some other less critical members of the hrtimer
cpu base to ensure starting from a clean state when a CPU is plugged in.

Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the
root cause of that hard to reproduce heisenbug. Once understood it's
trivial and certainly justifies a brown paperbag.

Fixes: 41d2e4949377 ("hrtimer: Tune hrtimer_interrupt hang logic")
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos

x86: Mark hpa as a "Designated Reviewer" for the time being

Due to some unfortunate events, I have not been directly involved in
the x86 kernel patch flow for a while now.  I have also not been able
to ramp back up by now like I had hoped to, and after reviewing what I
will need to work on both internally at Intel and elsewhere in the near
term, it is clear that I am not going to be able to ramp back up until
late 2018 at the very earliest.

It is not acceptable to not recognize that this load is currently
taken by Ingo and Thomas without my direct participation, so I mark
myself as R: (designated reviewer) rather than M: (maintainer) until
further notice.  This is in fact recognizing the de facto situation
for the past few years.

I have obviously no intention of going away, and I will do everything
within my power to improve Linux on x86 and x86 for Linux.  This,
however, puts credit where it is due and reflects a change of focus.

This patch also removes stale entries for portions of the x86
architecture which have not been maintained separately from arch/x86
for a long time.  If there is a reason to re-introduce them then that
can happen later.

Signed-off-by: H. Peter Anvin <h.peter.anvin@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180125195934.5253-1-hpa@zytor.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge tag 'riscv-for-linus-4.15-maintainers' of git://git./linux/kernel/git/palmer/riscv-linux

Pull RISC-V update from Palmer Dabbelt:
"RISC-V: We have a new mailing list and git repo!

  Sorry to send something essentially as late as possible (Friday after
  an rc9), but we managed to get a mailing list for the RISC-V Linux
  port. We've been using patches@groups.riscv.org for a while, but that
  list has some problems (it's Google Groups and it's shared over all
  RISC-V software projects). The new infaread.org list is much better.
  We just got it on Wednesday but I used it a bit on Thursday to shake
  out all the configuration problems and it appears to be in working
  order.

  When I updated the mailing list I noticed that the MAINTAINERS file
  was pointing to our github repo, but now that we have a kernel.org
  repo I'd like to point to that instead so I changed that as well.
  We'll be centralizing all RISC-V Linux related development here as
  that seems to be the saner way to go about it.

  I can understand if it's too late to get this into 4.15, but given
  that it's not a code change I was hoping it'd still be OK. It would be
  nice to have the new mailing list and git repo in the release tarballs
  so when people start to find bugs they'll get to the right place"

* tag 'riscv-for-linus-4.15-maintainers' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
  Update the RISC-V MAINTAINERS file

block: remove smart1,2.h

smart1,2.h is unused since commit d436641439e0 ("cpqarray: remove it from the kernel")
Remove it from tree.

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge branch 'nvme-4.16' of git://git.infradead.org/nvme into for-4.16/block

Pull NVMe fixes from Christoph:

"The additional week before the 4.15 release gave us time for a few more
nvme fixes, as well as the nifty trace points from Johannes"

* 'nvme-4.16' of git://git.infradead.org/nvme:
  nvme: add tracepoint for nvme_complete_rq
  nvme: add tracepoint for nvme_setup_cmd
  nvme-pci: introduce RECONNECTING state to mark initializing procedure
  nvme-rdma: remove redundant boolean for inline_data
  nvme: don't free uuid pointer before printing it
  nvme-pci: Suspend queues after deleting them
  nvme-pci: Fix queue double allocations

Merge remote-tracking branch 'spi/topic/xilinx' into spi-next

Merge remote-tracking branches 'spi/topic/pxa2xx', 'spi/topic/s3c64xx', 'spi/topic/sh-msiof', 'spi/topic/sirf' and 'spi/topic/sun6i' into spi-next

Merge remote-tracking branches 'spi/topic/fsl-dspi', 'spi/topic/imx', 'spi/topic/jcore', 'spi/topic/meson' and 'spi/topic/orion' into spi-next

Merge remote-tracking branches 'spi/topic/a3700', 'spi/topic/atmel', 'spi/topic/bcm53xx', 'spi/topic/davinci' and 'spi/topic/dw' into spi-next

Merge remote-tracking branches 'spi/fix/imx' and 'spi/fix/sh-msiof' into spi-linus

Merge remote-tracking branch 'regulator/topic/tps65218' into regulator-next

Merge remote-tracking branches 'regulator/topic/doc' and 'regulator/topic/sc2731' into regulator-next

Merge remote-tracking branch 'regulator/topic/qcom_spmi' into regulator-next

Merge remote-tracking branch 'regulator/topic/core' into regulator-next

regulator: Fix build error

3d67fe950707 (regulator: core: Refactor regulator_list_voltage()) missed
one user of regulator_list_voltage(), update for that.

Signed-off-by: Mark Brown <broonie@kernel.org>

Merge branch 'topic/suspend' of https://git./linux/kernel/git/broonie/regulator into regulator-core

Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) The per-network-namespace loopback device, and thus its namespace,
    can have its teardown deferred for a long time if a kernel created
    TCP socket closes and the namespace is exiting meanwhile. The kernel
    keeps trying to finish the close sequence until it times out (which
    takes quite some time).

    Fix this by forcing the socket closed in this situation, from Dan
    Streetman.

2) Fix regression where we're trying to invoke the update_pmtu method
    on route types (in this case metadata tunnel routes) that don't
    implement the dst_ops method. Fix from Nicolas Dichtel.

3) Fix long standing memory corruption issues in r8169 driver by
    performing the chip statistics DMA programming more correctly. From
    Francois Romieu.

4) Handle local broadcast sends over VRF routes properly, from David
    Ahern.

5) Don't refire the DCCP CCID2 timer endlessly, otherwise the socket
    can never be released. From Alexey Kodanev.

6) Set poll flags properly in VSOCK protocol layer, from Stefan
    Hajnoczi.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  VSOCK: set POLLOUT | POLLWRNORM for TCP_CLOSING
  dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
  net: vrf: Add support for sends to local broadcast address
  r8169: fix memory corruption on retrieval of hardware statistics.
  net: don't call update_pmtu unconditionally
  net: tcp: close sock if net namespace is exiting

Merge tag 'drm-fixes-for-v4.15-rc10-2' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"A fairly urgent nouveau regression fix for broken irqs across
  suspend/resume came in. This was broken before but a patch in 4.15 has
  made it much more obviously broken and now s/r fails a lot more often.

  The fix removes freeing the irq across s/r which never should have
  been done anyways.

  Also two vc4 fixes for a NULL deference and some misrendering /
  flickering on screen"

* tag 'drm-fixes-for-v4.15-rc10-2' of git://people.freedesktop.org/~airlied/linux:
  drm/nouveau: Move irq setup/teardown to pci ctor/dtor
  drm/vc4: Fix NULL pointer dereference in vc4_save_hang_state()
  drm/vc4: Flush the caches before the bin jobs, as well.

VSOCK: set POLLOUT | POLLWRNORM for TCP_CLOSING

select(2) with wfds but no rfds must return when the socket is shut down
by the peer. This way userspace notices socket activity and gets -EPIPE
from the next write(2).

Currently select(2) does not return for virtio-vsock when a SEND+RCV
shutdown packet is received. This is because vsock_poll() only sets
POLLOUT | POLLWRNORM for TCP_CLOSE, not the TCP_CLOSING state that the
socket is in when the shutdown is received.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state

ccid2_hc_tx_rto_expire() timer callback always restarts the timer
again and can run indefinitely (unless it is stopped outside), and after
commit 120e9dabaf55 ("dccp: defer ccid_hc_tx_delete() at dismantle time"),
which moved ccid_hc_tx_delete() (also includes sk_stop_timer()) from
dccp_destroy_sock() to sk_destruct(), this started to happen quite often.
The timer prevents releasing the socket, as a result, sk_destruct() won't
be called.

Found with LTP/dccp_ipsec tests running on the bonding device,
which later couldn't be unloaded after the tests were completed:

unregister_netdevice: waiting for bond0 to become free. Usage count = 148

Fixes: 2a91aa396739 ("[DCCP] CCID2: Initial CCID2 (TCP-Like) implementation")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Update the RISC-V MAINTAINERS file

Now that we're upstream in Linux we've been able to make some
infrastructure changes so our port works a bit more like other ports.
Specifically:

* We now have a mailing list specific to the RISC-V Linux port, hosted
at lists.infreadead.org.
* We now have a kernel.org git tree where work on our port is
coordinated.

This patch changes the RISC-V maintainers entry to reflect these new
bits of infrastructure.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>

regulator: core: Refactor regulator_list_voltage()

Change _regulator_list_voltage() argument from regulator to
regulator_dev in order to provide better separation of core layers.
Allow calling _regulator_list_voltage() from functions, with
regulator_dev argument. This refactoring is needed in order to
implement setting voltage of coupled regulators.

Signed-off-by: Maciej Purski <m.purski@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>

regulator: core: Move of_find_regulator_by_node() to of_regulator.c

As of_find_regulator_by_node() is an of function it should be moved from
core.c to of_regulator.c. It provides better separation of device tree
functions from the core and allows other of_functions in of_regulator.c
to resolve device_node to regulator_dev. This will be useful for
implementation of parsing coupled regulators properties.

Declare of_find_regulator_by_node() function in internal.h as well as
regulator_class and dev_to_rdev(), as they are needed by
of_find_regulator_by_node().

Signed-off-by: Maciej Purski <m.purski@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>

x86/mm/64: Tighten up vmalloc_fault() sanity checks on 5-level kernels

On a 5-level kernel, if a non-init mm has a top-level entry, it needs to
match init_mm's, but the vmalloc_fault() code skipped over the BUG_ON()
that would have checked it.

While we're at it, get rid of the rather confusing 4-level folded "pgd"
logic.

Cleans-up: b50858ce3e2a ("x86/mm/vmalloc: Add 5-level paging support")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Neil Berrington <neil.berrington@datacore.com>
Link: https://lkml.kernel.org/r/2ae598f8c279b0a29baf75df207e6f2fdddc0a1b.1516914529.git.luto@kernel.org

x86/mm/64: Fix vmapped stack syncing on very-large-memory 4-level systems

Neil Berrington reported a double-fault on a VM with 768GB of RAM that uses
large amounts of vmalloc space with PTI enabled.

The cause is that load_new_mm_cr3() was never fixed to take the 5-level pgd
folding code into account, so, on a 4-level kernel, the pgd synchronization
logic compiles away to exactly nothing.

Interestingly, the problem doesn't trigger with nopti.  I assume this is
because the kernel is mapped with global pages if we boot with nopti.  The
sequence of operations when we create a new task is that we first load its
mm while still running on the old stack (which crashes if the old stack is
unmapped in the new mm unless the TLB saves us), then we call
prepare_switch_to(), and then we switch to the new stack.
prepare_switch_to() pokes the new stack directly, which will populate the
mapping through vmalloc_fault().  I assume that we're getting lucky on
non-PTI systems -- the old stack's TLB entry stays alive long enough to
make it all the way through prepare_switch_to() and switch_to() so that we
make it to a valid stack.

Fixes: b50858ce3e2a ("x86/mm/vmalloc: Add 5-level paging support")
Reported-and-tested-by: Neil Berrington <neil.berrington@datacore.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: stable@vger.kernel.org
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/346541c56caed61abbe693d7d2742b4a380c5001.1516914529.git.luto@kernel.org

spi: dw: Remove unused members from struct chip_data

Local struct chip_data has two members that are not used:

- cs. Looks like was never used
- enable_dma. Became unused by the commit f89a6d8f43eb ("spi: dw-mid: move
to use core SPI DMA mappings").

Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>

regulator: add PM suspend and resume hooks

In this patch, consumers are allowed to set suspend voltage, and this
actually just set the "uV" in constraint::regulator_state, when the
regulator_suspend_late() was called by PM core through callback when
the system is entering into suspend, the regulator device would act
suspend activity then.

And it assumes that if any consumer set suspend voltage, the regulator
device should be enabled in the suspend state. And if the suspend
voltage of a regulator device for all consumers was set zero, the
regulator device would be off in the suspend state.

This patch also provides a new function hook to regulator devices for
resuming from suspend states.

Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>

regulator: empty the old suspend functions

Regualtor suspend/resume functions should only be called by PM suspend
core via registering dev_pm_ops, and regulator devices should implement
the callback functions. Thus, any regulator consumer shouldn't call
the regulator suspend/resume functions directly.

In order to avoid compile errors, two empty functions with the same name
still be left for the time being.

Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>

regulator: leave one item to record whether regulator is enabled

The items "disabled" and "enabled" are a little redundant, since only one
of them would be set to record if the regulator device should keep on
or be switched to off in suspend states.

So in this patch, the "disabled" was removed, only leave the "enabled":
  - enabled == 1 for regulator-on-in-suspend
  - enabled == 0 for regulator-off-in-suspend
  - enabled == -1 means do nothing when entering suspend mode.

Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>

regulator: make regulator voltage be an array to support more states

Some regulator consumers would like to make the regulator device
keeping a voltage range output when the system entering into
suspend states.

Making regulator voltage be an array can allow consumers to set voltage
for normal state as well as for suspend states through the same code.

Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>

regulator: added support for suspend states

Some systems need to set regulators to specific states when they enter
low power modes, especially around CPUs. There are many of these modes
depending on the particular runtime state.

Currently the regulator consumers are not granted permission to change
suspend state of regulator devices, the constraints are configured at
startup. In order to allow changes in a vlotage range, we need to add
new properties for voltage range and a flag to give permission to
change the suspend voltage and suspend on/off in suspend mode.

Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: orion: Fix a resource leak if the optional "axi" clk is deferred

If the optional "axi" clk is deferred, we still need to undo some
initialisation. Especially 'master' must be released. It will be
reallocated the next time 'orion_spi_probe()' is called.

Add a new label to clean what needs to be cleaned and rename another
label to improve the names used.

Fixes: 92ae112e477a ("spi: orion: Fix clock resource by adding an optional bus clock")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Mark Brown <broonie@kernel.org>

nvme: add tracepoint for nvme_complete_rq

Add a tracepoint in nvme_complete_rq() for completions of NVMe commands. An
expmale output of the trace-point is as follows:

<idle>-0 [001] d.h. 3.505266: nvme_complete_rq: cmdid=989, qid=1, res=0, retries=0, flags=0x0, status=0

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>

nvme: add tracepoint for nvme_setup_cmd

Add tracepoints for nvme_setup_cmd() for tracing admin and/or nvm commands.

Examples of the two tracepoints are as follows for trace_nvme_setup_admin_cmd():
kworker/u8:0-5 [003] .... 2.998792: nvme_setup_admin_cmd: cmdid=14, flags=0x0, meta=0x0, cmd=(nvme_admin_create_cq cqid=1, qsize=1023, cq_flags=0x3, irq_vector=0)

and trace_nvme_setup_nvm_cmd():
dd-205 [001] .... 3.503929: nvme_setup_nvm_cmd: qid=1, nsid=1, cmdid=989, flags=0x0, meta=0x0, cmd=(nvme_cmd_read slba=4096, len=2047, ctrl=0x0, dsmgmt=0, reftag=0)

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>

nvme-pci: introduce RECONNECTING state to mark initializing procedure

After Sagi's commit (nvme-rdma: fix concurrent reset and reconnect),
both nvme-fc/rdma have following pattern:
RESETTING    - quiesce blk-mq queues, teardown and delete queues/
               connections, clear out outstanding IO requests...
RECONNECTING - establish new queues/connections and some other
               initializing things.
Introduce RECONNECTING to nvme-pci transport to do the same mark.
Then we get a coherent state definition among nvme pci/rdma/fc
transports.

Suggested-by: James Smart <james.smart@broadcom.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

Merge branch 'linux-4.15' of git://github.com/skeggsb/linux into drm-fixes

Single irq regression fix
* 'linux-4.15' of git://github.com/skeggsb/linux:
drm/nouveau: Move irq setup/teardown to pci ctor/dtor

net: vrf: Add support for sends to local broadcast address

Sukumar reported that sends to the local broadcast address
(255.255.255.255) are broken. Check for the address in vrf driver
and do not redirect to the VRF device - similar to multicast
packets.

With this change sockets can use SO_BINDTODEVICE to specify an
egress interface and receive responses. Note: the egress interface
can not be a VRF device but needs to be the enslaved device.

https://bugzilla.kernel.org/show_bug.cgi?id=198521

Reported-by: Sukumar Gopalakrishnan <sukumarg1973@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: fix memory corruption on retrieval of hardware statistics.

Hardware statistics retrieval hurts in tight invocation loops.

Avoid extraneous write and enforce strict ordering of writes targeted to
the tally counters dump area address registers.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Oliver Freyermuth <o.freyermuth@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:
"The main item is that we try to better handle the newer trackpoints on
  Lenovo devices that are now being produced by Elan/ALPS/NXP and only
  implement a small subset of the original IBM trackpoint controls"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Revert "Input: synaptics_rmi4 - use devm_device_add_group() for attributes in F01"
  Input: trackpoint - only expose supported controls for Elan, ALPS and NXP
  Input: trackpoint - force 3 buttons if 0 button is reported
  Input: xpad - add support for PDP Xbox One controllers
  Input: stmfts,s6sy671 - add SPDX identifier

orangefs: fix deadlock; do not write i_size in read_iter

After do_readv_writev, the inode cache is invalidated anyway, so i_size
will never be read. It will be fetched from the server which will also
know about updates from other machines.

Fixes deadlock on 32-bit SMP.

See https://marc.info/?l=linux-fsdevel&m=151268557427760&w=2

Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mike Marshall <hubcap@omnibond.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drm/nouveau: Move irq setup/teardown to pci ctor/dtor

For a while we've been having issues with seemingly random interrupts
coming from nvidia cards when resuming them. Originally the fix for this
was thought to be just re-arming the MSI interrupt registers right after
re-allocating our IRQs, however it seems a lot of what we do is both
wrong and not even nessecary.

This was made apparent by what appeared to be a regression in the
mainline kernel that started introducing suspend/resume issues for
nouveau:

a0c9259dc4e1 (irq/matrix: Spread interrupts on allocation)

After this commit was introduced, we started getting interrupts from the
GPU before we actually re-allocated our own IRQ (see references below)
and assigned the IRQ handler. Investigating this turned out that the
problem was not with the commit, but the fact that nouveau even
free/allocates it's irqs before and after suspend/resume.

For starters: drivers in the linux kernel haven't had to handle
freeing/re-allocating their IRQs during suspend/resume cycles for quite
a while now. Nouveau seems to be one of the few drivers left that still
does this, despite the fact there's no reason we actually need to since
disabling interrupts from the device side should be enough, as the
kernel is already smart enough to know to disable host-side interrupts
for us before going into suspend. Since we were tearing down our IRQs by
hand however, that means there was a short period during resume where
interrupts could be received before we re-allocated our IRQ which would
lead to us getting an unhandled IRQ. Since we never handle said IRQ and
re-arm the interrupt registers, this would cause us to miss all of the
interrupts from the GPU and cause our init process to start timing out
on anything requiring interrupts.

So, since this whole setup/teardown every suspend/resume cycle is
useless anyway, move irq setup/teardown into the pci subdev's ctor/dtor
functions instead so they're only called at driver load and driver
unload. This should fix most of the issues with pending interrupts on
resume, along with getting suspend/resume for nouveau to work again.

As well, this probably means we can also just remove the msi rearm call
inside nvkm_pci_init(). But since our main focus here is to fix
suspend/resume before 4.15, we'll save that for a later patch.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Galbraith <efault@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

net: don't call update_pmtu unconditionally

Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to:
"BUG: unable to handle kernel NULL pointer dereference at (null)"

Let's add a helper to check if update_pmtu is available before calling it.

Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path")
Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path")
CC: Roman Kapl <code@rkapl.cz>
CC: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nvme-rdma: remove redundant boolean for inline_data

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>

nvme: don't free uuid pointer before printing it

Commit df351ef73789 ("nvme-fabrics: fix memory leak when parsing host ID
option") fixed the leak of 'p' but in case uuid_parse() fails the memory
is freed before the error print that is using it.

Free it after printing eventual errors.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Fixes: df351ef73789 ("nvme-fabrics: fix memory leak when parsing host ID option")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Radim Krčmář:
"Fix races and a potential use after free in the s390 cmma migration
code"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: add proper locking for CMMA migration bitmap

Merge tag 'for-4.15-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fix from David Sterba:
"It's been reported recently that readdir can list stale entries under
some conditions. Fix it."

* tag 'for-4.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Btrfs: fix stale entries in readdir

net: tcp: close sock if net namespace is exiting

When a tcp socket is closed, if it detects that its net namespace is
exiting, close immediately and do not wait for FIN sequence.

For normal sockets, a reference is taken to their net namespace, so it will
never exit while the socket is open.  However, kernel sockets do not take a
reference to their net namespace, so it may begin exiting while the kernel
socket is still open.  In this case if the kernel socket is a tcp socket,
it will stay open trying to complete its close sequence.  The sock's dst(s)
hold a reference to their interface, which are all transferred to the
namespace's loopback interface when the real interfaces are taken down.
When the namespace tries to take down its loopback interface, it hangs
waiting for all references to the loopback interface to release, which
results in messages like:

unregister_netdevice: waiting for lo to become free. Usage count = 1

These messages continue until the socket finally times out and closes.
Since the net namespace cleanup holds the net_mutex while calling its
registered pernet callbacks, any new net namespace initialization is
blocked until the current net namespace finishes exiting.

After this change, the tcp socket notices the exiting net namespace, and
closes immediately, releasing its dst(s) and their reference to the
loopback interface, which lets the net namespace continue exiting.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
Signed-off-by: Dan Streetman <ddstreet@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nvme-pci: Suspend queues after deleting them

The driver had been abusing the cq_vector state to know if new submissions
were safe, but that was before we could quiesce blk-mq. If the controller
happens to get an interrupt through while we're suspending those queues,
'no irq handler' warnings may occur.

This patch will disable the interrupts only after the queues are deleted.

Reported-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Tested-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

perf/x86: Fix perf,x86,cpuhp deadlock

More lockdep gifts, a 5-way lockup race:

perf_event_create_kernel_counter()
  perf_event_alloc()
    perf_try_init_event()
      x86_pmu_event_init()
__x86_pmu_event_init()
  x86_reserve_hardware()
#0     mutex_lock(&pmc_reserve_mutex);
    reserve_ds_buffer()
#1       get_online_cpus()

perf_event_release_kernel()
  _free_event()
    hw_perf_event_destroy()
      x86_release_hardware()
#0 mutex_lock(&pmc_reserve_mutex)
release_ds_buffer()
#1   get_online_cpus()

#1 do_cpu_up()
  perf_event_init_cpu()
#2     mutex_lock(&pmus_lock)
#3     mutex_lock(&ctx->mutex)

sys_perf_event_open()
  mutex_lock_double()
#3     mutex_lock(ctx->mutex)
#4     mutex_lock_nested(ctx->mutex, 1);

perf_try_init_event()
#4   mutex_lock_nested(ctx->mutex, 1)
  x86_pmu_event_init()
    intel_pmu_hw_config()
      x86_add_exclusive()
#0 mutex_lock(&pmc_reserve_mutex)

Fix it by using ordering constructs instead of locking.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

perf/core: Fix ctx::mutex deadlock

Lockdep noticed the following 3-way lockup scenario:

sys_perf_event_open()
  perf_event_alloc()
    perf_try_init_event()
#0       ctx = perf_event_ctx_lock_nested(1)
      perf_swevent_init()
swevent_hlist_get()
#1   mutex_lock(&pmus_lock)

perf_event_init_cpu()
#1   mutex_lock(&pmus_lock)
#2   mutex_lock(&ctx->mutex)

sys_perf_event_open()
  mutex_lock_double()
#2    mutex_lock()
#0    mutex_lock_nested()

And while we need that perf_event_ctx_lock_nested() for HW PMUs such
that they can iterate the sibling list, trying to match it to the
available counters, the software PMUs need do no such thing. Exclude
them.

In particular the swevent triggers the above invertion, while the
tpevent PMU triggers a more elaborate one through their event_mutex.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

perf/core: Fix another perf,trace,cpuhp lock inversion

Lockdep noticed the following 3-way lockup race:

        perf_trace_init()
#0       mutex_lock(&event_mutex)
          perf_trace_event_init()
            perf_trace_event_reg()
              tp_event->class->reg() := tracepoint_probe_register
#1              mutex_lock(&tracepoints_mutex)
                  trace_point_add_func()
#2                  static_key_enable()

#2 do_cpu_up()
  perf_event_init_cpu()
#3     mutex_lock(&pmus_lock)
#4     mutex_lock(&ctx->mutex)

perf_ioctl()
#4   ctx = perf_event_ctx_lock()
  _perf_iotcl()
    ftrace_profile_set_filter()
#0       mutex_lock(&event_mutex)

Fudge it for now by noting that the tracepoint state does not depend
on the event <-> context relation. Ugly though :/

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

perf/core: Fix lock inversion between perf,trace,cpuhp

Lockdep gifted us with noticing the following 4-way lockup scenario:

        perf_trace_init()
#0       mutex_lock(&event_mutex)
          perf_trace_event_init()
            perf_trace_event_reg()
              tp_event->class->reg() := tracepoint_probe_register
#1             mutex_lock(&tracepoints_mutex)
                  trace_point_add_func()
#2                 static_key_enable()

#2     do_cpu_up()
          perf_event_init_cpu()
#3         mutex_lock(&pmus_lock)
#4         mutex_lock(&ctx->mutex)

        perf_event_task_disable()
          mutex_lock(&current->perf_event_mutex)
#4       ctx = perf_event_ctx_lock()
#5       perf_event_for_each_child()

        do_exit()
          task_work_run()
            __fput()
              perf_release()
                perf_event_release_kernel()
#4               mutex_lock(&ctx->mutex)
#5               mutex_lock(&event->child_mutex)
                  free_event()
                    _free_event()
                      event->destroy() := perf_trace_destroy
#0                     mutex_lock(&event_mutex);

Fix that by moving the free_event() out from under the locks.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

mtd: nand: sunxi: Fix ECC strength choice

When the requested ECC strength does not exactly match the strengths
supported by the ECC engine, the driver is selecting the closest
strength meeting the 'selected_strength > requested_strength'
constraint. Fix the fact that, in this particular case, ecc->strength
value was not updated to match the 'selected_strength'.

For instance, one can encounter this issue when no ECC requirement is
filled in the device tree while the NAND chip minimum requirement is not
a strength/step_size combo natively supported by the ECC engine.

Fixes: 1fef62c1423b ("mtd: nand: add sunxi NAND flash controller support")
Cc: <stable@vger.kernel.org>
Suggested-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Miquel Raynal <miquel.raynal@free-electrons.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>

mtd: nand: gpmi: Fix subpage reads

Commit 25f815f66a14 ("mtd: nand: force drivers to explicitly send
READ/PROG commands") added a call to nand_read_page_op() in
gpmi_ecc_read_page(), which means this function now sends a READ0
command and place the data pointer at the beginning of the page. This
logic is breaking gpmi_ecc_read_subpage() which was calling
gpmi_ecc_read_page() and expected it to only retrieve the data without
sending the READ0 command.

Create a gpmi_ecc_read_page_data() helper which only does the data
retrieval and ECC correction steps and implement gpmi_ecc_read_page()
as a wrapper that calls nand_read_page_op()+gpmi_ecc_read_page_data().

This way, gpmi_ecc_read_subpage() can call gpmi_ecc_read_page_data()
which restores the logic we had before commit 25f815f66a14 ("mtd: nand:
force drivers to explicitly send READ/PROG commands").

Fixes: 25f815f66a14 ("mtd: nand: force drivers to explicitly send READ/PROG commands")
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Miquel Raynal <miquel.raynal@free-electrons.com>
Acked-by: Han Xu <han.xu@nxp.com>

Merge tag 'drm-misc-fixes-2018-01-24' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Two vc4 fixes that were applied in the last day.
One fixes a NULL dereference, and the other fixes
a flickering bug.

Cc: Eric Anholt <eric@anholt.net>
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
* tag 'drm-misc-fixes-2018-01-24' of git://anongit.freedesktop.org/drm/drm-misc:
drm/vc4: Fix NULL pointer dereference in vc4_save_hang_state()
drm/vc4: Flush the caches before the bin jobs, as well.

Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Avoid negative netdev refcount in error flow of xfrm state add, from
    Aviad Yehezkel.

2) Fix tcpdump decoding of IPSEC decap'd frames by filling in the
    ethernet header protocol field in xfrm{4,6}_mode_tunnel_input().
    From Yossi Kuperman.

3) Fix a syzbot triggered skb_under_panic in pppoe having to do with
    failing to allocate an appropriate amount of headroom. From
    Guillaume Nault.

4) Fix memory leak in vmxnet3 driver, from Neil Horman.

5) Cure out-of-bounds packet memory access in em_nbyte EMATCH module,
    from Wolfgang Bumiller.

6) Restrict what kinds of sockets can be bound to the KCM multiplexer
    and also disallow when another layer has attached to the socket and
    made use of sk_user_data. From Tom Herbert.

7) Fix use before init of IOTLB in vhost code, from Jason Wang.

8) Correct STACR register write bit definition in IBM emac driver, from
    Ivan Mikhaylov.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net/ibm/emac: wrong bit is used for STA control register write
  net/ibm/emac: add 8192 rx/tx fifo size
  vhost: do not try to access device IOTLB when not initialized
  vhost: use mutex_lock_nested() in vhost_dev_lock_vqs()
  i40e: flower: check if TC offload is enabled on a netdev
  qed: Free reserved MR tid
  qed: Remove reserveration of dpi for kernel
  kcm: Check if sk_user_data already set in kcm_attach
  kcm: Only allow TCP sockets to be attached to a KCM mux
  net: sched: fix TCF_LAYER_LINK case in tcf_get_base_ptr
  net: sched: em_nbyte: don't add the data offset twice
  mlxsw: spectrum_router: Don't log an error on missing neighbor
  vmxnet3: repair memory leak
  ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
  pppoe: take ->needed_headroom of lower device into account on xmit
  xfrm: fix boolean assignment in xfrm_get_type_offload
  xfrm: Fix eth_hdr(skb)->h_proto to reflect inner IP version
  xfrm: fix error flow in case of add state fails
  xfrm: Add SA to hardware at the end of xfrm_state_construct()

Merge git://git./linux/kernel/git/davem/sparc

Pull sparc bugfix from David Miller:
"Sparc Makefile typo fix"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: fix typo in CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64

net/ibm/emac: wrong bit is used for STA control register write

STA control register has areas of mode and opcodes for opeations. 18 bit is
using for mode selection, where 0 is old MIO/MDIO access method and 1 is
indirect access mode. 19-20 bits are using for setting up read/write
operation(STA opcodes). In current state 'read' is set into old MIO/MDIO mode
with 19 bit and write operation is set into 18 bit which is mode selection,
not a write operation. To correlate write with read we set it into 20 bit.
All those bit operations are MSB 0 based.

Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/ibm/emac: add 8192 rx/tx fifo size

emac4syn chips has availability to use 8192 rx/tx fifo buffer sizes,
in current state if we set it up in dts 8192 as example, we will get
only 2048 which may impact on network speed.

Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "Input: synaptics_rmi4 - use devm_device_add_group() for attributes in F01"

Since the sysfs attribute hangs off the RMI bus, which doesn't go away during
firmware flash, it needs to be explicitly removed, otherwise we would try and
register the same attribute twice.

This reverts commit 36a44af5c176d619552d99697433261141dd1296.

Signed-off-by: Nick Dyer <nick@shmanahar.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

vhost: do not try to access device IOTLB when not initialized

The code will try to access dev->iotlb when processing
VHOST_IOTLB_INVALIDATE even if it was not initialized which may lead
to NULL pointer dereference. Fixes this by check dev->iotlb before.

Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vhost: use mutex_lock_nested() in vhost_dev_lock_vqs()

We used to call mutex_lock() in vhost_dev_lock_vqs() which tries to
hold mutexes of all virtqueues. This may confuse lockdep to report a
possible deadlock because of trying to hold locks belong to same
class. Switch to use mutex_lock_nested() to avoid false positive.

Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API")
Reported-by: syzbot+dbb7c1161485e61b0241@syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

i40e: flower: check if TC offload is enabled on a netdev

Since TC block changes drivers are required to check if
the TC hw offload flag is set on the interface themselves.

Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower")
Fixes: 44ae12a768b7 ("net: sched: move the can_offload check from binding phase to rule insertion phase")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Amritha Nambiar <amritha.nambiar@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: fix typo in CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64

This patch fixes the typo CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64

Fixes: 81658ad0d923 ("sparc64: Add CAMELLIA driver making use of the new camellia opcodes.")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'qed-rdma-bug-fixes'

Michal Kalderon says:

====================
qed: rdma bug fixes

This patch contains two small bug fixes related to RDMA.
Both related to resource reservations.
====================

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Free reserved MR tid

A tid was allocated for reserved MR during initialization but
not freed. This lead to an annoying output message during
rdma unload flow.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Remove reserveration of dpi for kernel

Double reservation for kernel dedicated dpi was performed.
Once in the core module and once in qedr.
Remove the reservation from core.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'kcm-fix-two-syzcaller-issues'

Tom Herbert says:

====================
kcm: fix two syzcaller issues

In this patch set:

- Don't allow attaching non-TCP or listener sockets to a KCM mux.
- In kcm_attach Check if sk_user_data is already set. This is
  under lock to avoid race conditions. More work is need to make
  all of the users of sk_user_data to use the same locking.

- v2
  Remove unncessary check for not PF_KCM in kcm_attach (suggested by
  Guillaume Nault)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

kcm: Check if sk_user_data already set in kcm_attach

This is needed to prevent sk_user_data being overwritten.
The check is done under the callback lock. This should prevent
a socket from being attached twice to a KCM mux. It also prevents
a socket from being attached for other use cases of sk_user_data
as long as the other cases set sk_user_data under the lock.
Followup work is needed to unify all the use cases of sk_user_data
to use the same locking.

Reported-by: syzbot+114b15f2be420a8886c3@syzkaller.appspotmail.com
Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Signed-off-by: Tom Herbert <tom@quantonium.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

kcm: Only allow TCP sockets to be attached to a KCM mux

TCP sockets for IPv4 and IPv6 that are not listeners or in closed
stated are allowed to be attached to a KCM mux.

Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Reported-by: syzbot+8865eaff7f9acd593945@syzkaller.appspotmail.com
Signed-off-by: Tom Herbert <tom@quantonium.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>