Bart Van Assche [Thu, 8 Mar 2018 01:10:03 +0000 (17:10 -0800)]
block: Use the queue_flag_*() functions instead of open-coding these
Except for changing the atomic queue flag manipulations that are
protected by the queue lock into non-atomic manipulations, this
patch does not change any functionality.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bart Van Assche [Thu, 8 Mar 2018 01:10:02 +0000 (17:10 -0800)]
block: Reorder the queue flag manipulation function definitions
Move the definition of queue_flag_clear_unlocked() up and move the
definition of queue_in_flight() down such that all queue flag
manipulation function definitions become contiguous.
This patch does not change any functionality.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jonas Rabenstein [Thu, 1 Mar 2018 13:26:37 +0000 (14:26 +0100)]
block: sed-opal: fix response string extraction
Tokens are prefixed by a variable length of bytes. If a bytestring is
not stored in an tiny or short atom, we have to skip more than one byte
in order to have the actual bytes not prefixed by the bytes describing
the actual length of the string.
Acked-by: Jonathan Derrick <jonathan.derrick@intel.com>
Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Tue, 6 Mar 2018 04:07:13 +0000 (12:07 +0800)]
block: null_blk: fix 'Invalid parameters' when loading module
On ARM64, the default page size has been 64K on some distributions, and
we should allow ARM64 people to play null_blk.
This patch fixes the issue by extend page bitmap size for supporting
other non-4KB PAGE_SIZE.
Cc: Bart Van Assche <Bart.VanAssche@wdc.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: Kyungchan Koh <kkc6196@fb.com>,
Cc: weiping zhang <zhangweiping@didichuxing.com>
Cc: Yi Zhang <yi.zhang@redhat.com>
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Arnd Bergmann [Thu, 1 Mar 2018 10:31:29 +0000 (11:31 +0100)]
staging: rts5208: rename SG_END macro
A change to the generic scatterlist code caused a conflict with
the rtsx card reader driver:
In file included from drivers/staging/rts5208/rtsx.h:180,
from drivers/staging/rts5208/rtsx.c:28:
drivers/staging/rts5208/rtsx_chip.h:343: error: "SG_END" redefined [-Werror]
This changes one instance of the driver to prefix SG_END and
related constants.
Fixes: 723fbf563a6a ("lib/scatterlist: Add SG_CHAIN and SG_END macros for LSB encodings")
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Arnd Bergmann [Thu, 1 Mar 2018 10:31:28 +0000 (11:31 +0100)]
misc: rtsx: rename SG_END macro
A change to the generic scatterlist code caused a conflict with
the rtsx card reader driver:
In file included from drivers/misc/cardreader/rtsx_pcr.c:32:
include/linux/rtsx_pci.h:40: error: "SG_END" redefined [-Werror]
This changes one instance of the driver to prefix SG_END and
related constants.
Fixes: 723fbf563a6a ("lib/scatterlist: Add SG_CHAIN and SG_END macros for LSB encodings")
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bart Van Assche [Wed, 28 Feb 2018 18:15:33 +0000 (10:15 -0800)]
block: Fix a race between request queue removal and the block cgroup controller
Avoid that the following race can occur:
blk_cleanup_queue() blkcg_print_blkgs()
spin_lock_irq(lock) (1) spin_lock_irq(blkg->q->queue_lock) (2,5)
q->queue_lock = &q->__queue_lock (3)
spin_unlock_irq(lock) (4)
spin_unlock_irq(blkg->q->queue_lock) (6)
(1) take driver lock;
(2) busy loop for driver lock;
(3) override driver lock with internal lock;
(4) unlock driver lock;
(5) can take driver lock now;
(6) but unlock internal lock.
This change is safe because only the SCSI core and the NVME core keep
a reference on a request queue after having called blk_cleanup_queue().
Neither driver accesses any of the removed data structures between its
blk_cleanup_queue() and blk_put_queue() calls.
Reported-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bart Van Assche [Wed, 28 Feb 2018 18:15:32 +0000 (10:15 -0800)]
block: Fix a race between the cgroup code and request queue initialization
Initialize the request queue lock earlier such that the following
race can no longer occur:
blk_init_queue_node() blkcg_print_blkgs()
blk_alloc_queue_node (1)
q->queue_lock = &q->__queue_lock (2)
blkcg_init_queue(q) (3)
spin_lock_irq(blkg->q->queue_lock) (4)
q->queue_lock = lock (5)
spin_unlock_irq(blkg->q->queue_lock) (6)
(1) allocate an uninitialized queue;
(2) initialize queue_lock to its default internal lock;
(3) initialize blkcg part of request queue, which will create blkg and
then insert it to blkg_list;
(4) traverse blkg_list and find the created blkg, and then take its
queue lock, here it is the default *internal lock*;
(5) *race window*, now queue_lock is overridden with *driver specified
lock*;
(6) now unlock *driver specified lock*, not the locked *internal lock*,
unlock balance breaks.
The changes in this patch are as follows:
- Move the .queue_lock initialization from blk_init_queue_node() into
blk_alloc_queue_node().
- Only override the .queue_lock pointer for legacy queues because it
is not useful for blk-mq queues to override this pointer.
- For all all block drivers that initialize .queue_lock explicitly,
change the blk_alloc_queue() call in the driver into a
blk_alloc_queue_node() call and remove the explicit .queue_lock
initialization. Additionally, initialize the spin lock that will
be used as queue lock earlier if necessary.
Reported-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bart Van Assche [Wed, 28 Feb 2018 18:15:31 +0000 (10:15 -0800)]
block: Add 'lock' as third argument to blk_alloc_queue_node()
This patch does not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bart Van Assche [Wed, 28 Feb 2018 18:15:30 +0000 (10:15 -0800)]
zram: Delete gendisk before cleaning up the request queue
Remove the disk, partition and bdi sysfs attributes before cleaning up
the request queue associated with the disk.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bart Van Assche [Wed, 28 Feb 2018 18:15:29 +0000 (10:15 -0800)]
md: Delete gendisk before cleaning up the request queue
Remove the disk, partition and bdi sysfs attributes before cleaning up
the request queue associated with the disk.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Cc: Shaohua Li <shli@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bart Van Assche [Wed, 28 Feb 2018 18:15:28 +0000 (10:15 -0800)]
block/loop: Delete gendisk before cleaning up the request queue
Remove the disk, partition and bdi sysfs attributes before cleaning up
the request queue associated with the disk.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 28 Feb 2018 16:18:57 +0000 (09:18 -0700)]
null_blk: add 'requeue' fault attribute
Similarly to the support we have for testing/faking timeouts for
null_blk, this adds support for triggering a requeue condition.
Considering the issues around restart we've been seeing, this should be
a useful addition to the testing arsenal to ensure that we are handling
requeue conditions correctly.
This works for queue mode 1 (legacy request_fn based path) and 2 (blk-mq
path), as there's no good way to do requeue with a bio based driver.
This is similar to the timeout path. For the blk-mq path, we alternate
between passing back BLK_STS_RESOURCE and manually calling
blk_mq_requeue_request() in the driver. The former will hit the core
requeue path, while the latter exercises the IO scheduler requeue
path.
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Omar Sandoval [Wed, 28 Feb 2018 00:56:43 +0000 (16:56 -0800)]
sbitmap: use test_and_set_bit_lock()/clear_bit_unlock()
sbitmap_queue_get()/sbitmap_queue_clear() are used for
allocating/freeing a resource, so they should provide acquire/release
barrier semantics, respectively. sbitmap_get() currently contains a full
barrier, which is unnecessary, so use test_and_set_bit_lock() instead of
test_and_set_bit() (these are equivalent on x86_64). sbitmap_clear_bit()
does not imply any barriers, which is incorrect, as accesses of the
resource (e.g., request) could potentially get reordered to after the
clear_bit(). Introduce sbitmap_clear_bit_unlock() and use it for
sbitmap_queue_clear() (this only adds a compiler barrier on x86_64). The
other existing user of sbitmap_clear_bit() (the blk-mq software queue
pending map) is serialized through a spinlock and does not need this.
Reported-by: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Omar Sandoval [Wed, 28 Feb 2018 00:56:42 +0000 (16:56 -0800)]
block: clear ctx pending bit under ctx lock
When we insert a request, we set the software queue pending bit while
holding the software queue lock. However, we clear it outside of the
lock, so it's possible that a concurrent insert could reset the bit
after we clear it but before we empty the request list. Afterwards, the
bit would still be set but the software queue wouldn't have any requests
in it, leading us to do a spurious run in the future. This is mostly a
benign/theoretical issue, but it makes the following change easier to
justify.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bart Van Assche [Wed, 28 Feb 2018 00:32:14 +0000 (16:32 -0800)]
blk-mq-debugfs: Show zone locking information
When debugging the ZBC code in the mq-deadline scheduler it is very
important to know which zones are locked and which zones are not
locked. Hence this patch that exports the zone locking information
through debugfs.
Cc: Omar Sandoval <osandov@fb.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bart Van Assche [Wed, 28 Feb 2018 00:32:13 +0000 (16:32 -0800)]
blk-mq-debugfs: Reorder queue show and store methods
Make sure that the queue show and store methods are contiguous and
also that these appear in alphabetical order.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jiufei Xue [Wed, 28 Feb 2018 05:44:18 +0000 (13:44 +0800)]
writeback: remove dead code in wb_blkcg/memcg_offline
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Anshuman Khandual [Thu, 15 Feb 2018 03:33:56 +0000 (09:03 +0530)]
lib/scatterlist: Add SG_CHAIN and SG_END macros for LSB encodings
This replaces scatterlist->page_link LSB encodings with SG_CHAIN and
SG_END definitions without any functional change.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 28 Feb 2018 19:18:58 +0000 (12:18 -0700)]
Merge branch 'for-jens' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Keith for 4.16-rc.
* 'for-jens' of git://git.infradead.org/nvme:
nvmet: fix PSDT field check in command format
nvme-multipath: fix sysfs dangerously created links
nvme-pci: Fix nvme queue cleanup if IRQ setup fails
nvmet-loop: use blk_rq_payload_bytes for sgl selection
nvme-rdma: use blk_rq_payload_bytes instead of blk_rq_bytes
nvme-fabrics: don't check for non-NULL module in nvmf_register_transport
Max Gurtovoy [Wed, 24 Jan 2018 15:31:45 +0000 (17:31 +0200)]
nvmet: fix PSDT field check in command format
PSDT field section according to NVM_Express-1.3:
"This field specifies whether PRPs or SGLs are used for any data
transfer associated with the command. PRPs shall be used for all
Admin commands for NVMe over PCIe. SGLs shall be used for all Admin
and I/O commands for NVMe over Fabrics. This field shall be set to
01b for NVMe over Fabrics 1.0 implementations.
Suggested-by: Idan Burstein <idanb@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Baegjae Sung [Wed, 28 Feb 2018 07:06:04 +0000 (16:06 +0900)]
nvme-multipath: fix sysfs dangerously created links
If multipathing is enabled, each NVMe subsystem creates a head
namespace (e.g., nvme0n1) and multiple private namespaces
(e.g., nvme0c0n1 and nvme0c1n1) in sysfs. When creating links for
private namespaces, links of head namespace are used, so the
namespace creation order must be followed (e.g., nvme0n1 ->
nvme0c1n1). If the order is not followed, links of sysfs will be
incomplete or kernel panic will occur.
The kernel panic was:
kernel BUG at fs/sysfs/symlink.c:27!
Call Trace:
nvme_mpath_add_disk_links+0x5d/0x80 [nvme_core]
nvme_validate_ns+0x5c2/0x850 [nvme_core]
nvme_scan_work+0x1af/0x2d0 [nvme_core]
Correct order
Context A Context B
nvme0n1
nvme0c0n1 nvme0c1n1
Incorrect order
Context A Context B
nvme0c1n1
nvme0n1
nvme0c0n1
The nvme_mpath_add_disk (for creating head namespace) is called
just before the nvme_mpath_add_disk_links (for creating private
namespaces). In nvme_mpath_add_disk, the first context acquires
the lock of subsystem and creates a head namespace, and other
contexts do nothing by checking GENHD_FL_UP of a head namespace
after waiting to acquire the lock. We verified the code with or
without multipathing using three vendors of dual-port NVMe SSDs.
Signed-off-by: Baegjae Sung <baegjae@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Gustavo A. R. Silva [Mon, 12 Feb 2018 17:14:55 +0000 (11:14 -0600)]
nbd: fix return value in error handling path
It seems that the proper value to return in this particular case is the
one contained into variable new_index instead of ret.
Addresses-Coverity-ID:
1465148 ("Copy-paste error")
Fixes: e46c7287b1c2 ("nbd: add a basic netlink interface")
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Tang Junhui [Tue, 27 Feb 2018 17:49:30 +0000 (09:49 -0800)]
bcache: fix kcrashes with fio in RAID5 backend dev
Kernel crashed when run fio in a RAID5 backend bcache device, the call
trace is bellow:
[ 440.012034] kernel BUG at block/blk-ioc.c:146!
[ 440.012696] invalid opcode: 0000 [#1] SMP NOPTI
[ 440.026537] CPU: 2 PID: 2205 Comm: md127_raid5 Not tainted 4.15.0 #8
[ 440.027441] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 07/16
/2015
[ 440.028615] RIP: 0010:put_io_context+0x8b/0x90
[ 440.029246] RSP: 0018:
ffffa8c882b43af8 EFLAGS:
00010246
[ 440.029990] RAX:
0000000000000000 RBX:
ffffa8c88294fca0 RCX:
0000000000
0f4240
[ 440.031006] RDX:
0000000000000004 RSI:
0000000000000286 RDI:
ffffa8c882
94fca0
[ 440.032030] RBP:
ffffa8c882b43b10 R08:
0000000000000003 R09:
ffff949cb8
0c1700
[ 440.033206] R10:
0000000000000104 R11:
000000000000b71c R12:
00000000000
01000
[ 440.034222] R13:
0000000000000000 R14:
ffff949cad84db70 R15:
ffff949cb11
bd1e0
[ 440.035239] FS:
0000000000000000(0000) GS:
ffff949cba280000(0000) knlGS:
0000000000000000
[ 440.060190] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 440.084967] CR2:
00007ff0493ef000 CR3:
00000002f1e0a002 CR4:
00000000001
606e0
[ 440.110498] Call Trace:
[ 440.135443] bio_disassociate_task+0x1b/0x60
[ 440.160355] bio_free+0x1b/0x60
[ 440.184666] bio_put+0x23/0x30
[ 440.208272] search_free+0x23/0x40 [bcache]
[ 440.231448] cached_dev_write_complete+0x31/0x70 [bcache]
[ 440.254468] closure_put+0xb6/0xd0 [bcache]
[ 440.277087] request_endio+0x30/0x40 [bcache]
[ 440.298703] bio_endio+0xa1/0x120
[ 440.319644] handle_stripe+0x418/0x2270 [raid456]
[ 440.340614] ? load_balance+0x17b/0x9c0
[ 440.360506] handle_active_stripes.isra.58+0x387/0x5a0 [raid456]
[ 440.380675] ? __release_stripe+0x15/0x20 [raid456]
[ 440.400132] raid5d+0x3ed/0x5d0 [raid456]
[ 440.419193] ? schedule+0x36/0x80
[ 440.437932] ? schedule_timeout+0x1d2/0x2f0
[ 440.456136] md_thread+0x122/0x150
[ 440.473687] ? wait_woken+0x80/0x80
[ 440.491411] kthread+0x102/0x140
[ 440.508636] ? find_pers+0x70/0x70
[ 440.524927] ? kthread_associate_blkcg+0xa0/0xa0
[ 440.541791] ret_from_fork+0x35/0x40
[ 440.558020] Code: c2 48 00 5b 41 5c 41 5d 5d c3 48 89 c6 4c 89 e7 e8 bb c2
48 00 48 8b 3d bc 36 4b 01 48 89 de e8 7c f7 e0 ff 5b 41 5c 41 5d 5d c3 <0f> 0b
0f 1f 00 0f 1f 44 00 00 55 48 8d 47 b8 48 89 e5 41 57 41
[ 440.610020] RIP: put_io_context+0x8b/0x90 RSP:
ffffa8c882b43af8
[ 440.628575] ---[ end trace
a1fd79d85643a73e ]--
All the crash issue happened when a bypass IO coming, in such scenario
s->iop.bio is pointed to the s->orig_bio. In search_free(), it finishes the
s->orig_bio by calling bio_complete(), and after that, s->iop.bio became
invalid, then kernel would crash when calling bio_put(). Maybe its upper
layer's faulty, since bio should not be freed before we calling bio_put(),
but we'd better calling bio_put() first before calling bio_complete() to
notify upper layer ending this bio.
This patch moves bio_complete() under bio_put() to avoid kernel crash.
[mlyle: fixed commit subject for character limits]
Reported-by: Matthias Ferdinand <bcache@mfedv.net>
Tested-by: Matthias Ferdinand <bcache@mfedv.net>
Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Coly Li [Tue, 27 Feb 2018 17:49:29 +0000 (09:49 -0800)]
bcache: correct flash only vols (check all uuids)
Commit
2831231d4c3f ("bcache: reduce cache_set devices iteration by
devices_max_used") adds c->devices_max_used to reduce iteration of
c->uuids elements, this value is updated in bcache_device_attach().
But for flash only volume, when calling flash_devs_run(), the function
bcache_device_attach() is not called yet and c->devices_max_used is not
updated. The unexpected result is, the flash only volume won't be run
by flash_devs_run().
This patch fixes the issue by iterate all c->uuids elements in
flash_devs_run(). c->devices_max_used will be updated properly when
bcache_device_attach() gets called.
[mlyle: commit subject edited for character limit]
Fixes: 2831231d4c3f ("bcache: reduce cache_set devices iteration by devices_max_used")
Reported-by: Tang Junhui <tang.junhui@zte.com.cn>
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Eric Biggers [Sat, 27 Jan 2018 00:58:06 +0000 (16:58 -0800)]
blktrace_api.h: fix comment for struct blk_user_trace_setup
'struct blk_user_trace_setup' is passed to BLKTRACESETUP, not
BLKTRACESTART.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jan Kara [Mon, 26 Feb 2018 12:01:42 +0000 (13:01 +0100)]
blockdev: Avoid two active bdev inodes for one device
When blkdev_open() races with device removal and creation it can happen
that unhashed bdev inode gets associated with newly created gendisk
like:
CPU0 CPU1
blkdev_open()
bdev = bd_acquire()
del_gendisk()
bdev_unhash_inode(bdev);
remove device
create new device with the same number
__blkdev_get()
disk = get_gendisk()
- gets reference to gendisk of the new device
Now another blkdev_open() will not find original 'bdev' as it got
unhashed, create a new one and associate it with the same 'disk' at
which point problems start as we have two independent page caches for
one device.
Fix the problem by verifying that the bdev inode didn't get unhashed
before we acquired gendisk reference. That way we make sure gendisk can
get associated only with visible bdev inodes.
Tested-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jan Kara [Mon, 26 Feb 2018 12:01:41 +0000 (13:01 +0100)]
genhd: Fix BUG in blkdev_open()
When two blkdev_open() calls for a partition race with device removal
and recreation, we can hit BUG_ON(!bd_may_claim(bdev, whole, holder)) in
blkdev_open(). The race can happen as follows:
CPU0 CPU1 CPU2
del_gendisk()
bdev_unhash_inode(part1);
blkdev_open(part1, O_EXCL) blkdev_open(part1, O_EXCL)
bdev = bd_acquire() bdev = bd_acquire()
blkdev_get(bdev)
bd_start_claiming(bdev)
- finds old inode 'whole'
bd_prepare_to_claim() -> 0
bdev_unhash_inode(whole);
<device removed>
<new device under same
number created>
blkdev_get(bdev);
bd_start_claiming(bdev)
- finds new inode 'whole'
bd_prepare_to_claim()
- this also succeeds as we have
different 'whole' here...
- bad things happen now as we
have two exclusive openers of
the same bdev
The problem here is that block device opens can see various intermediate
states while gendisk is shutting down and then being recreated.
We fix the problem by introducing new lookup_sem in gendisk that
synchronizes gendisk deletion with get_gendisk() and furthermore by
making sure that get_gendisk() does not return gendisk that is being (or
has been) deleted. This makes sure that once we ever manage to look up
newly created bdev inode, we are also guaranteed that following
get_gendisk() will either return failure (and we fail open) or it
returns gendisk for the new device and following bdget_disk() will
return new bdev inode (i.e., blkdev_open() follows the path as if it is
completely run after new device is created).
Reported-and-analyzed-by: Hou Tao <houtao1@huawei.com>
Tested-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jan Kara [Mon, 26 Feb 2018 12:01:40 +0000 (13:01 +0100)]
genhd: Fix use after free in __blkdev_get()
When two blkdev_open() calls race with device removal and recreation,
__blkdev_get() can use looked up gendisk after it is freed:
CPU0 CPU1 CPU2
del_gendisk(disk);
bdev_unhash_inode(inode);
blkdev_open() blkdev_open()
bdev = bd_acquire(inode);
- creates and returns new inode
bdev = bd_acquire(inode);
- returns the same inode
__blkdev_get(devt) __blkdev_get(devt)
disk = get_gendisk(devt);
- got structure of device going away
<finish device removal>
<new device gets
created under the same
device number>
disk = get_gendisk(devt);
- got new device structure
if (!bdev->bd_openers) {
does the first open
}
if (!bdev->bd_openers)
- false
} else {
put_disk_and_module(disk)
- remember this was old device - this was last ref and disk is
now freed
}
disk_unblock_events(disk); -> oops
Fix the problem by making sure we drop reference to disk in
__blkdev_get() only after we are really done with it.
Reported-by: Hou Tao <houtao1@huawei.com>
Tested-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jan Kara [Mon, 26 Feb 2018 12:01:39 +0000 (13:01 +0100)]
genhd: Add helper put_disk_and_module()
Add a proper counterpart to get_disk_and_module() -
put_disk_and_module(). Currently it is opencoded in several places.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jan Kara [Mon, 26 Feb 2018 12:01:38 +0000 (13:01 +0100)]
genhd: Rename get_disk() to get_disk_and_module()
Rename get_disk() to get_disk_and_module() to make sure what the
function does. It's not a great name but at least it is now clear that
put_disk() is not it's counterpart.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jan Kara [Mon, 26 Feb 2018 12:01:37 +0000 (13:01 +0100)]
genhd: Fix leaked module reference for NVME devices
Commit
8ddcd653257c "block: introduce GENHD_FL_HIDDEN" added handling of
hidden devices to get_gendisk() but forgot to drop module reference
which is also acquired by get_disk(). Drop the reference as necessary.
Arguably the function naming here is misleading as put_disk() is *not*
the counterpart of get_disk() but let's fix that in the follow up
commit since that will be more intrusive.
Fixes: 8ddcd653257c18a669fcb75ee42c37054908e0d6
CC: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jan Kara [Mon, 26 Feb 2018 11:51:43 +0000 (12:51 +0100)]
direct-io: Fix sleep in atomic due to sync AIO
Commit
e864f39569f4 "fs: add RWF_DSYNC aand RWF_SYNC" added additional
way for direct IO to become synchronous and thus trigger fsync from the
IO completion handler. Then commit
9830f4be159b "fs: Use RWF_* flags for
AIO operations" allowed these flags to be set for AIO as well. However
that commit forgot to update the condition checking whether the IO
completion handling should be defered to a workqueue and thus AIO DIO
with RWF_[D]SYNC set will call fsync() from IRQ context resulting in
sleep in atomic.
Fix the problem by checking directly iocb flags (the same way as it is
done in dio_complete()) instead of checking all conditions that could
lead to IO being synchronous.
CC: Christoph Hellwig <hch@lst.de>
CC: Goldwyn Rodrigues <rgoldwyn@suse.com>
CC: stable@vger.kernel.org
Reported-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 9830f4be159b29399d107bffb99e0132bc5aedd4
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jianchao Wang [Thu, 15 Feb 2018 11:13:41 +0000 (19:13 +0800)]
nvme-pci: Fix nvme queue cleanup if IRQ setup fails
This patch fixes nvme queue cleanup if requesting an IRQ handler for
the queue's vector fails. It does this by resetting the cq_vector to
the uninitialized value of -1 so it is ignored for a controller reset.
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
[changelog updates, removed misc whitespace changes]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Ming Lei [Fri, 23 Feb 2018 15:36:57 +0000 (23:36 +0800)]
block: kyber: fix domain token leak during requeue
When requeuing request, the domain token should have been freed
before re-inserting the request to io scheduler. Otherwise, the
assigned domain token will be leaked, and IO hang can be caused.
Cc: Paolo Valente <paolo.valente@linaro.org>
Cc: Omar Sandoval <osandov@fb.com>
Cc: stable@vger.kernel.org
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Fri, 23 Feb 2018 15:36:56 +0000 (23:36 +0800)]
blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatch
__blk_mq_requeue_request() covers two cases:
- one is that the requeued request is added to hctx->dispatch, such as
blk_mq_dispatch_rq_list()
- another case is that the request is requeued to io scheduler, such as
blk_mq_requeue_request().
We should call io sched's .requeue_request callback only for the 2nd
case.
Cc: Paolo Valente <paolo.valente@linaro.org>
Cc: Omar Sandoval <osandov@fb.com>
Fixes: bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers")
Cc: stable@vger.kernel.org
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Sat, 10 Feb 2018 00:46:17 +0000 (08:46 +0800)]
block: pass inclusive 'lend' parameter to truncate_inode_pages_range
The 'lend' parameter of truncate_inode_pages_range is required to be
inclusive, so follow the rule.
This patch fixes one memory corruption triggered by discard.
Cc: <stable@vger.kernel.org>
Cc: Dmitry Monakhov <dmonakhov@openvz.org>
Fixes: 351499a172c0 ("block: Invalidate cache on discard v2")
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Kees Cook [Fri, 23 Feb 2018 00:59:26 +0000 (16:59 -0800)]
MIPS: boot: Define __ASSEMBLY__ for its.S build
The MIPS %.its.S compiler command did not define __ASSEMBLY__, which meant
when compiler_types.h was added to kconfig.h, unexpected things appeared
(e.g. struct declarations) which should not have been present. As done in
the general %.S compiler command, __ASSEMBLY__ is now included here too.
The failure was:
Error: arch/mips/boot/vmlinux.gz.its:201.1-2 syntax error
FATAL ERROR: Unable to parse input tree
/usr/bin/mkimage: Can't read arch/mips/boot/vmlinux.gz.itb.tmp: Invalid argument
/usr/bin/mkimage Can't add hashes to FIT blob
Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 28128c61e08e ("kconfig.h: Include compiler types to avoid missed struct attributes")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 23 Feb 2018 01:04:06 +0000 (17:04 -0800)]
Merge branch 'siginfo-linus' of git://git./linux/kernel/git/ebiederm/user-namespace
Pull siginfo fix from Eric Biederman:
"This fixes a build error that only shows up on blackfin"
* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
fs/signalfd: fix build error for BUS_MCEERR_AR
Linus Torvalds [Fri, 23 Feb 2018 00:38:10 +0000 (16:38 -0800)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"Fix an oops in the s5p-sss driver when used with ecb(aes)"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: s5p-sss - Fix kernel Oops in AES-ECB mode
Randy Dunlap [Mon, 12 Feb 2018 21:18:38 +0000 (13:18 -0800)]
fs/signalfd: fix build error for BUS_MCEERR_AR
Fix build error in fs/signalfd.c by using same method that is used in
kernel/signal.c: separate blocks for different signal si_code values.
./fs/signalfd.c: error: 'BUS_MCEERR_AR' undeclared (first use in this function)
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Linus Torvalds [Thu, 22 Feb 2018 20:13:01 +0000 (12:13 -0800)]
Merge tag 'usb-4.16-rc3' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of USB fixes for 4.16-rc3
Nothing major, but a number of different fixes all over the place in
the USB stack for reported issues. Mostly gadget driver fixes,
although the typical set of xhci bugfixes are there, along with some
new quirks additions as well.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-4.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (39 commits)
Revert "usb: musb: host: don't start next rx urb if current one failed"
usb: musb: fix enumeration after resume
usb: cdc_acm: prevent race at write to acm while system resumes
Add delay-init quirk for Corsair K70 RGB keyboards
usb: ohci: Proper handling of ed_rm_list to handle race condition between usb_kill_urb() and finish_unlinks()
usb: host: ehci: always enable interrupt for qtd completion at test mode
usb: ldusb: add PIDs for new CASSY devices supported by this driver
usb: renesas_usbhs: missed the "running" flag in usb_dmac with rx path
usb: host: ehci: use correct device pointer for dma ops
usbip: keep usbip_device sockfd state in sync with tcp_socket
ohci-hcd: Fix race condition caused by ohci_urb_enqueue() and io_watchdog_func()
USB: serial: option: Add support for Quectel EP06
xhci: fix xhci debugfs errors in xhci_stop
xhci: xhci debugfs device nodes weren't removed after device plugged out
xhci: Fix xhci debugfs devices node disappearance after hibernation
xhci: Fix NULL pointer in xhci debugfs
xhci: Don't print a warning when setting link state for disabled ports
xhci: workaround for AMD Promontory disabled ports wakeup
usb: dwc3: core: Fix ULPI PHYs and prevent phy_get/ulpi_init during suspend/resume
USB: gadget: udc: Add missing platform_device_put() on error in bdc_pci_probe()
...
Linus Torvalds [Thu, 22 Feb 2018 20:05:43 +0000 (12:05 -0800)]
Merge tag 'staging-4.16-rc2' of git://git./linux/kernel/git/gregkh/staging
Pull staging/IIO fixes from Greg KH:
"Here are a small number of staging and iio driver fixes for 4.16-rc2.
The IIO fixes are all for reported things, and the android driver
fixes also resolve some reported problems. The remaining fsl-mc
Kconfig change resolves a build testing error that Arnd reported.
All of these have been in linux-next with no reported issues"
* tag 'staging-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: buffer: check if a buffer has been set up when poll is called
iio: adis_lib: Initialize trigger before requesting interrupt
staging: android: ion: Zero CMA allocated memory
staging: android: ashmem: Fix a race condition in pin ioctls
staging: fsl-mc: fix build testing on x86
iio: srf08: fix link error "devm_iio_triggered_buffer_setup" undefined
staging: iio: ad5933: switch buffer mode to software
iio: adc: stm32: fix stm32h7_adc_enable error handling
staging: iio: adc: ad7192: fix external frequency setting
iio: adc: aspeed: Fix error handling path
Linus Torvalds [Thu, 22 Feb 2018 20:04:05 +0000 (12:04 -0800)]
Merge tag 'char-misc-4.16-rc3' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are a handful of char/misc driver fixes for 4.16-rc3.
There are some binder driver fixes to resolve reported issues in
stress testing the recent binder changes, some extcon driver fixes,
and a few mei driver fixes and new device ids.
All of these, with the exception of the mei driver id additions, have
been in linux-next for a while. I forgot to push out the mei driver id
additions to kernel.org until today, but all build tests pass with
them enabled"
* tag 'char-misc-4.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
mei: me: add cannon point device ids for 4th device
mei: me: add cannon point device ids
mei: set device client to the disconnected state upon suspend.
ANDROID: binder: synchronize_rcu() when using POLLFREE.
binder: replace "%p" with "%pK"
ANDROID: binder: remove WARN() for redundant txn error
binder: check for binder_thread allocation failure in binder_poll()
extcon: int3496: process id-pin first so that we start with the right status
Revert "extcon: axp288: Redo charger type detection a couple of seconds after probe()"
extcon: axp288: Constify the axp288_pwr_up_down_info array
Linus Torvalds [Thu, 22 Feb 2018 19:57:39 +0000 (11:57 -0800)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fixes from Doug Ledford:
"Nothing in this is overly interesting, it's mostly your garden variety
fixes.
There was some work in this merge cycle around the new ioctl kABI, so
there are fixes in here related to that (probably with more to come).
We've also recently added new netlink support with a goal of moving
the primary means of configuring the entire subsystem to netlink
(eventually, this is a long term project), so there are fixes for
that.
Then a few bnxt_re driver fixes, and a few minor WARN_ON removals, and
that covers this pull request. There are already a few more fixes on
the list as of this morning, so there will certainly be more to come
in this rc cycle ;-)
Summary:
- Lots of fixes for the new IOCTL interface and general uverbs flow.
Found through testing and syzkaller
- Bugfixes for the new resource track netlink reporting
- Remove some unneeded WARN_ONs that were triggering for some users
in IPoIB
- Various fixes for the bnxt_re driver"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (27 commits)
RDMA/uverbs: Fix kernel panic while using XRC_TGT QP type
RDMA/bnxt_re: Avoid system hang during device un-reg
RDMA/bnxt_re: Fix system crash during load/unload
RDMA/bnxt_re: Synchronize destroy_qp with poll_cq
RDMA/bnxt_re: Unpin SQ and RQ memory if QP create fails
RDMA/bnxt_re: Disable atomic capability on bnxt_re adapters
RDMA/restrack: don't use uaccess_kernel()
RDMA/verbs: Check existence of function prior to accessing it
RDMA/vmw_pvrdma: Fix usage of user response structures in ABI file
RDMA/uverbs: Sanitize user entered port numbers prior to access it
RDMA/uverbs: Fix circular locking dependency
RDMA/uverbs: Fix bad unlock balance in ib_uverbs_close_xrcd
RDMA/restrack: Increment CQ restrack object before committing
RDMA/uverbs: Protect from command mask overflow
IB/uverbs: Fix unbalanced unlock on error path for rdma_explicit_destroy
IB/uverbs: Improve lockdep_check
RDMA/uverbs: Protect from races between lookup and destroy of uobjects
IB/uverbs: Hold the uobj write lock after allocate
IB/uverbs: Fix possible oops with duplicate ioctl attributes
IB/uverbs: Add ioctl support for 32bit processes
...
Linus Torvalds [Thu, 22 Feb 2018 19:53:17 +0000 (11:53 -0800)]
Merge tag 'riscv-for-linus-4.16-rc3-riscv_cleanups' of git://git./linux/kernel/git/palmer/riscv-linux
Pull RISC-V cleanups from Palmer Dabbelt:
"This contains a handful of small cleanups.
The only functional change is that IRQs are now enabled during
exception handling, which was found when some warnings triggered with
`CONFIG_DEBUG_ATOMIC_SLEEP=y`.
The remaining fixes should have no functional change: `sbi_save()` has
been renamed to `parse_dtb()` reflect what it actually does, and a
handful of unused Kconfig entries have been removed"
* tag 'riscv-for-linus-4.16-rc3-riscv_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
Rename sbi_save to parse_dtb to improve code readability
RISC-V: Enable IRQ during exception handling
riscv: Remove ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE select
riscv: kconfig: Remove RISCV_IRQ_INTC select
riscv: Remove ARCH_WANT_OPTIONAL_GPIOLIB select
Linus Torvalds [Thu, 22 Feb 2018 18:45:46 +0000 (10:45 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"16 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: don't defer struct page initialization for Xen pv guests
lib/Kconfig.debug: enable RUNTIME_TESTING_MENU
vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems
selftests/memfd: add run_fuse_test.sh to TEST_FILES
bug.h: work around GCC PR82365 in BUG()
mm/swap.c: make functions and their kernel-doc agree (again)
mm/zpool.c: zpool_evictable: fix mismatch in parameter name and kernel-doc
ida: do zeroing in ida_pre_get()
mm, swap, frontswap: fix THP swap if frontswap enabled
certs/blacklist_nohashes.c: fix const confusion in certs blacklist
kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE
mm, mlock, vmscan: no more skipping pagevecs
mm: memcontrol: fix NR_WRITEBACK leak in memcg and system stats
Kbuild: always define endianess in kconfig.h
include/linux/sched/mm.h: re-inline mmdrop()
tools: fix cross-compile var clobbering
Luck, Tony [Thu, 22 Feb 2018 17:15:06 +0000 (09:15 -0800)]
efivarfs: Limit the rate for non-root to read files
Each read from a file in efivarfs results in two calls to EFI
(one to get the file size, another to get the actual data).
On X86 these EFI calls result in broadcast system management
interrupts (SMI) which affect performance of the whole system.
A malicious user can loop performing reads from efivarfs bringing
the system to its knees.
Linus suggested per-user rate limit to solve this.
So we add a ratelimit structure to "user_struct" and initialize
it for the root user for no limit. When allocating user_struct for
other users we set the limit to 100 per second. This could be used
for other places that want to limit the rate of some detrimental
user action.
In efivarfs if the limit is exceeded when reading, we take an
interruptible nap for 50ms and check the rate limit again.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Thu, 22 Feb 2018 17:41:40 +0000 (09:41 -0800)]
kconfig.h: Include compiler types to avoid missed struct attributes
The header files for some structures could get included in such a way
that struct attributes (specifically __randomize_layout from path.h) would
be parsed as variable names instead of attributes. This could lead to
some instances of a structure being unrandomized, causing nasty GPFs, etc.
This patch makes sure the compiler_types.h header is included in
kconfig.h so that we've always got types and struct attributes defined,
since kconfig.h is included from the compiler command line.
Reported-by: Patrick McLean <chutzpah@gentoo.org>
Root-caused-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Fixes: 3859a271a003 ("randstruct: Mark various structs for randomization")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
H.J. Lu [Wed, 7 Feb 2018 22:20:09 +0000 (14:20 -0800)]
x86: Treat R_X86_64_PLT32 as R_X86_64_PC32
On i386, there are 2 types of PLTs, PIC and non-PIC. PIE and shared
objects must use PIC PLT. To use PIC PLT, you need to load
_GLOBAL_OFFSET_TABLE_ into EBX first. There is no need for that on
x86-64 since x86-64 uses PC-relative PLT.
On x86-64, for 32-bit PC-relative branches, we can generate PLT32
relocation, instead of PC32 relocation, which can also be used as
a marker for 32-bit PC-relative branches. Linker can always reduce
PLT32 relocation to PC32 if function is defined locally. Local
functions should use PC32 relocation. As far as Linux kernel is
concerned, R_X86_64_PLT32 can be treated the same as R_X86_64_PC32
since Linux kernel doesn't use PLT.
R_X86_64_PLT32 for 32-bit PC-relative branches has been enabled in
binutils master branch which will become binutils 2.31.
[ hjl is working on having better documentation on this all, but a few
more notes from him:
"PLT32 relocation is used as marker for PC-relative branches. Because
of EBX, it looks odd to generate PLT32 relocation on i386 when EBX
doesn't have GOT.
As for symbol resolution, PLT32 and PC32 relocations are almost
interchangeable. But when linker sees PLT32 relocation against a
protected symbol, it can resolved locally at link-time since it is
used on a branch instruction. Linker can't do that for PC32
relocation"
but for the kernel use, the two are basically the same, and this
commit gets things building and working with the current binutils
master - Linus ]
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Thu, 22 Feb 2018 15:24:10 +0000 (07:24 -0800)]
nvmet-loop: use blk_rq_payload_bytes for sgl selection
blk_rq_bytes does the wrong thing for special payloads like discards and
might cause the driver to not set up a SGL.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Christoph Hellwig [Thu, 22 Feb 2018 15:24:09 +0000 (07:24 -0800)]
nvme-rdma: use blk_rq_payload_bytes instead of blk_rq_bytes
blk_rq_bytes does the wrong thing for special payloads like discards and
might cause the driver to not set up a SGL.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Christoph Hellwig [Thu, 22 Feb 2018 15:24:08 +0000 (07:24 -0800)]
nvme-fabrics: don't check for non-NULL module in nvmf_register_transport
THIS_MODULE evaluates to NULL when used from code built into the kernel,
thus breaking built-in transport modules. Remove the bogus check.
Fixes: 0de5cd36 ("nvme-fabrics: protect against module unload during create_ctrl")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Juergen Gross [Wed, 21 Feb 2018 22:46:09 +0000 (14:46 -0800)]
mm: don't defer struct page initialization for Xen pv guests
Commit
f7f99100d8d9 ("mm: stop zeroing memory during allocation in
vmemmap") broke Xen pv domains in some configurations, as the "Pinned"
information in struct page of early page tables could get lost.
This will lead to the kernel trying to write directly into the page
tables instead of asking the hypervisor to do so. The result is a crash
like the following:
BUG: unable to handle kernel paging request at
ffff8801ead19008
IP: xen_set_pud+0x4e/0xd0
PGD
1c0a067 P4D
1c0a067 PUD
23a0067 PMD
1e9de0067 PTE
80100001ead19065
Oops: 0003 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-default+ #271
Hardware name: Dell Inc. Latitude E6440/0159N7, BIOS A07 06/26/2014
task:
ffffffff81c10480 task.stack:
ffffffff81c00000
RIP: e030:xen_set_pud+0x4e/0xd0
Call Trace:
__pmd_alloc+0x128/0x140
ioremap_page_range+0x3f4/0x410
__ioremap_caller+0x1c3/0x2e0
acpi_os_map_iomem+0x175/0x1b0
acpi_tb_acquire_table+0x39/0x66
acpi_tb_validate_table+0x44/0x7c
acpi_tb_verify_temp_table+0x45/0x304
acpi_reallocate_root_table+0x12d/0x141
acpi_early_init+0x4d/0x10a
start_kernel+0x3eb/0x4a1
xen_start_kernel+0x528/0x532
Code: 48 01 e8 48 0f 42 15 a2 fd be 00 48 01 d0 48 ba 00 00 00 00 00 ea ff ff 48 c1 e8 0c 48 c1 e0 06 48 01 d0 48 8b 00 f6 c4 02 75 5d <4c> 89 65 00 5b 5d 41 5c c3 65 8b 05 52 9f fe 7e 89 c0 48 0f a3
RIP: xen_set_pud+0x4e/0xd0 RSP:
ffffffff81c03cd8
CR2:
ffff8801ead19008
---[ end trace
38eca2e56f1b642e ]---
Avoid this problem by not deferring struct page initialization when
running as Xen pv guest.
Pavel said:
: This is unique for Xen, so this particular issue won't effect other
: configurations. I am going to investigate if there is a way to
: re-enable deferred page initialization on xen guests.
[akpm@linux-foundation.org: explicitly include xen.h]
Link: http://lkml.kernel.org/r/20180216154101.22865-1-jgross@suse.com
Fixes: f7f99100d8d95d ("mm: stop zeroing memory during allocation in vmemmap")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: <stable@vger.kernel.org> [4.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anders Roxell [Wed, 21 Feb 2018 22:46:05 +0000 (14:46 -0800)]
lib/Kconfig.debug: enable RUNTIME_TESTING_MENU
Commit
d3deafaa8b5c ("lib/: make RUNTIME_TESTS a menuconfig to ease
disabling it all") causes a regression when using runtime tests due to
it defaults RUNTIME_TESTING_MENU to not set.
Link: http://lkml.kernel.org/r/20180214133015.10090-1-anders.roxell@linaro.org
Fixes: d3deafaa8b5c ("lib/: make RUNTIME_TESTS a menuconfig to easedisabling it all")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Cc: Vincent Legoll <vincent.legoll@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Wed, 21 Feb 2018 22:46:01 +0000 (14:46 -0800)]
vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems
Kai Heng Feng has noticed that BUG_ON(PageHighMem(pg)) triggers in
drivers/media/common/saa7146/saa7146_core.c since
19809c2da28a ("mm,
vmalloc: use __GFP_HIGHMEM implicitly").
saa7146_vmalloc_build_pgtable uses vmalloc_32 and it is reasonable to
expect that the resulting page is not in highmem. The above commit
aimed to add __GFP_HIGHMEM only for those requests which do not specify
any zone modifier gfp flag. vmalloc_32 relies on GFP_VMALLOC32 which
should do the right thing. Except it has been missed that GFP_VMALLOC32
is an alias for GFP_KERNEL on 32b architectures. Thanks to Matthew to
notice this.
Fix the problem by unconditionally setting GFP_DMA32 in GFP_VMALLOC32
for !64b arches (as a bailout). This should do the right thing and use
ZONE_NORMAL which should be always below 4G on 32b systems.
Debugged by Matthew Wilcox.
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20180212095019.GX21609@dhcp22.suse.cz
Fixes: 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitly”)
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Kai Heng Feng <kai.heng.feng@canonical.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anders Roxell [Wed, 21 Feb 2018 22:45:58 +0000 (14:45 -0800)]
selftests/memfd: add run_fuse_test.sh to TEST_FILES
While testing memfd tests, there is a missing script, as reported by
kselftest:
./run_tests.sh: line 7: ./run_fuse_test.sh: No such file or directory
Link: http://lkml.kernel.org/r/1517955779-11386-1-git-send-email-daniel.diaz@linaro.org
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Daniel Díaz <daniel.diaz@linaro.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Wed, 21 Feb 2018 22:45:54 +0000 (14:45 -0800)]
bug.h: work around GCC PR82365 in BUG()
Looking at functions with large stack frames across all architectures
led me discovering that BUG() suffers from the same problem as
fortify_panic(), which I've added a workaround for already.
In short, variables that go out of scope by calling a noreturn function
or __builtin_unreachable() keep using stack space in functions
afterwards.
A workaround that was identified is to insert an empty assembler
statement just before calling the function that doesn't return. I'm
adding a macro "barrier_before_unreachable()" to document this, and
insert calls to that in all instances of BUG() that currently suffer
from this problem.
The files that saw the largest change from this had these frame sizes
before, and much less with my patch:
fs/ext4/inode.c:82:1: warning: the frame size of 1672 bytes is larger than 800 bytes [-Wframe-larger-than=]
fs/ext4/namei.c:434:1: warning: the frame size of 904 bytes is larger than 800 bytes [-Wframe-larger-than=]
fs/ext4/super.c:2279:1: warning: the frame size of 1160 bytes is larger than 800 bytes [-Wframe-larger-than=]
fs/ext4/xattr.c:146:1: warning: the frame size of 1168 bytes is larger than 800 bytes [-Wframe-larger-than=]
fs/f2fs/inode.c:152:1: warning: the frame size of 1424 bytes is larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_core.c:1195:1: warning: the frame size of 1068 bytes is larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_core.c:395:1: warning: the frame size of 1084 bytes is larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_ftp.c:298:1: warning: the frame size of 928 bytes is larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_ftp.c:418:1: warning: the frame size of 908 bytes is larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_lblcr.c:718:1: warning: the frame size of 960 bytes is larger than 800 bytes [-Wframe-larger-than=]
drivers/net/xen-netback/netback.c:1500:1: warning: the frame size of 1088 bytes is larger than 800 bytes [-Wframe-larger-than=]
In case of ARC and CRIS, it turns out that the BUG() implementation
actually does return (or at least the compiler thinks it does),
resulting in lots of warnings about uninitialized variable use and
leaving noreturn functions, such as:
block/cfq-iosched.c: In function 'cfq_async_queue_prio':
block/cfq-iosched.c:3804:1: error: control reaches end of non-void function [-Werror=return-type]
include/linux/dmaengine.h: In function 'dma_maxpq':
include/linux/dmaengine.h:1123:1: error: control reaches end of non-void function [-Werror=return-type]
This makes them call __builtin_trap() instead, which should normally
dump the stack and kill the current process, like some of the other
architectures already do.
I tried adding barrier_before_unreachable() to panic() and
fortify_panic() as well, but that had very little effect, so I'm not
submitting that patch.
Vineet said:
: For ARC, it is double win.
:
: 1. Fixes 3 -Wreturn-type warnings
:
: | ../net/core/ethtool.c:311:1: warning: control reaches end of non-void function
: [-Wreturn-type]
: | ../kernel/sched/core.c:3246:1: warning: control reaches end of non-void function
: [-Wreturn-type]
: | ../include/linux/sunrpc/svc_xprt.h:180:1: warning: control reaches end of
: non-void function [-Wreturn-type]
:
: 2. bloat-o-meter reports code size improvements as gcc elides the
: generated code for stack return.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365
Link: http://lkml.kernel.org/r/20171219114112.939391-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc]
Tested-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc]
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christopher Li <sparse@chrisli.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Wed, 21 Feb 2018 22:45:50 +0000 (14:45 -0800)]
mm/swap.c: make functions and their kernel-doc agree (again)
There was a conflict between the commit
e02a9f048ef7 ("mm/swap.c: make
functions and their kernel-doc agree") and the commit
f144c390f905 ("mm:
docs: fix parameter names mismatch") that both tried to fix mismatch
betweeen pagevec_lookup_entries() parameter names and their description.
Since nr_entries is a better name for the parameter, fix the description
again.
Link: http://lkml.kernel.org/r/1518116946-20947-1-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Wed, 21 Feb 2018 22:45:46 +0000 (14:45 -0800)]
mm/zpool.c: zpool_evictable: fix mismatch in parameter name and kernel-doc
[akpm@linux-foundation.org: add colon, per Randy]
Link: http://lkml.kernel.org/r/1518116984-21141-1-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rasmus Villemoes [Wed, 21 Feb 2018 22:45:43 +0000 (14:45 -0800)]
ida: do zeroing in ida_pre_get()
As far as I can tell, the only place the per-cpu ida_bitmap is populated
is in ida_pre_get. The pre-allocated element is stolen in two places in
ida_get_new_above, in both cases immediately followed by a memset(0).
Since ida_get_new_above is called with locks held, do the zeroing in
ida_pre_get, or rather let kmalloc() do it. Also, apparently gcc
generates ~44 bytes of code to do a memset(, 0, 128):
$ scripts/bloat-o-meter vmlinux.{0,1}
add/remove: 0/0 grow/shrink: 2/1 up/down: 5/-88 (-83)
Function old new delta
ida_pre_get 115 119 +4
vermagic 27 28 +1
ida_get_new_above 715 627 -88
Link: http://lkml.kernel.org/r/20180108225634.15340-1-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Huang Ying [Wed, 21 Feb 2018 22:45:39 +0000 (14:45 -0800)]
mm, swap, frontswap: fix THP swap if frontswap enabled
It was reported by Sergey Senozhatsky that if THP (Transparent Huge
Page) and frontswap (via zswap) are both enabled, when memory goes low
so that swap is triggered, segfault and memory corruption will occur in
random user space applications as follow,
kernel: urxvt[338]: segfault at 20 ip
00007fc08889ae0d sp
00007ffc73a7fc40 error 6 in libc-2.26.so[
7fc08881a000+1ae000]
#0 0x00007fc08889ae0d _int_malloc (libc.so.6)
#1 0x00007fc08889c2f3 malloc (libc.so.6)
#2 0x0000560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt)
#3 0x0000560e6005e75c n/a (urxvt)
#4 0x0000560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez (urxvt)
#5 0x0000560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt)
#6 0x0000560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt)
#7 0x0000560e6005c10f _Z17ev_invoke_pendingv (urxvt)
#8 0x0000560e6005cb55 ev_run (urxvt)
#9 0x0000560e6003b9b9 main (urxvt)
#10 0x00007fc08883af4a __libc_start_main (libc.so.6)
#11 0x0000560e6003f9da _start (urxvt)
After bisection, it was found the first bad commit is
bd4c82c22c36 ("mm,
THP, swap: delay splitting THP after swapped out").
The root cause is as follows:
When the pages are written to swap device during swapping out in
swap_writepage(), zswap (fontswap) is tried to compress the pages to
improve performance. But zswap (frontswap) will treat THP as a normal
page, so only the head page is saved. After swapping in, tail pages
will not be restored to their original contents, causing memory
corruption in the applications.
This is fixed by refusing to save page in the frontswap store functions
if the page is a THP. So that the THP will be swapped out to swap
device.
Another choice is to split THP if frontswap is enabled. But it is found
that the frontswap enabling isn't flexible. For example, if
CONFIG_ZSWAP=y (cannot be module), frontswap will be enabled even if
zswap itself isn't enabled.
Frontswap has multiple backends, to make it easy for one backend to
enable THP support, the THP checking is put in backend frontswap store
functions instead of the general interfaces.
Link: http://lkml.kernel.org/r/20180209084947.22749-1-ying.huang@intel.com
Fixes: bd4c82c22c367e068 ("mm, THP, swap: delay splitting THP after swapped out")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Suggested-by: Minchan Kim <minchan@kernel.org> [put THP checking in backend]
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Shaohua Li <shli@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: <stable@vger.kernel.org> [4.14]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andi Kleen [Wed, 21 Feb 2018 22:45:35 +0000 (14:45 -0800)]
certs/blacklist_nohashes.c: fix const confusion in certs blacklist
const must be marked __initconst, not __initdata.
Link: http://lkml.kernel.org/r/20171222001335.1987-1-andi@firstfloor.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Wed, 21 Feb 2018 22:45:32 +0000 (14:45 -0800)]
kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE
chan->n_subbufs is set by the user and relay_create_buf() does a kmalloc()
of chan->n_subbufs * sizeof(size_t *).
kmalloc_slab() will generate a warning when this fails if
chan->subbufs * sizeof(size_t *) > KMALLOC_MAX_SIZE.
Limit chan->n_subbufs to the maximum allowed kmalloc() size.
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1802061216100.122576@chino.kir.corp.google.com
Fixes: f6302f1bcd75 ("relay: prevent integer overflow in relay_open()")
Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Shakeel Butt [Wed, 21 Feb 2018 22:45:28 +0000 (14:45 -0800)]
mm, mlock, vmscan: no more skipping pagevecs
When a thread mlocks an address space backed either by file pages which
are currently not present in memory or swapped out anon pages (not in
swapcache), a new page is allocated and added to the local pagevec
(lru_add_pvec), I/O is triggered and the thread then sleeps on the page.
On I/O completion, the thread can wake on a different CPU, the mlock
syscall will then sets the PageMlocked() bit of the page but will not be
able to put that page in unevictable LRU as the page is on the pagevec
of a different CPU. Even on drain, that page will go to evictable LRU
because the PageMlocked() bit is not checked on pagevec drain.
The page will eventually go to right LRU on reclaim but the LRU stats
will remain skewed for a long time.
This patch puts all the pages, even unevictable, to the pagevecs and on
the drain, the pages will be added on their LRUs correctly by checking
their evictability. This resolves the mlocked pages on pagevec of other
CPUs issue because when those pagevecs will be drained, the mlocked file
pages will go to unevictable LRU. Also this makes the race with munlock
easier to resolve because the pagevec drains happen in LRU lock.
However there is still one place which makes a page evictable and does
PageLRU check on that page without LRU lock and needs special attention.
TestClearPageMlocked() and isolate_lru_page() in clear_page_mlock().
#0: __pagevec_lru_add_fn #1: clear_page_mlock
SetPageLRU() if (!TestClearPageMlocked())
return
smp_mb() // <--required
// inside does PageLRU
if (!PageMlocked()) if (isolate_lru_page())
move to evictable LRU putback_lru_page()
else
move to unevictable LRU
In '#1', TestClearPageMlocked() provides full memory barrier semantics
and thus the PageLRU check (inside isolate_lru_page) can not be
reordered before it.
In '#0', without explicit memory barrier, the PageMlocked() check can be
reordered before SetPageLRU(). If that happens, '#0' can put a page in
unevictable LRU and '#1' might have just cleared the Mlocked bit of that
page but fails to isolate as PageLRU fails as '#0' still hasn't set
PageLRU bit of that page. That page will be stranded on the unevictable
LRU.
There is one (good) side effect though. Without this patch, the pages
allocated for System V shared memory segment are added to evictable LRUs
even after shmctl(SHM_LOCK) on that segment. This patch will correctly
put such pages to unevictable LRU.
Link: http://lkml.kernel.org/r/20171121211241.18877-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Shaohua Li <shli@fb.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Wed, 21 Feb 2018 22:45:24 +0000 (14:45 -0800)]
mm: memcontrol: fix NR_WRITEBACK leak in memcg and system stats
After commit
a983b5ebee57 ("mm: memcontrol: fix excessive complexity in
memory.stat reporting"), we observed slowly upward creeping NR_WRITEBACK
counts over the course of several days, both the per-memcg stats as well
as the system counter in e.g. /proc/meminfo.
The conversion from full per-cpu stat counts to per-cpu cached atomic
stat counts introduced an irq-unsafe RMW operation into the updates.
Most stat updates come from process context, but one notable exception
is the NR_WRITEBACK counter. While writebacks are issued from process
context, they are retired from (soft)irq context.
When writeback completions interrupt the RMW counter updates of new
writebacks being issued, the decs from the completions are lost.
Since the global updates are routed through the joint lruvec API, both
the memcg counters as well as the system counters are affected.
This patch makes the joint stat and event API irq safe.
Link: http://lkml.kernel.org/r/20180203082353.17284-1-hannes@cmpxchg.org
Fixes: a983b5ebee57 ("mm: memcontrol: fix excessive complexity in memory.stat reporting")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Debugged-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Wed, 21 Feb 2018 22:45:20 +0000 (14:45 -0800)]
Kbuild: always define endianess in kconfig.h
Build testing with LTO found a couple of files that get compiled
differently depending on whether asm/byteorder.h gets included early
enough or not. In particular, include/asm-generic/qrwlock_types.h is
affected by this, but there are probably others as well.
The symptom is a series of LTO link time warnings, including these:
net/netlabel/netlabel_unlabeled.h:223: error: type of 'netlbl_unlhsh_add' does not match original declaration [-Werror=lto-type-mismatch]
int netlbl_unlhsh_add(struct net *net,
net/netlabel/netlabel_unlabeled.c:377: note: 'netlbl_unlhsh_add' was previously declared here
include/net/ipv6.h:360: error: type of 'ipv6_renew_options_kern' does not match original declaration [-Werror=lto-type-mismatch]
ipv6_renew_options_kern(struct sock *sk,
net/ipv6/exthdrs.c:1162: note: 'ipv6_renew_options_kern' was previously declared here
net/core/dev.c:761: note: 'dev_get_by_name_rcu' was previously declared here
struct net_device *dev_get_by_name_rcu(struct net *net, const char *name)
net/core/dev.c:761: note: code may be misoptimized unless -fno-strict-aliasing is used
drivers/gpu/drm/i915/i915_drv.h:3377: error: type of 'i915_gem_object_set_to_wc_domain' does not match original declaration [-Werror=lto-type-mismatch]
i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write);
drivers/gpu/drm/i915/i915_gem.c:3639: note: 'i915_gem_object_set_to_wc_domain' was previously declared here
include/linux/debugfs.h:92:9: error: type of 'debugfs_attr_read' does not match original declaration [-Werror=lto-type-mismatch]
ssize_t debugfs_attr_read(struct file *file, char __user *buf,
fs/debugfs/file.c:318: note: 'debugfs_attr_read' was previously declared here
include/linux/rwlock_api_smp.h:30: error: type of '_raw_read_unlock' does not match original declaration [-Werror=lto-type-mismatch]
void __lockfunc _raw_read_unlock(rwlock_t *lock) __releases(lock);
kernel/locking/spinlock.c:246:26: note: '_raw_read_unlock' was previously declared here
include/linux/fs.h:3308:5: error: type of 'simple_attr_open' does not match original declaration [-Werror=lto-type-mismatch]
int simple_attr_open(struct inode *inode, struct file *file,
fs/libfs.c:795: note: 'simple_attr_open' was previously declared here
All of the above are caused by include/asm-generic/qrwlock_types.h
failing to include asm/byteorder.h after commit
e0d02285f16e
("locking/qrwlock: Use 'struct qrwlock' instead of 'struct __qrwlock'")
in linux-4.15.
Similar bugs may or may not exist in older kernels as well, but there is
no easy way to test those with link-time optimizations, and kernels
before 4.14 are harder to fix because they don't have Babu's patch
series
We had similar issues with CONFIG_ symbols in the past and ended up
always including the configuration headers though linux/kconfig.h. This
works around the issue through that same file, defining either
__BIG_ENDIAN or __LITTLE_ENDIAN depending on CONFIG_CPU_BIG_ENDIAN,
which is now always set on all architectures since commit
4c97a0c8fee3
("arch: define CPU_BIG_ENDIAN for all fixed big endian archs").
Link: http://lkml.kernel.org/r/20180202154104.1522809-2-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Wed, 21 Feb 2018 22:45:17 +0000 (14:45 -0800)]
include/linux/sched/mm.h: re-inline mmdrop()
As Peter points out, Doing a CALL+RET for just the decrement is a bit silly.
Fixes: d70f2a14b72a4bc ("include/linux/sched/mm.h: uninline mmdrop_async(), etc")
Acked-by: Peter Zijlstra (Intel) <peterz@infraded.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Martin Kelly [Wed, 21 Feb 2018 22:45:12 +0000 (14:45 -0800)]
tools: fix cross-compile var clobbering
Currently a number of Makefiles break when used with toolchains that
pass extra flags in CC and other cross-compile related variables (such
as --sysroot).
Thus we get this error when we use a toolchain that puts --sysroot in
the CC var:
~/src/linux/tools$ make iio
[snip]
iio_event_monitor.c:18:10: fatal error: unistd.h: No such file or directory
#include <unistd.h>
^~~~~~~~~~
This occurs because we clobber several env vars related to
cross-compiling with lines like this:
CC = $(CROSS_COMPILE)gcc
Although this will point to a valid cross-compiler, we lose any extra
flags that might exist in the CC variable, which can break toolchains
that rely on them (for example, those that use --sysroot).
This easily shows up using a Yocto SDK:
$ . [snip]/sdk/environment-setup-cortexa8hf-neon-poky-linux-gnueabi
$ echo $CC
arm-poky-linux-gnueabi-gcc -march=armv7-a -mfpu=neon -mfloat-abi=hard
-mcpu=cortex-a8
--sysroot=[snip]/sdk/sysroots/cortexa8hf-neon-poky-linux-gnueabi
$ echo $CROSS_COMPILE
arm-poky-linux-gnueabi-
$ echo ${CROSS_COMPILE}gcc
krm-poky-linux-gnueabi-gcc
Although arm-poky-linux-gnueabi-gcc is a cross-compiler, we've lost the
--sysroot and other flags that enable us to find the right libraries to
link against, so we can't find unistd.h and other libraries and headers.
Normally with the --sysroot flag we would find unistd.h in the sdk
directory in the sysroot:
$ find [snip]/sdk/sysroots -path '*/usr/include/unistd.h'
[snip]/sdk/sysroots/cortexa8hf-neon-poky-linux-gnueabi/usr/include/unistd.h
The perf Makefile adds CC = $(CROSS_COMPILE)gcc if and only if CC is not
already set, and it compiles correctly with the above toolchain.
So, generalize the logic that perf uses in the common Makefile and
remove the manual CC = $(CROSS_COMPILE)gcc lines from each Makefile.
Note that this patch does not fix cross-compile for all the tools (some
have other bugs), but it does fix it for all except usb and acpi, which
still have other unrelated issues.
I tested both with and without the patch on native and cross-build and
there appear to be no regressions.
Link: http://lkml.kernel.org/r/20180107214028.23771-1-martin@martingkelly.com
Signed-off-by: Martin Kelly <martin@martingkelly.com>
Acked-by: Mark Brown <broonie@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Pali Rohar <pali.rohar@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Valentina Manea <valentina.manea.m@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Leon Romanovsky [Wed, 21 Feb 2018 08:25:01 +0000 (10:25 +0200)]
RDMA/uverbs: Fix kernel panic while using XRC_TGT QP type
Attempt to modify XRC_TGT QP type from the user space (ibv_xsrq_pingpong
invocation) will trigger the following kernel panic. It is caused by the
fact that such QPs missed uobject initialization.
[ 17.408845] BUG: unable to handle kernel NULL pointer dereference at
0000000000000048
[ 17.412645] IP: rdma_lookup_put_uobject+0x9/0x50
[ 17.416567] PGD 0 P4D 0
[ 17.419262] Oops: 0000 [#1] SMP PTI
[ 17.422915] CPU: 0 PID: 455 Comm: ibv_xsrq_pingpo Not tainted 4.16.0-rc1+ #86
[ 17.424765] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[ 17.427399] RIP: 0010:rdma_lookup_put_uobject+0x9/0x50
[ 17.428445] RSP: 0018:
ffffb8c7401e7c90 EFLAGS:
00010246
[ 17.429543] RAX:
0000000000000000 RBX:
ffffb8c7401e7cf8 RCX:
0000000000000000
[ 17.432426] RDX:
0000000000000001 RSI:
0000000000000000 RDI:
0000000000000000
[ 17.437448] RBP:
0000000000000000 R08:
00000000000218f0 R09:
ffffffff8ebc4cac
[ 17.440223] R10:
fffff6038052cd80 R11:
ffff967694b36400 R12:
ffff96769391f800
[ 17.442184] R13:
ffffb8c7401e7cd8 R14:
0000000000000000 R15:
ffff967699f60000
[ 17.443971] FS:
00007fc29207d700(0000) GS:
ffff96769fc00000(0000) knlGS:
0000000000000000
[ 17.446623] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 17.448059] CR2:
0000000000000048 CR3:
000000001397a000 CR4:
00000000000006b0
[ 17.449677] Call Trace:
[ 17.450247] modify_qp.isra.20+0x219/0x2f0
[ 17.451151] ib_uverbs_modify_qp+0x90/0xe0
[ 17.452126] ib_uverbs_write+0x1d2/0x3c0
[ 17.453897] ? __handle_mm_fault+0x93c/0xe40
[ 17.454938] __vfs_write+0x36/0x180
[ 17.455875] vfs_write+0xad/0x1e0
[ 17.456766] SyS_write+0x52/0xc0
[ 17.457632] do_syscall_64+0x75/0x180
[ 17.458631] entry_SYSCALL_64_after_hwframe+0x21/0x86
[ 17.460004] RIP: 0033:0x7fc29198f5a0
[ 17.460982] RSP: 002b:
00007ffccc71f018 EFLAGS:
00000246 ORIG_RAX:
0000000000000001
[ 17.463043] RAX:
ffffffffffffffda RBX:
0000000000000078 RCX:
00007fc29198f5a0
[ 17.464581] RDX:
0000000000000078 RSI:
00007ffccc71f050 RDI:
0000000000000003
[ 17.466148] RBP:
0000000000000000 R08:
0000000000000078 R09:
00007ffccc71f050
[ 17.467750] R10:
000055b6cf87c248 R11:
0000000000000246 R12:
00007ffccc71f300
[ 17.469541] R13:
000055b6cf8733a0 R14:
0000000000000000 R15:
0000000000000000
[ 17.471151] Code: 00 00 0f 1f 44 00 00 48 8b 47 48 48 8b 00 48 8b 40 10 e9 0b 8b 68 00 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 53 89 f5 <48> 8b 47 48 48 89 fb 40 0f b6 f6 48 8b 00 48 8b 40 20 e8 e0 8a
[ 17.475185] RIP: rdma_lookup_put_uobject+0x9/0x50 RSP:
ffffb8c7401e7c90
[ 17.476841] CR2:
0000000000000048
[ 17.477764] ---[ end trace
1dbcc5354071a712 ]---
[ 17.478880] Kernel panic - not syncing: Fatal exception
[ 17.480277] Kernel Offset: 0xd000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Fixes: 2f08ee363fe0 ("RDMA/restrack: don't use uaccess_kernel()")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Michael Clark [Thu, 15 Feb 2018 20:30:29 +0000 (09:30 +1300)]
Rename sbi_save to parse_dtb to improve code readability
The sbi_ prefix would seem to indicate an SBI interface, and save is not
very specific. After applying this patch, reading head.S makes more sense.
Signed-off-by: Michael Clark <michaeljclark@mac.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
zongbox@gmail.com [Tue, 30 Jan 2018 07:51:45 +0000 (23:51 -0800)]
RISC-V: Enable IRQ during exception handling
Interrupt is allowed during exception handling.
There are warning messages if the kernel enables the configuration
'CONFIG_DEBUG_ATOMIC_SLEEP=y'.
BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:23
in_atomic(): 0, irqs_disabled(): 1, pid: 43, name: ash
CPU: 0 PID: 43 Comm: ash Tainted: G W
4.15.0-rc8-00089-g89ffdae-dirty #17
Call Trace:
[<
000000009abb1587>] walk_stackframe+0x0/0x7a
[<
00000000d4f3d088>] ___might_sleep+0x102/0x11a
[<
00000000b1fd792a>] down_read+0x18/0x28
[<
000000000289ec01>] do_page_fault+0x86/0x2f6
[<
00000000012441f6>] _do_fork+0x1b4/0x1e0
[<
00000000f46c3e3b>] ret_from_syscall+0xa/0xe
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zong Li <zong@andestech.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Palmer Dabbelt [Tue, 20 Feb 2018 18:51:19 +0000 (10:51 -0800)]
RISC-V: kconfig cleanups
These three kconfig cleanups were found by ulfalyzer. They're all
things we were selecting that were undefined, either because they'd been
remove upstream or are part of a future RISC-V submission.
* ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE is obselete.
* RISCV_IRQ_INTC is the old name for our interrupt controller driver,
it'll be changed for the final submission and doesn't exist now.
* ARCH_WANT_OPTIONAL_GPIOLIB is obselete.
Ulf Magnusson [Mon, 5 Feb 2018 01:21:18 +0000 (02:21 +0100)]
riscv: Remove ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE select
The ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE symbol was removed in
commit
51a021244b9d ("atomic64: no need for
CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE").
Remove the ARCH_HAS_ATOMIC64_DEC_IS_POSITIVE select from RISCV.
Discovered with the
https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined.py
script.
Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com>
Reviewed-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Ulf Magnusson [Thu, 8 Feb 2018 22:54:46 +0000 (23:54 +0100)]
riscv: kconfig: Remove RISCV_IRQ_INTC select
The RISCV_IRQ_INTC configuration symbol is undefined, but RISCV selects
it. Quoting Palmer Dabbelt:
It looks like this slipped through, the symbol has been renamed
RISCV_INTC.
No RISCV_INTC configuration symbol has been merged either. Just remove
the RISCV_IRQ_INTC select for now.
Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Ulf Magnusson [Mon, 5 Feb 2018 01:21:19 +0000 (02:21 +0100)]
riscv: Remove ARCH_WANT_OPTIONAL_GPIOLIB select
The ARCH_WANT_OPTIONAL_GPIOLIB symbol was removed in commit
65053e1a7743
("gpio: delete ARCH_[WANTS_OPTIONAL|REQUIRE]_GPIOLIB"). GPIOLIB should
just be selected explicitly if needed.
Remove the ARCH_WANT_OPTIONAL_GPIOLIB select from RISCV.
See commit
0145071b3314 ("x86: Do away with
ARCH_[WANT_OPTIONAL|REQUIRE]_GPIOLIB") and commit
da9a1c6767 ("arm64: do
away with ARCH_[WANT_OPTIONAL|REQUIRE]_GPIOLIB") as well.
Discovered with the
https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined.py
script.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Linus Torvalds [Tue, 20 Feb 2018 18:05:02 +0000 (10:05 -0800)]
Merge tag 'leds_for-4.16-rc3' of git://git./linux/kernel/git/j.anaszewski/linux-leds
Pull LED maintainer update:
"LED update to MAINTAINERS, to admit the reality.
Message from Richard:
"I've been looking at some of the emails but not needed to be
involved for a while now, you're doing fine without me!" [0]
Many thanks to Richard for his work as a founder of the LED
subsystem!"
[0] https://lkml.org/lkml/2018/2/18/145
* tag 'leds_for-4.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
MAINTAINERS: Remove Richard Purdie from LED maintainers
Selvin Xavier [Fri, 16 Feb 2018 05:20:13 +0000 (21:20 -0800)]
RDMA/bnxt_re: Avoid system hang during device un-reg
BNXT_RE_FLAG_TASK_IN_PROG doesn't handle multiple work
requests posted together. Track schedule of multiple
workqueue items by maintaining a per device counter
and proceed with IB dereg only if this counter is zero.
flush_workqueue is no longer required from
NETDEV_UNREGISTER path.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Selvin Xavier [Fri, 16 Feb 2018 05:20:12 +0000 (21:20 -0800)]
RDMA/bnxt_re: Fix system crash during load/unload
During driver unload, the driver proceeds with cleanup
without waiting for the scheduled events. So the device
pointers get freed up and driver crashes when the events
are scheduled later.
Flush the bnxt_re_task work queue before starting
device removal.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Selvin Xavier [Fri, 16 Feb 2018 05:20:11 +0000 (21:20 -0800)]
RDMA/bnxt_re: Synchronize destroy_qp with poll_cq
Avoid system crash when destroy_qp is invoked while
the driver is processing the poll_cq. Synchronize these
functions using the cq_lock.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Devesh Sharma [Fri, 16 Feb 2018 05:20:10 +0000 (21:20 -0800)]
RDMA/bnxt_re: Unpin SQ and RQ memory if QP create fails
Driver leaves the QP memory pinned if QP create command
fails from the FW. Avoids this scenario by adding a proper
exit path if the FW command fails.
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Devesh Sharma [Fri, 16 Feb 2018 05:20:08 +0000 (21:20 -0800)]
RDMA/bnxt_re: Disable atomic capability on bnxt_re adapters
More testing needs to be done before enabling this feature.
Disabling the feature temporarily
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bin Liu [Tue, 20 Feb 2018 13:31:35 +0000 (07:31 -0600)]
Revert "usb: musb: host: don't start next rx urb if current one failed"
This reverts commit
dbac5d07d13e330e6706813c9fde477140fb5d80.
commit
dbac5d07d13e ("usb: musb: host: don't start next rx urb if current one failed")
along with commit
b5801212229f ("usb: musb: host: clear rxcsr error bit if set")
try to solve the issue described in [1], but the latter alone is
sufficient, and the former causes the issue as in [2], so now revert it.
[1] https://marc.info/?l=linux-usb&m=
146173995117456&w=2
[2] https://marc.info/?l=linux-usb&m=
151689238420622&w=2
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andreas Kemnade [Tue, 20 Feb 2018 13:30:10 +0000 (07:30 -0600)]
usb: musb: fix enumeration after resume
On dm3730 there are enumeration problems after resume.
Investigation led to the cause that the MUSB_POWER_SOFTCONN
bit is not set. If it was set before suspend (because it
was enabled via musb_pullup()), it is set in
musb_restore_context() so the pullup is enabled. But then
musb_start() is called which overwrites MUSB_POWER and
therefore disables MUSB_POWER_SOFTCONN, so no pullup is
enabled and the device is not enumerated.
So let's do a subset of what musb_start() does
in the same way as musb_suspend() does it. Platform-specific
stuff it still called as there might be some phy-related stuff
which needs to be enabled.
Also interrupts are enabled, as it was the original idea
of calling musb_start() in musb_resume() according to
Commit
6fc6f4b87cb3 ("usb: musb: Disable interrupts on suspend,
enable them on resume")
Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Tue, 20 Feb 2018 09:03:22 +0000 (10:03 +0100)]
Merge tag 'iio-fixes-for-4.16a' of git://git./linux/kernel/git/jic23/iio into staging-linus
Jonathan writes:
First round of IIO fixes for the 4.16 cycle.
One nasty very old crash around polling for buffers that aren't there
- though that can only cause effects on drivers that support events
but not buffers.
* buffer / kfifo handling in the core.
- Check there is a buffer and return 0 from poll directly if there
isn't. Poll doesn't make sense in this circumstances, but best to close
the hole.
* ad5933
- Change the marked buffer mode to a software buffer as the meaning of
the hardware buffer label has long since changed and this uses a front
end software buffer anyway.
* ad7192
- Fix the fact the external clock frequency was only set when using the
internal clock which was less than helpful.
* adis_lib
- Initialize the trigger before requesting the interrupt. Some newer
parts can power up with interrupt generation enabled so ordering now
matters.
* aspeed-adc
- Fix an errror handling path as labels and general ordering were wrong.
* srf08
- Fix a link error due to undefined devm_iio_triggered_buffer_setup.
* stm32-adc
- Fix error handling unwind squence in stm32h7_adc_enable.
Tomas Winkler [Sun, 18 Feb 2018 09:05:16 +0000 (11:05 +0200)]
mei: me: add cannon point device ids for 4th device
Add cannon point device ids for 4th (itouch) device.
Cc: <stable@vger.kernel.org> 4.14+
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexander Usyskin [Sun, 18 Feb 2018 09:05:15 +0000 (11:05 +0200)]
mei: me: add cannon point device ids
Add CNP LP and CNP H device ids for cannon lake
and coffee lake platforms.
Cc: <stable@vger.kernel.org> 4.14+
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Tue, 20 Feb 2018 07:57:23 +0000 (08:57 +0100)]
Merge tag 'extcon-fixes-for-4.16-rc3' of git://git./linux/kernel/git/chanwoo/extcon into char-misc-linus
Chanwoo writes:
Update extcon for v4.16-rc3
This patch fixes issue of X-power extcon-axp288 and Intel extcon-int3496 driver.
- For extcon-int3496 driver,
Process id-pin first so that we start with the right status in order to fix
a race where the initial work might still be running while other drivers
were already calling extcon_get_state().
- For extcon-axp288 driver,
Revert the patch[1] which were applied to v4.16-rc1 because there are better
ways with usb-role-switch and constify the axp288_pwr_up_down_info array.
[1]
60ed99961469a3 ("extcon: axp288: Redo charger type detection a couple of seconds after probe()")
Linus Torvalds [Mon, 19 Feb 2018 19:58:19 +0000 (11:58 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Prevent index integer overflow in ptr_ring, from Jason Wang.
2) Program mvpp2 multicast filter properly, from Mikulas Patocka.
3) The bridge brport attribute file is write only and doesn't have a
->show() method, don't blindly invoke it. From Xin Long.
4) Inverted mask used in genphy_setup_forced(), from Ingo van Lil.
5) Fix multiple definition issue with if_ether.h UAPI header, from
Hauke Mehrtens.
6) Fix GFP_KERNEL usage in atomic in RDS protocol code, from Sowmini
Varadhan.
7) Revert XDP redirect support from thunderx driver, it is not
implemented properly. From Jesper Dangaard Brouer.
8) Fix missing RTNL protection across some tipc operations, from Ying
Xue.
9) Return the correct IV bytes in the TLS getsockopt code, from Boris
Pismenny.
10) Take tclassid into consideration properly when doing FIB rule
matching. From Stefano Brivio.
11) cxgb4 device needs more PCI VPD quirks, from Casey Leedom.
12) TUN driver doesn't align frags properly, and we can end up doing
unaligned atomics on misaligned metadata. From Eric Dumazet.
13) Fix various crashes found using DEBUG_PREEMPT in rmnet driver, from
Subash Abhinov Kasiviswanathan.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits)
tg3: APE heartbeat changes
mlxsw: spectrum_router: Do not unconditionally clear route offload indication
net: qualcomm: rmnet: Fix possible null dereference in command processing
net: qualcomm: rmnet: Fix warning seen with 64 bit stats
net: qualcomm: rmnet: Fix crash on real dev unregistration
sctp: remove the left unnecessary check for chunk in sctp_renege_events
rxrpc: Work around usercopy check
tun: fix tun_napi_alloc_frags() frag allocator
udplite: fix partial checksum initialization
skbuff: Fix comment mis-spelling.
dn_getsockoptdecnet: move nf_{get/set}sockopt outside sock lock
PCI/cxgb4: Extend T3 PCI quirk to T4+ devices
cxgb4: fix trailing zero in CIM LA dump
cxgb4: free up resources of pf 0-3
fib_semantics: Don't match route with mismatching tclassid
NFC: llcp: Limit size of SDP URI
tls: getsockopt return record sequence number
tls: reset the crypto info if copy_from_user fails
tls: retrun the correct IV in getsockopt
docs: segmentation-offloads.txt: add SCTP info
...
Jacek Anaszewski [Sun, 18 Feb 2018 20:11:25 +0000 (21:11 +0100)]
MAINTAINERS: Remove Richard Purdie from LED maintainers
Richard has been inactive on the linux-leds list for a long time.
After email discussion we agreed on removing him from
the LED maintainers, which will better reflect the actual status.
Acked-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Prashant Sreedharan [Mon, 19 Feb 2018 06:57:04 +0000 (12:27 +0530)]
tg3: APE heartbeat changes
In ungraceful host shutdown or driver crash case BMC connectivity is
lost. APE firmware is missing the driver state in this
case to keep the BMC connectivity alive.
This patch has below change to address this issue.
Heartbeat mechanism with APE firmware. This heartbeat mechanism
is needed to notify the APE firmware about driver state.
This patch also has the change in wait time for APE event from
1ms to 20ms as there can be some delay in getting response.
v2: Drop inline keyword as per David suggestion.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Satish Baddipadige <satish.baddipadige@broadcom.com>
Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 16 Feb 2018 23:30:44 +0000 (00:30 +0100)]
mlxsw: spectrum_router: Do not unconditionally clear route offload indication
When mlxsw replaces (or deletes) a route it removes the offload
indication from the replaced route. This is problematic for IPv4 routes,
as the offload indication is stored in the fib_info which is usually
shared between multiple routes.
Instead of unconditionally clearing the offload indication, only clear
it if no other route is using the fib_info.
Fixes: 3984d1a89fe7 ("mlxsw: spectrum_router: Provide offload indication using nexthop flags")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alexander Petrovskiy <alexpe@mellanox.com>
Tested-by: Alexander Petrovskiy <alexpe@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Feb 2018 16:17:34 +0000 (11:17 -0500)]
Merge branch 'qualcomm-rmnet-Fix-issues-with-CONFIG_DEBUG_PREEMPT-enabled'
Subash Abhinov Kasiviswanathan says:
====================
net: qualcomm: rmnet: Fix issues with CONFIG_DEBUG_PREEMPT enabled
Patch 1 and 2 fixes issues identified when CONFIG_DEBUG_PREEMPT was
enabled. These involve APIs which were called in invalid contexts.
Patch 3 is a null derefence fix identified by code inspection.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Subash Abhinov Kasiviswanathan [Fri, 16 Feb 2018 22:56:39 +0000 (15:56 -0700)]
net: qualcomm: rmnet: Fix possible null dereference in command processing
If a command packet with invalid mux id is received, the packet would
not have a valid endpoint. This invalid endpoint maybe dereferenced
leading to a crash. Identified by manual code inspection.
Fixes: 3352e6c45760 ("net: qualcomm: rmnet: Convert the muxed endpoint to hlist")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subash Abhinov Kasiviswanathan [Fri, 16 Feb 2018 22:56:38 +0000 (15:56 -0700)]
net: qualcomm: rmnet: Fix warning seen with 64 bit stats
With CONFIG_DEBUG_PREEMPT enabled, a warning was seen on device
creation. This occurs due to the incorrect cpu API usage in
ndo_get_stats64 handler.
BUG: using smp_processor_id() in preemptible [
00000000] code: rmnetcli/5743
caller is debug_smp_processor_id+0x1c/0x24
Call trace:
[<
ffffff9d48c8967c>] dump_backtrace+0x0/0x2a8
[<
ffffff9d48c89bbc>] show_stack+0x20/0x28
[<
ffffff9d4901fff8>] dump_stack+0xa8/0xe0
[<
ffffff9d490421e0>] check_preemption_disabled+0x104/0x108
[<
ffffff9d49042200>] debug_smp_processor_id+0x1c/0x24
[<
ffffff9d494a36b0>] rmnet_get_stats64+0x64/0x13c
[<
ffffff9d49b014e0>] dev_get_stats+0x68/0xd8
[<
ffffff9d49d58df8>] rtnl_fill_stats+0x54/0x140
[<
ffffff9d49b1f0b8>] rtnl_fill_ifinfo+0x428/0x9cc
[<
ffffff9d49b23834>] rtmsg_ifinfo_build_skb+0x80/0xf4
[<
ffffff9d49b23930>] rtnetlink_event+0x88/0xb4
[<
ffffff9d48cd21b4>] raw_notifier_call_chain+0x58/0x78
[<
ffffff9d49b028a4>] call_netdevice_notifiers_info+0x48/0x78
[<
ffffff9d49b08bf8>] __netdev_upper_dev_link+0x290/0x5e8
[<
ffffff9d49b08fcc>] netdev_master_upper_dev_link+0x3c/0x48
[<
ffffff9d494a2e74>] rmnet_newlink+0xf0/0x1c8
[<
ffffff9d49b23360>] rtnl_newlink+0x57c/0x6c8
[<
ffffff9d49b2355c>] rtnetlink_rcv_msg+0xb0/0x244
[<
ffffff9d49b5230c>] netlink_rcv_skb+0xb4/0xdc
[<
ffffff9d49b204f4>] rtnetlink_rcv+0x34/0x44
[<
ffffff9d49b51af0>] netlink_unicast+0x1ec/0x294
[<
ffffff9d49b51fdc>] netlink_sendmsg+0x320/0x390
[<
ffffff9d49ae6858>] sock_sendmsg+0x54/0x60
[<
ffffff9d49ae91bc>] SyS_sendto+0x1a0/0x1e4
[<
ffffff9d48c83770>] el0_svc_naked+0x24/0x28
Fixes: 192c4b5d48f2 ("net: qualcomm: rmnet: Add support for 64 bit stats")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subash Abhinov Kasiviswanathan [Fri, 16 Feb 2018 22:56:37 +0000 (15:56 -0700)]
net: qualcomm: rmnet: Fix crash on real dev unregistration
With CONFIG_DEBUG_PREEMPT enabled, a crash with the following call
stack was observed when removing a real dev which had rmnet devices
attached to it.
To fix this, remove the netdev_upper link APIs and instead use the
existing information in rmnet_port and rmnet_priv to get the
association between real and rmnet devs.
BUG: sleeping function called from invalid context
in_atomic(): 0, irqs_disabled(): 0, pid: 5762, name: ip
Preemption disabled at:
[<
ffffff9d49043564>] debug_object_active_state+0xa4/0x16c
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
PC is at ___might_sleep+0x13c/0x180
LR is at ___might_sleep+0x17c/0x180
[<
ffffff9d48ce0924>] ___might_sleep+0x13c/0x180
[<
ffffff9d48ce09c0>] __might_sleep+0x58/0x8c
[<
ffffff9d49d6253c>] mutex_lock+0x2c/0x48
[<
ffffff9d48ed4840>] kernfs_remove_by_name_ns+0x48/0xa8
[<
ffffff9d48ed6ec8>] sysfs_remove_link+0x30/0x58
[<
ffffff9d49b05840>] __netdev_adjacent_dev_remove+0x14c/0x1e0
[<
ffffff9d49b05914>] __netdev_adjacent_dev_unlink_lists+0x40/0x68
[<
ffffff9d49b08820>] netdev_upper_dev_unlink+0xb4/0x1fc
[<
ffffff9d494a29f0>] rmnet_dev_walk_unreg+0x6c/0xc8
[<
ffffff9d49b00b40>] netdev_walk_all_lower_dev_rcu+0x58/0xb4
[<
ffffff9d494a30fc>] rmnet_config_notify_cb+0xf4/0x134
[<
ffffff9d48cd21b4>] raw_notifier_call_chain+0x58/0x78
[<
ffffff9d49b028a4>] call_netdevice_notifiers_info+0x48/0x78
[<
ffffff9d49b0b568>] rollback_registered_many+0x230/0x3c8
[<
ffffff9d49b0b738>] unregister_netdevice_many+0x38/0x94
[<
ffffff9d49b1e110>] rtnl_delete_link+0x58/0x88
[<
ffffff9d49b201dc>] rtnl_dellink+0xbc/0x1cc
[<
ffffff9d49b2355c>] rtnetlink_rcv_msg+0xb0/0x244
[<
ffffff9d49b5230c>] netlink_rcv_skb+0xb4/0xdc
[<
ffffff9d49b204f4>] rtnetlink_rcv+0x34/0x44
[<
ffffff9d49b51af0>] netlink_unicast+0x1ec/0x294
[<
ffffff9d49b51fdc>] netlink_sendmsg+0x320/0x390
[<
ffffff9d49ae6858>] sock_sendmsg+0x54/0x60
[<
ffffff9d49ae6f94>] ___sys_sendmsg+0x298/0x2b0
[<
ffffff9d49ae98f8>] SyS_sendmsg+0xb4/0xf0
[<
ffffff9d48c83770>] el0_svc_naked+0x24/0x28
Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Fixes: 60d58f971c10 ("net: qualcomm: rmnet: Implement bridge mode")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 19 Feb 2018 01:29:42 +0000 (17:29 -0800)]
Linux 4.16-rc2
Linus Torvalds [Sun, 18 Feb 2018 20:56:41 +0000 (12:56 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 Kconfig fixes from Thomas Gleixner:
"Three patchlets to correct HIGHMEM64G and CMPXCHG64 dependencies in
Kconfig when CPU selections are explicitely set to M586 or M686"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/Kconfig: Explicitly enumerate i686-class CPUs in Kconfig
x86/Kconfig: Exclude i586-class CPUs lacking PAE support from the HIGHMEM64G Kconfig group
x86/Kconfig: Add missing i586-class CPUs to the X86_CMPXCHG64 Kconfig group
Linus Torvalds [Sun, 18 Feb 2018 20:38:40 +0000 (12:38 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf updates from Thomas Gleixner:
"Perf tool updates and kprobe fixes:
- perf_mmap overwrite mode fixes/overhaul, prep work to get 'perf
top' using it, making it bearable to use it in large core count
systems such as Knights Landing/Mill Intel systems (Kan Liang)
- s/390 now uses syscall.tbl, just like x86-64 to generate the
syscall table id -> string tables used by 'perf trace' (Hendrik
Brueckner)
- Use strtoull() instead of home grown function (Andy Shevchenko)
- Synchronize kernel ABI headers, v4.16-rc1 (Ingo Molnar)
- Document missing 'perf data --force' option (Sangwon Hong)
- Add perf vendor JSON metrics for ARM Cortex-A53 Processor (William
Cohen)
- Improve error handling and error propagation of ftrace based
kprobes so failures when installing kprobes are not silently
ignored and create disfunctional tracepoints"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
kprobes: Propagate error from disarm_kprobe_ftrace()
kprobes: Propagate error from arm_kprobe_ftrace()
Revert "tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h"
perf s390: Rework system call table creation by using syscall.tbl
perf s390: Grab a copy of arch/s390/kernel/syscall/syscall.tbl
tools/headers: Synchronize kernel ABI headers, v4.16-rc1
perf test: Fix test trace+probe_libc_inet_pton.sh for s390x
perf data: Document missing --force option
perf tools: Substitute yet another strtoull()
perf top: Check the latency of perf_top__mmap_read()
perf top: Switch default mode to overwrite mode
perf top: Remove lost events checking
perf hists browser: Add parameter to disable lost event warning
perf top: Add overwrite fall back
perf evsel: Expose the perf_missing_features struct
perf top: Check per-event overwrite term
perf mmap: Discard legacy interface for mmap read
perf test: Update mmap read functions for backward-ring-buffer test
perf mmap: Introduce perf_mmap__read_event()
perf mmap: Introduce perf_mmap__read_done()
...
Linus Torvalds [Sun, 18 Feb 2018 20:22:04 +0000 (12:22 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"A small set of updates mostly for irq chip drivers:
- MIPS GIC fix for spurious, masked interrupts
- fix for a subtle IPI bug in GICv3
- do not probe GICv3 ITSs that are marked as disabled
- multi-MSI support for GICv2m
- various small cleanups"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqdomain: Re-use DEFINE_SHOW_ATTRIBUTE() macro
irqchip/bcm: Remove hashed address printing
irqchip/gic-v2m: Add PCI Multi-MSI support
irqchip/gic-v3: Ignore disabled ITS nodes
irqchip/gic-v3: Use wmb() instead of smb_wmb() in gic_raise_softirq()
irqchip/gic-v3: Change pr_debug message to pr_devel
irqchip/mips-gic: Avoid spuriously handling masked interrupts