Jens Axboe [Tue, 12 Feb 2019 22:21:48 +0000 (15:21 -0700)]
Merge branch 'md-fixes' of https://github.com/liu-song-6/linux into for-linus
Pull MD fix from Song
* 'md-fixes' of https://github.com/liu-song-6/linux:
md/raid1: don't clear bitmap bits on interrupted recovery.
Nate Dailey [Thu, 7 Feb 2019 19:19:01 +0000 (14:19 -0500)]
md/raid1: don't clear bitmap bits on interrupted recovery.
sync_request_write no longer submits writes to a Faulty device. This has
the unfortunate side effect that bitmap bits can be incorrectly cleared
if a recovery is interrupted (previously, end_sync_write would have
prevented this). This means the next recovery may not copy everything
it should, potentially corrupting data.
Add a function for doing the proper md_bitmap_end_sync, called from
end_sync_write and the Faulty case in sync_request_write.
backport note to 4.14: s/md_bitmap_end_sync/bitmap_end_sync
Cc: stable@vger.kernel.org 4.14+
Fixes: 0c9d5b127f69 ("md/raid1: avoid reusing a resync bio after error handling.")
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Tested-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Nate Dailey <nate.dailey@stratus.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Yufen Yu [Tue, 29 Jan 2019 08:34:04 +0000 (16:34 +0800)]
floppy: check_events callback should not return a negative number
floppy_check_events() is supposed to return bit flags to say which
events occured. We should return zero to say that no event flags are
set. Only BIT(0) and BIT(1) are used in the caller. And .check_events
interface also expect to return an unsigned int value.
However, after commit
a0c80efe5956, it may return -EINTR (-4u).
Here, both BIT(0) and BIT(1) are cleared. So this patch shouldn't
affect runtime, but it obviously is still worth fixing.
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: a0c80efe5956 ("floppy: fix lock_fdc() signal handling")
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jianchao Wang [Tue, 12 Feb 2019 01:56:25 +0000 (09:56 +0800)]
blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue
When requeue, if RQF_DONTPREP, rq has contained some driver
specific data, so insert it to hctx dispatch list to avoid any
merge. Take scsi as example, here is the trace event log (no
io scheduler, because RQF_STARTED would prevent merging),
kworker/0:1H-339 [000] ...1 2037.209289: block_rq_insert: 8,0 R 4096 () 32768 + 8 [kworker/0:1H]
scsi_inert_test-1987 [000] .... 2037.220465: block_bio_queue: 8,0 R 32776 + 8 [scsi_inert_test]
scsi_inert_test-1987 [000] ...2 2037.220466: block_bio_backmerge: 8,0 R 32776 + 8 [scsi_inert_test]
kworker/0:1H-339 [000] .... 2047.220913: block_rq_issue: 8,0 R 8192 () 32768 + 16 [kworker/0:1H]
scsi_inert_test-1996 [000] ..s1 2047.221007: block_rq_complete: 8,0 R () 32768 + 8 [0]
scsi_inert_test-1996 [000] .Ns1 2047.221045: block_rq_requeue: 8,0 R () 32776 + 8 [0]
kworker/0:1H-339 [000] ...1 2047.221054: block_rq_insert: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
kworker/0:1H-339 [000] ...1 2047.221056: block_rq_issue: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
scsi_inert_test-1986 [000] ..s1 2047.221119: block_rq_complete: 8,0 R () 32776 + 8 [0]
(32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP.
Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP,
the sdb only contained the part of (32768 + 8), then only that part
was completed. The lucky thing was that scsi_io_completion detected
it and requeued the remaining part. So we didn't get corrupted data.
However, the requeue of (32776 + 8) is not expected.
Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Liu Bo [Fri, 25 Jan 2019 00:12:49 +0000 (08:12 +0800)]
blk-mq: remove duplicated definition of blk_mq_freeze_queue
As the prototype has been defined in "include/linux/blk-mq.h", the one
in "block/blk-mq.h" can be removed then.
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Liu Bo [Fri, 25 Jan 2019 00:12:48 +0000 (08:12 +0800)]
Blk-iolatency: warn on negative inflight IO counter
This is to catch any unexpected negative value of inflight IO counter.
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Liu Bo [Fri, 25 Jan 2019 00:12:47 +0000 (08:12 +0800)]
blk-iolatency: fix IO hang due to negative inflight counter
Our test reported the following stack, and vmcore showed that
->inflight counter is -1.
[
ffffc9003fcc38d0] __schedule at
ffffffff8173d95d
[
ffffc9003fcc3958] schedule at
ffffffff8173de26
[
ffffc9003fcc3970] io_schedule at
ffffffff810bb6b6
[
ffffc9003fcc3988] blkcg_iolatency_throttle at
ffffffff813911cb
[
ffffc9003fcc3a20] rq_qos_throttle at
ffffffff813847f3
[
ffffc9003fcc3a48] blk_mq_make_request at
ffffffff8137468a
[
ffffc9003fcc3b08] generic_make_request at
ffffffff81368b49
[
ffffc9003fcc3b68] submit_bio at
ffffffff81368d7d
[
ffffc9003fcc3bb8] ext4_io_submit at
ffffffffa031be00 [ext4]
[
ffffc9003fcc3c00] ext4_writepages at
ffffffffa03163de [ext4]
[
ffffc9003fcc3d68] do_writepages at
ffffffff811c49ae
[
ffffc9003fcc3d78] __filemap_fdatawrite_range at
ffffffff811b6188
[
ffffc9003fcc3e30] filemap_write_and_wait_range at
ffffffff811b6301
[
ffffc9003fcc3e60] ext4_sync_file at
ffffffffa030cee8 [ext4]
[
ffffc9003fcc3ea8] vfs_fsync_range at
ffffffff8128594b
[
ffffc9003fcc3ee8] do_fsync at
ffffffff81285abd
[
ffffc9003fcc3f18] sys_fsync at
ffffffff81285d50
[
ffffc9003fcc3f28] do_syscall_64 at
ffffffff81003c04
[
ffffc9003fcc3f50] entry_SYSCALL_64_after_swapgs at
ffffffff81742b8e
The ->inflight counter may be negative (-1) if
1) blk-iolatency was disabled when the IO was issued,
2) blk-iolatency was enabled before this IO reached its endio,
3) the ->inflight counter is decreased from 0 to -1 in endio()
In fact the hang can be easily reproduced by the below script,
H=/sys/fs/cgroup/unified/
P=/sys/fs/cgroup/unified/test
echo "+io" > $H/cgroup.subtree_control
mkdir -p $P
echo $$ > $P/cgroup.procs
xfs_io -f -d -c "pwrite 0 4k" /dev/sdg
echo "`cat /sys/block/sdg/dev` target=
1000000" > $P/io.latency
xfs_io -f -d -c "pwrite 0 4k" /dev/sdg
This fixes the problem by freezing the queue so that while
enabling/disabling iolatency, there is no inflight rq running.
Note that quiesce_queue is not needed as this only updating iolatency
configuration about which dispatching request_queue doesn't care.
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jan Kara [Thu, 7 Feb 2019 10:55:39 +0000 (11:55 +0100)]
blktrace: Show requests without sector
Currently, blktrace will not show requests that don't have any data as
rq->__sector is initialized to -1 which is out of device range and thus
discarded by act_log_check(). This is most notably the case for cache
flush requests sent to the device. Fix the problem by making
blk_rq_trace_sector() return 0 for requests without initialized sector.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Tetsuo Handa [Mon, 21 Jan 2019 13:49:37 +0000 (22:49 +0900)]
fs: ratelimit __find_get_block_slow() failure message.
When something let __find_get_block_slow() hit all_mapped path, it calls
printk() for 100+ times per a second. But there is no need to print same
message with such high frequency; it is just asking for stall warning, or
at least bloating log files.
[ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
[ 399.873324][T15342] b_state=0x00000029, b_size=512
[ 399.878403][T15342] device loop0 blocksize: 4096
[ 399.883296][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
[ 399.890400][T15342] b_state=0x00000029, b_size=512
[ 399.895595][T15342] device loop0 blocksize: 4096
[ 399.900556][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
[ 399.907471][T15342] b_state=0x00000029, b_size=512
[ 399.912506][T15342] device loop0 blocksize: 4096
This patch reduces frequency to up to once per a second, in addition to
concatenating three lines into one.
[ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8, b_state=0x00000029, b_size=512, device loop0 blocksize: 4096
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Chengguang Xu [Fri, 1 Feb 2019 03:26:02 +0000 (11:26 +0800)]
m68k: set proper major_num when specifying module param major_num
When calling register_blkdev() with specified major
device number, the return code is 0 on success.
So it seems not correct direct assign return code to
variable major_num in this case.
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Hans de Goede [Sun, 3 Feb 2019 09:02:07 +0000 (10:02 +0100)]
libata: Add NOLPM quirk for SAMSUNG MZ7TE512HMHP-000L1 SSD
We've received a bugreport that using LPM with a SAMSUNG
MZ7TE512HMHP-000L1 SSD leads to system instability, we already have
a quirk for the MZ7TD256HAFV-000L9, which is also a Samsun EVO 840 /
PM851 OEM model, so it seems some of these models have a LPM issue.
This commits adds a NOLPM quirk for the model string from the new
bugeport, to avoid the reported stability issues.
Cc: stable@vger.kernel.org
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1571330
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 6 Feb 2019 15:43:07 +0000 (08:43 -0700)]
Merge branch 'nvme-5.0' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Christoph for this release.
* 'nvme-5.0' of git://git.infradead.org/nvme:
nvme-pci: fix rapid add remove sequence
nvme: lock NS list changes while handling command effects
Keith Busch [Thu, 24 Jan 2019 01:46:11 +0000 (18:46 -0700)]
nvme-pci: fix rapid add remove sequence
A surprise removal may fail to tear down request queues if it is racing
with the initial asynchronous probe. If that happens, the remove path
won't see the queue resources to tear down, and the controller reset
path may create a new request queue on a removed device, but will not
be able to make forward progress, deadlocking the pci removal.
Protect setting up non-blocking resources from a shutdown by holding the
same mutex, and transition to the CONNECTING state after these resources
are initialized so the probe path may see the dead controller state
before dispatching new IO.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202081
Reported-by: Alex Gagniuc <Alex_Gagniuc@Dellteam.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Tested-by: Alex Gagniuc <mr.nuke.me@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Keith Busch [Mon, 28 Jan 2019 16:46:07 +0000 (09:46 -0700)]
nvme: lock NS list changes while handling command effects
If a controller supports the NS Change Notification, the namespace
scan_work is automatically triggered after attaching a new namespace.
Occasionally the namespace scan_work may append the new namespace to the
list before the admin command effects handling is completed. The effects
handling unfreezes namespaces, but if it unfreezes the newly attached
namespace, its request_queue freeze depth will be off and we'll hit the
warning in blk_mq_unfreeze_queue().
On the next namespace add, we will fail to freeze that queue due to the
previous bad accounting and deadlock waiting for frozen.
Fix that by preventing scan work from altering the namespace list while
command effects handling needs to pair freeze with unfreeze.
Reported-by: Wen Xiong <wenxiong@us.ibm.com>
Tested-by: Wen Xiong <wenxiong@us.ibm.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Mike Marshall [Tue, 5 Feb 2019 19:13:35 +0000 (14:13 -0500)]
aio: initialize kiocb private in case any filesystems expect it.
A recent optimization had left private uninitialized.
Fixes: 2bc4ca9bb600 ("aio: don't zero entire aio_kiocb aio_get_req()")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 30 Jan 2019 15:41:40 +0000 (08:41 -0700)]
ide: ensure atapi sense request aren't preempted
There's an issue with how sense requests are handled in IDE. If ide-cd
encounters an error, it queues a sense request. With how IDE request
handling is done, this is the next request we need to handle. But it's
impossible to guarantee this, as another request could come in between
the sense being queued, and ->queue_rq() being run and handling it. If
that request ALSO fails, then we attempt to doubly queue the single
sense request we have.
Since we only support one active request at the time, defer request
processing when a sense request is queued.
Fixes: 600335205b8d "ide: convert to blk-mq"
Reported-by: He Zhe <zhe.he@windriver.com>
Tested-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jianchao Wang [Wed, 30 Jan 2019 09:01:56 +0000 (17:01 +0800)]
blk-mq: fix a hung issue when fsync
Florian reported a io hung issue when fsync(). It should be
triggered by following race condition.
data + post flush a flush
blk_flush_complete_seq
case REQ_FSEQ_DATA
blk_flush_queue_rq
issued to driver blk_mq_dispatch_rq_list
try to issue a flush req
failed due to NON-NCQ command
.queue_rq return BLK_STS_DEV_RESOURCE
request completion
req->end_io // doesn't check RESTART
mq_flush_data_end_io
case REQ_FSEQ_POSTFLUSH
blk_kick_flush
do nothing because previous flush
has not been completed
blk_mq_run_hw_queue
insert rq to hctx->dispatch
due to RESTART is still set, do nothing
To fix this, replace the blk_mq_run_hw_queue in mq_flush_data_end_io
with blk_mq_sched_restart to check and clear the RESTART flag.
Fixes: bd166ef1 (blk-mq-sched: add framework for MQ capable IO schedulers)
Reported-by: Florian Stecker <m19@florianstecker.de>
Tested-by: Florian Stecker <m19@florianstecker.de>
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Tetsuo Handa [Wed, 30 Jan 2019 13:21:45 +0000 (22:21 +0900)]
block: pass no-op callback to INIT_WORK().
syzbot is hitting flush_work() warning caused by commit
4d43d395fed12463
("workqueue: Try to catch flush_work() without INIT_WORK().") [1].
Although that commit did not expect INIT_WORK(NULL) case, calling
flush_work() without setting a valid callback should be avoided anyway.
Fix this problem by setting a no-op callback instead of NULL.
[1] https://syzkaller.appspot.com/bug?id=
e390366bc48bc82a7c668326e0663be3b91cbd29
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-and-tested-by: syzbot <syzbot+ba2a929dcf8e704c180e@syzkaller.appspotmail.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 29 Jan 2019 13:41:44 +0000 (06:41 -0700)]
Merge branch 'md-fixes' of https://github.com/liu-song-6/linux into for-linus
Pull MD fix from Song.
* 'md-fixes' of https://github.com/liu-song-6/linux:
md/raid5: fix 'out of memory' during raid cache recovery
Alexei Naberezhnov [Tue, 27 Mar 2018 23:54:16 +0000 (16:54 -0700)]
md/raid5: fix 'out of memory' during raid cache recovery
This fixes the case when md array assembly fails because of raid cache recovery
unable to allocate a stripe, despite attempts to replay stripes and increase
cache size. This happens because stripes released by r5c_recovery_replay_stripes
and raid5_set_cache_size don't become available for allocation immediately.
Released stripes first are placed on conf->released_stripes list and require
md thread to merge them on conf->inactive_list before they can be allocated.
Patch allows final allocation attempt during cache recovery to wait for
new stripes to become availabe for allocation.
Cc: linux-raid@vger.kernel.org
Cc: Shaohua Li <shli@kernel.org>
Cc: linux-stable <stable@vger.kernel.org> # 4.10+
Fixes: b4c625c67362 ("md/r5cache: r5cache recovery: part 1")
Signed-off-by: Alexei Naberezhnov <anaberezhnov@fb.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Michal Hocko [Fri, 25 Jan 2019 18:08:58 +0000 (19:08 +0100)]
Revert "mm, memory_hotplug: initialize struct pages for the full memory section"
This reverts commit
2830bf6f05fb3e05bc4743274b806c821807a684.
The underlying assumption that one sparse section belongs into a single
numa node doesn't hold really. Robert Shteynfeld has reported a boot
failure. The boot log was not captured but his memory layout is as
follows:
Early memory node ranges
node 1: [mem 0x0000000000001000-0x0000000000090fff]
node 1: [mem 0x0000000000100000-0x00000000dbdf8fff]
node 1: [mem 0x0000000100000000-0x0000001423ffffff]
node 0: [mem 0x0000001424000000-0x0000002023ffffff]
This means that node0 starts in the middle of a memory section which is
also in node1. memmap_init_zone tries to initialize padding of a
section even when it is outside of the given pfn range because there are
code paths (e.g. memory hotplug) which assume that the full worth of
memory section is always initialized.
In this particular case, though, such a range is already intialized and
most likely already managed by the page allocator. Scribbling over
those pages corrupts the internal state and likely blows up when any of
those pages gets used.
Reported-by: Robert Shteynfeld <robert.shteynfeld@gmail.com>
Fixes: 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full memory section")
Cc: stable@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 27 Jan 2019 23:18:05 +0000 (15:18 -0800)]
Linux 5.0-rc4
Linus Torvalds [Sun, 27 Jan 2019 20:02:00 +0000 (12:02 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of fixes for x86:
- Fix the swapped outb() parameters in the KASLR code
- Fix the PKEY handling at fork which missed to preserve the pkey
state for the child. Comes with a test case to validate that.
- Fix the entry stack handling for XEN PV to respect that XEN PV
systems enter the function already on the current thread stack and
not on the trampoline.
- Fix kexec load failure caused by using a stale value when the
kexec_buf structure is reused for subsequent allocations.
- Fix a bogus sizeof() in the memory encryption code
- Enforce PCI dependency for the Intel Low Power Subsystem
- Enforce PCI_LOCKLESS_CONFIG when PCI is enabled"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/Kconfig: Select PCI_LOCKLESS_CONFIG if PCI is enabled
x86/entry/64/compat: Fix stack switching for XEN PV
x86/kexec: Fix a kexec_file_load() failure
x86/mm/mem_encrypt: Fix erroneous sizeof()
x86/selftests/pkeys: Fork() to check for state being preserved
x86/pkeys: Properly copy pkey state at fork()
x86/kaslr: Fix incorrect i8254 outb() parameters
x86/intel/lpss: Make PCI dependency explicit
Linus Torvalds [Sun, 27 Jan 2019 19:57:46 +0000 (11:57 -0800)]
Merge branch 'x86-timers-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 timer fixes from Thomas Gleixner:
"Two commits which were missed to be sent during the merge window.
- The TSC calibration fix turns out to be more urgent as recent
Skylake-X systems seem to have massive trouble with calibration
disturbance. This should go back into stable for that reason and it
the risk of breakage is rather low.
- Drop an unused define"
* 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/hpet: Remove unused FSEC_PER_NSEC define
x86/tsc: Make calibration refinement more robust
Linus Torvalds [Sun, 27 Jan 2019 19:55:06 +0000 (11:55 -0800)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Thomas Glexiner:
"A single regression fix to address the unintended breakage of posix
cpu timers.
This is caused by a new sanity check in the common code, which fails
for posix cpu timers under certain conditions because the posix cpu
timer code never updates the variable which is checked"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
posix-cpu-timers: Unbreak timer rearming
Linus Torvalds [Sun, 27 Jan 2019 19:52:50 +0000 (11:52 -0800)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"A small series of fixes which all address possible missed wakeups:
- Document and fix the wakeup ordering of wake_q
- Add the missing barrier in rcuwait_wake_up(), which was documented
in the comment but missing in the code
- Fix the possible missed wakeups in the rwsem and futex code"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rwsem: Fix (possible) missed wakeup
futex: Fix (possible) missed wakeup
sched/wake_q: Fix wakeup ordering for wake_q
sched/wake_q: Document wake_q_add()
sched/wait: Fix rcuwait_wake_up() ordering
Linus Torvalds [Sun, 27 Jan 2019 19:25:38 +0000 (11:25 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"A small set of fixes for the interrupt subsystem:
- Fix a double increment in the irq descriptor allocator which
resulted in a sanity check only being done for every second
affinity mask
- Add a missing device tree translation in the stm32-exti driver.
Without that the interrupt association is completely wrong.
- Initialize the mutex in the GIC-V3 MBI driver
- Fix the alignment for aliasing devices in the GIC-V3-ITS driver so
multi MSI allocations work correctly
- Ensure that the initial affinity of a interrupt is not empty at
startup time.
- Drop bogus include in the madera irq chip driver
- Fix KernelDoc regression"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Align PCI Multi-MSI allocation on their size
genirq/irqdesc: Fix double increment in alloc_descs()
genirq: Fix the kerneldoc comment for struct irq_affinity_desc
irqchip/madera: Drop GPIO includes
irqchip/gic-v3-mbi: Fix uninitialized mbi_lock
irqchip/stm32-exti: Add domain translate function
genirq: Make sure the initial affinity is not empty
Linus Torvalds [Sun, 27 Jan 2019 19:00:37 +0000 (11:00 -0800)]
Merge tag 'edac_fix_for_5.0' of git://git./linux/kernel/git/bp/bp
Pull EDAC fix from Borislav Petkov:
"Fix persistent register offsets of altera_edac, from Thor Thayer"
* tag 'edac_fix_for_5.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, altera: Fix S10 persistent register offset
Linus Torvalds [Sun, 27 Jan 2019 18:58:20 +0000 (10:58 -0800)]
Merge tag 'for-linus-
20190127' of git://git.kernel.dk/linux-block
Pull block revert from Jens Axboe:
"Silly error snuck into a patch from the last series, let's do a revert
to avoid a potential use-after-free"
* tag 'for-linus-
20190127' of git://git.kernel.dk/linux-block:
Revert "block: cover another queue enter recursion via BIO_QUEUE_ENTERED"
Linus Torvalds [Sun, 27 Jan 2019 17:21:00 +0000 (09:21 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Quite a few fixes for x86: nested virtualization save/restore, AMD
nested virtualization and virtual APIC, 32-bit fixes, an important fix
to restore operation on older processors, and a bunch of hyper-v
bugfixes. Several are marked stable.
There are also fixes for GCC warnings and for a GCC/objtool interaction"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Mark expected switch fall-throughs
KVM: x86: fix TRACE_INCLUDE_PATH and remove -I. header search paths
KVM: selftests: check returned evmcs version range
x86/kvm/hyper-v: nested_enable_evmcs() sets vmcs_version incorrectly
KVM: VMX: Move vmx_vcpu_run()'s VM-Enter asm blob to a helper function
kvm: selftests: Fix region overlap check in kvm_util
kvm: vmx: fix some -Wmissing-prototypes warnings
KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1
svm: Fix AVIC incomplete IPI emulation
svm: Add warning message for AVIC IPI invalid target
KVM: x86: WARN_ONCE if sending a PV IPI returns a fatal error
KVM: x86: Fix PV IPIs for 32-bit KVM host
x86/kvm/hyper-v: recommend using eVMCS only when it is enabled
x86/kvm/hyper-v: don't recommend doing reset via synthetic MSR
kvm: x86/vmx: Use kzalloc for cached_vmcs12
KVM: VMX: Use the correct field var when clearing VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL
KVM: x86: Fix single-step debugging
x86/kvm/hyper-v: don't announce GUEST IDLE MSR support
Linus Torvalds [Sun, 27 Jan 2019 17:18:05 +0000 (09:18 -0800)]
Merge tag 'dma-mapping-5.0-2' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:
"Fix a xen-swiotlb regression on arm64"
* tag 'dma-mapping-5.0-2' of git://git.infradead.org/users/hch/dma-mapping:
arm64/xen: fix xen-swiotlb cache flushing
Linus Torvalds [Sun, 27 Jan 2019 17:11:51 +0000 (09:11 -0800)]
Merge tag 'libnvdimm-fixes-5.0-rc4' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
"A fix for namespace label support for non-Intel NVDIMMs that implement
the ACPI standard label method.
This has apparently never worked and could wait for v5.1. However it
has enough visibility with hardware vendors [1] and distro bug
trackers [2], and low enough risk that I decided it should go in for
-rc4. The other fixups target the new, for v5.0, nvdimm security
functionality. The larger init path fixup closes a memory leak and a
potential userspace lockup due to missed notifications.
[1] https://github.com/pmem/ndctl/issues/78
[2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/
1811785
These have all soaked in -next for a week with no reported issues.
Summary:
- Fix support for NVDIMMs that implement the ACPI standard label
methods.
- Fix error handling for security overwrite (memory leak / userspace
hang condition), and another one-line security cleanup"
* tag 'libnvdimm-fixes-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
acpi/nfit: Fix command-supported detection
acpi/nfit: Block function zero DSMs
libnvdimm/security: Require nvdimm_security_setup_events() to succeed
nfit_test: fix security state pull for nvdimm security nfit_test
Linus Torvalds [Sun, 27 Jan 2019 17:07:03 +0000 (09:07 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
"A fixup for the input_event fix for y2038 Sparc64, and couple other
minor fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: input_event - fix the CONFIG_SPARC64 mixup
Input: olpc_apsp - assign priv->dev earlier
Input: uinput - fix undefined behavior in uinput_validate_absinfo()
Input: raspberrypi-ts - fix link error
Input: xpad - add support for SteelSeries Stratus Duo
Input: input_event - provide override for sparc64
Linus Torvalds [Sun, 27 Jan 2019 16:59:12 +0000 (08:59 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Count ttl-dropped frames properly in mac80211, from Bob Copeland.
2) Integer overflow in ktime handling of bcm can code, from Oliver
Hartkopp.
3) Fix RX desc handling wrt. hw checksumming in ravb, from Simon
Horman.
4) Various hash key fixes in hv_netvsc, from Haiyang Zhang.
5) Use after free in ax25, from Eric Dumazet.
6) Several fixes to the SSN support in SCTP, from Xin Long.
7) Do not process frames after a NAPI reschedule in ibmveth, from
Thomas Falcon.
8) Fix NLA_POLICY_NESTED arguments, from Johannes Berg.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (42 commits)
qed: Revert error handling changes.
cfg80211: extend range deviation for DMG
cfg80211: reg: remove warn_on for a normal case
mac80211: Add attribute aligned(2) to struct 'action'
mac80211: don't initiate TDLS connection if station is not associated to AP
nl80211: fix NLA_POLICY_NESTED() arguments
ibmveth: Do not process frames after calling napi_reschedule
net: dev_is_mac_header_xmit() true for ARPHRD_RAWIP
net: usb: asix: ax88772_bind return error when hw_reset fail
MAINTAINERS: Update cavium networking drivers
net/mlx4_core: Fix error handling when initializing CQ bufs in the driver
net/mlx4_core: Add masking for a few queries on HCA caps
sctp: set flow sport from saddr only when it's 0
sctp: set chunk transport correctly when it's a new asoc
sctp: improve the events for sctp stream adding
sctp: improve the events for sctp stream reset
ip_tunnel: Make none-tunnel-dst tunnel port work with lwtunnel
ax25: fix possible use-after-free
sfc: suppress duplicate nvmem partition types in efx_ef10_mtd_probe
hv_netvsc: fix typos in code comments
...
Jens Axboe [Sun, 27 Jan 2019 13:35:28 +0000 (06:35 -0700)]
Revert "block: cover another queue enter recursion via BIO_QUEUE_ENTERED"
We can't touch a bio after ->make_request_fn(), for all we know it could
already have been completed by the time this function returns.
This reverts commit
698cef173983b086977e633e46476e0f925ca01e.
Reported-by: syzbot+4df6ca820108fd248943@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Sat, 26 Jan 2019 23:38:22 +0000 (15:38 -0800)]
Merge tag '5.0-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb3 fixes from Steve French:
"A set of small smb3 fixes, some fixing various crediting issues
discovered during xfstest runs, five for stable"
* tag '5.0-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: print CIFSMaxBufSize as part of /proc/fs/cifs/DebugData
smb3: add credits we receive from oplock/break PDUs
CIFS: Fix mounts if the client is low on credits
CIFS: Do not assume one credit for async responses
CIFS: Fix credit calculations in compound mid callback
CIFS: Fix credit calculation for encrypted reads with errors
CIFS: Fix credits calculations for reads with errors
CIFS: Do not reconnect TCP session in add_credits()
smb3: Cleanup license mess
CIFS: Fix possible hang during async MTU reads and writes
cifs: fix memory leak of an allocated cifs_ntsd structure
Linus Torvalds [Sat, 26 Jan 2019 23:27:04 +0000 (15:27 -0800)]
Merge tag 'vfio-v5.0-rc4' of git://github.com/awilliam/linux-vfio
Pull VFIO fixes from Alex Williamson:
- cleanup licenses in new files (Thomas Gleixner)
- cleanup new compiler warnings (Alexey Kardashevskiy)
* tag 'vfio-v5.0-rc4' of git://github.com/awilliam/linux-vfio:
vfio-pci/nvlink2: Fix ancient gcc warnings
vfio/pci: Cleanup license mess
Linus Torvalds [Sat, 26 Jan 2019 23:03:43 +0000 (15:03 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Six fixes, all of which appear to have user visible consequences.
The DMA one is a regression fix from the merge window and of the
others, four are driver specific and one specific to the target code"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: Use explicit access size in ufshcd_dump_regs
scsi: tcmu: fix use after free
scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state()
scsi: lpfc: nvmet: avoid hang / use-after-free when destroying targetport
scsi: lpfc: nvme: avoid hang / use-after-free when destroying localport
scsi: communicate max segment size to the DMA mapping code
Linus Torvalds [Sat, 26 Jan 2019 20:42:41 +0000 (12:42 -0800)]
Merge tag 'for-linus-
20190125' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A collection of fixes for this release. This contains:
- Silence sparse rightfully complaining about non-static wbt
functions (Bart)
- Fixes for the zoned comments/ioctl documentation (Damien)
- direct-io fix that's been lingering for a while (Ernesto)
- cgroup writeback fix (Tejun)
- Set of NVMe patches for nvme-rdma/tcp (Sagi, Hannes, Raju)
- Block recursion tracking fix (Ming)
- Fix debugfs command flag naming for a few flags (Jianchao)"
* tag 'for-linus-
20190125' of git://git.kernel.dk/linux-block:
block: Fix comment typo
uapi: fix ioctl documentation
blk-wbt: Declare local functions static
blk-mq: fix the cmd_flag_name array
nvme-multipath: drop optimization for static ANA group IDs
nvmet-rdma: fix null dereference under heavy load
nvme-rdma: rework queue maps handling
nvme-tcp: fix timeout handler
nvme-rdma: fix timeout handler
writeback: synchronize sync(2) against cgroup writeback membership switches
block: cover another queue enter recursion via BIO_QUEUE_ENTERED
direct-io: allow direct writes to empty inodes
David S. Miller [Fri, 25 Jan 2019 23:32:28 +0000 (15:32 -0800)]
qed: Revert error handling changes.
This is new code and not bug fixes.
This reverts all changes added by merge commit
8fb18be93efd7292d6ee403b9f61af1008239639
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 25 Jan 2019 23:07:03 +0000 (13:07 -1000)]
Merge tag 'mmc-v5.0-rc2' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
- sdhci-acpi: Fixup build dependency for PCI
- sdhci-omap: Resolve Kconfig warnings on keystone
- sdhci-iproc: Propagate errors from DT parsing
- meson-gx: Fixup IRQ handling in release callback
- meson-gx: Use signal re-sampling to fixup tuning
- dw_mmc-bluefield: Fix the license information
* tag 'mmc-v5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: dw_mmc-bluefield: : Fix the license information
mmc: meson-gx: enable signal re-sampling together with tuning
mmc: sdhci-iproc: handle mmc_of_parse() errors during probe
mmc: meson-gx: Free irq in release() callback
mmc: host: Fix Kconfig warnings on keystone_defconfig
mmc: sdhci-acpi: Make PCI dependency explicit
Linus Torvalds [Fri, 25 Jan 2019 23:03:34 +0000 (13:03 -1000)]
Merge tag 'char-misc-5.0-rc4' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small char and misc driver fixes to resolve some
reported issues, as well as a number of binderfs fixups that were
found after auditing the filesystem code by Al Viro. As binderfs
hasn't been in a previous release yet, it's good to get these in now
before the first users show up.
All of these have been in linux-next for a bit with no reported
issues"
* tag 'char-misc-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (26 commits)
i3c: master: Fix an error checking typo in 'cdns_i3c_master_probe()'
binderfs: switch from d_add() to d_instantiate()
binderfs: drop lock in binderfs_binder_ctl_create
binderfs: kill_litter_super() before cleanup
binderfs: rework binderfs_binder_device_create()
binderfs: rework binderfs_fill_super()
binderfs: prevent renaming the control dentry
binderfs: remove outdated comment
binderfs: use __u32 for device numbers
binderfs: use correct include guards in header
misc: pvpanic: fix warning implicit declaration
char/mwave: fix potential Spectre v1 vulnerability
misc: ibmvsm: Fix potential NULL pointer dereference
binderfs: fix error return code in binderfs_fill_super()
mei: me: add denverton innovation engine device IDs
mei: me: mark LBG devices as having dma support
mei: dma: silent the reject message
binderfs: handle !CONFIG_IPC_NS builds
binderfs: reserve devices for initial mount
binderfs: rename header to binderfs.h
...
Linus Torvalds [Fri, 25 Jan 2019 23:02:12 +0000 (13:02 -1000)]
Merge tag 'staging-5.0-rc4' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are some small staging driver fixes for 5.0-rc4.
They resolve some reported bugs and add a new device id for one
driver. Nothing major at all, but all good to have.
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: android: ion: Support cpu access during dma_buf_detach
staging: rtl8723bs: Fix build error with Clang when inlining is disabled
staging: rtl8188eu: Add device code for D-Link DWA-121 rev B1
staging: vchiq: Fix local event signalling
Staging: wilc1000: unlock on error in init_chip()
staging: wilc1000: fix memory leak in wilc_add_rx_gtk
staging: wilc1000: fix registration frame size
Linus Torvalds [Fri, 25 Jan 2019 22:58:40 +0000 (12:58 -1000)]
Merge tag 'tty-5.0-rc4' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are a number of small tty core and serial driver fixes for
5.0-rc4 to resolve some reported issues.
Nothing major, the small serial driver fixes, a tty core fixup for a
crash that was reported, and some good vt fixes from Nicolas Pitre as
he seems to be auditing that chunk of code a lot lately.
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: fsl_lpuart: fix maximum acceptable baud rate with over-sampling
tty: serial: qcom_geni_serial: Allow mctrl when flow control is disabled
tty: Handle problem if line discipline does not have receive_buf
vgacon: unconfuse vc_origin when using soft scrollback
vt: invoke notifier on screen size change
vt: always call notifier with the console lock held
vt: make vt_console_print() compatible with the unicode screen buffer
tty/n_hdlc: fix __might_sleep warning
serial: 8250: Fix serial8250 initialization crash
uart: Fix crash in uart_write and uart_put_char
Linus Torvalds [Fri, 25 Jan 2019 22:57:09 +0000 (12:57 -1000)]
Merge tag 'usb-5.0-rc4' of git://git./linux/kernel/git/gregkh/usb
Pull USB/PHY fixes from Greg KH:
"Here are a number of small USB and PHY driver fixes for 5.0-rc4.
Nothing major at all, just the usual selection of USB gadget bugfixes,
some new USB serial driver ids, some SPDX fixes, and some PHY driver
fixes for reported issues.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: keyspan_usa: add proper SPDX lines for .h files
USB: EHCI: ehci-mv: add MODULE_DEVICE_TABLE
USB: leds: fix regression in usbport led trigger
usb: chipidea: fix static checker warning for NULL pointer
MAINTAINERS: email address update in MAINTAINERS entries
USB: usbip: delete README file
USB: serial: pl2303: add new PID to support PL2303TB
usb: dwc2: gadget: Fix Remote Wakeup interrupt bit clearing
phy: ath79-usb: Fix the main reset name to match the DT binding
phy: ath79-usb: Fix the power on error path
phy: fix build breakage: add PHY_MODE_SATA
phy: ti: ensure priv is not null before dereferencing it
USB: serial: ftdi_sio: fix GPIO not working in autosuspend
usb: gadget: Potential NULL dereference on allocation error
usb: dwc3: gadget: Fix the uninitialized link_state when udc starts
usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup
usb: dwc3: gadget: synchronize_irq dwc irq in suspend
USB: serial: simple: add Motorola Tetra TPG2200 device id
David S. Miller [Fri, 25 Jan 2019 18:59:36 +0000 (10:59 -0800)]
Merge tag 'mac80211-for-davem-2019-01-25' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Just a few small fixes:
* avoid trying to operate TDLS when not connection,
this is not valid and led to issues
* count TTL-dropped frames in mesh better
* deal with new WiGig channels in regulatory code
* remove a WARN_ON() that can trigger due to benign
races during device/driver registration
* fix nested netlink policy maxattrs (syzkaller)
* fix hwsim n_limits (syzkaller)
* propagate __aligned(2) to a surrounding struct
* return proper error in virt_wifi error path
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Fri, 25 Jan 2019 18:23:17 +0000 (12:23 -0600)]
KVM: x86: Mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
This patch fixes the following warnings:
arch/x86/kvm/lapic.c:1037:27: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/lapic.c:1876:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/hyperv.c:1637:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/svm.c:4396:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/mmu.c:4372:36: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/x86.c:3835:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/x86.c:7938:23: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/vmx/vmx.c:2015:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/vmx/vmx.c:1773:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enabling -Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Masahiro Yamada [Fri, 25 Jan 2019 07:32:46 +0000 (16:32 +0900)]
KVM: x86: fix TRACE_INCLUDE_PATH and remove -I. header search paths
The header search path -I. in kernel Makefiles is very suspicious;
it allows the compiler to search for headers in the top of $(srctree),
where obviously no header file exists.
The reason of having -I. here is to make the incorrectly set
TRACE_INCLUDE_PATH working.
As the comment block in include/trace/define_trace.h says,
TRACE_INCLUDE_PATH should be a relative path to the define_trace.h
Fix the TRACE_INCLUDE_PATH, and remove the iffy include paths.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Vitaly Kuznetsov [Thu, 17 Jan 2019 17:12:10 +0000 (18:12 +0100)]
KVM: selftests: check returned evmcs version range
Check that KVM_CAP_HYPERV_ENLIGHTENED_VMCS returns correct version range.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Vitaly Kuznetsov [Thu, 17 Jan 2019 17:12:09 +0000 (18:12 +0100)]
x86/kvm/hyper-v: nested_enable_evmcs() sets vmcs_version incorrectly
Commit
e2e871ab2f02 ("x86/kvm/hyper-v: Introduce nested_get_evmcs_version()
helper") broke EVMCS enablement: to set vmcs_version we now call
nested_get_evmcs_version() but this function checks
enlightened_vmcs_enabled flag which is not yet set so we end up returning
zero.
Fix the issue by re-arranging things in nested_enable_evmcs().
Fixes: e2e871ab2f02 ("x86/kvm/hyper-v: Introduce nested_get_evmcs_version() helper")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Wed, 16 Jan 2019 01:10:53 +0000 (17:10 -0800)]
KVM: VMX: Move vmx_vcpu_run()'s VM-Enter asm blob to a helper function
...along with the function's STACK_FRAME_NON_STANDARD tag. Moving the
asm blob results in a significantly smaller amount of code that is
marked with STACK_FRAME_NON_STANDARD, which makes it far less likely
that gcc will split the function and trigger a spurious objtool warning.
As a bonus, removing STACK_FRAME_NON_STANDARD from vmx_vcpu_run() allows
the bulk of code to be properly checked by objtool.
Because %rbp is not loaded via VMCS fields, vmx_vcpu_run() must manually
save/restore the host's RBP and load the guest's RBP prior to calling
vmx_vmenter(). Modifying %rbp triggers objtool's stack validation code,
and so vmx_vcpu_run() is tagged with STACK_FRAME_NON_STANDARD since it's
impossible to avoid modifying %rbp.
Unfortunately, vmx_vcpu_run() is also a gigantic function that gcc will
split into separate functions, e.g. so that pieces of the function can
be inlined. Splitting the function means that the compiled Elf file
will contain one or more vmx_vcpu_run.part.* functions in addition to
a vmx_vcpu_run function. Depending on where the function is split,
objtool may warn about a "call without frame pointer save/setup" in
vmx_vcpu_run.part.* since objtool's stack validation looks for exact
names when whitelisting functions tagged with STACK_FRAME_NON_STANDARD.
Up until recently, the undesirable function splitting was effectively
blocked because vmx_vcpu_run() was tagged with __noclone. At the time,
__noclone had an unintended side effect that put vmx_vcpu_run() into a
separate optimization unit, which in turn prevented gcc from inlining
the function (or any of its own function calls) and thus eliminated gcc's
motivation to split the function. Removing the __noclone attribute
allowed gcc to optimize vmx_vcpu_run(), exposing the objtool warning.
Kudos to Qian Cai for root causing that the fnsplit optimization is what
caused objtool to complain.
Fixes: 453eafbe65f7 ("KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines")
Tested-by: Qian Cai <cai@lca.pw>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ben Gardon [Wed, 16 Jan 2019 17:41:15 +0000 (09:41 -0800)]
kvm: selftests: Fix region overlap check in kvm_util
Fix a call to userspace_mem_region_find to conform to its spec of
taking an inclusive, inclusive range. It was previously being called
with an inclusive, exclusive range. Also remove a redundant region bounds
check in vm_userspace_mem_region_add. Region overlap checking is already
performed by the call to userspace_mem_region_find.
Tested: Compiled tools/testing/selftests/kvm with -static
Ran all resulting test binaries on an Intel Haswell test machine
All tests passed
Signed-off-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Yi Wang [Mon, 21 Jan 2019 07:27:05 +0000 (15:27 +0800)]
kvm: vmx: fix some -Wmissing-prototypes warnings
We get some warnings when building kernel with W=1:
arch/x86/kvm/vmx/vmx.c:426:5: warning: no previous prototype for ‘kvm_fill_hv_flush_list_func’ [-Wmissing-prototypes]
arch/x86/kvm/vmx/nested.c:58:6: warning: no previous prototype for ‘init_vmcs_shadow_fields’ [-Wmissing-prototypes]
Make them static to fix this.
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Vitaly Kuznetsov [Mon, 7 Jan 2019 18:44:51 +0000 (19:44 +0100)]
KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1
kvm-unit-tests' eventinj "NMI failing on IDT" test results in NMI being
delivered to the host (L1) when it's running nested. The problem seems to
be: svm_complete_interrupts() raises 'nmi_injected' flag but later we
decide to reflect EXIT_NPF to L1. The flag remains pending and we do NMI
injection upon entry so it got delivered to L1 instead of L2.
It seems that VMX code solves the same issue in prepare_vmcs12(), this was
introduced with code refactoring in commit
5f3d5799974b ("KVM: nVMX: Rework
event injection and recovery").
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Suravee Suthikulpanit [Tue, 22 Jan 2019 10:25:13 +0000 (10:25 +0000)]
svm: Fix AVIC incomplete IPI emulation
In case of incomplete IPI with invalid interrupt type, the current
SVM driver does not properly emulate the IPI, and fails to boot
FreeBSD guests with multiple vcpus when enabling AVIC.
Fix this by update APIC ICR high/low registers, which also
emulate sending the IPI.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Suravee Suthikulpanit [Tue, 22 Jan 2019 10:24:19 +0000 (10:24 +0000)]
svm: Add warning message for AVIC IPI invalid target
Print warning message when IPI target ID is invalid due to one of
the following reasons:
* In logical mode: cluster > max_cluster (64)
* In physical mode: target > max_physical (512)
* Address is not present in the physical or logical ID tables
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Wed, 23 Jan 2019 17:22:40 +0000 (09:22 -0800)]
KVM: x86: WARN_ONCE if sending a PV IPI returns a fatal error
KVM hypercalls return a negative value error code in case of a fatal
error, e.g. when the hypercall isn't supported or was made with invalid
parameters. WARN_ONCE on fatal errors when sending PV IPIs as any such
error all but guarantees an SMP system will hang due to a missing IPI.
Fixes: aaffcfd1e82d ("KVM: X86: Implement PV IPIs in linux guest")
Cc: stable@vger.kernel.org
Cc: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Wed, 23 Jan 2019 17:22:39 +0000 (09:22 -0800)]
KVM: x86: Fix PV IPIs for 32-bit KVM host
The recognition of the KVM_HC_SEND_IPI hypercall was unintentionally
wrapped in "#ifdef CONFIG_X86_64", causing 32-bit KVM hosts to reject
any and all PV IPI requests despite advertising the feature. This
results in all KVM paravirtualized guests hanging during SMP boot due
to IPIs never being delivered.
Fixes: 4180bf1b655a ("KVM: X86: Implement "send IPI" hypercall")
Cc: stable@vger.kernel.org
Cc: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Vitaly Kuznetsov [Fri, 25 Jan 2019 11:19:34 +0000 (12:19 +0100)]
x86/kvm/hyper-v: recommend using eVMCS only when it is enabled
We shouldn't probably be suggesting using Enlightened VMCS when it's not
enabled (not supported from guest's point of view). Hyper-V on KVM seems
to be fine either way but let's be consistent.
Fixes: 2bc39970e932 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Vitaly Kuznetsov [Fri, 25 Jan 2019 11:19:33 +0000 (12:19 +0100)]
x86/kvm/hyper-v: don't recommend doing reset via synthetic MSR
System reset through synthetic MSR is not recommended neither by genuine
Hyper-V nor my QEMU.
Fixes: 2bc39970e932 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tom Roeder [Thu, 24 Jan 2019 21:48:20 +0000 (13:48 -0800)]
kvm: x86/vmx: Use kzalloc for cached_vmcs12
This changes the allocation of cached_vmcs12 to use kzalloc instead of
kmalloc. This removes the information leak found by Syzkaller (see
Reported-by) in this case and prevents similar leaks from happening
based on cached_vmcs12.
It also changes vmx_get_nested_state to copy out the full 4k VMCS12_SIZE
in copy_to_user rather than only the size of the struct.
Tested: rebuilt against head, booted, and ran the syszkaller repro
https://syzkaller.appspot.com/text?tag=ReproC&x=
174efca3400000 without
observing any problems.
Reported-by: syzbot+ded1696f6b50b615b630@syzkaller.appspotmail.com
Fixes: 8fcc4b5923af5de58b80b53a069453b135693304
Cc: stable@vger.kernel.org
Signed-off-by: Tom Roeder <tmroeder@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Mon, 14 Jan 2019 20:12:02 +0000 (12:12 -0800)]
KVM: VMX: Use the correct field var when clearing VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL
Fix a recently introduced bug that results in the wrong VMCS control
field being updated when applying a IA32_PERF_GLOBAL_CTRL errata.
Fixes: c73da3fcab43 ("KVM: VMX: Properly handle dynamic VM Entry/Exit controls")
Reported-by: Harald Arnesen <harald@skogtun.org>
Tested-by: Harald Arnesen <harald@skogtun.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Alexander Popov [Mon, 21 Jan 2019 12:48:40 +0000 (15:48 +0300)]
KVM: x86: Fix single-step debugging
The single-step debugging of KVM guests on x86 is broken: if we run
gdb 'stepi' command at the breakpoint when the guest interrupts are
enabled, RIP always jumps to native_apic_mem_write(). Then other
nasty effects follow.
Long investigation showed that on Jun 7, 2017 the
commit
c8401dda2f0a00cd25c0 ("KVM: x86: fix singlestepping over syscall")
introduced the kvm_run.debug corruption: kvm_vcpu_do_singlestep() can
be called without X86_EFLAGS_TF set.
Let's fix it. Please consider that for -stable.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Cc: stable@vger.kernel.org
Fixes: c8401dda2f0a00cd25c0 ("KVM: x86: fix singlestepping over syscall")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Vitaly Kuznetsov [Thu, 24 Jan 2019 14:27:09 +0000 (15:27 +0100)]
x86/kvm/hyper-v: don't announce GUEST IDLE MSR support
HV_X64_MSR_GUEST_IDLE_AVAILABLE appeared in kvm_vcpu_ioctl_get_hv_cpuid()
by mistake: it announces support for HV_X64_MSR_GUEST_IDLE (0x400000F0)
which we don't support in KVM (yet).
Fixes: 2bc39970e932 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Chaitanya Tata [Fri, 18 Jan 2019 21:47:47 +0000 (03:17 +0530)]
cfg80211: extend range deviation for DMG
Recently, DMG frequency bands have been extended till 71GHz, so extend
the range check till 20GHz (45-71GHZ), else some channels will be marked
as disabled.
Signed-off-by: Chaitanya Tata <Chaitanya.Tata@bluwireless.co.uk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Chaitanya Tata [Thu, 24 Jan 2019 10:43:02 +0000 (16:13 +0530)]
cfg80211: reg: remove warn_on for a normal case
If there are simulatenous queries of regdb, then there might be a case
where multiple queries can trigger request_firmware_no_wait and can have
parallel callbacks being executed asynchronously. In this scenario we
might hit the WARN_ON.
So remove the warn_on, as the code already handles multiple callbacks
gracefully.
Signed-off-by: Chaitanya Tata <chaitanya.tata@bluwireless.co.uk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Mathieu Malaterre [Thu, 24 Jan 2019 18:19:57 +0000 (19:19 +0100)]
mac80211: Add attribute aligned(2) to struct 'action'
During refactor in commit
9e478066eae4 ("mac80211: fix MU-MIMO
follow-MAC mode") a new struct 'action' was declared with packed
attribute as:
struct {
struct ieee80211_hdr_3addr hdr;
u8 category;
u8 action_code;
} __packed action;
But since struct 'ieee80211_hdr_3addr' is declared with an aligned
keyword as:
struct ieee80211_hdr {
__le16 frame_control;
__le16 duration_id;
u8 addr1[ETH_ALEN];
u8 addr2[ETH_ALEN];
u8 addr3[ETH_ALEN];
__le16 seq_ctrl;
u8 addr4[ETH_ALEN];
} __packed __aligned(2);
Solve the ambiguity of placing aligned structure in a packed one by
adding the aligned(2) attribute to struct 'action'.
This removes the following warning (W=1):
net/mac80211/rx.c:234:2: warning: alignment 1 of 'struct <anonymous>' is less than 2 [-Wpacked-not-aligned]
Cc: Johannes Berg <johannes.berg@intel.com>
Suggested-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Balaji Pothunoori [Mon, 21 Jan 2019 07:00:43 +0000 (12:30 +0530)]
mac80211: don't initiate TDLS connection if station is not associated to AP
Following call trace is observed while adding TDLS peer entry in driver
during TDLS setup.
Call Trace:
[<
c1301476>] dump_stack+0x47/0x61
[<
c10537d2>] __warn+0xe2/0x100
[<
fa22415f>] ? sta_apply_parameters+0x49f/0x550 [mac80211]
[<
c1053895>] warn_slowpath_null+0x25/0x30
[<
fa22415f>] sta_apply_parameters+0x49f/0x550 [mac80211]
[<
fa20ad42>] ? sta_info_alloc+0x1c2/0x450 [mac80211]
[<
fa224623>] ieee80211_add_station+0xe3/0x160 [mac80211]
[<
c1876fe3>] nl80211_new_station+0x273/0x420
[<
c170f6d9>] genl_rcv_msg+0x219/0x3c0
[<
c170f4c0>] ? genl_rcv+0x30/0x30
[<
c170ee7e>] netlink_rcv_skb+0x8e/0xb0
[<
c170f4ac>] genl_rcv+0x1c/0x30
[<
c170e8aa>] netlink_unicast+0x13a/0x1d0
[<
c170ec18>] netlink_sendmsg+0x2d8/0x390
[<
c16c5acd>] sock_sendmsg+0x2d/0x40
[<
c16c6369>] ___sys_sendmsg+0x1d9/0x1e0
Fixing this by allowing TDLS setup request only when we have completed
association.
Signed-off-by: Balaji Pothunoori <bpothuno@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Fri, 25 Jan 2019 08:26:32 +0000 (09:26 +0100)]
nl80211: fix NLA_POLICY_NESTED() arguments
syzbot reported an out-of-bounds read when passing certain
malformed messages into nl80211. The specific place where
this happened isn't interesting, the problem is that nested
policy parsing was referring to the wrong maximum attribute
and thus the policy wasn't long enough.
Fix this by referring to the correct attribute. Since this
is really not necessary, I'll come up with a separate patch
to just pass the policy instead of both, in the common case
we can infer the maxattr from the size of the policy array.
Reported-by: syzbot+4157b036c5f4713b1f2f@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Fixes: 9bb7e0f24e7e ("cfg80211: add peer measurement with FTM initiator API")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Thomas Falcon [Thu, 24 Jan 2019 17:17:01 +0000 (11:17 -0600)]
ibmveth: Do not process frames after calling napi_reschedule
The IBM virtual ethernet driver's polling function continues
to process frames after rescheduling NAPI, resulting in a warning
if it exhausted its budget. Do not restart polling after calling
napi_reschedule. Instead let frames be processed in the following
instance.
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maciej Żenczykowski [Thu, 24 Jan 2019 11:07:02 +0000 (03:07 -0800)]
net: dev_is_mac_header_xmit() true for ARPHRD_RAWIP
__bpf_redirect() and act_mirred checks this boolean
to determine whether to prefix an ethernet header.
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zhang Run [Thu, 24 Jan 2019 05:48:49 +0000 (13:48 +0800)]
net: usb: asix: ax88772_bind return error when hw_reset fail
The ax88772_bind() should return error code immediately when the PHY
was not reset properly through ax88772a_hw_reset().
Otherwise, The asix_get_phyid() will block when get the PHY
Identifier from the PHYSID1 MII registers through asix_mdio_read()
due to the PHY isn't ready. Furthermore, it will produce a lot of
error message cause system crash.As follows:
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to write
reg index 0x0000: -71
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to send
software reset:
ffffffb9
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to write
reg index 0x0000: -71
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to enable
software MII access
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to read
reg index 0x0000: -71
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to write
reg index 0x0000: -71
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to enable
software MII access
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to read
reg index 0x0000: -71
...
Signed-off-by: Zhang Run <zhang.run@zte.com.cn>
Reviewed-by: Yang Wei <yang.wei9@zte.com.cn>
Tested-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru [Thu, 24 Jan 2019 02:03:20 +0000 (18:03 -0800)]
MAINTAINERS: Update cavium networking drivers
Following Marvell's acquisition of Cavium, we need to update all the
Cavium drivers maintainer's entries to point to our new e-mail addresses.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ameen Rahman <Ameen.Rahman@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 25 Jan 2019 06:22:17 +0000 (22:22 -0800)]
Merge tag 'hyperv-fixes-signed' of git://git./linux/kernel/git/hyperv/linux
Sasha Levin says:
====================
Hyper-V hv_netvsc commits for 5.0
Three patches from Haiyang Zhang to fix settings hash key using ethtool,
and Adrian Vladu's first patch fixing a few spelling mistakes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 25 Jan 2019 05:52:37 +0000 (21:52 -0800)]
Merge tag 'linux-can-fixes-for-5.0-
20190122' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2019-01-22
this is a pull request of 4 patches for net/master.
The first patch by is by Manfred Schlaegl and reverts a patch that caused wrong
warning messages in certain use cases. The next patch is by Oliver Hartkopp for
the bcm that adds sanity checks for the timer value before using it to detect
potential interger overflows. The last two patches are for the flexcan driver,
YueHaibing's patch fixes the the return value in the error path of the
flexcan_setup_stop_mode() function. The second patch is by Uwe Kleine-König and
fixes a NULL pointer deref on older flexcan cores in flexcan_chip_start().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 25 Jan 2019 05:48:26 +0000 (21:48 -0800)]
Merge branch 'mlx4_core-fixes'
Tariq Toukan says:
====================
mlx4_core fixes for 5.0-rc
This patchset includes two fixes for the mlx4_core driver.
First patch by Aya fixes inaccurate parsing of some FW fields, mistakenly
including additional (mostly reserved) bits.
Second patch by Jack fixes a wrong (yet harmless) error handling of
calls to copy_to_user() during the CQs init stage.
Series generated against net commit:
49a57857aeea Linux 5.0-rc3
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Tue, 22 Jan 2019 13:19:45 +0000 (15:19 +0200)]
net/mlx4_core: Fix error handling when initializing CQ bufs in the driver
Procedure mlx4_init_user_cqes() handles returns by copy_to_user
incorrectly. copy_to_user() returns the number of bytes not copied.
Thus, a non-zero return should be treated as a -EFAULT error
(as is done elsewhere in the kernel). However, mlx4_init_user_cqes()
error handling simply returns the number of bytes not copied
(instead of -EFAULT).
Note, though, that this is a harmless bug: procedure mlx4_alloc_cq()
(which is the only caller of mlx4_init_user_cqes()) treats any
non-zero return as an error, but that returned error value is processed
internally, and not passed further up the call stack.
In addition, fixes the following sparse warning:
warning: incorrect type in argument 1 (different address spaces)
expected void [noderef] <asn:1>*to
got void *buf
Fixes: e45678973dcb ("{net, IB}/mlx4: Initialize CQ buffers in the driver when possible")
Reported by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Aya Levin [Tue, 22 Jan 2019 13:19:44 +0000 (15:19 +0200)]
net/mlx4_core: Add masking for a few queries on HCA caps
Driver reads the query HCA capabilities without the corresponding masks.
Without the correct masks, the base addresses of the queues are
unaligned. In addition some reserved bits were wrongly read. Using the
correct masks, ensures alignment of the base addresses and allows future
firmware versions safe use of the reserved bits.
Fixes: ab9c17a009ee ("mlx4_core: Modify driver initialization flow to accommodate SRIOV for Ethernet")
Fixes: 0ff1fb654bec ("{NET, IB}/mlx4: Add device managed flow steering firmware API")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 21 Jan 2019 18:42:41 +0000 (02:42 +0800)]
sctp: set flow sport from saddr only when it's 0
Now sctp_transport_pmtu() passes transport->saddr into .get_dst() to set
flow sport from 'saddr'. However, transport->saddr is set only when
transport->dst exists in sctp_transport_route().
If sctp_transport_pmtu() is called without transport->saddr set, like
when transport->dst doesn't exists, the flow sport will be set to 0
from transport->saddr, which will cause a wrong route to be got.
Commit
6e91b578bf3f ("sctp: re-use sctp_transport_pmtu in
sctp_transport_route") made the issue be triggered more easily
since sctp_transport_pmtu() would be called in sctp_transport_route()
after that.
In gerneral, fl4->fl4_sport should always be set to
htons(asoc->base.bind_addr.port), unless transport->asoc doesn't exist
in sctp_v4/6_get_dst(), which is the case:
sctp_ootb_pkt_new() ->
sctp_transport_route()
For that, we can simply handle it by setting flow sport from saddr only
when it's 0 in sctp_v4/6_get_dst().
Fixes: 6e91b578bf3f ("sctp: re-use sctp_transport_pmtu in sctp_transport_route")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 21 Jan 2019 18:42:09 +0000 (02:42 +0800)]
sctp: set chunk transport correctly when it's a new asoc
In the paths:
sctp_sf_do_unexpected_init() ->
sctp_make_init_ack()
sctp_sf_do_dupcook_a/b()() ->
sctp_sf_do_5_1D_ce()
The new chunk 'retval' transport is set from the incoming chunk 'chunk'
transport. However, 'retval' transport belong to the new asoc, which
is a different one from 'chunk' transport's asoc.
It will cause that the 'retval' chunk gets set with a wrong transport.
Later when sending it and because of Commit
b9fd683982c9 ("sctp: add
sctp_packet_singleton"), sctp_packet_singleton() will set some fields,
like vtag to 'retval' chunk from that wrong transport's asoc.
This patch is to fix it by setting 'retval' transport correctly which
belongs to the right asoc in sctp_make_init_ack() and
sctp_sf_do_5_1D_ce().
Fixes: b9fd683982c9 ("sctp: add sctp_packet_singleton")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 21 Jan 2019 18:40:12 +0000 (02:40 +0800)]
sctp: improve the events for sctp stream adding
This patch is to improve sctp stream adding events in 2 places:
1. In sctp_process_strreset_addstrm_out(), move up SCTP_MAX_STREAM
and in stream allocation failure checks, as the adding has to
succeed after reconf_timer stops for the in stream adding
request retransmission.
3. In sctp_process_strreset_addstrm_in(), no event should be sent,
as no in or out stream is added here.
Fixes: 50a41591f110 ("sctp: implement receiver-side procedures for the Add Outgoing Streams Request Parameter")
Fixes: c5c4ebb3ab87 ("sctp: implement receiver-side procedures for the Add Incoming Streams Request Parameter")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 21 Jan 2019 18:39:34 +0000 (02:39 +0800)]
sctp: improve the events for sctp stream reset
This patch is to improve sctp stream reset events in 4 places:
1. In sctp_process_strreset_outreq(), the flag should always be set with
SCTP_STREAM_RESET_INCOMING_SSN instead of OUTGOING, as receiver's in
stream is reset here.
2. In sctp_process_strreset_outreq(), move up SCTP_STRRESET_ERR_WRONG_SSN
check, as the reset has to succeed after reconf_timer stops for the
in stream reset request retransmission.
3. In sctp_process_strreset_inreq(), no event should be sent, as no in
or out stream is reset here.
4. In sctp_process_strreset_resp(), SCTP_STREAM_RESET_INCOMING_SSN or
OUTGOING event should always be sent for stream reset requests, no
matter it fails or succeeds to process the request.
Fixes: 810544764536 ("sctp: implement receiver-side procedures for the Outgoing SSN Reset Request Parameter")
Fixes: 16e1a91965b0 ("sctp: implement receiver-side procedures for the Incoming SSN Reset Request Parameter")
Fixes: 11ae76e67a17 ("sctp: implement receiver-side procedures for the Reconf Response Parameter")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
wenxu [Sat, 19 Jan 2019 05:11:25 +0000 (13:11 +0800)]
ip_tunnel: Make none-tunnel-dst tunnel port work with lwtunnel
ip l add dev tun type gretap key 1000
ip a a dev tun 10.0.0.1/24
Packets with tun-id 1000 can be recived by tun dev. But packet can't
be sent through dev tun for non-tunnel-dst
With this patch: tunnel-dst can be get through lwtunnel like beflow:
ip r a 10.0.0.7 encap ip dst 172.168.0.11 dev tun
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 24 Jan 2019 23:19:10 +0000 (12:19 +1300)]
Merge tag 'drm-fixes-2019-01-25-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Live from LCA pull, some fixes all over the place,
i915:
- GVT workload destruction fix
msm:
- A6XX opp-level fix
- build fixes
- hard-coded irq removal
amdgpu:
- overclocking fix
- hybrid gfx fix
sun4i:
- fix TMDS clock usage"
* tag 'drm-fixes-2019-01-25-1' of git://anongit.freedesktop.org/drm/drm:
drm/msm: avoid unused function warning
drm/msm: Add __printf verification
drm/msm: Fix A6XX support for opp-level
drm/msm: honor GPU_READONLY flag
drm/msm: drop interrupt-names
drm/msm/gpu: Remove hardcoded interrupt name
drm/msm/gpu: fix building without debugfs
drm/i915/execlists: Mark up priority boost on preemption
drm/i915/gvt: release shadow batch buffer and wa_ctx before destroy one workload
drm/sun4i: hdmi: Fix usage of TMDS clock
drm/amd/powerplay: OD setting fix on Vega10
drm/amdgpu: Add APTX quirk for Lenovo laptop
drm/msm: Unblock writer if reader closes file
Dave Airlie [Thu, 24 Jan 2019 21:44:53 +0000 (07:44 +1000)]
Merge tag 'drm-msm-fixes-2019-01-24' of git://people.freedesktop.org/~robclark/linux into drm-fixes
A few fixes for v5.0.. the opp-level fix and removal of hard-coded irq
name is partially to make things smoother in v5.1 merge window to
avoid dependency on drm vs dt trees, but are otherwise sane changes.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGsAEHd2tGRQxRTs+A-8y_tthPs2iUgCCCEwR5vDMXab4A@mail.gmail.com
Dave Airlie [Thu, 24 Jan 2019 20:57:45 +0000 (06:57 +1000)]
Merge tag 'drm-intel-fixes-2019-01-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.0-rc4:
- fix priority boost
- gvt: fix destroy of shadow batch and indirect ctx
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87k1iu1a2e.fsf@intel.com
Dave Airlie [Thu, 24 Jan 2019 20:53:16 +0000 (06:53 +1000)]
Merge branch 'drm-fixes-5.0' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
- Overclock fix for vega10
- Hybrid gfx laptop fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190123231004.3111-1-alexander.deucher@amd.com
Ronnie Sahlberg [Thu, 24 Jan 2019 06:19:31 +0000 (16:19 +1000)]
cifs: print CIFSMaxBufSize as part of /proc/fs/cifs/DebugData
Was helpful in debug for some recent problems.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Ronnie Sahlberg [Wed, 23 Jan 2019 06:20:38 +0000 (16:20 +1000)]
smb3: add credits we receive from oplock/break PDUs
Otherwise we gradually leak credits leading to potential
hung session.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pavel Shilovsky [Wed, 16 Jan 2019 19:48:42 +0000 (11:48 -0800)]
CIFS: Fix mounts if the client is low on credits
If the server doesn't grant us at least 3 credits during the mount
we won't be able to complete it because query path info operation
requires 3 credits. Use the cached file handle if possible to allow
the mount to succeed.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pavel Shilovsky [Tue, 15 Jan 2019 23:08:48 +0000 (15:08 -0800)]
CIFS: Do not assume one credit for async responses
If we don't receive a response we can't assume that the server
granted one credit. Assume zero credits in such cases.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pavel Shilovsky [Wed, 23 Jan 2019 00:50:21 +0000 (16:50 -0800)]
CIFS: Fix credit calculations in compound mid callback
The current code doesn't do proper accounting for credits
in SMB1 case: it adds one credit per response only if we get
a complete response while it needs to return it unconditionally.
Fix this and also include malformed responses for SMB2+ into
accounting for credits because such responses have Credit
Granted field, thus nothing prevents to get a proper credit
value from them.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pavel Shilovsky [Fri, 18 Jan 2019 23:38:11 +0000 (15:38 -0800)]
CIFS: Fix credit calculation for encrypted reads with errors
We do need to account for credits received in error responses
to read requests on encrypted sessions.
Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pavel Shilovsky [Thu, 17 Jan 2019 23:29:26 +0000 (15:29 -0800)]
CIFS: Fix credits calculations for reads with errors
Currently we mark MID as malformed if we get an error from server
in a read response. This leads to not properly processing credits
in the readv callback. Fix this by marking such a response as
normal received response and process it appropriately.
Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pavel Shilovsky [Sat, 19 Jan 2019 01:25:36 +0000 (17:25 -0800)]
CIFS: Do not reconnect TCP session in add_credits()
When executing add_credits() we currently call cifs_reconnect()
if the number of credits is zero and there are no requests in
flight. In this case we may call cifs_reconnect() recursively
twice and cause memory corruption given the following sequence
of functions:
mid1.callback() -> add_credits() -> cifs_reconnect() ->
-> mid2.callback() -> add_credits() -> cifs_reconnect().
Fix this by avoiding to call cifs_reconnect() in add_credits()
and checking for zero credits in the demultiplex thread.
Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Dave Airlie [Thu, 24 Jan 2019 20:46:51 +0000 (06:46 +1000)]
Merge tag 'drm-misc-fixes-2019-01-24' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.0-rc4:
- Small refcounting fix to sun4i's HDMI support.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/588e9ecb-d80d-2cc6-254e-e5311f04224f@linux.intel.com
Arnd Bergmann [Thu, 10 Jan 2019 14:14:03 +0000 (15:14 +0100)]
drm/msm: avoid unused function warning
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c:368:13: error: 'dpu_plane_danger_signal_ctrl' defined but not used [-Werror=unused-function]
Fixes: 7b2e7adea732 ("drm/msm/dpu: Make dpu_plane_danger_signal_ctrl void")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Joe Perches [Thu, 17 Jan 2019 22:17:36 +0000 (14:17 -0800)]
drm/msm: Add __printf verification
Add a few __printf attribute specifiers to routines that
could use them.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Douglas Anderson [Wed, 16 Jan 2019 18:46:21 +0000 (10:46 -0800)]
drm/msm: Fix A6XX support for opp-level
The bindings for Qualcomm opp levels changed after being Acked but
before landing. Thus the code in the GPU driver that was relying on
the old bindings is now broken.
Let's change the code to match the new bindings by adjusting the old
string 'qcom,level' to the new string 'opp-level'. See the patch
("dt-bindings: opp: Introduce opp-level bindings").
NOTE: we will do additional cleanup to totally remove the string from
the code and use the new dev_pm_opp_get_level() but we'll do it in a
future patch. This will facilitate getting the important code fix in
sooner without having to deal with cross-maintainer dependencies.
This patch needs to land before the patch ("arm64: dts: sdm845: Add
gpu and gmu device nodes") since if a tree contains the device tree
patch but not this one you'll get a crash at bootup.
Fixes: 4b565ca5a2cb ("drm/msm: Add A6XX device support")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Wed, 9 Jan 2019 19:25:05 +0000 (14:25 -0500)]
drm/msm: honor GPU_READONLY flag
Signed-off-by: Rob Clark <robdclark@gmail.com>