Trond Myklebust [Tue, 7 Nov 2017 15:51:37 +0000 (10:51 -0500)]
NFSv4: Convert CLOSE to use nfs4_async_handle_exception()
Convert CLOSE so that it specifies the correct stateid, state and
inode for the error handling.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Mon, 18 Dec 2017 19:39:13 +0000 (14:39 -0500)]
NFS: Add a cond_resched() to nfs_commit_release_pages()
The commit list can get very large, and so we need a cond_resched()
in nfs_commit_release_pages() in order to ensure we don't hog the CPU
for excessive periods of time.
Reported-by: Mike Galbraith <efault@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Linus Torvalds [Sun, 14 Jan 2018 23:32:30 +0000 (15:32 -0800)]
Linux 4.15-rc8
Linus Torvalds [Sun, 14 Jan 2018 23:30:02 +0000 (15:30 -0800)]
Merge branch 'x86-pti-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixlet from Thomas Gleixner.
Remove a warning about lack of compiler support for retpoline that most
people can't do anything about, so it just annoys them needlessly.
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/retpoline: Remove compile time warning
Linus Torvalds [Sun, 14 Jan 2018 23:03:17 +0000 (15:03 -0800)]
Merge tag 'powerpc-4.15-7' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"One fix for an oops at boot if we take a hotplug interrupt before we
are ready to handle it.
The bulk is patches to implement mitigation for Meltdown, see the
change logs for more details.
Thanks to: Nicholas Piggin, Michael Neuling, Oliver O'Halloran, Jon
Masters, Jose Ricardo Ziviani, David Gibson"
* tag 'powerpc-4.15-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/powernv: Check device-tree for RFI flush settings
powerpc/pseries: Query hypervisor for RFI flush settings
powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti
powerpc/64s: Add support for RFI flush of L1-D cache
powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL
powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL
powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL
powerpc/64s: Simple RFI macro conversions
powerpc/64: Add macros for annotating the destination of rfid/hrfid
powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper
powerpc/pseries: Make RAS IRQ explicitly dependent on DLPAR WQ
Thomas Gleixner [Sun, 14 Jan 2018 21:13:29 +0000 (22:13 +0100)]
x86/retpoline: Remove compile time warning
Remove the compile time warning when CONFIG_RETPOLINE=y and the compiler
does not have retpoline support. Linus rationale for this is:
It's wrong because it will just make people turn off RETPOLINE, and the
asm updates - and return stack clearing - that are independent of the
compiler are likely the most important parts because they are likely the
ones easiest to target.
And it's annoying because most people won't be able to do anything about
it. The number of people building their own compiler? Very small. So if
their distro hasn't got a compiler yet (and pretty much nobody does), the
warning is just annoying crap.
It is already properly reported as part of the sysfs interface. The
compile-time warning only encourages bad things.
Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support")
Requested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Link: https://lkml.kernel.org/r/CA+55aFzWgquv4i6Mab6bASqYXg3ErV3XDFEYf=GEcCDQg5uAtw@mail.gmail.com
Linus Torvalds [Sun, 14 Jan 2018 18:22:45 +0000 (10:22 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull NVMe fix from Jens Axboe:
"Just a single fix for nvme over fabrics that should go into 4.15"
* 'for-linus' of git://git.kernel.dk/linux-block:
nvme-fabrics: initialize default host->id in nvmf_host_default()
Linus Torvalds [Sun, 14 Jan 2018 17:51:25 +0000 (09:51 -0800)]
Merge branch 'x86-pti-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 pti updates from Thomas Gleixner:
"This contains:
- a PTI bugfix to avoid setting reserved CR3 bits when PCID is
disabled. This seems to cause issues on a virtual machine at least
and is incorrect according to the AMD manual.
- a PTI bugfix which disables the perf BTS facility if PTI is
enabled. The BTS AUX buffer is not globally visible and causes the
CPU to fault when the mapping disappears on switching CR3 to user
space. A full fix which restores BTS on PTI is non trivial and will
be worked on.
- PTI bugfixes for EFI and trusted boot which make sure that the user
space visible page table entries have the NX bit cleared
- removal of dead code in the PTI pagetable setup functions
- add PTI documentation
- add a selftest for vsyscall to verify that the kernel actually
implements what it advertises.
- a sysfs interface to expose vulnerability and mitigation
information so there is a coherent way for users to retrieve the
status.
- the initial spectre_v2 mitigations, aka retpoline:
+ The necessary ASM thunk and compiler support
+ The ASM variants of retpoline and the conversion of affected ASM
code
+ Make LFENCE serializing on AMD so it can be used as speculation
trap
+ The RSB fill after vmexit
- initial objtool support for retpoline
As I said in the status mail this is the most of the set of patches
which should go into 4.15 except two straight forward patches still on
hold:
- the retpoline add on of LFENCE which waits for ACKs
- the RSB fill after context switch
Both should be ready to go early next week and with that we'll have
covered the major holes of spectre_v2 and go back to normality"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
x86,perf: Disable intel_bts when PTI
security/Kconfig: Correct the Documentation reference for PTI
x86/pti: Fix !PCID and sanitize defines
selftests/x86: Add test_vsyscall
x86/retpoline: Fill return stack buffer on vmexit
x86/retpoline/irq32: Convert assembler indirect jumps
x86/retpoline/checksum32: Convert assembler indirect jumps
x86/retpoline/xen: Convert Xen hypercall indirect jumps
x86/retpoline/hyperv: Convert assembler indirect jumps
x86/retpoline/ftrace: Convert ftrace assembler indirect jumps
x86/retpoline/entry: Convert entry assembler indirect jumps
x86/retpoline/crypto: Convert crypto assembler indirect jumps
x86/spectre: Add boot time option to select Spectre v2 mitigation
x86/retpoline: Add initial retpoline support
objtool: Allow alternatives to be ignored
objtool: Detect jumps to retpoline thunks
x86/pti: Make unpoison of pgd for trusted boot work for real
x86/alternatives: Fix optimize_nops() checking
sysfs/cpu: Fix typos in vulnerability documentation
x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC
...
Peter Zijlstra [Sun, 14 Jan 2018 10:27:13 +0000 (11:27 +0100)]
x86,perf: Disable intel_bts when PTI
The intel_bts driver does not use the 'normal' BTS buffer which is exposed
through the cpu_entry_area but instead uses the memory allocated for the
perf AUX buffer.
This obviously comes apart when using PTI because then the kernel mapping;
which includes that AUX buffer memory; disappears. Fixing this requires to
expose a mapping which is visible in all context and that's not trivial.
As a quick fix disable this driver when PTI is enabled to prevent
malfunction.
Fixes: 385ce0ea4c07 ("x86/mm/pti: Add Kconfig")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Reported-by: Robert Święcki <robert@swiecki.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: greg@kroah.com
Cc: hughd@google.com
Cc: luto@amacapital.net
Cc: Vince Weaver <vince@deater.net>
Cc: torvalds@linux-foundation.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180114102713.GB6166@worktop.programming.kicks-ass.net
W. Trevor King [Fri, 12 Jan 2018 23:24:59 +0000 (15:24 -0800)]
security/Kconfig: Correct the Documentation reference for PTI
When the config option for PTI was added a reference to documentation was
added as well. But the documentation did not exist at that point. The final
documentation has a different file name.
Fix it up to point to the proper file.
Fixes: 385ce0ea ("x86/mm/pti: Add Kconfig")
Signed-off-by: W. Trevor King <wking@tremily.us>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-mm@kvack.org
Cc: linux-security-module@vger.kernel.org
Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/3009cc8ccbddcd897ec1e0cb6dda524929de0d14.1515799398.git.wking@tremily.us
Thomas Gleixner [Sat, 13 Jan 2018 23:23:57 +0000 (00:23 +0100)]
x86/pti: Fix !PCID and sanitize defines
The switch to the user space page tables in the low level ASM code sets
unconditionally bit 12 and bit 11 of CR3. Bit 12 is switching the base
address of the page directory to the user part, bit 11 is switching the
PCID to the PCID associated with the user page tables.
This fails on a machine which lacks PCID support because bit 11 is set in
CR3. Bit 11 is reserved when PCID is inactive.
While the Intel SDM claims that the reserved bits are ignored when PCID is
disabled, the AMD APM states that they should be cleared.
This went unnoticed as the AMD APM was not checked when the code was
developed and reviewed and test systems with Intel CPUs never failed to
boot. The report is against a Centos 6 host where the guest fails to boot,
so it's not yet clear whether this is a virt issue or can happen on real
hardware too, but thats irrelevant as the AMD APM clearly ask for clearing
the reserved bits.
Make sure that on non PCID machines bit 11 is not set by the page table
switching code.
Andy suggested to rename the related bits and masks so they are clearly
describing what they should be used for, which is done as well for clarity.
That split could have been done with alternatives but the macro hell is
horrible and ugly. This can be done on top if someone cares to remove the
extra orq. For now it's a straight forward fix.
Fixes: 6fd166aae78c ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
Reported-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Willy Tarreau <w@1wt.eu>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801140009150.2371@nanos
Linus Torvalds [Sat, 13 Jan 2018 22:10:32 +0000 (14:10 -0800)]
Merge tag 'usb-4.15-rc8' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes and device ids for 4.15-rc8
Nothing major, small fixes for various devices, some resolutions for
bugs found by fuzzers, and the usual handful of new device ids.
All of these have been in linux-next with no reported issues"
* tag 'usb-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
Documentation: usb: fix typo in UVC gadgetfs config command
usb: misc: usb3503: make sure reset is low for at least 100us
uas: ignore UAS for Norelsys NS1068(X) chips
USB: UDC core: fix double-free in usb_add_gadget_udc_release
USB: fix usbmon BUG trigger
usbip: vudc_tx: fix v_send_ret_submit() vulnerability to null xfer buffer
usbip: remove kernel addresses from usb device and urb debug msgs
usbip: fix vudc_rx: harden CMD_SUBMIT path to handle malicious input
USB: serial: cp210x: add new device ID ELV ALC 8xxx
USB: serial: cp210x: add IDs for LifeScan OneTouch Verio IQ
Linus Torvalds [Sat, 13 Jan 2018 22:04:06 +0000 (14:04 -0800)]
Merge tag 'staging-4.15-rc8' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver fix from Greg KH:
"Here is a single android ashmem bugfix that resolves a reported issue
in that interface. It's been in linux-next this week with no reported
issues"
* tag 'staging-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: android: ashmem: fix a race condition in ASHMEM_SET_SIZE ioctl
Linus Torvalds [Sat, 13 Jan 2018 22:01:59 +0000 (14:01 -0800)]
Merge tag 'char-misc-4.15-rc8' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg KH:
"Here are two bugfixes for some driver bugs for 4.15-rc8
The first is a bluetooth security bug that has been ignored by the
Bluetooth developers for months for no obvious reason at all, so I've
taken it through my tree.
The second is a simple double-free bug in the mux subsystem.
Both have been in linux-next for a while with no reported issues"
* tag 'char-misc-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
mux: core: fix double get_device()
Bluetooth: Prevent stack info leak from the EFS element.
Linus Torvalds [Sat, 13 Jan 2018 21:24:56 +0000 (13:24 -0800)]
Merge tag 'kbuild-fixes-v4.15' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- fix cross-compilation for architectures that setup CROSS_COMPILE in
their arch Makefile
- fix Kconfig rational operators for bool / tristate
- drop a gperf-generated file from .gitignore
* tag 'kbuild-fixes-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
genksyms: drop *.hash.c from .gitignore
kconfig: fix relational operators for bool and tristate symbols
kbuild: move cc-option and cc-disable-warning after incl. arch Makefile
Linus Torvalds [Sat, 13 Jan 2018 21:18:15 +0000 (13:18 -0800)]
Merge tag 'apparmor-pr-2018-01-12' of git://git./linux/kernel/git/jj/linux-apparmor
Pull apparmor regression fixes from John Johansen:
"This fixes a couple bugs I have been working with Matthew Garrett on
this week. Specifically a regression in the handling of a conflicting
profile attachment and label match restrictions for ptrace when
profiles are stacked.
Summary:
- fix ptrace label match when matching stacked labels
- fix regression in profile conflict logic"
* tag 'apparmor-pr-2018-01-12' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: Fix regression in profile conflict logic
apparmor: fix ptrace label match when matching stacked labels
Linus Torvalds [Sat, 13 Jan 2018 21:14:54 +0000 (13:14 -0800)]
Merge tag 'pci-v4.15-fixes-2' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"Fix AMD boot regression due to 64-bit window conflicting with system
memory (Christian König)"
* tag 'pci-v4.15-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
x86/PCI: Move and shrink AMD 64-bit window to avoid conflict
x86/PCI: Add "pci=big_root_window" option for AMD 64-bit windows
Linus Torvalds [Sat, 13 Jan 2018 19:07:55 +0000 (11:07 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixlets from Andrew Morton:
"4 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
tools/objtool/Makefile: don't assume sync-check.sh is executable
kdump: write correct address of mem_section into vmcoreinfo
kmemleak: allow to coexist with fault injection
MAINTAINERS, nilfs2: change project home URLs
Andrew Morton [Sat, 13 Jan 2018 00:53:17 +0000 (16:53 -0800)]
tools/objtool/Makefile: don't assume sync-check.sh is executable
patch(1) loses the x bit. So if a user follows our patching
instructions in Documentation/admin-guide/README.rst, their kernel will
not compile.
Fixes: 3bd51c5a371de ("objtool: Move kernel headers/code sync check to a script")
Reported-by: Nicolas Bock <nicolasbock@gentoo.org>
Reported-by Joakim Tjernlund <Joakim.Tjernlund@infinera.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kirill A. Shutemov [Sat, 13 Jan 2018 00:53:14 +0000 (16:53 -0800)]
kdump: write correct address of mem_section into vmcoreinfo
Depending on configuration mem_section can now be an array or a pointer
to an array allocated dynamically. In most cases, we can continue to
refer to it as 'mem_section' regardless of what it is.
But there's one exception: '&mem_section' means "address of the array"
if mem_section is an array, but if mem_section is a pointer, it would
mean "address of the pointer".
We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
writes down address of pointer into vmcoreinfo, not array as we wanted.
Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
situation correctly for both cases.
Link: http://lkml.kernel.org/r/20180112162532.35896-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dmitry Vyukov [Sat, 13 Jan 2018 00:53:10 +0000 (16:53 -0800)]
kmemleak: allow to coexist with fault injection
kmemleak does one slab allocation per user allocation. So if slab fault
injection is enabled to any degree, kmemleak instantly fails to allocate
and turns itself off. However, it's useful to use kmemleak with fault
injection to find leaks on error paths. On the other hand, checking
kmemleak itself is not so useful because (1) it's a debugging tool and
(2) it has a very regular allocation pattern (basically a single
allocation site, so it either works or not).
Turn off fault injection for kmemleak allocations.
Link: http://lkml.kernel.org/r/20180109192243.19316-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Sat, 13 Jan 2018 00:53:07 +0000 (16:53 -0800)]
MAINTAINERS, nilfs2: change project home URLs
The domain of NILFS project home was changed to "nilfs.sourceforge.io"
to enable https access (the previous domain "nilfs.sourceforge.net" is
redirected to the new one). Modify URLs of the project home to reflect
this change and to replace their protocol from http to https.
Link: http://lkml.kernel.org/r/1515416141-5614-1-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Masahiro Yamada [Thu, 11 Jan 2018 09:28:08 +0000 (18:28 +0900)]
genksyms: drop *.hash.c from .gitignore
This is a left-over of commit
bb3290d91695 ("Remove gperf usage from
toolchain").
We do not generate a hash function any more.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Andy Lutomirski [Fri, 12 Jan 2018 01:16:51 +0000 (17:16 -0800)]
selftests/x86: Add test_vsyscall
This tests that the vsyscall entries do what they're expected to do.
It also confirms that attempts to read the vsyscall page behave as
expected.
If changes are made to the vsyscall code or its memory map handling,
running this test in all three of vsyscall=none, vsyscall=emulate,
and vsyscall=native are helpful.
(Because it's easy, this also compares the vsyscall results to their
vDSO equivalents.)
Note to KAISER backporters: please test this under all three
vsyscall modes. Also, in the emulate and native modes, make sure
that test_vsyscall_64 agrees with the command line or config
option as to which mode you're in. It's quite easy to mess up
the kernel such that native mode accidentally emulates
or vice versa.
Greg, etc: please backport this to all your Meltdown-patched
kernels. It'll help make sure the patches didn't regress
vsyscalls.
CSigned-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/2b9c5a174c1d60fd7774461d518aa75598b1d8fd.1515719552.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Matthew Garrett [Thu, 11 Jan 2018 21:07:54 +0000 (13:07 -0800)]
apparmor: Fix regression in profile conflict logic
The intended behaviour in apparmor profile matching is to flag a
conflict if two profiles match equally well. However, right now a
conflict is generated if another profile has the same match length even
if that profile doesn't actually match. Fix the logic so we only
generate a conflict if the profiles match.
Fixes: 844b8292b631 ("apparmor: ensure that undecidable profile attachments fail")
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
John Johansen [Sat, 9 Dec 2017 01:43:18 +0000 (17:43 -0800)]
apparmor: fix ptrace label match when matching stacked labels
Given a label with a profile stack of
A//&B or A//&C ...
A ptrace rule should be able to specify a generic trace pattern with
a rule like
ptrace trace A//&**,
however this is failing because while the correct label match routine
is called, it is being done post label decomposition so it is always
being done against a profile instead of the stacked label.
To fix this refactor the cross check to pass the full peer label in to
the label_match.
Fixes: 290f458a4f16 ("apparmor: allow ptrace checks to be finer grained than just capability")
Cc: Stable <stable@vger.kernel.org>
Reported-by: Matthew Garrett <mjg59@google.com>
Tested-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Linus Torvalds [Fri, 12 Jan 2018 18:32:11 +0000 (10:32 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Two pending (non-PTI) x86 fixes:
- an Intel-MID crash fix
- and an Intel microcode loader blacklist quirk to avoid a
problematic revision"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform/intel-mid: Revert "Make 'bt_sfi_data' const"
x86/microcode/intel: Extend BDW late-loading with a revision check
Linus Torvalds [Fri, 12 Jan 2018 18:23:59 +0000 (10:23 -0800)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"A Kconfig fix, a build fix and a membarrier bug fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
membarrier: Disable preemption when calling smp_call_function_many()
sched/isolation: Make CONFIG_CPU_ISOLATION=y depend on SMP or COMPILE_TEST
ia64, sched/cputime: Fix build error if CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y
Linus Torvalds [Fri, 12 Jan 2018 18:14:09 +0000 (10:14 -0800)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"No functional effects intended: removes leftovers from recent lockdep
and refcounts work"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/refcounts: Remove stale comment from the ARCH_HAS_REFCOUNT Kconfig entry
locking/lockdep: Remove cross-release leftovers
locking/Documentation: Remove stale crossrelease_fullstack parameter
Linus Torvalds [Fri, 12 Jan 2018 18:00:15 +0000 (10:00 -0800)]
Merge tag 'for-linus-4.15-rc8-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"This contains two build fixes for clang and two fixes for rather
unlikely situations in the Xen gntdev driver"
* tag 'for-linus-4.15-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/gntdev: Fix partial gntdev_mmap() cleanup
xen/gntdev: Fix off-by-one error when unmapping with holes
x86: xen: remove the use of VLAIS
x86/xen/time: fix section mismatch for xen_init_time_ops()
Linus Torvalds [Fri, 12 Jan 2018 17:56:52 +0000 (09:56 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"PPC:
- user-triggerable use-after-free in HPT resizing
- stale TLB entries in the guest
- trap-and-emulate (PR) KVM guests failing to start under pHyp
x86:
- Another "Spectre" fix.
- async pagefault fix
- Revert an old fix for x86 nested virtualization, which turned out
to do more harm than good
- Check shrinker registration return code, to avoid warnings from
upcoming 4.16 -mm patches"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Add memory barrier on vmcs field lookup
KVM: x86: emulate #UD while in guest mode
x86: kvm: propagate register_shrinker return code
KVM MMU: check pending exception before injecting APF
KVM: PPC: Book3S HV: Always flush TLB in kvmppc_alloc_reset_hpt()
KVM: PPC: Book3S PR: Fix WIMG handling under pHyp
KVM: PPC: Book3S HV: Fix use after free in case of multiple resize requests
KVM: PPC: Book3S HV: Drop prepare_done from struct kvm_resize_hpt
Linus Torvalds [Fri, 12 Jan 2018 17:47:58 +0000 (09:47 -0800)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"This fixes a NULL pointer dereference in crypto_remove_spawns that can
be triggered through af_alg"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: algapi - fix NULL dereference in crypto_remove_spawns()
Jens Axboe [Fri, 12 Jan 2018 17:42:36 +0000 (10:42 -0700)]
Merge branch 'nvme-4.15' of git://git.infradead.org/nvme into for-linus
Pull a single NVMe fix from Christoph for 4.15.
Linus Torvalds [Fri, 12 Jan 2018 17:34:20 +0000 (09:34 -0800)]
Merge tag 'mmc-v4.15-rc2-2' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:
- s3mci: mark debug_regs[] as static
- renesas_sdhi: Add MODULE_LICENSE
* tag 'mmc-v4.15-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: s3mci: mark debug_regs[] as static
mmc: renesas_sdhi: Add MODULE_LICENSE
Linus Torvalds [Fri, 12 Jan 2018 17:28:28 +0000 (09:28 -0800)]
Merge tag 'drm-fixes-for-v4.15-rc8' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
- Nouveau: regression fix
- Tegra: regression fix
- vmwgfx: crasher + freed data leak
- i915: KASAN use after free fix, whitelist register to avoid hang fix,
GVT fixes
- vc4: irq/pm fix
* tag 'drm-fixes-for-v4.15-rc8' of git://people.freedesktop.org/~airlied/linux:
drm/i915: Don't adjust priority on an already signaled fence
drm/i915: Whitelist SLICE_COMMON_ECO_CHICKEN1 on Geminilake.
drm/vmwgfx: Potential off by one in vmw_view_add()
drm/tegra: sor: Fix hang on Tegra124 eDP
drm/vmwgfx: Don't cache framebuffer maps
drm/nouveau/disp/gf119: add missing drive vfunc ptr
drm/i915/gvt: Fix stack-out-of-bounds bug in cmd parser
drm/i915/gvt: Clear the shadow page table entry after post-sync
drm/vc4: Move IRQ enable to PM path
David Woodhouse [Fri, 12 Jan 2018 11:11:27 +0000 (11:11 +0000)]
x86/retpoline: Fill return stack buffer on vmexit
In accordance with the Intel and AMD documentation, we need to overwrite
all entries in the RSB on exiting a guest, to prevent malicious branch
target predictions from affecting the host kernel. This is needed both
for retpoline and for IBRS.
[ak: numbers again for the RSB stuffing labels]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515755487-8524-1-git-send-email-dwmw@amazon.co.uk
Dave Airlie [Fri, 12 Jan 2018 01:48:06 +0000 (11:48 +1000)]
Merge tag 'drm-intel-fixes-2018-01-11-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
Hopefully final drm/i915 fixes for v4.15:
- Fix a KASAN reported use after free
- Whitelist a register to avoid hangs
- GVT fixes
* tag 'drm-intel-fixes-2018-01-11-1' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915: Don't adjust priority on an already signaled fence
drm/i915: Whitelist SLICE_COMMON_ECO_CHICKEN1 on Geminilake.
drm/i915/gvt: Fix stack-out-of-bounds bug in cmd parser
drm/i915/gvt: Clear the shadow page table entry after post-sync
Dave Airlie [Fri, 12 Jan 2018 01:47:40 +0000 (11:47 +1000)]
Merge branch 'vmwgfx-fixes-4.15' of git://people.freedesktop.org/~thomash/linux into drm-fixes
Two important fixes for vmwgfx.
The off-by-one fix could cause a malicious user to potentially crash the
kernel.
The framebuffer map cache fix can under some circumstances enable a user to
read from or write to freed pages.
* 'vmwgfx-fixes-4.15' of git://people.freedesktop.org/~thomash/linux:
drm/vmwgfx: Potential off by one in vmw_view_add()
drm/vmwgfx: Don't cache framebuffer maps
Dave Airlie [Fri, 12 Jan 2018 01:47:11 +0000 (11:47 +1000)]
Merge tag 'drm/tegra/for-4.15-rc8' of git://anongit.freedesktop.org/tegra/linux into drm-fixes
drm/tegra: Fixes for v4.15-rc8
A single fix for a Tegra124 eDP regression introduced by the SOR changes
in v4.15-rc1.
* tag 'drm/tegra/for-4.15-rc8' of git://anongit.freedesktop.org/tegra/linux:
drm/tegra: sor: Fix hang on Tegra124 eDP
Linus Torvalds [Fri, 12 Jan 2018 00:57:32 +0000 (16:57 -0800)]
Merge tag 'ceph-for-4.15-rc8' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"Two rbd fixes for 4.12 and 4.2 issues respectively, marked for
stable"
* tag 'ceph-for-4.15-rc8' of git://github.com/ceph/ceph-client:
rbd: set max_segments to USHRT_MAX
rbd: reacquire lock should update lock owner client id
Linus Torvalds [Fri, 12 Jan 2018 00:54:35 +0000 (16:54 -0800)]
Merge tag 'gpio-v4.15-4' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO fix from Linus Walleij:
"Fix a raw vs elaborate GPIO descriptor bug introduced by yours truly"
* tag 'gpio-v4.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: Add missing open drain/source handling to gpiod_set_value_cansleep()
Andi Kleen [Thu, 11 Jan 2018 21:46:33 +0000 (21:46 +0000)]
x86/retpoline/irq32: Convert assembler indirect jumps
Convert all indirect jumps in 32bit irq inline asm code to use non
speculative sequences.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-12-git-send-email-dwmw@amazon.co.uk
David Woodhouse [Thu, 11 Jan 2018 21:46:32 +0000 (21:46 +0000)]
x86/retpoline/checksum32: Convert assembler indirect jumps
Convert all indirect jumps in 32bit checksum assembler code to use
non-speculative sequences when CONFIG_RETPOLINE is enabled.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-11-git-send-email-dwmw@amazon.co.uk
David Woodhouse [Thu, 11 Jan 2018 21:46:31 +0000 (21:46 +0000)]
x86/retpoline/xen: Convert Xen hypercall indirect jumps
Convert indirect call in Xen hypercall to use non-speculative sequence,
when CONFIG_RETPOLINE is enabled.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-10-git-send-email-dwmw@amazon.co.uk
David Woodhouse [Thu, 11 Jan 2018 21:46:30 +0000 (21:46 +0000)]
x86/retpoline/hyperv: Convert assembler indirect jumps
Convert all indirect jumps in hyperv inline asm code to use non-speculative
sequences when CONFIG_RETPOLINE is enabled.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-9-git-send-email-dwmw@amazon.co.uk
David Woodhouse [Thu, 11 Jan 2018 21:46:29 +0000 (21:46 +0000)]
x86/retpoline/ftrace: Convert ftrace assembler indirect jumps
Convert all indirect jumps in ftrace assembler code to use non-speculative
sequences when CONFIG_RETPOLINE is enabled.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-8-git-send-email-dwmw@amazon.co.uk
David Woodhouse [Thu, 11 Jan 2018 21:46:28 +0000 (21:46 +0000)]
x86/retpoline/entry: Convert entry assembler indirect jumps
Convert indirect jumps in core 32/64bit entry assembler code to use
non-speculative sequences when CONFIG_RETPOLINE is enabled.
Don't use CALL_NOSPEC in entry_SYSCALL_64_fastpath because the return
address after the 'call' instruction must be *precisely* at the
.Lentry_SYSCALL_64_after_fastpath label for stub_ptregs_64 to work,
and the use of alternatives will mess that up unless we play horrid
games to prepend with NOPs and make the variants the same length. It's
not worth it; in the case where we ALTERNATIVE out the retpoline, the
first instruction at __x86.indirect_thunk.rax is going to be a bare
jmp *%rax anyway.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-7-git-send-email-dwmw@amazon.co.uk
David Woodhouse [Thu, 11 Jan 2018 21:46:27 +0000 (21:46 +0000)]
x86/retpoline/crypto: Convert crypto assembler indirect jumps
Convert all indirect jumps in crypto assembler code to use non-speculative
sequences when CONFIG_RETPOLINE is enabled.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-6-git-send-email-dwmw@amazon.co.uk
David Woodhouse [Thu, 11 Jan 2018 21:46:26 +0000 (21:46 +0000)]
x86/spectre: Add boot time option to select Spectre v2 mitigation
Add a spectre_v2= option to select the mitigation used for the indirect
branch speculation vulnerability.
Currently, the only option available is retpoline, in its various forms.
This will be expanded to cover the new IBRS/IBPB microcode features.
The RETPOLINE_AMD feature relies on a serializing LFENCE for speculation
control. For AMD hardware, only set RETPOLINE_AMD if LFENCE is a
serializing instruction, which is indicated by the LFENCE_RDTSC feature.
[ tglx: Folded back the LFENCE/AMD fixes and reworked it so IBRS
integration becomes simple ]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-5-git-send-email-dwmw@amazon.co.uk
David Woodhouse [Thu, 11 Jan 2018 21:46:25 +0000 (21:46 +0000)]
x86/retpoline: Add initial retpoline support
Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide
the corresponding thunks. Provide assembler macros for invoking the thunks
in the same way that GCC does, from native and inline assembler.
This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In
some circumstances, IBRS microcode features may be used instead, and the
retpoline can be disabled.
On AMD CPUs if lfence is serialising, the retpoline can be dramatically
simplified to a simple "lfence; jmp *\reg". A future patch, after it has
been verified that lfence really is serialising in all circumstances, can
enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition
to X86_FEATURE_RETPOLINE.
Do not align the retpoline in the altinstr section, because there is no
guarantee that it stays aligned when it's copied over the oldinstr during
alternative patching.
[ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks]
[ tglx: Put actual function CALL/JMP in front of the macros, convert to
symbolic labels ]
[ dwmw2: Convert back to numeric labels, merge objtool fixes ]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.uk
Josh Poimboeuf [Thu, 11 Jan 2018 21:46:24 +0000 (21:46 +0000)]
objtool: Allow alternatives to be ignored
Getting objtool to understand retpolines is going to be a bit of a
challenge. For now, take advantage of the fact that retpolines are
patched in with alternatives. Just read the original (sane)
non-alternative instruction, and ignore the patched-in retpoline.
This allows objtool to understand the control flow *around* the
retpoline, even if it can't yet follow what's inside. This means the
ORC unwinder will fail to unwind from inside a retpoline, but will work
fine otherwise.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-3-git-send-email-dwmw@amazon.co.uk
Josh Poimboeuf [Thu, 11 Jan 2018 21:46:23 +0000 (21:46 +0000)]
objtool: Detect jumps to retpoline thunks
A direct jump to a retpoline thunk is really an indirect jump in
disguise. Change the objtool instruction type accordingly.
Objtool needs to know where indirect branches are so it can detect
switch statement jump tables.
This fixes a bunch of warnings with CONFIG_RETPOLINE like:
arch/x86/events/intel/uncore_nhmex.o: warning: objtool: nhmex_rbox_msr_enable_event()+0x44: sibling call from callable instruction with modified stack frame
kernel/signal.o: warning: objtool: copy_siginfo_to_user()+0x91: sibling call from callable instruction with modified stack frame
...
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-2-git-send-email-dwmw@amazon.co.uk
Dave Hansen [Wed, 10 Jan 2018 22:49:39 +0000 (14:49 -0800)]
x86/pti: Make unpoison of pgd for trusted boot work for real
The inital fix for trusted boot and PTI potentially misses the pgd clearing
if pud_alloc() sets a PGD. It probably works in *practice* because for two
adjacent calls to map_tboot_page() that share a PGD entry, the first will
clear NX, *then* allocate and set the PGD (without NX clear). The second
call will *not* allocate but will clear the NX bit.
Defer the NX clearing to a point after it is known that all top-level
allocations have occurred. Add a comment to clarify why.
[ tglx: Massaged changelog ]
Fixes: 262b6b30087 ("x86/tboot: Unbreak tboot with PTI enabled")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: "Tim Chen" <tim.c.chen@linux.intel.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: peterz@infradead.org
Cc: ning.sun@intel.com
Cc: tboot-devel@lists.sourceforge.net
Cc: andi@firstfloor.org
Cc: luto@kernel.org
Cc: law@redhat.com
Cc: pbonzini@redhat.com
Cc: torvalds@linux-foundation.org
Cc: gregkh@linux-foundation.org
Cc: dwmw@amazon.co.uk
Cc: nickc@redhat.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180110224939.2695CD47@viggo.jf.intel.com
=?UTF-8?q?Christian=20K=C3=B6nig?= [Thu, 11 Jan 2018 13:23:30 +0000 (14:23 +0100)]
x86/PCI: Move and shrink AMD 64-bit window to avoid conflict
Avoid problems with BIOS implementations which don't report all used
resources to the OS by only allocating a 256GB window directly below the
hardware limit (from the BKDG, sec 2.4.6).
Fixes a silent reboot loop reported by Aaro Koskinen <aaro.koskinen@iki.fi>
on an AMD-based MSI MS-7699/760GA-P43(FX) system. This was apparently
caused by RAM or other unreported hardware that conflicted with the new
window.
Link: https://support.amd.com/TechDocs/49125_15h_Models_30h-3Fh_BKDG.pdf
Link: https://lkml.kernel.org/r/20180105220412.fzpwqe4zljdawr36@darkstar.musicnaut.iki.fi
Fixes: fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)")
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Christian König <christian.koenig@amd.com>
[bhelgaas: changelog, comment, Fixes:]
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Bin Liu [Tue, 9 Jan 2018 19:27:17 +0000 (13:27 -0600)]
Documentation: usb: fix typo in UVC gadgetfs config command
This seems to be a copy&paste error. With the fix the uvc gadget now can
be created by following the instrucitons.
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefan Agner [Thu, 11 Jan 2018 13:47:40 +0000 (14:47 +0100)]
usb: misc: usb3503: make sure reset is low for at least 100us
When using a GPIO which is high by default, and initialize the
driver in USB Hub mode, initialization fails with:
[ 111.757794] usb3503 0-0008: SP_ILOCK failed (-5)
The reason seems to be that the chip is not properly reset.
Probe does initialize reset low, however some lines later the
code already set it back high, which is not long enouth.
Make sure reset is asserted for at least 100us by inserting a
delay after initializing the reset pin during probe.
Signed-off-by: Stefan Agner <stefan@agner.ch>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
=?UTF-8?q?Christian=20K=C3=B6nig?= [Thu, 11 Jan 2018 13:23:29 +0000 (14:23 +0100)]
x86/PCI: Add "pci=big_root_window" option for AMD 64-bit windows
Only try to enable a 64-bit window on AMD CPUs when "pci=big_root_window"
is specified.
This taints the kernel because the new 64-bit window uses address space we
don't know anything about, and it may contain unreported devices or memory
that would conflict with the window.
The pci_amd_enable_64bit_bar() quirk that enables the window is specific to
AMD CPUs. The generic solution would be to have the firmware enable the
window and describe it in the host bridge's _CRS method, or at least
describe it in the _PRS method so the OS would have the option of enabling
it.
Signed-off-by: Christian König <christian.koenig@amd.com>
[bhelgaas: changelog, extend doc, mention taint in dmesg]
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Paolo Bonzini [Thu, 11 Jan 2018 17:20:48 +0000 (18:20 +0100)]
Merge branch 'kvm-insert-lfence' into kvm-master
Topic branch for CVE-2017-5753, avoiding conflicts in the next merge window.
Andrew Honig [Wed, 10 Jan 2018 18:12:03 +0000 (10:12 -0800)]
KVM: x86: Add memory barrier on vmcs field lookup
This adds a memory barrier when performing a lookup into
the vmcs_field_to_offset_table. This is related to
CVE-2017-5753.
Signed-off-by: Andrew Honig <ahonig@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Greg Kroah-Hartman [Thu, 11 Jan 2018 16:40:16 +0000 (17:40 +0100)]
Merge tag 'usb-serial-4.15-rc8' of https://git./linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:
USB-serial fixes for v4.15-rc8
Here are a couple of new device ids for cp210x.
Both have been in linux-next with no reported issues.
Signed-off-by: Johan Hovold <johan@kernel.org>
Paolo Bonzini [Thu, 11 Jan 2018 15:55:24 +0000 (16:55 +0100)]
KVM: x86: emulate #UD while in guest mode
This reverts commits
ae1f57670703656cc9f293722c3b8b6782f8ab3f
and
ac9b305caa0df6f5b75d294e4b86c1027648991e.
If the hardware doesn't support MOVBE, but L0 sets CPUID.01H:ECX.MOVBE
in L1's emulated CPUID information, then L1 is likely to pass that
CPUID bit through to L2. L2 will expect MOVBE to work, but if L1
doesn't intercept #UD, then any MOVBE instruction executed in L2 will
raise #UD, and the exception will be delivered in L2.
Commit
ac9b305caa0df6f5b75d294e4b86c1027648991e is a better and more
complete version of
ae1f57670703 ("KVM: nVMX: Do not emulate #UD while
in guest mode"); however, neither considers the above case.
Suggested-by: Jim Mattson <jmattson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Arnd Bergmann [Wed, 10 Jan 2018 16:26:59 +0000 (17:26 +0100)]
x86: kvm: propagate register_shrinker return code
Patch "mm,vmscan: mark register_shrinker() as __must_check" is
queued for 4.16 in linux-mm and adds a warning about the unchecked
call to register_shrinker:
arch/x86/kvm/mmu.c:5485:2: warning: ignoring return value of 'register_shrinker', declared with attribute warn_unused_result [-Wunused-result]
This changes the kvm_mmu_module_init() function to fail itself
when the call to register_shrinker fails.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 11 Jan 2018 13:07:27 +0000 (14:07 +0100)]
Merge tag 'kvm-ppc-fixes-4.15-3' of git://git./linux/kernel/git/paulus/powerpc into kvm-master
PPC KVM fixes for 4.15
Four commits here, including two that were tagged but never merged.
Three of them are for the HPT resizing code; two of those fix a
user-triggerable use-after-free in the host, and one that fixes
stale TLB entries in the guest. The remaining commit fixes a bug
causing PR KVM guests under PowerVM to fail to start.
Haozhong Zhang [Wed, 10 Jan 2018 13:44:42 +0000 (21:44 +0800)]
KVM MMU: check pending exception before injecting APF
For example, when two APF's for page ready happen after one exit and
the first one becomes pending, the second one will result in #DF.
Instead, just handle the second page fault synchronously.
Reported-by: Ross Zwisler <zwisler@gmail.com>
Message-ID: <CAOxpaSUBf8QoOZQ1p4KfUp0jq76OKfGY4Uxs-Gg8ngReD99xww@mail.gmail.com>
Reported-by: Alec Blayne <ab@tevsa.net>
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Chris Wilson [Sat, 6 Jan 2018 10:56:18 +0000 (10:56 +0000)]
drm/i915: Don't adjust priority on an already signaled fence
When we retire a signaled fence, we free the dependency tree. However,
we skip clearing the list so that if we then try to adjust the priority
of the signaled fence, we may walk the list of freed dependencies.
[ 3083.156757] ==================================================================
[ 3083.156806] BUG: KASAN: use-after-free in execlists_schedule+0x199/0x660 [i915]
[ 3083.156810] Read of size 8 at addr
ffff8806bf20f400 by task Xorg/831
[ 3083.156815] CPU: 0 PID: 831 Comm: Xorg Not tainted 4.15.0-rc6-no-psn+ #1
[ 3083.156817] Hardware name: Notebook N24_25BU/N24_25BU, BIOS 5.12 02/17/2017
[ 3083.156818] Call Trace:
[ 3083.156823] dump_stack+0x5c/0x7a
[ 3083.156827] print_address_description+0x6b/0x290
[ 3083.156830] kasan_report+0x28f/0x380
[ 3083.156872] ? execlists_schedule+0x199/0x660 [i915]
[ 3083.156914] execlists_schedule+0x199/0x660 [i915]
[ 3083.156956] ? intel_crtc_atomic_check+0x146/0x4e0 [i915]
[ 3083.156997] ? execlists_submit_request+0xe0/0xe0 [i915]
[ 3083.157038] ? i915_vma_misplaced.part.4+0x25/0xb0 [i915]
[ 3083.157079] ? __i915_vma_do_pin+0x7c8/0xc80 [i915]
[ 3083.157121] ? intel_atomic_state_alloc+0x44/0x60 [i915]
[ 3083.157130] ? drm_atomic_helper_page_flip+0x3e/0xb0 [drm_kms_helper]
[ 3083.157145] ? drm_mode_page_flip_ioctl+0x7d2/0x850 [drm]
[ 3083.157159] ? drm_ioctl_kernel+0xa7/0xf0 [drm]
[ 3083.157172] ? drm_ioctl+0x45b/0x560 [drm]
[ 3083.157211] i915_gem_object_wait_priority+0x14c/0x2c0 [i915]
[ 3083.157251] ? i915_gem_get_aperture_ioctl+0x150/0x150 [i915]
[ 3083.157290] ? i915_vma_pin_fence+0x1d8/0x320 [i915]
[ 3083.157331] ? intel_pin_and_fence_fb_obj+0x175/0x250 [i915]
[ 3083.157372] ? intel_rotation_info_size+0x60/0x60 [i915]
[ 3083.157413] ? intel_link_compute_m_n+0x80/0x80 [i915]
[ 3083.157428] ? drm_dev_printk+0x1b0/0x1b0 [drm]
[ 3083.157443] ? drm_dev_printk+0x1b0/0x1b0 [drm]
[ 3083.157485] intel_prepare_plane_fb+0x2f8/0x5a0 [i915]
[ 3083.157527] ? intel_crtc_get_vblank_counter+0x80/0x80 [i915]
[ 3083.157536] drm_atomic_helper_prepare_planes+0xa0/0x1c0 [drm_kms_helper]
[ 3083.157587] intel_atomic_commit+0x12e/0x4e0 [i915]
[ 3083.157605] drm_atomic_helper_page_flip+0xa2/0xb0 [drm_kms_helper]
[ 3083.157621] drm_mode_page_flip_ioctl+0x7d2/0x850 [drm]
[ 3083.157638] ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
[ 3083.157652] ? drm_lease_owner+0x1a/0x30 [drm]
[ 3083.157668] ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
[ 3083.157681] drm_ioctl_kernel+0xa7/0xf0 [drm]
[ 3083.157696] drm_ioctl+0x45b/0x560 [drm]
[ 3083.157711] ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
[ 3083.157725] ? drm_getstats+0x20/0x20 [drm]
[ 3083.157729] ? timerqueue_del+0x49/0x80
[ 3083.157732] ? __remove_hrtimer+0x62/0xb0
[ 3083.157735] ? hrtimer_try_to_cancel+0x173/0x210
[ 3083.157738] do_vfs_ioctl+0x13b/0x880
[ 3083.157741] ? ioctl_preallocate+0x140/0x140
[ 3083.157744] ? _raw_spin_unlock_irq+0xe/0x30
[ 3083.157746] ? do_setitimer+0x234/0x370
[ 3083.157750] ? SyS_setitimer+0x19e/0x1b0
[ 3083.157752] ? SyS_alarm+0x140/0x140
[ 3083.157755] ? __rcu_read_unlock+0x66/0x80
[ 3083.157757] ? __fget+0xc4/0x100
[ 3083.157760] SyS_ioctl+0x74/0x80
[ 3083.157763] entry_SYSCALL_64_fastpath+0x1a/0x7d
[ 3083.157765] RIP: 0033:0x7f6135d0c6a7
[ 3083.157767] RSP: 002b:
00007fff01451888 EFLAGS:
00003246 ORIG_RAX:
0000000000000010
[ 3083.157769] RAX:
ffffffffffffffda RBX:
0000000000000004 RCX:
00007f6135d0c6a7
[ 3083.157771] RDX:
00007fff01451950 RSI:
00000000c01864b0 RDI:
000000000000000c
[ 3083.157772] RBP:
00007f613076f600 R08:
0000000000000001 R09:
0000000000000000
[ 3083.157773] R10:
0000000000000060 R11:
0000000000003246 R12:
0000000000000000
[ 3083.157774] R13:
0000000000000060 R14:
000000000000001b R15:
0000000000000060
[ 3083.157779] Allocated by task 831:
[ 3083.157783] kmem_cache_alloc+0xc0/0x200
[ 3083.157822] i915_gem_request_await_dma_fence+0x2c4/0x5d0 [i915]
[ 3083.157861] i915_gem_request_await_object+0x321/0x370 [i915]
[ 3083.157900] i915_gem_do_execbuffer+0x1165/0x19c0 [i915]
[ 3083.157937] i915_gem_execbuffer2+0x1ad/0x550 [i915]
[ 3083.157950] drm_ioctl_kernel+0xa7/0xf0 [drm]
[ 3083.157962] drm_ioctl+0x45b/0x560 [drm]
[ 3083.157964] do_vfs_ioctl+0x13b/0x880
[ 3083.157966] SyS_ioctl+0x74/0x80
[ 3083.157968] entry_SYSCALL_64_fastpath+0x1a/0x7d
[ 3083.157971] Freed by task 831:
[ 3083.157973] kmem_cache_free+0x77/0x220
[ 3083.158012] i915_gem_request_retire+0x72c/0xa70 [i915]
[ 3083.158051] i915_gem_request_alloc+0x1e9/0x8b0 [i915]
[ 3083.158089] i915_gem_do_execbuffer+0xa96/0x19c0 [i915]
[ 3083.158127] i915_gem_execbuffer2+0x1ad/0x550 [i915]
[ 3083.158140] drm_ioctl_kernel+0xa7/0xf0 [drm]
[ 3083.158153] drm_ioctl+0x45b/0x560 [drm]
[ 3083.158155] do_vfs_ioctl+0x13b/0x880
[ 3083.158156] SyS_ioctl+0x74/0x80
[ 3083.158158] entry_SYSCALL_64_fastpath+0x1a/0x7d
[ 3083.158162] The buggy address belongs to the object at
ffff8806bf20f400
which belongs to the cache i915_dependency of size 64
[ 3083.158166] The buggy address is located 0 bytes inside of
64-byte region [
ffff8806bf20f400,
ffff8806bf20f440)
[ 3083.158168] The buggy address belongs to the page:
[ 3083.158171] page:
00000000d43decc4 count:1 mapcount:0 mapping: (null) index:0x0
[ 3083.158174] flags: 0x17ffe0000000100(slab)
[ 3083.158179] raw:
017ffe0000000100 0000000000000000 0000000000000000 0000000180200020
[ 3083.158182] raw:
ffffea001afc16c0 0000000500000005 ffff880731b881c0 0000000000000000
[ 3083.158184] page dumped because: kasan: bad access detected
[ 3083.158187] Memory state around the buggy address:
[ 3083.158190]
ffff8806bf20f300: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 3083.158192]
ffff8806bf20f380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 3083.158195] >
ffff8806bf20f400: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 3083.158196] ^
[ 3083.158199]
ffff8806bf20f480: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 3083.158201]
ffff8806bf20f500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 3083.158203] ==================================================================
Reported-by: Alexandru Chirvasitu <achirvasub@gmail.com>
Reported-by: Mike Keehan <mike@keehan.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104436
Fixes: 1f181225f8ec ("drm/i915/execlists: Keep request->priority for its lifetime")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180106105618.13532-1-chris@chris-wilson.co.uk
(cherry picked from commit
c218ee03b9315073ce43992792554dafa0626eb8)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Kenneth Graunke [Fri, 5 Jan 2018 08:59:05 +0000 (00:59 -0800)]
drm/i915: Whitelist SLICE_COMMON_ECO_CHICKEN1 on Geminilake.
Geminilake requires the 3D driver to select whether barriers are
intended for compute shaders, or tessellation control shaders, by
whacking a "Barrier Mode" bit in SLICE_COMMON_ECO_CHICKEN1 when
switching pipelines. Failure to do this properly can result in GPU
hangs.
Unfortunately, this means it needs to switch mid-batch, so only
userspace can properly set it. To facilitate this, the kernel needs
to whitelist the register.
The workarounds page currently tags this as applying to Broxton only,
but that doesn't make sense. The documentation for the register it
references says the bit userspace is supposed to toggle only exists on
Geminilake. Empirically, the Mesa patch to toggle this bit appears to
fix intermittent GPU hangs in tessellation control shader barrier tests
on Geminilake; we haven't seen those hangs on Broxton.
v2: Mention WA #0862 in the comment (it doesn't have a name).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180105085905.9298-1-kenneth@whitecape.org
(cherry picked from commit
ab062639edb0412daf6de540725276b9a5d217f9)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Linus Torvalds [Thu, 11 Jan 2018 01:55:42 +0000 (17:55 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull vfs regression fix from Al Viro/
Fix a leak in socket() introduced by commit
8e1611e23579 ("make
sock_alloc_file() do sock_release() on failures").
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Fix a leak in socket(2) when we fail to allocate a file descriptor.
Linus Torvalds [Thu, 11 Jan 2018 01:53:18 +0000 (17:53 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) BPF speculation prevention and BPF_JIT_ALWAYS_ON, from Alexei
Starovoitov.
2) Revert dev_get_random_name() changes as adjust the error code
returns seen by userspace definitely breaks stuff.
3) Fix TX DMA map/unmap on older iwlwifi devices, from Emmanuel
Grumbach.
4) From wrong AF family when requesting sock diag modules, from Andrii
Vladyka.
5) Don't add new ipv6 routes attached to the null_entry, from Wei Wang.
6) Some SCTP sockopt length fixes from Marcelo Ricardo Leitner.
7) Don't leak when removing VLAN ID 0, from Cong Wang.
8) Hey there's a potential leak in ipv6_make_skb() too, from Eric
Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
ipv6: sr: fix TLVs not being copied using setsockopt
ipv6: fix possible mem leaks in ipv6_make_skb()
mlxsw: spectrum_qdisc: Don't use variable array in mlxsw_sp_tclass_congestion_enable
mlxsw: pci: Wait after reset before accessing HW
nfp: always unmask aux interrupts at init
8021q: fix a memory leak for VLAN 0 device
of_mdio: avoid MDIO bus removal when a PHY is missing
caif_usb: use strlcpy() instead of strncpy()
doc: clarification about setting SO_ZEROCOPY
net: gianfar_ptp: move set_fipers() to spinlock protecting area
sctp: make use of pre-calculated len
sctp: add a ceiling to optlen in some sockopts
sctp: GFP_ATOMIC is not needed in sctp_setsockopt_events
bpf: introduce BPF_JIT_ALWAYS_ON config
bpf: avoid false sharing of map refcount with max_entries
ipv6: remove null_entry before adding default route
SolutionEngine771x: add Ether TSU resource
SolutionEngine771x: fix Ether platform data
docs-rst: networking: wire up msg_zerocopy
net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()
...
Al Viro [Wed, 10 Jan 2018 23:47:05 +0000 (18:47 -0500)]
Fix a leak in socket(2) when we fail to allocate a file descriptor.
Got broken by "make sock_alloc_file() do sock_release() on failures" -
cleanup after sock_map_fd() failure got pulled all the way into
sock_alloc_file(), but it used to serve the case when sock_map_fd()
failed *before* getting to sock_alloc_file() as well, and that got
lost. Trivial to fix, fortunately.
Fixes: 8e1611e23579 (make sock_alloc_file() do sock_release() on failures)
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Mathieu Xhonneux [Wed, 10 Jan 2018 13:35:49 +0000 (13:35 +0000)]
ipv6: sr: fix TLVs not being copied using setsockopt
Function ipv6_push_rthdr4 allows to add an IPv6 Segment Routing Header
to a socket through setsockopt, but the current implementation doesn't
copy possible TLVs at the end of the SRH received from userspace.
Therefore, the execution of the following branch if (sr_has_hmac(sr_phdr))
{ ... } will never complete since the len and type fields of a possible
HMAC TLV are not copied, hence seg6_get_tlv_hmac will return an error,
and the HMAC will not be computed.
This commit adds a memcpy in case TLVs have been appended to the SRH.
Fixes: a149e7c7ce81 ("ipv6: sr: add support for SRH injection through setsockopt")
Acked-by: David Lebrun <dlebrun@google.com>
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 10 Jan 2018 11:45:49 +0000 (03:45 -0800)]
ipv6: fix possible mem leaks in ipv6_make_skb()
ip6_setup_cork() might return an error, while memory allocations have
been done and must be rolled back.
Fixes: 6422398c2ab0 ("ipv6: introduce ipv6_make_skb")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Reported-by: Mike Maloney <maloney@google.com>
Acked-by: Mike Maloney <maloney@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 10 Jan 2018 20:58:23 +0000 (15:58 -0500)]
Merge branch 'mlxsw-couple-of-fixes'
Jiri Pirko says:
====================
mlxsw: couple of fixes
Couple of small fixes for mlxsw driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 10 Jan 2018 10:42:44 +0000 (11:42 +0100)]
mlxsw: spectrum_qdisc: Don't use variable array in mlxsw_sp_tclass_congestion_enable
Resolve the sparse warning:
"sparse: Variable length array is used."
Use 2 arrays for 2 PRM register accesses.
Fixes: 96f17e0776c2 ("mlxsw: spectrum: Support RED qdisc offload")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Wed, 10 Jan 2018 10:42:43 +0000 (11:42 +0100)]
mlxsw: pci: Wait after reset before accessing HW
After performing reset driver polls on HW indication until learning
that the reset is done, but immediately after reset the device becomes
unresponsive which might lead to completion timeout on the first read.
Wait for 100ms before starting the polling.
Fixes: 233fa44bd67a ("mlxsw: pci: Implement reset done check")
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 10 Jan 2018 02:14:28 +0000 (18:14 -0800)]
nfp: always unmask aux interrupts at init
The link state and exception interrupts may be masked when we probe.
The firmware should in theory prevent sending (and automasking) those
interrupts if the device is disabled, but if my reading of the FW code
is correct there are firmwares out there with race conditions in this
area. The interrupt may also be masked if previous driver which used
the device was malfunctioning and we didn't load the FW (there is no
other good way to comprehensively reset the PF).
Note that FW unmasks the data interrupts by itself when vNIC is
enabled, such helpful operation is not performed for LSC/EXN interrupts.
Always unmask the auxiliary interrupts after request_irq(). On the
remove path add missing PCI write flush before free_irq().
Fixes: 4c3523623dc0 ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Tue, 9 Jan 2018 21:40:41 +0000 (13:40 -0800)]
8021q: fix a memory leak for VLAN 0 device
A vlan device with vid 0 is allow to creat by not able to be fully
cleaned up by unregister_vlan_dev() which checks for vlan_id!=0.
Also, VLAN 0 is probably not a valid number and it is kinda
"reserved" for HW accelerating devices, but it is probably too
late to reject it from creation even if makes sense. Instead,
just remove the check in unregister_vlan_dev().
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: ad1afb003939 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 10 Jan 2018 20:08:46 +0000 (15:08 -0500)]
Merge tag 'wireless-drivers-for-davem-2018-01-09' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.15
Hopefully the last set of fixes for 4.15.
iwlwifi
* fix DMA mapping regression since v4.14
wcn36xx
* fix dynamic power save which has been broken since the driver was commited
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Tue, 9 Jan 2018 12:43:34 +0000 (14:43 +0200)]
of_mdio: avoid MDIO bus removal when a PHY is missing
If one of the child devices is missing the of_mdiobus_register_phy()
call will return -ENODEV. When a missing device is encountered the
registration of the remaining PHYs is stopped and the MDIO bus will
fail to register. Propagate all errors except ENODEV to avoid it.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xiongfeng Wang [Tue, 9 Jan 2018 11:58:18 +0000 (19:58 +0800)]
caif_usb: use strlcpy() instead of strncpy()
gcc-8 reports
net/caif/caif_usb.c: In function 'cfusbl_device_notify':
./include/linux/string.h:245:9: warning: '__builtin_strncpy' output may
be truncated copying 15 bytes from a string of length 15
[-Wstringop-truncation]
The compiler require that the input param 'len' of strncpy() should be
greater than the length of the src string, so that '\0' is copied as
well. We can just use strlcpy() to avoid this warning.
Signed-off-by: Xiongfeng Wang <xiongfeng.wang@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kornilios Kourtis [Tue, 9 Jan 2018 08:52:22 +0000 (09:52 +0100)]
doc: clarification about setting SO_ZEROCOPY
Signed-off-by: Kornilios Kourtis <kou@zurich.ibm.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Tue, 9 Jan 2018 03:02:33 +0000 (11:02 +0800)]
net: gianfar_ptp: move set_fipers() to spinlock protecting area
set_fipers() calling should be protected by spinlock in
case that any interrupt breaks related registers setting
and the function we expect. This patch is to move set_fipers()
to spinlock protecting area in ptp_gianfar_adjtime().
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 10 Jan 2018 19:53:23 +0000 (14:53 -0500)]
Merge branch 'sctp-Some-sockopt-optlen-fixes'
Marcelo Ricardo Leitner says:
====================
sctp: Some sockopt optlen fixes
Hangbin Liu reported that some SCTP sockopt are allowing the user to get
the kernel to allocate really large buffers by not having a ceiling on
optlen.
This patchset address this issue (in patch 2), replace an GFP_ATOMIC
that isn't needed and avoid calculating the option size multiple times
in some setsockopt.
====================
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Mon, 8 Jan 2018 21:02:29 +0000 (19:02 -0200)]
sctp: make use of pre-calculated len
Some sockopt handling functions were calculating the length of the
buffer to be written to userspace and then calculating it again when
actually writing the buffer, which could lead to some write not using
an up-to-date length.
This patch updates such places to just make use of the len variable.
Also, replace some sizeof(type) to sizeof(var).
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Mon, 8 Jan 2018 21:02:28 +0000 (19:02 -0200)]
sctp: add a ceiling to optlen in some sockopts
Hangbin Liu reported that some sockopt calls could cause the kernel to log
a warning on memory allocation failure if the user supplied a large optlen
value. That is because some of them called memdup_user() without a ceiling
on optlen, allowing it to try to allocate really large buffers.
This patch adds a ceiling by limiting optlen to the maximum allowed that
would still make sense for these sockopt.
Reported-by: Hangbin Liu <haliu@redhat.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Mon, 8 Jan 2018 21:02:27 +0000 (19:02 -0200)]
sctp: GFP_ATOMIC is not needed in sctp_setsockopt_events
So replace it with GFP_USER and also add __GFP_NOWARN.
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 10 Jan 2018 19:18:31 +0000 (11:18 -0800)]
Merge tag 'sound-4.15-rc8' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of the last-minute small PCM fixes:
- A workaround for the recent regression wrt PulseAudio
- Removal of spurious WARN_ON() that is triggered by syzkaller
- Fixes for aloop, hardening racy accesses
- Fixes in PCM OSS emulation wrt the unabortable loops that may cause
RCU stall"
* tag 'sound-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: pcm: Allow aborting mutex lock at OSS read/write loops
ALSA: pcm: Abort properly at pending signal in OSS read/write loops
ALSA: aloop: Fix racy hw constraints adjustment
ALSA: aloop: Fix inconsistent format due to incomplete rule
ALSA: aloop: Release cable upon open error path
ALSA: pcm: Workaround for weird PulseAudio behavior on rewind error
ALSA: pcm: Add missing error checks in OSS emulation plugin builder
ALSA: pcm: Remove incorrect snd_BUG_ON() usages
Borislav Petkov [Wed, 10 Jan 2018 11:28:16 +0000 (12:28 +0100)]
x86/alternatives: Fix optimize_nops() checking
The alternatives code checks only the first byte whether it is a NOP, but
with NOPs in front of the payload and having actual instructions after it
breaks the "optimized' test.
Make sure to scan all bytes before deciding to optimize the NOPs in there.
Reported-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/20180110112815.mgciyf5acwacphkq@pd.tnic
David S. Miller [Wed, 10 Jan 2018 16:17:21 +0000 (11:17 -0500)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2018-01-09
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Prevent out-of-bounds speculation in BPF maps by masking the
index after bounds checks in order to fix spectre v1, and
add an option BPF_JIT_ALWAYS_ON into Kconfig that allows for
removing the BPF interpreter from the kernel in favor of
JIT-only mode to make spectre v2 harder, from Alexei.
2) Remove false sharing of map refcount with max_entries which
was used in spectre v1, from Daniel.
3) Add a missing NULL psock check in sockmap in order to fix
a race, from John.
4) Fix test_align BPF selftest case since a recent change in
verifier rejects the bit-wise arithmetic on pointers
earlier but test_align update was missing, from Alexei.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 10 Jan 2018 09:40:04 +0000 (12:40 +0300)]
drm/vmwgfx: Potential off by one in vmw_view_add()
The vmw_view_cmd_to_type() function returns vmw_view_max (3) on error.
It's one element beyond the end of the vmw_view_cotables[] table.
My read on this is that it's possible to hit this failure. header->id
comes from vmw_cmd_check() and it's a user controlled number between
1040 and 1225 so we can hit that error. But I don't have the hardware
to test this code.
Fixes: d80efd5cb3de ("drm/vmwgfx: Initial DX support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: <stable@vger.kernel.org>
Ross Lagerwall [Tue, 9 Jan 2018 12:10:22 +0000 (12:10 +0000)]
xen/gntdev: Fix partial gntdev_mmap() cleanup
When cleaning up after a partially successful gntdev_mmap(), unmap the
successfully mapped grant pages otherwise Xen will kill the domain if
in debug mode (Attempt to implicitly unmap a granted PTE) or Linux will
kill the process and emit "BUG: Bad page map in process" if Xen is in
release mode.
This is only needed when use_ptemod is true because gntdev_put_map()
will unmap grant pages itself when use_ptemod is false.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Ross Lagerwall [Tue, 9 Jan 2018 12:10:21 +0000 (12:10 +0000)]
xen/gntdev: Fix off-by-one error when unmapping with holes
If the requested range has a hole, the calculation of the number of
pages to unmap is off by one. Fix it.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Geert Uytterhoeven [Tue, 9 Jan 2018 18:08:21 +0000 (19:08 +0100)]
gpio: Add missing open drain/source handling to gpiod_set_value_cansleep()
Since commit
f11a04464ae57e8d ("i2c: gpio: Enable working over slow
can_sleep GPIOs"), probing the i2c RTC connected to an i2c-gpio bus on
r8a7740/armadillo fails with:
rtc-s35390a 0-0030: error resetting chip
rtc-s35390a: probe of 0-0030 failed with error -5
More debug code reveals:
i2c i2c-0: master_xfer[0] R, addr=0x30, len=1
i2c i2c-0: NAK from device addr 0x30 msg #0
s35390a_get_reg: ret = -6
Commit
02e479808b5d62f8 ("gpio: Alter semantics of *raw* operations to
actually be raw") moved open drain/source handling from
gpiod_set_raw_value_commit() to gpiod_set_value(), but forgot to take
into account that gpiod_set_value_cansleep() also needs this handling.
The i2c protocol mandates that i2c signals are open drain, hence i2c
communication fails.
Fix this by adding the missing handling to gpiod_set_value_cansleep(),
using a new common helper gpiod_set_value_nocheck().
Fixes: 02e479808b5d62f8 ("gpio: Alter semantics of *raw* operations to actually be raw")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
[removed underscore syntax, added kerneldoc]
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Thierry Reding [Wed, 10 Jan 2018 12:04:58 +0000 (13:04 +0100)]
drm/tegra: sor: Fix hang on Tegra124 eDP
The SOR0 found on Tegra124 and Tegra210 only supports eDP and LVDS and
therefore has a slightly different clock tree than the SOR1 which does
not support eDP, but HDMI and DP instead.
Commit
e1335e2f0cfc ("drm/tegra: sor: Reimplement pad clock") breaks
setups with eDP because the sor->clk_out clock is uninitialized and
therefore setting the parent clock (either the safe clock or either of
the display PLLs) fails, which can cause hangs later on since there is
no clock driving the module.
Fix this by falling back to the module clock for sor->clk_out on those
setups. This guarantees that the module will always be clocked by an
enabled clock and hence prevents those hangs.
Fixes: e1335e2f0cfc ("drm/tegra: sor: Reimplement pad clock")
Reported-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Oliver O'Halloran [Tue, 9 Jan 2018 16:07:15 +0000 (03:07 +1100)]
powerpc/powernv: Check device-tree for RFI flush settings
New device-tree properties are available which tell the hypervisor
settings related to the RFI flush. Use them to determine the
appropriate flush instruction to use, and whether the flush is
required.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Neuling [Tue, 9 Jan 2018 16:07:15 +0000 (03:07 +1100)]
powerpc/pseries: Query hypervisor for RFI flush settings
A new hypervisor call is available which tells the guest settings
related to the RFI flush. Use it to query the appropriate flush
instruction(s), and whether the flush is required.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Tue, 9 Jan 2018 16:07:15 +0000 (03:07 +1100)]
powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti
Because there may be some performance overhead of the RFI flush, add
kernel command line options to disable it.
We add a sensibly named 'no_rfi_flush' option, but we also hijack the
x86 option 'nopti'. The RFI flush is not the same as KPTI, but if we
see 'nopti' we can guess that the user is trying to avoid any overhead
of Meltdown mitigations, and it means we don't have to educate every
one about a different command line option.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Tue, 9 Jan 2018 16:07:15 +0000 (03:07 +1100)]
powerpc/64s: Add support for RFI flush of L1-D cache
On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.
This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.
The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.
In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.
In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.
For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.
In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
David Gibson [Wed, 10 Jan 2018 06:04:39 +0000 (17:04 +1100)]
KVM: PPC: Book3S HV: Always flush TLB in kvmppc_alloc_reset_hpt()
The KVM_PPC_ALLOCATE_HTAB ioctl(), implemented by kvmppc_alloc_reset_hpt()
is supposed to completely clear and reset a guest's Hashed Page Table (HPT)
allocating or re-allocating it if necessary.
In the case where an HPT of the right size already exists and it just
zeroes it, it forces a TLB flush on all guest CPUs, to remove any stale TLB
entries loaded from the old HPT.
However, that situation can arise when the HPT is resizing as well - or
even when switching from an RPT to HPT - so those cases need a TLB flush as
well.
So, move the TLB flush to trigger in all cases except for errors.
Cc: stable@vger.kernel.org # v4.10+
Fixes: f98a8bf9ee20 ("KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB ioctl() to change HPT size")
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Alexey Kardashevskiy [Wed, 22 Nov 2017 03:42:21 +0000 (14:42 +1100)]
KVM: PPC: Book3S PR: Fix WIMG handling under pHyp
Commit
96df226 ("KVM: PPC: Book3S PR: Preserve storage control bits")
added code to preserve WIMG bits but it missed 2 special cases:
- a magic page in kvmppc_mmu_book3s_64_xlate() and
- guest real mode in kvmppc_handle_pagefault().
For these ptes, WIMG was 0 and pHyp failed on these causing a guest to
stop in the very beginning at NIP=0x100 (due to
bd9166ffe "KVM: PPC:
Book3S PR: Exit KVM on failed mapping").
According to LoPAPR v1.1 14.5.4.1.2 H_ENTER:
The hypervisor checks that the WIMG bits within the PTE are appropriate
for the physical page number else H_Parameter return. (For System Memory
pages WIMG=0010, or, 1110 if the SAO option is enabled, and for IO pages
WIMG=01**.)
This hence initializes WIMG to non-zero value HPTE_R_M (0x10), as expected
by pHyp.
[paulus@ozlabs.org - fix compile for 32-bit]
Cc: stable@vger.kernel.org # v4.11+
Fixes: 96df226 "KVM: PPC: Book3S PR: Preserve storage control bits"
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: Ruediger Oertel <ro@suse.de>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Mathieu Desnoyers [Fri, 15 Dec 2017 19:23:10 +0000 (14:23 -0500)]
membarrier: Disable preemption when calling smp_call_function_many()
smp_call_function_many() requires disabling preemption around the call.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: <stable@vger.kernel.org> # v4.14+
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Andrew Hunter <ahh@google.com>
Cc: Avi Kivity <avi@scylladb.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Dave Watson <davejwatson@fb.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maged Michael <maged.michael@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171215192310.25293-1-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>