Sam Ravnborg [Mon, 21 Apr 2014 19:39:18 +0000 (21:39 +0200)]
sparc32: fix sparse warning in init_32.c
Fix following warning:
init_32.c:112:22: warning: symbol 'bootmem_init' was not declared. Should it be static?
Fix by adding a proper prototype in pgtable_32.h and drop
the local prototype in srmmu.c
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Mon, 21 Apr 2014 19:39:17 +0000 (21:39 +0200)]
sparc32: fix sparse warning in fault_32.c
Fix following warning:
fault_32.c:38:24: error: symbol 'unhandled_fault' redeclared with different type - different modifiers
When this warning was fixed several new warnings popped up - fix them too.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Mon, 21 Apr 2014 19:39:16 +0000 (21:39 +0200)]
sparc32: rename mm/srmmu.h to mm/mm_32.h
This file will be used for more than just srmmu stuff, so the old name was misleading.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 28 Apr 2014 23:57:51 +0000 (16:57 -0700)]
Merge tag 'trace-fixes-v3.15-rc2' of git://git./linux/kernel/git/rostedt/linux-trace
Pull ftrace bugfix from Steven Rostedt:
"Takao Indoh reported that he was able to cause a ftrace bug while
loading a module and enabling function tracing at the same time.
He uncovered a race where the module when loaded will convert the
calls to mcount into nops, and expects the module's text to be RW.
But when function tracing is enabled, it will convert all kernel text
(core and module) from RO to RW to convert the nops to calls to ftrace
to record the function. After the convertion, it will convert all the
text back from RW to RO.
The issue is, it will also convert the module's text that is loading.
If it converts it to RO before ftrace does its conversion, it will
cause ftrace to fail and require a reboot to fix it again.
This patch moves the ftrace module update that converts calls to
mcount into nops to be done when the module state is still
MODULE_STATE_UNFORMED. This will ignore the module when the text is
being converted from RW back to RO"
* tag 'trace-fixes-v3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace/module: Hardcode ftrace_module_init() call into load_module()
Linus Torvalds [Mon, 28 Apr 2014 22:19:06 +0000 (15:19 -0700)]
Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linux
Pull devicetree bug fixes from Grant Likely:
"These are some important bug fixes that need to get into v3.15.
This branch contains a pair of important bug fixes for the DT code:
- Fix some incorrect binding property names before they enter common
usage
- Fix bug where some platform devices will be unable to get their
interrupt number when they depend on an interrupt controller that
is not available at device creation time. This is a problem
causing mainline to fail on a number of ARM platforms"
* tag 'dt-for-linus' of git://git.secretlab.ca/git/linux:
of/irq: do irq resolution in platform_get_irq
of: selftest: add deferred probe interrupt test
dt: Fix binding typos in clock-names and interrupt-names
Linus Torvalds [Mon, 28 Apr 2014 21:58:15 +0000 (14:58 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
"Here is a bunch of post-merge window fixes that have been accumulating
in patchwork while I was on vacation or buried under other stuff last
week.
We have the now usual batch of LE fixes from Anton (sadly some new
stuff that went into this merge window had endian issues, we'll try to
make sure we do better next time)
Some fixes and cleanups to the new 24x7 performance monitoring stuff
(mostly typos and cleaning up printk's)
A series of fixes for an issue with our runlatch bit, which wasn't set
properly for offlined threads/cores and under KVM, causing potentially
some counters to misbehave along with possible power management
issues.
A fix for kexec nasty race where the new kernel wouldn't "see" the
secondary processors having reached back into firmware in time.
And finally a few other misc (and pretty simple) bug fixes"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (33 commits)
powerpc/4xx: Fix section mismatch in ppc4xx_pci.c
ppc/kvm: Clear the runlatch bit of a vcpu before napping
ppc/kvm: Set the runlatch bit of a CPU just before starting guest
ppc/powernv: Set the runlatch bits correctly for offline cpus
powerpc/pseries: Protect remove_memory() with device hotplug lock
powerpc: Fix error return in rtas_flash module init
powerpc: Bump BOOT_COMMAND_LINE_SIZE to 2048
powerpc: Bump COMMAND_LINE_SIZE to 2048
powerpc: Rename duplicate COMMAND_LINE_SIZE define
powerpc/perf/hv-24x7: Catalog version number is be64, not be32
powerpc/perf/hv-24x7: Remove [static 4096], sparse chokes on it
powerpc/perf/hv-24x7: Use (unsigned long) not (u32) values when calling plpar_hcall_norets()
powerpc/perf/hv-gpci: Make device attr static
powerpc/perf/hv_gpci: Probe failures use pr_debug(), and padding reduced
powerpc/perf/hv_24x7: Probe errors changed to pr_debug(), padding fixed
powerpc/mm: Fix tlbie to add AVAL fields for 64K pages
powerpc/powernv: Fix little endian issues in OPAL dump code
powerpc/powernv: Create OPAL sglist helper functions and fix endian issues
powerpc/powernv: Fix little endian issues in OPAL error log code
powerpc/powernv: Fix little endian issues with opal_do_notifier calls
...
Linus Torvalds [Mon, 28 Apr 2014 21:24:09 +0000 (14:24 -0700)]
mm: don't pointlessly use BUG_ON() for sanity check
BUG_ON() is a big hammer, and should be used _only_ if there is some
major corruption that you cannot possibly recover from, making it
imperative that the current process (and possibly the whole machine) be
terminated with extreme prejudice.
The trivial sanity check in the vmacache code is *not* such a fatal
error. Recovering from it is absolutely trivial, and using BUG_ON()
just makes it harder to debug for no actual advantage.
To make matters worse, the placement of the BUG_ON() (only if the range
check matched) actually makes it harder to hit the sanity check to begin
with, so _if_ there is a bug (and we just got a report from Srivatsa
Bhat that this can indeed trigger), it is harder to debug not just
because the machine is possibly dead, but because we don't have better
coverage.
BUG_ON() must *die*. Maybe we should add a checkpatch warning for it,
because it is simply just about the worst thing you can ever do if you
hit some "this cannot happen" situation.
Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven Rostedt (Red Hat) [Thu, 24 Apr 2014 14:40:12 +0000 (10:40 -0400)]
ftrace/module: Hardcode ftrace_module_init() call into load_module()
A race exists between module loading and enabling of function tracer.
CPU 1 CPU 2
----- -----
load_module()
module->state = MODULE_STATE_COMING
register_ftrace_function()
mutex_lock(&ftrace_lock);
ftrace_startup()
update_ftrace_function();
ftrace_arch_code_modify_prepare()
set_all_module_text_rw();
<enables-ftrace>
ftrace_arch_code_modify_post_process()
set_all_module_text_ro();
[ here all module text is set to RO,
including the module that is
loading!! ]
blocking_notifier_call_chain(MODULE_STATE_COMING);
ftrace_init_module()
[ tries to modify code, but it's RO, and fails!
ftrace_bug() is called]
When this race happens, ftrace_bug() will produces a nasty warning and
all of the function tracing features will be disabled until reboot.
The simple solution is to treate module load the same way the core
kernel is treated at boot. To hardcode the ftrace function modification
of converting calls to mcount into nops. This is done in init/main.c
there's no reason it could not be done in load_module(). This gives
a better control of the changes and doesn't tie the state of the
module to its notifiers as much. Ftrace is special, it needs to be
treated as such.
The reason this would work, is that the ftrace_module_init() would be
called while the module is in MODULE_STATE_UNFORMED, which is ignored
by the set_all_module_text_ro() call.
Link: http://lkml.kernel.org/r/1395637826-3312-1-git-send-email-indou.takao@jp.fujitsu.com
Reported-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@vger.kernel.org # 2.6.38+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Alistair Popple [Tue, 8 Apr 2014 04:20:19 +0000 (14:20 +1000)]
powerpc/4xx: Fix section mismatch in ppc4xx_pci.c
This patch fixes this section mismatch:
WARNING: vmlinux.o(.text+0x1efc4): Section mismatch in reference from
the function apm821xx_pciex_init_port_hw() to the function
.init.text:ppc4xx_pciex_wait_on_sdr.isra.9()
The function apm821xx_pciex_init_port_hw() references the function
__init ppc4xx_pciex_wait_on_sdr.isra.9(). This is often because
apm821xx_pciex_init_port_hw lacks a __init annotation or the
annotation of ppc4xx_pciex_wait_on_sdr.isra.9 is wrong.
apm821xx_pciex_init_port_hw is only referenced by a struct in
__initdata, so it should be safe to add __init to
apm821xx_pciex_init_port_hw.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Preeti U Murthy [Fri, 11 Apr 2014 10:32:08 +0000 (16:02 +0530)]
ppc/kvm: Clear the runlatch bit of a vcpu before napping
When the guest cedes the vcpu or the vcpu has no guest to
run it naps. Clear the runlatch bit of the vcpu before
napping to indicate an idle cpu.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Preeti U Murthy [Fri, 11 Apr 2014 10:31:58 +0000 (16:01 +0530)]
ppc/kvm: Set the runlatch bit of a CPU just before starting guest
The secondary threads in the core are kept offline before launching guests
in kvm on powerpc: "
371fefd6f2dc4666:KVM: PPC: Allow book3s_hv guests to use
SMT processor modes."
Hence their runlatch bits are cleared. When the secondary threads are called
in to start a guest, their runlatch bits need to be set to indicate that they
are busy. The primary thread has its runlatch bit set though, but there is no
harm in setting this bit once again. Hence set the runlatch bit for all
threads before they start guest.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Preeti U Murthy [Fri, 11 Apr 2014 10:31:48 +0000 (16:01 +0530)]
ppc/powernv: Set the runlatch bits correctly for offline cpus
Up until now we have been setting the runlatch bits for a busy CPU and
clearing it when a CPU enters idle state. The runlatch bit has thus
been consistent with the utilization of a CPU as long as the CPU is online.
However when a CPU is hotplugged out the runlatch bit is not cleared. It
needs to be cleared to indicate an unused CPU. Hence this patch has the
runlatch bit cleared for an offline CPU just before entering an idle state
and sets it immediately after it exits the idle state.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Li Zhong [Thu, 10 Apr 2014 08:25:31 +0000 (16:25 +0800)]
powerpc/pseries: Protect remove_memory() with device hotplug lock
While testing memory hot-remove, I found following dead lock:
Process #1141 is drmgr, trying to remove some memory, i.e. memory499.
It holds the memory_hotplug_mutex, and blocks when trying to remove file
"online" under dir memory499, in kernfs_drain(), at
wait_event(root->deactivate_waitq,
atomic_read(&kn->active) == KN_DEACTIVATED_BIAS);
Process #1120 is trying to online memory499 by
echo 1 > memory499/online
In .kernfs_fop_write, it uses kernfs_get_active() to increase
&kn->active, thus blocking process #1141. While itself is blocked later
when trying to acquire memory_hotplug_mutex, which is held by process
The backtrace of both processes are shown below:
[<
c000000001b18600>] 0xc000000001b18600
[<
c000000000015044>] .__switch_to+0x144/0x200
[<
c000000000263ca4>] .online_pages+0x74/0x7b0
[<
c00000000055b40c>] .memory_subsys_online+0x9c/0x150
[<
c00000000053cbe8>] .device_online+0xb8/0x120
[<
c00000000053cd04>] .online_store+0xb4/0xc0
[<
c000000000538ce4>] .dev_attr_store+0x64/0xa0
[<
c00000000030f4ec>] .sysfs_kf_write+0x7c/0xb0
[<
c00000000030e574>] .kernfs_fop_write+0x154/0x1e0
[<
c000000000268450>] .vfs_write+0xe0/0x260
[<
c000000000269144>] .SyS_write+0x64/0x110
[<
c000000000009ffc>] syscall_exit+0x0/0x7c
[<
c000000001b18600>] 0xc000000001b18600
[<
c000000000015044>] .__switch_to+0x144/0x200
[<
c00000000030be14>] .__kernfs_remove+0x204/0x300
[<
c00000000030d428>] .kernfs_remove_by_name_ns+0x68/0xf0
[<
c00000000030fb38>] .sysfs_remove_file_ns+0x38/0x60
[<
c000000000539354>] .device_remove_attrs+0x54/0xc0
[<
c000000000539fd8>] .device_del+0x158/0x250
[<
c00000000053a104>] .device_unregister+0x34/0xa0
[<
c00000000055bc14>] .unregister_memory_section+0x164/0x170
[<
c00000000024ee18>] .__remove_pages+0x108/0x4c0
[<
c00000000004b590>] .arch_remove_memory+0x60/0xc0
[<
c00000000026446c>] .remove_memory+0x8c/0xe0
[<
c00000000007f9f4>] .pseries_remove_memblock+0xd4/0x160
[<
c00000000007fcfc>] .pseries_memory_notifier+0x27c/0x290
[<
c0000000008ae6cc>] .notifier_call_chain+0x8c/0x100
[<
c0000000000d858c>] .__blocking_notifier_call_chain+0x6c/0xe0
[<
c00000000071ddec>] .of_property_notify+0x7c/0xc0
[<
c00000000071ed3c>] .of_update_property+0x3c/0x1b0
[<
c0000000000756cc>] .ofdt_write+0x3dc/0x740
[<
c0000000002f60fc>] .proc_reg_write+0xac/0x110
[<
c000000000268450>] .vfs_write+0xe0/0x260
[<
c000000000269144>] .SyS_write+0x64/0x110
[<
c000000000009ffc>] syscall_exit+0x0/0x7c
This patch uses lock_device_hotplug() to protect remove_memory() called
in pseries_remove_memblock(), which is also stated before function
remove_memory():
* NOTE: The caller must call lock_device_hotplug() to serialize hotplug
* and online/offline operations before this call, as required by
* try_offline_node().
*/
void __ref remove_memory(int nid, u64 start, u64 size)
With this lock held, the other process(#1120 above) trying to online the
memory block will retry the system call when calling
lock_device_hotplug_sysfs(), and finally find No such device error.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 14 Apr 2014 11:23:32 +0000 (21:23 +1000)]
powerpc: Fix error return in rtas_flash module init
module_init should return 0 or a negative errno.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 14 Apr 2014 11:55:25 +0000 (21:55 +1000)]
powerpc: Bump BOOT_COMMAND_LINE_SIZE to 2048
Bump the boot wrapper BOOT_COMMAND_LINE_SIZE to match the
kernel.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 14 Apr 2014 11:54:52 +0000 (21:54 +1000)]
powerpc: Bump COMMAND_LINE_SIZE to 2048
I've had a report that the current limit is too small for
an automated network based installer. Bump it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 14 Apr 2014 11:54:05 +0000 (21:54 +1000)]
powerpc: Rename duplicate COMMAND_LINE_SIZE define
We have two definitions of COMMAND_LINE_SIZE, one for the kernel
and one for the boot wrapper. I assume this is so the boot
wrapper can be self sufficient and not rely on kernel headers.
Having two defines with the same name is confusing, I just
updated the wrong one when trying to bump it.
Make the boot wrapper define unique by calling it
BOOT_COMMAND_LINE_SIZE.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cody P Schafer [Tue, 15 Apr 2014 17:10:55 +0000 (10:10 -0700)]
powerpc/perf/hv-24x7: Catalog version number is be64, not be32
The catalog version number was changed from a be32 (with proceeding
32bits of padding) to a be64, update the code to treat it as a be64
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cody P Schafer [Tue, 15 Apr 2014 17:10:54 +0000 (10:10 -0700)]
powerpc/perf/hv-24x7: Remove [static 4096], sparse chokes on it
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cody P Schafer [Tue, 15 Apr 2014 17:10:53 +0000 (10:10 -0700)]
powerpc/perf/hv-24x7: Use (unsigned long) not (u32) values when calling plpar_hcall_norets()
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cody P Schafer [Tue, 15 Apr 2014 17:10:52 +0000 (10:10 -0700)]
powerpc/perf/hv-gpci: Make device attr static
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cody P Schafer [Tue, 15 Apr 2014 17:10:51 +0000 (10:10 -0700)]
powerpc/perf/hv_gpci: Probe failures use pr_debug(), and padding reduced
fixup for "powerpc/perf: Add support for the hv gpci (get performance
counter info) interface".
Makes the "not enabled" message less awful (and hidden unless
debugging).
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cody P Schafer [Tue, 15 Apr 2014 17:10:50 +0000 (10:10 -0700)]
powerpc/perf/hv_24x7: Probe errors changed to pr_debug(), padding fixed
fixup for "powerpc/perf: Add support for the hv 24x7 interface"
Makes the "not enabled" message less awful (and hides it in most cases).
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Aneesh Kumar K.V [Mon, 21 Apr 2014 05:07:36 +0000 (10:37 +0530)]
powerpc/mm: Fix tlbie to add AVAL fields for 64K pages
The if condition check was based on a draft ISA doc. Remove the same.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 22 Apr 2014 05:01:27 +0000 (15:01 +1000)]
powerpc/powernv: Fix little endian issues in OPAL dump code
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 22 Apr 2014 05:01:26 +0000 (15:01 +1000)]
powerpc/powernv: Create OPAL sglist helper functions and fix endian issues
We have two copies of code that creates an OPAL sg list. Consolidate
these into a common set of helpers and fix the endian issues.
The flash interface embedded a version number in the num_entries
field, whereas the dump interface did did not. Since versioning
wasn't added to the flash interface and it is impossible to add
this in a backwards compatible way, just remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 22 Apr 2014 05:01:25 +0000 (15:01 +1000)]
powerpc/powernv: Fix little endian issues in OPAL error log code
Fix little endian issues with the OPAL error log code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 22 Apr 2014 05:01:24 +0000 (15:01 +1000)]
powerpc/powernv: Fix little endian issues with opal_do_notifier calls
The bitmap in opal_poll_events and opal_handle_interrupt is
big endian, so we need to byteswap it on little endian builds.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 22 Apr 2014 05:01:23 +0000 (15:01 +1000)]
powerpc/powernv: Remove some OPAL function declaration duplication
We had some duplication of the internal OPAL functions.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 22 Apr 2014 05:01:22 +0000 (15:01 +1000)]
powerpc/powernv: Use uint64_t instead of size_t in OPAL APIs
Using size_t in our APIs is asking for trouble, especially
when some OPAL calls use size_t pointers.
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Wei Yang [Wed, 23 Apr 2014 02:26:33 +0000 (10:26 +0800)]
powerpc/powernv: Release the refcount for pci_dev
On PowerNV platform, we are holding an unnecessary refcount on a pci_dev, which
leads to the pci_dev is not destroyed when hotplugging a pci device.
This patch release the unnecessary refcount.
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Wei Yang [Wed, 23 Apr 2014 02:26:32 +0000 (10:26 +0800)]
powerpc/powernv: Reduce multi-hit of iommu_add_device()
During the EEH hotplug event, iommu_add_device() will be invoked three times
and two of them will trigger warning or error.
The three times to invoke the iommu_add_device() are:
pci_device_add
...
set_iommu_table_base_and_group <- 1st time, fail
device_add
...
tce_iommu_bus_notifier <- 2nd time, succees
pcibios_add_pci_devices
...
pcibios_setup_bus_devices <- 3rd time, re-attach
The first time fails, since the dev->kobj->sd is not initialized. The
dev->kobj->sd is initialized in device_add().
The third time's warning is triggered by the re-attach of the iommu_group.
After applying this patch, the error
iommu_tce: 0003:05:00.0 has not been added, ret=-14
and the warning
[ 204.123609] ------------[ cut here ]------------
[ 204.123645] WARNING: at arch/powerpc/kernel/iommu.c:1125
[ 204.123680] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT bnep bluetooth 6lowpan_iphc rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc mlx4_ib ib_sa ib_mad ib_core ib_addr ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnx2x tg3 mlx4_core nfsd ptp mdio ses libcrc32c nfs_acl enclosure be2net pps_core shpchp lockd kvm uinput sunrpc binfmt_misc lpfc scsi_transport_fc ipr scsi_tgt
[ 204.124356] CPU: 18 PID: 650 Comm: eehd Not tainted 3.14.0-rc5yw+ #102
[ 204.124400] task:
c0000027ed485670 ti:
c0000027ed50c000 task.ti:
c0000027ed50c000
[ 204.124453] NIP:
c00000000003cf80 LR:
c00000000006c648 CTR:
c00000000006c5c0
[ 204.124506] REGS:
c0000027ed50f440 TRAP: 0700 Not tainted (3.14.0-rc5yw+)
[ 204.124558] MSR:
9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR:
88008084 XER:
20000000
[ 204.124682] CFAR:
c00000000006c644 SOFTE: 1
GPR00:
c00000000006c648 c0000027ed50f6c0 c000000001398380 c0000027ec260300
GPR04:
c0000027ea92c000 c00000000006ad00 c0000000016e41b0 0000000000000110
GPR08:
c0000000012cd4c0 0000000000000001 c0000027ec2602ff 0000000000000062
GPR12:
0000000028008084 c00000000fdca200 c0000000000d1d90 c0000027ec281a80
GPR16:
0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20:
0000000000000000 0000000000000000 0000000000000000 0000000000000001
GPR24:
000000005342697b 0000000000002906 c000001fe6ac9800 c000001fe6ac9800
GPR28:
0000000000000000 c0000000016e3a80 c0000027ea92c090 c0000027ea92c000
[ 204.125353] NIP [
c00000000003cf80] .iommu_add_device+0x30/0x1f0
[ 204.125399] LR [
c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
[ 204.125443] Call Trace:
[ 204.125464] [
c0000027ed50f6c0] [
c0000027ed50f750] 0xc0000027ed50f750 (unreliable)
[ 204.125526] [
c0000027ed50f750] [
c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
[ 204.125588] [
c0000027ed50f7d0] [
c000000000069cc8] .pnv_pci_dma_dev_setup+0x78/0x340
[ 204.125650] [
c0000027ed50f870] [
c000000000044408] .pcibios_setup_device+0x88/0x2f0
[ 204.125712] [
c0000027ed50f940] [
c000000000046040] .pcibios_setup_bus_devices+0x60/0xd0
[ 204.125774] [
c0000027ed50f9c0] [
c000000000043acc] .pcibios_add_pci_devices+0xdc/0x1c0
[ 204.125837] [
c0000027ed50fa50] [
c00000000086f970] .eeh_reset_device+0x36c/0x4f0
[ 204.125939] [
c0000027ed50fb20] [
c00000000003a2d8] .eeh_handle_normal_event+0x448/0x480
[ 204.126068] [
c0000027ed50fbc0] [
c00000000003a35c] .eeh_handle_event+0x4c/0x340
[ 204.126192] [
c0000027ed50fc80] [
c00000000003a74c] .eeh_event_handler+0xfc/0x1b0
[ 204.126319] [
c0000027ed50fd30] [
c0000000000d1ea0] .kthread+0x110/0x130
[ 204.126430] [
c0000027ed50fe30] [
c00000000000a460] .ret_from_kernel_thread+0x5c/0x7c
[ 204.126556] Instruction dump:
[ 204.126610]
7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ff71 7c7e1b78 60000000
[ 204.126787]
60000000 e87e0298 3143ffff 7d2a1910 <
0b090000>
2fa90000 40de00c8 ebfe0218
[ 204.126966] ---[ end trace
6e7aefd80add2973 ]---
are cleared.
This patch removes iommu_add_device() in pnv_pci_ioda_dma_dev_setup(), which
revert part of the change in commit
d905c5df(PPC: POWERNV: move
iommu_add_device earlier).
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Wed, 23 Apr 2014 21:25:34 +0000 (07:25 +1000)]
powerpc/powernv: Fix little endian issues in OPAL flash code
With this patch I was able to update firmware on an LE kernel.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Thu, 24 Apr 2014 06:14:25 +0000 (16:14 +1000)]
powerpc/powernv: Fix kexec races going back to OPAL
We have a subtle race when sending CPUs back to OPAL on kexec.
We mark them as "in real mode" right before we send them down. Once
we've booted the new kernel, it might try to call opal_reinit_cpus()
to change endianness, and that requires all CPUs to be spinning inside
OPAL.
However there is no synchronization here and we've observed cases
where the returning CPUs hadn't established their new state inside
OPAL before opal_reinit_cpus() is called, causing it to fail.
The proper fix is to actually wait for them to go down all the way
from the kexec'ing kernel.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Joel Stanley [Thu, 24 Apr 2014 07:25:37 +0000 (16:55 +0930)]
powerpc/powernv: Check sysparam size before creation
The size of the sysparam sysfs files is determined from the device tree
at boot. However the buffer is hard coded to 64 bytes. If we encounter a
parameter that is larger than 64, or miss-parse the device tree, the
buffer will overflow when reading or writing to the parameter.
Check it at discovery time, and if the parameter is too large, do not
create a sysfs entry for it.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Joel Stanley [Thu, 24 Apr 2014 07:25:36 +0000 (16:55 +0930)]
powerpc/powernv: Fix typos in sysparam code
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Joel Stanley [Thu, 24 Apr 2014 07:25:35 +0000 (16:55 +0930)]
powerpc/powernv: Check sysfs size before copying
The sysparam code currently uses the userspace supplied number of
bytes when memcpy()ing in to a local 64-byte buffer.
Limit the maximum number of bytes by the size of the buffer.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Joel Stanley [Thu, 24 Apr 2014 07:25:34 +0000 (16:55 +0930)]
powerpc/powernv: Use ssize_t for sysparam return values
The OPAL calls are returning int64_t values, which the sysparam code
stores in an int, and the sysfs callback returns ssize_t. Make code a
easier to read by consistently using ssize_t.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Joel Stanley [Thu, 24 Apr 2014 07:25:33 +0000 (16:55 +0930)]
powerpc/powernv: Fix sysparam sysfs error handling
When a sysparam query in OPAL returned a negative value (error code),
sysfs would spew out a decent chunk of memory; almost 64K more than
expected. This was traced to a sign/unsigned mix up in the OPAL sysparam
sysfs code at sys_param_show.
The return value of sys_param_show is a ssize_t, calculated using
return ret ? ret : attr->param_size;
Alan Modra explains:
"attr->param_size" is an unsigned int, "ret" an int, so the overall
expression has type unsigned int. Result is that ret is cast to
unsigned int before being cast to ssize_t.
Instead of using the ternary operator, set ret to the param_size if an
error is not detected. The same bug exists in the sysfs write callback;
this patch fixes it in the same way.
A note on debugging this next time: on my system gcc will warn about
this if compiled with -Wsign-compare, which is not enabled by -Wall,
only -Wextra.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Li Zhong [Mon, 28 Apr 2014 00:29:51 +0000 (08:29 +0800)]
powerpc: Fix Oops in rtas_stop_self()
commit
41dd03a9 may cause Oops in rtas_stop_self().
The reason is that the rtas_args was moved into stack space. For a box
with more that 4GB RAM, the stack could easily be outside 32bit range,
but RTAS is 32bit.
So the patch moves rtas_args away from stack by adding static before
it.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Jeff Mahoney [Sun, 27 Apr 2014 22:10:43 +0000 (18:10 -0400)]
powerpc: Export flush_icache_range
Commit
aac416fc38c (lkdtm: flush icache and report actions) calls
flush_icache_range from a module. It's exported on most architectures
that implement it, but not on powerpc. This patch exports it to fix
the module link failure.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Torvalds [Mon, 28 Apr 2014 02:29:27 +0000 (19:29 -0700)]
Linux 3.15-rc3
Will Deacon [Wed, 23 Apr 2014 16:52:52 +0000 (17:52 +0100)]
word-at-a-time: avoid undefined behaviour in zero_bytemask macro
The asm-generic, big-endian version of zero_bytemask creates a mask of
bytes preceding the first zero-byte by left shifting ~0ul based on the
position of the first zero byte.
Unfortunately, if the first (top) byte is zero, the output of
prep_zero_mask has only the top bit set, resulting in undefined C
behaviour as we shift left by an amount equal to the width of the type.
As it happens, GCC doesn't manage to spot this through the call to fls(),
but the issue remains if architectures choose to implement their shift
instructions differently.
An example would be arch/arm/ (AArch32), where LSL Rd, Rn, #32 results
in Rd == 0x0, whilst on arch/arm64 (AArch64) LSL Xd, Xn, #64 results in
Xd == Xn.
Rather than check explicitly for the problematic shift, this patch adds
an extra shift by 1, replacing fls with __fls. Since zero_bytemask is
never called with a zero argument (has_zero() is used to check the data
first), we don't need to worry about calling __fls(0), which is
undefined.
Cc: <stable@vger.kernel.org>
Cc: Victor Kamensky <victor.kamensky@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 27 Apr 2014 22:08:12 +0000 (15:08 -0700)]
Merge branch 'safe-dirty-tlb-flush'
This merges the patch to fix possible loss of dirty bit on munmap() or
madvice(DONTNEED). If there are concurrent writers on other CPU's that
have the unmapped/unneeded page in their TLBs, their writes to the page
could possibly get lost if a third CPU raced with the TLB flush and did
a page_mkclean() before the page was fully written.
Admittedly, if you unmap() or madvice(DONTNEED) an area _while_ another
thread is still busy writing to it, you deserve all the lost writes you
could get. But we kernel people hold ourselves to higher quality
standards than "crazy people deserve to lose", because, well, we've seen
people do all kinds of crazy things.
So let's get it right, just because we can, and we don't have to worry
about it.
* safe-dirty-tlb-flush:
mm: split 'tlb_flush_mmu()' into tlb flushing and memory freeing parts
Linus Torvalds [Sun, 27 Apr 2014 20:26:28 +0000 (13:26 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: limit the path size in send to PATH_MAX
Btrfs: correctly set profile flags on seqlock retry
Btrfs: use correct key when repeating search for extent item
Btrfs: fix inode caching vs tree log
Btrfs: fix possible memory leaks in open_ctree()
Btrfs: avoid triggering bug_on() when we fail to start inode caching task
Btrfs: move btrfs_{set,clear}_and_info() to ctree.h
btrfs: replace error code from btrfs_drop_extents
btrfs: Change the hole range to a more accurate value.
btrfs: fix use-after-free in mount_subvol()
Linus Torvalds [Sun, 27 Apr 2014 19:55:04 +0000 (12:55 -0700)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull arm fixes from Russell King:
"A number of fixes for the PJ4/iwmmxt changes which arm-soc forced me
to take during the merge window. This stuff should have been better
tested and sorted out *before* the merge window"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8042/1: iwmmxt: allow to build iWMMXt on Marvell PJ4B
ARM: 8041/1: pj4: fix cpu_is_pj4 check
ARM: 8040/1: pj4: properly detect existence of iWMMXt coprocessor
ARM: 8039/1: pj4: enable iWMMXt only if CONFIG_IWMMXT is set
ARM: 8038/1: iwmmxt: explicitly check for supported architectures
Linus Torvalds [Sun, 27 Apr 2014 19:54:05 +0000 (12:54 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- compat renameat2 syscall wiring and __NR_compat_syscalls fix
- TLB fix for transparent huge pages following switch to generic
mmu_gather
- spinlock initialisation for init_mm's context
- move of_clk_init() earlier
- Kconfig duplicate entry fix
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: init: Move of_clk_init to time_init
arm64: initialize spinlock for init_mm's context
arm64: debug: remove noisy, pointless warning
arm64: mm: Add THP TLB entries to general mmu_gather
arm64: add renameat2 compat syscall
ARM64: Remove duplicated Kconfig entry for "kernel/power/Kconfig"
arm64: __NR_compat_syscalls fix
Linus Torvalds [Sun, 27 Apr 2014 18:21:03 +0000 (11:21 -0700)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"A slighlty large fix for a subtle issue in the CPU hotplug code of
certain ARM SoCs, where the not yet online cpu needs to setup the cpu
local timer and needs to set the interrupt affinity to itself.
Setting interrupt affinity to a not online cpu is prohibited and
therefor the timer interrupt ends up on the wrong cpu, which leads to
nasty complications.
The SoC folks tried to hack around that in the SoC code in some more
than nasty ways. The proper solution is to have a way to enforce the
affinity setting to a not online cpu. The core patch to the genirq
code provides that facility and the follow up patches make use of it
in the GIC interrupt controller and the exynos timer driver.
The change to the core code has no implications to existing users,
except for the rename of the locked function and therefor the
necessary fixup in mips/cavium. Aside of that, no runtime impact is
possible, as none of the existing interrupt chips implements anything
which depends on the force argument of the irq_set_affinity()
callback"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: Exynos_mct: Register clock event after request_irq()
clocksource: Exynos_mct: Use irq_force_affinity() in cpu bringup
irqchip: Gic: Support forced affinity setting
genirq: Allow forcing cpu affinity of interrupts
Linus Torvalds [Sun, 27 Apr 2014 17:39:09 +0000 (10:39 -0700)]
Merge tag 'tty-3.15-rc3' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are a few tty/serial fixes for 3.15-rc3 that resolve a number of
reported issues in the 8250 and samsung serial drivers, as well as a
character loss fix for the tty core that was caused by the lock
removal patches a release ago"
* tag 'tty-3.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial_core: fix uart PORT_UNKNOWN handling
serial: samsung: Change barrier() to cpu_relax() in console output
serial: samsung: don't check config for every character
serial: samsung: Use the passed in "port", fixing kgdb w/ no console
serial: 8250: Fix thread unsafe __dma_tx_complete function
8250_core: Fix unwanted TX chars write
tty: Fix race condition between __tty_buffer_request_room and flush_to_ldisc
Linus Torvalds [Sun, 27 Apr 2014 17:34:29 +0000 (10:34 -0700)]
Merge tag 'staging-3.15-rc3' of git://git./linux/kernel/git/gregkh/staging
Pull staging / IIO driver fixes from Greg KH:
"Here are some small staging and IIO driver fixes for 3.15-rc3.
Nothing major at all, just some assorted issues that people have
reported"
* tag 'staging-3.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: comedi: usbdux: bug fix for accessing 'ao_chanlist' in private data
iio: adc: mxs-lradc: fix warning when buidling on avr32
iio: cm36651: Fix i2c client leak and possible NULL pointer dereference
iio: querying buffer scan_mask should return 0/1
staging:iio:ad2s1200 fix a missing break
iio: adc: at91_adc: correct default shtim value
ARM: at91: at91sam9260: change at91_adc name
ARM: at91: at91sam9g45: change at91_adc name
iio: cm32181: Fix read integration time function
iio: adc: at91_adc: Repair broken platform_data support
Linus Torvalds [Sun, 27 Apr 2014 17:28:34 +0000 (10:28 -0700)]
Merge tag 'driver-core-3.15-rc3' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are some kernfs fixes for 3.15-rc3 that resolve some reported
problems. Nothing huge, but all needed"
* tag 'driver-core-3.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
s390/ccwgroup: Fix memory corruption
kernfs: add back missing error check in kernfs_fop_mmap()
kernfs: fix a subdir count leak
Linus Torvalds [Sun, 27 Apr 2014 17:24:17 +0000 (10:24 -0700)]
Merge tag 'usb-3.15-rc3' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of USB fixes for 3.15-rc3. The majority are gadget
fixes, as we didn't get any of those in for 3.15-rc2. The others are
all over the place, and there's a number of new device id addtions as
well."
* tag 'usb-3.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (35 commits)
usb: option: add and update a number of CMOTech devices
usb: option: add Alcatel L800MA
usb: option: add Olivetti Olicard 500
usb: qcserial: add Sierra Wireless MC7305/MC7355
usb: qcserial: add Sierra Wireless MC73xx
usb: qcserial: add Sierra Wireless EM7355
USB: io_ti: fix firmware download on big-endian machines
usb/xhci: fix compilation warning when !CONFIG_PCI && !CONFIG_PM
xhci: extend quirk for Renesas cards
xhci: Switch Intel Lynx Point ports to EHCI on shutdown.
usb: xhci: Prefer endpoint context dequeue pointer over stopped_trb
phy: core: make NULL a valid phy reference if !CONFIG_GENERIC_PHY
phy: fix kernel oops in phy_lookup()
phy: restore OMAP_CONTROL_PHY dependencies
phy: exynos: fix building as a module
USB: serial: fix sysfs-attribute removal deadlock
usb: wusbcore: fix panic in wusbhc_chid_set
usb: wusbcore: convert nested lock to use spin_lock instead of spin_lock_irq
uwb: don't call spin_unlock_irq in a USB completion handler
usb: chipidea: coordinate usb phy initialization for different phy type
...
Linus Torvalds [Sun, 27 Apr 2014 17:19:06 +0000 (10:19 -0700)]
Merge tag 'pm+acpi-3.15-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"These include a fix for a recent ACPI regression related to device
notifications, intel_idle fix related to IvyTown support, fix for a
buffer size issue in ACPICA, PM core fix related to the "freeze" sleep
state, four fixes for various types of breakage in cpufreq drivers, a
PNP workaround for a wrong memory region size in ACPI tables, and a
fix and cleanup for the ACPI tools Makefile.
Specifics:
- Fix for broken ACPI notifications on some systems caused by a
recent ACPI hotplug commit that blocked the propagation of unknown
type notifications to device drivers inadvertently.
- intel_idle fix to make the IvyTown C-states handling (added
recently) work as intended which now is broken due to missing
braces. From Christoph Jaeger.
- ACPICA fix to make it allocate buffers of the right sizes for the
Generic Serial Bus operation region access. From Lv Zheng.
- PM core fix unblocking cpuidle before entering the "freeze" sleep
state which causes that state to be able to actually save more
energy than runtime idle.
- Configuration and build fixes for the highbank and powernv cpufreq
drivers from Kefeng Wang and Srivatsa S Bhat.
- Coccinelle warning fix related to error pointers for the unicore32
cpufreq driver from Duan Jiong.
- Integer overflow fix for the ppc-corenet cpufreq driver from Geert
Uytterhoeven.
- Workaround for BIOSes that don't report the entire Intel MCH area
in their ACPI tables from Bjorn Helgaas.
- ACPI tools Makefile fix and cleanup from Thomas Renninger"
* tag 'pm+acpi-3.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / notify: Do not block unknown type notifications in root handler
PNP: Work around BIOS defects in Intel MCH area reporting
cpufreq: highbank: fix ARM_HIGHBANK_CPUFREQ dependency warning
cpufreq: ppc: Fix integer overflow in expression
cpufreq, powernv: Fix build failure on UP
cpufreq: unicore32: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
PM / suspend: Make cpuidle work in the "freeze" state
intel_idle: fix IVT idle state table setting
ACPICA: Fix buffer allocation issue for generic_serial_bus region accesses.
tools/power/acpi: Minor bugfixes
Chris Mason [Sat, 26 Apr 2014 12:02:03 +0000 (05:02 -0700)]
Btrfs: limit the path size in send to PATH_MAX
fs_path_ensure_buf is used to make sure our path buffers for
send are big enough for the path names as we construct them.
The buffer size is limited to 32K by the length field in
the struct.
But bugs in the path construction can end up trying to build
a huge buffer, and we'll do invalid memmmoves when the
buffer length field wraps.
This patch is step one, preventing the overflows.
Signed-off-by: Chris Mason <clm@fb.com>
Linus Torvalds [Fri, 25 Apr 2014 23:05:40 +0000 (16:05 -0700)]
mm: split 'tlb_flush_mmu()' into tlb flushing and memory freeing parts
The mmu-gather operation 'tlb_flush_mmu()' has done two things: the
actual tlb flush operation, and the batched freeing of the pages that
the TLB entries pointed at.
This splits the operation into separate phases, so that the forced
batched flushing done by zap_pte_range() can now do the actual TLB flush
while still holding the page table lock, but delay the batched freeing
of all the pages to after the lock has been dropped.
This in turn allows us to avoid a race condition between
set_page_dirty() (as called by zap_pte_range() when it finds a dirty
shared memory pte) and page_mkclean(): because we now flush all the
dirty page data from the TLB's while holding the pte lock,
page_mkclean() will be held up walking the (recently cleaned) page
tables until after the TLB entries have been flushed from all CPU's.
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Fri, 25 Apr 2014 22:40:25 +0000 (00:40 +0200)]
Merge branches 'pnp' and 'acpi-hotplug'
* pnp:
PNP: Work around BIOS defects in Intel MCH area reporting
* acpi-hotplug:
ACPI / notify: Do not block unknown type notifications in root handler
Linus Torvalds [Fri, 25 Apr 2014 20:12:25 +0000 (13:12 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- ltc2945: Don't unecessarily crash kernel on implementation error
- vexpress: Fix 'name' and 'label' attributes
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (ltc2945) Don't crash the kernel unnecessarily
hwmon: (vexpress) Avoid creating non-existing attributes
hwmon: (vexpress) Use legal hwmon device names
Linus Torvalds [Fri, 25 Apr 2014 20:07:24 +0000 (13:07 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a set of seven fixes, three (hpsa) and free'd command
references correcting bugs in the last round of updates and the
remaining four correcting problems within the SCSI error handler that
was causing a deadlock within USB"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
[SCSI] More USB deadlock fixes
[SCSI] Fix USB deadlock caused by SCSI error handling
[SCSI] Fix command result state propagation
[SCSI] Fix spurious request sense in error handling
[SCSI] don't reference freed command in scsi_prep_return
[SCSI] don't reference freed command in scsi_init_sgtable
[SCSI] hpsa: fix NULL dereference in hpsa_put_ctlr_into_performant_mode()
Linus Torvalds [Fri, 25 Apr 2014 20:02:02 +0000 (13:02 -0700)]
Merge tag 'fixes-3.15-rc3' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"Since we didn't get around to collect fixes in time for -rc2 over the
easter vacation, this one is unfortunately a bit larger than we'd like
for an -rc3 merge.
A large set of the changes is in the device tree sources, so I'm
splitting out the description between code changes and DT changes.
Aside from omap and versatile express, the actual code bugs are and
trivial. Here is an overview:
imx:
- fix video clock settings
- fix one clock refcounting bug
omap:
- update defconfig for renamed USB PHY driver
- fix error handling in gpmc
- fix N900 video initialization regression
- fix reression in hwmod code from missing braces
- fix am43xx and omap3 clocks
- remove bogus write to voltage control register
pxa:
- fix build regression from 3.13 header cleanup
rockchip:
- fix a misleading printk string
shmobile:
- fix incorrect sound setting on multiple machines
spear:
- remove incorrect __init section annotation
tegra:
- remove a stale Kconfig entry
u300:
- update defconfig
ux500:
- enable common wireless and sensor drivers in defconfig
- more defconfig updates
vexpress:
- fix voltage calculation for opp
- fix reboot hang and warning
- fix out-of-bounds array access
- improve error handling in clock driver
overall:
- always select CLKSRC_OF in multiplatform builds
And these are the devicetree related changes:
imx:
- add missing #clock-cell properties
- fix pinctrl setting in imx6sl-evk
- fix video endpoint on imx53
- remove obsolete lvds-channel nodes (multiple patches)
- add missing second stmpe node
- fix usb host mode on dmo-edmqmx6 (multiple patches)
- fix gic node #address-cells to match usage
- add missing legacy IRQ map for PCIe
- fix microsom pincontrol setting for rgmii
- fix fatal typo in touchscreen DT usage for mx5
- list all RAM present on m53evk and mx53qsb
omap:
- fix bug in DT handling of gpmc external bus
- add DT for older revision of beagleboard
- fix regression after DT node name fixes
- remove obsolete properties for gpmc
- fix pinmux comment to match DT it refers to
- fix newly added dra7xx clock node data
- add missing clock for USB PHY
mvebu:
- add missing clock for mdio node
- fix nonstandard vendor prefixes on i2c nodes
rockchip:
- fix pin control setting for uart
shmobile:
- fix typo in DT data for pin control (multiple patches)
- fix gic node #address-cells to match usage
tegra:
- fix clock and uart DT representation to match hardware
zynq:
- add DT nodes for newly added driver
- add DT properties required for cpufreq-ondemand
overall:
- restore alphabetic order in Makefile
- grammar fixes in bindings"
* tag 'fixes-3.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (66 commits)
ARM: vexpress/TC2: Convert OPP voltage to uV before storing
power/reset: vexpress: Fix restart/power off operation
dt: tegra: remove non-existent clock IDs
clk: tegra: remove non-existent clocks
ARM: tegra: remove UART5/UARTE from tegra124.dtsi
ARM: tegra: remove TEGRA_EMC_SCALING_ENABLE
ARM: Tidy up DTB Makefile entries
ARM: fix missing CLKSRC_OF on multi-platform
ARM: spear: add __init to spear_clocksource_init()
ARM: pxa: hx4700.h: include "irqs.h" for PXA_NR_BUILTIN_GPIO
arm/mach-vexpress: array accessed out of bounds
clk: vexpress: NULL dereference on error path
ARM: OMAP2+: Fix GPMC remap for devices using an offset
ARM: zynq: dt: Add I2C nodes to Zynq device tree
ARM: zynq: DT: Add 'clock-latency' property
ARM: OMAP2+: Fix oops for GPMC free
ARM: dts: Add support for the BeagleBoard xM A/B
ARM: dts: Grammar /that will/it will/
ARM: dts: Grammar /is uses/ is used/
ARM: OMAP2+: Fix config name for USB3 PHY
...
Linus Torvalds [Fri, 25 Apr 2014 19:40:32 +0000 (12:40 -0700)]
Merge tag 'locks-v3.15-2' of git://git.samba.org/jlayton/linux
Pull file locking fixes from Jeff Layton:
"File locking related bugfixes for v3.15 (pile #2)
- fix for a long-standing bug in __break_lease that can cause soft
lockups
- renaming of file-private locks to "open file description" locks,
and the command macros to more visually distinct names
The fix for __break_lease is also in the pile of patches for which
Bruce sent a pull request, but I assume that your merge procedure will
handle that correctly.
For the other patches, I don't like the fact that we need to rename
this stuff at this late stage, but it should be settled now
(hopefully)"
* tag 'locks-v3.15-2' of git://git.samba.org/jlayton/linux:
locks: rename FL_FILE_PVT and IS_FILE_PVT to use "*_OFDLCK" instead
locks: rename file-private locks to "open file description locks"
locks: allow __break_lease to sleep even when break_time is 0
Linus Torvalds [Fri, 25 Apr 2014 19:39:05 +0000 (12:39 -0700)]
Merge branch 'for-3.15' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from Bruce Fields:
"Three small nfsd bugfixes (including one locks.c fix for a bug
triggered only from nfsd).
Jeff's patches are for long-existing problems that became easier to
trigger since the addition of vfs delegation support"
* 'for-3.15' of git://linux-nfs.org/~bfields/linux:
Revert "nfsd4: fix nfs4err_resource in 4.1 case"
nfsd: set timeparms.to_maxval in setup_callback_client
locks: allow __break_lease to sleep even when break_time is 0
Christian Borntraeger [Wed, 23 Apr 2014 18:58:45 +0000 (20:58 +0200)]
s390/ccwgroup: Fix memory corruption
commit
0b60f9ead5d4816e7e3d6e28f4a0d22d4a1b2513 (s390: use
device_remove_file_self() instead of device_schedule_callback())
caused random memory corruption on my s390 box. Turns out that the
last element of the ccwgroup structure is of dynamic size, so we
must move the newly introduced work structure _before_ the zero
length array.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Tejun Heo <tj@kernel.org>
CC: Martin Schwidefsky <schwidefsky@de.ibm.com>
CC: Heiko Carstens <heiko.carstens@de.ibm.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tejun Heo [Sun, 20 Apr 2014 12:29:21 +0000 (08:29 -0400)]
kernfs: add back missing error check in kernfs_fop_mmap()
While updating how mmap enabled kernfs files are handled by lockdep,
9b2db6e18945 ("sysfs: bail early from kernfs_file_mmap() to avoid
spurious lockdep warning") inadvertently dropped error return check
from kernfs_file_mmap(). The intention was just dropping "if
(ops->mmap)" check as the control won't reach the point if the mmap
callback isn't implemented, but I mistakenly removed the error return
check together with it.
This led to Xorg crash on i810 which was reported and bisected to the
commit and then to the specific change by Tobias.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-bisected-by: Tobias Powalowski <tobias.powalowski@googlemail.com>
Tested-by: Tobias Powalowski <tobias.powalowski@googlemail.com>
References: http://lkml.kernel.org/g/
533D01BD.
1010200@googlemail.com
Cc: stable <stable@vger.kernel.org> # 3.14
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jianyu Zhan [Thu, 17 Apr 2014 09:52:10 +0000 (17:52 +0800)]
kernfs: fix a subdir count leak
Currently kernfs_link_sibling() increates parent->dir.subdirs before
adding the node into parent's chidren rb tree.
Because it is possible that kernfs_link_sibling() couldn't find
a suitable slot and bail out, this leads to a mismatch between
elevated subdir count with actual children node numbers.
This patches fix this problem, by moving the subdir accouting
after the actual addtion happening.
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bjørn Mork [Fri, 25 Apr 2014 16:49:20 +0000 (18:49 +0200)]
usb: option: add and update a number of CMOTech devices
A number of older CMOTech modems are based on Qualcomm
chips. The blacklisted interfaces are QMI/wwan.
Reported-by: Lars Melin <larsm17@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bjørn Mork [Fri, 25 Apr 2014 16:49:19 +0000 (18:49 +0200)]
usb: option: add Alcatel L800MA
Device interface layout:
0: ff/ff/ff - serial
1: ff/00/00 - serial AT+PPP
2: ff/ff/ff - QMI/wwan
3: 08/06/50 - storage
Cc: <stable@vger.kernel.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bjørn Mork [Fri, 25 Apr 2014 16:49:18 +0000 (18:49 +0200)]
usb: option: add Olivetti Olicard 500
Device interface layout:
0: ff/ff/ff - serial
1: ff/ff/ff - serial AT+PPP
2: 08/06/50 - storage
3: ff/ff/ff - serial
4: ff/ff/ff - QMI/wwan
Cc: <stable@vger.kernel.org>
Reported-by: Julio Araujo <julio.araujo@wllctel.com.br>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bjørn Mork [Fri, 25 Apr 2014 16:49:17 +0000 (18:49 +0200)]
usb: qcserial: add Sierra Wireless MC7305/MC7355
Cc: <stable@vger.kernel.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bjørn Mork [Fri, 25 Apr 2014 16:49:16 +0000 (18:49 +0200)]
usb: qcserial: add Sierra Wireless MC73xx
Cc: <stable@vger.kernel.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bjørn Mork [Fri, 25 Apr 2014 16:49:15 +0000 (18:49 +0200)]
usb: qcserial: add Sierra Wireless EM7355
Cc: <stable@vger.kernel.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Chanho Min [Mon, 14 Apr 2014 07:38:53 +0000 (08:38 +0100)]
arm64: init: Move of_clk_init to time_init
Clock providers should be initialized before clocksource_of_init.
If not, Clock source initialization can be fail to get the clock.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Chanho Min <chanho.min@lge.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Johan Hovold [Fri, 25 Apr 2014 13:23:03 +0000 (15:23 +0200)]
USB: io_ti: fix firmware download on big-endian machines
During firmware download the device expects memory addresses in
big-endian byte order. As the wIndex parameter which hold the address is
sent in little-endian byte order regardless of host byte order, we need
to use swab16 rather than cpu_to_be16.
Also make sure to handle the struct ti_i2c_desc size parameter which is
returned in little-endian byte order.
Reported-by: Ludovic Drolez <ldrolez@debian.org>
Tested-by: Ludovic Drolez <ldrolez@debian.org>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Cohen [Fri, 25 Apr 2014 16:20:16 +0000 (19:20 +0300)]
usb/xhci: fix compilation warning when !CONFIG_PCI && !CONFIG_PM
When CONFIG_PCI and CONFIG_PM are not selected, xhci.c gets this
warning:
drivers/usb/host/xhci.c:409:13: warning: ‘xhci_msix_sync_irqs’ defined
but not used [-Wunused-function]
Instead of creating nested #ifdefs, this patch fixes it by defining the
xHCI PCI stubs as inline.
This warning has been in since 3.2 kernel and was
caused by commit
421aa841a134f6a743111cf44d0c6d3b45e3cf8c
"usb/xhci: hide MSI code behind PCI bars", but wasn't noticed
until 3.13 when a configuration with these options was tried
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Cc: stable@vger.kernel.org # 3.2
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Igor Gnatenko [Fri, 25 Apr 2014 16:20:15 +0000 (19:20 +0300)]
xhci: extend quirk for Renesas cards
After suspend another Renesas PCI-X USB 3.0 card doesn't work.
[root@fedora-20 ~]# lspci -vmnnd 1912:
Device: 03:00.0
Class: USB controller [0c03]
Vendor: Renesas Technology Corp. [1912]
Device: uPD720202 USB 3.0 Host Controller [0015]
SVendor: Renesas Technology Corp. [1912]
SDevice: uPD720202 USB 3.0 Host Controller [0015]
Rev: 02
ProgIf: 30
This patch should be applied to stable kernel 3.14 that contain
the commit
1aa9578c1a9450fb21501c4f549f5b1edb557e6d
"xhci: Fix resume issues on Renesas chips in Samsung laptops"
Reported-and-tested-by: Anatoly Kharchenko <rfr-bugs@yandex.ru>
Reference: http://redmine.russianfedora.pro/issues/1315
Signed-off-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
Cc: stable@vger.kernel.org # 3.14
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Denis Turischev [Fri, 25 Apr 2014 16:20:14 +0000 (19:20 +0300)]
xhci: Switch Intel Lynx Point ports to EHCI on shutdown.
The same issue like with Panther Point chipsets. If the USB ports are
switched to xHCI on shutdown, the xHCI host will send a spurious interrupt,
which will wake the system. Some BIOS have work around for this, but not all.
One example is Compulab's mini-desktop, the Intense-PC2.
The bug can be avoided if the USB ports are switched back to EHCI on
shutdown.
This patch should be backported to stable kernels as old as 3.12,
that contain the commit
638298dc66ea36623dbc2757a24fc2c4ab41b016
"xhci: Fix spurious wakeups after S5 on Haswell"
Signed-off-by: Denis Turischev <denis@compulab.co.il>
Cc: stable@vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Julius Werner [Fri, 25 Apr 2014 16:20:13 +0000 (19:20 +0300)]
usb: xhci: Prefer endpoint context dequeue pointer over stopped_trb
We have observed a rare cycle state desync bug after Set TR Dequeue
Pointer commands on Intel LynxPoint xHCs (resulting in an endpoint that
doesn't fetch new TRBs and thus an unresponsive USB device). It always
triggers when a previous Set TR Dequeue Pointer command has set the
pointer to the final Link TRB of a segment, and then another URB gets
enqueued and cancelled again before it can be completed. Further
investigation showed that the xHC had returned the Link TRB in the TRB
Pointer field of the Transfer Event (CC == Stopped -- Length Invalid),
but when xhci_find_new_dequeue_state() later accesses the Endpoint
Context's TR Dequeue Pointer field it is set to the first TRB of the
next segment.
The driver expects those two values to be the same in this situation,
and uses the cycle state of the latter together with the address of the
former. This should be fine according to the XHCI specification, since
the endpoint ring should be stopped when returning the Transfer Event
and thus should not advance over the Link TRB before it gets restarted.
However, real-world XHCI implementations apparently don't really care
that much about these details, so the driver should follow a more
defensive approach to try to work around HC spec violations.
This patch removes the stopped_trb variable that had been used to store
the TRB Pointer from the last Transfer Event of a stopped TRB. Instead,
xhci_find_new_dequeue_state() now relies only on the Endpoint Context,
requiring a small amount of additional processing to find the virtual
address corresponding to the TR Dequeue Pointer. Some other parts of the
function were slightly rearranged to better fit into this model.
This patch should be backported to kernels as old as 2.6.31 that contain
the commit
ae636747146ea97efa18e04576acd3416e2514f5 "USB: xhci: URB
cancellation support."
Signed-off-by: Julius Werner <jwerner@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Leo Yan [Wed, 16 Apr 2014 12:26:35 +0000 (13:26 +0100)]
arm64: initialize spinlock for init_mm's context
ARM64 has defined the spinlock for init_mm's context, so need initialize
the spinlock structure; otherwise during the suspend flow it will dump
the info for spinlock's bad magic warning as below:
[ 39.084394] Disabling non-boot CPUs ...
[ 39.092871] BUG: spinlock bad magic on CPU#1, swapper/1/0
[ 39.092896] lock: init_mm+0x338/0x3e0, .magic:
00000000, .owner: <none>/-1, .owner_cpu: 0
[ 39.092907] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 3.10.33 #125
[ 39.092912] Call trace:
[ 39.092927] [<
ffffffc000087e64>] dump_backtrace+0x0/0x16c
[ 39.092934] [<
ffffffc000087fe0>] show_stack+0x10/0x1c
[ 39.092947] [<
ffffffc000765334>] dump_stack+0x1c/0x28
[ 39.092953] [<
ffffffc0007653b8>] spin_dump+0x78/0x88
[ 39.092960] [<
ffffffc0007653ec>] spin_bug+0x24/0x34
[ 39.092971] [<
ffffffc000300a28>] do_raw_spin_lock+0x98/0x17c
[ 39.092979] [<
ffffffc00076cf08>] _raw_spin_lock_irqsave+0x4c/0x60
[ 39.092990] [<
ffffffc000094044>] set_mm_context+0x1c/0x6c
[ 39.092996] [<
ffffffc0000941c8>] __new_context+0x94/0x10c
[ 39.093007] [<
ffffffc0000d63d4>] idle_task_exit+0x104/0x1b0
[ 39.093014] [<
ffffffc00008d91c>] cpu_die+0x14/0x74
[ 39.093021] [<
ffffffc000084f74>] arch_cpu_idle_dead+0x8/0x14
[ 39.093030] [<
ffffffc0000e7f18>] cpu_startup_entry+0x1ec/0x258
[ 39.093036] [<
ffffffc00008d810>] secondary_start_kernel+0x114/0x124
Signed-off-by: Leo Yan <leoy@marvell.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Will Deacon [Thu, 17 Apr 2014 11:37:14 +0000 (12:37 +0100)]
arm64: debug: remove noisy, pointless warning
Sending a SIGTRAP to a user task after execution of a BRK instruction at
EL0 is fundamental to the way in which software breakpoints work and
doesn't deserve a warning to be logged in dmesg. Whilst the warning can
be justified from EL1, do_debug_exception will already do the right thing,
so simply remove the code altogether.
Cc: Sandeepa Prabhu <sandeepa.prabhu@linaro.org>
Reported-by: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Steve Capper [Thu, 24 Apr 2014 14:33:21 +0000 (15:33 +0100)]
arm64: mm: Add THP TLB entries to general mmu_gather
When arm64 moved over to the core mmu_gather, it lost the logic to
flush THP TLB entries (tlb_remove_pmd_tlb_entry was removed and the
core implementation only signals that the mmu_gather needs a flush).
This patch ensures that tlb_add_flush is called for THP TLB entries.
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Sebastian Hesselbarth [Thu, 24 Apr 2014 21:58:30 +0000 (22:58 +0100)]
ARM: 8042/1: iwmmxt: allow to build iWMMXt on Marvell PJ4B
Some Marvell PJ4B CPUs also implement iWMMXt extensions. With a
proper check for iWMMXt coprocessors now in place, enable it by
default on PJ4B. While at it, also allow to manually select
the corresponding Kconfig option.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Sebastian Hesselbarth [Thu, 24 Apr 2014 21:58:00 +0000 (22:58 +0100)]
ARM: 8041/1: pj4: fix cpu_is_pj4 check
Commit
fdb487f5c961b94486a78fa61fa28b8eff1954ab
("ARM: 8015/1: Add cpu_is_pj4 to distinguish PJ4 because it
has some differences with V7")
introduced a cpuid check for Marvell PJ4 processors to fix a
regression caused by adding PJ4 based Marvell Dove into
multi_v7.
Unfortunately, this check is too narrow to catch PJ4 used on
Dove itself and breaks iWMMXt support.
This patch therefore relaxes the cpuid mask to match both PJ4
and PJ4B. Also, rework the given comment about PJ4/PJ4B
modifications to be a little bit more specific about the
differences.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Sebastian Hesselbarth [Thu, 24 Apr 2014 21:57:25 +0000 (22:57 +0100)]
ARM: 8040/1: pj4: properly detect existence of iWMMXt coprocessor
commit
fdb487f5c961b94486a78fa61fa28b8eff1954ab
("ARM: 8015/1: Add cpu_is_pj4 to distinguish PJ4 because it
has some differences with V7")
introduced a fix for checking PJ4 cpuid to not use PJ4 specific
coprocessor access on non-PJ4 platforms.
Unfortunately, this in turn broke Marvell Armada 370/XP, both
comprising Marvell PJ4B CPUs without iWMMXt extension. Instead
of only checking for cpuid, which may not be sufficient to
determine iWMMXt support, the presence of iWMMXt coprocessors
can be checked by enabling and reading the Coprocessor ID
register (wCID, register 0 of CP1).
Therefore this adds an explicit check for the presence and correct
wCID value, before enabling iWMMXt capabilities. As a bonus, also
print the iWMMXt version of a detected coprocessor.
This has been tested to properly detect iWMMXt presence/absence on:
- PJ4, CPUID 0x560f5815, wCID 0x56052001: Marvell Dove, iWMMXt v2
- PJ4B, CPUID 0x561f5811: Marvell Armada 370, no iWMMXt
- PJ4B, CPUID 0x562f5841, wCID 0x56052001: Marvell Armada 1500, iWMMXt v2
- PJ4B, CPUID 0x562f5842: Marvell Armada XP, no iWMMXt
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Sebastian Hesselbarth [Thu, 24 Apr 2014 21:56:43 +0000 (22:56 +0100)]
ARM: 8039/1: pj4: enable iWMMXt only if CONFIG_IWMMXT is set
This fixes PJ4 coprocessor init to only expose iWMMXt capabilities,
if the corresponding kernel support for iWMMXt is enabled.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Sebastian Hesselbarth [Thu, 24 Apr 2014 21:54:58 +0000 (22:54 +0100)]
ARM: 8038/1: iwmmxt: explicitly check for supported architectures
iwmmxt.S requires special treatment of coprocessor access registers
for PJ4 and XScale-based CPUs. It only checks for CPU_PJ4 and drops
down to XScale-based treatment on all other architectures.
As some PJ4B also come with iWMMXt and also need PJ4 treatment,
rework the corresponding preprocessor directives to explicitly
check for supported architectures and fail on unsupported ones.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Arnd Bergmann [Fri, 25 Apr 2014 09:22:20 +0000 (11:22 +0200)]
Merge tag 'zynq-dt-fixes-for-3.15' of git://git.xilinx.com/linux-xlnx into fixes
arm: Xilinx Zynq DT fixes for v3.15
- Enable Zynq I2c
- Fix cpufreq DT binding
* tag 'zynq-dt-fixes-for-3.15' of git://git.xilinx.com/linux-xlnx:
ARM: zynq: dt: Add I2C nodes to Zynq device tree
ARM: zynq: DT: Add 'clock-latency' property
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Filipe Manana [Thu, 24 Apr 2014 14:15:29 +0000 (15:15 +0100)]
Btrfs: correctly set profile flags on seqlock retry
If we had to retry on the profiles seqlock (due to a concurrent write), we
would set bits on the input flags that corresponded both to the current
profile and to previous values of the profile.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Filipe Manana [Thu, 24 Apr 2014 14:15:28 +0000 (15:15 +0100)]
Btrfs: use correct key when repeating search for extent item
If skinny metadata is enabled and our first tree search fails to find a
skinny extent item, we may repeat a tree search for a "fat" extent item
(if the previous item in the leaf is not the "fat" extent we're looking
for). However we were not setting the new key's objectid to the right
value, as we previously used the same key variable to peek at the previous
item in the leaf, which has a different objectid. So just set the right
objectid to avoid modifying/deleting a wrong item if we repeat the tree
search.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Miao Xie [Wed, 23 Apr 2014 11:33:36 +0000 (19:33 +0800)]
Btrfs: fix inode caching vs tree log
Currently, with inode cache enabled, we will reuse its inode id immediately
after unlinking file, we may hit something like following:
|->iput inode
|->return inode id into inode cache
|->create dir,fsync
|->power off
An easy way to reproduce this problem is:
mkfs.btrfs -f /dev/sdb
mount /dev/sdb /mnt -o inode_cache,commit=100
dd if=/dev/zero of=/mnt/data bs=1M count=10 oflag=sync
inode_id=`ls -i /mnt/data | awk '{print $1}'`
rm -f /mnt/data
i=1
while [ 1 ]
do
mkdir /mnt/dir_$i
test1=`stat /mnt/dir_$i | grep Inode: | awk '{print $4}'`
if [ $test1 -eq $inode_id ]
then
dd if=/dev/zero of=/mnt/dir_$i/data bs=1M count=1 oflag=sync
echo b > /proc/sysrq-trigger
fi
sleep 1
i=$(($i+1))
done
mount /dev/sdb /mnt
umount /dev/sdb
btrfs check /dev/sdb
We fix this problem by adding unlinked inode's id into pinned tree,
and we can not reuse them until committing transaction.
Cc: stable@vger.kernel.org
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Wang Shilong [Wed, 23 Apr 2014 11:33:35 +0000 (19:33 +0800)]
Btrfs: fix possible memory leaks in open_ctree()
Fix possible memory leaks in the following error handling paths:
read_tree_block()
btrfs_recover_log_trees
btrfs_commit_super()
btrfs_find_orphan_roots()
btrfs_cleanup_fs_roots()
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Wang Shilong [Wed, 23 Apr 2014 11:33:34 +0000 (19:33 +0800)]
Btrfs: avoid triggering bug_on() when we fail to start inode caching task
When running stress test(including snapshots,balance,fstress), we trigger
the following BUG_ON() which is because we fail to start inode caching task.
[ 181.131945] kernel BUG at fs/btrfs/inode-map.c:179!
[ 181.137963] invalid opcode: 0000 [#1] SMP
[ 181.217096] CPU: 11 PID: 2532 Comm: btrfs Not tainted 3.14.0 #1
[ 181.240521] task:
ffff88013b621b30 ti:
ffff8800b6ada000 task.ti:
ffff8800b6ada000
[ 181.367506] Call Trace:
[ 181.371107] [<
ffffffffa036c1be>] btrfs_return_ino+0x9e/0x110 [btrfs]
[ 181.379191] [<
ffffffffa038082b>] btrfs_evict_inode+0x46b/0x4c0 [btrfs]
[ 181.387464] [<
ffffffff810b5a70>] ? autoremove_wake_function+0x40/0x40
[ 181.395642] [<
ffffffff811dc5fe>] evict+0x9e/0x190
[ 181.401882] [<
ffffffff811dcde3>] iput+0xf3/0x180
[ 181.408025] [<
ffffffffa03812de>] btrfs_orphan_cleanup+0x1ee/0x430 [btrfs]
[ 181.416614] [<
ffffffffa03a6abd>] btrfs_mksubvol.isra.29+0x3bd/0x450 [btrfs]
[ 181.425399] [<
ffffffffa03a6cd6>] btrfs_ioctl_snap_create_transid+0x186/0x190 [btrfs]
[ 181.435059] [<
ffffffffa03a6e3b>] btrfs_ioctl_snap_create_v2+0xeb/0x130 [btrfs]
[ 181.444148] [<
ffffffffa03a9656>] btrfs_ioctl+0xf76/0x2b90 [btrfs]
[ 181.451971] [<
ffffffff8117e565>] ? handle_mm_fault+0x475/0xe80
[ 181.459509] [<
ffffffff8167ba0c>] ? __do_page_fault+0x1ec/0x520
[ 181.467046] [<
ffffffff81185b35>] ? do_mmap_pgoff+0x2f5/0x3c0
[ 181.474393] [<
ffffffff811d4da8>] do_vfs_ioctl+0x2d8/0x4b0
[ 181.481450] [<
ffffffff811d5001>] SyS_ioctl+0x81/0xa0
[ 181.488021] [<
ffffffff81680b69>] system_call_fastpath+0x16/0x1b
We should avoid triggering BUG_ON() here, instead, we output warning messages
and clear inode_cache option.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Wang Shilong [Wed, 23 Apr 2014 11:33:33 +0000 (19:33 +0800)]
Btrfs: move btrfs_{set,clear}_and_info() to ctree.h
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
David Sterba [Tue, 15 Apr 2014 16:50:17 +0000 (18:50 +0200)]
btrfs: replace error code from btrfs_drop_extents
There's a case which clone does not handle and used to BUG_ON instead,
(testcase xfstests/btrfs/035), now returns EINVAL. This error code is
confusing to the ioctl caller, as it normally signifies errorneous
arguments.
Change it to ENOPNOTSUPP which allows a fall back to copy instead of
clone. This does not affect the common reflink operation.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Qu Wenruo [Tue, 15 Apr 2014 02:41:00 +0000 (10:41 +0800)]
btrfs: Change the hole range to a more accurate value.
Commit
3ac0d7b96a268a98bd474cab8bce3a9f125aaccf fixed the btrfs expanding
write problem but the hole punched is sometimes too large for some
iovec, which has unmapped data ranges.
This patch will change to hole range to a more accurate value using the
counts checked by the write check routines.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Thomas Pfaff [Wed, 23 Apr 2014 10:33:22 +0000 (12:33 +0200)]
serial_core: fix uart PORT_UNKNOWN handling
While porting a RS485 driver from 2.6.29 to 3.14, i noticed that the serial tty
driver could break it by using uart ports that it does not own :
1. uart_change_pm ist called during uart_open and calls the uart pm function
without checking for PORT_UNKNOWN.
The fix is to move uart_change_pm from uart_open to uart_port_startup.
2. The return code from the uart request_port call in uart_set_info is not
handled properly, leading to the situation that the serial driver also
thinks it owns the uart ports.
This can triggered by doing following actions :
setserial /dev/ttyS0 uart none # release the uart ports
modprobe lirc-serial # or any other device that uses the uart
setserial /dev/ttyS0 uart 16550 # gives no error and the uart tty driver
# can use the ports as well
Signed-off-by: Thomas Pfaff <tpfaff@pcs.com>
Reviewed-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Doug Anderson [Mon, 21 Apr 2014 16:40:36 +0000 (09:40 -0700)]
serial: samsung: Change barrier() to cpu_relax() in console output
The two functions to write out to the console (one used in normal
console mode and one in polling console mode) were slightly different.
One used a barrier() in its loop and the other a cpu_relax(). The
barrier() really doesn't do anything since we're using rd_regl() to
read the port anyway. Switch it to cpu_relax() to make things
consistent.
No known bugs / issues are fixed by this change--it just makes things
more consistent.
Signed-off-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Doug Anderson [Mon, 21 Apr 2014 16:40:35 +0000 (09:40 -0700)]
serial: samsung: don't check config for every character
The s3c24xx_serial_console_putchar() is _only_ ever used by
s3c24xx_serial_console_write() and is called in a loop (indirectly
through uart_console_write()). There's no reason to call
s3c24xx_port_configured() for every iteration through the loop. Move
it outside the loop.
Signed-off-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Doug Anderson [Mon, 21 Apr 2014 16:40:34 +0000 (09:40 -0700)]
serial: samsung: Use the passed in "port", fixing kgdb w/ no console
The two functions in the samsung serial driver used for writing
characters out to the port were inconsistent about whether they used
the passed in "port" or the global "cons_uart". There was no reason
to use the global and the use of the global in
s3c24xx_serial_put_poll_char() caused a crash in the case where you
used the serial port for kgdboc but not for console.
Fix it so we used the passed in variable.
Note that this doesn't fix all problems with the samsung serial
driver. Specifically:
* s3c24xx_serial_console_putchar() is still 99% identical to
s3c24xx_serial_put_poll_char() (the function signature is different,
but that's about it). A future patch will make them slightly less
identical and judging by other serial drivers we may need yet more
differences eventually.
* The samsung serial driver still doesn't allow you to have more than
one console port since it still uses the global cons_uart in
s3c24xx_serial_console_write().
Signed-off-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Loic Poulain [Thu, 24 Apr 2014 09:34:48 +0000 (11:34 +0200)]
serial: 8250: Fix thread unsafe __dma_tx_complete function
__dma_tx_complete is not protected against concurrent
call of serial8250_tx_dma. it can lead to circular tail
index corruption or parallel call of serial_tx_dma on the
same data portion.
This patch fixes this issue by holding the port lock.
Signed-off-by: Loic Poulain <loic.poulain@intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Loic Poulain [Thu, 24 Apr 2014 09:38:56 +0000 (11:38 +0200)]
8250_core: Fix unwanted TX chars write
On transmit-hold-register empty, serial8250_tx_chars
should be called only if we don't use DMA.
DMA has its own tx cycle.
Signed-off-by: Loic Poulain <loic.poulain@intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manfred Schlaegl [Tue, 8 Apr 2014 12:42:04 +0000 (14:42 +0200)]
tty: Fix race condition between __tty_buffer_request_room and flush_to_ldisc
The race was introduced while development of linux-3.11 by
e8437d7ecbc50198705331449367d401ebb3181f and
e9975fdec0138f1b2a85b9624e41660abd9865d4.
Originally it was found and reproduced on linux-3.12.15 and
linux-3.12.15-rt25, by sending 500 byte blocks with 115kbaud to the
target uart in a loop with 100 milliseconds delay.
In short:
1. The consumer flush_to_ldisc is on to remove the head tty_buffer.
2. The producer adds a number of bytes, so that a new tty_buffer must
be allocated and added by __tty_buffer_request_room.
3. The consumer removes the head tty_buffer element, without handling
newly committed data.
Detailed example:
* Initial buffer:
* Head, Tail -> 0: used=250; commit=250; read=240; next=NULL
* Consumer: ''flush_to_ldisc''
* consumed 10 Byte
* buffer:
* Head, Tail -> 0: used=250; commit=250; read=250; next=NULL
{{{
count = head->commit - head->read; // count = 0
if (!count) { // enter
// INTERRUPTED BY PRODUCER ->
if (head->next == NULL)
break;
buf->head = head->next;
tty_buffer_free(port, head);
continue;
}
}}}
* Producer: tty_insert_flip_... 10 bytes + tty_flip_buffer_push
* buffer:
* Head, Tail -> 0: used=250; commit=250; read=250; next=NULL
* added 6 bytes: head-element filled to maximum.
* buffer:
* Head, Tail -> 0: used=256; commit=250; read=250; next=NULL
* added 4 bytes: __tty_buffer_request_room is called
* buffer:
* Head -> 0: used=256; commit=256; read=250; next=1
* Tail -> 1: used=4; commit=0; read=250 next=NULL
* push (tty_flip_buffer_push)
* buffer:
* Head -> 0: used=256; commit=256; read=250; next=1
* Tail -> 1: used=4; commit=4; read=250 next=NULL
* Consumer
{{{
count = head->commit - head->read;
if (!count) {
// INTERRUPTED BY PRODUCER <-
if (head->next == NULL) // -> no break
break;
buf->head = head->next;
tty_buffer_free(port, head);
// ERROR: tty_buffer head freed -> 6 bytes lost
continue;
}
}}}
This patch reintroduces a spin_lock to protect this case. Perhaps later
a lock-less solution could be found.
Signed-off-by: Manfred Schlaegl <manfred.schlaegl@gmx.at>
Cc: stable <stable@vger.kernel.org> # 3.11
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>