Bob Peterson [Thu, 27 Jun 2013 16:47:51 +0000 (12:47 -0400)]
GFS2: Reserve journal space for quota change in do_grow
If a GFS2 file system is mounted with quotas and a file is grown
in such a way that its free blocks for the allocation are represented
in a secondary bitmap, GFS2 ran out of blocks in the transaction.
That resulted in "fatal: assertion "tr->tr_num_buf <= tr->tr_blocks".
This patch reserves extra blocks for the quota change so the
transaction has enough space.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Abhijith Das [Wed, 19 Jun 2013 21:03:29 +0000 (17:03 -0400)]
GFS2: Fix fstrim boundary conditions
This patch correctly distinguishes two boundary conditions:
1. When the given range is entire within the unaccounted space between
two rgrps, and
2. The range begins beyond the end of the filesystem
Also fix the unit of the returned value r.len (total trimming) to be in bytes
instead of the (incorrect) 512 byte blocks
With this patch, GFS2 passes multiple iterations of all the relevant xfstests
(251, 260, 288) with different fs block sizes.
Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Benjamin Marzinski [Wed, 19 Jun 2013 18:18:59 +0000 (13:18 -0500)]
GFS2: fix warning message
This patch fixes a warning message introduced in the recent
"GFS2: aggressively issue revokes in gfs2_log_flush" patch.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Benjamin Marzinski [Fri, 14 Jun 2013 16:38:29 +0000 (11:38 -0500)]
GFS2: aggressively issue revokes in gfs2_log_flush
This patch looks at all the outstanding blocks in all the transactions
on the log, and moves the completed ones to the ail2 list. Then it
issues revokes for these blocks. This will hopefully speed things up
in situations where there is a lot of contention for glocks, especially
if they are acquired serially.
revoke_lo_before_commit will issue at most one log block's full of these
preemptive revokes. The amount of reserved log space that
gfs2_log_reserve() ignores has been incremented to allow for this extra
block.
This patch also consolidates the common revoke instructions into one
function, gfs2_add_revoke().
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Bob Peterson [Fri, 14 Jun 2013 12:39:18 +0000 (08:39 -0400)]
GFS2: fix regression in dir_double_exhash
Recent commit
e8830d8 introduced a bug in function dir_double_exhash;
it was failing to set h in the fall-back case. This patch corrects it.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Steven Whitehouse [Fri, 14 Jun 2013 10:17:15 +0000 (11:17 +0100)]
GFS2: Add atomic_open support
I've restricted atomic_open to only operate on regular files, although
I still don't understand why atomic_open should not be possible also for
directories on GFS2. That can always be added in later though, if it
makes sense.
The ->atomic_open function can be passed negative dentries, which
in most cases means either ENOENT (->lookup) or a call to d_instantiate
(->create). In the GFS2 case though, we need to actually perform the
look up, since we do not know whether there has been a new inode created
on another node. The look up calls d_splice_alias which then tries to
rehash the dentry - so the solution here is to simply check for that
in d_splice_alias. The same issue is likely to affect any other cluster
filesystem implementing ->atomic_open
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields fieldses org>
Cc: Jeff Layton <jlayton@redhat.com>
Steven Whitehouse [Tue, 11 Jun 2013 12:45:29 +0000 (13:45 +0100)]
GFS2: Only do one directory search on create
Creation of a new inode requires a directory search in order to ensure
that we are not trying to create an inode with the same name as an
existing one. This was hidden away inside the create_ok() function.
In the case that there was an existing inode, and a lookup can be
substituted for a create (which is the case with regular files
when the O_EXCL flag is not in use) then we were doing a second
lookup in order to return the inode.
This patch merges these two lookups into one. This can be done by
passing a flag to gfs2_dir_search() to tell it to just return -EEXIST
in the cases where we don't actually want to look up the inode.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Alexey Khoroshilov [Wed, 5 Jun 2013 21:29:04 +0000 (01:29 +0400)]
GFS2: fix error propagation in init_threads()
If kthread_run() fails, init_threads() returns
IS_ERR(p) instead of PTR_ERR(p).
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Steven Whitehouse [Mon, 3 Jun 2013 10:12:59 +0000 (11:12 +0100)]
GFS2: Remove no-op wrapper function
This wrapper function is no longer required, so get rid of it.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Thomas Meyer [Sat, 1 Jun 2013 09:52:43 +0000 (11:52 +0200)]
GFS2: Cocci spatch "ptr_ret.spatch"
Use PTR_RET in place of open coding this function.
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Bob Peterson [Tue, 21 May 2013 14:54:04 +0000 (10:54 -0400)]
GFS2: Eliminate gfs2_rg_lops
With recent changes to the transactions, it appears that we
are no longer using the "log ops" for resource groups. Since the
log commit code processes the array of log ops, eliminating this
should be marginally better for performance. Therefore this patch
eliminates it.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Benjamin Marzinski [Tue, 7 May 2013 14:58:49 +0000 (09:58 -0500)]
GFS2: Sort buffer lists by inplace block number
This patch simply sort the data and metadata buffer lists by their
inplace block number. This makes gfs2_log_flush issue the inplace IO
in sequential order, which will hopefully speed up writing the IO
out to disk.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Linus Torvalds [Wed, 5 Jun 2013 00:13:06 +0000 (09:13 +0900)]
Merge tag 'mmc-fixes-for-3.10-rc5' of git://git./linux/kernel/git/cjb/mmc
Pull MMC fixes from Chris Ball:
- sdhci-acpi: Fix initial runtime PM status, add more ACPI IDs
- atmel-mci, omap_hsmmc: DT handling fixes
- esdhc-imx: Fix SDIO IRQs, fix multiblock reads (both h/w errata)
* tag 'mmc-fixes-for-3.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
mmc: omap_hsmmc: Skip platform_get_resource_byname() for dt case
mmc: omap_hsmmc: convert to dma_request_slave_channel_compat
mmc: omap_hsmmc: Fix the DT pbias workaround for MMC controllers 2 to 5
mmc: sdhci-pci: add more device ids
mmc: sdhci-acpi: add more device ids
mmc: sdhci-acpi: fix initial runtime pm status
mmc: atmel-mci: convert to dma_request_slave_channel_compat()
mmc: sdhci-esdhc-imx: fix multiblock reads on i.MX53
mmc: sdhci-esdhc-imx: Fix SDIO interrupts
Linus Torvalds [Wed, 5 Jun 2013 00:11:06 +0000 (09:11 +0900)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
"Just a 2 small driver fixups here"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: wacom - fix a typo for Cintiq 22HDT
Input: synaptics - fix sync lost after resume on some laptops
Linus Torvalds [Wed, 5 Jun 2013 00:09:35 +0000 (09:09 +0900)]
Merge branch 'fixes' of git://git./virt/kvm/kvm
Pull kvm bugfixes from Gleb Natapov:
"The bulk of the fixes is in MIPS KVM kernel<->userspace ABI. MIPS KVM
is new for 3.10 and some problems were found with current ABI. It is
better to fix them now and do not have a kernel with broken one"
* 'fixes' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Fix race in apic->pending_events processing
KVM: fix sil/dil/bpl/spl in the mod/rm fields
KVM: Emulate multibyte NOP
ARM: KVM: be more thorough when invalidating TLBs
ARM: KVM: prevent NULL pointer dereferences with KVM VCPU ioctl
mips/kvm: Use ENOIOCTLCMD to indicate unimplemented ioctls.
mips/kvm: Fix ABI by moving manipulation of CP0 registers to KVM_{G,S}ET_ONE_REG
mips/kvm: Use ARRAY_SIZE() instead of hardcoded constants in kvm_arch_vcpu_ioctl_{s,g}et_regs
mips/kvm: Fix name of gpr field in struct kvm_regs.
mips/kvm: Fix ABI for use of 64-bit registers.
mips/kvm: Fix ABI for use of FPU.
Linus Torvalds [Wed, 5 Jun 2013 00:06:28 +0000 (09:06 +0900)]
Merge git://git./linux/kernel/git/steve/gfs2-3.0-fixes
Pull gfs2 fixes from Steven Whitehouse:
"There are four patches this time.
The first fixes a problem where the wrong descriptor type was being
written into the log for journaled data blocks.
The second fixes a race relating to the deallocation of allocator
data.
The third provides a fallback if kmalloc is unable to satisfy a
request to allocate a directory hash table.
The fourth fixes the iopen glock caching so that inodes are deleted in
a more timely manner after rmdir/unlink"
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
GFS2: Don't cache iopen glocks
GFS2: Fall back to vmalloc if kmalloc fails for dir hash tables
GFS2: Increase i_writecount during gfs2_setattr_size
GFS2: Set log descriptor type for jdata blocks
Linus Torvalds [Wed, 5 Jun 2013 00:03:31 +0000 (09:03 +0900)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
"One patch fixes an Oops introduced in 3.9 with the readdirplus
feature. The rest are fixes for async-dio in 3.10"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: fix alignment in short read optimization for async_dio
fuse: return -EIOCBQUEUED from fuse_direct_IO() for all async requests
fuse: fix readdirplus Oops in fuse_dentry_revalidate
fuse: update inode size and invalidate attributes on fallocate
fuse: truncate pagecache range on hole punch
fuse: allocate for_background dio requests based on io->async state
Linus Torvalds [Wed, 5 Jun 2013 00:02:09 +0000 (09:02 +0900)]
Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze
Pull microblaze fixes from Michal Simek:
"One is fixing warning reported by sparse and the second warning was
reported by Geert in his build regressions/improvements status update
for -rc4."
* 'next' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Use static inline functions in cacheflush.h
microblaze: Fix sparse warnings
Ping Cheng [Thu, 23 May 2013 17:22:33 +0000 (10:22 -0700)]
Input: wacom - fix a typo for Cintiq 22HDT
And make the lines easier to read.
Signed-off-by: Ping Cheng <pingc@wacom.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Eric Miao [Tue, 4 Jun 2013 16:30:55 +0000 (09:30 -0700)]
Input: synaptics - fix sync lost after resume on some laptops
In summary, the symptom is intermittent key events lost after resume
on some machines with synaptics touchpad (seems this is synaptics _only_),
and key events loss is due to serio port reconnect after psmouse sync lost.
Removing psmouse and inserting it back during the suspend/resume process
is able to work around the issue, so the difference between psmouse_connect()
and psmouse_reconnect() is the key to the root cause of this problem.
After comparing the two different paths, synaptics driver has its own
implementation of synaptics_reconnect(), and the missing psmouse_probe()
seems significant, the patch below added psmouse_probe() to the reconnect
process, and has been verified many times that the issue could not be reliably
reproduced.
There are two PS/2 commands in psmouse_probe():
1. PSMOUSE_CMD_GETID
2. PSMOUSE_CMD_RESET_DIS
Only the PSMOUSE_CMD_GETID seems to be significant. The
PSMOUSE_CMD_RESET_DIS is irrelevant to this issue after trying
several times. So we have only implemented this patch to issue
the PSMOUSE_CMD_GETID so far.
Tested-by: Daniel Manrique <daniel.manrique@canonical.com>
Signed-off-by: James M Leddy <james.leddy@canonical.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Linus Torvalds [Mon, 3 Jun 2013 21:34:51 +0000 (06:34 +0900)]
Merge tag 'regulator-v3.10-rc4' of git://git./linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A few small fixes for v3.10, documentation things in the core and a
few driver bugs."
* tag 'regulator-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: palmas: Fix "enable_reg" to point to the correct reg for SMPS10
regulator: palmas: Fix incorrect condition
regulator: core: Correct spelling mistake in comment
regulator: dbx500: Make local symbol static
regulator: Fix kernel-doc generation warnings.
Linus Torvalds [Mon, 3 Jun 2013 21:33:44 +0000 (06:33 +0900)]
Merge tag 'jfs-3.10-rc5' of git://github.com/kleikamp/linux-shaggy
Pull jfs bugfixes from David Kleikamp:
"A couple jfs bug fixes for 3.10-rc5"
* tag 'jfs-3.10-rc5' of git://github.com/kleikamp/linux-shaggy:
fs/jfs: Add check if journaling to disk has been disabled in lbmRead()
jfs: Several bugs in jfs_freeze() and jfs_unfreeze()
Bob Peterson [Wed, 29 May 2013 15:51:52 +0000 (11:51 -0400)]
GFS2: Don't cache iopen glocks
This patch makes GFS2 immediately reclaim/delete all iopen glocks
as soon as they're dequeued. This allows deleters to get an
EXclusive lock on iopen so files are deleted properly instead of
being set as unlinked.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Bob Peterson [Thu, 30 May 2013 13:48:56 +0000 (09:48 -0400)]
GFS2: Fall back to vmalloc if kmalloc fails for dir hash tables
This version has one more correction: the vmalloc calls are replaced
by __vmalloc calls to preserve the GFP_NOFS flag.
When GFS2's directory management code allocates buffers for a
directory hash table, if it can't get the memory it needs, it
currently gives a bad return code. Rather than giving an error,
this patch allows it to use virtual memory rather than kernel
memory for the hash table. This should make it possible for
directories to function properly, even when kernel memory becomes
very fragmented.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Bob Peterson [Tue, 28 May 2013 14:04:44 +0000 (10:04 -0400)]
GFS2: Increase i_writecount during gfs2_setattr_size
This patch calls get_write_access in a few functions. This
merely increases inode->i_writecount for the duration of the function.
That will ensure that any file closes won't delete the inode's
multi-block reservation while the function is running.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Bob Peterson [Fri, 24 May 2013 19:02:49 +0000 (15:02 -0400)]
GFS2: Set log descriptor type for jdata blocks
This patch sets the log descriptor type according to whether the
journal commit is for (journaled) data or metadata. This was
recently broken when the functions to process data and metadata
log ops were combined.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Maxim Patlasov [Thu, 30 May 2013 12:41:34 +0000 (16:41 +0400)]
fuse: fix alignment in short read optimization for async_dio
The bug was introduced with async_dio feature: trying to optimize short reads,
we cut number-of-bytes-to-read to i_size boundary. Hence the following example:
truncate --size=300 /mnt/file
dd if=/mnt/file of=/dev/null iflag=direct
led to FUSE_READ request of 300 bytes size. This turned out to be problem
for userspace fuse implementations who rely on assumption that kernel fuse
does not change alignment of request from client FS.
The patch turns off the optimization if async_dio is disabled. And, if it's
enabled, the patch fixes adjustment of number-of-bytes-to-read to preserve
alignment.
Note, that we cannot throw out short read optimization entirely because
otherwise a direct read of a huge size issued on a tiny file would generate
a huge amount of fuse requests and most of them would be ACKed by userspace
with zero bytes read.
Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Brian Foster [Thu, 30 May 2013 19:35:50 +0000 (15:35 -0400)]
fuse: return -EIOCBQUEUED from fuse_direct_IO() for all async requests
If request submission fails for an async request (i.e.,
get_user_pages() returns -ERESTARTSYS), we currently skip the
-EIOCBQUEUED return and drop into wait_for_sync_kiocb() forever.
Avoid this by always returning -EIOCBQUEUED for async requests. If
an error occurs, the error is passed into fuse_aio_complete(),
returned via aio_complete() and thus propagated to userspace via
io_getevents().
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Maxim Patlasov <MPatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Miklos Szeredi [Mon, 3 Jun 2013 12:40:22 +0000 (14:40 +0200)]
fuse: fix readdirplus Oops in fuse_dentry_revalidate
Fix bug introduced by commit
4582a4ab2a "FUSE: Adapt readdirplus to application
usage patterns".
We need to check for a positive dentry; negative dentries are not added by
readdirplus. Secondly we need to advise the use of readdirplus on the *parent*,
otherwise the whole thing is useless. Thirdly all this is only relevant if
"readdirplus_auto" mode is selected by the filesystem.
We advise the use of readdirplus only if the dentry was still valid. If we had
to redo the lookup then there was no use in doing the -plus version.
Reported-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Feng Shuo <steve.shuo.feng@gmail.com>
CC: stable@vger.kernel.org
Michal Simek [Mon, 3 Jun 2013 09:30:04 +0000 (11:30 +0200)]
microblaze: Use static inline functions in cacheflush.h
Using static inline functions ensure proper type checking
which also remove compilation warning for no MMU
Compilation warning:
arch/microblaze/include/asm/cacheflush.h: warning: 'addr'
may be used uninitialized in this function [-Wmaybe-uninitialized]
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Linus Torvalds [Mon, 3 Jun 2013 09:09:42 +0000 (18:09 +0900)]
Merge branch 'for-linus' of git://git./linux/kernel/git/geert/linux-m68k
Pull m68k fix from Geert Uytterhoeven:
"A boot lock-up on Mac, also destined for stable"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k/mac: Fix unexpected interrupt with CONFIG_EARLY_PRINTK
Linus Torvalds [Mon, 3 Jun 2013 09:04:07 +0000 (18:04 +0900)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"Recent bug fixes, one of them touches a common code file.
It adds two #ifndef/#endif pairs to asm-generic/io.h to be able to
override xlate_dev_kmem_ptr and xlate_dev_mem_ptr."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/pgtable: Fix gmap notifier address
s390/dasd: fix handling of gone paths
s390/pgtable: Fix check for pgste/storage key handling
arch: s390: appldata: using strncpy() and strnlen() instead of sprintf()
s390/smp: lost IPIs on cpu hotplug
kernel: Fix s390 absolute memory access for /dev/mem
s390/dma: do not call debug_dma after free
Linus Torvalds [Mon, 3 Jun 2013 08:57:16 +0000 (17:57 +0900)]
Merge branch 'for-3.10-fixes' of git://git./linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- Fix for yet another xattr bug which may lead to NULL deref.
- A subtle bug in for_each_descendant_pre(). This bug requires quite
specific conditions to trigger and isn't too likely to actually
happen in the wild, but maybe that just makes it that much more
nastier.
- A warning message added for silly cgroup re-mount (not -o remount,
but unmount followed by mount) behavior.
* 'for-3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: warn about mismatching options of a new mount of an existing hierarchy
cgroup: fix a subtle bug in descendant pre-order walk
cgroup: initialize xattr before calling d_instantiate()
Linus Torvalds [Mon, 3 Jun 2013 08:55:09 +0000 (17:55 +0900)]
Merge branch 'for-3.10-fixes' of git://git./linux/kernel/git/tj/libata
Pull libata changes from Tejun Heo:
"Nothing too interesting. PCI ID additions, some sata_rcar fixes and a
fringe bug fix for DMADIR handling which shouldn't affect any device
remotely modern."
* 'for-3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
sata_rcar: fix interrupt handling
ahci: add an observed PCI ID for Marvell 88se9172 SATA controller
sata_rcar: clear STOP bit in bmdma_start() method
libata: make ata_exec_internal_sg honor DMADIR
ata_piix: add PCI IDs for Intel BayTail
libata: update "Maintained by:" tags
Michal Simek [Thu, 30 May 2013 13:10:52 +0000 (15:10 +0200)]
microblaze: Fix sparse warnings
arch/microblaze/include/asm/uaccess.h:101:3:
warning: cast removes address space of expression
arch/microblaze/include/asm/uaccess.h:107:2:
warning: cast removes address space of expression
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Gleb Natapov [Mon, 3 Jun 2013 08:30:02 +0000 (11:30 +0300)]
KVM: Fix race in apic->pending_events processing
apic->pending_events processing has a race that may cause INIT and
SIPI
processing to be reordered:
vpu0: vcpu1:
set INIT
test_and_clear_bit(KVM_APIC_INIT)
process INIT
set INIT
set SIPI
test_and_clear_bit(KVM_APIC_SIPI)
process SIPI
At the end INIT is left pending in pending_events. The following patch
fixes this by latching pending event before processing them.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Paolo Bonzini [Thu, 30 May 2013 14:35:55 +0000 (16:35 +0200)]
KVM: fix sil/dil/bpl/spl in the mod/rm fields
The x86-64 extended low-byte registers were fetched correctly from reg,
but not from mod/rm.
This fixes another bug in the boot of RHEL5.9 64-bit, but it is still
not enough.
Cc: <stable@vger.kernel.org> # 3.9
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Paolo Bonzini [Thu, 30 May 2013 11:22:39 +0000 (13:22 +0200)]
KVM: Emulate multibyte NOP
This is encountered when booting RHEL5.9 64-bit. There is another bug
after this one that is not a simple emulation failure, but this one lets
the boot proceed a bit.
Cc: <stable@vger.kernel.org> # 3.9
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Marc Zyngier [Tue, 14 May 2013 11:11:34 +0000 (12:11 +0100)]
ARM: KVM: be more thorough when invalidating TLBs
The KVM/ARM MMU code doesn't take care of invalidating TLBs before
freeing a {pte,pmd} table. This could cause problems if the page
is reallocated and then speculated into by another CPU.
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Andre Przywara [Wed, 8 May 2013 22:28:06 +0000 (00:28 +0200)]
ARM: KVM: prevent NULL pointer dereferences with KVM VCPU ioctl
Some ARM KVM VCPU ioctls require the vCPU to be properly initialized
with the KVM_ARM_VCPU_INIT ioctl before being used with further
requests. KVM_RUN checks whether this initialization has been
done, but other ioctls do not.
Namely KVM_GET_REG_LIST will dereference an array with index -1
without initialization and thus leads to a kernel oops.
Fix this by adding checks before executing the ioctl handlers.
[ Removed superflous comment from static function - Christoffer ]
Changes from v1:
* moved check into a static function with a meaningful name
Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
David Daney [Thu, 23 May 2013 16:49:10 +0000 (09:49 -0700)]
mips/kvm: Use ENOIOCTLCMD to indicate unimplemented ioctls.
The Linux Way is to return -ENOIOCTLCMD to the vfs when an
unimplemented ioctl is requested. Do this in kvm_mips instead of a
random mixture of -ENOTSUPP and -EINVAL.
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Daney [Thu, 23 May 2013 16:49:09 +0000 (09:49 -0700)]
mips/kvm: Fix ABI by moving manipulation of CP0 registers to KVM_{G,S}ET_ONE_REG
Because not all 256 CP0 registers are ever implemented, we need a
different method of manipulating them. Use the
KVM_SET_ONE_REG/KVM_GET_ONE_REG mechanism.
Now unused code and definitions are removed.
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Daney [Thu, 23 May 2013 16:49:08 +0000 (09:49 -0700)]
mips/kvm: Use ARRAY_SIZE() instead of hardcoded constants in kvm_arch_vcpu_ioctl_{s,g}et_regs
Also we cannot set special zero register, so force it to zero.
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Daney [Thu, 23 May 2013 16:49:07 +0000 (09:49 -0700)]
mips/kvm: Fix name of gpr field in struct kvm_regs.
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Daney [Thu, 23 May 2013 16:49:06 +0000 (09:49 -0700)]
mips/kvm: Fix ABI for use of 64-bit registers.
All registers are 64-bits wide, 32-bit guests use the least
significant portion of the register storage fields.
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Daney [Thu, 23 May 2013 16:49:05 +0000 (09:49 -0700)]
mips/kvm: Fix ABI for use of FPU.
Define a non-empty struct kvm_fpu.
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Sun, 2 Jun 2013 08:11:17 +0000 (17:11 +0900)]
Linux 3.10-rc4
Sergei Shtylyov [Fri, 31 May 2013 22:38:35 +0000 (02:38 +0400)]
sata_rcar: fix interrupt handling
The driver's interrupt handling code is too picky in deciding whether it should
handle an interrupt or not which causes completely unneeded spurious interrupts.
Thus make sata_rcar_{ata|serr}_interrupt() *void*; add ATA status register read
to sata_rcar_ata_interrupt() to clear an unexpected ATA interrupt -- it doesn't
get cleared by writing to the SATAINTSTAT register in the interrupt mode we use.
Also, in sata_rcar_ata_interrupt() we should check SATAINTSTAT register only for
enabled interrupts and we should clear only those interrupts that we have read
as active first time around, because else we have a race and risk clearing an
interrupt that can occur between read and write of the SATAINTSTAT register
and never registering it...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Linus Torvalds [Sat, 1 Jun 2013 21:24:54 +0000 (06:24 +0900)]
Merge branch 'for-3.10' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"This patcheset includes fixes for:
- the PCI/LBA which brings back the stifb graphics framebuffer
console
- possible memory overflows in parisc kernel init code
- parport support on older GSC machines
- avoids that users by mistake enable PARPORT_PC_SUPERIO on parisc
- MAINTAINERS file list updates for parisc."
* 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: parport0: fix this legacy no-device port driver!
parport_pc: disable PARPORT_PC_SUPERIO on parisc architecture
parisc/PCI: lba: fix: convert to pci_create_root_bus() for correct root bus resources (v2)
parisc/PCI: Set type for LBA bus_num resource
MAINTAINERS: update parisc architecture file list
parisc: kernel: using strlcpy() instead of strcpy()
parisc: rename "CONFIG_PA7100" to "CONFIG_PA7000"
parisc: fix kernel BUG at arch/parisc/include/asm/mmzone.h:50
parisc: memory overflow, 'name' length is too short for using
Helge Deller [Thu, 30 May 2013 21:06:39 +0000 (21:06 +0000)]
parisc: parport0: fix this legacy no-device port driver!
Fix the above kernel error from parport_announce_port() on 32bit GSC
machines (e.g. B160L). The parport driver requires now a pointer to the
device struct.
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Thu, 30 May 2013 16:24:46 +0000 (16:24 +0000)]
parport_pc: disable PARPORT_PC_SUPERIO on parisc architecture
If enabled, CONFIG_PARPORT_PC_SUPERIO scans on PC-like hardware for
various super-io chips by accessing i/o ports in a range which will
crash any parisc hardware at once.
In addition, parisc has it's own incompatible superio chip
(CONFIG_SUPERIO), so if we disable PARPORT_PC_SUPERIO completely for
parisc we can avoid that people by accident enable the parport_pc
superio option too.
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Fri, 31 May 2013 22:24:58 +0000 (22:24 +0000)]
parisc/PCI: lba: fix: convert to pci_create_root_bus() for correct root bus resources (v2)
commit
dc7dce280a
Author: Bjorn Helgaas <bhelgaas@google.com>
Date: Fri Oct 28 16:27:27 2011 -0600
parisc/PCI: lba: convert to pci_create_root_bus() for correct root bus
resources
Supply root bus resources to pci_create_root_bus() so they're correct
immediately. This fixes the problem of "early" and "header" quirks seeing
incorrect root bus resources.
added tests for elmmio_space.start while it should use
elmmio_space.flags. This for example led to incorrect resource
assignments and a non-working stifb framebuffer on most parisc machines.
LBA 10:1: PCI host bridge to bus 0000:01
pci_bus 0000:01: root bus resource [io 0x12000-0x13fff] (bus address [0x2000-0x3fff])
pci_bus 0000:01: root bus resource [mem 0xfffffffffa000000-0xfffffffffbffffff] (bus address [0xfa000000-0xfbffffff])
pci_bus 0000:01: root bus resource [mem 0xfffffffff4800000-0xfffffffff4ffffff] (bus address [0xf4800000-0xf4ffffff])
pci_bus 0000:01: root bus resource [??? 0x00000001 flags 0x0]
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Bjorn Helgaas [Thu, 30 May 2013 17:45:39 +0000 (11:45 -0600)]
parisc/PCI: Set type for LBA bus_num resource
The non-PAT resource probing code failed to set the type of the LBA bus_num
resource (
30aa80da43 "parisc/PCI: register busn_res for root buses" did
the corresponding thing for the PAT case).
This causes incorrect resource assignments and a non-working stifb
framebuffer on most parisc machines.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Thu, 30 May 2013 13:48:07 +0000 (13:48 +0000)]
MAINTAINERS: update parisc architecture file list
Signed-off-by: Helge Deller <deller@gmx.de>
Chen Gang [Thu, 30 May 2013 01:18:43 +0000 (01:18 +0000)]
parisc: kernel: using strlcpy() instead of strcpy()
'boot_args' is an input args, and 'boot_command_line' has a fix length.
So use strlcpy() instead of strcpy() to avoid memory overflow.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Helge Deller <deller@gmx.de>
Paul Bolle [Wed, 29 May 2013 09:56:58 +0000 (09:56 +0000)]
parisc: rename "CONFIG_PA7100" to "CONFIG_PA7000"
There's a Makefile line setting cflags for CONFIG_PA7100. But that
Kconfig macro doesn't exist. There is a Kconfig symbol PA7000, which
covers both PA7000 and PA7100 processors. So let's use the corresponding
Kconfig macro.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Tue, 28 May 2013 20:35:54 +0000 (20:35 +0000)]
parisc: fix kernel BUG at arch/parisc/include/asm/mmzone.h:50
With CONFIG_DISCONTIGMEM=y and multiple physical memory areas,
cat /proc/kpageflags triggers this kernel bug:
kernel BUG at arch/parisc/include/asm/mmzone.h:50!
CPU: 2 PID: 7848 Comm: cat Tainted: G D W 3.10.0-rc3-64bit #44
IAOQ[0]: kpageflags_read0x128/0x238
IAOQ[1]: kpageflags_read0x12c/0x238
RP(r2): proc_reg_read0xbc/0x130
Backtrace:
[<
00000000402ca2d4>] proc_reg_read0xbc/0x130
[<
0000000040235bcc>] vfs_read0xc4/0x1d0
[<
0000000040235f0c>] SyS_read0x94/0xf0
[<
0000000040105fc0>] syscall_exit0x0/0x14
kpageflags_read() walks through the whole memory, even if some memory
areas are physically not available. So, we should better not BUG on an
unavailable pfn in pfn_to_nid() but just return the expected value -1 or
0.
Signed-off-by: Helge Deller <deller@gmx.de>
Chen Gang [Mon, 27 May 2013 04:57:09 +0000 (04:57 +0000)]
parisc: memory overflow, 'name' length is too short for using
'path.bc[i]' can be asigned by PCI_SLOT() which can '> 10', so sizeof(6
* "%u:" + "%u" + '\0') may be 21.
Since 'name' length is 20, it may be memory overflow.
And 'path.bc[i]' is 'unsigned char' for printing, we can be sure the
max length of 'name' must be less than 28.
So simplify thinking, we can use 28 instead of 20 directly, and do not
think of whether 'patchc.bc[i]' can '> 100'.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Linus Torvalds [Sat, 1 Jun 2013 11:13:16 +0000 (20:13 +0900)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
"Here are a few more fixes for powerpc 3.10. It's a bit more than I
would have liked this late in the game but I suppose that's what
happens with a brand new chip generation coming out.
A few regression fixes, some last minute fixes for new P8 features
such as transactional memory,...
There's also one powerpc KVM patch that I requested that adds two
missing functions to our in-kernel interrupt controller support which
is itself a new 3.10 feature. These are defined by the base
hypervisor specification. We didn't implement them originally because
Linux doesn't use them but they are simple and I'm not comfortable
having a half-implemented interface in 3.10 and having to deal with
versionning etc... later when something starts needing those calls.
They cannot be emulated in qemu when using in-kernel interrupt
controller (not enough shared state).
Just added a last minute patch to fix a typo introducing a breakage in
our cputable for Power7+ processors, sorry about that, but the
regression it fixes just hurt me :-)"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/cputable: Fix typo on P7+ cputable entry
powerpc/perf: Add missing SIER support
powerpc/perf: Revert to original NO_SIPR logic
powerpc/pci: Remove the unused variables in pci_process_bridge_OF_ranges
powerpc/pci: Remove the stale comments of pci_process_bridge_OF_ranges
powerpc/pseries: Always enable CONFIG_HOTPLUG_CPU on PSERIES SMP
powerpc/kvm/book3s: Add support for H_IPOLL and H_XIRR_X in XICS emulation
powerpc/32bit:Store temporary result in r0 instead of r8
powerpc/mm: Always invalidate tlb on hpte invalidate and update
powerpc/pseries: Improve stream generation comments in copypage/user
powerpc/pseries: Kill all prefetch streams on context switch
powerpc/cputable: Fix oprofile_cpu_type on power8
powerpc/mpic: Fix irq distribution problem when MPIC_SINGLE_DEST_CPU
powerpc/tm: Fix userspace stack corruption on signal delivery for active transactions
powerpc/tm: Move TM abort cause codes to uapi
powerpc/tm: Abort on emulation and alignment faults
powerpc/tm: Update cause codes documentation
powerpc/tm: Make room for hypervisor in abort cause codes
Linus Torvalds [Sat, 1 Jun 2013 11:05:20 +0000 (20:05 +0900)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull scsi target fixes from Nicholas Bellinger:
"The highlights include:
- Re-instate sess->wait_list in target_wait_for_sess_cmds() for
active I/O shutdown handling in fabrics using se_cmd->cmd_kref
- Make ib_srpt call target_sess_cmd_list_set_waiting() during session
shutdown
- Fix FILEIO off-by-one READ_CAPACITY bug for !S_ISBLK export
- Fix iscsi-target login error heap buffer overflow (Kees)
- Fix iscsi-target active I/O shutdown handling regression in
v3.10-rc1
A big thanks to Kees Cook for fixing a long standing login error
buffer overflow bug.
All patches are CC'ed to stable with the exception of the v3.10-rc1
specific regression + other minor target cleanup."
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
iscsi-target: Fix iscsit_free_cmd() se_cmd->cmd_kref shutdown handling
target: Propigate up ->cmd_kref put return via transport_generic_free_cmd
iscsi-target: fix heap buffer overflow on error
target/file: Fix off-by-one READ_CAPACITY bug for !S_ISBLK export
ib_srpt: Call target_sess_cmd_list_set_waiting during shutdown_session
target: Re-instate sess_wait_list for target_wait_for_sess_cmds
target: Remove unused wait_for_tasks bit in target_wait_for_sess_cmds
Linus Torvalds [Sat, 1 Jun 2013 10:55:26 +0000 (19:55 +0900)]
Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux
Pull clock subsystem fixes from Mike Turquette:
"A mix of small fixes affecting mostly ARM platforms as well as a
discrete clock expander chip. Most fixes are corrections to lousy
clock data of one form or another."
* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux:
clk: mxs: Include clk mxs header file
clk: vt8500: Fix unbalanced spinlock in vt8500_dclk_set_rate()
clk: si5351: Set initial clkout rate when defined in platform data.
clk: si5351: Fix clkout rate computation.
clk: samsung: Add CLK_IGNORE_UNUSED flag for the sysreg clocks
clk: ux500: clk-sysctrl: handle clocks with no parents
clk: ux500: Provide device enumeration number suffix for SMSC911x
Linus Torvalds [Sat, 1 Jun 2013 10:53:41 +0000 (19:53 +0900)]
Merge tag 'fbdev-for-3.10-rc4' of git://git./linux/kernel/git/plagnioj/linux-fbdev
Pull fbdev fixes from Jean-Christophe PLAGNIOL-VILLARD:
"This contains some small fixes
- Atmel LCDC: fix blank the backlight on remove
- ps3fb: fix compile warning
- OMAPDSS: Fix crash with DT boot"
* tag 'fbdev-for-3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/plagnioj/linux-fbdev:
atmel_lcdfb: blank the backlight on remove
trivial: atmel_lcdfb: add missing error message
OMAPDSS: Fix crash with DT boot
fbdev/ps3fb: fix compile warning
Linus Torvalds [Sat, 1 Jun 2013 10:51:52 +0000 (19:51 +0900)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull assorted fixes from Al Viro:
"There'll be more - I'm trying to dig out from under the pile of mail
(a couple of weeks of something flu-like ;-/) and there's several more
things waiting for review; this is just the obvious stuff."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
zoran: racy refcount handling in vm_ops ->open()/->close()
befs_readdir(): do not increment ->f_pos if filldir tells us to stop
hpfs: deadlock and race in directory lseek()
qnx6: qnx6_readdir() has a braino in pos calculation
fix buffer leak after "scsi: saner replacements for ->proc_info()"
vfs: Fix invalid ida_remove() call
Linus Torvalds [Sat, 1 Jun 2013 10:48:59 +0000 (19:48 +0900)]
Merge tag 'nfs-for-3.10-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull two NFS client fixes from Trond Myklebust:
- Fix a regression that broke NFS mounting using klibc and busybox
- Stable fix to check access modes correctly on NFSv4 delegated open()
* tag 'nfs-for-3.10-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Fix security flavor negotiation with legacy binary mounts
NFSv4: Fix a thinko in nfs4_try_open_cached
Will Schmidt [Mon, 20 May 2013 05:04:18 +0000 (05:04 +0000)]
powerpc/cputable: Fix typo on P7+ cputable entry
Fix a typo in setting COMMON_USER2_POWER7 bits to .cpu_user_features2
cpu specs table.
Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Wed, 15 May 2013 20:19:31 +0000 (20:19 +0000)]
powerpc/perf: Add missing SIER support
Commit
8f61aa3 "Add support for SIER" missed updates to siar_valid()
and perf_get_data_addr().
In both cases we need to check the SIER instead of mmcra.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Wed, 15 May 2013 20:19:30 +0000 (20:19 +0000)]
powerpc/perf: Revert to original NO_SIPR logic
This is a revert and then some of commit
860aad7 "Add regs_no_sipr()".
This workaround was only needed on early chip versions.
As before NO_SIPR becomes a static flag of the PMU struct.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kevin Hao [Thu, 16 May 2013 20:58:42 +0000 (20:58 +0000)]
powerpc/pci: Remove the unused variables in pci_process_bridge_OF_ranges
The codes which ever used these two variables have gone. Throw away
them too.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kevin Hao [Thu, 16 May 2013 20:58:41 +0000 (20:58 +0000)]
powerpc/pci: Remove the stale comments of pci_process_bridge_OF_ranges
These comments already don't apply to the current code. So just remove
them.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Srivatsa S. Bhat [Tue, 21 May 2013 09:32:48 +0000 (09:32 +0000)]
powerpc/pseries: Always enable CONFIG_HOTPLUG_CPU on PSERIES SMP
Adam Lackorzynski reported the following build failure on
!CONFIG_HOTPLUG_CPU configuration:
CC arch/powerpc/kernel/rtas.o
arch/powerpc/kernel/rtas.c: In function ‘rtas_cpu_state_change_mask’:
arch/powerpc/kernel/rtas.c:843:4: error: implicit declaration of function ‘cpu_down’ [-Werror=implicit-function-declaration]
cc1: all warnings being treated as errors
make[1]: *** [arch/powerpc/kernel/rtas.o] Error 1
make: *** [arch/powerpc/kernel] Error 2
The build fails because cpu_down() is defined only under CONFIG_HOTPLUG_CPU.
Looking further, the mobility code in pseries is one of the call-sites which
uses rtas_ibm_suspend_me(), which in turn calls rtas_cpu_state_change_mask().
And the mobility code is unconditionally compiled-in (it does not fall under
any Kconfig option). And commit
120496ac (powerpc: Bring all threads online
prior to migration/hibernation) which introduced this build regression is
critical for the proper functioning of the migration code. So it appears
that the only solution to this problem is to enable CONFIG_HOTPLUG_CPU if
SMP is enabled on PPC_PSERIES platforms. So make that change in the Kconfig.
Reported-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Paul Mackerras [Thu, 23 May 2013 15:42:21 +0000 (15:42 +0000)]
powerpc/kvm/book3s: Add support for H_IPOLL and H_XIRR_X in XICS emulation
This adds the remaining two hypercalls defined by PAPR for manipulating
the XICS interrupt controller, H_IPOLL and H_XIRR_X. H_IPOLL returns
information about the priority and pending interrupts for a virtual
cpu, without changing any state. H_XIRR_X is like H_XIRR in that it
reads and acknowledges the highest-priority pending interrupt, but it
also returns the timestamp (timebase register value) from when the
interrupt was first received by the hypervisor. Currently we just
return the current time, since we don't do any software queueing of
virtual interrupts inside the XICS emulation code.
These hcalls are not currently used by Linux guests, but may be in
future.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Priyanka Jain [Fri, 31 May 2013 01:20:02 +0000 (01:20 +0000)]
powerpc/32bit:Store temporary result in r0 instead of r8
Commit
a9c4e541ea9b22944da356f2a9258b4eddcc953b
"powerpc/kprobe: Complete kprobe and migrate exception frame"
introduced a regression:
While returning from exception handling in case of PREEMPT enabled,
_TIF_NEED_RESCHED bit is checked in TI_FLAGS (thread_info flag) of current
task. Only if this bit is set, it should continue with the process of
calling preempt_schedule_irq() to schedule highest priority task if
available.
Current code assumes that r8 contains TI_FLAGS and check this for
_TIF_NEED_RESCHED, but as r8 is modified in the code which executes before
this check, r8 no longer contains the expected TI_FLAGS information.
As a result check for comparison with _TIF_NEED_RESCHED was failing even if
NEED_RESCHED bit is set in the current thread_info flag. Due to this,
preempt_schedule_irq() and in turn scheduler was not getting called even if
highest priority task is ready for execution.
So, store temporary results in r0 instead of r8 to prevent r8 from getting
modified as subsequent code is dependent on its value.
Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
CC: <stable@vger.kernel.org> [v3.7+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Aneesh Kumar K.V [Fri, 31 May 2013 01:03:24 +0000 (01:03 +0000)]
powerpc/mm: Always invalidate tlb on hpte invalidate and update
If a hash bucket gets full, we "evict" a more/less random entry from it.
When we do that we don't invalidate the TLB (hpte_remove) because we assume
the old translation is still technically "valid". This implies that when
we are invalidating or updating pte, even if HPTE entry is not valid
we should do a tlb invalidate.
This was a regression introduced by
b1022fbd293564de91596b8775340cf41ad5214c
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Wed, 29 May 2013 19:34:29 +0000 (19:34 +0000)]
powerpc/pseries: Improve stream generation comments in copypage/user
No code changes, just documenting what's happening a little better.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Wed, 29 May 2013 19:34:27 +0000 (19:34 +0000)]
powerpc/pseries: Kill all prefetch streams on context switch
On context switch, we should have no prefetch streams leak from one
userspace process to another. This frees up prefetch resources for the
next process.
Based on patch from Milton Miller.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Nishanth Aravamudan [Tue, 28 May 2013 10:39:50 +0000 (10:39 +0000)]
powerpc/cputable: Fix oprofile_cpu_type on power8
Maynard informed me that neither the oprofile kernel module nor oprofile
userspace has been updated to support that "legacy" oprofile module
interface for power8, which is indicated by "ppc64/power8." This results
in no samples. The solution is to default to the "timer" type, instead.
The raw entry also should be updated, as "ppc64/ibm-compat-v1" indicates
to oprofile userspace to use "compatibility events" which are obsolete
in ISA 2.07.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
chenhui zhao [Mon, 27 May 2013 21:59:43 +0000 (21:59 +0000)]
powerpc/mpic: Fix irq distribution problem when MPIC_SINGLE_DEST_CPU
For the mpic with a flag MPIC_SINGLE_DEST_CPU, only one bit should be
set in interrupt destination registers.
The code is applicable to 64-bit platforms as well as 32-bit.
Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Sun, 26 May 2013 18:09:41 +0000 (18:09 +0000)]
powerpc/tm: Fix userspace stack corruption on signal delivery for active transactions
When in an active transaction that takes a signal, we need to be careful with
the stack. It's possible that the stack has moved back up after the tbegin.
The obvious case here is when the tbegin is called inside a function that
returns before a tend. In this case, the stack is part of the checkpointed
transactional memory state. If we write over this non transactionally or in
suspend, we are in trouble because if we get a tm abort, the program counter
and stack pointer will be back at the tbegin but our in memory stack won't be
valid anymore.
To avoid this, when taking a signal in an active transaction, we need to use
the stack pointer from the checkpointed state, rather than the speculated
state. This ensures that the signal context (written tm suspended) will be
written below the stack required for the rollback. The transaction is aborted
becuase of the treclaim, so any memory written between the tbegin and the
signal will be rolled back anyway.
For signals taken in non-TM or suspended mode, we use the
normal/non-checkpointed stack pointer.
Tested with 64 and 32 bit signals
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> # v3.9
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Sun, 26 May 2013 18:30:56 +0000 (18:30 +0000)]
powerpc/tm: Move TM abort cause codes to uapi
These cause codes are usable by userspace, so let's export to uapi.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> # v3.9
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Sun, 26 May 2013 18:09:39 +0000 (18:09 +0000)]
powerpc/tm: Abort on emulation and alignment faults
If we are emulating an instruction inside an active user transaction that
touches memory, the kernel can't emulate it as it operates in transactional
suspend context. We need to abort these transactions and send them back to
userspace for the hardware to rollback.
We can service these if the user transaction is in suspend mode, since the
kernel will operate in the same suspend context.
This adds a check to all alignment faults and to specific instruction
emulations (only string instructions for now). If the user process is in an
active (non-suspended) transaction, we abort the transaction go back to
userspace allowing the HW to roll back the transaction and tell the user of the
failure. This also adds new tm abort cause codes to report the reason of the
persistent error to the user.
Crappy test case here http://neuling.org/devel/junkcode/aligntm.c
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> # v3.9
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Sun, 26 May 2013 18:09:38 +0000 (18:09 +0000)]
powerpc/tm: Update cause codes documentation
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> # 3.9 only
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Sun, 26 May 2013 18:09:37 +0000 (18:09 +0000)]
powerpc/tm: Make room for hypervisor in abort cause codes
PAPR carves out 0xff-0xe0 for hypervisor use of transactional memory software
abort cause codes. Unfortunately we don't respect this currently.
Below fixes this to move our cause codes to below this region.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> # 3.9 only
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Torvalds [Fri, 31 May 2013 21:59:14 +0000 (06:59 +0900)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs
Pull reiserfs fixes from Jan Kara:
"Three reiserfs fixes. They fix real problems spotted by users so I
hope they are ok even at this stage."
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
reiserfs: fix deadlock with nfs racing on create/lookup
reiserfs: fix problems with chowning setuid file w/ xattrs
reiserfs: fix spurious multiple-fill in reiserfs_readdir_dentry
Linus Torvalds [Fri, 31 May 2013 21:56:21 +0000 (06:56 +0900)]
Merge tag 'for-linus-v3.10-rc4-crc-xattr-fixes' of git://oss.sgi.com/xfs/xfs
Pull xfs extended attribute fixes for CRCs from Ben Myers:
"Here are several fixes that are relevant on CRC enabled XFS
filesystems. They are followed by a rework of the remote attribute
code so that each block of the attribute contains a header with a CRC.
Previously there was a CRC header per extent in the remote attribute
code, but this was untenable because it was not possible to know how
many extents would be allocated for the attribute until after the
allocation has completed, due to the fragmentation of free space.
This became complicated because the size of the headers needs to be
added to the length of the payload to get the overall length required
for the allocation. With a header per block, things are less
complicated at the cost of a little space.
I would have preferred to defer this and the rest of the CRC queue to
3.11 to mitigate risk for existing non-crc users in 3.10. Doing so
would require setting a feature bit for the on-disk changes, and so I
have been pressured into sending this pull request by Eric Sandeen and
David Chinner from Red Hat. I'll send another pull request or two
with the rest of the CRC queue next week.
- Remove assert on count of remote attribute CRC headers
- Fix the number of blocks read in for remote attributes
- Zero remote attribute tails properly
- Fix mapping of remote attribute buffers to have correct length
- initialize temp leaf properly in xfs_attr3_leaf_unbalance, and
xfs_attr3_leaf_compact
- Rework remote atttributes to have a header per block"
* tag 'for-linus-v3.10-rc4-crc-xattr-fixes' of git://oss.sgi.com/xfs/xfs:
xfs: rework remote attr CRCs
xfs: fully initialise temp leaf in xfs_attr3_leaf_compact
xfs: fully initialise temp leaf in xfs_attr3_leaf_unbalance
xfs: correctly map remote attr buffers during removal
xfs: remote attribute tail zeroing does too much
xfs: remote attribute read too short
xfs: remote attribute allocation may be contiguous
Linus Torvalds [Fri, 31 May 2013 21:50:16 +0000 (06:50 +0900)]
Merge tag 'for-linus-v3.10-rc4' of git://oss.sgi.com/xfs/xfs
Pull xfs fixes from Ben Myers:
- Fix nested transactions in xfs_qm_scall_setqlim
- Clear suid/sgid bits when we truncate with size update
- Fix recovery for split buffers
- Fix block count on remote symlinks
- Add fsgeom flag for v5 superblock support
- Disable XFS_IOC_SWAPEXT for CRC enabled filesystems
- Fix dirv3 freespace block corruption
* tag 'for-linus-v3.10-rc4' of git://oss.sgi.com/xfs/xfs:
xfs: fix dir3 freespace block corruption
xfs: disable swap extents ioctl on CRC enabled filesystems
xfs: add fsgeom flag for v5 superblock support.
xfs: fix incorrect remote symlink block count
xfs: fix split buffer vector log recovery support
xfs: kill suid/sgid through the truncate path.
xfs: avoid nesting transactions in xfs_qm_scall_setqlim()
Linus Torvalds [Fri, 31 May 2013 21:47:04 +0000 (06:47 +0900)]
Merge tag 'please-pull-aertracefix' of git://git./linux/kernel/git/ras/ras
Pull aer error logging fix from Tony Luck:
"Can't call pci_get_domain_bus_and_slot() from interupt context"
* tag 'please-pull-aertracefix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
aerdrv: Move cper_print_aer() call out of interrupt context
Linus Torvalds [Fri, 31 May 2013 21:45:10 +0000 (06:45 +0900)]
Merge tag 'arm64-stable' of git://git./linux/kernel/git/cmarinas/linux-aarch64
Pull arm64 fixes from Catalin Marinas:
- Module compilation issues (symbol not exported).
- Plug a hole where user space can bring the kernel down.
* tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
arm64: don't kill the kernel on a bad esr from el0
arm64: treat unhandled compat el0 traps as undef
arm64: Do not report user faults for handled signals
arm64: kernel: compiling issue, need 'EXPORT_SYMBOL(clear_page)'
Jeff Mahoney [Fri, 31 May 2013 19:51:17 +0000 (15:51 -0400)]
reiserfs: fix deadlock with nfs racing on create/lookup
Reiserfs is currently able to be deadlocked by having two NFS clients
where one has removed and recreated a file and another is accessing the
file with an open file handle.
If one client deletes and recreates a file with timing such that the
recreated file obtains the same [dirid, objectid] pair as the original
file while another client accesses the file via file handle, the create
and lookup can race and deadlock if the lookup manages to create the
in-memory inode first.
The create thread, in insert_inode_locked4, will hold the write lock
while waiting on the other inode to be unlocked. The lookup thread,
anywhere in the iget path, will release and reacquire the write lock while
it schedules. If it needs to reacquire the lock while the create thread
has it, it will never be able to make forward progress because it needs
to reacquire the lock before ultimately unlocking the inode.
This patch drops the write lock across the insert_inode_locked4 call so
that the ordering of inode_wait -> write lock is retained. Since this
would have been the case before the BKL push-down, this is safe.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Jeff Mahoney [Fri, 31 May 2013 19:54:17 +0000 (15:54 -0400)]
reiserfs: fix problems with chowning setuid file w/ xattrs
reiserfs_chown_xattrs() takes the iattr struct passed into ->setattr
and uses it to iterate over all the attrs associated with a file to change
ownership of xattrs (and transfer quota associated with the xattr files).
When the setuid bit is cleared during chown, ATTR_MODE and iattr->ia_mode
are passed to all the xattrs as well. This means that the xattr directory
will have S_IFREG added to its mode bits.
This has been prevented in practice by a missing IS_PRIVATE check
in reiserfs_acl_chmod, which caused a double-lock to occur while holding
the write lock. Since the file system was completely locked up, the
writeout of the corrupted mode never happened.
This patch temporarily clears everything but ATTR_UID|ATTR_GID for the
calls to reiserfs_setattr and adds the missing IS_PRIVATE check.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Jeff Mahoney [Fri, 31 May 2013 19:07:52 +0000 (15:07 -0400)]
reiserfs: fix spurious multiple-fill in reiserfs_readdir_dentry
After sleeping for filldir(), we check to see if the file system has
changed and research. The next_pos pointer is updated but its value
isn't pushed into the key used for the search itself. As a result,
the search returns the same item that the last cycle of the loop did
and filldir() is called multiple times with the same data.
The end result is that the buffer can contain the same name multiple
times. This can be returned to userspace or used internally in the
xattr code where it can manifest with the following warning:
jdm-20004 reiserfs_delete_xattrs: Couldn't delete all xattrs (-2)
reiserfs_for_each_xattr uses reiserfs_readdir_dentry to iterate over
the xattr names and ends up trying to unlink the same name twice. The
second attempt fails with -ENOENT and the error is returned. At some
point I'll need to add support into reiserfsck to remove the orphaned
directories left behind when this occurs.
The fix is to push the value into the key before researching.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Al Viro [Thu, 23 May 2013 08:38:22 +0000 (04:38 -0400)]
zoran: racy refcount handling in vm_ops ->open()/->close()
worse, we lock ->resource_lock too late when we are destroying the
final clonal VMA; the check for lack of other mappings of the same
opened file can race with mmap().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Richard Genoud [Fri, 31 May 2013 15:49:35 +0000 (15:49 +0000)]
atmel_lcdfb: blank the backlight on remove
When removing atmel_lcdfb module, the backlight is unregistered but not
blanked. (only for CONFIG_BACKLIGHT_ATMEL_LCDC case).
This can result in the screen going full white depending on how the PWM
is wired.
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Richard Genoud [Fri, 31 May 2013 14:28:38 +0000 (14:28 +0000)]
trivial: atmel_lcdfb: add missing error message
When a too small framebuffer is given, the atmel_lcdfb_check_var
silently fails.
Adding an error message will save some head scratching.
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Al Viro [Wed, 22 May 2013 17:41:26 +0000 (13:41 -0400)]
befs_readdir(): do not increment ->f_pos if filldir tells us to stop
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 18 May 2013 06:38:52 +0000 (02:38 -0400)]
hpfs: deadlock and race in directory lseek()
For one thing, there's an ABBA deadlock on hpfs fs-wide lock and i_mutex
in hpfs_dir_lseek() - there's a lot of methods that grab the former with
the caller already holding the latter, so it must take i_mutex first.
For another, locking the damn thing, carefully validating the offset,
then dropping locks and assigning the offset is obviously racy.
Moreover, we _must_ do hpfs_add_pos(), or the machinery in dnode.c
won't modify the sucker on B-tree surgeries.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 17 May 2013 19:21:56 +0000 (15:21 -0400)]
qnx6: qnx6_readdir() has a braino in pos calculation
We want to mask lower 5 bits out, not leave only those and clear the
rest... As it is, we end up always starting to read from the beginning
of directory, no matter what the current position had been.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jan Beulich [Wed, 29 May 2013 12:26:53 +0000 (13:26 +0100)]
fix buffer leak after "scsi: saner replacements for ->proc_info()"
That patch failed to set proc_scsi_fops' .release method.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Takashi Iwai [Fri, 10 May 2013 12:04:11 +0000 (14:04 +0200)]
vfs: Fix invalid ida_remove() call
When the group id of a shared mount is not allocated, the umount still
tries to call mnt_release_group_id(), which eventually hits a kernel
warning at ida_remove() spewing a message like:
ida_remove called for id=0 which is not allocated.
This patch fixes the bug simply checking the group id in the caller.
Reported-by: Cristian RodrÃguez <crrodriguez@opensuse.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Christian Borntraeger [Wed, 29 May 2013 11:08:39 +0000 (13:08 +0200)]
s390/pgtable: Fix gmap notifier address
The address of the gmap notifier was broken, resulting in
unhandled validity intercepts in KVM. Fix the rmap->vmaddr
to be on a segment boundary.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Stefan Weinhuber [Tue, 28 May 2013 13:26:06 +0000 (15:26 +0200)]
s390/dasd: fix handling of gone paths
When a path is gone and dasd_generic_path_event is called with a
PE_PATH_GONE event, we must assume that any I/O request on that
subchannel is still running. This is unlike the dasd_generic_notify
handler and the CIO_NO_PATH event, which implies that the subchannel
has been cleared.
If dasd_generic_path_event finds that the path has been the last
usable path, it must not call dasd_generic_last_path_gone (which would
reset the state of running requests), but just set the
DASD_STOPPED_DC_WAIT bit.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>