Thomas Gleixner [Mon, 31 Jan 2011 14:08:43 +0000 (15:08 +0100)]
Merge branch 'tip/rtmutex' of git://git./linux/kernel/git/rostedt/linux-2.6-trace into core/locking
*git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace tip/rtmutex:
rtmutex: Simplify PI algorithm and make highest prio task get lock
Lai Jiangshan [Fri, 14 Jan 2011 09:09:41 +0000 (17:09 +0800)]
rtmutex: Simplify PI algorithm and make highest prio task get lock
In current rtmutex, the pending owner may be boosted by the tasks
in the rtmutex's waitlist when the pending owner is deboosted
or a task in the waitlist is boosted. This boosting is unrelated,
because the pending owner does not really take the rtmutex.
It is not reasonable.
Example.
time1:
A(high prio) onwers the rtmutex.
B(mid prio) and C (low prio) in the waitlist.
time2
A release the lock, B becomes the pending owner
A(or other high prio task) continues to run. B's prio is lower
than A, so B is just queued at the runqueue.
time3
A or other high prio task sleeps, but we have passed some time
The B and C's prio are changed in the period (time2 ~ time3)
due to boosting or deboosting. Now C has the priority higher
than B. ***Is it reasonable that C has to boost B and help B to
get the rtmutex?
NO!! I think, it is unrelated/unneed boosting before B really
owns the rtmutex. We should give C a chance to beat B and
win the rtmutex.
This is the motivation of this patch. This patch *ensures*
only the top waiter or higher priority task can take the lock.
How?
1) we don't dequeue the top waiter when unlock, if the top waiter
is changed, the old top waiter will fail and go to sleep again.
2) when requiring lock, it will get the lock when the lock is not taken and:
there is no waiter OR higher priority than waiters OR it is top waiter.
3) In any time, the top waiter is changed, the top waiter will be woken up.
The algorithm is much simpler than before, no pending owner, no
boosting for pending owner.
Other advantage of this patch:
1) The states of a rtmutex are reduced a half, easier to read the code.
2) the codes become shorter.
3) top waiter is not dequeued until it really take the lock:
they will retain FIFO when it is stolen.
Not advantage nor disadvantage
1) Even we may wakeup multiple waiters(any time when top waiter changed),
we hardly cause "thundering herd",
the number of wokenup task is likely 1 or very little.
2) two APIs are changed.
rt_mutex_owner() will not return pending owner, it will return NULL when
the top waiter is going to take the lock.
rt_mutex_next_owner() always return the top waiter.
will not return NULL if we have waiters
because the top waiter is not dequeued.
I have fixed the code that use these APIs.
need updated after this patch is accepted
1) Document/*
2) the testcase scripts/rt-tester/t4-l2-pi-deboost.tst
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <
4D3012D5.
4060709@cn.fujitsu.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Thomas Gleixner [Wed, 26 Jan 2011 20:32:01 +0000 (21:32 +0100)]
rwsem: Remove redundant asmregparm annotation
Peter Zijlstra pointed out, that the only user of asmregparm (x86) is
compiling the kernel already with -mregparm=3. So the annotation of
the rwsem functions is redundant. Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <alpine.LFD.2.00.
1101262130450.31804@localhost6.localdomain6>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 26 Jan 2011 20:06:06 +0000 (20:06 +0000)]
rwsem: Move duplicate function prototypes to linux/rwsem.h
All architecture specific rwsem headers carry the same function
prototypes. Just x86 adds asmregparm, which is an empty define on all
other architectures. S390 has a stale rwsem_downgrade_write()
prototype.
Remove the duplicates and add the prototypes to linux/rwsem.h
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <
20110126195833.
970840140@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 26 Jan 2011 20:06:03 +0000 (20:06 +0000)]
rwsem: Unify the duplicate rwsem_is_locked() inlines
Instead of having the same implementation in each architecture, move
it to linux/rwsem.h and remove the duplicates. It's unlikely that an
arch will ever implement something different, but we can deal with
that when it happens.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <
20110126195833.
876773757@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 26 Jan 2011 20:06:00 +0000 (20:06 +0000)]
rwsem: Move duplicate init macros and functions to linux/rwsem.h
The rwsem initializers and related macros and functions are mostly the
same. Some of them lack the lockdep initializer, but having it in
place does not matter for architectures which do not support lockdep.
powerpc, sparc, x86: No functional change
sh, s390: Removes the duplicate init_rwsem (inline and #define)
alpha, ia64, xtensa: Use the lockdep capable init function in
lib/rwsem.c which is just uninlining the init
function for the LOCKDEP=n case
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <
20110126195833.
771812729@linutronix.de>
Thomas Gleixner [Wed, 26 Jan 2011 20:05:56 +0000 (20:05 +0000)]
rwsem: Move duplicate struct rwsem declaration to linux/rwsem.h
The difference between these declarations is the data type of the
count member and the lack of lockdep in some architectures/
long is equivivalent to signed long and the #ifdef guarded dep_map
member does not hurt anyone.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <
20110126195833.
679641914@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 26 Jan 2011 20:05:53 +0000 (20:05 +0000)]
x86: Cleanup rwsem_count_t typedef
Remove the typedef which has no real reason to be there.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <
20110126195833.
580335506@linutronix.de>
Thomas Gleixner [Wed, 26 Jan 2011 20:05:50 +0000 (20:05 +0000)]
rwsem: Cleanup includes
All rwsem implementations include the same headers. Include them from
include/linux/rwsem.h
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <
20110126195833.
483520950@linutronix.de>
Thomas Gleixner [Sun, 23 Jan 2011 14:30:09 +0000 (15:30 +0100)]
locking: Remove deprecated lock initializers
Last users are gone. Remove the left overs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Sun, 23 Jan 2011 14:25:56 +0000 (15:25 +0100)]
cred: Replace deprecated spinlock initialization
SPIN_LOCK_UNLOCK is deprecated. Use the lockdep capable variant
instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Sun, 23 Jan 2011 14:24:55 +0000 (15:24 +0100)]
kthread: Replace deprecated spinlock initialization
SPIN_LOCK_UNLOCK is deprecated. Use the lockdep capable variant
instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Sun, 23 Jan 2011 14:23:14 +0000 (15:23 +0100)]
xtensa: Replace deprecated spinlock initialization
SPIN_LOCK_UNLOCK is deprecated. Use the lockdep capable variant
instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Zankel <chris@zankel.net>
Thomas Gleixner [Sun, 23 Jan 2011 14:21:25 +0000 (15:21 +0100)]
um: Replace deprecated spinlock initialization
SPIN_LOCK_UNLOCK is deprecated. Use the lockdep capable variant
instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeff Dike <jdike@addtoit.com>
Thomas Gleixner [Sun, 23 Jan 2011 14:19:12 +0000 (15:19 +0100)]
sparc: Replace deprecated spinlock initialization
SPIN_LOCK_UNLOCK is deprecated. Use the lockdep capable variant
instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David S. Miller <davem@davemloft.net>
Thomas Gleixner [Sun, 23 Jan 2011 14:16:35 +0000 (15:16 +0100)]
mips: Replace deprecated spinlock initialization
SPIN_LOCK_UNLOCK is deprecated. Use the lockdep capable variant
instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Thomas Gleixner [Sun, 23 Jan 2011 14:14:15 +0000 (15:14 +0100)]
cris: Replace deprecated spinlock initialization
SPIN_LOCK_UNLOCK is deprecated. Use the lockdep capable variant
instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Thomas Gleixner [Sun, 23 Jan 2011 14:10:51 +0000 (15:10 +0100)]
alpha: Replace deprecated spinlock initialization
SPIN_LOCK_UNLOCK is deprecated. Use the lockdep capable variant
instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Matt Turner <mattst88@gmail.com>
Thomas Gleixner [Thu, 27 Jan 2011 11:29:13 +0000 (12:29 +0100)]
Merge commit 'v2.6.38-rc2' into core/locking
Reason: Update to mainline before adding the locking cleanup
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Arnd Bergmann [Tue, 25 Jan 2011 22:17:32 +0000 (23:17 +0100)]
rtmutex-tester: Remove BKL tests
The BKL is going away, no need to test it any more.
I left the definitions of the test case numbers
in, so that the other tests do not get renumbered.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <
1295993854-4971-19-git-send-email-arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Wed, 26 Jan 2011 06:31:44 +0000 (16:31 +1000)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: wacom - pass touch resolution to clients through input_absinfo
Input: wacom - add 2 Bamboo Pen and touch models
Input: sysrq - ensure sysrq_enabled and __sysrq_enabled are consistent
Input: sparse-keymap - fix KEY_VSW handling in sparse_keymap_setup
Input: tegra-kbc - add tegra keyboard driver
Input: gpio_keys - switch to using request_any_context_irq
Input: serio - allow registered drivers to get status flag
Input: ct82710c - return proper error code for ct82c710_open
Input: bu21013_ts - added regulator support
Input: bu21013_ts - remove duplicate resolution parameters
Input: tnetv107x-ts - don't treat NULL clk as an error
Input: tnetv107x-keypad - don't treat NULL clk as an error
Fix up trivial conflicts in drivers/input/keyboard/Makefile due to
additions of tc3589x/Tegra drivers
Ping Cheng [Wed, 26 Jan 2011 02:03:13 +0000 (18:03 -0800)]
Input: wacom - pass touch resolution to clients through input_absinfo
Also remove fake ABS_RX/ABS_RY "axes" that were used to report physical
dimensions now that we have better way.
Signed-off-by: Ping Cheng <pingc@wacom.com>
Reviewed-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Torben Hohn [Tue, 25 Jan 2011 23:07:35 +0000 (15:07 -0800)]
console: rename acquire/release_console_sem() to console_lock/unlock()
The -rt patches change the console_semaphore to console_mutex. As a
result, a quite large chunk of the patches changes all
acquire/release_console_sem() to acquire/release_console_mutex()
This commit makes things use more neutral function names which dont make
implications about the underlying lock.
The only real change is the return value of console_trylock which is
inverted from try_acquire_console_sem()
This patch also paves the way to switching console_sem from a semaphore to
a mutex.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: make console_trylock return 1 on success, per Geert]
Signed-off-by: Torben Hohn <torbenh@gmx.de>
Cc: Thomas Gleixner <tglx@tglx.de>
Cc: Greg KH <gregkh@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Phillip Lougher [Tue, 25 Jan 2011 23:07:34 +0000 (15:07 -0800)]
squashfs: fix use of uninitialised variable in zlib & xz decompressors
Fix potential use of uninitialised variable caused by recent
decompressor code optimisations.
In zlib_uncompress (zlib_wrapper.c) we have
int zlib_err, zlib_init = 0;
...
do {
...
if (avail == 0) {
offset = 0;
put_bh(bh[k++]);
continue;
}
...
zlib_err = zlib_inflate(stream, Z_SYNC_FLUSH);
...
} while (zlib_err == Z_OK);
If continue is executed (avail == 0) then the while condition will be
evaluated testing zlib_err, which is uninitialised first time around the
loop.
Fix this by getting rid of the 'if (avail == 0)' condition test, this
edge condition should not be being handled in the decompressor code, and
instead handle it generically in the caller code.
Similarly for xz_wrapper.c.
Incidentally, on most architectures (bar Mips and Parisc), no
uninitialised variable warning is generated by gcc, this is because the
while condition test on continue is optimised out and not performed
(when executing continue zlib_err has not been changed since entering
the loop, and logically if the while condition was true previously, then
it's still true).
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Reported-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Toshiyuki Okajima [Tue, 25 Jan 2011 23:07:32 +0000 (15:07 -0800)]
radix_tree: radix_tree_gang_lookup_tag_slot() may never return
Executed command: fsstress -d /mnt -n 600 -p 850
crash> bt
PID: 7947 TASK:
ffff880160546a70 CPU: 0 COMMAND: "fsstress"
#0 [
ffff8800dfc07d00] machine_kexec at
ffffffff81030db9
#1 [
ffff8800dfc07d70] crash_kexec at
ffffffff810a7952
#2 [
ffff8800dfc07e40] oops_end at
ffffffff814aa7c8
#3 [
ffff8800dfc07e70] die_nmi at
ffffffff814aa969
#4 [
ffff8800dfc07ea0] do_nmi_callback at
ffffffff8102b07b
#5 [
ffff8800dfc07f10] do_nmi at
ffffffff814aa514
#6 [
ffff8800dfc07f50] nmi at
ffffffff814a9d60
[exception RIP: __lookup_tag+100]
RIP:
ffffffff812274b4 RSP:
ffff88016056b998 RFLAGS:
00000287
RAX:
0000000000000000 RBX:
0000000000000002 RCX:
0000000000000006
RDX:
000000000000001d RSI:
ffff88016056bb18 RDI:
ffff8800c85366e0
RBP:
ffff88016056b9c8 R8:
ffff88016056b9e8 R9:
0000000000000000
R10:
000000000000000e R11:
ffff8800c8536908 R12:
0000000000000010
R13:
0000000000000040 R14:
ffffffffffffffc0 R15:
ffff8800c85366e0
ORIG_RAX:
ffffffffffffffff CS: 0010 SS: 0018
<NMI exception stack>
#7 [
ffff88016056b998] __lookup_tag at
ffffffff812274b4
#8 [
ffff88016056b9d0] radix_tree_gang_lookup_tag_slot at
ffffffff81227605
#9 [
ffff88016056ba20] find_get_pages_tag at
ffffffff810fc110
#10 [
ffff88016056ba80] pagevec_lookup_tag at
ffffffff81105e85
#11 [
ffff88016056baa0] write_cache_pages at
ffffffff81104c47
#12 [
ffff88016056bbd0] generic_writepages at
ffffffff81105014
#13 [
ffff88016056bbe0] do_writepages at
ffffffff81105055
#14 [
ffff88016056bbf0] __filemap_fdatawrite_range at
ffffffff810fb2cb
#15 [
ffff88016056bc40] filemap_write_and_wait_range at
ffffffff810fb32a
#16 [
ffff88016056bc70] generic_file_direct_write at
ffffffff810fb3dc
#17 [
ffff88016056bce0] __generic_file_aio_write at
ffffffff810fcee5
#18 [
ffff88016056bda0] generic_file_aio_write at
ffffffff810fd085
#19 [
ffff88016056bdf0] do_sync_write at
ffffffff8114f9ea
#20 [
ffff88016056bf00] vfs_write at
ffffffff8114fcf8
#21 [
ffff88016056bf30] sys_write at
ffffffff81150691
#22 [
ffff88016056bf80] system_call_fastpath at
ffffffff8100c0b2
I think this root cause is the following:
radix_tree_range_tag_if_tagged() always tags the root tag with settag
if the root tag is set with iftag even if there are no iftag tags
in the specified range (Of course, there are some iftag tags
outside the specified range).
===============================================================================
[[[Detailed description]]]
(1) Why cannot radix_tree_gang_lookup_tag_slot() return forever?
__lookup_tag():
- Return with 0.
- Return with the index which is not bigger than the old one as the
input parameter.
Therefore the following "while" repeats forever because the above
conditions cause "ret" not to be updated and the cur_index cannot be
changed into the bigger one.
(So, radix_tree_gang_lookup_tag_slot() cannot return forever.)
radix_tree_gang_lookup_tag_slot():
1178 while (ret < max_items) {
1179 unsigned int slots_found;
1180 unsigned long next_index; /* Index of next search */
1181
1182 if (cur_index > max_index)
1183 break;
1184 slots_found = __lookup_tag(node, results + ret,
1185 cur_index, max_items - ret, &next_index,
tag);
1186 ret += slots_found;
// cannot update ret because slots_found == 0.
// so, this while loops forever.
1187 if (next_index == 0)
1188 break;
1189 cur_index = next_index;
1190 }
(2) Why does __lookup_tag() return with 0 and doesn't update the index?
Assuming the following:
- the one of the slot in radix_tree_node is NULL.
- the one of the tag which corresponds to the slot sets with
PAGECACHE_TAG_TOWRITE or other.
- In a certain height(!=0), the corresponding index is 0.
a) __lookup_tag() notices that the tag is set.
1005 static unsigned int
1006 __lookup_tag(struct radix_tree_node *slot, void ***results, unsigned long index,
1007 unsigned int max_items, unsigned long *next_index, unsigned int tag)
1008 {
1009 unsigned int nr_found = 0;
1010 unsigned int shift, height;
1011
1012 height = slot->height;
1013 if (height == 0)
1014 goto out;
1015 shift = (height-1) * RADIX_TREE_MAP_SHIFT;
1016
1017 while (height > 0) {
1018 unsigned long i = (index >> shift) & RADIX_TREE_MAP_MASK ;
1019
1020 for (;;) {
1021 if (tag_get(slot, tag, i))
1022 break;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* the index is not updated yet.
b) __lookup_tag() notices that the slot is NULL.
1023 index &= ~((1UL << shift) - 1);
1024 index += 1UL << shift;
1025 if (index == 0)
1026 goto out; /* 32-bit wraparound */
1027 i++;
1028 if (i == RADIX_TREE_MAP_SIZE)
1029 goto out;
1030 }
1031 height--;
1032 if (height == 0) { /* Bottom level: grab some items */
...
1055 }
1056 shift -= RADIX_TREE_MAP_SHIFT;
1057 slot = rcu_dereference_raw(slot->slots[i]);
1058 if (slot == NULL)
1059 break;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
c) __lookup_tag() doesn't update the index and return with 0.
1060 }
1061 out:
1062 *next_index = index;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1063 return nr_found;
1064 }
(3) Why is the slot NULL even if the tag is set?
Because radix_tree_range_tag_if_tagged() always sets the root tag with
PAGECACHE_TAG_TOWRITE if the root tag is set with PAGECACHE_TAG_DIRTY,
even if there is no tag which can be set with PAGECACHE_TAG_TOWRITE
in the specified range (from *first_indexp to last_index). Of course,
some PAGECACHE_TAG_DIRTY nodes must exist outside the specified range.
(radix_tree_range_tag_if_tagged() is called only from tag_pages_for_writeback())
640 unsigned long radix_tree_range_tag_if_tagged(struct radix_tree_root
*root,
641 unsigned long *first_indexp, unsigned long last_index,
642 unsigned long nr_to_tag,
643 unsigned int iftag, unsigned int settag)
644 {
645 unsigned int height = root->height;
646 struct radix_tree_path path[height];
647 struct radix_tree_path *pathp = path;
648 struct radix_tree_node *slot;
649 unsigned int shift;
650 unsigned long tagged = 0;
651 unsigned long index = *first_indexp;
652
653 last_index = min(last_index, radix_tree_maxindex(height));
654 if (index > last_index)
655 return 0;
656 if (!nr_to_tag)
657 return 0;
658 if (!root_tag_get(root, iftag)) {
659 *first_indexp = last_index + 1;
660 return 0;
661 }
662 if (height == 0) {
663 *first_indexp = last_index + 1;
664 root_tag_set(root, settag);
665 return 1;
666 }
...
733 root_tag_set(root, settag);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
734 *first_indexp = index;
735
736 return tagged;
737 }
As the result, there is no radix_tree_node which is set with
PAGECACHE_TAG_TOWRITE but the root tag(radix_tree_root) is set with
PAGECACHE_TAG_TOWRITE.
[figure: inside radix_tree]
(Please see the figure with typewriter font)
===========================================
[roottag = DIRTY]
| tag=0:NOTHING
tag[0 0 0 1] 1:DIRTY
[x x x +] 2:WRITEBACK
| 3:DIRTY,WRITEBACK
p 4:TOWRITE
<---> 5:DIRTY,TOWRITE ...
specified range (index: 0 to 2)
* There is no DIRTY tag within the specified range.
(But there is a DIRTY tag outside that range.)
| | | | | | | | |
after calling tag_pages_for_writeback()
| | | | | | | | |
v v v v v v v v v
[roottag = DIRTY,TOWRITE]
| p is "page".
tag[0 0 0 1] x is NULL.
[x x x +] +- is a pointer to "page".
|
p
* But TOWRITE tag is set on the root tag.
============================================
After that, radix_tree_extend() via radix_tree_insert() is called
when the page is added.
This function sets the new radix_tree_node with PAGECACHE_TAG_TOWRITE
to succeed the status of the root tag.
246 static int radix_tree_extend(struct radix_tree_root *root, unsigned long
index)
247 {
248 struct radix_tree_node *node;
249 unsigned int height;
250 int tag;
251
252 /* Figure out what the height should be. */
253 height = root->height + 1;
254 while (index > radix_tree_maxindex(height))
255 height++;
256
257 if (root->rnode == NULL) {
258 root->height = height;
259 goto out;
260 }
261
262 do {
263 unsigned int newheight;
264 if (!(node = radix_tree_node_alloc(root)))
265 return -ENOMEM;
266
267 /* Increase the height. */
268 node->slots[0] = radix_tree_indirect_to_ptr(root->rnode);
269
270 /* Propagate the aggregated tag info into the new root */
271 for (tag = 0; tag < RADIX_TREE_MAX_TAGS; tag++) {
272 if (root_tag_get(root, tag))
273 tag_set(node, tag, 0);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
274 }
===========================================
[roottag = DIRTY,TOWRITE]
| :
tag[0 0 0 1] [0 0 0 0]
[x x x +] [+ x x x]
| |
p p (new page)
| | | | | | | | |
after calling radix_tree_insert
| | | | | | | | |
v v v v v v v v v
[roottag = DIRTY,TOWRITE]
|
tag [5 0 0 0] * DIRTY and TOWRITE tags are
[+ + x x] succeeded to the new node.
| |
tag [0 0 0 1] [0 0 0 0]
[x x x +] [+ x x x]
| |
p p
============================================
After that, the index 3 page is released by remove_from_page_cache().
Then we can make the situation that the tag is set with PAGECACHE_TAG_TOWRITE
and that the slot which corresponds to the tag is NULL.
===========================================
[roottag = DIRTY,TOWRITE]
|
tag [5 0 0 0]
[+ + x x]
| |
tag [0 0 0 1] [0 0 0 0]
[x x x +] [+ x x x]
| |
p p
(remove)
| | | | | | | | |
after calling remove_page_cache
| | | | | | | | |
v v v v v v v v v
[roottag = DIRTY,TOWRITE]
|
tag [4 0 0 0] * Only DIRTY tag is cleared
[x + x x] because no TOWRITE tag is existed
| in the bottom node.
[0 0 0 0]
[+ x x x]
|
p
============================================
To solve this problem
Change to that radix_tree_tag_if_tagged() doesn't tag the root tag
if it doesn't set any tags within the specified range.
Like this.
============================================
640 unsigned long radix_tree_range_tag_if_tagged(struct radix_tree_root
*root,
641 unsigned long *first_indexp, unsigned long last_index,
642 unsigned long nr_to_tag,
643 unsigned int iftag, unsigned int settag)
644 {
650 unsigned long tagged = 0;
...
733 if (tagged)
^^^^^^^^^^^^^^^^^^^^^^^^
734 root_tag_set(root, settag);
735 *first_indexp = index;
736
737 return tagged;
738 }
============================================
Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Voss, Nikolaus [Tue, 25 Jan 2011 23:07:29 +0000 (15:07 -0800)]
drivers/clocksource/tcb_clksrc.c: fix init sequence
setup_irq() was called before clockevents_register_device() which is
needed by the irq handler. Bug was reproducible by restarting the
kernel using kexec (reliable crash).
Signed-off-by: Nikolaus Voss <n.voss@weinmann.de>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Tue, 25 Jan 2011 23:07:29 +0000 (15:07 -0800)]
memcg: fix race at move_parent around compound_order()
A fix up mem_cgroup_move_parent() which use compound_order() in
asynchronous manner. This compound_order() may return unknown value
because we don't take lock. Use PageTransHuge() and HPAGE_SIZE instead
of it.
Also clean up for mem_cgroup_move_parent().
- remove unnecessary initialization of local variable.
- rename charge_size -> page_size
- remove unnecessary (wrong) comment.
- added a comment about THP.
Note:
Current design take compound_page_lock() in caller of move_account().
This should be revisited when we implement direct move_task of hugepage
without splitting.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Tue, 25 Jan 2011 23:07:28 +0000 (15:07 -0800)]
memcg: bugfix check mem_cgroup_disabled() at split fixup
mem_cgroup_disabled() should be checked at splitting. If disabled, no
heavy work is necesary.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Tue, 25 Jan 2011 23:07:27 +0000 (15:07 -0800)]
memcg: fix account leak at failure of memsw acconting
Commit
4b53433468 ("memcg: clean up try_charge main loop") removes a
cancel of charge at case: memory charge-> success. mem+swap charge->
failure.
This leaks usage of memory. Fix it.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: <stable@kernel.org> [2.6.36+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Tue, 25 Jan 2011 23:07:26 +0000 (15:07 -0800)]
mm: migration: clarify migrate_pages() comment
Callers of migrate_pages should putback_lru_pages to return pages
isolated to LRU or free list. Now comment is rather confusing. It says
caller always have to call it.
It is more clear to point out that the caller has to call it if
migrate_pages's return value isn't zero.
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Arcangeli [Tue, 25 Jan 2011 23:07:25 +0000 (15:07 -0800)]
mm: compaction: don't depend on HUGETLB_PAGE
Commit
5d6892407 ("thp: select CONFIG_COMPACTION if TRANSPARENT_HUGEPAGE
enabled") causes this warning during the configuration process:
warning: (TRANSPARENT_HUGEPAGE) selects COMPACTION which has unmet
direct dependencies (EXPERIMENTAL && HUGETLB_PAGE && MMU)
COMPACTION doesn't depend on HUGETLB_PAGE, it doesn't depend on THP
either, it is also useful for regular alloc_pages(order > 0) including
the very kernel stack during fork (THREAD_ORDER = 1). It's always
better to enable COMPACTION.
The warning should be an error because we would end up with MIGRATION
not selected, and COMPACTION wouldn't work without migration (despite it
seems to build with an inline migrate_pages returning -ENOSYS).
I'd also like to remove EXPERIMENTAL: compaction has been in the kernel
for some releases (for full safety the default remains disabled which I
think is enough).
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Luca Tettamanti <kronos.it@gmail.com>
Tested-by: Luca Tettamanti <kronos.it@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jesper Juhl [Tue, 25 Jan 2011 23:07:24 +0000 (15:07 -0800)]
mm/memcontrol.c: fix uninitialized variable use in mem_cgroup_move_parent()
In mm/memcontrol.c::mem_cgroup_move_parent() there's a path that jumps
to the 'put_back' label
ret = __mem_cgroup_try_charge(NULL, gfp_mask, &parent, false, charge);
if (ret || !parent)
goto put_back;
where we'll
if (charge > PAGE_SIZE)
compound_unlock_irqrestore(page, flags);
but, we have not assigned anything to 'flags' at this point, nor have we
called 'compound_lock_irqsave()' (which is what sets 'flags'). The
'put_back' label should be moved below the call to
compound_unlock_irqrestore() as per this patch.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Pavel Emelianov <xemul@openvz.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Tue, 25 Jan 2011 23:07:23 +0000 (15:07 -0800)]
mm: clear pages_scanned only if draining a pcp adds pages to the buddy allocator
Commit
0e093d99763e ("writeback: do not sleep on the congestion queue if
there are no congested BDIs or if significant congestion is not being
encountered in the current zone") uncovered a livelock in the page
allocator that resulted in tasks infinitely looping trying to find
memory and kswapd running at 100% cpu.
The issue occurs because drain_all_pages() is called immediately
following direct reclaim when no memory is freed and try_to_free_pages()
returns non-zero because all zones in the zonelist do not have their
all_unreclaimable flag set.
When draining the per-cpu pagesets back to the buddy allocator for each
zone, the zone->pages_scanned counter is cleared to avoid erroneously
setting zone->all_unreclaimable later. The problem is that no pages may
actually be drained and, thus, the unreclaimable logic never fails
direct reclaim so the oom killer may be invoked.
This apparently only manifested after wait_iff_congested() was
introduced and the zone was full of anonymous memory that would not
congest the backing store. The page allocator would infinitely loop if
there were no other tasks waiting to be scheduled and clear
zone->pages_scanned because of drain_all_pages() as the result of this
change before kswapd could scan enough pages to trigger the reclaim
logic. Additionally, with every loop of the page allocator and in the
reclaim path, kswapd would be kicked and would end up running at 100%
cpu. In this scenario, current and kswapd are all running continuously
with kswapd incrementing zone->pages_scanned and current clearing it.
The problem is even more pronounced when current swaps some of its
memory to swap cache and the reclaimable logic then considers all active
anonymous memory in the all_unreclaimable logic, which requires a much
higher zone->pages_scanned value for try_to_free_pages() to return zero
that is never attainable in this scenario.
Before wait_iff_congested(), the page allocator would incur an
unconditional timeout and allow kswapd to elevate zone->pages_scanned to
a level that the oom killer would be called the next time it loops.
The fix is to only attempt to drain pcp pages if there is actually a
quantity to be drained. The unconditional clearing of
zone->pages_scanned in free_pcppages_bulk() need not be changed since
other callers already ensure that draining will occur. This patch
ensures that free_pcppages_bulk() will actually free memory before
calling into it from drain_all_pages() so zone->pages_scanned is only
cleared if appropriate.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Tue, 25 Jan 2011 23:07:20 +0000 (15:07 -0800)]
mm: fix deferred congestion timeout if preferred zone is not allowed
Before
0e093d99763e ("writeback: do not sleep on the congestion queue if
there are no congested BDIs or if significant congestion is not being
encountered in the current zone"), preferred_zone was only used for NUMA
statistics, to determine the zoneidx from which to allocate from given
the type requested, and whether to utilize memory compaction.
wait_iff_congested(), though, uses preferred_zone to determine if the
congestion wait should be deferred because its dirty pages are backed by
a congested bdi. This incorrectly defers the timeout and busy loops in
the page allocator with various cond_resched() calls if preferred_zone
is not allowed in the current context, usually consuming 100% of a cpu.
This patch ensures preferred_zone is an allowed zone in the fastpath
depending on whether current is constrained by its cpuset or nodes in
its mempolicy (when the nodemask passed is non-NULL). This is correct
since the fastpath allocation always passes ALLOC_CPUSET when trying to
allocate memory. In the slowpath, this patch resets preferred_zone to
the first zone of the allowed type when the allocation is not
constrained by current's cpuset, i.e. it does not pass ALLOC_CPUSET.
This patch also ensures preferred_zone is from the set of allowed nodes
when called from within direct reclaim since allocations are always
constrained by cpusets in this context (it is blockable).
Both of these uses of cpuset_current_mems_allowed are protected by
get_mems_allowed().
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Gordeev [Tue, 25 Jan 2011 23:07:19 +0000 (15:07 -0800)]
pps: claim parallel port exclusively
Both pps_parport and pps_gen_parport are written in a way that they
can't share a port with any other driver. This can result in locking up
the process that loads modules or even the whole kernel if the modules
are compiled in. Use PARPORT_FLAG_EXCL to indicate this.
Signed-off-by: Alexander Gordeev <lasaine@lvk.cs.msu.su>
Cc: Alexander Gordeev <lasaine@lvk.cs.msu.su>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rodolfo Giometti [Tue, 25 Jan 2011 23:07:17 +0000 (15:07 -0800)]
pps ktimer: remove noisy message
Signed-off-by: Rodolfo Giometti <giometti@linux.it>
Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Gordeev <lasaine@lvk.cs.msu.su>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Gordeev [Tue, 25 Jan 2011 23:07:16 +0000 (15:07 -0800)]
parport: make lockdep happy with waitlist_lock
parport_unregister_device() should never be used when interrupts are
enabled in hardware and irq handler is registered so there is no need to
disable interrupts when using waitlist_lock. But there is no way to
explain this subtle semantics to lockdep analyzer.
So disable interrupts here too to simplify things. The price is
negligible.
Signed-off-by: Alexander Gordeev <lasaine@lvk.cs.msu.su>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Feng Tang [Tue, 25 Jan 2011 23:07:15 +0000 (15:07 -0800)]
langwell_gpio: modify EOI handling following change of kernel irq subsystem
Latest kernel has many changes in IRQ subsystem and its interfaces, like
adding "irq_eoi" for struct irq_chip, this patch is a follow up change
for that.
Also remove the unnecessary cast for a "void *".
Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Alek Du <alek.du@intel.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Axel Lin [Tue, 25 Jan 2011 23:07:14 +0000 (15:07 -0800)]
leds: leds-pwm: return proper error if pwm_request failed
Return PTR_ERR(led_dat->pwm) instead of 0 if pwm_request failed
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Luotao Fu <l.fu@pengutronix.de>
Cc: Reviewed-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Tue, 25 Jan 2011 23:07:11 +0000 (15:07 -0800)]
mm/pgtable-generic.c: fix CONFIG_SWAP=n build
mips (and sparc32):
In file included from arch/mips/include/asm/tlb.h:21,
from mm/pgtable-generic.c:9:
include/asm-generic/tlb.h: In function `tlb_flush_mmu':
include/asm-generic/tlb.h:76: error: implicit declaration of function `release_pages'
include/asm-generic/tlb.h: In function `tlb_remove_page':
include/asm-generic/tlb.h:105: error: implicit declaration of function `page_cache_release'
free_pages_and_swap_cache() and free_page_and_swap_cache() are macros
which call release_pages() and page_cache_release(). The obvious fix is
to include pagemap.h in swap.h, where those macros are defined. But that
breaks sparc for weird reasons.
So fix it within mm/pgtable-generic.c instead.
Reported-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Sergei Shtylyov <sshtylyov@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Arcangeli [Tue, 25 Jan 2011 23:07:09 +0000 (15:07 -0800)]
thp: fix PARAVIRT x86 32bit noPAE
This fixes TRANSPARENT_HUGEPAGE=y with PARAVIRT=y and HIGHMEM64=n.
The #ifdef that this patch removes was erratically introduced to fix a
build error for noPAE (where pmd.pmd doesn't exist). So then the kernel
built but it failed at runtime because set_pmd_at was a noop. This will
correct it by enabling set_pmd_at for noPAE mode too.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: werner <w.landgraf@ru.ru>
Reported-by: Minchan Kim <minchan.kim@gmail.com>
Tested-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 25 Jan 2011 23:04:18 +0000 (09:04 +1000)]
Merge branch 'fixes' of /home/rmk/linux-2.6-arm
* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm:
ALSA: AACI: fix timeout duration
ALSA: AACI: fix timeout condition checking
ARM: 6636/1: ep93xx: default multiplexed gpio ports to gpio mode
ARM: 6637/1: Make the argument to virt_to_phys() "const volatile"
ARM: twd: ensure timer reload is reprogrammed on entry to periodic mode
ARM: 6635/2: Configure reference clock for Versatile Express timers
ARM: versatile: name configuration options after actual board names
ARM: realview: name configuration options after actual board names
ARM: realview,vexpress: fix section mismatch warning for pen_release
ARM: 6632/3: mmci: stop using the blockend interrupts
Linus Torvalds [Tue, 25 Jan 2011 23:03:36 +0000 (09:03 +1000)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: fix crash after one superblock became unavailable
Linus Torvalds [Tue, 25 Jan 2011 23:02:55 +0000 (09:02 +1000)]
Merge branch 'for-linus' of git://git./linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k/amiga: Fix "debug=mem"
m68k/atari: Rename "scc" to "atari_scc"
m68k: Uninline strchr()
Linus Torvalds [Tue, 25 Jan 2011 23:02:14 +0000 (09:02 +1000)]
Merge branch 'rmobile-fixes-for-linus' of git://git./linux/kernel/git/lethal/sh-2.6
* 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
ARM: mach-shmobile: AG5EVM LCDC / MIPI-DSI platform data
ARM: mach-shmobile: sh73a0 CPGA fix for PLL CFG bit
ARM: mach-shmobile: mackerel: clarify shdi/mmcif switch settings
ARM: mach-shmobile: sh73a0 CPGA fix for IrDA MSTP
ARM: mach-shmobile: sh73a0 CPGA fix for FRQCRA M3
ARM: mach-shmobile: remove sh7367 on-chip set_irq_type()
ARM: mach-shmobile: sh7372 INTCS MFIS2 interrupt update
ARM: mach-shmobile: ag5evm: Add IrDA support
ARM: mach-shmobile: clock-sh7372: fixup pllc2 set_rate
mmc: sh_mmcif: Convert to __raw_xxx() I/O accessors.
ARM: mach-shmobile: ag5evm requires GPIOLIB
ARM: mach-shmobile: fix cpu_base of gic_init() on sh73a0
Linus Torvalds [Tue, 25 Jan 2011 23:01:22 +0000 (09:01 +1000)]
Merge branch 'fbdev-fixes-for-linus' of git://git./linux/kernel/git/lethal/fbdev-2.6
* 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6:
mailmap: Add an entry for Axel Lin.
video: fix some comments in drivers/video/console/vgacon.c
drivers/video/bf537-lq035.c: Add missing IS_ERR test
video: pxa168fb: remove a redundant pxa168fb_check_var call
video: da8xx-fb: fix fb_probe error path
video: pxa3xx-gcu: Return -EFAULT when copy_from_user() fails
video: nuc900fb: properly free resources in nuc900fb_remove
video: nuc900fb: fix compile error
Linus Torvalds [Tue, 25 Jan 2011 23:00:17 +0000 (09:00 +1000)]
Merge branch 'sh-fixes-for-linus' of git://git./linux/kernel/git/lethal/sh-2.6
* 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: Fix build of sh7750 base boards
sh: update INTC to clear IRQ sense valid flag
sh: Fix sh build failure when CONFIG_SFC=m
sh: fix MSIOF0 SPI on ecovec: it conflicts with VOU
sh: support XZ-compressed kernel.
sh: Fix up breakage from asm-generic/pgtable.h changes.
David Howells [Tue, 25 Jan 2011 16:34:28 +0000 (16:34 +0000)]
KEYS: Fix __key_link_end() quota fixup on error
Fix __key_link_end()'s attempt to fix up the quota if an error occurs.
There are two erroneous cases: Firstly, we always decrease the quota if
the preallocated replacement keyring needs cleaning up, irrespective of
whether or not we should (we may have replaced a pointer rather than
adding another pointer).
Secondly, we never clean up the quota if we added a pointer without the
keyring storage being extended (we allocate multiple pointers at a time,
even if we're not going to use them all immediately).
We handle this by setting the bottom bit of the preallocation pointer in
__key_link_begin() to indicate that the quota needs fixing up, which is
then passed to __key_link() (which clears the whole thing) and
__key_link_end().
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 25 Jan 2011 14:33:36 +0000 (14:33 +0000)]
intel_scu_ipcutils: Fix the license tag
GPL V2 should be GPL v2
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 25 Jan 2011 14:18:38 +0000 (14:18 +0000)]
Documentation: Fix kernel parameter ordering
A B C D E ...
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Axel Lin [Tue, 25 Jan 2011 14:12:12 +0000 (14:12 +0000)]
intel_scu_ipc: fix signedness bug
busy_loop() returns negative error code, thus change err variable
from u32 to int to properly propagate correct error code.
Also remove unneeded initialization for err and i variables.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Russell King [Tue, 25 Jan 2011 21:18:11 +0000 (21:18 +0000)]
Merge branch 'mmci' into fixes
Russell King [Wed, 12 Jan 2011 23:42:57 +0000 (23:42 +0000)]
ALSA: AACI: fix timeout duration
Relying on the access time of peripherals is unreliable - it depends
on the speed of the CPU and the bus. On Versatile Express, these
timeouts were expiring, causing the driver to fail.
Add udelay(1) to ensure that they don't expire early, and adjust
timeouts to give a reasonable margin over the response times.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Wed, 12 Jan 2011 23:17:24 +0000 (23:17 +0000)]
ALSA: AACI: fix timeout condition checking
Ensure that a timeout coincident with the condition being waited for
results in success rather than failure. This helps avoid timeout
conditions being inappropriately flagged.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Hartley Sweeten [Tue, 25 Jan 2011 00:05:35 +0000 (01:05 +0100)]
ARM: 6636/1: ep93xx: default multiplexed gpio ports to gpio mode
The EP93xx C and D GPIO ports are multiplexed with the Keypad Interface
peripheral. Â At power-up they default into non-GPIO mode with the Key
Matrix controller enabled so these ports are unusable for GPIO. Â Note
that the Keypad Interface peripheral is only available in the EP9307,
EP9312, and EP9315 processor variants.
The keypad support will clear the DeviceConfig bits appropriately to
enable the Keypad Interface when the driver is loaded. Â And, when the
driver is unloaded it will set the bits to return the ports to GPIO mode.
To make these ports available for GPIO after power-up on all EP93xx
processor variants, set the KEYS and GONK bits in the DeviceConfig
register.
Similarly, the E, G, and H ports are multiplexed with the IDE Interface
peripheral. Â At power-up these also default into non-GPIO mode. Â Note
that the IDE peripheral is only available in the EP9312 and EP9315
processor variants.
Since an IDE driver is not even available in mainline, set the EONIDE,
GONIDE, and HONIDE bits in the DeviceConfig register so that these
ports will be available for GPIO use after power-up.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Ryan Mallon <ryan@bluewatersys.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Catalin Marinas [Tue, 25 Jan 2011 10:18:25 +0000 (11:18 +0100)]
ARM: 6637/1: Make the argument to virt_to_phys() "const volatile"
Changing the virt_to_phys() argument to "const volatile void *" avoids
compiler warnings in some situations where this function is used.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Tue, 25 Jan 2011 10:35:36 +0000 (10:35 +0000)]
ARM: twd: ensure timer reload is reprogrammed on entry to periodic mode
Ensure that the twd timer reload value is reprogrammed each time we
enter periodic mode. This ensures that the reload value is always
reset correctly.
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pawel Moll [Tue, 25 Jan 2011 14:53:03 +0000 (15:53 +0100)]
ARM: 6635/2: Configure reference clock for Versatile Express timers
Timers on Versatile Express mainboard are used as system clock/event
sources. Driver assumes that they are clocked with 1MHz signal.
Old V2M firmware apparently configured it by default, but on newer
boards one can observe that "sleep 1" command takes over 30 seconds
to finish, as the timers are fed with 32kHz instead...
This patch performs required magic and also removes code clearing
timer's control registers, as exactly the same operations are
performed by the timer driver few jiffies later.
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 24 Jan 2011 12:00:01 +0000 (12:00 +0000)]
ARM: versatile: name configuration options after actual board names
Update the option text to those which appear on the front of the
appropriate board user guides. This gives consistent board naming, and
makes it obvious which option is for which platform.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 24 Jan 2011 10:58:24 +0000 (10:58 +0000)]
ARM: realview: name configuration options after actual board names
As no one seems to really know which configuration options tie up with
which boards, I thought I'd do some investigation and try to work it
out. After discussion with some folk in linaro, I think I have this
nailed.
The names are updated to use the name on the front of the appropriate
board user guide for the various baseboards, which I've taken to be
the official name for each board.
I haven't significantly updated the descriptions for the tiles as that
is even less clear - as far as I can see on ARMs website, there is no
Cortex-A9 tile for Realview EB - only ARM11MPCore, ARM1156T2F-S,
ARM1176TZF-S and Cortex-R4F. So exactly what this 'Multicore Cortex-A9
Tile' is...
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Sat, 22 Jan 2011 17:22:34 +0000 (17:22 +0000)]
ARM: realview,vexpress: fix section mismatch warning for pen_release
Fix two section mismatch warnings in the platform SMP bringup code for
Realview and Versatile Express:
WARNING: arch/arm/mach-realview/built-in.o(.text+0x8ac): Section mismatch in reference from the function write_pen_release() to the variable .cpuinit.data:pen_release
The function write_pen_release() references
the variable __cpuinitdata pen_release.
This is often because write_pen_release lacks a __cpuinitdata
annotation or the annotation of pen_release is wrong.
WARNING: arch/arm/mach-vexpress/built-in.o(.text+0x7b4): Section mismatch in reference from the function write_pen_release() to the variable .cpuinit.data:pen_release
The function write_pen_release() references
the variable __cpuinitdata pen_release.
This is often because write_pen_release lacks a __cpuinitdata
annotation or the annotation of pen_release is wrong.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Paul Mundt [Tue, 25 Jan 2011 06:30:55 +0000 (15:30 +0900)]
mailmap: Add an entry for Axel Lin.
Not all of Axel's patches have used a consistent casing, so fix it up
here.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Amerigo Wang [Wed, 19 Jan 2011 06:24:02 +0000 (06:24 +0000)]
video: fix some comments in drivers/video/console/vgacon.c
Now vgacon_scrollback_startup() uses slab, not bootmem,
the comment above it is obsolete, so does __init_refok.
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Julia Lawall [Mon, 24 Jan 2011 19:55:21 +0000 (19:55 +0000)]
drivers/video/bf537-lq035.c: Add missing IS_ERR test
lcd_device_register may return ERR_PTR, so a check is added for this value
before the dereference. All of the other changes reorganize the error
handling code in this function to avoid duplicating all of it in the added
case.
In the original code, in one case, the global variable fb_buffer was set to
NULL in error code that appears after this variable is initialized. This
is done now in all error handling code that has this property.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r@
identifier f;
@@
f(...) { ... return ERR_PTR(...); }
@@
identifier r.f, fld;
expression x;
statement S1,S2;
@@
x = f(...)
... when != IS_ERR(x)
(
if (IS_ERR(x) ||...) S1 else S2
|
*x->fld
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
axel lin [Fri, 21 Jan 2011 11:18:06 +0000 (11:18 +0000)]
video: pxa168fb: remove a redundant pxa168fb_check_var call
Current implementation calls pxa168fb_check_var twice in pxa168fb_probe.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
axel lin [Thu, 20 Jan 2011 03:50:51 +0000 (03:50 +0000)]
video: da8xx-fb: fix fb_probe error path
Current implementation puts CONFIG_CPU_FREQ at wrong place, CONFIG_CPU_FREQ
is for lcd_da8xx_cpufreq_deregister not for unregister_framebuffer.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Nobuhiro Iwamatsu [Mon, 24 Jan 2011 01:40:17 +0000 (01:40 +0000)]
sh: Fix build of sh7750 base boards
Renamed platform_register_device to platform_device_register.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Linus Torvalds [Tue, 25 Jan 2011 04:23:54 +0000 (14:23 +1000)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
Make CIFS mount work in a container.
CIFS: Remove pointless variable assignment in cifs_dfs_do_automount()
Linus Torvalds [Tue, 25 Jan 2011 01:01:33 +0000 (11:01 +1000)]
Merge branch 'for-38-rc3' of git://codeaurora.org/quic/kernel/davidb/linux-msm
* 'for-38-rc3' of git://codeaurora.org/quic/kernel/davidb/linux-msm:
drivers: mmc: msm: remove clock disable in probe
mmc: msm: fix dma usage not to use internal APIs
Linus Torvalds [Tue, 25 Jan 2011 00:46:14 +0000 (10:46 +1000)]
Merge branch 'drm-fixes' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/kms: add new radeon_info ioctl query for clock crystal freq
drm/i915: Prevent uninitialised reads during error state capture
drm/i915: Use consistent mappings for OpRegion between ACPI and i915
drm/i915: Handle the no-interrupts case for UMS by polling
drm/i915: Disable high-precision vblank timestamping for UMS
drm/i915: Increase the amount of defense before computing vblank timestamps
drm/i915,agp/intel: Do not clear stolen entries
drm/radeon/kms: simplify atom adjust pll setup
drm/radeon/kms: match r6xx/r7xx/evergreen asic_reset with previous asics
drm/radeon/kms: make the mac rv630 quirk generic
drm/radeon/kms: fix a spelling error in an error message
drm/radeon/kms: Initialize pageflip spinlocks.
drm/i915: Recognise non-VGA display devices
drm/i915: Fix use of invalid array size for ring->sync_seqno
drm/i915/ringbuffer: Fix use of stale HEAD position whilst polling for space
drm/i915: Don't kick-off hangcheck after a DRI interrupt
drm/i915: Add dependency on CONFIG_TMPFS
drm/i915: Initialise ring vfuncs for old DRI paths
drm/i915: make the blitter report buffer modifications to the FBC unit
drm/i915: set more FBC chicken bits
Dave Airlie [Mon, 24 Jan 2011 22:41:58 +0000 (08:41 +1000)]
Merge branch 'drm-intel-fixes-2' of ssh:///linux/kernel/git/ickle/drm-intel into drm-fixes
* 'drm-intel-fixes-2' of ssh://master.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel: (30 commits)
drm/i915: Prevent uninitialised reads during error state capture
drm/i915: Use consistent mappings for OpRegion between ACPI and i915
drm/i915: Handle the no-interrupts case for UMS by polling
drm/i915: Disable high-precision vblank timestamping for UMS
drm/i915: Increase the amount of defense before computing vblank timestamps
drm/i915,agp/intel: Do not clear stolen entries
Remove MAYBE_BUILD_BUG_ON
BUILD_BUG_ON: make it handle more cases
module: fix missing semicolons in MODULE macro usage
param: add null statement to compiled-in module params
module: fix linker error for MODULE_VERSION when !MODULE and CONFIG_SYSFS=n
module: show version information for built-in modules in sysfs
selinux: return -ENOMEM when memory allocation fails
tpm: fix panic caused by "tpm: Autodetect itpm devices"
TPM: Long default timeout fix
trusted keys: Fix a memory leak in trusted_update().
keys: add trusted and encrypted maintainers
encrypted-keys: rename encrypted_defined files to encrypted
trusted-keys: rename trusted_defined files to trusted
drm/i915: Recognise non-VGA display devices
...
Alex Deucher [Mon, 24 Jan 2011 22:14:26 +0000 (17:14 -0500)]
drm/radeon/kms: add new radeon_info ioctl query for clock crystal freq
Needed for timer queries in the 3D driver.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
Shaohua Li [Mon, 24 Jan 2011 08:00:01 +0000 (08:00 +0000)]
fix a shutdown regression in intel_idle
Fix a shutdown regression caused by
2a2d31c8dc6f ("intel_idle: open
broadcast clock event"). The clockevent framework can automatically
shutdown broadcast timers for hotremove CPUs. And we get a shutdown
regression when we shutdown broadcast timer for hot remove CPU, so just
delete some code.
Also fix some section mismatch.
Reported-by: Ari Savolainen <ari.m.savolainen@gmail.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 24 Jan 2011 19:29:49 +0000 (05:29 +1000)]
Merge branch 'omap-fixes-for-linus' of git://git./linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
omap: DMA: clear interrupt status correctly
OMAP3: Devkit8000: Fix tps65930 pullup/pulldown configuration
arm: omap3: cm-t3517: minor comment fix
arm: omap3: cm-t3517: rtc fix
omap1: Fix sched_clock implementation when both MPU timer and 32K timer are used
omap1: Fix booting for 15xx and 730 with omap1_defconfig
omap1: Fix sched_clock for the MPU timer
OMAP: PRCM: remove duplicated headers
OMAP4: clockdomain: bypass unimplemented wake-up dependency functions on OMAP4
OMAP: counter_32k: init clocksource as part of machine timer init
Linus Torvalds [Mon, 24 Jan 2011 19:26:47 +0000 (05:26 +1000)]
Merge branch 'perf-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Fix time function double declaration with glibc
perf tools: Fix build by checking if extra warnings are supported
perf tools: Fix build when using gcc 3.4.6
perf tools: Add missing header, fixes build
perf tools: Fix 64 bit integer format strings
perf test: Fix build on older glibcs
perf: perf_event_exit_task_context: s/rcu_dereference/rcu_dereference_raw/
perf test: Use cpu_map->[cpu] when setting affinity
perf symbols: Fix annotation of thumb code
perf: Annotate cpuctx->ctx.mutex to avoid a lockdep splat
powerpc, perf: Fix frequency calculation for overflowing counters (FSL version)
perf: Fix perf_event_init_task()/perf_event_free_task() interaction
perf: Fix find_get_context() vs perf_event_exit_task() race
Linus Torvalds [Mon, 24 Jan 2011 19:25:55 +0000 (05:25 +1000)]
Merge branch 'timers-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
RTC: Remove Kconfig symbol for UIE emulation
RTC: Properly handle rtc_read_alarm error propagation and fix bug
RTC: Propagate error handling via rtc_timer_enqueue properly
acpi_pm: Clear pmtmr_ioport if acpi_pm initialization fails
rtc: Cleanup removed UIE emulation declaration
hrtimers: Notify hrtimer users of switches to NOHZ mode
Linus Torvalds [Mon, 24 Jan 2011 19:25:13 +0000 (05:25 +1000)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix poor interactivity on UP systems due to group scheduler nice tune bug
Linus Torvalds [Mon, 24 Jan 2011 19:24:12 +0000 (05:24 +1000)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Fix jump label with RO/NX module protection crash
x86, hotplug: Fix powersavings with offlined cores on AMD
x86, mcheck, therm_throt.c: Export symbol platform_thermal_notify to allow coretemp to handler intr
x86: Use asm-generic/cacheflush.h
x86: Update CPU cache attributes table descriptors
Chris Wilson [Mon, 24 Jan 2011 12:34:00 +0000 (12:34 +0000)]
drm/i915: Prevent uninitialised reads during error state capture
error_bo and pinned_bo could be used uninitialised if there were no
active buffers.
Caught by kmemcheck.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Michael Karcher [Sun, 23 Jan 2011 18:17:17 +0000 (18:17 +0000)]
drm/i915: Use consistent mappings for OpRegion between ACPI and i915
The opregion is a shared memory region between ACPI and the graphics
driver. As the ACPI mapping has been changed to cachable in commit
6d5bbf00d251cc73223a71422d69e069dc2e0b8d, mapping the intel opregion
non-cachable now fails. As no bus-master hardware is involved in the
opregion, cachable map should do no harm.
Tested on a Fujitsu Lifebook P8010.
Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>
[ickle: convert to acpi_os_ioremap for consistency]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Sun, 23 Jan 2011 17:22:16 +0000 (17:22 +0000)]
Merge remote branch 'linus/master' into drm-intel-fixes
Merge with Linus to resolve conflicting fixes for the reusing the stale
HEAD value during intel_ring_wait().
Conflicts:
drivers/gpu/drm/i915/intel_ringbuffer.c
Chris Wilson [Sun, 23 Jan 2011 13:03:24 +0000 (13:03 +0000)]
drm/i915: Handle the no-interrupts case for UMS by polling
If the driver calls into the kernel to wait for a breadcrumb to pass,
but hasn't enabled interrupts, fallback to polling the breadcrumb value.
Reported-by: Chris Clayton <chris2553@googlemail.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Sun, 23 Jan 2011 10:45:14 +0000 (10:45 +0000)]
drm/i915: Disable high-precision vblank timestamping for UMS
We only have sufficient information for accurate (sub-frame) timestamping
when the modesetting is under our control.
Reported-by: Chris Clayton <chris2553@googlemail.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Reviewed-by: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Sat, 22 Jan 2011 10:07:56 +0000 (10:07 +0000)]
drm/i915: Increase the amount of defense before computing vblank timestamps
Reported-by: Chris Clayton <chris2553@googlemail.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Fri, 21 Jan 2011 10:54:32 +0000 (10:54 +0000)]
drm/i915,agp/intel: Do not clear stolen entries
We can only utilize the stolen portion of the GTT if we are in sole
charge of the hardware. This is only true if using GEM and KMS,
otherwise VESA continues to access stolen memory.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Ping Cheng [Mon, 24 Jan 2011 17:32:50 +0000 (09:32 -0800)]
Input: wacom - add 2 Bamboo Pen and touch models
Reported-by: David Foley <favux.is@gmail.com>
Signed-off-by: Ping Cheng <pingc@wacom.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Andy Whitcroft [Mon, 24 Jan 2011 17:31:38 +0000 (09:31 -0800)]
Input: sysrq - ensure sysrq_enabled and __sysrq_enabled are consistent
Currently sysrq_enabled and __sysrq_enabled are initialised separately
and inconsistently, leading to sysrq being actually enabled by reported
as not enabled in sysfs. The first change to the sysfs configurable
synchronises these two:
static int __read_mostly sysrq_enabled = 1;
static int __sysrq_enabled;
Add a common define to carry the default for these preventing them becoming
out of sync again. Default this to 1 to mirror previous behaviour.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Cc: stable@kernel.org
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Linus Walleij [Mon, 24 Jan 2011 14:22:13 +0000 (15:22 +0100)]
ARM: 6632/3: mmci: stop using the blockend interrupts
Implement a suggestion from Russell to drop the use of blockend
interrupts altogether and instead rely on the data counter.
Tested with error-free cards on U300, U8500 and RealView PB1176.
Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Yong Zhang [Mon, 24 Jan 2011 07:33:52 +0000 (15:33 +0800)]
sched: Fix poor interactivity on UP systems due to group scheduler nice tune bug
Michael Witten and Christian Kujau reported that the autogroup
scheduling feature hurts interactivity on their UP systems.
It turns out that this is an older bug in the group scheduling code,
and the wider appeal provided by the autogroup feature exposed it
more prominently.
When on UP with FAIR_GROUP_SCHED enabled, tune shares
only affect tg->shares, but is not reflected in
tg->se->load. The reason is that update_cfs_shares()
does nothing on UP.
So introduce update_cfs_shares() for UP && FAIR_GROUP_SCHED.
This issue was found when enable autogroup scheduling was enabled,
but it is an older bug that also exists on cgroup.cpu on UP.
Reported-and-Tested-by: Michael Witten <mfwitten@gmail.com>
Reported-and-Tested-by: Christian Kujau <christian@nerdbynature.de>
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Acked-by: Mike Galbraith <efault@gmx.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <
20110124073352.GA24186@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Mon, 24 Jan 2011 09:58:39 +0000 (19:58 +1000)]
Merge branch 'BUG_ON' of git://git./linux/kernel/git/rusty/linux-2.6-for-linus
* 'BUG_ON' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
Remove MAYBE_BUILD_BUG_ON
BUILD_BUG_ON: make it handle more cases
Linus Torvalds [Mon, 24 Jan 2011 09:57:43 +0000 (19:57 +1000)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
module: fix missing semicolons in MODULE macro usage
param: add null statement to compiled-in module params
module: fix linker error for MODULE_VERSION when !MODULE and CONFIG_SYSFS=n
module: show version information for built-in modules in sysfs
Linus Torvalds [Mon, 24 Jan 2011 09:56:47 +0000 (19:56 +1000)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
selinux: return -ENOMEM when memory allocation fails
tpm: fix panic caused by "tpm: Autodetect itpm devices"
TPM: Long default timeout fix
trusted keys: Fix a memory leak in trusted_update().
keys: add trusted and encrypted maintainers
encrypted-keys: rename encrypted_defined files to encrypted
trusted-keys: rename trusted_defined files to trusted
Rob Landley [Sat, 22 Jan 2011 21:44:05 +0000 (15:44 -0600)]
Make CIFS mount work in a container.
Teach cifs about network namespaces, so mounting uses adresses/routing
visible from the container rather than from init context.
A container is a chroot on steroids that changes more than just the root
filesystem the new processes see. One thing containers can isolate is
"network namespaces", meaning each container can have its own set of
ethernet interfaces, each with its own own IP address and routing to the
outside world. And if you open a socket in _userspace_ from processes
within such a container, this works fine.
But sockets opened from within the kernel still use a single global
networking context in a lot of places, meaning the new socket's address
and routing are correct for PID 1 on the host, but are _not_ what
userspace processes in the container get to use.
So when you mount a network filesystem from within in a container, the
mount code in the CIFS driver uses the host's networking context and not
the container's networking context, so it gets the wrong address, uses
the wrong routing, and may even try to go out an interface that the
container can't even access... Bad stuff.
This patch copies the mount process's network context into the CIFS
structure that stores the rest of the server information for that mount
point, and changes the socket open code to use the saved network context
instead of the global network context. I.E. "when you attempt to use
these addresses, do so relative to THIS set of network interfaces and
routing rules, not the old global context from back before we supported
containers".
The big long HOWTO sets up a test environment on the assumption you've
never used ocntainers before. It basically says:
1) configure and build a new kernel that has container support
2) build a new root filesystem that includes the userspace container
control package (LXC)
3) package/run them under KVM (so you don't have to mess up your host
system in order to play with containers).
4) set up some containers under the KVM system
5) set up contradictory routing in the KVM system and the container so
that the host and the container see different things for the same address
6) try to mount a CIFS share from both contexts so you can both force it
to work and force it to fail.
For a long drawn out test reproduction sequence, see:
http://landley.livejournal.com/47024.html
http://landley.livejournal.com/47205.html
http://landley.livejournal.com/47476.html
Signed-off-by: Rob Landley <rlandley@parallels.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Rusty Russell [Mon, 24 Jan 2011 20:45:10 +0000 (14:45 -0600)]
Remove MAYBE_BUILD_BUG_ON
Now BUILD_BUG_ON() can handle optimizable constants, we don't need
MAYBE_BUILD_BUG_ON any more.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 24 Jan 2011 20:45:10 +0000 (14:45 -0600)]
BUILD_BUG_ON: make it handle more cases
BUILD_BUG_ON used to use the optimizer to do code elimination or fail
at link time; it was changed to first the size of a negative array (a
nicer compile time error), then (in
8c87df457cb58fe75b9b893007917cf8095660a0) to a bitfield.
This forced us to change some non-constant cases to MAYBE_BUILD_BUG_ON();
as Jan points out in that commit, it didn't work as intended anyway.
bitfields: needs a literal constant at parse time, and can't be put under
"if (__builtin_constant_p(x))" for example.
negative array: can handle anything, but if the compiler can't tell it's
a constant, silently has no effect.
link time: breaks link if the compiler can't determine the value, but the
linker output is not usually as informative as a compiler error.
If we use the negative-array-size method *and* the link time trick,
we get the ability to use BUILD_BUG_ON() under __builtin_constant_p()
branches, and maximal ability for the compiler to detect errors at
build time.
We also document it thoroughly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jan Beulich <JBeulich@novell.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Rusty Russell [Mon, 24 Jan 2011 20:32:52 +0000 (14:32 -0600)]
module: fix missing semicolons in MODULE macro usage
You always needed them when you were a module, but the builtin versions
of the macros used to be more lenient.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Linus Walleij [Wed, 5 Jan 2011 12:27:04 +0000 (13:27 +0100)]
param: add null statement to compiled-in module params
Add an unused struct declaration statement requiring a
terminating semicolon to the compile-in case to provoke an
error if __MODULE_INFO() is used without the terminating
semicolon. Previously MODULE_ALIAS("foo") (no semicolon)
compiled fine if MODULE was not selected.
Cc: Dan Carpenter <error27@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 24 Jan 2011 20:32:51 +0000 (14:32 -0600)]
module: fix linker error for MODULE_VERSION when !MODULE and CONFIG_SYSFS=n
lib/built-in.o:(__modver+0x8): undefined reference to `__modver_version_show'
lib/built-in.o:(__modver+0x2c): undefined reference to `__modver_version_show'
Simplest to just not emit anything: if they've disabled SYSFS they probably
want the smallest kernel possible.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Dmitry Torokhov [Wed, 15 Dec 2010 22:00:19 +0000 (14:00 -0800)]
module: show version information for built-in modules in sysfs
Currently only drivers that are built as modules have their versions
shown in /sys/module/<module_name>/version, but this information might
also be useful for built-in drivers as well. This especially important
for drivers that do not define any parameters - such drivers, if
built-in, are completely invisible from userspace.
This patch changes MODULE_VERSION() macro so that in case when we are
compiling built-in module, version information is stored in a separate
section. Kernel then uses this data to create 'version' sysfs attribute
in the same fashion it creates attributes for module parameters.
Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Jesper Juhl [Sat, 22 Jan 2011 20:07:16 +0000 (21:07 +0100)]
CIFS: Remove pointless variable assignment in cifs_dfs_do_automount()
In fs/cifs/cifs_dfs_ref.c::cifs_dfs_do_automount() we have this code:
...
mnt = ERR_PTR(-EINVAL);
if (IS_ERR(tlink)) {
mnt = ERR_CAST(tlink);
goto free_full_path;
}
ses = tlink_tcon(tlink)->ses;
rc = get_dfs_path(xid, ses, full_path + 1, cifs_sb->local_nls,
&num_referrals, &referrals,
cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
cifs_put_tlink(tlink);
mnt = ERR_PTR(-ENOENT);
...
The assignment of 'mnt = ERR_PTR(-EINVAL);' is completely pointless. If we
take the 'if (IS_ERR(tlink))' branch we'll set 'mnt' again and we'll also
do so if we do not take the branch. There is no way we'll ever use 'mnt'
with the assigned 'ERR_PTR(-EINVAL)' value, so we may as well just remove
the pointless assignment.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Steve French <sfrench@us.ibm.com>