openwrt/staging/blogic.git
13 years agoGFS2: Remove potential race in flock code
Steven Whitehouse [Wed, 9 Mar 2011 11:14:32 +0000 (11:14 +0000)]
GFS2: Remove potential race in flock code

This patch ensures that we always wait for glock demotion when
dropping flocks on a file in order to prevent any race
conditions associated with further flock calls or closing
the file.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
13 years agoGFS2: Fix glock deallocation race
Steven Whitehouse [Wed, 9 Mar 2011 10:58:04 +0000 (10:58 +0000)]
GFS2: Fix glock deallocation race

This patch fixes a race in deallocating glocks which was introduced
in the RCU glock patch. We need to ensure that the glock count is
kept correct even in the case that there is a race to add a new
glock into the hash table. Also, to avoid having to wait for an
RCU grace period, the glock counter can be decremented before
call_rcu() is called.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
13 years agoGFS2: quota allows exceeding hard limit
Abhijith Das [Tue, 8 Mar 2011 15:40:42 +0000 (10:40 -0500)]
GFS2: quota allows exceeding hard limit

Immediately after being synced to disk, cached quotas are zeroed out and a
subsequent access of the cached quotas results in incorrect zero values. This
meant that gfs2 assumed the actual usage to be the zero (or near-zero) usage
values it found in the cached quotas and comparison against warn/limits never
triggered a quota violation.

This patch adds a new flag QDF_REFRESH that is set after a sync so that the
cached quotas are forcefully refreshed from disk on a subsequent access on
seeing this flag set.

Resolves: rhbz#675944
Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
13 years agoGFS2: deallocation performance patch
Bob Peterson [Wed, 23 Feb 2011 21:11:33 +0000 (16:11 -0500)]
GFS2: deallocation performance patch

This patch is a performance improvement to GFS2's dealloc code.
Rather than update the quota file and statfs file for every
single block that's stripped off in unlink function do_strip,
this patch keeps track and updates them once for every layer
that's stripped.  This is done entirely inside the existing
transaction, so there should be no risk of corruption.
The other functions that deallocate blocks will be unaffected
because they are using wrapper functions that do the same
thing that they do today.

I tested this code on my roth cluster by creating 200
files in a directory, each of which is 100MB, then on
four nodes, I simultaneously deleted the files, thus competing
for GFS2 resources (but different files).  The commands
I used were:

[root@roth-01]# time for i in `seq 1 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done
[root@roth-02]# time for i in `seq 2 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done
[root@roth-03]# time for i in `seq 3 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done
[root@roth-05]# time for i in `seq 4 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done

The performance increase was significant:

             roth-01     roth-02     roth-03     roth-05
             ---------   ---------   ---------   ---------
old: real    0m34.027    0m25.021s   0m23.906s   0m35.646s
new: real    0m22.379s   0m24.362s   0m24.133s   0m18.562s

Total time spent deleting:
old: 118.6s
new:  89.4

For this particular case, this showed a 25% performance increase for
GFS2 unlinks.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
13 years agoGFS2: panics on quotacheck update
Abhijith Das [Mon, 7 Feb 2011 16:22:41 +0000 (11:22 -0500)]
GFS2: panics on quotacheck update

Handle block allocation for forceful unstuffing of quota dinode during quota
update using quotactl(). Also fix block reservation for special cases when
quotas cross over block boundaries and update 2 blocks instead of 1.

Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
13 years agoGFS2: Improve cluster mmap scalability
Steven Whitehouse [Wed, 2 Feb 2011 14:48:10 +0000 (14:48 +0000)]
GFS2: Improve cluster mmap scalability

The mmap system call grabs a glock when an update to atime maybe
required. It does this in order to ensure that the flags on the
inode are uptodate, but since it will only mark atime for a future
update, an exclusive lock is not required here (one will be taken
later when the actual update is performed).

Also, the lock can be skipped when the mount is marked noatime in
addition to the original check which only looked at the noatime
flag for the inode itself.

This should increase the scalability of the mmap call when multiple
nodes are all mmaping the same file.

Reported-by: Scooter Morris <scooter@cgl.ucsf.edu>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
13 years agoGFS2: Fix glock queue trace point
Steven Whitehouse [Mon, 31 Jan 2011 09:38:12 +0000 (09:38 +0000)]
GFS2: Fix glock queue trace point

Somehow this tracepoint landed up in the wrong place. This moves it
to where it should be.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
14 years agoGFS2: Post-VFS scale update for RCU path walk
Steven Whitehouse [Wed, 19 Jan 2011 09:42:40 +0000 (09:42 +0000)]
GFS2: Post-VFS scale update for RCU path walk

We can allow a few more cases to use RCU path walking than
originally allowed. It should be possible to also enable
RCU path walking when the glock is already cached. Thats
a bit more complicated though, so left for a future patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Nick Piggin <npiggin@gmail.com>
14 years agoGFS2: Use RCU for glock hash table
Steven Whitehouse [Wed, 19 Jan 2011 09:30:01 +0000 (09:30 +0000)]
GFS2: Use RCU for glock hash table

This has a number of advantages:

 - Reduces contention on the hash table lock
 - Makes the code smaller and simpler
 - Should speed up glock dumps when under load
 - Removes ref count changing in examine_bucket
 - No longer need hash chain lock in glock_put() in common case

There are some further changes which this enables and which
we may do in the future. One is to look at using SLAB_RCU,
and another is to look at using a per-cpu counter for the
per-sb glock counter, since that is touched twice in the
lifetime of each glock (but only used at umount time).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
14 years agoMerge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 21 Jan 2011 02:30:37 +0000 (18:30 -0800)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  smp: Allow on_each_cpu() to be called while early_boot_irqs_disabled status to init/main.c
  lockdep: Move early boot local IRQ enable/disable status to init/main.c

14 years agoACPI / PM: Call suspend_nvs_free() earlier during resume
Rafael J. Wysocki [Wed, 19 Jan 2011 21:27:55 +0000 (22:27 +0100)]
ACPI / PM: Call suspend_nvs_free() earlier during resume

It turns out that some device drivers map pages from the ACPI NVS region
during resume using ioremap(), which conflicts with ioremap_cache() used
for mapping those pages by the NVS save/restore code in nvs.c.

Make the NVS pages mapped by the code in nvs.c be unmapped before device
drivers' resume routines run.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoACPI: Introduce acpi_os_ioremap()
Rafael J. Wysocki [Wed, 19 Jan 2011 21:27:14 +0000 (22:27 +0100)]
ACPI: Introduce acpi_os_ioremap()

Commit ca9b600be38c ("ACPI / PM: Make suspend_nvs_save() use
acpi_os_map_memory()") attempted to prevent the code in osl.c and nvs.c
from using different ioremap() variants by making the latter use
acpi_os_map_memory() for mapping the NVS pages.  However, that also
requires acpi_os_unmap_memory() to be used for unmapping them, which
causes synchronize_rcu() to be executed many times in a row
unnecessarily and introduces substantial delays during resume on some
systems.

Instead of using acpi_os_map_memory() for mapping the NVS pages in nvs.c
introduce acpi_os_ioremap() calling ioremap_cache() and make the code in
both osl.c and nvs.c use it.

Reported-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoMerge branch 'akpm'
Linus Torvalds [Fri, 21 Jan 2011 01:02:14 +0000 (17:02 -0800)]
Merge branch 'akpm'

* akpm:
  kernel/smp.c: consolidate writes in smp_call_function_interrupt()
  kernel/smp.c: fix smp_call_function_many() SMP race
  memcg: correctly order reading PCG_USED and pc->mem_cgroup
  backlight: fix 88pm860x_bl macro collision
  drivers/leds/ledtrig-gpio.c: make output match input, tighten input checking
  MAINTAINERS: update Atmel AT91 entry
  mm: fix truncate_setsize() comment
  memcg: fix rmdir, force_empty with THP
  memcg: fix LRU accounting with THP
  memcg: fix USED bit handling at uncharge in THP
  memcg: modify accounting function for supporting THP better
  fs/direct-io.c: don't try to allocate more than BIO_MAX_PAGES in a bio
  mm: compaction: prevent division-by-zero during user-requested compaction
  mm/vmscan.c: remove duplicate include of compaction.h
  memblock: fix memblock_is_region_memory()
  thp: keep highpte mapped until it is no longer needed
  kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT

14 years agokernel/smp.c: consolidate writes in smp_call_function_interrupt()
Milton Miller [Thu, 20 Jan 2011 22:44:34 +0000 (14:44 -0800)]
kernel/smp.c: consolidate writes in smp_call_function_interrupt()

We have to test the cpu mask in the interrupt handler before checking the
refs, otherwise we can start to follow an entry before its deleted and
find it partially initailzed for the next trip.  Presently we also clear
the cpumask bit before executing the called function, which implies
getting write access to the line.  After the function is called we then
decrement refs, and if they go to zero we then unlock the structure.

However, this implies getting write access to the call function data
before and after another the function is called.  If we can assert that no
smp_call_function execution function is allowed to enable interrupts, then
we can move both writes to after the function is called, hopfully allowing
both writes with one cache line bounce.

On a 256 thread system with a kernel compiled for 1024 threads, the time
to execute testcase in the "smp_call_function_many race" changelog was
reduced by about 30-40ms out of about 545 ms.

I decided to keep this as WARN because its now a buggy function, even
though the stack trace is of no value -- a simple printk would give us the
information needed.

Raw data:

Without patch:
  ipi_test startup took 1219366ns complete 539819014ns total 541038380ns
  ipi_test startup took 1695754ns complete 543439872ns total 545135626ns
  ipi_test startup took 7513568ns complete 539606362ns total 547119930ns
  ipi_test startup took 13304064ns complete 533898562ns total 547202626ns
  ipi_test startup took 8668192ns complete 544264074ns total 552932266ns
  ipi_test startup took 4977626ns complete 548862684ns total 553840310ns
  ipi_test startup took 2144486ns complete 541292318ns total 543436804ns
  ipi_test startup took 21245824ns complete 530280180ns total 551526004ns

With patch:
  ipi_test startup took 5961748ns complete 500859628ns total 506821376ns
  ipi_test startup took 8975996ns complete 495098924ns total 504074920ns
  ipi_test startup took 19797750ns complete 492204740ns total 512002490ns
  ipi_test startup took 14824796ns complete 487495878ns total 502320674ns
  ipi_test startup took 11514882ns complete 494439372ns total 505954254ns
  ipi_test startup took 8288084ns complete 502570774ns total 510858858ns
  ipi_test startup took 6789954ns complete 493388112ns total 500178066ns

#include <linux/module.h>
#include <linux/init.h>
#include <linux/sched.h> /* sched clock */

#define ITERATIONS 100

static void do_nothing_ipi(void *dummy)
{
}

static void do_ipis(struct work_struct *dummy)
{
int i;

for (i = 0; i < ITERATIONS; i++)
smp_call_function(do_nothing_ipi, NULL, 1);

printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id());
}

static struct work_struct work[NR_CPUS];

static int __init testcase_init(void)
{
int cpu;
u64 start, started, done;

start = local_clock();
for_each_online_cpu(cpu) {
INIT_WORK(&work[cpu], do_ipis);
schedule_work_on(cpu, &work[cpu]);
}
started = local_clock();
for_each_online_cpu(cpu)
flush_work(&work[cpu]);
done = local_clock();
pr_info("ipi_test startup took %lldns complete %lldns total %lldns\n",
started-start, done-started, done-start);

return 0;
}

static void __exit testcase_exit(void)
{
}

module_init(testcase_init)
module_exit(testcase_exit)
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anton Blanchard");

Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agokernel/smp.c: fix smp_call_function_many() SMP race
Anton Blanchard [Thu, 20 Jan 2011 22:44:33 +0000 (14:44 -0800)]
kernel/smp.c: fix smp_call_function_many() SMP race

I noticed a failure where we hit the following WARN_ON in
generic_smp_call_function_interrupt:

                if (!cpumask_test_and_clear_cpu(cpu, data->cpumask))
                        continue;

                data->csd.func(data->csd.info);

                refs = atomic_dec_return(&data->refs);
                WARN_ON(refs < 0);      <-------------------------

We atomically tested and cleared our bit in the cpumask, and yet the
number of cpus left (ie refs) was 0.  How can this be?

It turns out commit 54fdade1c3332391948ec43530c02c4794a38172
("generic-ipi: make struct call_function_data lockless") is at fault.  It
removes locking from smp_call_function_many and in doing so creates a
rather complicated race.

The problem comes about because:

 - The smp_call_function_many interrupt handler walks call_function.queue
   without any locking.
 - We reuse a percpu data structure in smp_call_function_many.
 - We do not wait for any RCU grace period before starting the next
   smp_call_function_many.

Imagine a scenario where CPU A does two smp_call_functions back to back,
and CPU B does an smp_call_function in between.  We concentrate on how CPU
C handles the calls:

CPU A            CPU B                  CPU C              CPU D

smp_call_function
                                        smp_call_function_interrupt
                                            walks
call_function.queue sees
data from CPU A on list

                 smp_call_function

                                        smp_call_function_interrupt
                                            walks

                                        call_function.queue sees
                                          (stale) CPU A on list
   smp_call_function int
   clears last ref on A
   list_del_rcu, unlock
smp_call_function reuses
percpu *data A
                                         data->cpumask sees and
                                         clears bit in cpumask
                                         might be using old or new fn!
                                         decrements refs below 0

set data->refs (too late!)

The important thing to note is since the interrupt handler walks a
potentially stale call_function.queue without any locking, then another
cpu can view the percpu *data structure at any time, even when the owner
is in the process of initialising it.

The following test case hits the WARN_ON 100% of the time on my PowerPC
box (having 128 threads does help :)

#include <linux/module.h>
#include <linux/init.h>

#define ITERATIONS 100

static void do_nothing_ipi(void *dummy)
{
}

static void do_ipis(struct work_struct *dummy)
{
int i;

for (i = 0; i < ITERATIONS; i++)
smp_call_function(do_nothing_ipi, NULL, 1);

printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id());
}

static struct work_struct work[NR_CPUS];

static int __init testcase_init(void)
{
int cpu;

for_each_online_cpu(cpu) {
INIT_WORK(&work[cpu], do_ipis);
schedule_work_on(cpu, &work[cpu]);
}

return 0;
}

static void __exit testcase_exit(void)
{
}

module_init(testcase_init)
module_exit(testcase_exit)
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anton Blanchard");

I tried to fix it by ordering the read and the write of ->cpumask and
->refs.  In doing so I missed a critical case but Paul McKenney was able
to spot my bug thankfully :) To ensure we arent viewing previous
iterations the interrupt handler needs to read ->refs then ->cpumask then
->refs _again_.

Thanks to Milton Miller and Paul McKenney for helping to debug this issue.

[miltonm@bga.com: add WARN_ON and BUG_ON, remove extra read of refs before initial read of mask that doesn't help (also noted by Peter Zijlstra), adjust comments, hopefully clarify scenario ]
[miltonm@bga.com: remove excess tests]
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: <stable@kernel.org> [2.6.32+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomemcg: correctly order reading PCG_USED and pc->mem_cgroup
Johannes Weiner [Thu, 20 Jan 2011 22:44:31 +0000 (14:44 -0800)]
memcg: correctly order reading PCG_USED and pc->mem_cgroup

The placement of the read-side barrier is confused: the writer first
sets pc->mem_cgroup, then PCG_USED.  The read-side barrier has to be
between testing PCG_USED and reading pc->mem_cgroup.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agobacklight: fix 88pm860x_bl macro collision
Randy Dunlap [Thu, 20 Jan 2011 22:44:31 +0000 (14:44 -0800)]
backlight: fix 88pm860x_bl macro collision

Fix collision with kernel-supplied #define:

  drivers/video/backlight/88pm860x_bl.c:24:1: warning: "CURRENT_MASK" redefined
  arch/x86/include/asm/page_64_types.h:6:1: warning: this is the location of the previous definition

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Haojian Zhuang <haojian.zhuang@marvell.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agodrivers/leds/ledtrig-gpio.c: make output match input, tighten input checking
Janusz Krzysztofik [Thu, 20 Jan 2011 22:44:29 +0000 (14:44 -0800)]
drivers/leds/ledtrig-gpio.c: make output match input, tighten input checking

Replicate changes made to drivers/leds/ledtrig-backlight.c.

Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Richard Purdie <richard.purdie@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoMAINTAINERS: update Atmel AT91 entry
Nicolas Ferre [Thu, 20 Jan 2011 22:44:27 +0000 (14:44 -0800)]
MAINTAINERS: update Atmel AT91 entry

Add two co-maintainers and update the entry with new information.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Andrew Victor <linux@maxim.org.za>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomm: fix truncate_setsize() comment
Jan Kara [Thu, 20 Jan 2011 22:44:26 +0000 (14:44 -0800)]
mm: fix truncate_setsize() comment

Contrary to what the comment says, truncate_setsize() should be called
*before* filesystem truncated blocks.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomemcg: fix rmdir, force_empty with THP
KAMEZAWA Hiroyuki [Thu, 20 Jan 2011 22:44:25 +0000 (14:44 -0800)]
memcg: fix rmdir, force_empty with THP

Now, when THP is enabled, memcg's rmdir() function is broken because
move_account() for THP page is not supported.

This will cause account leak or -EBUSY issue at rmdir().
This patch fixes the issue by supporting move_account() THP pages.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomemcg: fix LRU accounting with THP
KAMEZAWA Hiroyuki [Thu, 20 Jan 2011 22:44:24 +0000 (14:44 -0800)]
memcg: fix LRU accounting with THP

memory cgroup's LRU stat should take care of size of pages because
Transparent Hugepage inserts hugepage into LRU.  If this value is the
number wrong, memory reclaim will not work well.

Note: only head page of THP's huge page is linked into LRU.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomemcg: fix USED bit handling at uncharge in THP
KAMEZAWA Hiroyuki [Thu, 20 Jan 2011 22:44:24 +0000 (14:44 -0800)]
memcg: fix USED bit handling at uncharge in THP

Now, under THP:

at charge:
  - PageCgroupUsed bit is set to all page_cgroup on a hugepage.
    ....set to 512 pages.
at uncharge
  - PageCgroupUsed bit is unset on the head page.

So, some pages will remain with "Used" bit.

This patch fixes that Used bit is set only to the head page.
Used bits for tail pages will be set at splitting if necessary.

This patch adds this lock order:
   compound_lock() -> page_cgroup_move_lock().

[akpm@linux-foundation.org: fix warning]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomemcg: modify accounting function for supporting THP better
KAMEZAWA Hiroyuki [Thu, 20 Jan 2011 22:44:23 +0000 (14:44 -0800)]
memcg: modify accounting function for supporting THP better

mem_cgroup_charge_statisics() was designed for charging a page but now, we
have transparent hugepage.  To fix problems (in following patch) it's
required to change the function to get the number of pages as its
arguments.

The new function gets following as argument.
  - type of page rather than 'pc'
  - size of page which is accounted.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agofs/direct-io.c: don't try to allocate more than BIO_MAX_PAGES in a bio
David Dillow [Thu, 20 Jan 2011 22:44:22 +0000 (14:44 -0800)]
fs/direct-io.c: don't try to allocate more than BIO_MAX_PAGES in a bio

When using devices that support max_segments > BIO_MAX_PAGES (256), direct
IO tries to allocate a bio with more pages than allowed, which leads to an
oops in dio_bio_alloc().  Clamp the request to the supported maximum, and
change dio_bio_alloc() to reflect that bio_alloc() will always return a
bio when called with __GFP_WAIT and a valid number of vectors.

[akpm@linux-foundation.org: remove redundant BUG_ON()]
Signed-off-by: David Dillow <dillowda@ornl.gov>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomm: compaction: prevent division-by-zero during user-requested compaction
Johannes Weiner [Thu, 20 Jan 2011 22:44:21 +0000 (14:44 -0800)]
mm: compaction: prevent division-by-zero during user-requested compaction

Up until 3e7d344 ("mm: vmscan: reclaim order-0 and use compaction instead
of lumpy reclaim"), compaction skipped calculating the fragmentation index
of a zone when compaction was explicitely requested through the procfs
knob.

However, when compaction_suitable was introduced, it did not come with an
extra check for order == -1, set on explicit compaction requests, and
passed this order on to the fragmentation index calculation, where it
overshifts the number of requested pages, leading to a division by zero.

This patch makes sure that order == -1 is recognized as the flag it is
rather than passing it along as valid order parameter.

[akpm@linux-foundation.org: add comment, per Mel]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomm/vmscan.c: remove duplicate include of compaction.h
Jesper Juhl [Thu, 20 Jan 2011 22:44:20 +0000 (14:44 -0800)]
mm/vmscan.c: remove duplicate include of compaction.h

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomemblock: fix memblock_is_region_memory()
Tomi Valkeinen [Thu, 20 Jan 2011 22:44:20 +0000 (14:44 -0800)]
memblock: fix memblock_is_region_memory()

memblock_is_region_memory() uses reserved memblocks to search for the
given region, while it should use the memory memblocks.

I encountered the problem with OMAP's framebuffer ram allocation.
Normally the ram is allocated dynamically, and this function is not
called.  However, if we want to pass the framebuffer from the bootloader
to the kernel (to retain the boot image), this function is used to check
the validity of the kernel parameters for the framebuffer ram area.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@nokia.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agothp: keep highpte mapped until it is no longer needed
Johannes Weiner [Thu, 20 Jan 2011 22:44:18 +0000 (14:44 -0800)]
thp: keep highpte mapped until it is no longer needed

Two users reported THP-related crashes on 32-bit x86 machines.  Their oops
reports indicated an invalid pte, and subsequent code inspection showed
that the highpte is actually used after unmap.

The fix is to unmap the pte only after all operations against it are
finished.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Ilya Dryomov <idryomov@gmail.com>
Reported-by: werner <w.landgraf@ru.ru>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Tested-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Steven Rostedt <rostedt@goodmis.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agokconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT
David Rientjes [Thu, 20 Jan 2011 22:44:16 +0000 (14:44 -0800)]
kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT

The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option
is used to configure any non-standard kernel with a much larger scope than
only small devices.

This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes
references to the option throughout the kernel.  A new CONFIG_EMBEDDED
option is added that automatically selects CONFIG_EXPERT when enabled and
can be used in the future to isolate options that should only be
considered for embedded systems (RISC architectures, SLOB, etc).

Calling the option "EXPERT" more accurately represents its intention: only
expert users who understand the impact of the configuration changes they
are making should enable it.

Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: David Woodhouse <david.woodhouse@intel.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Greg KH <gregkh@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Robin Holt <holt@sgi.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoMerge branch 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 21 Jan 2011 00:39:23 +0000 (16:39 -0800)]
Merge branch 'tty-linus' of git://git./linux/kernel/git/gregkh/tty-2.6

* 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
  tty: update MAINTAINERS file due to driver movement
  tty: move drivers/serial/ to drivers/tty/serial/
  tty: move hvc drivers to drivers/tty/hvc/

14 years agoMerge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 21 Jan 2011 00:37:55 +0000 (16:37 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched, cgroup: Use exit hook to avoid use-after-free crash
  sched: Fix signed unsigned comparison in check_preempt_tick()
  sched: Replace rq->bkl_count with rq->rq_sched_info.bkl_count
  sched, autogroup: Fix CONFIG_RT_GROUP_SCHED sched_setscheduler() failure
  sched: Display autogroup names in /proc/sched_debug
  sched: Reinstate group names in /proc/sched_debug
  sched: Update effective_load() to use global share weights

14 years agoMerge branch 'xen/xenbus' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
Linus Torvalds [Fri, 21 Jan 2011 00:37:28 +0000 (16:37 -0800)]
Merge branch 'xen/xenbus' of git://git./linux/kernel/git/jeremy/xen

* 'xen/xenbus' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  xenbus: Fix memory leak on release
  xenbus: avoid zero returns from read()
  xenbus: add missing wakeup in concurrent read/write
  xenbus: allow any xenbus command over /proc/xen/xenbus
  xenfs/xenbus: report partial reads/writes correctly

14 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Fri, 21 Jan 2011 00:36:38 +0000 (16:36 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/sfrench/cifs-2.6

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: mangle existing header for SMB_COM_NT_CANCEL
  cifs: remove code for setting timeouts on requests
  [CIFS] cifs: reconnect unresponsive servers
  cifs: set up recurring workqueue job to do SMB echo requests
  cifs: add ability to send an echo request
  cifs: add cifs_call_async
  cifs: allow for different handling of received response
  cifs: clean up sync_mid_result
  cifs: don't reconnect server when we don't get a response
  cifs: wait indefinitely for responses
  cifs: Use mask of ACEs for SID Everyone to calculate all three permissions user, group, and other
  cifs: Fix regression during share-level security mounts (Repost)
  [CIFS] Update cifs version number
  cifs: move mid result processing into common function
  cifs: move locked sections out of DeleteMidQEntry and AllocMidQEntry
  cifs: clean up accesses to midCount
  cifs: make wait_for_free_request take a TCP_Server_Info pointer
  cifs: no need to mark smb_ses_list as cifs_demultiplex_thread is exiting
  cifs: don't fail writepages on -EAGAIN errors
  CIFS: Fix oplock break handling (try #2)

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
Linus Torvalds [Fri, 21 Jan 2011 00:31:20 +0000 (16:31 -0800)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-for-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  virtio: remove virtio-pci root device
  LGUEST_GUEST: fix unmet direct dependencies (VIRTUALIZATION && VIRTIO)
  lguest: compile fixes
  lguest: Use this_cpu_ops
  lguest: document --rng in example Launcher
  lguest: example launcher to use guard pages, drop PROT_EXEC, fix limit logic
  lguest: --username and --chroot options

14 years agoMerge branch 'for-38-rc2' of git://codeaurora.org/quic/kernel/davidb/linux-msm
Linus Torvalds [Fri, 21 Jan 2011 00:30:22 +0000 (16:30 -0800)]
Merge branch 'for-38-rc2' of git://codeaurora.org/quic/kernel/davidb/linux-msm

* 'for-38-rc2' of git://codeaurora.org/quic/kernel/davidb/linux-msm:
  msm: qsd8x50: Platform data isn't init data

14 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Fri, 21 Jan 2011 00:29:43 +0000 (16:29 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  trusted-keys: avoid scattring va_end()
  trusted-keys: check for NULL before using it
  trusted-keys: another free memory bugfix
  trusted-keys: free memory bugfix

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes
Linus Torvalds [Fri, 21 Jan 2011 00:29:04 +0000 (16:29 -0800)]
Merge git://git./linux/kernel/git/steve/gfs2-2.6-fixes

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
  GFS2: Fix error path in gfs2_lookup_by_inum()
  GFS2: remove iopen glocks from cache on failed deletes

14 years agoMerge branch 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux...
Linus Torvalds [Fri, 21 Jan 2011 00:28:34 +0000 (16:28 -0800)]
Merge branch 'acpica' of git://git./linux/kernel/git/lenb/linux-acpi-2.6

* 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPICA: Update version to 20110112
  ACPICA: Update all ACPICA copyrights and signons to 2011
  ACPICA: Fix issues/fault with automatic "serialized" method support
  ACPICA: Debugger: Lock namespace for duration of a namespace dump
  ACPICA: Fix namespace race condition
  ACPICA: Fix memory leak in acpi_ev_asynch_execute_gpe_method().

14 years agoFix broken "pipe: use event aware wakeups" optimization
Linus Torvalds [Fri, 21 Jan 2011 00:21:59 +0000 (16:21 -0800)]
Fix broken "pipe: use event aware wakeups" optimization

Commit e462c448fdc8 ("pipe: use event aware wakeups") optimized the pipe
event wakeup calls to avoid wakeups if the events do not match the
requested set.

However, the optimization was buggy, in that it didn't actually use the
correct sets for the events: when we make room for more data to be
written, the pipe poll() routine will return both the POLLOUT _and_
POLLWRNORM bits.  Similarly for read.

And most critically, when a pipe is released, that will potentially
result in POLLHUP|POLLERR (depending on whether it was the last reader
or writer), not just the regular POLLIN|POLLOUT.

This bug showed itself as a hung gnome-screensaver-dialog process, stuck
forever (or at least until it was poked by a signal or by being traced)
in a poll() system call.

Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoi915: Fix i915 suspend delay
Linus Torvalds [Thu, 20 Jan 2011 21:19:55 +0000 (13:19 -0800)]
i915: Fix i915 suspend delay

During system suspend, the "wait for ring buffer to empty" loop would
always time out after three seconds, because the faster cached ring
buffer head read would always return zero.  Force the slow-and-careful
PIO read on all but the first iterations of the loop to fix it.

This also removes the unused (and useless) 'actual_head' variable that
tried to approximate doing this, but did it incorrectly.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Dave Airlie <airlied@linux.ie>
Cc: DRI mailing list <dri-devel@lists.freedesktop.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoACPI / Battery: remove battery refresh on resume
Linus Torvalds [Thu, 20 Jan 2011 21:14:10 +0000 (13:14 -0800)]
ACPI / Battery: remove battery refresh on resume

This partially reverts commit da8aeb92d4853f37e281f11fddf61f9c7d84c3cd
("ACPI / Battery: Update information on info notification and resume"),
which causes a hang on resume on at least some machines.

This bug was bisected on an ASUS EeePC 901, which hangs at resume time
if we do that "acpi_battery_refresh(battery)" in the battery resume
function.

Rafael suspects we'll still need to refresh the sysfs files upon resume,
but that that can be done from a PM notifier (that will run after
thawing user space).

Bisected-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agocifs: mangle existing header for SMB_COM_NT_CANCEL
Jeff Layton [Tue, 11 Jan 2011 12:24:24 +0000 (07:24 -0500)]
cifs: mangle existing header for SMB_COM_NT_CANCEL

The NT_CANCEL command looks just like the original command, except for a
few small differences. The send_nt_cancel function however currently takes
a tcon, which we don't have in SendReceive and SendReceive2.

Instead of "respinning" the entire header for an NT_CANCEL, just mangle
the existing header by replacing just the fields we need. This means we
don't need a tcon and allows us to call it from other places.

Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: remove code for setting timeouts on requests
Jeff Layton [Tue, 11 Jan 2011 12:24:23 +0000 (07:24 -0500)]
cifs: remove code for setting timeouts on requests

Since we don't time out individual requests anymore, remove the code
that we used to use for setting timeouts on different requests.

Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years ago[CIFS] cifs: reconnect unresponsive servers
Steve French [Thu, 20 Jan 2011 18:06:34 +0000 (18:06 +0000)]
[CIFS] cifs: reconnect unresponsive servers

If the server isn't responding to echoes, we don't want to leave tasks
hung waiting for it to reply. At that point, we'll want to reconnect
so that soft mounts can return an error to userspace quickly.

If the client hasn't received a reply after a specified number of echo
intervals, assume that the transport is down and attempt to reconnect
the socket.

The number of echo_intervals to wait before attempting to reconnect is
tunable via a module parameter. Setting it to 0, means that the client
will never attempt to reconnect. The default is 5.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
14 years agocifs: set up recurring workqueue job to do SMB echo requests
Jeff Layton [Tue, 11 Jan 2011 12:24:23 +0000 (07:24 -0500)]
cifs: set up recurring workqueue job to do SMB echo requests

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: add ability to send an echo request
Jeff Layton [Tue, 11 Jan 2011 12:24:21 +0000 (07:24 -0500)]
cifs: add ability to send an echo request

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: add cifs_call_async
Jeff Layton [Tue, 11 Jan 2011 12:24:21 +0000 (07:24 -0500)]
cifs: add cifs_call_async

Add a function that will send a request, and set up the mid for an
async reply.

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: allow for different handling of received response
Jeff Layton [Tue, 11 Jan 2011 12:24:21 +0000 (07:24 -0500)]
cifs: allow for different handling of received response

In order to incorporate async requests, we need to allow for a more
general way to do things on receive, rather than just waking up a
process.

Turn the task pointer in the mid_q_entry into a callback function and a
generic data pointer. When a response comes in, or the socket is
reconnected, cifsd can call the callback function in order to wake up
the process.

The default is to just wake up the current process which should mean no
change in behavior for existing code.

Also, clean up the locking in cifs_reconnect. There doesn't seem to be
any need to hold both the srv_mutex and GlobalMid_Lock when walking the
list of mids.

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: clean up sync_mid_result
Jeff Layton [Tue, 11 Jan 2011 12:24:02 +0000 (07:24 -0500)]
cifs: clean up sync_mid_result

Make it use a switch statement based on the value of the midStatus. If
the resp_buf is set, then MID_RESPONSE_RECEIVED is too.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: don't reconnect server when we don't get a response
Jeff Layton [Tue, 11 Jan 2011 12:24:02 +0000 (07:24 -0500)]
cifs: don't reconnect server when we don't get a response

We only want to force a reconnect to the server under very limited and
specific circumstances. Now that we have processes waiting indefinitely
for responses, we shouldn't reach this point unless a reconnect is
already in process. Thus, there's no reason to re-mark the server for
reconnect here.

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: wait indefinitely for responses
Jeff Layton [Tue, 11 Jan 2011 12:24:02 +0000 (07:24 -0500)]
cifs: wait indefinitely for responses

The client should not be timing out on individual SMB requests. Too much
of the state between client and server is tied to the state of the
socket. If we time out requests and issue spurious disconnects then that
comprimises data integrity.

Instead of doing this complicated dance where we try to decide how long
to wait for a response for particular requests, have the client instead
wait indefinitely for a response. Also, use a TASK_KILLABLE sleep here
so that fatal signals will break out of this waiting.

Later patches will add support for detecting dead peers and forcing
reconnects based on that.

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agosmp: Allow on_each_cpu() to be called while early_boot_irqs_disabled status to init...
Tejun Heo [Thu, 20 Jan 2011 11:07:13 +0000 (12:07 +0100)]
smp: Allow on_each_cpu() to be called while early_boot_irqs_disabled status to init/main.c

percpu may end up calling vfree() during early boot which in
turn may call on_each_cpu() for TLB flushes.  The function of
on_each_cpu() can be done safely while IRQ is disabled during
early boot but it assumed that the function is always called
with local IRQ enabled which ended up enabling local IRQ
prematurely during boot and triggering a couple of warnings.

This patch updates on_each_cpu() and smp_call_function_many()
such on_each_cpu() can be used safely while
early_boot_irqs_disabled is set.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110120110713.GC6036@htj.dyndns.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reported-by: Ingo Molnar <mingo@elte.hu>
14 years agolockdep: Move early boot local IRQ enable/disable status to init/main.c
Tejun Heo [Thu, 20 Jan 2011 11:06:35 +0000 (12:06 +0100)]
lockdep: Move early boot local IRQ enable/disable status to init/main.c

During early boot, local IRQ is disabled until IRQ subsystem is
properly initialized.  During this time, no one should enable
local IRQ and some operations which usually are not allowed with
IRQ disabled, e.g. operations which might sleep or require
communications with other processors, are allowed.

lockdep tracked this with early_boot_irqs_off/on() callbacks.
As other subsystems need this information too, move it to
init/main.c and make it generally available.  While at it,
toggle the boolean to early_boot_irqs_disabled instead of
enabled so that it can be initialized with %false and %true
indicates the exceptional condition.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110120110635.GB6036@htj.dyndns.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agovirtio: remove virtio-pci root device
Milton Miller [Fri, 7 Jan 2011 08:55:06 +0000 (02:55 -0600)]
virtio: remove virtio-pci root device

We sometimes need to map between the virtio device and
the given pci device. One such use is OS installer that
gets the boot pci device from BIOS and needs to
find the relevant block device. Since it can't,
installation fails.

Instead of creating a top-level devices/virtio-pci
directory, create each device under the corresponding
pci device node.  Symlinks to all virtio-pci
devices can be found under the pci driver link in
bus/pci/drivers/virtio-pci/devices, and all virtio
devices under drivers/bus/virtio/devices.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Gleb Natapov <gleb@redhat.com>
Tested-by: "Daniel P. Berrange" <berrange@redhat.com>
Cc: stable@kernel.org
14 years agoLGUEST_GUEST: fix unmet direct dependencies (VIRTUALIZATION && VIRTIO)
Randy Dunlap [Sat, 1 Jan 2011 19:08:46 +0000 (11:08 -0800)]
LGUEST_GUEST: fix unmet direct dependencies (VIRTUALIZATION && VIRTIO)

Honor the kconfig menu hierarchy to remove kconfig dependency warnings:
VIRTIO and VIRTIO_RING are subordinate to VIRTUALIZATION.

warning: (LGUEST_GUEST) selects VIRTIO which has unmet direct dependencies (VIRTUALIZATION)
warning: (LGUEST_GUEST && VIRTIO_PCI && VIRTIO_BALLOON) selects VIRTIO_RING which has unmet direct dependencies (VIRTUALIZATION && VIRTIO)

Reported-by: Toralf F_rster <toralf.foerster@gmx.de>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agolguest: compile fixes
Rusty Russell [Fri, 21 Jan 2011 03:37:29 +0000 (21:37 -0600)]
lguest: compile fixes

arch/x86/lguest/boot.c: In function â€˜lguest_init_IRQ’:
arch/x86/lguest/boot.c:824: error: macro "__this_cpu_write" requires 2 arguments, but only 1 given
arch/x86/lguest/boot.c:824: error: â€˜__this_cpu_write’ undeclared (first use in this function)
arch/x86/lguest/boot.c:824: error: (Each undeclared identifier is reported only once
arch/x86/lguest/boot.c:824: error: for each function it appears in.)

drivers/lguest/x86/core.c: In function â€˜copy_in_guest_info’:
drivers/lguest/x86/core.c:94: error: lvalue required as left operand of assignment

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agolguest: Use this_cpu_ops
Christoph Lameter [Tue, 30 Nov 2010 19:07:21 +0000 (13:07 -0600)]
lguest: Use this_cpu_ops

Use this_cpu_ops in a couple of places in lguest.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agolguest: document --rng in example Launcher
Philip Sanderson [Fri, 21 Jan 2011 03:37:29 +0000 (21:37 -0600)]
lguest: document --rng in example Launcher

Rusty Russell wrote:
> Ah, it will appear as /dev/hwrng.  It's a weirdness of Linux that our actual
> hardware number generators are not wired up to /dev/random...

Reflected this in the documentation, thanks :-)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agolguest: example launcher to use guard pages, drop PROT_EXEC, fix limit logic
Philip Sanderson [Fri, 21 Jan 2011 03:37:28 +0000 (21:37 -0600)]
lguest: example launcher to use guard pages, drop PROT_EXEC, fix limit logic

PROT_EXEC seems to be completely unnecessary (as the lguest binary
never executes there), and will allow it to work with SELinux (and
more importantly, PaX :-) as they can/do forbid writable and
executable mappings.

Also, map PROT_NONE guard pages at start and end of guest memory for extra
paranoia.

I changed the length check to addr + size > guest_limit because >= is wrong
(addr of 0, size of getpagesize() with a guest_limit of getpagesize() would
false positive).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agolguest: --username and --chroot options
Philip Sanderson [Fri, 21 Jan 2011 03:37:28 +0000 (21:37 -0600)]
lguest: --username and --chroot options

I've attached a patch which implements dropping to privileges
and chrooting to a directory.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 20 Jan 2011 04:27:25 +0000 (20:27 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Unify "numa=" command line option handling
  Revert "x86: Make relocatable kernel work with new binutils"

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Thu, 20 Jan 2011 04:25:45 +0000 (20:25 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (41 commits)
  sctp: user perfect name for Delayed SACK Timer option
  net: fix can_checksum_protocol() arguments swap
  Revert "netlink: test for all flags of the NLM_F_DUMP composite"
  gianfar: Fix misleading indentation in startup_gfar()
  net/irda/sh_irda: return to RX mode when TX error
  net offloading: Do not mask out NETIF_F_HW_VLAN_TX for vlan.
  USB CDC NCM: tx_fixup() race condition fix
  ns83820: Avoid bad pointer deref in ns83820_init_one().
  ipv6: Silence privacy extensions initialization
  bnx2x: Update bnx2x version to 1.62.00-4
  bnx2x: Fix AER setting for BCM57712
  bnx2x: Fix BCM84823 LED behavior
  bnx2x: Mark full duplex on some external PHYs
  bnx2x: Fix BCM8073/BCM8727 microcode loading
  bnx2x: LED fix for BCM8727 over BCM57712
  bnx2x: Common init will be executed only once after POR
  bnx2x: Swap BCM8073 PHY polarity if required
  iwlwifi: fix valid chain reading from EEPROM
  ath5k: fix locking in tx_complete_poll_work
  ath9k_hw: do PA offset calibration only on longcal interval
  ...

14 years agosctp: user perfect name for Delayed SACK Timer option
Shan Wei [Tue, 18 Jan 2011 22:39:00 +0000 (22:39 +0000)]
sctp: user perfect name for Delayed SACK Timer option

The option name of Delayed SACK Timer should be SCTP_DELAYED_SACK,
not SCTP_DELAYED_ACK.

Left SCTP_DELAYED_ACK be concomitant with SCTP_DELAYED_SACK,
for making compatibility with existing applications.

Reference:
8.1.19.  Get or Set Delayed SACK Timer (SCTP_DELAYED_SACK)
(http://tools.ietf.org/html/draft-ietf-tsvwg-sctpsocket-25)

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: fix can_checksum_protocol() arguments swap
Eric Dumazet [Wed, 19 Jan 2011 00:51:36 +0000 (00:51 +0000)]
net: fix can_checksum_protocol() arguments swap

commit 0363466866d901fbc (net offloading: Convert checksums to use
centrally computed features.) mistakenly swapped can_checksum_protocol()
arguments.

This broke IPv6 on bnx2 for instance, on NIC without TCPv6 checksum
offloads.

Reported-by: Hans de Bruin <jmdebruin@xmsnet.nl>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoRevert "netlink: test for all flags of the NLM_F_DUMP composite"
David S. Miller [Tue, 18 Jan 2011 20:40:38 +0000 (12:40 -0800)]
Revert "netlink: test for all flags of the NLM_F_DUMP composite"

This reverts commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf.

It breaks several things including the avahi daemon.

Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocifs: Use mask of ACEs for SID Everyone to calculate all three permissions user,...
Shirish Pargaonkar [Mon, 6 Dec 2010 20:56:46 +0000 (14:56 -0600)]
cifs: Use mask of ACEs for SID Everyone to calculate all three permissions user, group, and other

If a DACL has entries for ACEs for SID Everyone and Authenticated Users,
factor in mask in respective entries during calculation of permissions
for all three, user, group, and other.

http://technet.microsoft.com/en-us/library/bb463216.aspx

Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: Fix regression during share-level security mounts (Repost)
Shirish Pargaonkar [Wed, 19 Jan 2011 04:33:54 +0000 (22:33 -0600)]
cifs: Fix regression during share-level security mounts (Repost)

NTLM response length was changed to 16 bytes instead of 24 bytes
that are sent in Tree Connection Request during share-level security
share mounts.  Revert it back to 24 bytes.

Reported-and-Tested-by: Grzegorz Ozanski <grzegorz.ozanski@intel.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Acked-by: Suresh Jayaraman <sjayaraman@suse.de>
Cc: stable@kernel.org
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years ago[CIFS] Update cifs version number
Steve French [Wed, 19 Jan 2011 17:53:44 +0000 (17:53 +0000)]
[CIFS] Update cifs version number

Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: move mid result processing into common function
Jeff Layton [Tue, 11 Jan 2011 12:24:02 +0000 (07:24 -0500)]
cifs: move mid result processing into common function

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: move locked sections out of DeleteMidQEntry and AllocMidQEntry
Jeff Layton [Tue, 11 Jan 2011 12:24:02 +0000 (07:24 -0500)]
cifs: move locked sections out of DeleteMidQEntry and AllocMidQEntry

In later patches, we're going to need to have finer-grained control
over the addition and removal of these structs from the pending_mid_q
and we'll need to be able to call the destructor while holding the
spinlock. Move the locked sections out of both routines and into
the callers. Fix up current callers of DeleteMidQEntry to call a new
routine that dequeues the entry and then destroys it.

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: clean up accesses to midCount
Jeff Layton [Tue, 11 Jan 2011 12:24:02 +0000 (07:24 -0500)]
cifs: clean up accesses to midCount

It's an atomic_t and the code accesses the "counter" field in it directly
instead of using atomic_read(). It also is sometimes accessed under a
spinlock and sometimes not. Move it out of the spinlock since we don't need
belt-and-suspenders for something that's just informational.

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: make wait_for_free_request take a TCP_Server_Info pointer
Jeff Layton [Tue, 11 Jan 2011 12:24:01 +0000 (07:24 -0500)]
cifs: make wait_for_free_request take a TCP_Server_Info pointer

The cifsSesInfo pointer is only used to get at the server.

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: no need to mark smb_ses_list as cifs_demultiplex_thread is exiting
Jeff Layton [Tue, 11 Jan 2011 12:24:01 +0000 (07:24 -0500)]
cifs: no need to mark smb_ses_list as cifs_demultiplex_thread is exiting

The TCP_Server_Info is refcounted and every SMB session holds a
reference to it. Thus, smb_ses_list is always going to be empty when
cifsd is coming down. This is dead code.

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: don't fail writepages on -EAGAIN errors
Jeff Layton [Tue, 11 Jan 2011 12:24:01 +0000 (07:24 -0500)]
cifs: don't fail writepages on -EAGAIN errors

If CIFSSMBWrite2 returns -EAGAIN, then the error should be considered
temporary. CIFS should retry the write instead of setting an error on
the mapping and returning.

For WB_SYNC_ALL, just retry the write immediately. In the WB_SYNC_NONE
case, call redirty_page_for_writeback on all of the pages that didn't
get written out and then move on.

Also, fix up the handling of a short write with a successful return
code. MS-CIFS says that 0 bytes_written means ENOSPC or EFBIG. It
doesn't mention what a short, but non-zero write means, so for now
treat it as we would an -EAGAIN return.

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agoCIFS: Fix oplock break handling (try #2)
Pavel Shilovsky [Mon, 17 Jan 2011 17:15:44 +0000 (20:15 +0300)]
CIFS: Fix oplock break handling (try #2)

When we get oplock break notification we should set the appropriate
value of OplockLevel field in oplock break acknowledge according to
the oplock level held by the client in this time. As we only can have
level II oplock or no oplock in the case of oplock break, we should be
aware only about clientCanCacheRead field in cifsInodeInfo structure.

Also fix bug connected with wrong interpretation of OplockLevel field
during oplock break notification processing.

Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agosched, cgroup: Use exit hook to avoid use-after-free crash
Peter Zijlstra [Wed, 19 Jan 2011 11:26:11 +0000 (12:26 +0100)]
sched, cgroup: Use exit hook to avoid use-after-free crash

By not notifying the controller of the on-exit move back to
init_css_set, we fail to move the task out of the previous
cgroup's cfs_rq. This leads to an opportunity for a
cgroup-destroy to come in and free the cgroup (there are no
active tasks left in it after all) to which the not-quite dead
task is still enqueued.

Reported-by: Miklos Vajna <vmiklos@frugalware.org>
Fixed-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1293206353.29444.205.camel@laptop>

14 years agox86: Unify "numa=" command line option handling
Jan Beulich [Wed, 19 Jan 2011 08:57:21 +0000 (08:57 +0000)]
x86: Unify "numa=" command line option handling

In order to be able to suppress the use of SRAT tables that
32-bit Linux can't deal with (in one case known to lead to a
non-bootable system, unless disabling ACPI altogether), move the
"numa=" option handling to common code.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Reviewed-by: Thomas Renninger <trenn@suse.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Renninger <trenn@suse.de>
LKML-Reference: <4D36B581020000780002D0FF@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoRevert "x86: Make relocatable kernel work with new binutils"
Ingo Molnar [Wed, 19 Jan 2011 09:09:42 +0000 (10:09 +0100)]
Revert "x86: Make relocatable kernel work with new binutils"

This reverts commit 86b1e8dd83cb ("x86: Make relocatable kernel work with
new binutils").

Markus Trippelsdorf reported a boot failure caused by this patch.

The real solution to the original patch will likely involve an
arch-generic solution to define an overlaid jiffies_64 and jiffies
variables.

Until that's done and tested on all architectures revert this commit to
solve the regression.

Reported-and-bisected-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Cc: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <4D36A759.60704@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoACPICA: Update version to 20110112
Bob Moore [Mon, 17 Jan 2011 02:31:08 +0000 (10:31 +0800)]
ACPICA: Update version to 20110112

Version 20110112.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
14 years agoACPICA: Update all ACPICA copyrights and signons to 2011
Bob Moore [Mon, 17 Jan 2011 03:05:40 +0000 (11:05 +0800)]
ACPICA: Update all ACPICA copyrights and signons to 2011

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
14 years agoACPICA: Fix issues/fault with automatic "serialized" method support
Lin Ming [Wed, 12 Jan 2011 01:19:43 +0000 (09:19 +0800)]
ACPICA: Fix issues/fault with automatic "serialized" method support

History: This support changes a method to "serialized" on the fly if the
method generates an AE_ALREADY_EXISTS error, indicating the possibility
that it cannot handle reentrancy.

This fix repairs a couple of issues seen in the field, especially on
machines with many cores.

1) Delete method children only upon the exit of the last thread, so
as to not delete objects out from under running threads.

2) Set the "serialized" bit for the method only upon the exit of the
last thread, so as to not cause deadlock when running threads attempt
to exit.

3) Cleanup the use of the AML "MethodFlags" and internal method flags
so that there is no longer any confustion between the two.

Reported-by: Dana Myers <dana.myers@oracle.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
14 years agoACPICA: Debugger: Lock namespace for duration of a namespace dump
Bob Moore [Wed, 12 Jan 2011 01:13:31 +0000 (09:13 +0800)]
ACPICA: Debugger: Lock namespace for duration of a namespace dump

Prevents issues if the namespace is changing underneath the
debugger.  Especially temporary nodes, since the debugger displays
these also.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Reviewed-by: Rafael Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
14 years agoACPICA: Fix namespace race condition
Dana Myers [Wed, 12 Jan 2011 01:09:31 +0000 (09:09 +0800)]
ACPICA: Fix namespace race condition

Fixes a race condition between method execution and namespace
walks that can possibly fault. Problem was apparently introduced
in version 20100528 as a result of a performance optimization
that reduces the number of namespace walks upon method exit
by using the delete_namespace_subtree function instead of the
delete_namespace_by_owner function used previously. Bug is in
the delete_namespace_subtree function.

Signed-off-by: Dana Myers <dana.myers@oracle.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Reviewed-by: Rafael Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
14 years agoACPICA: Fix memory leak in acpi_ev_asynch_execute_gpe_method().
Jesper Juhl [Sun, 16 Jan 2011 19:37:52 +0000 (20:37 +0100)]
ACPICA: Fix memory leak in acpi_ev_asynch_execute_gpe_method().

We will leak the memory allocated to 'local_gpe_event_info' if
'acpi_ut_acquire_mutex()' fails or if 'acpi_ev_valid_gpe_event()' fails in
drivers/acpi/acpica/evgpe.c::acpi_ev_asynch_execute_gpe_method().

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Reviewed-by: Rafael Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
14 years agogianfar: Fix misleading indentation in startup_gfar()
Anton Vorontsov [Tue, 18 Jan 2011 02:36:02 +0000 (02:36 +0000)]
gianfar: Fix misleading indentation in startup_gfar()

Just stumbled upon the issue while looking for another bug.

The code looks correct, the indentation is not.

Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/irda/sh_irda: return to RX mode when TX error
Kuninori Morimoto [Thu, 13 Jan 2011 21:47:42 +0000 (21:47 +0000)]
net/irda/sh_irda: return to RX mode when TX error

sh_irda can not use RX/TX in same time,
but this driver didn't return to RX mode when TX error occurred.
This patch care xmit error case to solve this issue.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet offloading: Do not mask out NETIF_F_HW_VLAN_TX for vlan.
Jesse Gross [Mon, 17 Jan 2011 20:46:00 +0000 (20:46 +0000)]
net offloading: Do not mask out NETIF_F_HW_VLAN_TX for vlan.

In netif_skb_features() we return only the features that are valid for vlans
if we have a vlan packet.  However, we should not mask out NETIF_F_HW_VLAN_TX
since it enables transmission of vlan tags and is obviously valid.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoUSB CDC NCM: tx_fixup() race condition fix
Alexey Orishko [Mon, 17 Jan 2011 07:07:25 +0000 (07:07 +0000)]
USB CDC NCM: tx_fixup() race condition fix

- tx_fixup() can be called from either timer callback or from xmit()
  in usbnet, so spinlock is added to avoid concurrency-related problem.
- minor correction due to checkpatch warning for some line over 80
  chars after previous patch was applied.

Signed-off-by: Alexey Orishko <alexey.orishko@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agons83820: Avoid bad pointer deref in ns83820_init_one().
Jesper Juhl [Mon, 17 Jan 2011 10:24:57 +0000 (10:24 +0000)]
ns83820: Avoid bad pointer deref in ns83820_init_one().

In drivers/net/ns83820.c::ns83820_init_one() we dynamically allocate
memory via alloc_etherdev(). We then call PRIV() on the returned storage
which is 'return netdev_priv()'. netdev_priv() takes the pointer it is
passed and adds 'ALIGN(sizeof(struct net_device), NETDEV_ALIGN)' to it and
returns it. Then we test the resulting pointer for NULL, which it is
unlikely to be at this point, and later dereference it. This will go bad
if alloc_etherdev() actually returned NULL.

This patch reworks the code slightly so that we test for a NULL pointer
(and return -ENOMEM) directly after calling alloc_etherdev().

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoipv6: Silence privacy extensions initialization
Romain Francoise [Mon, 17 Jan 2011 07:59:18 +0000 (07:59 +0000)]
ipv6: Silence privacy extensions initialization

When a network namespace is created (via CLONE_NEWNET), the loopback
interface is automatically added to the new namespace, triggering a
printk in ipv6_add_dev() if CONFIG_IPV6_PRIVACY is set.

This is problematic for applications which use CLONE_NEWNET as
part of a sandbox, like Chromium's suid sandbox or recent versions of
vsftpd. On a busy machine, it can lead to thousands of useless
"lo: Disabled Privacy Extensions" messages appearing in dmesg.

It's easy enough to check the status of privacy extensions via the
use_tempaddr sysctl, so just removing the printk seems like the most
sensible solution.

Signed-off-by: Romain Francoise <romain@orebokech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: Update bnx2x version to 1.62.00-4
Yaniv Rosner [Tue, 18 Jan 2011 04:33:55 +0000 (04:33 +0000)]
bnx2x: Update bnx2x version to 1.62.00-4

Update bnx2x version to 1.62.00-4

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: Fix AER setting for BCM57712
Yaniv Rosner [Tue, 18 Jan 2011 04:33:52 +0000 (04:33 +0000)]
bnx2x: Fix AER setting for BCM57712

Fix AER settings for BCM57712 to allow accessing all device addresses range in CL45 MDC/MDIO

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: Fix BCM84823 LED behavior
Yaniv Rosner [Tue, 18 Jan 2011 04:33:47 +0000 (04:33 +0000)]
bnx2x: Fix BCM84823 LED behavior

Fix BCM84823 LED behavior which may show on some systems

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: Mark full duplex on some external PHYs
Yaniv Rosner [Tue, 18 Jan 2011 04:33:42 +0000 (04:33 +0000)]
bnx2x: Mark full duplex on some external PHYs

Device may show incorrect duplex mode for devices with external PHY

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: Fix BCM8073/BCM8727 microcode loading
Yaniv Rosner [Tue, 18 Jan 2011 04:33:36 +0000 (04:33 +0000)]
bnx2x: Fix BCM8073/BCM8727 microcode loading

Improve microcode loading verification before proceeding to next stage

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: LED fix for BCM8727 over BCM57712
Yaniv Rosner [Tue, 18 Jan 2011 04:33:31 +0000 (04:33 +0000)]
bnx2x: LED fix for BCM8727 over BCM57712

LED on BCM57712+BCM8727 systems requires different settings

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: Common init will be executed only once after POR
Yaniv Rosner [Tue, 18 Jan 2011 04:33:24 +0000 (04:33 +0000)]
bnx2x: Common init will be executed only once after POR

Common init used to be called by the driver when the first port comes up, mainly to reset and reload external PHY microcode.
However, in case management driver is active on the other port, traffic would halted. So limit the common init to be done only once after POR.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: Swap BCM8073 PHY polarity if required
Yaniv Rosner [Tue, 18 Jan 2011 04:33:18 +0000 (04:33 +0000)]
bnx2x: Swap BCM8073 PHY polarity if required

Enable controlling BCM8073 PN polarity swap through nvm configuration, which is required in certain systems

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoLinux 2.6.38-rc1
Linus Torvalds [Tue, 18 Jan 2011 23:14:02 +0000 (15:14 -0800)]
Linux 2.6.38-rc1