openwrt/staging/blogic.git
15 years agoxen: setup percpu data pointers
Jeremy Fitzhardinge [Mon, 2 Feb 2009 21:55:31 +0000 (13:55 -0800)]
xen: setup percpu data pointers

We need to access percpu data fairly early, so set up the percpu
registers as soon as possible.  We only need to load the appropriate
segment register.  We already have a GDT, but its hard to change it
early because we need to manipulate the pagetable to do so, and that
hasn't been set up yet.

Also, set the kernel stack when bringing up secondary CPUs.  If we
don't they all end up sharing the same stack...

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
15 years agoMerge branch 'core/percpu' into x86/paravirt
H. Peter Anvin [Thu, 5 Feb 2009 00:58:26 +0000 (16:58 -0800)]
Merge branch 'core/percpu' into x86/paravirt

15 years agoxen: fix 32-bit build resulting from mmu move
Jeremy Fitzhardinge [Mon, 2 Feb 2009 21:58:06 +0000 (13:58 -0800)]
xen: fix 32-bit build resulting from mmu move

Moving the mmu code from enlighten.c to mmu.c inadvertently broke the
32-bit build.  Fix it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
15 years agox86/paravirt: return full 64-bit result
Jeremy Fitzhardinge [Wed, 4 Feb 2009 00:00:38 +0000 (16:00 -0800)]
x86/paravirt: return full 64-bit result

Impact: Bug fix

A hunk went missing in the original patch, and callee-save callsites were
not marked as returning the upper 32-bit of result, causing Badness.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
15 years agox86, percpu: fix kexec with vmlinux
Yinghai Lu [Tue, 3 Feb 2009 02:16:19 +0000 (18:16 -0800)]
x86, percpu: fix kexec with vmlinux

Impact: fix regression with kexec with vmlinux

Split data.init into data.init, percpu, data.init2 sections
instead of let data.init wrap percpu secion.

Thus kexec loading will be happy, because sections will not
overlap.

Before the patch we have:

Elf file type is EXEC (Executable file)
Entry point 0x200000
There are 6 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000200000 0xffffffff80200000 0x0000000000200000
                 0x0000000000ca6000 0x0000000000ca6000  R E    200000
  LOAD           0x0000000000ea6000 0xffffffff80ea6000 0x0000000000ea6000
                 0x000000000014dfe0 0x000000000014dfe0  RWE    200000
  LOAD           0x0000000001000000 0xffffffffff600000 0x0000000000ff4000
                 0x0000000000000888 0x0000000000000888  RWE    200000
  LOAD           0x00000000011f6000 0xffffffff80ff6000 0x0000000000ff6000
                 0x0000000000073086 0x0000000000a2d938  RWE    200000
  LOAD           0x0000000001400000 0x0000000000000000 0x000000000106a000
                 0x00000000001d2ce0 0x00000000001d2ce0  RWE    200000
  NOTE           0x00000000009e2c1c 0xffffffff809e2c1c 0x00000000009e2c1c
                 0x0000000000000024 0x0000000000000024         4

 Section to Segment mapping:
  Segment Sections...
   00     .text .notes __ex_table .rodata __bug_table .pci_fixup .builtin_fw __ksymtab __ksymtab_gpl __ksymtab_strings __init_rodata __param
   01     .data .init.rodata .data.cacheline_aligned .data.read_mostly
   02     .vsyscall_0 .vsyscall_fn .vsyscall_gtod_data .vsyscall_1 .vsyscall_2 .vgetcpu_mode .jiffies
   03     .data.init_task .smp_locks .init.text .init.data .init.setup .initcall.init .con_initcall.init .x86_cpu_dev.init .altinstructions .altinstr_replacement .exit.text .init.ramfs .bss
   04     .data.percpu
   05     .notes

After patch we've got:

Elf file type is EXEC (Executable file)
Entry point 0x200000
There are 7 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000200000 0xffffffff80200000 0x0000000000200000
                 0x0000000000ca6000 0x0000000000ca6000  R E    200000
  LOAD           0x0000000000ea6000 0xffffffff80ea6000 0x0000000000ea6000
                 0x000000000014dfe0 0x000000000014dfe0  RWE    200000
  LOAD           0x0000000001000000 0xffffffffff600000 0x0000000000ff4000
                 0x0000000000000888 0x0000000000000888  RWE    200000
  LOAD           0x00000000011f6000 0xffffffff80ff6000 0x0000000000ff6000
                 0x0000000000073086 0x0000000000073086  RWE    200000
  LOAD           0x0000000001400000 0x0000000000000000 0x000000000106a000
                 0x00000000001d2ce0 0x00000000001d2ce0  RWE    200000
  LOAD           0x000000000163d000 0xffffffff8123d000 0x000000000123d000
                 0x0000000000000000 0x00000000007e6938  RWE    200000
  NOTE           0x00000000009e2c1c 0xffffffff809e2c1c 0x00000000009e2c1c
                 0x0000000000000024 0x0000000000000024         4

 Section to Segment mapping:
  Segment Sections...
   00     .text .notes __ex_table .rodata __bug_table .pci_fixup .builtin_fw __ksymtab __ksymtab_gpl __ksymtab_strings __init_rodata __param
   01     .data .init.rodata .data.cacheline_aligned .data.read_mostly
   02     .vsyscall_0 .vsyscall_fn .vsyscall_gtod_data .vsyscall_1 .vsyscall_2 .vgetcpu_mode .jiffies
   03     .data.init_task .smp_locks .init.text .init.data .init.setup .initcall.init .con_initcall.init .x86_cpu_dev.init .altinstructions .altinstr_replacement .exit.text .init.ramfs
   04     .data.percpu
   05     .bss
   06     .notes

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86/vmi: fix interrupt enable/disable/save/restore calling convention.
Jeremy Fitzhardinge [Sat, 31 Jan 2009 07:18:41 +0000 (23:18 -0800)]
x86/vmi: fix interrupt enable/disable/save/restore calling convention.

Zach says:
> Enable/Disable have no clobbers at all.
> Save clobbers only return value, %eax
> Restore also clobbers nothing.

This is precisely compatible with the calling convention, so we can
just call them directly without wrapping.

(Compile tested only.)

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
15 years agox86/paravirt: don't restore second return reg
Jeremy Fitzhardinge [Sat, 31 Jan 2009 07:17:23 +0000 (23:17 -0800)]
x86/paravirt: don't restore second return reg

Impact: bugfix

In the 32-bit calling convention, %eax:%edx is used to return 64-bit
values.  Don't save and restore %edx around wrapped functions, or they
can't return a full 64-bit result.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
15 years agoMerge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc...
Ingo Molnar [Sat, 31 Jan 2009 13:27:28 +0000 (14:27 +0100)]
Merge branch 'tj-percpu' of git://git./linux/kernel/git/tj/misc into core/percpu

15 years agoMerge branch 'master' into tj-percpu
Tejun Heo [Sat, 31 Jan 2009 05:36:00 +0000 (14:36 +0900)]
Merge branch 'master' into tj-percpu

15 years agoxen: setup percpu data pointers
Jeremy Fitzhardinge [Fri, 30 Jan 2009 08:47:54 +0000 (17:47 +0900)]
xen: setup percpu data pointers

Impact: fix xen booting

We need to access percpu data fairly early, so set up the percpu
registers as soon as possible.  We only need to load the appropriate
segment register.  We already have a GDT, but its hard to change it
early because we need to manipulate the pagetable to do so, and that
hasn't been set up yet.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: split loading percpu segments from loading gdt
Jeremy Fitzhardinge [Fri, 30 Jan 2009 08:47:54 +0000 (17:47 +0900)]
x86: split loading percpu segments from loading gdt

Impact: split out a function, no functional change

Xen needs to be able to access percpu data from very early on.  For
various reasons, it cannot also load the gdt at that time.   It does,
however, have a pefectly functional gdt at that point, so there's no
pressing need to reload the gdt.

Split the function to load the segment registers off, so Xen can call
it directly.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: pass in cpu number to switch_to_new_gdt()
Brian Gerst [Fri, 30 Jan 2009 08:47:53 +0000 (17:47 +0900)]
x86: pass in cpu number to switch_to_new_gdt()

Impact: cleanup, prepare for xen boot fix.

Xen needs to call this function very early to setup the GDT and
per-cpu segments.  Remove the call to smp_processor_id() and just
pass in the cpu number.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: UV fix uv_flush_send_and_wait()
Cliff Wickman [Thu, 29 Jan 2009 21:35:26 +0000 (15:35 -0600)]
x86: UV fix uv_flush_send_and_wait()

Impact: fix possible tlb mis-flushing on UV

uv_flush_send_and_wait() should return a pointer if the broadcast
remote tlb shootdown requests fail. That causes the conventional IPI
method of shootdown to be used.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86/paravirt: fix missing callee-save call on pud_val
Jeremy Fitzhardinge [Thu, 29 Jan 2009 09:51:34 +0000 (01:51 -0800)]
x86/paravirt: fix missing callee-save call on pud_val

Impact: Fix build when CONFIG_PARAVIRT_DEBUG is enabled

Fix missed convertion to using callee-saved calls for pud_val, which
causes a compile error when CONFIG_PARAVIRT_DEBUG is enabled.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
15 years agox86/paravirt: use callee-saved convention for pte_val/make_pte/etc
Jeremy Fitzhardinge [Wed, 28 Jan 2009 22:35:07 +0000 (14:35 -0800)]
x86/paravirt: use callee-saved convention for pte_val/make_pte/etc

Impact: Optimization

In the native case, pte_val, make_pte, etc are all just identity
functions, so there's no need to clobber a lot of registers over them.

(This changes the 32-bit callee-save calling convention to return both
EAX and EDX so functions can return 64-bit values.)

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
15 years agox86/paravirt: implement PVOP_CALL macros for callee-save functions
Jeremy Fitzhardinge [Wed, 28 Jan 2009 22:35:06 +0000 (14:35 -0800)]
x86/paravirt: implement PVOP_CALL macros for callee-save functions

Impact: Optimization

Functions with the callee save calling convention clobber many fewer
registers than the normal C calling convention.  Implement variants of
PVOP_V?CALL* accordingly.  This only bothers with functions up to 3
args, since functions with more args may as well use the normal
calling convention.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
15 years agox86/paravirt: add register-saving thunks to reduce caller register pressure
Jeremy Fitzhardinge [Wed, 28 Jan 2009 22:35:05 +0000 (14:35 -0800)]
x86/paravirt: add register-saving thunks to reduce caller register pressure

Impact: Optimization

One of the problems with inserting a pile of C calls where previously
there were none is that the register pressure is greatly increased.
The C calling convention says that the caller must expect a certain
set of registers may be trashed by the callee, and that the callee can
use those registers without restriction.  This includes the function
argument registers, and several others.

This patch seeks to alleviate this pressure by introducing wrapper
thunks that will do the register saving/restoring, so that the
callsite doesn't need to worry about it, but the callee function can
be conventional compiler-generated code.  In many cases (particularly
performance-sensitive cases) the callee will be in assembler anyway,
and need not use the compiler's calling convention.

Standard calling convention is:
 arguments     return scratch
x86-32  eax edx ecx     eax ?
x86-64  rdi rsi rdx rcx    rax r8 r9 r10 r11

The thunk preserves all argument and scratch registers.  The return
register is not preserved, and is available as a scratch register for
unwrapped callee code (and of course the return value).

Wrapped function pointers are themselves wrapped in a struct
paravirt_callee_save structure, in order to get some warning from the
compiler when functions with mismatched calling conventions are used.

The most common paravirt ops, both statically and dynamically, are
interrupt enable/disable/save/restore, so handle them first.  This is
particularly easy since their calls are handled specially anyway.

XXX Deal with VMI.  What's their calling convention?

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
15 years agox86/paravirt: selectively save/restore regs around pvops calls
Jeremy Fitzhardinge [Wed, 28 Jan 2009 22:35:04 +0000 (14:35 -0800)]
x86/paravirt: selectively save/restore regs around pvops calls

Impact: Optimization

Each asm paravirt-ops call says what registers are available for
clobbering.  This patch makes use of this to selectively save/restore
registers around each pvops call.  In many cases this significantly
shrinks code size.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
15 years agox86: fix paravirt clobber in entry_64.S
Jeremy Fitzhardinge [Wed, 28 Jan 2009 22:35:03 +0000 (14:35 -0800)]
x86: fix paravirt clobber in entry_64.S

Impact: Fix latent bug

The clobber is trying to say that anything except RDI is available for
clobbering, but actually clobbers everything.  This hasn't mattered
because the clobbers were basically ignored, but subsequent patches
will rely on them.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
15 years agox86/pvops: add a paravirt_ident functions to allow special patching
Jeremy Fitzhardinge [Wed, 28 Jan 2009 22:35:02 +0000 (14:35 -0800)]
x86/pvops: add a paravirt_ident functions to allow special patching

Impact: Optimization

Several paravirt ops implementations simply return their arguments,
the most obvious being the make_pte/pte_val class of operations on
native.

On 32-bit, the identity function is literally a no-op, as the calling
convention uses the same registers for the first argument and return.
On 64-bit, it can be implemented with a single "mov".

This patch adds special identity functions for 32 and 64 bit argument,
and machinery to recognize them and replace them with either nops or a
mov as appropriate.

At the moment, the only users for the identity functions are the
pagetable entry conversion functions.

The result is a measureable improvement on pagetable-heavy benchmarks
(2-3%, reducing the pvops overhead from 5 to 2%).

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
15 years agoxen: move remaining mmu-related stuff into mmu.c
Jeremy Fitzhardinge [Wed, 28 Jan 2009 22:35:01 +0000 (14:35 -0800)]
xen: move remaining mmu-related stuff into mmu.c

Impact: Cleanup

Move remaining mmu-related stuff into mmu.c.
A general cleanup, and lay the groundwork for later patches.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
15 years agoMerge branch 'core/percpu' into x86/paravirt
H. Peter Anvin [Fri, 30 Jan 2009 22:50:57 +0000 (14:50 -0800)]
Merge branch 'core/percpu' into x86/paravirt

15 years agolinker script: use separate simpler definition for PERCPU()
Tejun Heo [Fri, 30 Jan 2009 07:32:22 +0000 (16:32 +0900)]
linker script: use separate simpler definition for PERCPU()

Impact: fix linker screwup on x86_32

Recent x86_64 zerobased patches introduced PERCPU_VADDR() to put
.data.percpu to a predefined address and re-defined PERCPU() in terms
of it.  The new macro defined one extra symbol, __per_cpu_load, for
LMA of the section so that the init data could be accessed.  This new
symbol introduced the following problems to x86_32.

1. If __per_cpu_load is defined outside of .data.percpu as an absolute
   symbol, relocation generation for relocatable kernel fails due to
   absolute relocation.

2. If __per_cpu_load is put inside .data.percpu with absolute address
   assignment to work around #1, linker gets confused and under
   certain configurations ends up relocating the symbol against
   .data.percpu such that the load address gets added on top of
   already set load address.

As x86_32 doesn't use predefined address for .data.percpu, there's no
need for it to care about the possibility of __per_cpu_load being
different from __per_cpu_start.

This patch defines PERCPU() separately so that __per_cpu_load is
defined inside .data.percpu so that everything is ordinary
linking-wise.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoAllow opportunistic merging of VM_CAN_NONLINEAR areas
Linus Torvalds [Fri, 30 Jan 2009 19:37:22 +0000 (11:37 -0800)]
Allow opportunistic merging of VM_CAN_NONLINEAR areas

Commit de33c8db5910cda599899dd431cc30d7c1018cbf ("Fix OOPS in
mmap_region() when merging adjacent VM_LOCKED file segments") unified
the vma merging of anonymous and file maps to just one place, which
simplified the code and fixed a use-after-free bug that could cause an
oops.

But by doing the merge opportunistically before even having called
->mmap() on the file method, it now compares two different 'vm_flags'
values: the pre-mmap() value of the new not-yet-formed vma, and previous
mappings of the same file around it.

And in doing so, it refused to merge the common file case, which adds a
marker to say "I can be made non-linear".

This fixes it by just adding a set of flags that don't have to match,
because we know they are ok to merge.  Currently it's only that single
VM_CAN_NONLINEAR flag, but at least conceptually there could be others
in the future.

Reported-and-acked-by: Hugh Dickins <hugh@veritas.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge branch 'linus' into core/percpu
Ingo Molnar [Fri, 30 Jan 2009 17:23:30 +0000 (18:23 +0100)]
Merge branch 'linus' into core/percpu

Conflicts:
kernel/irq/handle.c

15 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Linus Torvalds [Fri, 30 Jan 2009 16:54:29 +0000 (08:54 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/tytso/ext4

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: Remove bogus BUG() check in ext4_bmap()
  ext4: Fix building with EXT4FS_DEBUG
  ext4: Initialize the new group descriptor when resizing the filesystem
  ext4: Fix ext4_free_blocks() w/o a journal when files have indirect blocks
  jbd2: On a __journal_expect() assertion failure printk "JBD2", not "EXT3-fs"
  ext3: Add sanity check to make_indexed_dir
  ext4: Add sanity check to make_indexed_dir
  ext4: only use i_size_high for regular files
  ext4: fix wrong use of do_div

15 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
Linus Torvalds [Fri, 30 Jan 2009 16:46:42 +0000 (08:46 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  cfq-iosched: Allow RT requests to pre-empt ongoing BE timeslice
  block: add sysfs file for controlling io stats accounting
  Mark mandatory elevator functions in the biodoc.txt
  include/linux: Add bsg.h to the Kernel exported headers
  block: silently error an unsupported barrier bio
  block: Fix documentation for blkdev_issue_flush()
  block: add bio_rw_flagged() for testing bio->bi_rw
  block: seperate bio/request unplug and sync bits
  block: export SSD/non-rotational queue flag through sysfs
  Fix small typo in bio.h's documentation
  block: get rid of the manual directory counting in blktrace
  block: Allow empty integrity profile
  block: Remove obsolete BUG_ON
  block: Don't verify integrity metadata on read error

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Fri, 30 Jan 2009 16:41:36 +0000 (08:41 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits)
  tulip: fix 21142 with 10Mbps without negotiation
  drivers/net/skfp: if !capable(CAP_NET_ADMIN): inverted logic
  gianfar: Fix Wake-on-LAN support
  smsc911x: timeout reaches -1
  smsc9420: fix interrupt signalling test failures
  ucc_geth: Change uec phy id to the same format as gianfar's
  wimax: fix build issue when debugfs is disabled
  netxen: fix memory leak in drivers/net/netxen_nic_init.c
  tun: Add some missing TUN compat ioctl translations.
  ipv4: fix infinite retry loop in IP-Config
  net: update documentation ip aliases
  net: Fix OOPS in skb_seq_read().
  net: Fix frag_list handling in skb_seq_read
  netxen: revert jumbo ringsize
  ath5k: fix locking in ath5k_config
  cfg80211: print correct intersected regulatory domain
  cfg80211: Fix sanity check on 5 GHz when processing country IE
  iwlwifi: fix kernel oops when ucode DMA memory allocation failure
  rtl8187: Fix error in setting OFDM power settings for RTL8187L
  mac80211: remove Michael Wu as maintainer
  ...

15 years agoAdd enable_ms to jsm driver
Paul Larson [Fri, 30 Jan 2009 16:21:49 +0000 (10:21 -0600)]
Add enable_ms to jsm driver

This fixes a crash observed when non-existant enable_ms function is
called for jsm driver.

Signed-off-by: Scott Kilau <Scott.Kilau@digi.com>
Signed-off-by: Paul Larson <pl@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agocfq-iosched: Allow RT requests to pre-empt ongoing BE timeslice
Divyesh Shah [Fri, 30 Jan 2009 11:46:41 +0000 (12:46 +0100)]
cfq-iosched: Allow RT requests to pre-empt ongoing BE timeslice

This patch adds the ability to pre-empt an ongoing BE timeslice when a RT
request is waiting for the current timeslice to complete. This reduces the
wait time to disk for RT requests from an upper bound of 4 (current value
of cfq_quantum) to 1 disk request.

Applied Jens' suggeested changes to avoid the rb lookup and use !cfq_class_rt()
and retested.

Latency(secs) for the RT task when doing sequential reads from 10G file.
                       | only RT | RT + BE | RT + BE + this patch
small (512 byte) reads | 143     | 163     | 145
large (1Mb) reads      | 142     | 158     | 146

Signed-off-by: Divyesh Shah <dpshah@google.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock: add sysfs file for controlling io stats accounting
Jens Axboe [Fri, 23 Jan 2009 09:54:44 +0000 (10:54 +0100)]
block: add sysfs file for controlling io stats accounting

This allows us to turn off disk stat accounting completely, for the cases
where the 0.5-1% reduction in system time is important.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoMark mandatory elevator functions in the biodoc.txt
Nikanth Karthikesan [Tue, 27 Jan 2009 08:29:24 +0000 (09:29 +0100)]
Mark mandatory elevator functions in the biodoc.txt

biodoc.txt mentions that elevator functions marked with * are mandatory, but
no function is marked with *. Mark the 3 functions which should be
implemented by any io scheduler.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoinclude/linux: Add bsg.h to the Kernel exported headers
Boaz Harrosh [Mon, 19 Jan 2009 09:37:38 +0000 (10:37 +0100)]
include/linux: Add bsg.h to the Kernel exported headers

bsg.h in current form is perfectly suitable for user-mode
consumption. It is needed together with scsi/sg.h for applications
that want to interface with the bsg driver.

Currently the few projects that use it would copy it over into
the projects. But that is not acceptable for projects that need
to provide source and devel packages for distros.

This should also be submitted to stable 2.6.28 and 2.6.27 since bsg had
a stable API since these Kernels and distro users will need the header
for these kernels a swell

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
CC: stable@kernel.org
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock: silently error an unsupported barrier bio
Jens Axboe [Tue, 13 Jan 2009 14:28:32 +0000 (15:28 +0100)]
block: silently error an unsupported barrier bio

This fixes a "regression" from 2.6.28, where the barrier probes that file
systems may do would trigger additional end request warnings in dmesg.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock: Fix documentation for blkdev_issue_flush()
Theodore Ts'o [Tue, 13 Jan 2009 14:27:32 +0000 (15:27 +0100)]
block: Fix documentation for blkdev_issue_flush()

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock: add bio_rw_flagged() for testing bio->bi_rw
Jens Axboe [Tue, 6 Jan 2009 08:21:49 +0000 (09:21 +0100)]
block: add bio_rw_flagged() for testing bio->bi_rw

The existing functions for checking bio->bi_rw are badly named. So lets
mirror what we do for bio->bi_flags testing, use a properly named
function so that it's immediately obvious what is being tested.

Maintain compatability names for the old macros, eventually we'll get
rid of these.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock: seperate bio/request unplug and sync bits
Jens Axboe [Tue, 6 Jan 2009 08:16:05 +0000 (09:16 +0100)]
block: seperate bio/request unplug and sync bits

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock: export SSD/non-rotational queue flag through sysfs
Bartlomiej Zolnierkiewicz [Wed, 7 Jan 2009 11:22:39 +0000 (12:22 +0100)]
block: export SSD/non-rotational queue flag through sysfs

For some devices (i.e. CFA ATA) we can't reliably detect whether
the device is of rotational or non-rotational type so we need to
leave the final decision about this setting to the user-space.

As a bonus do a minor CodingStyle fixup in queue_nomerges_store().

Suggested-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoFix small typo in bio.h's documentation
Alberto Bertogli [Mon, 5 Jan 2009 09:18:53 +0000 (10:18 +0100)]
Fix small typo in bio.h's documentation

Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock: get rid of the manual directory counting in blktrace
Jens Axboe [Mon, 5 Jan 2009 09:17:25 +0000 (10:17 +0100)]
block: get rid of the manual directory counting in blktrace

It can result in a stuck blktrace system, if --kill is used.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock: Allow empty integrity profile
Martin K. Petersen [Sun, 4 Jan 2009 07:43:40 +0000 (02:43 -0500)]
block: Allow empty integrity profile

Allow a block device to allocate and register an integrity profile
without providing a template.  This allows DM to preallocate a profile
to avoid deadlocks during table reconfiguration.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock: Remove obsolete BUG_ON
Martin K. Petersen [Sun, 4 Jan 2009 07:43:39 +0000 (02:43 -0500)]
block: Remove obsolete BUG_ON

Now that bio_vecs are no longer cleared in bvec_alloc_bs() the following
BUG_ON must go.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock: Don't verify integrity metadata on read error
Martin K. Petersen [Sun, 4 Jan 2009 07:43:38 +0000 (02:43 -0500)]
block: Don't verify integrity metadata on read error

If we get an I/O error on a read request there is no point in doing a
verify pass on the integrity buffer.  Adjust the completion path
accordingly.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoext4: Remove bogus BUG() check in ext4_bmap()
Theodore Ts'o [Fri, 30 Jan 2009 05:00:24 +0000 (00:00 -0500)]
ext4: Remove bogus BUG() check in ext4_bmap()

The code to support journal-less ext4 operation added a BUG to
ext4_bmap() which fired if there was no journal and the
EXT4_STATE_JDATA bit was set in the i_state field.  This caused
running the filefrag program (which uses the FIMBAP ioctl) to trigger
a BUG().

The EXT4_STATE_JDATA bit is only used for ext4_bmap(), and it's
harmless for the bit to be set.  We could add a check in
__ext4_journalled_writepage() and ext4_journalled_write_end() to only
set the EXT4_STATE_JDATA bit if the journal is present, but that adds
an extra test and jump instruction.  It's easier to simply remove the
BUG check.

http://bugzilla.kernel.org/show_bug.cgi?id=12568

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Fri, 30 Jan 2009 02:21:14 +0000 (18:21 -0800)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: make sure we allocate enough storage for socket address
  [CIFS] Make socket retry timeouts consistent between blocking and nonblocking cases
  [CIFS] some cleanup to dir.c prior to addition of posix_open
  [CIFS] revalidate parent inode when rmdir done within that directory
  [CIFS] Rename md5 functions to avoid collision with new rt modules
  cifs: turn smb_send into a wrapper around smb_sendv

15 years agosata_sil: Fix build breakage
Alexander Beregalov [Wed, 28 Jan 2009 23:30:56 +0000 (02:30 +0300)]
sata_sil: Fix build breakage

Commit e57db7b (SATA Sil: Blacklist system that spins off disks during ACPI power off)
 breaks build like the following, in both cases when CONFIG_DMI set or not.

        drivers/ata/sata_sil.c: In function 'sil_broken_system_poweroff':
        drivers/ata/sata_sil.c:713: error: implicit declaration of function 'dmi_first_match'
        drivers/ata/sata_sil.c:713: warning: initialization makes pointer from integer without a cast

  sata_sil.c should include dmi.h

Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoDocumentation/Changes: add required versions for new filesystems
Bill Nottingham [Fri, 30 Jan 2009 00:28:40 +0000 (16:28 -0800)]
Documentation/Changes: add required versions for new filesystems

btrfs requires version 0.18 of its tools, and squashfs requires 4.0.
ext3 should use and ext4 requires v1.41.4 of e2fsprogs.

Signed-off-by: Bill Nottingham <notting@redhat.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
cc: Ted Tso <tytso@mit.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agofix emacs indenting howto filename expansion
Dan Carpenter [Fri, 30 Jan 2009 00:28:28 +0000 (16:28 -0800)]
fix emacs indenting howto filename expansion

I don't think emacs understands tilde expansion, so use
"expand-file-name" to do that.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoDocumentation: update CodingStyle tips for Emacs users
Teemu Likonen [Fri, 30 Jan 2009 00:28:16 +0000 (16:28 -0800)]
Documentation: update CodingStyle tips for Emacs users

With the previous Emacs tips example the kernel style was made available
for files in the kernel-tree only. This patch updates the tip to add a
separate cc-mode indent style ("linux-tabs-only"). This makes it easy to
switch between different indent styles and also makes the kernel style
easily available for any filetype mode (c++, awk, ...) that is managed
by the Emacs cc-mode.

Signed-off-by: Teemu Likonen <tlikonen@iki.fi>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoDocumentation: move DMA-mapping.txt to Doc/PCI/
Randy Dunlap [Fri, 30 Jan 2009 00:28:02 +0000 (16:28 -0800)]
Documentation: move DMA-mapping.txt to Doc/PCI/

Move DMA-mapping.txt to Documentation/PCI/.

DMA-mapping.txt was supposed to be moved from Documentation/ to
Documentation/PCI/.  The 00-INDEX files in those two directories
were updated, along with a few other text files, but the file
itself somehow escaped being moved, so move it and update more
text files and source files with its new location.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
Linus Torvalds [Fri, 30 Jan 2009 02:14:20 +0000 (18:14 -0800)]
Merge git://git./linux/kernel/git/gregkh/staging-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6:
  Staging: poch: fix verification of memory area
  Staging: usbip: usbip_start_threads(): handle kernel_thread failure
  staging: agnx: drivers/staging/agnx/agnx.h needs <linux/io.h>
  Staging: android: task_get_unused_fd_flags: fix the wrong usage of tsk->signal
  Staging: android: Add lowmemorykiller documentation.
  Staging: android: fix build error on 64bit boxes
  Staging: android: timed_gpio: Fix build to build on kernels after 2.6.25.
  Staging: android: binder: fix arm build errors
  Staging: meilhaus: fix Kbuild
  Staging: comedi: fix Kbuild

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
Linus Torvalds [Fri, 30 Jan 2009 02:14:05 +0000 (18:14 -0800)]
Merge git://git./linux/kernel/git/gregkh/driver-core-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  driver-core: fix kernel-doc parameter name
  UIO: Add missing documentation of features added recently
  Sync patch for jp_JP/stable_kernel_rules.txt

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
Linus Torvalds [Fri, 30 Jan 2009 02:13:22 +0000 (18:13 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ASoC: OMAP: Initialize XCCR and RCCR registers in McBSP DAI driver
  ASoC: Fix null string usage with WM8753 DAIs
  ALSA: hda - add another MacBook Pro 4, 1 subsystem ID
  ALSA: hda - Fix compile warning with CONFIG_SND_JACK=n
  ALSA: hda - Add quirk for HP DV6700 laptop
  ALSA: hda - Fix PCM reference NID for STAC/IDT analog outputs

15 years agoMerge branch 'linux-next' of git://git.infradead.org/ubi-2.6
Linus Torvalds [Fri, 30 Jan 2009 02:12:58 +0000 (18:12 -0800)]
Merge branch 'linux-next' of git://git.infradead.org/ubi-2.6

* 'linux-next' of git://git.infradead.org/ubi-2.6:
  UBI: allow direct user-space I/O
  UBI: fix resource de-allocation
  UBI: remove unused variable
  UBI: use nicer 64-bit math
  UBI: add ioctl compatibility
  UBI: constify file operations
  UBI: allow all ioctls
  UBI: remove unnecessry header inclusion
  UBI: improve ioctl commentaries
  UBI: add ioctl for is_mapped operation
  UBI: add ioctl for unmap operation
  UBI: add ioctl for map operation

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Linus Torvalds [Fri, 30 Jan 2009 02:11:02 +0000 (18:11 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: document difference between hid_blacklist and hid_ignore_list
  HID: add antec-branded soundgraph imon devices to blacklist
  HID: fix reversed logic in disconnect testing of hiddev
  HID: adjust report descriptor fixup for MS 1028 receiver

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
Linus Torvalds [Fri, 30 Jan 2009 02:10:36 +0000 (18:10 -0800)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-for-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  lguest: Fix a memory leak with the lg object during launcher close
  lguest: disable the FORTIFY for lguest.
  lguest: typos fix

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394...
Linus Torvalds [Fri, 30 Jan 2009 02:09:41 +0000 (18:09 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ieee1394/linux1394-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  ieee1394: sbp2: add workarounds for 2nd and 3rd generation iPods
  firewire: sbp2: add workarounds for 2nd and 3rd generation iPods
  firewire: sbp2: fix DMA mapping leak on the failure path
  firewire: sbp2: define some magic numbers as macros
  firewire: sbp2: fix payload limit at S1600 and S3200
  ieee1394: sbp2: don't assume zero model_id or firmware_revision if there is none
  ieee1394: sbp2: fix payload limit at S1600 and S3200
  ieee1394: sbp2: update a help string
  ieee1394: support for speeds greater than S800
  firewire: core: optimize card shutdown
  ieee1394: ohci1394: increase AT req. retries, fix ack_busy_X from Panasonic camcorders and others
  firewire: ohci: increase AT req. retries, fix ack_busy_X from Panasonic camcorders and others
  firewire: ohci: change "context_stop: still active" log message
  firewire: keep highlevel drivers attached during brief connection loss
  firewire: unnecessary BM delay after generation rollover
  firewire: insist on successive self ID complete events

15 years agodrivers/gpu/drm/i915/intel_lvds.c: fix locking snafu
Andrew Morton [Thu, 29 Jan 2009 22:25:27 +0000 (14:25 -0800)]
drivers/gpu/drm/i915/intel_lvds.c: fix locking snafu

s/unlock/lock/

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=12575

Reported-by: Daniel Vetter <daniel@ffwll.ch>
Cc: Dave Airlie <airlied@linux.ie>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoepoll: drop max_user_instances and rely only on max_user_watches
Davide Libenzi [Thu, 29 Jan 2009 22:25:26 +0000 (14:25 -0800)]
epoll: drop max_user_instances and rely only on max_user_watches

Linus suggested to put limits where the money is, and max_user_watches
already does that w/out the need of max_user_instances.  That has the
advantage to mitigate the potential DoS while allowing pretty generous
default behavior.

Allowing top 4% of low memory (per user) to be allocated in epoll watches,
we have:

LOMEM    MAX_WATCHES (per user)
512MB    ~178000
1GB      ~356000
2GB      ~712000

A box with 512MB of lomem, will meet some challenge in hitting 180K
watches, socket buffers math teaches us.  No more max_user_instances
limits then.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Bron Gondwana <brong@fastmail.fm>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agohp-wmi: set initial docking state
Frans Pop [Thu, 29 Jan 2009 22:25:25 +0000 (14:25 -0800)]
hp-wmi: set initial docking state

If the initial state is not set when the input device is set up, the first
docking event after the module is loaded will be lost.

Signed-off-by: Frans Pop <elendil@planet.nl>
Acked-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agohwmon: applesmc: add support for MacPro 3 temperature sensors
Bharath Ramesh [Thu, 29 Jan 2009 22:25:24 +0000 (14:25 -0800)]
hwmon: applesmc: add support for MacPro 3 temperature sensors

MacPro 3 have more temperature sensors than the previous MacPro's also the
sensor THTG has been removed.  This patch add supports for the newer
temperature sensors in the MacPro3.

Signed-off-by: Bharath Ramesh <bramesh@vt.edu>
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agohpilo: increment version
David Altobelli [Thu, 29 Jan 2009 22:25:23 +0000 (14:25 -0800)]
hpilo: increment version

Bump hpilo module version to indicate that the open/close bug is fixed.

Signed-off-by: David Altobelli <david.altobelli@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agocgroup: fix root_count when mount fails due to busy subsystem
Paul Menage [Thu, 29 Jan 2009 22:25:22 +0000 (14:25 -0800)]
cgroup: fix root_count when mount fails due to busy subsystem

root_count was being incremented in cgroup_get_sb() after all error
checking was complete, but decremented in cgroup_kill_sb(), which can be
called on a superblock that we gave up on due to an error.  This patch
changes cgroup_kill_sb() to only decrement root_count if the root was
previously linked into the list of roots.

Signed-off-by: Paul Menage <menage@google.com>
Tested-by: Serge Hallyn <serue@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agocgroups: add cpu_relax() calls in css_tryget() and cgroup_clear_css_refs()
Paul Menage [Thu, 29 Jan 2009 22:25:21 +0000 (14:25 -0800)]
cgroups: add cpu_relax() calls in css_tryget() and cgroup_clear_css_refs()

css_tryget() and cgroup_clear_css_refs() contain polling loops; these
loops should have cpu_relax calls in them to reduce cross-cache traffic.

Signed-off-by: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agocgroups: fix lock inconsistency in cgroup_clone()
Li Zefan [Thu, 29 Jan 2009 22:25:21 +0000 (14:25 -0800)]
cgroups: fix lock inconsistency in cgroup_clone()

I fixed a bug in cgroup_clone() in Linus' tree in commit 7b574b7
("cgroups: fix a race between cgroup_clone and umount") without noticing
there was a cleanup patch in -mm tree that should be rebased (now commit
104cbd5, "cgroups: use task_lock() for access tsk->cgroups safe in
cgroup_clone()"), thus resulted in lock inconsistency.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoalpha: fix the BUG() macro
Ivan Kokshaysky [Thu, 29 Jan 2009 22:25:20 +0000 (14:25 -0800)]
alpha: fix the BUG() macro

The commit "alpha: teach the compiler that BUG doesn't return"
(ed6b9b97f42c091630335bfb71a2931e6f86388b) moved the asm code into inline
function which takes __FILE__ and __LINE__ as arguments.  This violates
asm constrains there ("i" - an immediate operand with constant value), so
that compile may result in warning or error, depending on compiler
version.

Just adding an infinite loop to the BUG() is sufficient.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoalpha: compile fixes
Ivan Kokshaysky [Thu, 29 Jan 2009 22:25:19 +0000 (14:25 -0800)]
alpha: compile fixes

- jensen build: fix conflicting declarations for pci_alloc_consistent()
  and undefined virt_to_phys();

- SMP: arch/alpha/kernel/smp.c:124: warning: passing argument 2
       of '__cpu_test_and_set' discards qualifiers from pointer target type
  Interestingly, this only happens with gcc-4.2; gcc <= 4.1 and gcc-4.3
  are OK. Fixed with extra assignment.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoalpha: use syscall wrappers
Ivan Kokshaysky [Thu, 29 Jan 2009 22:25:18 +0000 (14:25 -0800)]
alpha: use syscall wrappers

Convert OSF syscalls and add alpha specific SYSCALL_ALIAS() macro.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: NULL pointer dereference at rmdir on some NUMA systems
KAMEZAWA Hiroyuki [Thu, 29 Jan 2009 22:25:17 +0000 (14:25 -0800)]
memcg: NULL pointer dereference at rmdir on some NUMA systems

N_POSSIBLE doesn't means there is memory...and force_empty can
visit invalid node which have no pgdat.

To visit all valid nodes, N_HIGH_MEMORY should be used.

Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agofbdev: incorrect URL given in drivers/video/Kconfig
Alex Buell [Thu, 29 Jan 2009 22:25:16 +0000 (14:25 -0800)]
fbdev: incorrect URL given in drivers/video/Kconfig

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agohp-wmi: fix regressions caused by missing if statement
Frans Pop [Thu, 29 Jan 2009 22:25:14 +0000 (14:25 -0800)]
hp-wmi: fix regressions caused by missing if statement

Error was introduced in commit fe8e4e039dc3 ("hp-wmi: handle
rfkill_register() failure").

Signed-off-by: Frans Pop <elendil@planet.nl>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Acked-by: Matthew Garrett <mjg@redhat.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: update document to mention that swapoff should be tested
KAMEZAWA Hiroyuki [Thu, 29 Jan 2009 22:25:14 +0000 (14:25 -0800)]
memcg: update document to mention that swapoff should be tested

Considering the recently found problem "memcg: fix refcnt handling at
swapoff", it's better to mention swapoff behavior in the memcg_test
document.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: fix refcnt handling at swapoff
KAMEZAWA Hiroyuki [Thu, 29 Jan 2009 22:25:13 +0000 (14:25 -0800)]
memcg: fix refcnt handling at swapoff

Now, at swapoff, even while try_charge() fails, commit is executed.  This
is a bug which turns the refcnt of cgroup_subsys_state negative.

Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Tested-by: Li Zefan <lizf@cn.fujitsu.com>
Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogpiolib: fix request related issue
Magnus Damm [Thu, 29 Jan 2009 22:25:12 +0000 (14:25 -0800)]
gpiolib: fix request related issue

Fix request-already-requested handling in gpio_request().

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: <stable@kernel.org> [2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: get/put parents at create/free
Daisuke Nishimura [Thu, 29 Jan 2009 22:25:11 +0000 (14:25 -0800)]
memcg: get/put parents at create/free

The lifetime of struct cgroup and struct mem_cgroup is different and
mem_cgroup has its own reference count for handling references from
swap_cgroup.

This causes strange problem that the parent mem_cgroup dies while child
mem_cgroup alive, and this problem causes a bug in case of
use_hierarchy==1 because res_counter_uncharge climbs up the tree.

This patch is for avoiding it by getting the parent at create, and putting
it at freeing.

Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reviewed-by; KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agocgroups: use hierarchy mutex in creation failure path
KAMEZAWA Hiroyuki [Thu, 29 Jan 2009 22:25:10 +0000 (14:25 -0800)]
cgroups: use hierarchy mutex in creation failure path

Now, cgrp->sibling is handled under hierarchy mutex.
error route should do so, too.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Acked-by Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomm: OOM documentation update
Evgeniy Polyakov [Thu, 29 Jan 2009 22:25:09 +0000 (14:25 -0800)]
mm: OOM documentation update

Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agokprobes: fix module compilation error with CONFIG_KPROBES=n
Masami Hiramatsu [Thu, 29 Jan 2009 22:25:08 +0000 (14:25 -0800)]
kprobes: fix module compilation error with CONFIG_KPROBES=n

Define kprobes related data structures even if CONFIG_KPROBES is not set.
This fixes compilation errors which occur if CONFIG_KPROBES is not set, in
kprobe using modules.

[akpm@linux-foundation.org: fix build for non-kprobes-supporting architectures]
Reviewed-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agosgi-xpc: fix up stale DBUG_ON statements
Robin Holt [Thu, 29 Jan 2009 22:25:07 +0000 (14:25 -0800)]
sgi-xpc: fix up stale DBUG_ON statements

Clean up the stale DBUG_ON checks and add a couple new ones.

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agosgi-xpc: Remove NULL pointer dereference.
Robin Holt [Thu, 29 Jan 2009 22:25:07 +0000 (14:25 -0800)]
sgi-xpc: Remove NULL pointer dereference.

If the bte copy fails, the attempt to retrieve payloads merely returns a
null pointer deref and not NULL as was expected.

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Dean Nelson <dcn@sgi.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agosgi-xpc: ensure flags are updated before bte_copy
Robin Holt [Thu, 29 Jan 2009 22:25:06 +0000 (14:25 -0800)]
sgi-xpc: ensure flags are updated before bte_copy

The clearing of the msg->flags needs a barrier between it and the notify
of the channel threads that the messages are cleaned and ready for use.

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Dean Nelson <dcn@sgi.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoFix OOPS in mmap_region() when merging adjacent VM_LOCKED file segments
Linus Torvalds [Fri, 30 Jan 2009 01:46:42 +0000 (17:46 -0800)]
Fix OOPS in mmap_region() when merging adjacent VM_LOCKED file segments

As of commit ba470de43188cdbff795b5da43a1474523c6c2fb ("map: handle
mlocked pages during map, remap, unmap") we now use the 'vma' variable
at the end of mmap_region() to handle the page-in of newly mapped
mlocked pages.

However, if we merged adjacent vma's together, the vma we're using may
be stale.  We historically consciously avoided using it after the merge
operation, but that got overlooked when redoing the locked page
handling.

This commit simplifies mmap_region() by doing any vma merges early,
avoiding the issue entirely, and 'vma' will always be valid.  As pointed
out by Hugh Dickins, this depends on any drivers that change the page
offset of flags to have set one of the VM_SPECIAL bits (so that they
cannot trigger the early merge logic), but that's true in general.

Reported-and-tested-by: Maksim Yevmenkin <maksim.yevmenkin@gmail.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agotulip: fix 21142 with 10Mbps without negotiation
Philippe De Muyter [Fri, 30 Jan 2009 01:35:04 +0000 (17:35 -0800)]
tulip: fix 21142 with 10Mbps without negotiation

with current kernels, tulip 21142 ethernet controllers fail to connect
to a 10Mbps only (i.e. without negotiation-partner) network.  It used
to work in 2.4 kernels.  Fix that.  Tested on a 21142 Rev 0x11.

Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agodrivers/net/skfp: if !capable(CAP_NET_ADMIN): inverted logic
Roel Kluin [Fri, 30 Jan 2009 01:32:20 +0000 (17:32 -0800)]
drivers/net/skfp: if !capable(CAP_NET_ADMIN): inverted logic

Fix inverted logic

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agogianfar: Fix Wake-on-LAN support
Anton Vorontsov [Fri, 30 Jan 2009 01:31:13 +0000 (17:31 -0800)]
gianfar: Fix Wake-on-LAN support

commit 0f0ca340e57bd7446855fefd07a64249acf81223 ("phy: power
management support") caused a regression in the gianfar driver.

Now phylib turns off PHY power during suspend, and thus WOL
doesn't work anymore.

This patch workarounds the issue by enabling wakeup in the MDIO
device, i.e. just restores the old behaviour for the gianfar
driver. Note that this way all PHYs on a given MDIO bus won't
be turned off during suspend, which isn't good from the power
saving point of view.

A proper, per netdevice wakeup management support will need
a bit reworked phylib suspend/resume logic.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosmsc911x: timeout reaches -1
Roel Kluin [Fri, 30 Jan 2009 01:30:00 +0000 (17:30 -0800)]
smsc911x: timeout reaches -1

With a postfix decrement the timeout will reach -1 rather than 0,
so the warning will not be issued.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosmsc9420: fix interrupt signalling test failures
Steve Glendinning [Fri, 30 Jan 2009 01:29:15 +0000 (17:29 -0800)]
smsc9420: fix interrupt signalling test failures

smsc9420 performs an interrupt signalling test when the interface is
brought up.  The current code mistakenly sets its test flag to false
AFTER enabling the software interrupt source, making failure quite
likely.

This patch changes the code to set the test flag BEFORE enabling
interrupts. I've also removed an smp_wmb because the following spinlock
provides an implicit memory barrier.

Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoucc_geth: Change uec phy id to the same format as gianfar's
Haiying Wang [Fri, 30 Jan 2009 01:28:04 +0000 (17:28 -0800)]
ucc_geth: Change uec phy id to the same format as gianfar's

The commit b31a1d8b41513b96e9c7ec2f68c5734cef0b26a4 ("gianfar: Convert
gianfar to an of_platform_driver") changes the gianfar's phy id to the
format like "mdio@xxxx:xx", but uec still uses the old format like
"xxxxxxxx:xx".  For the board whose UEC uses gianfar-mdio like
MPC8568MDS, the phy can not be attached because of the incompatible
phy id format. This patch changes uec's phy id to the same format as
gianfar's.

Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agowimax: fix build issue when debugfs is disabled
Inaky Perez-Gonzalez [Fri, 30 Jan 2009 01:18:31 +0000 (17:18 -0800)]
wimax: fix build issue when debugfs is disabled

As reported by Toralf Förster and Randy Dunlap.

- http://linuxwimax.org/pipermail/wimax/2009-January/000460.html

- http://lkml.org/lkml/2009/1/29/279

The definitions needed for the wimax stack and i2400m driver debug
infrastructure was, by mistake, compiled depending on CONFIG_DEBUG_FS
(by them being placed in the debugfs.c files); thus the build broke in
2.6.29-rc3 when debugging was enabled (CONFIG_WIMAX_DEBUG) and
DEBUG_FS was disabled.

These definitions are always needed if debug is enabled at compile
time (independently of DEBUG_FS being or not enabled), so moving them
to a file that is always compiled fixes the issue.

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agolguest: Fix a memory leak with the lg object during launcher close
Mark Wallis [Mon, 26 Jan 2009 06:32:35 +0000 (17:32 +1100)]
lguest: Fix a memory leak with the lg object during launcher close

Fix a memory leak identified by Rusty Russell during LCA09 by
kfree'ing the lg object instead of just clearing it when the
launcher closes.

Signed-off-by: Mark Wallis <mwallis@serialmonkey.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
15 years agolguest: disable the FORTIFY for lguest.
Tim 'mithro' Ansell [Thu, 22 Jan 2009 04:06:41 +0000 (15:06 +1100)]
lguest: disable the FORTIFY for lguest.

Makes all the warnings go away when compiling lguest on Ubuntu on
Intrepid or greater.

Signed-off-by: Timothy R Ansell <mithro@mithis.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
15 years agolguest: typos fix
Atsushi SAKAI [Fri, 16 Jan 2009 11:39:14 +0000 (20:39 +0900)]
lguest: typos fix

3 points

lguest_asm.S => i386_head.S
LHCALL_BREAK => LHREQ_BREAK
perferred    => preferred

Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
15 years agonetxen: fix memory leak in drivers/net/netxen_nic_init.c
Daniel Marjamäki [Thu, 29 Jan 2009 08:55:56 +0000 (08:55 +0000)]
netxen: fix memory leak in drivers/net/netxen_nic_init.c

For kernel bugzilla #12537:
http://bugzilla.kernel.org/show_bug.cgi?id=12537

Free memory.

Signed-off-by: Daniel Marjamäki <danielm77@spray.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotun: Add some missing TUN compat ioctl translations.
David S. Miller [Fri, 30 Jan 2009 00:53:35 +0000 (16:53 -0800)]
tun: Add some missing TUN compat ioctl translations.

Based upon a report from Michael Tokarev <mjt@tls.msk.ru>:

Just saw in dmesg:

ioctl32(kvm:4408): Unknown cmd fd(9) cmd(800454cf){t:'T';sz:4} arg(ffc668e4) on /dev/net/tun

Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoipv4: fix infinite retry loop in IP-Config
Benjamin Zores [Fri, 30 Jan 2009 00:19:13 +0000 (16:19 -0800)]
ipv4: fix infinite retry loop in IP-Config

Signed-off-by: Benjamin Zores <benjamin.zores@alcatel-lucent.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: update documentation ip aliases
Stephen Hemminger [Fri, 30 Jan 2009 00:16:31 +0000 (16:16 -0800)]
net: update documentation ip aliases

This documentation is old.  Add a short note to describe why aliases
are no long necessary, and remove the old contact/edit info.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: Fix OOPS in skb_seq_read().
Shyam Iyer [Fri, 30 Jan 2009 00:12:42 +0000 (16:12 -0800)]
net: Fix OOPS in skb_seq_read().

It oopsd for me in skb_seq_read. addr2line said it was
linux-2.6/net/core/skbuff.c:2228, which is this line:

while (st->frag_idx < skb_shinfo(st->cur_skb)->nr_frags) {

I added some printks in there and it looks like we hit this:

        } else if (st->root_skb == st->cur_skb &&
                   skb_shinfo(st->root_skb)->frag_list) {
                 st->cur_skb = skb_shinfo(st->root_skb)->frag_list;
                 st->frag_idx = 0;
                 goto next_skb;
        }

Actually I did some testing and added a few printks and found that the
st->cur_skb->data was 0 and hence the ptr used by iscsi_tcp was null.
This caused the kernel panic.

  if (abs_offset < block_limit) {
- *data = st->cur_skb->data + abs_offset;
+ *data = st->cur_skb->data + (abs_offset - st->stepped_offset);

I enabled the debug_tcp and with a few printks found that the code did
not go to the next_skb label and could find that the sequence being
followed was this -

It hit this if condition -

        if (st->cur_skb->next) {
                st->cur_skb = st->cur_skb->next;
                st->frag_idx = 0;
                goto next_skb;

And so, now the st pointer is shifted to the next skb whereas actually
it should have hit the second else if first since the data is in the
frag_list.

        else if (st->root_skb == st->cur_skb &&
                 skb_shinfo(st->root_skb)->frag_list) {
                st->cur_skb = skb_shinfo(st->root_skb)->frag_list;
                goto next_skb;
        }

Reversing the two conditions the attached patch fixes the issue for me
on top of Herbert's patches.

Signed-off-by: Shyam Iyer <shyam_iyer@dell.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: Fix frag_list handling in skb_seq_read
Herbert Xu [Fri, 30 Jan 2009 00:07:52 +0000 (16:07 -0800)]
net: Fix frag_list handling in skb_seq_read

The frag_list handling was broken in skb_seq_read:

1) We didn't add the stepped offset when looking at the head
are of fragments other than the first.

2) We didn't take the stepped offset away when setting the data
pointer in the head area.

3) The frag index wasn't reset.

This patch fixes both issues.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetxen: revert jumbo ringsize
Dhananjay Phadke [Fri, 30 Jan 2009 00:05:19 +0000 (16:05 -0800)]
netxen: revert jumbo ringsize

Reducing jumbo ring size below 1024 reduces throughput for old
firmwares (3.4.216 and older) running on older (NX2031) chip,
so restore it back to 1024.

This was reduced in commit 32ec803348b4d5f1353e1d7feae30880b8b3e342
("netxen: reduce memory footprint").

Raising jumbo ring size from 512 to 1024, adds ~4MB per port, but
there's still big saving because of original patch (~20MB per port).

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
David S. Miller [Thu, 29 Jan 2009 23:27:47 +0000 (15:27 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-2.6