openwrt/staging/blogic.git
16 years agox86: remove duplicated force_mwait
Yinghai Lu [Mon, 8 Sep 2008 00:58:51 +0000 (17:58 -0700)]
x86: remove duplicated force_mwait

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu make amd.c more like amd_64.c v2
Yinghai Lu [Mon, 8 Sep 2008 00:58:50 +0000 (17:58 -0700)]
x86: cpu make amd.c more like amd_64.c v2

1. make 32bit have early_init_amd_mc and amd_detect_cmp
2. seperate init_amd_k5/k6/k7 ...

v2: fix compiling for !CONFIG_SMP

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86-64: add two __cpuinit annotations
Jan Beulich [Fri, 29 Aug 2008 12:15:04 +0000 (13:15 +0100)]
x86-64: add two __cpuinit annotations

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, cpu init: call early_init_xxx in init_xxx
Yinghai Lu [Sat, 6 Sep 2008 08:52:28 +0000 (01:52 -0700)]
x86, cpu init: call early_init_xxx in init_xxx

so we:

 1. could set some cap to ap
 2. restore some cap after memset in identify_cpu for boot cpu

esp for CONSTANT_TSC this matters, as:

before this patch:
 flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow rep_good nopl pni monitor cx16 lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs

after this patch:
 flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl pni monitor cx16 lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs

so constant_tsc is back...

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: remove duplicated get_model_name() calling
Yinghai Lu [Sat, 6 Sep 2008 08:52:27 +0000 (01:52 -0700)]
x86: remove duplicated get_model_name() calling

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, init_64.c: cleanup
Ingo Molnar [Fri, 5 Sep 2008 08:23:26 +0000 (10:23 +0200)]
x86, init_64.c: cleanup

Clean up comments.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: move nonx_setup etc from common.c to init_64.c
Yinghai Lu [Fri, 5 Sep 2008 07:58:28 +0000 (00:58 -0700)]
x86: move nonx_setup etc from common.c to init_64.c

like 32 bit put it in init_32.c

Signed-off-by: Yinghai <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: use cpu/common.c on 64 bit
Yinghai Lu [Fri, 5 Sep 2008 03:09:14 +0000 (20:09 -0700)]
x86: use cpu/common.c on 64 bit

Use cpu/common.c on both 64-bit and 32-bit and remove cpu/common_64.c.

We started out with this linecount:

  816  arch/x86/kernel/cpu/common_64.c
  805  arch/x86/kernel/cpu/common.c

and the resulting common.c is 1197 lines long, so there's already
424 lines of code eliminated in this phase of the unification.

Signed-off-by: Yinghai <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu/common*.c, merge whitespaces
Ingo Molnar [Fri, 5 Sep 2008 07:37:15 +0000 (09:37 +0200)]
x86: cpu/common*.c, merge whitespaces

Merge leftover whitespaces, to make arch/x86/kernel/cpu/common_64.c
exactly identical to arch/x86/kernel/cpu/common.c.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu/common*.c, merge identify_cpu()
Yinghai Lu [Fri, 5 Sep 2008 03:09:13 +0000 (20:09 -0700)]
x86: cpu/common*.c, merge identify_cpu()

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu/common*.c, merge generic_identify()
Yinghai Lu [Fri, 5 Sep 2008 03:09:12 +0000 (20:09 -0700)]
x86: cpu/common*.c, merge generic_identify()

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu/common*.c: merge print_cpu_info()
Yinghai Lu [Fri, 5 Sep 2008 03:09:11 +0000 (20:09 -0700)]
x86: cpu/common*.c: merge print_cpu_info()

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu/common*.c, merge early_identify_cpu()
Yinghai Lu [Fri, 5 Sep 2008 03:09:10 +0000 (20:09 -0700)]
x86: cpu/common*.c, merge early_identify_cpu()

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu/common.c: merge get_cpu_cap()
Yinghai Lu [Fri, 5 Sep 2008 03:09:09 +0000 (20:09 -0700)]
x86: cpu/common.c: merge get_cpu_cap()

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu/common*.c, merge detect_ht()
Yinghai Lu [Fri, 5 Sep 2008 03:09:08 +0000 (20:09 -0700)]
x86: cpu/common*.c, merge detect_ht()

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu/common*.c, merge display_cacheinfo()
Yinghai Lu [Fri, 5 Sep 2008 03:09:07 +0000 (20:09 -0700)]
x86: cpu/common*.c, merge display_cacheinfo()

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu/common.c, merge default_init()
Yinghai Lu [Fri, 5 Sep 2008 03:09:06 +0000 (20:09 -0700)]
x86: cpu/common.c, merge default_init()

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu/common*.c, merge switch_to_new_gdt()
Yinghai Lu [Fri, 5 Sep 2008 03:09:05 +0000 (20:09 -0700)]
x86: cpu/common*.c, merge switch_to_new_gdt()

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu/common*.c have same cpu_init(), with copying and #ifdef
Yinghai Lu [Fri, 5 Sep 2008 03:09:04 +0000 (20:09 -0700)]
x86: cpu/common*.c have same cpu_init(), with copying and #ifdef

hard to merge by lines... (as here we have material differences between
32-bit and 64-bit mode) - will try to do it later.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu/common*.c, make 32-bit have 64-bit only functions
Yinghai Lu [Fri, 5 Sep 2008 03:09:03 +0000 (20:09 -0700)]
x86: cpu/common*.c, make 32-bit have 64-bit only functions

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cpu/common.c, let 64-bit code have 32-bit only functions
Yinghai Lu [Fri, 5 Sep 2008 03:09:02 +0000 (20:09 -0700)]
x86: cpu/common.c, let 64-bit code have 32-bit only functions

No effect on 64-bit.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: same gdt_page with macro
Yinghai Lu [Fri, 5 Sep 2008 03:09:01 +0000 (20:09 -0700)]
x86: same gdt_page with macro

Move the 32-bit and 64-bit gdt_page definitions next to each
other, separated with an #ifdef.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: make header file the same in arch/x86/kernel/cpu/common_xx.c
Yinghai Lu [Fri, 5 Sep 2008 03:09:00 +0000 (20:09 -0700)]
x86: make header file the same in arch/x86/kernel/cpu/common_xx.c

Make the files more similar in preparation to unification, no
code changed.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: make detect_ht depend on CONFIG_X86_HT
Yinghai Lu [Fri, 5 Sep 2008 03:08:59 +0000 (20:08 -0700)]
x86: make detect_ht depend on CONFIG_X86_HT

64-bit has X86_HT set too, so use that instead of SMP.

This also removes a include/asm-x86/processor.h ifdef.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'x86/core' into x86/unify-cpu-detect
Ingo Molnar [Fri, 5 Sep 2008 07:27:23 +0000 (09:27 +0200)]
Merge branch 'x86/core' into x86/unify-cpu-detect

16 years agoMerge commit '63cc8c75156462d4b42cbdd76c293b7eee7ddbfe':
Ingo Molnar [Fri, 5 Sep 2008 07:24:30 +0000 (09:24 +0200)]
Merge commit '63cc8c75156462d4b42cbdd76c293b7eee7ddbfe':

  "percpu: introduce DEFINE_PER_CPU_PAGE_ALIGNED() macro"

into x86/core

Conflicts:
arch/x86/kernel/cpu/common.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'x86/x2apic' into x86/core
Ingo Molnar [Fri, 5 Sep 2008 07:21:21 +0000 (09:21 +0200)]
Merge branch 'x86/x2apic' into x86/core

Conflicts:
arch/x86/kernel/cpu/common_64.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'x86/cpu' into x86/core
Ingo Molnar [Fri, 5 Sep 2008 07:19:50 +0000 (09:19 +0200)]
Merge branch 'x86/cpu' into x86/core

16 years agoMerge branch 'x86/xsave' into x86/core
Ingo Molnar [Fri, 5 Sep 2008 07:18:39 +0000 (09:18 +0200)]
Merge branch 'x86/xsave' into x86/core

16 years agox86: move 32bit related functions together
Yinghai Lu [Thu, 4 Sep 2008 19:09:47 +0000 (21:09 +0200)]
x86: move 32bit related functions together

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: make get_mode_name of 64bit the same as 32bit
Yinghai Lu [Thu, 4 Sep 2008 19:09:46 +0000 (21:09 +0200)]
x86: make get_mode_name of 64bit the same as 32bit

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: make 32bit support show_msr like 64 bit
Yinghai Lu [Thu, 4 Sep 2008 19:09:46 +0000 (21:09 +0200)]
x86: make 32bit support show_msr like 64 bit

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: remove cpu_vendor_dev
Yinghai Lu [Thu, 4 Sep 2008 19:09:45 +0000 (21:09 +0200)]
x86: remove cpu_vendor_dev

1. add c_x86_vendor into cpu_dev
2. change cpu_devs to static
3. check c_x86_vendor before put that cpu_dev into array
4. remove alignment for 64bit
5. order the sequence in cpu_devs according to link sequence...
   so could put intel at first, then amd...

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: order functions in cpu/common.c and cpu/common_64.c v2
Yinghai Lu [Thu, 4 Sep 2008 19:09:44 +0000 (21:09 +0200)]
x86: order functions in cpu/common.c and cpu/common_64.c v2

v2: make 64 bit get c->x86_cache_alignment = c->x86_clfush_size

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: make (early)_identify_cpu more the same between 32bit and 64 bit
Yinghai Lu [Thu, 4 Sep 2008 19:09:44 +0000 (21:09 +0200)]
x86: make (early)_identify_cpu more the same between 32bit and 64 bit

1. add extended_cpuid_level for 32bit
 2. add generic_identify for 64bit
 3. add early_identify_cpu for 32bit
 4. early_identify_cpu not be called by identify_cpu
 5. remove early in get_cpu_vendor for 32bit
 6. add get_cpu_cap
 7. add cpu_detect for 64bit

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: delay early cpu initialization until cpuid is done
Krzysztof Helt [Thu, 4 Sep 2008 19:09:43 +0000 (21:09 +0200)]
x86: delay early cpu initialization until cpuid is done

Move early cpu initialization after cpu early get cap so the
early cpu initialization can fix up cpu caps.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: move mtrr cpu cap setting early in early_init_xxxx
Yinghai Lu [Thu, 4 Sep 2008 19:09:43 +0000 (21:09 +0200)]
x86: move mtrr cpu cap setting early in early_init_xxxx

Krzysztof Helt found MTRR is not detected on k6-2

root cause:
we moved mtrr_bp_init() early for mtrr trimming,
and in early_detect we only read the CPU capability from cpuid,
so some cpu doesn't have that bit in cpuid.

So we need to add early_init_xxxx to preset those bit before mtrr_bp_init
for those earlier cpus.

this patch is for v2.6.27

Reported-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'x86/debug' into x86/cpu
Ingo Molnar [Thu, 4 Sep 2008 19:08:09 +0000 (21:08 +0200)]
Merge branch 'x86/debug' into x86/cpu

16 years agox86: unify using pci_mmcfg_insert_resource
Yinghai Lu [Thu, 4 Sep 2008 19:04:32 +0000 (21:04 +0200)]
x86: unify using pci_mmcfg_insert_resource

even with known_bridge insert them late too.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: split e820 reserved entries record to late, v7
Yinghai Lu [Thu, 4 Sep 2008 18:59:22 +0000 (20:59 +0200)]
x86: split e820 reserved entries record to late, v7

try to insert_resource second time, by expanding the resource...

for case: e820 reserved entry is partially overlapped with bar res...

hope it will never happen

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'core/resources' into x86/core
Ingo Molnar [Thu, 4 Sep 2008 19:04:04 +0000 (21:04 +0200)]
Merge branch 'core/resources' into x86/core

16 years agoIO resources: add reserve_region_with_split()
Yinghai Lu [Thu, 4 Sep 2008 19:02:44 +0000 (21:02 +0200)]
IO resources: add reserve_region_with_split()

add reserve_region_with_split() to not lose e820 reserved entries if
they overlap with existing IO regions:

with test case by extend 0xe0000000 - 0xeffffff to 0xdd800000 -
we get:
e0000000-efffffff : PCI MMCONFIG 0
 e0000000-efffffff : reserved

and in /proc/iomem we get:
found conflict for reserved [dd800000efffffff], try to reserve with split
    __reserve_region_with_split: (PCI Bus #80) [dd000000ddffffff], res: (reserved) [dd800000efffffff]
    __reserve_region_with_split: (PCI Bus #00) [de000000dfffffff], res: (reserved) [de000000efffffff]
initcall pci_subsys_init+0x0/0x121 returned 0 after 381 msecs
in dmesg

various fixes and improvements suggested by Linus.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoforgotten refcount on sysctl root table
Al Viro [Thu, 4 Sep 2008 16:05:57 +0000 (17:05 +0100)]
forgotten refcount on sysctl root table

We should've set refcount on the root sysctl table; otherwise we'll blow
up the first time we get down to zero dynamically registered sysctl
tables.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge branch 'x86/cpu' into x86/x2apic
H. Peter Anvin [Thu, 4 Sep 2008 16:21:21 +0000 (09:21 -0700)]
Merge branch 'x86/cpu' into x86/x2apic

Conflicts:

arch/x86/kernel/cpu/feature_names.c
include/asm-x86/cpufeature.h

16 years agoMerge branch 'x86/cpu' into x86/xsave
H. Peter Anvin [Thu, 4 Sep 2008 16:04:45 +0000 (09:04 -0700)]
Merge branch 'x86/cpu' into x86/xsave

Conflicts:

arch/x86/kernel/cpu/feature_names.c
include/asm-x86/cpufeature.h

16 years agox86: drop -funroll-loops for csum_partial_64.c
Andi Kleen [Thu, 4 Sep 2008 11:46:11 +0000 (13:46 +0200)]
x86: drop -funroll-loops for csum_partial_64.c

Impact: performance optimization

I did some rebenchmarking with modern compilers and dropping
-funroll-loops makes the function consistently go faster by a few
percent.  So drop that flag.

Thanks to Richard Guenther for a hint.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
16 years agox86: split e820 reserved entries record to late v4
Ingo Molnar [Fri, 29 Aug 2008 06:09:23 +0000 (08:09 +0200)]
x86: split e820 reserved entries record to late v4

this one replaces:

| commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd
| Author: Yinghai Lu <yhlu.kernel@gmail.com>
| Date:   Mon Aug 25 00:56:08 2008 -0700
|
|    x86: fix HPET regression in 2.6.26 versus 2.6.25, check hpet against BAR, v3

v2: insert e820 reserve resources before pnp_system_init
v3: fix merging problem in tip/x86/core
v4: address Linus's review about comments and condition in _late()

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: split e820 reserved entries record to late v2
Yinghai Lu [Thu, 28 Aug 2008 20:52:25 +0000 (13:52 -0700)]
x86: split e820 reserved entries record to late v2

so could let BAR res register at first, or even pnp.

v2: insert e820 reserve resources before pnp_system_init

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'linus' into x86/core
H. Peter Anvin [Thu, 4 Sep 2008 15:09:09 +0000 (08:09 -0700)]
Merge branch 'linus' into x86/core

16 years agox86: move dir es7000 to es7000_32.c
Yinghai Lu [Thu, 28 Aug 2008 06:01:16 +0000 (23:01 -0700)]
x86: move dir es7000 to es7000_32.c

to be aligned with numaq, summit.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'x86/cpu' into x86/core
H. Peter Anvin [Thu, 4 Sep 2008 15:08:42 +0000 (08:08 -0700)]
Merge branch 'x86/cpu' into x86/core

Conflicts:

arch/x86/kernel/cpu/feature_names.c
include/asm-x86/cpufeature.h

16 years agoMerge branch 'linus' into x86/x2apic
Ingo Molnar [Thu, 4 Sep 2008 11:02:35 +0000 (13:02 +0200)]
Merge branch 'linus' into x86/x2apic

Conflicts:
arch/x86/kernel/cpu/cyrix.c
include/asm-x86/cpufeature.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoPCI: fix pbus_size_mem() resource alignment for CardBus controllers
Linus Torvalds [Thu, 4 Sep 2008 08:33:59 +0000 (01:33 -0700)]
PCI: fix pbus_size_mem() resource alignment for CardBus controllers

Commit 884525655d07fdee9245716b998ecdc45cdd8007 ("PCI: clean up resource
alignment management") changed the resource handling to mark how a
resource was aligned on a per-resource basis.

Thus, instead of looking at the resource number to determine whether it
was a bridge resource or a regular resource (they have different
alignment rules), we should just ask the resource for its alignment
directly.

The reason this broke only cardbus resources was that for the other
types of resources, the old way of deciding alignment actually still
happened to work.  But CardBus bridge resources had been changed by
commit 934b7024f0ed29003c95cef447d92737ab86dc4f ("Fix cardbus resource
allocation") to look more like regular resources than PCI bridge
resources from an alignment handling standpoint.

Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agox86: Change warning message in TSC calibration.
Alok N Kataria [Thu, 4 Sep 2008 01:18:01 +0000 (18:18 -0700)]
x86: Change warning message in TSC calibration.

When calibration against PIT fails, the warning that we print is misleading.
In a virtualized environment the VM may get descheduled while calibration
or, the check in PIT calibration may fail due to other virtualization
overheads.

The warning message explicitly assumes that calibration failed due to SMI's
which may not be the case. Change that to something proper.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agommap: fix petty bug in anonymous shared mmap offset handling
Tejun Heo [Wed, 3 Sep 2008 14:09:47 +0000 (16:09 +0200)]
mmap: fix petty bug in anonymous shared mmap offset handling

Anonymous mappings should ignore offset but shared anonymous mapping
forgot to clear it and makes the following legit test program trigger
SIGBUS.

 #include <sys/mman.h>
 #include <stdio.h>
 #include <errno.h>

 #define PAGE_SIZE 4096

 int main(void)
 {
 char *p;
 int i;

 p = mmap(NULL, 2 * PAGE_SIZE, PROT_READ|PROT_WRITE,
  MAP_SHARED|MAP_ANONYMOUS, -1, PAGE_SIZE);
 if (p == MAP_FAILED) {
 perror("mmap");
 return 1;
 }

 for (i = 0; i < 2; i++) {
 printf("page %d\n", i);
 p[i * 4096] = i;
 }
 return 0;
 }

Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Hugh Dickins <hugh@veritas.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Thu, 4 Sep 2008 00:57:55 +0000 (17:57 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  SELinux: memory leak in security_context_to_sid_core

16 years agoMerge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
Linus Torvalds [Thu, 4 Sep 2008 00:36:37 +0000 (17:36 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/paulus/powerpc

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  powerpc: Fix for getting CPU number in power_save_ppc32_restore()
  powerpc: Fix build error with 64K pages and !hugetlbfs
  powerpc: Work around gcc's -fno-omit-frame-pointer bug
  powerpc: Make sure _etext is after all kernel text
  powerpc: Only make kernel text pages of linear mapping executable
  powerpc: Fix uninitialised variable in VSX alignment code

16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Wed, 3 Sep 2008 23:21:02 +0000 (16:21 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  bnx2x: Accessing un-mapped page
  ath9k: Fix TX control flag use for no ACK and RTS/CTS
  ath9k: Fix TX status reporting
  iwlwifi: fix STATUS_EXIT_PENDING is not set on pci_remove
  iwlwifi: call apm stop on exit
  iwlwifi: fix Tx cmd memory allocation failure handling
  iwlwifi: fix rx_chain computation
  iwlwifi: fix station mimo power save values
  iwlwifi: remove false rxon if rx chain changes
  iwlwifi: fix hidden ssid discovery in passive channels
  iwlwifi: W/A for the TSF correction in IBSS
  netxen: Remove workaround for chipset quirk
  pcnet-cs, axnet_cs: add new IDs, remove dup ID with less info
  ixgbe: initialize interrupt throttle rate
  net/usb/pegasus: avoid hundreds of diagnostics
  tipc: Don't use structure names which easily globally conflict.

16 years agoSELinux: memory leak in security_context_to_sid_core
Eric Paris [Wed, 3 Sep 2008 15:49:47 +0000 (11:49 -0400)]
SELinux: memory leak in security_context_to_sid_core

Fix a bug and a philosophical decision about who handles errors.

security_context_to_sid_core() was leaking a context in the common case.
This was causing problems on fedora systems which recently have started
making extensive use of this function.

In discussion it was decided that if string_to_context_struct() had an
error it was its own responsibility to clean up any mess it created
along the way.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
16 years agoMerge branch 'davem-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik...
David S. Miller [Wed, 3 Sep 2008 21:43:30 +0000 (14:43 -0700)]
Merge branch 'davem-fixes' of /linux/kernel/git/jgarzik/netdev-2.6

16 years agobnx2x: Accessing un-mapped page
Eilon Greenstein [Wed, 3 Sep 2008 21:38:00 +0000 (14:38 -0700)]
bnx2x: Accessing un-mapped page

The allocated RX buffer size was 64 bytes bigger than the PCI mapped
size with no good reason. If the packet was actually using the buffer up
to its limit and if the last 64 bytes of the buffer crossed 4KB boundary
then an unmapped PCI page was accessed. The fix is to use only one
parameter for the buffer size - there is no need to differentiate
between the buffer size and the PCI mapping size since the extra 64
bytes can actually be used by the FW to align the Ethernet payload to
64 bytes.

Also updating the driver version and date

Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
David S. Miller [Wed, 3 Sep 2008 21:32:33 +0000 (14:32 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-2.6

16 years agoath9k: Fix TX control flag use for no ACK and RTS/CTS
Jouni Malinen [Mon, 11 Aug 2008 11:01:51 +0000 (14:01 +0300)]
ath9k: Fix TX control flag use for no ACK and RTS/CTS

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agoath9k: Fix TX status reporting
Jouni Malinen [Mon, 11 Aug 2008 11:01:49 +0000 (14:01 +0300)]
ath9k: Fix TX status reporting

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agoiwlwifi: fix STATUS_EXIT_PENDING is not set on pci_remove
Gregory Greenman [Wed, 3 Sep 2008 03:18:50 +0000 (11:18 +0800)]
iwlwifi: fix STATUS_EXIT_PENDING is not set on pci_remove

This patch sets STATUS_EXIT_PENDING on pci_remove. Otherwise
iwl4965_down may fail to uninitialize the driver.

Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: Mohamed Abbas <mohamed.abbas@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agoiwlwifi: call apm stop on exit
Gregory Greenman [Wed, 3 Sep 2008 03:18:49 +0000 (11:18 +0800)]
iwlwifi: call apm stop on exit

This patch calls apm stop on exit and suspend. Without this patch
hardware consumes power even after driver is removed or suspended.

Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: Mohamed Abbas <mohamed.abbas@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agoiwlwifi: fix Tx cmd memory allocation failure handling
Tomas Winkler [Wed, 3 Sep 2008 03:18:48 +0000 (11:18 +0800)]
iwlwifi: fix Tx cmd memory allocation failure handling

This patch "iwlwifi: do not use GFP_DMA in iwl_tx_queue_init" removes
GFP_DMA from allocation tx command buffers. GFP_DMA allows allocation
only for memory under 16M which causes allocation problems
suspend/resume flows.

Using kmalloc is temporal solution and some consistent/coherent
allocation schema will be more correct. Since iwlwifi hardware
supports 64bit address this solution should work on x86 (32 and
64bit) for now.

This patch fixes memory freeing problem in the previous patch.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ian Schram <ischram@telenet.be>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agoiwlwifi: fix rx_chain computation
Tomas Winkler [Wed, 3 Sep 2008 03:18:47 +0000 (11:18 +0800)]
iwlwifi: fix rx_chain computation

This patch fixes rx_chain computation. The code that adjusts number of
rx chains to number supported by HW was missing. Miss configuration
causes firmware error.  Note: iwlwifi supports HW with up to 3 RX
chains (2x2, 2x3, 1x2, and 3x3 MIMO). This patch also simplifies the
whole RX chain computation.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Mohamed Abbas <mohamed.abbas@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agoiwlwifi: fix station mimo power save values
Ron Rindjunsky [Wed, 3 Sep 2008 03:18:46 +0000 (11:18 +0800)]
iwlwifi: fix station mimo power save values

This patch fixes the wrong use MIMO power save values. Our TX was
configured with our MIMO power save values instead of peer's MIMO power
save values, this may affect connectivity. The peer STA/AP may not sense
our traffic at all as it doesn't have all RX chains opened.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agoiwlwifi: remove false rxon if rx chain changes
Mohamed Abbas [Wed, 3 Sep 2008 03:18:44 +0000 (11:18 +0800)]
iwlwifi: remove false rxon if rx chain changes

Rx chain might change during power save transitions but it doesn't
require sending Full-ROXN command to the firmware. Full-RXON requires
reconnection to an AP and thus affects user experience. The patch
avoids the Full-RXON by removing the rx_chain modification check in
iwl_full_rxon_required function.

Signed-off-by: Mohamed Abbas <mohamed.abbas@intel.com
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agoiwlwifi: fix hidden ssid discovery in passive channels
Ron Rindjunsky [Wed, 3 Sep 2008 03:18:43 +0000 (11:18 +0800)]
iwlwifi: fix hidden ssid discovery in passive channels

This enables sending of direct probes on passive channels, as long as
traffic was detected on that channel. This enables connectivity to
hidden/non broadcasting SSIDs APs on passive channels. Note 5000 HW
declares all 5.2 spectrum as passive.

Signed-off-by: Cahill Ben <ben.m.cahill@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agoiwlwifi: W/A for the TSF correction in IBSS
Assaf Krauss [Wed, 3 Sep 2008 03:18:42 +0000 (11:18 +0800)]
iwlwifi: W/A for the TSF correction in IBSS

This patch is a W/A for the TSF sync issue in IBSS merging. HW is not
capable to sync TSF (it's constantly little behind). This creates
constant IBSS merging upon reception of each beacon, adding and removing
station which in turn creates above 50% packet loss and thus dramatically
degrade the throughput. The W/A simply stops the driver from declaring it
has a reliable TSF value and thus eliminates IBSS merging.

Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agoSplit up PIT part of TSC calibration from native_calibrate_tsc
Linus Torvalds [Wed, 3 Sep 2008 14:30:13 +0000 (07:30 -0700)]
Split up PIT part of TSC calibration from native_calibrate_tsc

The TSC calibration function is still very complicated, but this makes
it at least a little bit less so by moving the PIT part out into a
helper function of its own.

Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agonetxen: Remove workaround for chipset quirk
Dhananjay Phadke [Thu, 28 Aug 2008 04:57:30 +0000 (21:57 -0700)]
netxen: Remove workaround for chipset quirk

Remove chipset-specific quirk workaround; the workaround caused
unrecoverable DMA lockups when the driver was loaded following a
PXE boot.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: Michael Brown <mbrown@fensystems.co.uk>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agopcnet-cs, axnet_cs: add new IDs, remove dup ID with less info
Komuro [Sat, 30 Aug 2008 03:13:33 +0000 (12:13 +0900)]
pcnet-cs, axnet_cs: add new IDs, remove dup ID with less info

pcnet_cs:
    add new ID: "corega Ether PCC-TD".
    remove duplicate ID: "IC-CARD".

axnet_cs:
    add new ID: "IO DATA ETXPCM".

Signed-off-by: Komuro <komurojun-mbn@nifty.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agoixgbe: initialize interrupt throttle rate
Andy Gospodarek [Thu, 28 Aug 2008 01:04:32 +0000 (18:04 -0700)]
ixgbe: initialize interrupt throttle rate

This commit dropped the setting of the default interrupt throttle rate.

commit 021230d40ae0e6508d6c717b6e0d6d81cd77ac25
Author: Ayyappan Veeraiyan <ayyappan.veeraiyan@intel.com>
Date:   Mon Mar 3 15:03:45 2008 -0800

    ixgbe: Introduce MSI-X queue vector code

The following patch adds it back.  Without this the default value of 0
causes the performance of this card to be awful.  Restoring these to the
default values yields much better performance.

This regression has been around since 2.6.25.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: stable@kernel.org [2.6.25 and later]
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agonet/usb/pegasus: avoid hundreds of diagnostics
David Brownell [Tue, 2 Sep 2008 18:34:24 +0000 (11:34 -0700)]
net/usb/pegasus: avoid hundreds of diagnostics

Make the "pegasus" driver scream less loudly in the face of
problems as it initializes, avoiding hundreds of messages:

 - ratelimit some key error messages
 - avoid some spurious diagnostics caused by strange codeflow

And fix one instance of goofy indentation.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agopowerpc: Fix for getting CPU number in power_save_ppc32_restore()
Kumar Gala [Tue, 26 Aug 2008 02:08:56 +0000 (12:08 +1000)]
powerpc: Fix for getting CPU number in power_save_ppc32_restore()

The calculation to get TI_CPU based off of SPRG3 was just plain wrong,
meaning that we were getting garbage for the CPU number on 6xx/G3/G4
based SMP boxes in this code.

Just offset off the stack pointer (to get to thread_info) like all the
other references to TI_CPU do.

This was pointed out by Chen Gong <G.Chen@freescale.com>

[paulus@samba.org - use rlwinm r12,r11,... instead of
 rlwinm r12,r1,...; tophys()]

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years agopowerpc: Fix build error with 64K pages and !hugetlbfs
Benjamin Herrenschmidt [Wed, 3 Sep 2008 03:12:05 +0000 (13:12 +1000)]
powerpc: Fix build error with 64K pages and !hugetlbfs

HAVE_ARCH_UNMAPPED_AREA and HAVE_ARCH_UNMAPPED_AREA_TOPDOWN must
be defined whenever CONFIG_PPC_MM_SLICES is enabled, not just when
CONFIG_HUGETLB_PAGE is.  They used to be always defined together but
this is no longer the case since 3a8247cc2c856930f34eafce33f6a039227ee175
("powerpc: Only demote individual slices rather than whole process").

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years agopowerpc: Work around gcc's -fno-omit-frame-pointer bug
Tony Breeds [Tue, 2 Sep 2008 06:50:38 +0000 (16:50 +1000)]
powerpc: Work around gcc's -fno-omit-frame-pointer bug

This bug is causing random crashes
(http://bugzilla.kernel.org/show_bug.cgi?id=11414).

-fno-omit-frame-pointer is only needed on powerpc when -pg is also
supplied, and there is a gcc bug that causes incorrect code generation
on 32-bit powerpc when -fno-omit-frame-pointer is used---it uses stack
locations below the stack pointer, which is not allowed by the ABI
because those locations can and sometimes do get corrupted by an
interrupt.

This ensures that CONFIG_FRAME_POINTER is only selected by ftrace.
When CONFIG_FTRACE is enabled we also pass -mno-sched-epilog to work
around the gcc codegen bug.

Patch based on work by:
Andreas Schwab <schwab@suse.de>
Segher Boessenkool <segher@kernel.crashing.org>

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years agopowerpc: Make sure _etext is after all kernel text
Stephen Rothwell [Tue, 2 Sep 2008 05:04:09 +0000 (15:04 +1000)]
powerpc: Make sure _etext is after all kernel text

This makes core_kernel_text() (and therefore kernel_text_address())
return the correct result.  Currently all the __devinit routines (at
least) will not be considered to be kernel text.

This is just a quick fix for 2.6.27 - hopefully we will be able to fix
this better in 2.6.28.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years agopowerpc: Only make kernel text pages of linear mapping executable
Paul Mackerras [Sat, 30 Aug 2008 01:26:27 +0000 (11:26 +1000)]
powerpc: Only make kernel text pages of linear mapping executable

Commit bc033b63bbfeb6c4b4eb0a1d083c650e4a0d2af8 ("powerpc/mm: Fix
attribute confusion with htab_bolt_mapping()") moved the check for
whether we should make pages of the linear mapping executable from
htab_bolt_mapping into its callers, including htab_initialize.
A side-effect of this is that the decision is now made once for
each contiguous section in the LMB array rather than for each page
individually.  This can often mean that the whole of the linear
mapping ends up being executable.

This reverts to the previous behaviour, where individual pages are
checked for being part of the kernel text or not, by moving the check
back down into htab_bolt_mapping.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years agopowerpc: Fix uninitialised variable in VSX alignment code
Michael Neuling [Thu, 28 Aug 2008 04:57:39 +0000 (14:57 +1000)]
powerpc: Fix uninitialised variable in VSX alignment code

This fixes an uninitialised variable in the VSX alignment code.  It can
cause warnings from GCC (noticed with gcc-4.1.1).  Gcc is actually
correct in this instance, and this bug could cause the alignment
interrupt handler to send a SIGSEGV to the process on a legitimate
access.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years agotipc: Don't use structure names which easily globally conflict.
David S. Miller [Wed, 3 Sep 2008 06:38:32 +0000 (23:38 -0700)]
tipc: Don't use structure names which easily globally conflict.

Andrew Morton reported a build failure on sparc32, because TIPC
uses names like "struct node" and there is a like named data
structure defined in linux/node.h

This just regexp replaces "struct node*" to "struct tipc_node*"
to avoid this and any future similar problems.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Wed, 3 Sep 2008 04:02:14 +0000 (21:02 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  ipsec: Fix deadlock in xfrm_state management.
  ipv: Re-enable IP when MTU > 68
  net/xfrm: Use an IS_ERR test rather than a NULL test
  ath9: Fix ath_rx_flush_tid() for IRQs disabled kernel warning message.
  ath9k: Incorrect key used when group and pairwise ciphers are different.
  rt2x00: Compiler warning unmasked by fix of BUILD_BUG_ON
  mac80211: Fix debugfs union misuse and pointer corruption
  wireless/libertas/if_cs.c: fix memory leaks
  orinoco: Multicast to the specified addresses
  iwlwifi: fix 64bit platform firmware loading
  iwlwifi: fix apm_stop (wrong bit polarity for FLAG_INIT_DONE)
  iwlwifi: workaround interrupt handling no some platforms
  iwlwifi: do not use GFP_DMA in iwl_tx_queue_init
  net/wireless/Kconfig: clarify the description for CONFIG_WIRELESS_EXT_SYSFS
  net: Unbreak userspace usage of linux/mroute.h
  pkt_sched: Fix locking of qdisc_root with qdisc_root_sleeping_lock()
  ipv6: When we droped a packet, we should return NET_RX_DROP instead of 0

16 years ago[x86] Fix TSC calibration issues
Thomas Gleixner [Tue, 2 Sep 2008 22:54:47 +0000 (00:54 +0200)]
[x86] Fix TSC calibration issues

Larry Finger reported at http://lkml.org/lkml/2008/9/1/90:
An ancient laptop of mine started throwing errors from b43legacy when
I started using 2.6.27 on it. This has been bisected to commit bfc0f59
"x86: merge tsc calibration".

The unification of the TSC code adopted mostly the 64bit code, which
prefers PMTIMER/HPET over the PIT calibration.

Larrys system has an AMD K6 CPU. Such systems are known to have
PMTIMER incarnations which run at double speed. This results in a
miscalibration of the TSC by factor 0.5. So the resulting calibrated
CPU/TSC speed is half of the real CPU speed, which means that the TSC
based delay loop will run half the time it should run. That might
explain why the b43legacy driver went berserk.

On the other hand we know about systems, where the PIT based
calibration results in random crap due to heavy SMI/SMM
disturbance. On those systems the PMTIMER/HPET based calibration logic
with SMI detection shows better results.

According to Alok also virtualized systems suffer from the PIT
calibration method.

The solution is to use a more wreckage aware aproach than the current
either/or decision.

1) reimplement the retry loop which was dropped from the 32bit code
during the merge. It repeats the calibration and selects the lowest
frequency value as this is probably the closest estimate to the real
frequency

2) Monitor the delta of the TSC values in the delay loop which waits
for the PIT counter to reach zero. If the maximum value is
significantly different from the minimum, then we have a pretty safe
indicator that the loop was disturbed by an SMI.

3) keep the pmtimer/hpet reference as a backup solution for systems
where the SMI disturbance is a permanent point of failure for PIT
based calibration

4) do the loop iteration for both methods, record the lowest value and
decide after all iterations finished.

5) Set a clear preference to PIT based calibration when the result
makes sense.

The implementation does the reference calibration based on
HPET/PMTIMER around the delay, which is necessary for the PIT anyway,
but keeps separate TSC values to ensure the "independency" of the
resulting calibration values.

Tested on various 32bit/64bit machines including Geode 266Mhz, AMD K6
(affected machine with a double speed pmtimer which I grabbed out of
the dump), Pentium class machines and AMD/Intel 64 bit boxen.

Bisected-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoipsec: Fix deadlock in xfrm_state management.
David S. Miller [Wed, 3 Sep 2008 03:14:15 +0000 (20:14 -0700)]
ipsec: Fix deadlock in xfrm_state management.

Ever since commit 4c563f7669c10a12354b72b518c2287ffc6ebfb3
("[XFRM]: Speed up xfrm_policy and xfrm_state walking") it is
illegal to call __xfrm_state_destroy (and thus xfrm_state_put())
with xfrm_state_lock held.  If we do, we'll deadlock since we
have the lock already and __xfrm_state_destroy() tries to take
it again.

Fix this by pushing the xfrm_state_put() calls after the lock
is dropped.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agodrivers/char/random.c: fix a race which can lead to a bogus BUG()
Andrew Morton [Tue, 2 Sep 2008 21:36:14 +0000 (14:36 -0700)]
drivers/char/random.c: fix a race which can lead to a bogus BUG()

Fix a bug reported by and diagnosed by Aaron Straus.

This is a regression intruduced into 2.6.26 by

    commit adc782dae6c4c0f6fb679a48a544cfbcd79ae3dc
    Author: Matt Mackall <mpm@selenic.com>
    Date:   Tue Apr 29 01:03:07 2008 -0700

        random: simplify and rename credit_entropy_store

credit_entropy_bits() does:

spin_lock_irqsave(&r->lock, flags);
...
if (r->entropy_count > r->poolinfo->POOLBITS)
r->entropy_count = r->poolinfo->POOLBITS;

so there is a time window in which this BUG_ON():

static size_t account(struct entropy_store *r, size_t nbytes, int min,
      int reserved)
{
unsigned long flags;

BUG_ON(r->entropy_count > r->poolinfo->POOLBITS);

/* Hold lock while accounting */
spin_lock_irqsave(&r->lock, flags);

can trigger.

We could fix this by moving the assertion inside the lock, but it seems
safer and saner to revert to the old behaviour wherein
entropy_store.entropy_count at no time exceeds
entropy_store.poolinfo->POOLBITS.

Reported-by: Aaron Straus <aaron@merfinllc.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: <stable@kernel.org> [2.6.26.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agopm_qos_requirement might sleep
John Kacur [Tue, 2 Sep 2008 21:36:13 +0000 (14:36 -0700)]
pm_qos_requirement might sleep

Make PM_QOS and CPU_IDLE play nicer when run with the RT-Preempt kernel.

The purpose of the patch is to remove the spin_lock around the read in the
function pm_qos_requirement - since spinlocks can sleep in -rt and this
function is called from idle.

CPU_IDLE polls the target_value's of some of the pm_qos parameters from
the idle loop causing sleeping locking warnings.  Changing the
target_value to an atomic avoids this issue.

Remove the spinlock in pm_qos_requirement by making target_value an atomic
type.

Signed-off-by: mark gross <mgross@linux.intel.com>
Signed-off-by: John Kacur <jkacur@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agortc-cmos: wake again from S5
Rafael J. Wysocki [Tue, 2 Sep 2008 21:36:11 +0000 (14:36 -0700)]
rtc-cmos: wake again from S5

Update rtc-cmos shutdown handling to leave RTC alarms active, resolving
http://bugzilla.kernel.org/show_bug.cgi?id=11411 on several boards.  There
are still some systems where the ACPI event handling doesn't cooperate.
(Possibly related to bugid 11312, reporting the spontaneous disabling of
RTC events.)

Bug 11411 reported that changes to work around some ACPI event issues
broke wake-from-S5 handling, as used for DVR applications.  (They like to
power off, then wake later to record programs.)

[yakui.zhao@intel.com: add shutdown for PNP devices]
[dbrownell@users.sourceforge.net: update comments]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Stefan Bauer <stefan.bauer@cs.tu-chemnitz.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agosysfs: document files in /sys/firmware/sgi_uv/
Russ Anderson [Tue, 2 Sep 2008 21:36:09 +0000 (14:36 -0700)]
sysfs: document files in /sys/firmware/sgi_uv/

Document files in /sys/firmware/sgi_uv/.

Signed-off-by: Russ Anderson <rja@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoibft: fix target info parsing in ibft module
Mike Christie [Tue, 2 Sep 2008 21:36:07 +0000 (14:36 -0700)]
ibft: fix target info parsing in ibft module

I got this patch through Red Hat's bugzilla from the bug submitter and
patch creator.  I have just fixed it up so it applies without fuzz to
upstream kernels.

Original patch and description from Shyam kumar Iyer:

The issue [ibft module not displaying targets with short names] is because
of an offset calculatation error in the iscsi_ibft.c code.  Due to this
error directory structure for the target in /sys/firmware/ibft does not
get created and so the initiator is unable to connect to the target.

Note that this bug surfaced only with an name that had a short section at
the end.  eg: "iqn.1984-05.com.dell:dell".  It did not surface when the
iqn's had a longer section at the end.  eg:
"iqn.2001-04.com.example:storage.disk2.sys1.xyz"

So, the eot_offset was calculated such that an extra 48 bytes i.e.  the
size of the ibft_header which has already been accounted was subtracted
twice.

This was not evident with longer iqn names because they would overshoot
the total ibft length more than 48 bytes and thus would escape the bug.

Signed-off-by: Shyam Kumar Iyer <shyam_iyer@dell.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Konrad Rzeszutek <konrad@virtualiron.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agortc_time_to_tm: fix signed/unsigned arithmetic
Jan Altenberg [Tue, 2 Sep 2008 21:36:05 +0000 (14:36 -0700)]
rtc_time_to_tm: fix signed/unsigned arithmetic

commit 945185a69daa457c4c5e46e47f4afad7dcea734f ("rtc: rtc_time_to_tm: use
unsigned arithmetic") changed the some types in rtc_time_to_tm() to
unsigned:

 void rtc_time_to_tm(unsigned long time, struct rtc_time *tm)
 {
-       register int days, month, year;
+       unsigned int days, month, year;

This doesn't work for all cases, because days is checked for < 0 later
on:

if (days < 0) {
year -= 1;
days += 365 + LEAP_YEAR(year);
}

I think the correct fix would be to keep days signed and do an appropriate
cast later on.

Signed-off-by: Jan Altenberg <jan.altenberg@linutronix.de>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: Dmitri Vorobiev <dmitri.vorobiev@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agotdfxfb: fix frame buffer name overrun
Krzysztof Helt [Tue, 2 Sep 2008 21:36:04 +0000 (14:36 -0700)]
tdfxfb: fix frame buffer name overrun

If there are more then one graphics card handled by the tdfxfb driver the
name of the frame buffer overruns reserved size.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agotdfxfb: fix SDRAM memory size detection
Krzysztof Helt [Tue, 2 Sep 2008 21:36:03 +0000 (14:36 -0700)]
tdfxfb: fix SDRAM memory size detection

Fix memory detection on Voodoo3 cards with SDRAM memory.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agohp-wmi: add proper hotkey support
Matthew Garrett [Tue, 2 Sep 2008 21:36:03 +0000 (14:36 -0700)]
hp-wmi: add proper hotkey support

It turns out that event 0x4 merely indcates that a hotkey has been
pressed, not which one.  A further query is required in order to determine
the actual keypress.  The following patch adds support for that along with
the known keycodes.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agohp-wmi: update to match current rfkill semantics
Matthew Garrett [Tue, 2 Sep 2008 21:36:00 +0000 (14:36 -0700)]
hp-wmi: update to match current rfkill semantics

hp-wmi currently changes the RFKill state by altering the struct members
rather than using the dedicated interface, meaning that update events
won't be pushed to userspace.  This patch fixes that, along with fixing
the declared type of the WWAN kill switch.  It also ensures that rfkill
interfaces are only registered for hardware that exists.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Ivo van Doorn <ivdoorn@gmail.com>
Cc: Dave Young <hidave.darkstar@gmail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoipc: document the new auto_msgmni proc file
Nadia Derbey [Tue, 2 Sep 2008 21:35:59 +0000 (14:35 -0700)]
ipc: document the new auto_msgmni proc file

Update Documentation/filesystems/proc.txt: it describes the file
auto_msgmni intoduced to enable/disable msgmni automatic recomputing upon
memory add/remove (see thread http://lkml.org/lkml/2008/7/4/27).  Also
added a description for msgmni (this filex is only listed in
Documentation/sysctl/kernel.txt).

Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agomm: size of quicklists shouldn't be proportional to the number of CPUs
KOSAKI Motohiro [Tue, 2 Sep 2008 21:35:58 +0000 (14:35 -0700)]
mm: size of quicklists shouldn't be proportional to the number of CPUs

Quicklists store pages for each CPU as caches.  (Each CPU can cache
node_free_pages/16 pages)

It is used for page table cache.  exit() will increase the cache size,
while fork() consumes it.

So for example if an apache-style application runs (one parent and many
child model), one CPU process will fork() while another CPU will process
the middleware work and exit().

At that time, the CPU on which the parent runs doesn't have page table
cache at all.  Others (on which children runs) have maximum caches.

QList_max = (#ofCPUs - 1) x Free / 16
=> QList_max / (Free + QList_max) = (#ofCPUs - 1) / (16 + #ofCPUs - 1)

So, How much quicklist memory is used in the maximum case?

This is proposional to # of CPUs because the limit of per cpu quicklist
cache doesn't see the number of cpus.

Above calculation mean

 Number of CPUs per node            2    4    8   16
 ==============================  ====================
 QList_max / (Free + QList_max)   5.8%  16%  30%  48%

Wow! Quicklist can spend about 50% memory at worst case.

My demonstration program is here
--------------------------------------------------------------------------------
#define _GNU_SOURCE

#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <sched.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/wait.h>

#define BUFFSIZE 512

int max_cpu(void) /* get max number of logical cpus from /proc/cpuinfo */
{
  FILE *fd;
  char *ret, buffer[BUFFSIZE];
  int cpu = 1;

  fd = fopen("/proc/cpuinfo", "r");
  if (fd == NULL) {
    perror("fopen(/proc/cpuinfo)");
    exit(EXIT_FAILURE);
  }
  while (1) {
    ret = fgets(buffer, BUFFSIZE, fd);
    if (ret == NULL)
      break;
    if (!strncmp(buffer, "processor", 9))
      cpu = atoi(strchr(buffer, ':') + 2);
  }
  fclose(fd);
  return cpu;
}

void cpu_bind(int cpu) /* bind current process to one cpu */
{
  cpu_set_t mask;
  int ret;

  CPU_ZERO(&mask);
  CPU_SET(cpu, &mask);
  ret = sched_setaffinity(0, sizeof(mask), &mask);
  if (ret == -1) {
    perror("sched_setaffinity()");
    exit(EXIT_FAILURE);
  }
  sched_yield(); /* not necessary */
}

#define MMAP_SIZE (10 * 1024 * 1024) /* 10 MB */
#define FORK_INTERVAL 1 /* 1 second */

main(int argc, char *argv[])
{
  int cpu_max, nextcpu;
  long pagesize;
  pid_t pid;

  /* set max number of logical cpu */
  if (argc > 1)
    cpu_max = atoi(argv[1]) - 1;
  else
    cpu_max = max_cpu();

  /* get the page size */
  pagesize = sysconf(_SC_PAGESIZE);
  if (pagesize == -1) {
    perror("sysconf(_SC_PAGESIZE)");
    exit(EXIT_FAILURE);
  }

  /* prepare parent process */
  cpu_bind(0);
  nextcpu = cpu_max;

loop:

  /* select destination cpu for child process by round-robin rule */
  if (++nextcpu > cpu_max)
    nextcpu = 1;

  pid = fork();

  if (pid == 0) { /* child action */

    char *p;
    int i;

    /* consume page tables */
    p = mmap(0, MMAP_SIZE, PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
    i = MMAP_SIZE / pagesize;
    while (i-- > 0) {
      *p = 1;
      p += pagesize;
    }

    /* move to other cpu */
    cpu_bind(nextcpu);
/*
    printf("a child moved to cpu%d after mmap().\n", nextcpu);
    fflush(stdout);
 */

    /* back page tables to pgtable_quicklist */
    exit(0);

  } else if (pid > 0) { /* parent action */

    sleep(FORK_INTERVAL);
    waitpid(pid, NULL, WNOHANG);

  }

  goto loop;
}
----------------------------------------

When above program which does task migration runs, my 8GB box spends
800MB of memory for quicklist.  This is not memory leak but doesn't seem
good.

% cat /proc/meminfo

MemTotal:        7701568 kB
MemFree:         4724672 kB
(snip)
Quicklists:       844800 kB

because

- My machine spec is
number of numa node: 2
number of cpus:      8 (4CPU x2 node)
        total mem:           8GB (4GB x2 node)
        free mem:            about 5GB

- Then, 4.7GB x 16% ~= 880MB.
  So, Quicklist can use 800MB.

So, if following spec machine run that program

   CPUs: 64 (8cpu x 8node)
   Mem:  1TB (128GB x8node)

Then, quicklist can waste 300GB (= 1TB x 30%).  It is too large.

So, I don't like cache policies which is proportional to # of cpus.

My patch changes the number of caches
from:
   per-cpu-cache-amount = memory_on_node / 16
to
   per-cpu-cache-amount = memory_on_node / 16 / number_of_cpus_on_node.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Keiichiro Tokunaga <tokunaga.keiich@jp.fujitsu.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Tested-by: David Miller <davem@davemloft.net>
Acked-by: Mike Travis <travis@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agomm: show quicklist usage in /proc/meminfo
KOSAKI Motohiro [Tue, 2 Sep 2008 21:35:53 +0000 (14:35 -0700)]
mm: show quicklist usage in /proc/meminfo

Quicklists can consume several GB of memory.  We should provide a means of
monitoring this.

After this patch is applied, /proc/meminfo will output the following:

% cat /proc/meminfo

MemTotal:      7715392 kB
MemFree:       5401600 kB
Buffers:         80384 kB
Cached:         300800 kB
SwapCached:          0 kB
Active:         235584 kB
Inactive:       262656 kB
SwapTotal:     2031488 kB
SwapFree:      2031488 kB
Dirty:            3520 kB
Writeback:           0 kB
AnonPages:      117696 kB
Mapped:          38528 kB
Slab:          1589952 kB
SReclaimable:    23104 kB
SUnreclaim:    1566848 kB
PageTables:      14656 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
WritebackTmp:        0 kB
CommitLimit:   5889152 kB
Committed_AS:   393152 kB
VmallocTotal: 17592177655808 kB
VmallocUsed:     29056 kB
VmallocChunk: 17592177626432 kB
Quicklists:     130944 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
HugePages_Surp:      0
Hugepagesize:    262144 kB

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Keiichiro Tokunaga <tokunaga.keiich@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>