openwrt/staging/blogic.git
15 years agolocking: Split rwlock from spinlock headers
Thomas Gleixner [Sat, 7 Nov 2009 22:04:15 +0000 (23:04 +0100)]
locking: Split rwlock from spinlock headers

Move the rwlock defines and inlines into separate header files. This
makes the selection for -rt easier.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
15 years agolocking: Reorder functions in spinlock.c
Thomas Gleixner [Mon, 9 Nov 2009 20:01:59 +0000 (21:01 +0100)]
locking: Reorder functions in spinlock.c

Separate spin_lock and rw_lock functions. Preempt-RT needs to exclude
the rw_lock functions from being compiled. The reordering allows to do
that with a single #ifdef.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux...
Linus Torvalds [Mon, 14 Dec 2009 20:50:25 +0000 (12:50 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-udf-2.6

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
  udf: Avoid IO in udf_clear_inode
  udf: Try harder when looking for VAT inode
  udf: Fix compilation with UDFFS_DEBUG enabled

15 years agoudf: Avoid IO in udf_clear_inode
Jan Kara [Thu, 3 Dec 2009 12:39:28 +0000 (13:39 +0100)]
udf: Avoid IO in udf_clear_inode

It is not very good to do IO in udf_clear_inode. First, VFS does not really
expect inode to become dirty there and thus we have to write it ourselves,
second, memory reclaim gets blocked waiting for IO when it does not really
expect it, third, the IO pattern (e.g. on umount) resulting from writes in
udf_clear_inode is bad and it slows down writing a lot.

The reason why UDF needed to do IO in udf_clear_inode is that UDF standard
mandates extent length to exactly match inode size. But when we allocate
extents to a file or directory, we don't really know what exactly the final
file size will be and thus temporarily set it to block boundary and later
truncate it to exact length in udf_clear_inode. Now, this is changed to
truncate to final file size in udf_release_file for regular files. For
directories and symlinks, we do the truncation at the moment when learn
what the final file size will be.

Signed-off-by: Jan Kara <jack@suse.cz>
15 years agoudf: Try harder when looking for VAT inode
Jan Kara [Mon, 30 Nov 2009 18:47:55 +0000 (19:47 +0100)]
udf: Try harder when looking for VAT inode

Some disks do not contain VAT inode in the last recorded block as required
by the standard but a few blocks earlier (or the number of recorded blocks
is wrong). So look for the VAT inode a bit before the end of the media.

Signed-off-by: Jan Kara <jack@suse.cz>
15 years agoudf: Fix compilation with UDFFS_DEBUG enabled
Jan Kara [Mon, 30 Nov 2009 18:47:10 +0000 (19:47 +0100)]
udf: Fix compilation with UDFFS_DEBUG enabled

Signed-off-by: Jan Kara <jack@suse.cz>
15 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 14 Dec 2009 20:36:46 +0000 (12:36 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mce: Clean up thermal init by introducing intel_thermal_supported()
  x86, mce: Thermal monitoring depends on APIC being enabled
  x86: Gart: fix breakage due to IOMMU initialization cleanup
  x86: Move swiotlb initialization before dma32_free_bootmem
  x86: Fix build warning in arch/x86/mm/mmio-mod.c
  x86: Remove usedac in feature-removal-schedule.txt
  x86: Fix duplicated UV BAU interrupt vector
  nvram: Fix write beyond end condition; prove to gcc copy is safe
  mm: Adjust do_pages_stat() so gcc can see copy_from_user() is safe
  x86: Limit the number of processor bootup messages
  x86: Remove enabling x2apic message for every CPU
  doc: Add documentation for bootloader_{type,version}
  x86, msr: Add support for non-contiguous cpumasks
  x86: Use find_e820() instead of hard coded trampoline address
  x86, AMD: Fix stale cpuid4_info shared_map data in shared_cpu_map cpumasks

Trivial percpu-naming-introduced conflicts in arch/x86/kernel/cpu/intel_cacheinfo.c

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6
Linus Torvalds [Mon, 14 Dec 2009 20:33:02 +0000 (12:33 -0800)]
Merge git://git./linux/kernel/git/brodo/pcmcia-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
  pcmcia: CodingStyle fixes
  pcmcia: remove unused IRQ_FIRST_SHARED

15 years agoMerge branch 'next-spi' of git://git.secretlab.ca/git/linux-2.6
Linus Torvalds [Mon, 14 Dec 2009 18:22:11 +0000 (10:22 -0800)]
Merge branch 'next-spi' of git://git.secretlab.ca/git/linux-2.6

* 'next-spi' of git://git.secretlab.ca/git/linux-2.6: (23 commits)
  spi: fix probe/remove section markings
  Add OMAP spi100k driver
  spi-imx: don't access struct device directly but use dev_get_platdata
  spi-imx: Add mx25 support
  spi-imx: use positive logic to distinguish cpu variants
  spi-imx: correct check for platform_get_irq failing
  ARM: NUC900: Add spi driver support for nuc900
  spi: SuperH MSIOF SPI Master driver V2
  spi: fix spidev compilation failure when VERBOSE is defined
  spi/au1550_spi: fix setupxfer not to override cfg with zeros
  spi/mpc8xxx: don't use __exit_p to wrap plat_mpc8xxx_spi_remove
  spi/i.MX: fix broken error handling for gpio_request
  spi/i.mx: drain MXC SPI transfer buffer when probing device
  MAINTAINERS: add SPI co-maintainer.
  spi/xilinx_spi: fix incorrect casting
  spi/mpc52xx-spi: minor cleanups
  xilinx_spi: add a platform driver using the xilinx_spi common module.
  xilinx_spi: add support for the DS570 IP.
  xilinx_spi: Switch to iomem functions and support little endian.
  xilinx_spi: Split into of driver and generic part.
  ...

15 years agoMerge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 14 Dec 2009 18:13:22 +0000 (10:13 -0800)]
Merge branch 'perf-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf sched: Fix build failure on sparc
  perf bench: Add "all" pseudo subsystem and "all" pseudo suite
  perf tools: Introduce perf_session class
  perf symbols: Ditch dso->find_symbol
  perf symbols: Allow lookups by symbol name too
  perf symbols: Add missing "Variables" entry to map_type__name
  perf symbols: Add support for 'variable' symtabs
  perf symbols: Introduce ELF counterparts to symbol_type__is_a
  perf symbols: Introduce symbol_type__is_a
  perf symbols: Rename kthreads to kmaps, using another abstraction for it
  perf tools: Allow building for ARM
  hw-breakpoints: Handle bad modify_user_hw_breakpoint off-case return value
  perf tools: Allow cross compiling
  tracing, slab: Fix no callsite ifndef CONFIG_KMEMTRACE
  tracing, slab: Define kmem_cache_alloc_notrace ifdef CONFIG_TRACING

Trivial conflict due to different fixes to modify_user_hw_breakpoint()
in include/linux/hw_breakpoint.h

15 years agoPCI: Global variable decls must match the defs in section attributes
David Howells [Mon, 14 Dec 2009 14:13:44 +0000 (14:13 +0000)]
PCI: Global variable decls must match the defs in section attributes

Global variable declarations must match the definitions in section attributes
as the compiler is at liberty to vary the method it uses to access a variable,
depending on the section it is in.

When building the FRV arch, I now see:

  drivers/built-in.o: In function `pci_apply_final_quirks':
  drivers/pci/quirks.c:2606: relocation truncated to fit: R_FRV_GPREL12 against symbol `pci_dfl_cache_line_size' defined in .devinit.data section in drivers/built-in.o
  drivers/pci/quirks.c:2623: relocation truncated to fit: R_FRV_GPREL12 against symbol `pci_dfl_cache_line_size' defined in .devinit.data section in drivers/built-in.o
  drivers/pci/quirks.c:2630: relocation truncated to fit: R_FRV_GPREL12 against symbol `pci_dfl_cache_line_size' defined in .devinit.data section in drivers/built-in.o

because the declaration of pci_dfl_cache_line_size in linux/pci.h does not
match the definition in drivers/pci/pci.c.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoFRV: Fix no-hardware-breakpoint case
David Howells [Mon, 14 Dec 2009 14:03:27 +0000 (14:03 +0000)]
FRV: Fix no-hardware-breakpoint case

If there is no hardware breakpoint support, modify_user_hw_breakpoint()
tries to return a NULL pointer through as an 'int' return value:

  In file included from kernel/exit.c:53:
  include/linux/hw_breakpoint.h: In function 'modify_user_hw_breakpoint':
  include/linux/hw_breakpoint.h:96: warning: return makes integer from pointer without a cast

Return 0 instead.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge branch 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze
Linus Torvalds [Mon, 14 Dec 2009 18:04:04 +0000 (10:04 -0800)]
Merge branch 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze

* 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze: (46 commits)
  microblaze: Remove rt_sigsuspend wrapper
  microblaze: nommu: Don't clobber R11 on syscalls
  microblaze: Remove show_tmem function
  microblaze: Support for WB cache
  microblaze: Add PVR for Microblaze v7.30.a
  microblaze: Remove ancient and fake microblaze version from cpu_ver table
  microblaze: Remove panic_timeout init value
  microblaze: Do not count system calls in default
  microblaze: Enable DTC compilation
  microblaze: Core oprofile configs and hooks
  microblaze: Fix level interrupt ACKing
  microblaze: Enable futimesat syscall
  microblaze: Checking DTS against PVR for write-back cache
  microblaze: Remove duplicity from pgalloc.h
  microblaze: Futex support
  microblaze: Adding dev_arch_data functions
  microblaze: Fix the heartbeat gpio to be more robust
  microblaze: Simple __copy_tofrom_user for noMMU
  microblaze: Export memory_start for modules
  microblaze: Use lowest-common-denominator default CPU settings
  ...

15 years agoMerge branch 'for-linus' of git://neil.brown.name/md
Linus Torvalds [Mon, 14 Dec 2009 18:03:36 +0000 (10:03 -0800)]
Merge branch 'for-linus' of git://neil.brown.name/md

* 'for-linus' of git://neil.brown.name/md: (27 commits)
  md: add 'recovery_start' per-device sysfs attribute
  md: rcu_read_lock() walk of mddev->disks in md_do_sync()
  md: integrate spares into array at earliest opportunity.
  md: move compat_ioctl handling into md.c
  md: revise Kconfig help for MD_MULTIPATH
  md: add MODULE_DESCRIPTION for all md related modules.
  raid: improve MD/raid10 handling of correctable read errors.
  md/raid10: print more useful messages on device failure.
  md/bitmap: update dirty flag when bitmap bits are explicitly set.
  md: Support write-intent bitmaps with externally managed metadata.
  md/bitmap: move setting of daemon_lastrun out of bitmap_read_sb
  md: support updating bitmap parameters via sysfs.
  md: factor out parsing of fixed-point numbers
  md: support bitmap offset appropriate for external-metadata arrays.
  md: remove needless setting of thread->timeout in raid10_quiesce
  md: change daemon_sleep to be in 'jiffies' rather than 'seconds'.
  md: move offset, daemon_sleep and chunksize out of bitmap structure
  md: collect bitmap-specific fields into one structure.
  md/raid1: add takeover support for raid5->raid1
  md: add honouring of suspend_{lo,hi} to raid1.
  ...

15 years agoMerge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
Linus Torvalds [Mon, 14 Dec 2009 18:02:35 +0000 (10:02 -0800)]
Merge branch 'for-next' of git://git./linux/kernel/git/sameo/mfd-2.6

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: (58 commits)
  mfd: Add twl6030 regulator subdevices
  regulator: Add support for twl6030 regulators
  rtc: Add twl6030 RTC support
  mfd: Add support for twl6030 irq framework
  mfd: Rename twl4030_ routines in twl-regulator.c
  mfd: Rename twl4030_ routines in rtc-twl.c
  mfd: Rename all twl4030_i2c*
  mfd: Rename twl4030* driver files to enable re-use
  mfd: Clarify twl4030 return value for read and write
  mfd: Add all twl4030 regulators to the twl4030 mfd driver
  mfd: Don't set mc13783 ADREFMODE for touch conversions
  mfd: Remove ezx-pcap defines for custom led gpio encoding
  mfd: Near complete mc13783 rewrite
  mfd: Remove build time warning for WM835x register default tables
  mfd: Force I2C to be built in when building WM831x
  mfd: Don't allow wm831x to be built as a module
  mfd: Fix incorrect error check for wm8350-core
  mfd: Fix twl4030 warning
  gpiolib: Implement gpio_to_irq() for wm831x
  mfd: Remove default selection of AB4500
  ...

15 years agoMerge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Mon, 14 Dec 2009 18:01:15 +0000 (10:01 -0800)]
Merge branch 'devel' of /home/rmk/linux-2.6-arm

* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm:
  ARM: fix lh7a40x build
  ARM: fix sa1100 build
  ARM: fix clps711x, footbridge, integrator, ixp2000, ixp2300 and s3c build bug
  ARM: VFP: fix vfp thread init bug and document vfp notifier entry conditions
  ARM: pxa: fix now incorrect reference of skt->irq by using skt->socket.pci_irq
  [ARM] pxa/zeus: default configuration for Arcom Zeus SBC.
  [ARM] pxa/zeus: make Viper pcmcia support more generic to support Zeus
  [ARM] pxa/zeus: basic support for Arcom Zeus SBC
  [ARM] pxa/em-x270: fix usb hub power up/reset sequence
  PCMCIA: fix pxa2xx_lubbock modular build error
  ARM: RealView: Fix typo in the RealView/PBX Kconfig entry
  ARM: Do not allow the probing of the local timer
  ARM: Add an earlyprintk debug console

15 years agoMerge git://git.linux-nfs.org/projects/trondmy/nfs-2.6
Linus Torvalds [Mon, 14 Dec 2009 18:00:24 +0000 (10:00 -0800)]
Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6

* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (75 commits)
  NFS: Fix nfs_migrate_page()
  rpc: remove unneeded function parameter in gss_add_msg()
  nfs41: Invoke RECLAIM_COMPLETE on all new client ids
  SUNRPC: IS_ERR/PTR_ERR confusion
  NFSv41: Fix a potential state leakage when restarting nfs4_close_prepare
  nfs41: Handle NFSv4.1 session errors in the delegation recall code
  nfs41: Retry delegation return if it failed with session error
  nfs41: Handle session errors during delegation return
  nfs41: Mark stateids in need of reclaim if state manager gets stale clientid
  NFS: Fix up the declaration of nfs4_restart_rpc when NFSv4 not configured
  nfs41: Don't clear DRAINING flag on NFS4ERR_STALE_CLIENTID
  nfs41: nfs41_setup_state_renewal
  NFSv41: More cleanups
  NFSv41: Fix up some bugs in the NFS4CLNT_SESSION_DRAINING code
  NFSv41: Clean up slot table management
  NFSv41: Fix nfs4_proc_create_session
  nfs41: Invoke RECLAIM_COMPLETE
  nfs41: RECLAIM_COMPLETE functionality
  nfs41: RECLAIM_COMPLETE XDR functionality
  Cleanup some NFSv4 XDR decode comments
  ...

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Linus Torvalds [Mon, 14 Dec 2009 17:58:24 +0000 (09:58 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tj/percpu

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
  m68k: rename global variable vmalloc_end to m68k_vmalloc_end
  percpu: add missing per_cpu_ptr_to_phys() definition for UP
  percpu: Fix kdump failure if booted with percpu_alloc=page
  percpu: make misc percpu symbols unique
  percpu: make percpu symbols in ia64 unique
  percpu: make percpu symbols in powerpc unique
  percpu: make percpu symbols in x86 unique
  percpu: make percpu symbols in xen unique
  percpu: make percpu symbols in cpufreq unique
  percpu: make percpu symbols in oprofile unique
  percpu: make percpu symbols in tracer unique
  percpu: make percpu symbols under kernel/ and mm/ unique
  percpu: remove some sparse warnings
  percpu: make alloc_percpu() handle array types
  vmalloc: fix use of non-existent percpu variable in put_cpu_var()
  this_cpu: Use this_cpu_xx in trace_functions_graph.c
  this_cpu: Use this_cpu_xx for ftrace
  this_cpu: Use this_cpu_xx in nmi handling
  this_cpu: Use this_cpu operations in RCU
  this_cpu: Use this_cpu ops for VM statistics
  ...

Fix up trivial (famous last words) global per-cpu naming conflicts in
arch/x86/kvm/svm.c
mm/slab.c

15 years agoDocumentation: rw_lock lessons learned
William Allen Simpson [Sun, 13 Dec 2009 20:12:46 +0000 (15:12 -0500)]
Documentation: rw_lock lessons learned

In recent months, two different network projects erroneously
strayed down the rw_lock path.  Update the Documentation
based upon comments by Eric Dumazet and Paul E. McKenney in
those threads.

Further updates await somebody else with more expertise.

Changes:
  - Merged with extensive content by Stephen Hemminger.
  - Fix one of the comments by Linus Torvalds.

Signed-off-by: William.Allen.Simpson@gmail.com
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agox86, mce: Clean up thermal init by introducing intel_thermal_supported()
Hidetoshi Seto [Mon, 14 Dec 2009 08:57:00 +0000 (17:57 +0900)]
x86, mce: Clean up thermal init by introducing intel_thermal_supported()

It looks better to have a common function. No change in functionality.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <4B25FDDC.407@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
15 years agox86, mce: Thermal monitoring depends on APIC being enabled
Cyrill Gorcunov [Mon, 14 Dec 2009 08:56:34 +0000 (17:56 +0900)]
x86, mce: Thermal monitoring depends on APIC being enabled

Add check if APIC is not disabled since thermal
monitoring depends on it. As only apic gets disabled
we should not try to install "thermal monitor" vector,
print out that thermal monitoring is enabled and etc...

Note that "Intel Correct Machine Check Interrupts" already
has such a check.

Also I decided to not add cpu_has_apic check into
mcheck_intel_therm_init since even if it'll call apic_read on
disabled apic -- it's safe here and allow us to save a few code
bytes.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
LKML-Reference: <4B25FDC2.3020401@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf sched: Fix build failure on sparc
David Miller [Mon, 14 Dec 2009 07:56:22 +0000 (23:56 -0800)]
perf sched: Fix build failure on sparc

Here, tvec->tv_usec is "unsigned int" not "unsigned long".

Since the type is different on every platform, it's probably
best to just use long printf formats and cast.

Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091213.235622.53363059.davem@davemloft.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: Gart: fix breakage due to IOMMU initialization cleanup
Yinghai Lu [Mon, 14 Dec 2009 02:52:15 +0000 (11:52 +0900)]
x86: Gart: fix breakage due to IOMMU initialization cleanup

This fixes the following breakage of the commit
75f1cdf1dda92cae037ec848ae63690d91913eac:

- GART systems that don't AGP with broken BIOS and more than 4GB
  memory are forced to use swiotlb. They can allocate aperture by
  hand and use GART.

- GART systems without GAP must disable GART on shutdown.

- swiotlb usage is forced by the boot option,
  gart_iommu_hole_init() is not called, so we disable GART
  early_gart_iommu_check().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
LKML-Reference: <1260759135-6450-3-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: Move swiotlb initialization before dma32_free_bootmem
FUJITA Tomonori [Mon, 14 Dec 2009 02:52:14 +0000 (11:52 +0900)]
x86: Move swiotlb initialization before dma32_free_bootmem

The commit 75f1cdf1dda92cae037ec848ae63690d91913eac introduced a
bug that we initialize SWIOTLB right after dma32_free_bootmem so
we wrongly steal memory area allocated for GART with broken BIOS
earlier.

This moves swiotlb initialization before dma32_free_bootmem().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: yinghai@kernel.org
LKML-Reference: <1260759135-6450-2-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: Fix build warning in arch/x86/mm/mmio-mod.c
Joe Perches [Mon, 14 Dec 2009 07:24:03 +0000 (23:24 -0800)]
x86: Fix build warning in arch/x86/mm/mmio-mod.c

Stephen Rothwell reported these warnings:

 arch/x86/mm/mmio-mod.c: In function 'print_pte':
 arch/x86/mm/mmio-mod.c:100: warning: too many arguments for format
 arch/x86/mm/mmio-mod.c:106: warning: too many arguments for format

The 'fmt' was left out accidentally.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus <torvalds@linux-foundation.org>
LKML-Reference: <1260775443.18538.16.camel@Joe-Laptop.home>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: Remove usedac in feature-removal-schedule.txt
FUJITA Tomonori [Mon, 14 Dec 2009 02:06:15 +0000 (11:06 +0900)]
x86: Remove usedac in feature-removal-schedule.txt

The reason of removal, "replaced by allowdac and no dac
combination" is incorrect. There is no way to do the same thing
with "allowdac" and "nodac" combination.

The usedac option enables us to stop via_no_dac() setting
forbid_dac to 1. That is, someone who uses VIA bridges can use
DAC with this option even if some of VIA bridges seem to be
broken about DAC.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: WANG Cong <amwang@redhat.com>
Cc: gcosta@redhat.com
LKML-Reference: <20091214104423X.fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf bench: Add "all" pseudo subsystem and "all" pseudo suite
Hitoshi Mitake [Sun, 13 Dec 2009 08:01:59 +0000 (17:01 +0900)]
perf bench: Add "all" pseudo subsystem and "all" pseudo suite

This patch adds a new "all" pseudo subsystem and an "all" pseudo
suite. These are for testing all subsystem and its all suite, or
all suite of one subsystem.

(This patch also contains a few trivial comment fixes for
bench/* and output style fixes. I judged that there are no
necessity to make them into individual patch.)

Example of use:

| % ./perf bench sched all                      # Test all suites of sched subsystem
| # Running sched/messaging benchmark...
| # 20 sender and receiver processes per group
| # 10 groups == 400 processes run
|
|      Total time: 0.414 [sec]
|
| # Running sched/pipe benchmark...
| # Extecuted 1000000 pipe operations between two tasks
|
|      Total time: 10.999 [sec]
|
|       10.999317 usecs/op
|           90914 ops/sec
|
| % ./perf bench all                            # Test all suites of all subsystems
| # Running sched/messaging benchmark...
| # 20 sender and receiver processes per group
| # 10 groups == 400 processes run
|
|      Total time: 0.420 [sec]
|
| # Running sched/pipe benchmark...
| # Extecuted 1000000 pipe operations between two tasks
|
|      Total time: 11.741 [sec]
|
|       11.741346 usecs/op
|           85169 ops/sec
|
| # Running mem/memcpy benchmark...
| # Copying 1MB Bytes from 0x7ff33e920010 to 0x7ff3401ae010 ...
|
|      808.407437 MB/Sec

Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1260691319-4683-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agomicroblaze: Remove rt_sigsuspend wrapper
Michal Simek [Fri, 11 Dec 2009 11:54:04 +0000 (12:54 +0100)]
microblaze: Remove rt_sigsuspend wrapper

Generic rt_sigsuspend syscalls doesn't need any asm wrapper.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: nommu: Don't clobber R11 on syscalls
steve@digidescorp.com [Wed, 9 Dec 2009 23:13:42 +0000 (17:13 -0600)]
microblaze: nommu: Don't clobber R11 on syscalls

The noMMU syscall trap has a bug that causes R11 to be zero on return to
userland. Remove the extra "save" of R11 responsible for the bug.

Remove reloading of mode indicator because r11 already contains it.

Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Remove show_tmem function
Michal Simek [Thu, 10 Dec 2009 11:06:03 +0000 (12:06 +0100)]
microblaze: Remove show_tmem function

show_tmem function do nothing that's why I removed it.
There is also cleaning of commented ancient code.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Support for WB cache
Michal Simek [Thu, 10 Dec 2009 10:43:57 +0000 (11:43 +0100)]
microblaze: Support for WB cache

Microblaze version 7.20.d is the first MB version which can be run
on MMU linux. Please do not used previous version because they contain
HW bug.
Based on WB support was necessary to redesign whole cache design.
Microblaze versions from 7.20.a don't need to disable IRQ and cache
before working with them that's why there are special structures for it.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Add PVR for Microblaze v7.30.a
Michal Simek [Tue, 8 Dec 2009 16:54:07 +0000 (17:54 +0100)]
microblaze: Add PVR for Microblaze v7.30.a

Microblaze v7.30.a will have 0x10 version string.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Remove ancient and fake microblaze version from cpu_ver table
Michal Simek [Tue, 8 Dec 2009 16:51:06 +0000 (17:51 +0100)]
microblaze: Remove ancient and fake microblaze version from cpu_ver table

We need to continue with next microblaze PVR version that's why
I have to remove that ancient version. These version strings not match
any versions. From Microblaze v5.00.a is possible to use this style.
I believe that none use ancients versions. If yes they will be just
labeled as unknown version.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Remove panic_timeout init value
Michal Simek [Tue, 8 Dec 2009 16:49:21 +0000 (17:49 +0100)]
microblaze: Remove panic_timeout init value

panic_timeout is in BSS section and it is cleared with BSS section.
This means that value is setup to 0.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Do not count system calls in default
Michal Simek [Mon, 7 Dec 2009 07:21:34 +0000 (08:21 +0100)]
microblaze: Do not count system calls in default

There is not necessary to count system calls that's why
I added DEBUG macro

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Enable DTC compilation
Michal Simek [Mon, 30 Nov 2009 08:26:09 +0000 (09:26 +0100)]
microblaze: Enable DTC compilation

For simpleImage format we need to compile DTC. There is still possibility
to compile only Linux kernel without DTB compiled-in.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Core oprofile configs and hooks
John Williams [Tue, 24 Nov 2009 10:27:54 +0000 (20:27 +1000)]
microblaze: Core oprofile configs and hooks

Microblaze uses timer interrupt mode. Microblaze don't have
any performance counter that's why we use just simple implementation.

Signed-off-by: John Williams <john.williams@petalogix.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Fix level interrupt ACKing
steve@digidescorp.com [Tue, 17 Nov 2009 14:43:39 +0000 (08:43 -0600)]
microblaze: Fix level interrupt ACKing

Level interrupts need to be ack'd in the unmask handler, as in powerpc.
Among other issues, this bug causes the system clock to appear to run at
double-speed.

Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Enable futimesat syscall
Michal Simek [Mon, 19 Oct 2009 11:50:02 +0000 (13:50 +0200)]
microblaze: Enable futimesat syscall

Futimesat was disabled. LTP testing shows that MB has no
problem with this syscall.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Checking DTS against PVR for write-back cache
Michal Simek [Wed, 21 Oct 2009 10:29:46 +0000 (12:29 +0200)]
microblaze: Checking DTS against PVR for write-back cache

WB cache has special flag in PVR. There is added checking mechanism
for PVR and DTS.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Remove duplicity from pgalloc.h
Michal Simek [Mon, 23 Nov 2009 09:15:00 +0000 (10:15 +0100)]
microblaze: Remove duplicity from pgalloc.h

just file cleanup

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Futex support
Michal Simek [Mon, 19 Oct 2009 09:58:44 +0000 (11:58 +0200)]
microblaze: Futex support

Microblaze v7.20 provides new lwx, swx instructions which bring
possibility to implement lock rutines.

There are some tests in open posix thread LTP part but current
toolchain not support it.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Adding dev_arch_data functions
Michal Simek [Mon, 23 Nov 2009 09:07:51 +0000 (10:07 +0100)]
microblaze: Adding dev_arch_data functions

The functions, dev_arch_data_set_node and get_node are missing
and are needed by some device drivers such as I2C.

Signed-off-by: John Linn <john.linn@xilinx.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Fix the heartbeat gpio to be more robust
John Linn [Fri, 5 Jun 2009 17:36:31 +0000 (11:36 -0600)]
microblaze: Fix the heartbeat gpio to be more robust

The device tree handling for the gpio in the heart beat was not handling
the system when there was no gpio and it wasn't working with a newer version
of the gpio core which does not have the is-bidir property.

Signed-off-by: John Linn <john.linn@xilinx.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Simple __copy_tofrom_user for noMMU
John Williams [Fri, 14 Aug 2009 02:06:46 +0000 (12:06 +1000)]
microblaze: Simple __copy_tofrom_user for noMMU

This is first patch which clear part of uaccess.h.
uaccess.h will be clear later.

Signed-off-by: John Williams <john.williams@petalogix.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Export memory_start for modules
Michal Simek [Thu, 23 Jul 2009 06:23:53 +0000 (08:23 +0200)]
microblaze: Export memory_start for modules

memory_start symbol is needed by kernel modules.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Use lowest-common-denominator default CPU settings
John Williams [Mon, 24 Aug 2009 03:52:33 +0000 (13:52 +1000)]
microblaze: Use lowest-common-denominator default CPU settings

This will ensure that kernels built with no custom CPU settings will still boot
OK on hardware that has additional CPU hardware instructions etc.

Signed-off-by: John Williams <john.williams@petalogix.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Update default generic DTS
Michal Simek [Fri, 21 Aug 2009 11:47:09 +0000 (13:47 +0200)]
microblaze: Update default generic DTS

It is generated with longer compatible list

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Enable asm optimization only for HW with barrel-shifter
Michal Simek [Mon, 26 Oct 2009 08:56:48 +0000 (09:56 +0100)]
microblaze: Enable asm optimization only for HW with barrel-shifter

Asm code uses barrel-shifter instruction that's why we have
to protect cases when HW don't have it.

Reported-by: John Linn <john.linn@xilinx.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Remove the buggy ALLOW_EDIT_AUTO config option
John Williams [Mon, 24 Aug 2009 03:52:32 +0000 (13:52 +1000)]
microblaze: Remove the buggy ALLOW_EDIT_AUTO config option

This was intended to allow manual override of CPU settings copied automatically
to Kconfig.auto, however it's problematic for several reasons, but mostly:

  * If the defconfig doesn't have ALLOW_EDIT_AUTO=y, then it's impossible for
    that defconfig to iverride the values in the kernel source tree.  This leads
    to very strange errors where the kernel is compiled with the wrong CPUFLAGS.

Next patch in the series will back out the default in Kconfig.auto to baseline
settings, so a kernel built with no default values will at least boot on any
hardware, just not make use of additional CPU features.

Signed-off-by: John Williams <john.williams@petalogix.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Move cache macro from cache.h to cacheflush.h
Michal Simek [Thu, 15 Oct 2009 13:18:13 +0000 (15:18 +0200)]
microblaze: Move cache macro from cache.h to cacheflush.h

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: support U-BOOT image format
Michal Simek [Wed, 14 Oct 2009 15:38:26 +0000 (17:38 +0200)]
microblaze: support U-BOOT image format

Two version are generated.
linux.bin.ub which is created from linux.bin file
and
simpleImage.<dts>.ub which is created from stripped simpleImage.<dts> file

Load address and entry point is for microblaze first instruction
which is CONFIG_KERNEL_BASE_ADDR variable.

There is possible for simpleImage format parse _start symbol too.

simpleImage.<dts> is still stripped elf file

I cleared simpleImage.<dts>.unstrip file because there are so big.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Ptrace notifying from signal code
Michal Simek [Thu, 15 Oct 2009 09:32:25 +0000 (11:32 +0200)]
microblaze: Ptrace notifying from signal code

After the signal frame is set up on the userspace stack, ptrace() should
be given an opportunity to single-step into the signal handler

FRV, Blackfin, mn10300 and UM. Worth to look at that patches.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Extend cpuinfo for support write-back caches
Michal Simek [Wed, 14 Oct 2009 09:12:50 +0000 (11:12 +0200)]
microblaze: Extend cpuinfo for support write-back caches

There is missing checking agains PVR but this is not important
for now. There are some missing checking too.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Fix cache_line_lenght
Michal Simek [Thu, 8 Oct 2009 11:06:42 +0000 (13:06 +0200)]
microblaze: Fix cache_line_lenght

We used cache_line as cache_line_lenght. For this reason
we did cache flushing 4 times longer than was necessary.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Detect new 7.20.d version
Michal Simek [Thu, 15 Oct 2009 11:34:31 +0000 (13:34 +0200)]
microblaze: Detect new 7.20.d version

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Support both levels for reset
Michal Simek [Thu, 29 Oct 2009 09:12:59 +0000 (10:12 +0100)]
microblaze: Support both levels for reset

Till this patch reset always perform writen to 1.
Now we can use negative logic and perform reset write to 0.

It is opposite level than is currently read from that pin

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Fix announce message for reset gpio
Michal Simek [Thu, 29 Oct 2009 07:58:15 +0000 (08:58 +0100)]
microblaze: Fix announce message for reset gpio

I had to change message for gpio-reset because I always
not to see it. Prefix RESET is big and visible.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Remove saving and restoring before calling signal code
Michal Simek [Fri, 13 Nov 2009 07:26:49 +0000 (08:26 +0100)]
microblaze: Remove saving and restoring before calling signal code

Saving is done in SAVE_STATE macros that's why another save discard
previous saved value.

This change has no effect to normal programs because they ends in any exception
and they are killed. On the other side has effect on debugging.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Fix pfn_valid() for noMMU
steve@digidescorp.com [Fri, 13 Nov 2009 22:08:29 +0000 (16:08 -0600)]
microblaze: Fix pfn_valid() for noMMU

Configuring DEBUG_SLAB causes a noMMU kernel to die during initialization
with an invalid virtual address panic in kfree_debugcheck().
The panic is due to an improper definition of pfn_valid().

Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: ftrace: Add dynamic function graph tracer
Michal Simek [Mon, 16 Nov 2009 09:34:15 +0000 (10:34 +0100)]
microblaze: ftrace: Add dynamic function graph tracer

This patch add support for dynamic function graph tracer.

There is one my expactation that I can do flush_icache after
all code modification. On microblaze is this safer than do
flush for every entry. For icache is used name flush but
correct should be invalidation - this will be fix in upcomming
new cache implementaion and WB support.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: ftrace: add function graph support
Michal Simek [Mon, 16 Nov 2009 09:32:10 +0000 (10:32 +0100)]
microblaze: ftrace: add function graph support

For more information look at Documentation/trace folder.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: ftrace: Add dynamic trace support
Michal Simek [Thu, 10 Dec 2009 13:15:44 +0000 (14:15 +0100)]
microblaze: ftrace: Add dynamic trace support

With dynamic function tracer, by default, _mcount is defined as an
"empty" function, it returns directly without any more action. When
enabling it in user-space, it will jump to a real tracing
function(ftrace_caller), and do the real job for us.

Differ from the static function tracer, dynamic function tracer provides
two functions ftrace_make_call()/ftrace_make_nop() to enable/disable the
tracing of some indicated kernel functions(set_ftrace_filter).

In the kernel version, there is only one "_mcount" string for every
kernel function, so, we just need to match this one in mcount_regex of
scripts/recordmcount.pl.

For more information please look at code and Documentation/trace folder.

Steven ACK that scripts/recordmcount.pl part.

Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: ftrace: enable HAVE_FUNCTION_TRACE_MCOUNT_TEST
Michal Simek [Mon, 16 Nov 2009 08:55:08 +0000 (09:55 +0100)]
microblaze: ftrace: enable HAVE_FUNCTION_TRACE_MCOUNT_TEST

Implement MCOUNT_TEST in asm code - it is faster than use
generic code

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: ftrace: add static function tracer
Michal Simek [Mon, 16 Nov 2009 08:40:14 +0000 (09:40 +0100)]
microblaze: ftrace: add static function tracer

If -pg of gcc is enabled with CONFIG_FUNCTION_TRACER=y. a calling to
_mcount will be inserted into each kernel function. so, there is a
possibility to trace the kernel functions in _mcount.

This patch add the specific _mcount support for static function
tracing. by default, ftrace_trace_function is initialized as
ftrace_stub(an empty function), so, the default _mcount will introduce
very little overhead. after enabling ftrace in user-space, it will jump
to a real tracing function and do static function tracing for us.

Commit message from Wu Zhangjin <wuzhangjin@gmail.com>

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Add TRACE_IRQFLAGS_SUPPORT
Michal Simek [Fri, 30 Oct 2009 11:26:53 +0000 (12:26 +0100)]
microblaze: Add TRACE_IRQFLAGS_SUPPORT

There are just two major changes
Renamed local_irq functions to raw_local_irq in irq.c.
Added TRACE_IRQFLAGS_SUPPORT to Kconfig.debug.

Look at Documentation/irqflags-tracing.txt

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: preliminary enabling for LATENCYTOP support in Kconfig
Michal Simek [Mon, 16 Nov 2009 08:09:47 +0000 (09:09 +0100)]
microblaze: preliminary enabling for LATENCYTOP support in Kconfig

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Lockdep support
Michal Simek [Thu, 10 Dec 2009 11:07:02 +0000 (12:07 +0100)]
microblaze: Lockdep support

Microblaze needs to do lock_init very soon because MMU init calls lock functions.

Here is the explanation from Peter Zijlstra why we have to enable
__ARCH_WANTS_INTERRUPTS_ON_CTSW.

"So we schedule while holding rq->lock (for obvious reasons), but since
lockdep tracks held locks per tasks, we need to transfer the held state
from the prev to the next task. We do this by explicity calling
spin_release(&rq->lock) in context_switch() right before switch_to(),
and calling spin_acquire(&rq->lock) in
finish_task_switch()->finish_lock_switch().

Now, for some reason lockdep thinks that interrupts got enabled over the
context switch (git grep __ARCH_WANTS_INTERRUPTS_ON_CTSW arch/microblaze
doesn't seem to turn up anything).

Clearly trying to acquire the rq->lock with interrupts enabled is a bad
idea and lockdep warns you about this."

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Register timecounter/cyclecounter
Michal Simek [Fri, 6 Nov 2009 11:31:00 +0000 (12:31 +0100)]
microblaze: Register timecounter/cyclecounter

It is the same counter as we use as free running one.
I would like to use it for ftrace.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Stack trace support
Michal Simek [Tue, 10 Nov 2009 14:57:01 +0000 (15:57 +0100)]
microblaze: Stack trace support

This is working implemetation but the problem is that
Microblaze misses frame pointer that's why is there
big loop which trace and show all addresses which are in text.
It shows addresses which are in registers, etc.

This is problem and this is the reason why all Microblaze
traces are wrong. There is an option to do hacks and trace
the kernel code but this is too complicated.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Add IRQENTRY_TEXT to lds
Michal Simek [Fri, 6 Nov 2009 11:27:25 +0000 (12:27 +0100)]
microblaze: Add IRQENTRY_TEXT to lds

It is important for ftrace irqsoff support

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: __init_begin symbol must be aligned
Michal Simek [Fri, 30 Oct 2009 13:41:52 +0000 (14:41 +0100)]
microblaze: __init_begin symbol must be aligned

The problem was that free_initmem pass to  free_initrd_mem got
bad aligned __init_begin symbol and free_initrd_mem don't care
about __init_end but take PAGE_SIZE instead.

Here is behavior in kernel bootlog.
ramdisk_execute_command from (init/main.c) was rewrite

Freeing unused kernel memory: 6224k freed
Failed to execute ��������������{���
Failed to execute ��������������{����.  Attempting defaults...
Mounting proc:
Mounting var:

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: GPIO reset support
Michal Simek [Fri, 2 Oct 2009 10:48:47 +0000 (12:48 +0200)]
microblaze: GPIO reset support

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomd: add 'recovery_start' per-device sysfs attribute
Dan Williams [Sun, 13 Dec 2009 04:17:12 +0000 (21:17 -0700)]
md: add 'recovery_start' per-device sysfs attribute

Enable external metadata arrays to manage rebuild checkpointing via a
md/dev-XXX/recovery_start attribute which reflects rdev->recovery_offset

Also update resync_start_store to allow 'none' to be written, for
consistency.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: rcu_read_lock() walk of mddev->disks in md_do_sync()
Dan Williams [Sun, 13 Dec 2009 04:17:06 +0000 (21:17 -0700)]
md: rcu_read_lock() walk of mddev->disks in md_do_sync()

Other walks of this list are either under rcu_read_lock() or the list
mutation lock (mddev_lock()).  This protects against the improbable case of a
disk being removed from the array at the start of md_do_sync().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
15 years agomd: integrate spares into array at earliest opportunity.
NeilBrown [Mon, 14 Dec 2009 01:50:06 +0000 (12:50 +1100)]
md: integrate spares into array at earliest opportunity.

As v1.x metadata can record that a member of the array is
not completely recovered, it make sense to record that a
spare has become a regular member of the array at the earliest
opportunity.
So remove the tests on "recovery_offset > 0" in super_1_sync
as they really aren't needed, and schedule a metadata update
immediately after adding spares to a degraded array.

This means that if a crash happens immediately after a recovery
starts, the new device will be included in the array and recovery will
continue from wherever it was up to.  Previously this didn't happen
unless recovery was at least 1/16 of the way through.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: move compat_ioctl handling into md.c
Arnd Bergmann [Mon, 14 Dec 2009 01:50:05 +0000 (12:50 +1100)]
md: move compat_ioctl handling into md.c

The RAID ioctls are only implemented in md.c, so the
handling for them should also be moved there from
fs/compat_ioctl.c.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Andre Noll <maan@systemlinux.org>
Cc: linux-raid@vger.kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: revise Kconfig help for MD_MULTIPATH
NeilBrown [Mon, 14 Dec 2009 01:49:59 +0000 (12:49 +1100)]
md: revise Kconfig help for MD_MULTIPATH

Make it clear in the config message that MD_MULTIPATH is not under
active development.

Cc: Oren Held <orenhe@il.ibm.com>
Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: add MODULE_DESCRIPTION for all md related modules.
NeilBrown [Mon, 14 Dec 2009 01:49:58 +0000 (12:49 +1100)]
md: add MODULE_DESCRIPTION for all md related modules.

Suggested by  Oren Held <orenhe@il.ibm.com>

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agoraid: improve MD/raid10 handling of correctable read errors.
Robert Becker [Mon, 14 Dec 2009 01:49:58 +0000 (12:49 +1100)]
raid: improve MD/raid10 handling of correctable read errors.

We've noticed severe lasting performance degradation of our raid
arrays when we have drives that yield large amounts of media errors.
The raid10 module will queue each failed read for retry, and also
will attempt call fix_read_error() to perform the read recovery.
Read recovery is performed while the array is frozen, so repeated
recovery attempts can degrade the performance of the array for
extended periods of time.

With this patch I propose adding a per md device max number of
corrected read attempts.  Each rdev will maintain a count of
read correction attempts in the rdev->read_errors field (not
used currently for raid10). When we enter fix_read_error()
we'll check to see when the last read error occurred, and
divide the read error count by 2 for every hour since the
last read error. If at that point our read error count
exceeds the read error threshold, we'll fail the raid device.

In addition in this patch I add sysfs nodes (get/set) for
the per md max_read_errors attribute, the rdev->read_errors
attribute, and added some printk's to indicate when
fix_read_error fails to repair an rdev.

For testing I used debugfs->fail_make_request to inject
IO errors to the rdev while doing IO to the raid array.

Signed-off-by: Robert Becker <Rob.Becker@riverbed.com>
Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd/raid10: print more useful messages on device failure.
Robert Becker [Mon, 14 Dec 2009 01:49:57 +0000 (12:49 +1100)]
md/raid10: print more useful messages on device failure.

When we get a read error on a device in a RAID10, and attempting to
repair the error fails, print more useful messages about why it
failed.

Signed-off-by: Robert Becker <Rob.Becker@riverbed.com>
Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd/bitmap: update dirty flag when bitmap bits are explicitly set.
NeilBrown [Mon, 14 Dec 2009 01:49:56 +0000 (12:49 +1100)]
md/bitmap: update dirty flag when bitmap bits are explicitly set.

There is a sysfs file which allows bits in the write-intent
bitmap to be explicit set - indicating that the block is thought
to be 'dirty'.
When this happens we should really set recovery_cp backwards
to include the block to reflect this dirtiness.

In particular, a 'resync' process will refuse to start if
recovery_cp is beyond the end of the array, so this is needed
to allow a resync to be triggered.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: Support write-intent bitmaps with externally managed metadata.
NeilBrown [Mon, 14 Dec 2009 01:49:56 +0000 (12:49 +1100)]
md: Support write-intent bitmaps with externally managed metadata.

In this case, the metadata needs to not be in the same
sector as the bitmap.
md will not read/write any bitmap metadata.  Config must be
done via sysfs and when a recovery makes the array non-degraded
again, writing 'true' to 'bitmap/can_clear' will allow bits in
the bitmap to be cleared again.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd/bitmap: move setting of daemon_lastrun out of bitmap_read_sb
NeilBrown [Mon, 14 Dec 2009 01:49:56 +0000 (12:49 +1100)]
md/bitmap: move setting of daemon_lastrun out of bitmap_read_sb

Setting daemon_lastrun really has nothing to do with reading
the bitmap superblock, it just happens to be needed at the same time.
bitmap_read_sb is about to become options, so move that code out
to after the call to bitmap_read_sb.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: support updating bitmap parameters via sysfs.
NeilBrown [Mon, 14 Dec 2009 01:49:55 +0000 (12:49 +1100)]
md: support updating bitmap parameters via sysfs.

A new attribute directory 'bitmap' in 'md' is created which
contains files for configuring the bitmap.
'location' identifies where the bitmap is, either 'none',
or 'file' or 'sector offset from metadata'.
Writing 'location' can create or remove a bitmap.
Adding a 'file' bitmap this way is not yet supported.
'chunksize' and 'time_base' must be set before 'location'
can be set.

'chunksize' can be set before creating a bitmap, but is
currently always over-ridden by the bitmap superblock.

'time_base' and 'backlog' can be updated at any time.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Andre Noll <maan@systemlinux.org>
15 years agomd: factor out parsing of fixed-point numbers
NeilBrown [Mon, 14 Dec 2009 01:49:55 +0000 (12:49 +1100)]
md: factor out parsing of fixed-point numbers

safe_delay_store can parse fixed point numbers (for fractions
of a second).  We will want to do that for another sysfs
file soon, so factor out the code.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: support bitmap offset appropriate for external-metadata arrays.
NeilBrown [Mon, 14 Dec 2009 01:49:54 +0000 (12:49 +1100)]
md: support bitmap offset appropriate for external-metadata arrays.

For md arrays were metadata is managed externally, the kernel does not
know about a superblock so the superblock offset is 0.
If we want to have a write-intent-bitmap near the end of the
devices of such an array, we should support sector_t sized offset.
We need offset be possibly negative for when the bitmap is before
the metadata, so use loff_t instead.

Also add sanity check that bitmap does not overlap with data.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: remove needless setting of thread->timeout in raid10_quiesce
NeilBrown [Mon, 14 Dec 2009 01:49:54 +0000 (12:49 +1100)]
md: remove needless setting of thread->timeout in raid10_quiesce

As bitmap_create and bitmap_destroy already set thread->timeout
as appropriate, there is no need to do it in raid10_quiesce.
There is a possible need to wake the thread after the timeout
has been set low, but it is better to do that where the timeout
is actually set low, in bitmap_create.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: change daemon_sleep to be in 'jiffies' rather than 'seconds'.
NeilBrown [Mon, 14 Dec 2009 01:49:53 +0000 (12:49 +1100)]
md: change daemon_sleep to be in 'jiffies' rather than 'seconds'.

This removes a lot of multiplications by HZ.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: move offset, daemon_sleep and chunksize out of bitmap structure
NeilBrown [Mon, 14 Dec 2009 01:49:53 +0000 (12:49 +1100)]
md: move offset, daemon_sleep and chunksize out of bitmap structure

... and into bitmap_info.  These are all configuration parameters
that need to be set before the bitmap is created.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: collect bitmap-specific fields into one structure.
NeilBrown [Mon, 14 Dec 2009 01:49:52 +0000 (12:49 +1100)]
md: collect bitmap-specific fields into one structure.

In preparation for making bitmap fields configurable via sysfs,
start tidying up by making a single structure to contain the
configuration fields.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd/raid1: add takeover support for raid5->raid1
NeilBrown [Mon, 14 Dec 2009 01:49:51 +0000 (12:49 +1100)]
md/raid1: add takeover support for raid5->raid1

A 2-device raid5 array can now be converted to raid1.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: add honouring of suspend_{lo,hi} to raid1.
NeilBrown [Mon, 14 Dec 2009 01:49:51 +0000 (12:49 +1100)]
md: add honouring of suspend_{lo,hi} to raid1.

This will allow us to stop writeout to portions of the array
while  they are resynced by someone else - e.g. another node in
a cluster.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd/raid5: don't complete make_request on barrier until writes are scheduled
NeilBrown [Mon, 14 Dec 2009 01:49:50 +0000 (12:49 +1100)]
md/raid5: don't complete make_request on barrier until writes are scheduled

The post-barrier-flush is sent by md as soon as make_request on the
barrier write completes.  For raid5, the data might not be in the
per-device queues yet.  So for barrier requests, wait for any
pre-reading to be done so that the request will be in the per-device
queues.

We use the 'preread_active' count to check that nothing is still in
the preread phase, and delay the decrement of this count until after
write requests have been submitted to the underlying devices.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: support barrier requests on all personalities.
NeilBrown [Mon, 14 Dec 2009 01:49:49 +0000 (12:49 +1100)]
md: support barrier requests on all personalities.

Previously barriers were only supported on RAID1.  This is because
other levels requires synchronisation across all devices and so needed
a different approach.
Here is that approach.

When a barrier arrives, we send a zero-length barrier to every active
device.  When that completes - and if the original request was not
empty -  we submit the barrier request itself (with the barrier flag
cleared) and then submit a fresh load of zero length barriers.

The barrier request itself is asynchronous, but any subsequent
request will block until the barrier completes.

The reason for clearing the barrier flag is that a barrier request is
allowed to fail.  If we pass a non-empty barrier through a striping
raid level it is conceivable that part of it could succeed and part
could fail.  That would be way too hard to deal with.
So if the first run of zero length barriers succeed, we assume all is
sufficiently well that we send the request and ignore errors in the
second run of barriers.

RAID5 needs extra care as write requests may not have been submitted
to the underlying devices yet.  So we flush the stripe cache before
proceeding with the barrier.

Note that the second set of zero-length barriers are submitted
immediately after the original request is submitted.  Thus when
a personality finds mddev->barrier to be set during make_request,
it should not return from make_request until the corresponding
per-device request(s) have been queued.

That will be done in later patches.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Andre Noll <maan@systemlinux.org>
15 years agomd: don't reset curr_resync_completed after an interrupted resync
NeilBrown [Mon, 14 Dec 2009 01:49:49 +0000 (12:49 +1100)]
md: don't reset curr_resync_completed after an interrupted resync

If a resync/recovery/check/repair is interrupted for some reason, it
can be useful to know exactly where it got up to.
So in that case, do not clear curr_resync_completed.
Initialise it when starting a resync/recovery/... instead.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: adjust resync_min usefully when resync aborts.
NeilBrown [Mon, 14 Dec 2009 01:49:48 +0000 (12:49 +1100)]
md: adjust resync_min usefully when resync aborts.

When a 'check' or 'repair' finished we should clear resync_min
so that a future check/repair will cover the whole array (by default).
However if it is interrupted, we should update resync_min to
where we got up to, so that when the check/repair continues it
just does the remainder of the array.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: remove sparse warning:symbol XXX was not declared.
NeilBrown [Mon, 14 Dec 2009 01:49:47 +0000 (12:49 +1100)]
md: remove sparse warning:symbol XXX was not declared.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd/raid5: remove some sparse warnings.
NeilBrown [Mon, 14 Dec 2009 01:49:47 +0000 (12:49 +1100)]
md/raid5: remove some sparse warnings.

qd_idx is previously declared and given exactly the same value!

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd/bitmap: protect against bitmap removal while being updated.
NeilBrown [Mon, 14 Dec 2009 01:49:46 +0000 (12:49 +1100)]
md/bitmap: protect against bitmap removal while being updated.

A write intent bitmap can be removed from an array while the
array is active.
When this happens, all IO is suspended and flushed before the
bitmap is removed.
However it is possible that bitmap_daemon_work is still running to
clear old bits from the bitmap.  If it is, it can dereference the
bitmap after it has been freed.

So introduce a new mutex to protect bitmap_daemon_work and get it
before destroying a bitmap.

This is suitable for any current -stable kernel.

Signed-off-by: NeilBrown <neilb@suse.de>
Cc: stable@kernel.org