openwrt/staging/blogic.git
14 years agoConvert simple loops over superblocks to list_for_each_entry_safe
Al Viro [Tue, 23 Mar 2010 00:09:33 +0000 (20:09 -0400)]
Convert simple loops over superblocks to list_for_each_entry_safe

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 years agoget rid of restarts in sync_filesystems()
Al Viro [Mon, 22 Mar 2010 23:56:42 +0000 (19:56 -0400)]
get rid of restarts in sync_filesystems()

At the same time we can kill s_need_restart and local mutex in there.
__put_super() made public for a while; will be gone later.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 years agoLeave superblocks on s_list until the end
Al Viro [Mon, 22 Mar 2010 23:36:35 +0000 (19:36 -0400)]
Leave superblocks on s_list until the end

We used to remove from s_list and s_instances at the same
time.  So let's *not* do the former and skip superblocks
that have empty s_instances in the loops over s_list.

The next step, of course, will be to get rid of rescan logics
in those loops.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 years agoSaner locking around deactivate_super()
Al Viro [Mon, 22 Mar 2010 19:22:31 +0000 (15:22 -0400)]
Saner locking around deactivate_super()

Make sure that s_umount is acquired *before* we drop the final
active reference; we still have the fast path (atomic_dec_unless)
and we have gotten rid of the window between the moment when
s_active hits zero and s_umount is acquired.  Which simplifies
the living hell out of grab_super() and inotify pin_to_kill()
stuff.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 years agoget rid of S_BIAS
Al Viro [Mon, 22 Mar 2010 12:53:19 +0000 (08:53 -0400)]
get rid of S_BIAS

use atomic_inc_not_zero(&sb->s_active) instead of playing games with
checking ->s_count > S_BIAS

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 years agoget rid of open-coded grab_super() in get_active_super()
Al Viro [Mon, 22 Mar 2010 02:34:11 +0000 (22:34 -0400)]
get rid of open-coded grab_super() in get_active_super()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 years agosb_entry() has been killed a couple of years ago and resurrected on mismerge
Al Viro [Sun, 21 Mar 2010 23:24:23 +0000 (19:24 -0400)]
sb_entry() has been killed a couple of years ago and resurrected on mismerge

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 years agoceph: should use deactivate_locked_super() on failure exits
Al Viro [Sun, 21 Mar 2010 23:22:29 +0000 (19:22 -0400)]
ceph: should use deactivate_locked_super() on failure exits

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 years agoClean ecryptfs ->get_sb() up
Al Viro [Sun, 21 Mar 2010 16:24:29 +0000 (12:24 -0400)]
Clean ecryptfs ->get_sb() up

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 years agofix a couple of ecryptfs leaks
Al Viro [Sun, 21 Mar 2010 02:32:26 +0000 (22:32 -0400)]
fix a couple of ecryptfs leaks

First of all, get_sb_nodev() grabs anon dev minor and we
never free it in ecryptfs ->kill_sb().  Moreover, on one
of the failure exits in ecryptfs_get_sb() we leak things -
it happens before we set ->s_root and ->put_super() won't
be called in that case.  Solution: kill ->put_super(), do
all that stuff in ->kill_sb().  And use kill_anon_sb() instead
of generic_shutdown_super() to deal with anon dev leak.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 years agoSimplify devpts_get_sb() failure exits
Al Viro [Sun, 21 Mar 2010 01:57:43 +0000 (21:57 -0400)]
Simplify devpts_get_sb() failure exits

postpone simple_set_mnt() until we know we won't fail.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 years agoremove incorrect comment in do_emergency_remount
Christoph Hellwig [Mon, 1 Feb 2010 20:55:52 +0000 (21:55 +0100)]
remove incorrect comment in do_emergency_remount

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 years agoclean DCACHE_CANT_MOUNT in d_delete()
Al Viro [Fri, 21 May 2010 20:11:04 +0000 (16:11 -0400)]
clean DCACHE_CANT_MOUNT in d_delete()

We set the "it's dead, don't mount on it" flag _and_ do not remove it if
we turn the damn thing negative and leave it around.  And if it goes
positive afterwards, well...

Fortunately, there's only one place where that needs to be caught:
only d_delete() can turn the sucker negative without immediately freeing
it; all other places that can lead to ->d_iput() call are followed by
unconditionally freeing struct dentry in question.  So the fix is obvious:

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16014
Reported-by: Adam Tkac <vonsch@gmail.com>
Tested-by: Adam Tkac <vonsch@gmail.com>
Cc: <stable@kernel.org> [2.6.34.x]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux...
Linus Torvalds [Fri, 21 May 2010 18:17:43 +0000 (11:17 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-udf-2.6

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
  udf: BKL ioctl pushdown

14 years agoMerge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Linus Torvalds [Fri, 21 May 2010 18:17:05 +0000 (11:17 -0700)]
Merge branch 'next' of git://git./linux/kernel/git/benh/powerpc

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (92 commits)
  powerpc: Remove unused 'protect4gb' boot parameter
  powerpc: Build-in e1000e for pseries & ppc64_defconfig
  powerpc/pseries: Make request_ras_irqs() available to other pseries code
  powerpc/numa: Use ibm,architecture-vec-5 to detect form 1 affinity
  powerpc/numa: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim
  powerpc: Use smt_snooze_delay=-1 to always busy loop
  powerpc: Remove check of ibm,smt-snooze-delay OF property
  powerpc/kdump: Fix race in kdump shutdown
  powerpc/kexec: Fix race in kexec shutdown
  powerpc/kexec: Speedup kexec hash PTE tear down
  powerpc/pseries: Add hcall to read 4 ptes at a time in real mode
  powerpc: Use more accurate limit for first segment memory allocations
  powerpc/kdump: Use chip->shutdown to disable IRQs
  powerpc/kdump: CPUs assume the context of the oopsing CPU
  powerpc/crashdump: Do not fail on NULL pointer dereferencing
  powerpc/eeh: Fix oops when probing in early boot
  powerpc/pci: Check devices status property when scanning OF tree
  powerpc/vio: Switch VIO Bus PM to use generic helpers
  powerpc: Avoid bad relocations in iSeries code
  powerpc: Use common cpu_die (fixes SMP+SUSPEND build)
  ...

14 years agoMerge branch 'drm-for-2.6.35' of git://git.kernel.org/pub/scm/linux/kernel/git/airlie...
Linus Torvalds [Fri, 21 May 2010 18:14:52 +0000 (11:14 -0700)]
Merge branch 'drm-for-2.6.35' of git://git./linux/kernel/git/airlied/drm-2.6

* 'drm-for-2.6.35' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (207 commits)
  drm/radeon/kms/pm/r600: select the mid clock mode for single head low profile
  drm/radeon: fix power supply kconfig interaction.
  drm/radeon/kms: record object that have been list reserved
  drm/radeon: AGP memory is only I/O if the aperture can be mapped by the CPU.
  drm/radeon/kms: don't default display priority to high on rs4xx
  drm/edid: fix typo in 1600x1200@75 mode
  drm/nouveau: fix i2c-related init table handlers
  drm/nouveau: support init table i2c device identifier 0x81
  drm/nouveau: ensure we've parsed i2c table entry for INIT_*I2C* handlers
  drm/nouveau: display error message for any failed init table opcode
  drm/nouveau: fix init table handlers to return proper error codes
  drm/nv50: support fractional feedback divider on newer chips
  drm/nv50: fix monitor detection on certain chipsets
  drm/nv50: store full dcb i2c entry from vbios
  drm/nv50: fix suspend/resume with DP outputs
  drm/nv50: output calculated crtc pll when debugging on
  drm/nouveau: dump pll limits entries when debugging is on
  drm/nouveau: bios parser fixes for eDP boards
  drm/nouveau: fix a nouveau_bo dereference after it's been destroyed
  drm/nv40: remove some completed ctxprog TODOs
  ...

14 years agoMerge branch 'dbg-early-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwess...
Linus Torvalds [Fri, 21 May 2010 18:10:41 +0000 (11:10 -0700)]
Merge branch 'dbg-early-merge' of git://git./linux/kernel/git/jwessel/linux-2.6-kgdb

* 'dbg-early-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  echi-dbgp: Add kernel debugger support for the usb debug port
  earlyprintk,vga,kdb: Fix \b and \r for earlyprintk=vga with kdb
  kgdboc: Add ekgdboc for early use of the kernel debugger
  x86,early dr regs,kgdb: Allow kernel debugger early dr register access
  x86,kgdb: Implement early hardware breakpoint debugging
  x86, kgdb, init: Add early and late debug states
  x86, kgdb: early trap init for early debug

14 years agoMerge branch 'kdb-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel...
Linus Torvalds [Fri, 21 May 2010 18:08:05 +0000 (11:08 -0700)]
Merge branch 'kdb-merge' of git://git./linux/kernel/git/jwessel/linux-2.6-kgdb

* 'kdb-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: (25 commits)
  kdb,debug_core: Allow the debug core to receive a panic notification
  MAINTAINERS: update kgdb, kdb, and debug_core info
  debug_core,kdb: Allow the debug core to process a recursive debug entry
  printk,kdb: capture printk() when in kdb shell
  kgdboc,kdb: Allow kdb to work on a non open console port
  kgdb: Add the ability to schedule a breakpoint via a tasklet
  mips,kgdb: kdb low level trap catch and stack trace
  powerpc,kgdb: Introduce low level trap catching
  x86,kgdb: Add low level debug hook
  kgdb: remove post_primary_code references
  kgdb,docs: Update the kgdb docs to include kdb
  kgdboc,keyboard: Keyboard driver for kdb with kgdb
  kgdb: gdb "monitor" -> kdb passthrough
  sparc,sunzilog: Add console polling support for sunzilog serial driver
  sh,sh-sci: Use NO_POLL_CHAR in the SCIF polled console code
  kgdb,8250,pl011: Return immediately from console poll
  kgdb: core changes to support kdb
  kdb: core for kgdb back end (2 of 2)
  kdb: core for kgdb back end (1 of 2)
  kgdb,blackfin: Add in kgdb_arch_set_pc for blackfin
  ...

14 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Linus Torvalds [Fri, 21 May 2010 17:51:03 +0000 (10:51 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (59 commits)
  HID: fix up 'EMBEDDED' mess in Kconfig
  HID: roccat: cleanup preprocessor macros
  HID: roccat: refactor special event handling
  HID: roccat: fix special button support
  HID: roccat: Correctly mark init and exit functions
  HID: hidraw: Use Interrupt Endpoint for OUT Transfers if Available
  HID: hid-samsung: remove redundant key mappings
  HID: add omitted hid-zydacron.c file
  HID: hid-samsung: add support for Creative Desktop Wireless 6000
  HID: picolcd: Eliminate use after free
  HID: Zydacron Remote Control driver
  HID: Use kmemdup
  HID: magicmouse: fix input registration
  HID: make Prodikeys driver standalone config option
  HID: Prodikeys PC-MIDI HID Driver
  HID: hidraw: fix indentation
  HID: ntrig: add filtering module parameters
  HID: ntrig: add sysfs access to filter parameters
  HID: ntrig: add sensitivity and responsiveness support
  HID: add multi-input quirk for eGalax Touchcontroller
  ...

14 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux...
Linus Torvalds [Fri, 21 May 2010 17:50:28 +0000 (10:50 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs-2.6

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (31 commits)
  dquot: Detect partial write error to quota file in write_blk() and add printk_ratelimit for quota error messages
  ocfs2: Fix lock inversion in quotas during umount
  ocfs2: Use __dquot_transfer to avoid lock inversion
  ocfs2: Fix NULL pointer deref when writing local dquot
  ocfs2: Fix estimate of credits needed for quota allocation
  ocfs2: Fix quota locking
  ocfs2: Avoid unnecessary block mapping when refreshing quota info
  ocfs2: Do not map blocks from local quota file on each write
  quota: Refactor dquot_transfer code so that OCFS2 can pass in its references
  quota: unify quota init condition in setattr
  quota: remove sb_has_quota_active in get/set_info
  quota: unify ->set_dqblk
  quota: unify ->get_dqblk
  ext3: make barrier options consistent with ext4
  quota: Make quota stat accounting lockless.
  suppress warning: "quotatypes" defined but not used
  ext3: Fix waiting on transaction during fsync
  jbd: Provide function to check whether transaction will issue data barrier
  ufs: add ufs speciffic ->setattr call
  BKL: Remove BKL from ext2 filesystem
  ...

14 years agoMerge branch 'omap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind...
Linus Torvalds [Fri, 21 May 2010 17:50:00 +0000 (10:50 -0700)]
Merge branch 'omap-for-linus' of git://git./linux/kernel/git/tmlind/linux-omap-2.6

* 'omap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6: (113 commits)
  omap4: Add support for i2c init
  omap: Fix i2c platform init code for omap4
  OMAP2 clock: fix recursive spinlock attempt when CONFIG_CPU_FREQ=y
  OMAP powerdomain, hwmod, omap_device: add some credits
  OMAP4 powerdomain: Support LOWPOWERSTATECHANGE for powerdomains
  OMAP3 clock: add support for setting the divider for sys_clkout2 using clk_set_rate
  OMAP4 powerdomain: Fix pwrsts flags for ALWAYS ON domains
  OMAP: timers: Fix clock source names for OMAP4
  OMAP4 clock: Support clk_set_parent
  OMAP4: PRCM: Add offset defines for all CM registers
  OMAP4: PRCM: Add offset defines for all PRM registers
  OMAP4: PRCM: Remove duplicate definition of base addresses
  OMAP4: PRM: Remove MPU internal code name and apply PRCM naming convention
  OMAP4: CM: Remove non-functional registers in ES1.0
  OMAP: hwmod: Replace WARN by pr_warning for clockdomain check
  OMAP: hwmod: Rename hwmod name for the MPU
  OMAP: hwmod: Do not exit the iteration if one clock init failed
  OMAP: hwmod: Replace WARN by pr_warning if clock lookup failed
  OMAP: hwmod: Remove IS_ERR check with omap_clk_get_by_name return value
  OMAP: hwmod: Fix wrong pointer iteration in oh->slaves
  ...

14 years agoMerge branch 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvar...
Linus Torvalds [Fri, 21 May 2010 17:49:43 +0000 (10:49 -0700)]
Merge branch 'i2c-for-linus' of git://git./linux/kernel/git/jdelvare/staging

* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  i2c-nforce2: Remove redundant error messages on ACPI conflict
  i2c: Use <linux/io.h> instead of <asm/io.h>
  i2c-algo-pca: Fix coding style issues
  i2c-dev: Fix all coding style issues
  i2c-core: Fix some coding style issues
  i2c-gpio: Move initialization code to subsys_initcall()
  i2c-parport: Make template structure const
  i2c-dev: Remove unnecessary casts
  at24: Fall back to byte or word reads if needed
  i2c-stub: Expose the default functionality flags
  i2c/scx200_acb: Make PCI device ids constant
  i2c-i801: Fix all checkpatch warnings
  i2c-i801: All newer devices have all the optional features
  i2c-i801: Let the user disable selected driver features

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
Linus Torvalds [Fri, 21 May 2010 17:48:48 +0000 (10:48 -0700)]
Merge git://git./linux/kernel/git/gregkh/tty-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (25 commits)
  serial: Tidy REMOTE_DEBUG
  serial: isicomm: handle running out of slots
  serial: bfin_sport_uart: Use resource size to fix off-by-one error
  tty: fix obsolete comment on tty_insert_flip_string_fixed_flag
  serial: Add driver for the Altera UART
  serial: Add driver for the Altera JTAG UART
  serial: timbuart: make sure last byte is sent when port is closed
  serial: two branches the same in timbuart_set_mctrl()
  serial: uartlite: move from byte accesses to word accesses
  tty: n_gsm: depends on NET
  tty: n_gsm line discipline
  serial: TTY: new ldiscs for staging
  serial: bfin_sport_uart: drop redundant cpu depends
  serial: bfin_sport_uart: drop the experimental markings
  serial: bfin_sport_uart: pull in bfin_sport.h for SPORT defines
  serial: bfin_sport_uart: only enable SPORT TX if data is to be sent
  serial: bfin_sport_uart: drop useless status masks
  serial: bfin_sport_uart: zero sport_uart_port if allocated dynamically
  serial: bfin_sport_uart: protect changes to uart_port
  serial: bfin_sport_uart: add support for CTS/RTS via GPIOs
  ...

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
Linus Torvalds [Fri, 21 May 2010 17:48:32 +0000 (10:48 -0700)]
Merge git://git./linux/kernel/git/gregkh/driver-core-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (38 commits)
  net: Expose all network devices in a namespaces in sysfs
  hotplug: netns aware uevent_helper
  kobj: Send hotplug events in the proper namespace.
  netlink: Implment netlink_broadcast_filtered
  net/sysfs: Fix the bitrot in network device kobject namespace support
  netns: Teach network device kobjects which namespace they are in.
  kobject: Send hotplug events in all network namespaces
  driver-core: fix Typo in drivers/base/core.c for CONFIG_MODULE
  pci: check caps from sysfs file open to read device dependent config space
  sysfs: add struct file* to bin_attr callbacks
  sysfs: Remove usage of S_BIAS to avoid merge conflict with the vfs tree
  sysfs: Don't use enums in inline function declaration.
  sysfs-namespaces: add a high-level Documentation file
  sysfs: Comment sysfs directory tagging logic
  driver core: Implement ns directory support for device classes.
  sysfs: Implement sysfs_delete_link
  sysfs: Add support for tagged directories with untagged members.
  sysfs: Implement sysfs tagged directory support.
  kobj: Add basic infrastructure for dealing with namespaces.
  sysfs: Remove double free sysfs_get_sb
  ...

14 years agointerrupt.h: fix fatal kernel-doc error
Randy Dunlap [Fri, 21 May 2010 16:03:01 +0000 (09:03 -0700)]
interrupt.h: fix fatal kernel-doc error

Fix kernel-doc fatal error:
/** beginning a non-kernel-doc comment block:
(That alone does not kill kernel-doc, but the 'enum' was
totally confusing to it.)

Error(/lnx/src/TMP/linux-2.6.34-git6//include/linux/interrupt.h:88): cannot understand prototype: 'enum '
make[2]: *** [Documentation/DocBook/genericirq.xml] Error 1

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agodquot: Detect partial write error to quota file in write_blk() and add printk_ratelim...
Jiaying Zhang [Mon, 17 May 2010 16:36:03 +0000 (18:36 +0200)]
dquot: Detect partial write error to quota file in write_blk() and add printk_ratelimit for quota error messages

This patch changes quota_tree.c:write_blk() to detect error caused by partial
write to quota file and add a macro to limit control printed quota error
messages so we won't fill up dmesg with a corrupted quota file.

Signed-off-by: Jiaying Zhang <jiayingz@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoocfs2: Fix lock inversion in quotas during umount
Jan Kara [Thu, 13 May 2010 20:14:53 +0000 (22:14 +0200)]
ocfs2: Fix lock inversion in quotas during umount

We cannot cancel delayed work from ocfs2_local_free_info because that is called
with dqonoff_mutex held and the work it cancels requires dqonoff_mutex to
finish. Cancel the work before acquiring dqonoff_mutex.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoocfs2: Use __dquot_transfer to avoid lock inversion
Jan Kara [Thu, 13 May 2010 18:18:45 +0000 (20:18 +0200)]
ocfs2: Use __dquot_transfer to avoid lock inversion

dquot_transfer() acquires own references to dquots via dqget(). Thus it waits
for dq_lock which creates a lock inversion because dq_lock ranks above
transaction start but transaction is already started in ocfs2_setattr(). Fix
the problem by passing own references directly to __dquot_transfer.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoocfs2: Fix NULL pointer deref when writing local dquot
Jan Kara [Thu, 13 May 2010 16:05:15 +0000 (18:05 +0200)]
ocfs2: Fix NULL pointer deref when writing local dquot

commit_dqblk() can write quota info to global file. That is actually a bad
thing to do because if we are just modifying local quota file, we are not
prepared (do not hold proper locks, do not have transaction credits) to do
a modification of the global quota file. So do not use commit_dqblk() and
instead call our writing function directly.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoocfs2: Fix estimate of credits needed for quota allocation
Jan Kara [Tue, 11 May 2010 15:04:14 +0000 (17:04 +0200)]
ocfs2: Fix estimate of credits needed for quota allocation

We were missing reservation of a journal credit for modification of quota
file inode when creating new dquot structure in the global quota file.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoocfs2: Fix quota locking
Jan Kara [Wed, 31 Mar 2010 14:25:37 +0000 (16:25 +0200)]
ocfs2: Fix quota locking

OCFS2 had three issues with quota locking:
a) When reading dquot from global quota file, we started a transaction while
   holding dqio_mutex which is prone to deadlocks because other paths do it
   the other way around
b) During ocfs2_sync_dquot we were not protected against concurrent writers
   on the same node. Because we first copy data to local buffer, a race
   could happen resulting in old data being written to global quota file and
   thus causing quota inconsistency after a crash.
c) ip_alloc_sem of quota files was acquired while a transaction is started
   in ocfs2_quota_write which can deadlock because we first get ip_alloc_sem
   and then start a transaction when extending quota files.

We fix the problem a) by pulling all necessary code to ocfs2_acquire_dquot
and ocfs2_release_dquot. Thus we no longer depend on generic dquot_acquire
to do the locking and can force proper lock ordering.

Problems b) and c) are fixed by locking i_mutex and ip_alloc_sem of
global quota file in ocfs2_lock_global_qf and removing ip_alloc_sem from
ocfs2_quota_read and ocfs2_quota_write.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoocfs2: Avoid unnecessary block mapping when refreshing quota info
Jan Kara [Wed, 28 Apr 2010 17:04:29 +0000 (19:04 +0200)]
ocfs2: Avoid unnecessary block mapping when refreshing quota info

The position of global quota file info does not change. So we do not have
to do logical -> physical block translation every time we reread it from
disk. Thus we can also avoid taking ip_alloc_sem.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoocfs2: Do not map blocks from local quota file on each write
Jan Kara [Tue, 27 Apr 2010 22:22:30 +0000 (00:22 +0200)]
ocfs2: Do not map blocks from local quota file on each write

There is no need to map offset of local dquot structure to on disk block
in each quota write. It is enough to map it just once and store the physical
block number in quota structure in memory. Moreover this simplifies locking
as we do not have to take ip_alloc_sem from quota write path.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoquota: Refactor dquot_transfer code so that OCFS2 can pass in its references
Jan Kara [Thu, 13 May 2010 17:58:50 +0000 (19:58 +0200)]
quota: Refactor dquot_transfer code so that OCFS2 can pass in its references

Currently, __dquot_transfer() acquires its own references of dquot structures
that will be put into inode. But for OCFS2, this creates a lock inversion
between dq_lock (waited on in dqget) and transaction start (started in
ocfs2_setattr). Currently, deadlock is impossible because dq_lock is acquired
only during dquot_acquire and dquot_release and we already hold a reference to
dquot structures in ocfs2_setattr so neither of these functions can be called
while we call dquot_transfer. But this is rather subtle and it is hard to teach
lockdep about it. So provide __dquot_transfer function that can be passed dquot
references directly. OCFS2 can then pass acquired dquot references directly to
__dquot_transfer with proper locking.

Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoquota: unify quota init condition in setattr
Dmitry Monakhov [Thu, 8 Apr 2010 18:04:20 +0000 (22:04 +0400)]
quota: unify quota init condition in setattr

Quota must being initialized if size or uid/git changes requested.
But initialization performed in two different places:
in case of i_size file system is responsible for dquot init
, but in case of uid/gid init will be called internally in
dquot_transfer().
This ambiguity makes code harder to understand.
Let's move this logic to one common helper function.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoquota: remove sb_has_quota_active in get/set_info
Christoph Hellwig [Fri, 7 May 2010 16:35:40 +0000 (12:35 -0400)]
quota: remove sb_has_quota_active in get/set_info

The methods already do these checks, so remove them in the quotactl
implementation to allow non-VFS quota implementations to also support
these calls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoquota: unify ->set_dqblk
Christoph Hellwig [Thu, 6 May 2010 21:05:17 +0000 (17:05 -0400)]
quota: unify ->set_dqblk

Pass the larger struct fs_disk_quota to the ->set_dqblk operation so
that the Q_SETQUOTA and Q_XSETQUOTA operations can be implemented
with a single filesystem operation and we can retire the ->set_xquota
operation.  The additional information (RT-subvolume accounting and
warn counts) are left zero for the VFS quota implementation.

Add new fieldmask values for setting the numer of blocks and inodes
values which is required for the VFS quota, but wasn't for XFS.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoquota: unify ->get_dqblk
Christoph Hellwig [Thu, 6 May 2010 21:04:58 +0000 (17:04 -0400)]
quota: unify ->get_dqblk

Pass the larger struct fs_disk_quota to the ->get_dqblk operation so
that the Q_GETQUOTA and Q_XGETQUOTA operations can be implemented
with a single filesystem operation and we can retire the ->get_xquota
operation.  The additional information (RT-subvolume accounting and
warn counts) are left zero for the VFS quota implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoext3: make barrier options consistent with ext4
Eric Sandeen [Fri, 30 Apr 2010 16:09:34 +0000 (11:09 -0500)]
ext3: make barrier options consistent with ext4

ext4 was updated to accept barrier/nobarrier mount options
in addition to the older barrier=0/1.  The barrier story
is complex enough, we should help people by making the options
the same at least, even if the defaults are different.

This patch allows the barrier/nobarrier mount options for ext3,
while keeping nobarrier the default.

It also unconditionally displays barrier status in show_options,
and prints a message at mount time if barriers are not enabled,
just as ext4 does.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoquota: Make quota stat accounting lockless.
Dmitry Monakhov [Mon, 26 Apr 2010 16:03:33 +0000 (20:03 +0400)]
quota: Make quota stat accounting lockless.

Quota stats is mostly writable data structure. Let's alloc percpu
bucket for each value.

NOTE: dqstats_read() function is racy against dqstats_{inc,dec}
and may return inconsistent value. But this is ok since absolute
accuracy is not required.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agosuppress warning: "quotatypes" defined but not used
Sergey Senozhatsky [Mon, 26 Apr 2010 10:09:26 +0000 (12:09 +0200)]
suppress warning: "quotatypes" defined but not used

Suppress compilation warning: "quotatypes" defined but not used.
quotatypes is used only when CONFIG_QUOTA_DEBUG or CONFIG_PRINT_QUOTA_WARNING
is/are defined.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoext3: Fix waiting on transaction during fsync
Jan Kara [Thu, 15 Apr 2010 20:24:26 +0000 (22:24 +0200)]
ext3: Fix waiting on transaction during fsync

log_start_commit() returns 1 only when it started a transaction
commit. Thus in case transaction commit is already running, we
fail to wait for the commit to finish. Fix the issue by always
waiting for the commit regardless of the log_start_commit return
value.

Signed-off-by: Jan Kara <jack@suse.cz>
14 years agojbd: Provide function to check whether transaction will issue data barrier
Jan Kara [Thu, 15 Apr 2010 20:16:24 +0000 (22:16 +0200)]
jbd: Provide function to check whether transaction will issue data barrier

Provide a function which returns whether a transaction with given tid
will send a barrier to the filesystem device. The function will be used
by ext3 to detect whether fsync needs to send a separate barrier or not.

Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoufs: add ufs speciffic ->setattr call
Dmitry Monakhov [Wed, 14 Apr 2010 22:56:58 +0000 (00:56 +0200)]
ufs: add ufs speciffic ->setattr call

generic setattr not longer responsible for quota transfer.
use ufs_setattr for all ufs's inodes.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoBKL: Remove BKL from ext2 filesystem
Jan Blunck [Wed, 14 Apr 2010 12:38:39 +0000 (14:38 +0200)]
BKL: Remove BKL from ext2 filesystem

The BKL is still used in ext2_put_super(), ext2_fill_super(), ext2_sync_fs()
ext2_remount() and ext2_write_inode(). From these calls ext2_put_super(),
ext2_fill_super() and ext2_remount() are protected against each other by
the struct super_block s_umount rw semaphore. The call in ext2_write_inode()
could only protect the modification of the ext2_sb_info through
ext2_update_dynamic_rev() against concurrent ext2_sync_fs() or ext2_remount().
ext2_fill_super() and ext2_put_super() can be left out because you need a
valid filesystem reference in all three cases, which you do not have when
you are one of these functions.

If the BKL is only protecting the modification of the ext2_sb_info it can
safely be removed since this is protected by the struct ext2_sb_info s_lock.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoext2: Add ext2_sb_info s_lock spinlock
Jan Blunck [Wed, 14 Apr 2010 12:38:38 +0000 (14:38 +0200)]
ext2: Add ext2_sb_info s_lock spinlock

Add a spinlock that protects against concurrent modifications of
s_mount_state, s_blocks_last, s_overhead_last and the content of the
superblock's buffer pointed to by sbi->s_es. The spinlock is now used in
ext2_xattr_update_super_block() which was setting the
EXT2_FEATURE_COMPAT_EXT_ATTR flag on the superblock without protection
before. Likewise the spinlock is used in ext2_show_options() to have a
consistent view of the mount options.

This is a preparation patch for removing the BKL from ext2 in the next
patch.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jan Kara <jack@suse.cz>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoext2: Move ext2_write_super() out of ext2_setup_super()
Jan Blunck [Wed, 14 Apr 2010 12:38:37 +0000 (14:38 +0200)]
ext2: Move ext2_write_super() out of ext2_setup_super()

Move ext2_write_super() out of ext2_setup_super() as a preparation for the
next patch that adds a new lock for superblock fields.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoext2: Fold ext2_commit_super() into ext2_sync_super()
Jan Blunck [Wed, 14 Apr 2010 12:38:36 +0000 (14:38 +0200)]
ext2: Fold ext2_commit_super() into ext2_sync_super()

Both function originally did similar things except that ext2_sync_super()
is returning after the call to sync_dirty_buffer(sbh). Therefore this
patch adds a wait flag to tell ext2_sync_super() if it has to call
sync_dirty_buffer() to wait for in-progress I/O to finish.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoext2: Remove duplicate code from ext2_sync_fs()
Jan Blunck [Wed, 14 Apr 2010 12:38:35 +0000 (14:38 +0200)]
ext2: Remove duplicate code from ext2_sync_fs()

Depending in the state (valid or unchecked) of the filesystem either
ext2_sync_super() or ext2_commit_super() is called. If the filesystem is
currently valid (it is checked), we first mark it unchecked and afterwards
duplicate the work that ext2_sync_super() is doing later. Therefore this
patch removes the duplicate code and calls ext2_sync_super() directly after
marking the filesystem unchecked.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoext2: Set the write time in ext2_sync_fs()
Jan Blunck [Wed, 14 Apr 2010 12:38:34 +0000 (14:38 +0200)]
ext2: Set the write time in ext2_sync_fs()

This is probably a typo since the write time should actually be updated by
ext2_sync_fs() instead of the mount time.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoext2: Use ext2_clear_super_error() in ext2_sync_fs()
Jan Blunck [Wed, 14 Apr 2010 12:38:33 +0000 (14:38 +0200)]
ext2: Use ext2_clear_super_error() in ext2_sync_fs()

ext2_sync_fs() used to duplicate the code from ext2_clear_super_error().

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoext3: init statistics after journal recovery v2
Dmitry Monakhov [Mon, 12 Apr 2010 19:46:00 +0000 (23:46 +0400)]
ext3: init statistics after journal recovery v2

Currently block/inode/dir counters are initialized before journal was
recovered. In fact after journal recovery this info will probably
change which results in incorrect numbers returned from statfs(2).
BUG:#15768

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoext2: remove useless call to brelse() in ext2_free_inode()
Francis Moreau [Thu, 8 Apr 2010 09:35:17 +0000 (11:35 +0200)]
ext2: remove useless call to brelse() in ext2_free_inode()

This patch removes a useless call to brelse(bitmap_bh) since at that
point bitmap_bh is NULL and slightly cleans up bitmap_bh handling.

Signed-off-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoquota: optimize mark_dirty logic
Dmitry Monakhov [Sat, 27 Mar 2010 12:15:38 +0000 (15:15 +0300)]
quota: optimize mark_dirty logic

- Skip locking if quota is dirty already.
- Return old quota state to help fs-specciffic implementation to optimize
  case where quota was dirty already.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoext2: Avoid loading bitmaps for full groups during block allocation
Jan Kara [Mon, 29 Mar 2010 11:55:39 +0000 (13:55 +0200)]
ext2: Avoid loading bitmaps for full groups during block allocation

There is no point in loading bitmap for groups which are completely full.
This causes noticeable performance problems (and memory pressure) on small
systems with large full filesystem
(http://marc.info/?l=linux-ext4&m=126843108314310&w=2).

Port of the same ext3 patch.

Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoext3: Avoid loading bitmaps for full groups during block allocation
Frans van de Wiel [Mon, 15 Mar 2010 18:29:34 +0000 (19:29 +0100)]
ext3: Avoid loading bitmaps for full groups during block allocation

There is no point in loading bitmap for groups which are completely full.
This causes noticeable performance problems (and memory pressure) on small
systems with large full filesystem
(http://marc.info/?l=linux-ext4&m=126843108314310&w=2).

Jan Kara: Added a comment and changed check to use cpu-endian value.

Signed-off-by: "Frans van de Wiel" <fvdw@fvdw.eu>
Signed-off-by: Jan Kara <jack@suse.cz>
14 years agoFix networking tree iscsi_tcp.c mis-merge
Linus Torvalds [Fri, 21 May 2010 16:48:36 +0000 (09:48 -0700)]
Fix networking tree iscsi_tcp.c mis-merge

The removal of the 'waitqueue_active()' test in commit d7d05548a6
("[SCSI] iscsi_tcp: fix relogin/shutdown hang") got incorrectly resolved
by David when he back-merged the main git tree into the networking tree
in commit 278554bd65 ("Merge branch 'master' of master.kernel.org:...").

There was a content conflict due to 'sock->sk->sk_sleep' being changed
into 'sk_sleep(sock->sk)' in the networking tree, but David didn't pick
up the iscsi change from the main tree.

Reported-by: James Bottomley <James.Bottomley@suse.de>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoi2c-nforce2: Remove redundant error messages on ACPI conflict
Chase Douglas [Fri, 21 May 2010 16:41:01 +0000 (18:41 +0200)]
i2c-nforce2: Remove redundant error messages on ACPI conflict

The ACPI subsystem strictly checks for resource conflicts. When there's
a conflict, it outputs a warning message with all the details needed to
properly diagnose the underlying issue. However, the i2c-nforce2 driver
also prints its own message. Not only is the message redundant, it is at
the KERN_ERR level, which overrides some bootsplash screens for no good
reason. This change removes the two lines that print out the error
messages.

Signed-off-by: Chase Douglas <chase.douglas@canonical.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
14 years agoi2c: Use <linux/io.h> instead of <asm/io.h>
H Hartley Sweeten [Fri, 21 May 2010 16:41:01 +0000 (18:41 +0200)]
i2c: Use <linux/io.h> instead of <asm/io.h>

As warned by checkpatch.pl, <linux/io.h> should be used instead of
<asm/io.h>.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
14 years agoi2c-algo-pca: Fix coding style issues
Farid Hammane [Fri, 21 May 2010 16:41:00 +0000 (18:41 +0200)]
i2c-algo-pca: Fix coding style issues

Fix up some coding style issues. i2c-algo-pca.c has been built
successfully after applying this patch and the binary object is
still exactly the same. Other issues found by checkpatch.pl were
voluntarily not fixed, either to keep readability, or because of
false positive errors.

Signed-off-by: Farid Hammane <farid.hammane@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
14 years agoi2c-dev: Fix all coding style issues
Farid Hammane [Fri, 21 May 2010 16:40:59 +0000 (18:40 +0200)]
i2c-dev: Fix all coding style issues

Fix all coding style issues found by checkpatch.pl.

Signed-off-by: Farid Hammane <farid.hammane@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
14 years agoi2c-core: Fix some coding style issues
Farid Hammane [Fri, 21 May 2010 16:40:58 +0000 (18:40 +0200)]
i2c-core: Fix some coding style issues

Fix up coding style issues found by the checkpatch.pl tool.

Signed-off-by: Farid Hammane <farid.hammane@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
14 years agoi2c-gpio: Move initialization code to subsys_initcall()
Marek Szyprowski [Fri, 21 May 2010 16:40:58 +0000 (18:40 +0200)]
i2c-gpio: Move initialization code to subsys_initcall()

GPIO driven I2C bus can be used for controlling the PMIC chip. The
example of such configuration is Samsung Aquila board.

This patch moves initialization code to subsys_initcall() to ensure
that the i2c bus is available early so the regulators can be quickly
probed and available for other devices on their probe() call.

Such solution has been proposed by Mark Brown to fix the problem of
the regulators not beeing available on the peripheral device probe():
http://lists.infradead.org/pipermail/linux-arm-kernel/2010-March/011971.html

Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
14 years agoi2c-parport: Make template structure const
Jean Delvare [Fri, 21 May 2010 16:40:57 +0000 (18:40 +0200)]
i2c-parport: Make template structure const

parport_algo_data is a template so it can be marked const.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
14 years agoi2c-dev: Remove unnecessary casts
H Hartley Sweeten [Fri, 21 May 2010 16:40:57 +0000 (18:40 +0200)]
i2c-dev: Remove unnecessary casts

The private_data member of struct file is a void *, there is no need
to cast it.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
14 years agoat24: Fall back to byte or word reads if needed
Jean Delvare [Fri, 21 May 2010 16:40:57 +0000 (18:40 +0200)]
at24: Fall back to byte or word reads if needed

Increase the portability of the at24 driver by letting it read from
EEPROM chips connected to cheap SMBus controllers that support neither
raw I2C messages nor even I2C block reads. All SMBus controllers
should support either word reads or byte reads, so read support
becomes universal, much like with the legacy "eeprom" driver.

Obviously, this only works with EEPROM chips up to AT24C16, that use
8-bit offset addressing. 16-bit offset addressing is almost impossible
to support on SMBus controllers.

I did not add universal support for writes, as I had no immediate need
for this, but it could be added later if needed (with the same
performance issue as byte and word reads have, of course.)

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Konstantin Lazarev <klazarev@sbcglobal.net>
14 years agoi2c-stub: Expose the default functionality flags
Jean Delvare [Fri, 21 May 2010 16:40:56 +0000 (18:40 +0200)]
i2c-stub: Expose the default functionality flags

It is easier to adjust the flags when you know their default value.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
14 years agoi2c/scx200_acb: Make PCI device ids constant
Jean Delvare [Fri, 21 May 2010 16:40:56 +0000 (18:40 +0200)]
i2c/scx200_acb: Make PCI device ids constant

Make PCI device ids constant as we just did for many other i2c bus
drivers already.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Márton Németh <nm127@freemail.hu>
14 years agoi2c-i801: Fix all checkpatch warnings
Ivo Manca [Fri, 21 May 2010 16:40:55 +0000 (18:40 +0200)]
i2c-i801: Fix all checkpatch warnings

Fix all checkpatch warnings. No functional changes are made.

Signed-off-by: Ivo Manca <pinkel@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
14 years agoi2c-i801: All newer devices have all the optional features
Jean Delvare [Fri, 21 May 2010 16:40:55 +0000 (18:40 +0200)]
i2c-i801: All newer devices have all the optional features

Only the oldest devices lack some of the features supported by this
driver. List them explicitly, and default to all features enabled for
all other chips, including the ones added through sysfs. This will
make future driver maintenance easier.

In the unlikely event of a not yet supported device not implementing
all the features, one can always use the disable_features module
parameter to prevent the driver from attempting to use them.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Seth Heasley <seth.heasley@intel.com>
14 years agoi2c-i801: Let the user disable selected driver features
Jean Delvare [Fri, 21 May 2010 16:40:54 +0000 (18:40 +0200)]
i2c-i801: Let the user disable selected driver features

Let the user disable selected features normally supported by the
device. This makes it possible to work around possible driver or
hardware bugs if the feature in question doesn't work as intended
for whatever reason.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Felix Rubinstein <felixru@gmail.com>
14 years agonet: Expose all network devices in a namespaces in sysfs
Eric W. Biederman [Wed, 5 May 2010 00:36:49 +0000 (17:36 -0700)]
net: Expose all network devices in a namespaces in sysfs

This reverts commit aaf8cdc34ddba08122f02217d9d684e2f9f5d575.

Drivers like the ipw2100 call device_create_group when they
are initialized and device_remove_group when they are shutdown.
Moving them between namespaces deletes their sysfs groups early.

In particular the following call chain results.
netdev_unregister_kobject -> device_del -> kobject_del -> sysfs_remove_dir
With sysfs_remove_dir recursively deleting all of it's subdirectories,
and nothing adding them back.

Ouch!

Therefore we need to call something that ultimate calls sysfs_mv_dir
as that sysfs function can move sysfs directories between namespaces
without deleting their subdirectories or their contents.   Allowing
us to avoid placing extra boiler plate into every driver that does
something interesting with sysfs.

Currently the function that provides that capability is device_rename.
That is the code works without nasty side effects as originally written.

So remove the misguided fix for moving devices between namespaces.  The
bug in the kobject layer that inspired it has now been recognized and
fixed.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agohotplug: netns aware uevent_helper
Eric W. Biederman [Wed, 5 May 2010 00:36:48 +0000 (17:36 -0700)]
hotplug: netns aware uevent_helper

It only makes sense for uevent_helper to get events
in the intial namespaces.  It's invocation is not
per namespace and it is not clear how we could make
it's invocation namespace aware.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agokobj: Send hotplug events in the proper namespace.
Eric W. Biederman [Wed, 5 May 2010 00:36:47 +0000 (17:36 -0700)]
kobj: Send hotplug events in the proper namespace.

Utilize netlink_broacast_filtered to allow sending hotplug events
in the proper namespace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonetlink: Implment netlink_broadcast_filtered
Eric W. Biederman [Wed, 5 May 2010 00:36:46 +0000 (17:36 -0700)]
netlink: Implment netlink_broadcast_filtered

When netlink sockets are used to convey data that is in a namespace
we need a way to select a subset of the listening sockets to deliver
the packet to.  For the network namespace we have been doing this
by only transmitting packets in the correct network namespace.

For data belonging to other namespaces netlink_bradcast_filtered
provides a mechanism that allows us to examine the destination
socket and to decide if we should transmit the specified packet
to it.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonet/sysfs: Fix the bitrot in network device kobject namespace support
Eric W. Biederman [Mon, 17 May 2010 04:59:45 +0000 (21:59 -0700)]
net/sysfs: Fix the bitrot in network device kobject namespace support

I had a couple of stupid bugs in:
netns: Teach network device kobjects which namespace they are in.

- I duplicated the Kconfig for the NET_NS
- The build was broken when sysfs was not compiled in

The sysfs breakage is because after I moved the operations
for the sysfs to the kobject layer, to make things cleaner
I forgot to move the ifdefs.  Opps.

I'm not quite certain how I got introduced a second NET_NS Kconfig,
but it was probably a 3 way merge somewhere along the way that
did not notice that the NET_NS Kconfig option had mvoed and thout
that was a bug.  It probably slipped in because it used to be the
sysfs patches were the first patches in my network namespace patches.
Some things just don't go like you would expect.

Neither of these bugs actually affect anything in the common case
but they should be fixed.

Thanks to Serge for noticing they were present.

Reported-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Acked-by: David S. Miller <davem@davemloft.net>
14 years agonetns: Teach network device kobjects which namespace they are in.
Eric W. Biederman [Wed, 5 May 2010 00:36:45 +0000 (17:36 -0700)]
netns: Teach network device kobjects which namespace they are in.

The problem.  Network devices show up in sysfs and with the network
namespace active multiple devices with the same name can show up in
the same directory, ouch!

To avoid that problem and allow existing applications in network namespaces
to see the same interface that is currently presented in sysfs, this
patch enables the tagging directory support in sysfs.

By using the network namespace pointers as tags to separate out the
the sysfs directory entries we ensure that we don't have conflicts
in the directories and applications only see a limited set of
the network devices.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agokobject: Send hotplug events in all network namespaces
Eric W. Biederman [Wed, 5 May 2010 00:36:44 +0000 (17:36 -0700)]
kobject: Send hotplug events in all network namespaces

Open a copy of the uevent kernel socket in each network
namespace so we can send uevents in all network namespaces.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodriver-core: fix Typo in drivers/base/core.c for CONFIG_MODULE
Christoph Egger [Mon, 17 May 2010 14:57:58 +0000 (16:57 +0200)]
driver-core: fix Typo in drivers/base/core.c for CONFIG_MODULE

In this code section the final S of CONFIG_MODULES was missed making
the whole check useless

Signed-off-by: Christoph Egger <siccegge@cs.fau.de>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agopci: check caps from sysfs file open to read device dependent config space
Chris Wright [Thu, 13 May 2010 17:43:07 +0000 (10:43 -0700)]
pci: check caps from sysfs file open to read device dependent config space

The PCI config space bin_attr read handler has a hardcoded CAP_SYS_ADMIN
check to verify privileges before allowing a user to read device
dependent config space.  This is meant to protect from an unprivileged
user potentially locking up the box.

When assigning a PCI device directly to a guest with libvirt and KVM,
the sysfs config space file is chown'd to the unprivileged user that
the KVM guest will run as.  The guest needs to have full access to the
device's config space since it's responsible for driving the device.
However, despite being the owner of the sysfs file, the CAP_SYS_ADMIN
check will not allow read access beyond the config header.

With this patch we check privileges against the capabilities used when
openining the sysfs file.  The allows a privileged process to open the
file and hand it to an unprivileged process, and the unprivileged process
can still read all of the config space.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosysfs: add struct file* to bin_attr callbacks
Chris Wright [Thu, 13 May 2010 01:28:57 +0000 (18:28 -0700)]
sysfs: add struct file* to bin_attr callbacks

This allows bin_attr->read,write,mmap callbacks to check file specific data
(such as inode owner) as part of any privilege validation.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosysfs: Remove usage of S_BIAS to avoid merge conflict with the vfs tree
Eric W. Biederman [Tue, 18 May 2010 19:58:33 +0000 (12:58 -0700)]
sysfs: Remove usage of S_BIAS to avoid merge conflict with the vfs tree

In Al's latest vfs tree the code is reworked and S_BIAS has been removed.

It turns out that checking to see if a super block is in the
middle of an unmount in sysfs_exit_ns is unnecessary because we
remove the super_block from the s_supers/s_instances list before
struct sysfs_super_info pointed to by sb->s_fs_info is freed.

For now just delete the unnecessary check to see if a superblock is in the
middle of an unmount, it isn't necessary with or without Al's changes
and it just causes a needless conflict.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosysfs: Don't use enums in inline function declaration.
Eric W. Biederman [Wed, 5 May 2010 21:54:00 +0000 (14:54 -0700)]
sysfs: Don't use enums in inline function declaration.

It appears gcc can't cope with using an enum that is only declared in
an inline function declaration, that doesn't even use the variable
that is so declared.

Avoid the silliness and replace the enum with an int, and make gcc
happy.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosysfs-namespaces: add a high-level Documentation file
Serge E. Hallyn [Wed, 5 May 2010 02:45:38 +0000 (21:45 -0500)]
sysfs-namespaces: add a high-level Documentation file

The first three paragraphs are almost verbatim taken from Eric's
commit message on the patch introducing network ns tags.  The next
two paragraphs I wrote to be a brief high level overview.  The last
section is taken from the commit message on "Implement sysfs tagged
directory support", but updated.  Hopefully correctly.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosysfs: Comment sysfs directory tagging logic
Serge E. Hallyn [Mon, 3 May 2010 21:23:15 +0000 (16:23 -0500)]
sysfs: Comment sysfs directory tagging logic

Add some in-line comments to explain the new infrastructure, which
was introduced to support sysfs directory tagging with namespaces.
I think an overall description someplace might be good too, but it
didn't really seem to fit into Documentation/filesystems/sysfs.txt,
which appears more geared toward users, rather than maintainers, of
sysfs.

(Tejun, please let me know if I can make anything clearer or failed
altogether to comment something that should be commented.)

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodriver core: Implement ns directory support for device classes.
Eric W. Biederman [Tue, 30 Mar 2010 18:31:29 +0000 (11:31 -0700)]
driver core: Implement ns directory support for device classes.

device_del and device_rename were modified to use
sysfs_delete_link and sysfs_rename_link respectively to ensure
when these operations happen on devices whose classes
are in namespace directories they work properly.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosysfs: Implement sysfs_delete_link
Eric W. Biederman [Tue, 30 Mar 2010 18:31:28 +0000 (11:31 -0700)]
sysfs: Implement sysfs_delete_link

When removing a symlink sysfs_remove_link does not provide
enough information to figure out which tagged directory the symlink
falls in.  So I need sysfs_delete_link which is passed the target
of the symlink to delete.

sysfs_rename_link is updated to call sysfs_delete_link instead
of sysfs_remove_link as we have all of the information necessary
and the callers are interesting.

Both of these functions now have enough information to find a symlink
in a tagged directory.  The only restriction is that they must be called
before the target kobject is renamed or deleted.  If they are called
later I loose track of which tag the target kobject was marked with
and can no longer find the old symlink to remove it.

This patch was split from an earlier patch.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosysfs: Add support for tagged directories with untagged members.
Eric W. Biederman [Tue, 30 Mar 2010 18:31:27 +0000 (11:31 -0700)]
sysfs: Add support for tagged directories with untagged members.

I had hopped to avoid this but the bonding driver adds a file
to /sys/class/net/  and the easiest way to handle that file is
to make it untagged and to register it only once.

So relax the rules on tagged directories, and make bonding work.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosysfs: Implement sysfs tagged directory support.
Eric W. Biederman [Tue, 30 Mar 2010 18:31:26 +0000 (11:31 -0700)]
sysfs: Implement sysfs tagged directory support.

The problem.  When implementing a network namespace I need to be able
to have multiple network devices with the same name.  Currently this
is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and
potentially a few other directories of the form /sys/ ... /net/*.

What this patch does is to add an additional tag field to the
sysfs dirent structure.  For directories that should show different
contents depending on the context such as /sys/class/net/, and
/sys/devices/virtual/net/ this tag field is used to specify the
context in which those directories should be visible.  Effectively
this is the same as creating multiple distinct directories with
the same name but internally to sysfs the result is nicer.

I am calling the concept of a single directory that looks like multiple
directories all at the same path in the filesystem tagged directories.

For the networking namespace the set of directories whose contents I need
to filter with tags can depend on the presence or absence of hotplug
hardware or which modules are currently loaded.  Which means I need
a simple race free way to setup those directories as tagged.

To achieve a reace free design all tagged directories are created
and managed by sysfs itself.

Users of this interface:
- define a type in the sysfs_tag_type enumeration.
- call sysfs_register_ns_types with the type and it's operations
- sysfs_exit_ns when an individual tag is no longer valid

- Implement mount_ns() which returns the ns of the calling process
  so we can attach it to a sysfs superblock.
- Implement ktype.namespace() which returns the ns of a syfs kobject.

Everything else is left up to sysfs and the driver layer.

For the network namespace mount_ns and namespace() are essentially
one line functions, and look to remain that.

Tags are currently represented a const void * pointers as that is
both generic, prevides enough information for equality comparisons,
and is trivial to create for current users, as it is just the
existing namespace pointer.

The work needed in sysfs is more extensive.  At each directory
or symlink creating I need to check if the directory it is being
created in is a tagged directory and if so generate the appropriate
tag to place on the sysfs_dirent.  Likewise at each symlink or
directory removal I need to check if the sysfs directory it is
being removed from is a tagged directory and if so figure out
which tag goes along with the name I am deleting.

Currently only directories which hold kobjects, and
symlinks are supported.  There is not enough information
in the current file attribute interfaces to give us anything
to discriminate on which makes it useless, and there are
no potential users which makes it an uninteresting problem
to solve.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agokobj: Add basic infrastructure for dealing with namespaces.
Eric W. Biederman [Tue, 30 Mar 2010 18:31:25 +0000 (11:31 -0700)]
kobj: Add basic infrastructure for dealing with namespaces.

Move complete knowledge of namespaces into the kobject layer
so we can use that information when reporting kobjects to
userspace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosysfs: Remove double free sysfs_get_sb
Eric W. Biederman [Tue, 30 Mar 2010 23:50:26 +0000 (16:50 -0700)]
sysfs: Remove double free sysfs_get_sb

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosysfs: Basic support for multiple super blocks
Eric W. Biederman [Tue, 30 Mar 2010 18:31:24 +0000 (11:31 -0700)]
sysfs: Basic support for multiple super blocks

Add all of the necessary bioler plate to support
multiple superblocks in sysfs.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agogenerate "change" uevent for loop device
David Zeuthen [Mon, 3 May 2010 12:08:59 +0000 (14:08 +0200)]
generate "change" uevent for loop device

Recent udev versions probe loop devices for filesystems meaning that
the /dev/disk hierarchy may contain useful entries such as

 $ ls -l /dev/disk/by-label/Fedora-12-x86_64-Live
 lrwxrwxrwx 1 root root 11 Mar 11 13:41 /dev/disk/by-label/Fedora-12-x86_64-Live -> ../../loop0

Unfortunately, no "change" uevent is generated when the loop device is
detached so the symlink persists. Additionally, no "change" uevent is
guaranteed to be generated when attaching an fd or changing capacity.
For example,  user space could open the loop device O_RDONLY (in fact,
recent util-linux-ng does this) so udev's OPTIONS+="watch" machinery may
not trigger the "change" uevent.

This patch ensures that the "change" uevent is generated in all of
these cases. As a result, the /dev/disk hierarchy works as expected
for loop devices.

Signed-off-by: David Zeuthen <davidz@redhat.com>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoDriver core: Protect device shutdown from hot unplug events.
Hugh Daschbach [Mon, 22 Mar 2010 17:36:37 +0000 (10:36 -0700)]
Driver core: Protect device shutdown from hot unplug events.

While device_shutdown() walks through devices_kset to shutdown all
devices, device unplug events may race to shutdown individual devices.
Specifically, sd_shutdown(), on behalf of fc_starget_delete(), has
been observed deleting devices during device_shutdown()'s list
traversal.  So we factor out list_for_each_entry_safe_reverse(...) in
favor of while (!list_empty(...)).

Signed-off-by: Hugh Daschbach <hdasch@broadcom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agofirmware loader: do not allocate firmare id separately
Dmitry Torokhov [Sun, 14 Mar 2010 07:49:23 +0000 (23:49 -0800)]
firmware loader: do not allocate firmare id separately

fw_id has the same life time as firmware_priv so it makes sense to move
it into firmware_priv structure instead of allocating separately.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agofirmware loader: split out builtin firmware handling
Dmitry Torokhov [Sun, 14 Mar 2010 07:49:18 +0000 (23:49 -0800)]
firmware loader: split out builtin firmware handling

Split builtin firmware handling into separate functions to clean up the
main body of code.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agofirmware loader: rely on driver core to create class attribute
Dmitry Torokhov [Sun, 14 Mar 2010 07:49:13 +0000 (23:49 -0800)]
firmware loader: rely on driver core to create class attribute

Do not create 'timeout' attribute manually, let driver core do it for us.
This also ensures that attribute is cleaned up properly.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agofirmware class: export nowait to userspace
Johannes Berg [Mon, 29 Mar 2010 15:57:20 +0000 (17:57 +0200)]
firmware class: export nowait to userspace

When we use request_firmware_nowait(), userspace may
not want to answer negatively right away when for
example it is answering from an initrd only, but
with request_firmware() it has to in order to not
delay the kernel boot until the request times out.

This allows userspace to differentiate between the
two in order to be able to reply negatively to async
requests only when all filesystems have been mounted
and have been checked for the requested firmware file.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agolockdep: Add novalidate class for dev->mutex conversion
Peter Zijlstra [Fri, 19 Mar 2010 00:37:42 +0000 (01:37 +0100)]
lockdep: Add novalidate class for dev->mutex conversion

The conversion of device->sem to device->mutex resulted in lockdep
warnings. Create a novalidate class for now until the driver folks
come up with separate classes. That way we have at least the basic
mutex debugging coverage.

Add a checkpatch error so the usage is reserved for device->mutex.

[ tglx: checkpatch and compile fix for LOCKDEP=n ]

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodrivers/base: Convert dev->sem to mutex
Thomas Gleixner [Fri, 29 Jan 2010 20:39:02 +0000 (20:39 +0000)]
drivers/base: Convert dev->sem to mutex

The semaphore is semantically a mutex. Convert it to a real mutex and
fix up a few places where code was relying on semaphore.h to be included
by device.h, as well as the users of the trylock function, as that value
is now reversed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>