Steve Wise [Thu, 9 Apr 2009 14:09:39 +0000 (14:09 +0000)]
RDS/IW+IB: Allow max credit advertise window.
Fix hack that restricts the credit advertisement to 127.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steve Wise [Thu, 9 Apr 2009 14:09:38 +0000 (14:09 +0000)]
RDS/IW+IB: Set the RDS_LL_SEND_FULL bit when we're throttled.
The RDS_LL_SEND_FULL bit should be set when we stop transmitted due to
flow control. Otherwise the send worker will keep trying as opposed to
sleeping until we unthrottle. Saves CPU.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Grover [Thu, 9 Apr 2009 14:09:37 +0000 (14:09 +0000)]
RDS: Correct some iw references in rdma_transport.c
Had some lingering instances of _iw_ variable names from when
the listen code was centralized into rdma_transport.c
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steve Wise [Thu, 9 Apr 2009 14:09:36 +0000 (14:09 +0000)]
RDS/IW+IB: Set recv ring low water mark to 1/2 full.
Currently the recv ring low water mark is 1/4 the depth. Performance
measurements show that this limits iWARP throughput by flow controlling
the rds-stress senders. Setting it to 1/2 seems to max the T3
performance. I tried even higher levels but that didn't help and it
started to increase the rds thread cpu utilization.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 8 Apr 2009 23:01:16 +0000 (16:01 -0700)]
ISDN: update Documentation/isdn/00-INDEX
After the merging of mISDN, state which files refer only to the
old isdn4linux subsystem. Also add a few missing files and sort
alphabetically.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jean Delvare [Wed, 8 Apr 2009 22:59:53 +0000 (15:59 -0700)]
sfc: Don't specify unexistent IRQ
Neither the lm90 driver nor the lm87 driver do support interrupts, so
there is no point in specifying one when declaring the devices.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Tue, 7 Apr 2009 22:50:48 +0000 (22:50 +0000)]
netxen: cache align register map table
Aligning register offset translation table imporves performance
on rx side.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Kumar Salecha [Tue, 7 Apr 2009 22:50:47 +0000 (22:50 +0000)]
netxen: enable GRO support
Signed-off-by: Amit Kumar Salecha <amit@netxen.com>
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Tue, 7 Apr 2009 22:50:46 +0000 (22:50 +0000)]
netxen: enable rss for NX2031
Enable multiple rx rings for older NX2031 chip, firmware 3.4.336
or newer supports this feature.
Signed-off-by: Amit Kumar Salecha <amit@netxen.com>
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Tue, 7 Apr 2009 22:50:45 +0000 (22:50 +0000)]
netxen: sanitize function names
Replace superfluous wrapper functions with two macros:
NXWR32 replaces netxen_nic_reg_write, netxen_nic_write_w0,
netxen_nic_read_w1, netxen_crb_writelit_adapter.
NXRD32 replaces netxen_nic_reg_read, netxen_nic_read_w0,
netxen_nic_read_w1.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Tue, 7 Apr 2009 22:50:44 +0000 (22:50 +0000)]
netxen: annotate register access functions
o remove unnecessary length parameter since register access
width is fixed 4 byte.
o remove superfluous pci_read_normalize and pci_write_normalize
functions.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Tue, 7 Apr 2009 22:50:43 +0000 (22:50 +0000)]
netxen: allocate status rings dynamically
This reduces netxen_adapter footprint when rss (msi-x) is disabled.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Tue, 7 Apr 2009 22:50:42 +0000 (22:50 +0000)]
netxen: async link event handling
Add support for asynchronous events from firmware,
received over one of the rx rings.
Add support for event based phy interrupts, enhanced links
status reporting from firmware.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Tue, 7 Apr 2009 22:50:41 +0000 (22:50 +0000)]
netxen: defer firmware handshake
Removed duplicate firmware handshake, defer it until first
port (interface) is brought up.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Tue, 7 Apr 2009 22:50:40 +0000 (22:50 +0000)]
netxen: refactor transmit code
o move tx stuff into nx_host_tx_ring structure, this will
help managing multiple tx rings in future.
o sanitize some variable names
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Tue, 7 Apr 2009 22:50:39 +0000 (22:50 +0000)]
netxen: refactor netxen_adapter
Rearrange members to align them at right offset.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Tue, 7 Apr 2009 22:50:38 +0000 (22:50 +0000)]
netxen: code cleanup
o remove unused structure defs.
o remove unnecessary includes.
o replace enums with specific #defines.
o reduce footprint of stats structure.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 7 Apr 2009 21:25:01 +0000 (14:25 -0700)]
Linux 2.6.30-rc1
Linus Torvalds [Tue, 7 Apr 2009 21:11:07 +0000 (14:11 -0700)]
Merge branch 'core/softlockup' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'core/softlockup' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
softlockup: make DETECT_HUNG_TASK default depend on DETECT_SOFTLOCKUP
softlockup: move 'one' to the softlockup section in sysctl.c
softlockup: ensure the task has been switched out once
softlockup: remove timestamp checking from hung_task
softlockup: convert read_lock in hung_task to rcu_read_lock
softlockup: check all tasks in hung_task
softlockup: remove unused definition for spawn_softlockup_task
softlockup: fix potential race in hung_task when resetting timeout
softlockup: fix to allow compiling with !DETECT_HUNG_TASK
softlockup: decouple hung tasks check from softlockup detection
Linus Torvalds [Tue, 7 Apr 2009 21:10:10 +0000 (14:10 -0700)]
Merge branch 'tracing-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
branch tracer, intel-iommu: fix build with CONFIG_BRANCH_TRACER=y
branch tracer: Fix for enabling branch profiling makes sparse unusable
ftrace: Correct a text align for event format output
Update /debug/tracing/README
tracing/ftrace: alloc the started cpumask for the trace file
tracing, x86: remove duplicated #include
ftrace: Add check of sched_stopped for probe_sched_wakeup
function-graph: add proper initialization for init task
tracing/ftrace: fix missing include string.h
tracing: fix incorrect return type of ns2usecs()
tracing: remove CALLER_ADDR2 from wakeup tracer
blktrace: fix pdu_len when tracing packet command requests
blktrace: small cleanup in blk_msg_write()
blktrace: NUL-terminate user space messages
tracing: move scripts/trace/power.pl to scripts/tracing/power.pl
Linus Torvalds [Tue, 7 Apr 2009 21:07:52 +0000 (14:07 -0700)]
Merge branch 'irq/threaded' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'irq/threaded' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: fix devres.o build for GENERIC_HARDIRQS=n
genirq: provide old request_irq() for CONFIG_GENERIC_HARDIRQ=n
genirq: threaded irq handlers review fixups
genirq: add support for threaded interrupts to devres
genirq: add threaded interrupt handler support
Trond Myklebust [Tue, 7 Apr 2009 21:02:53 +0000 (14:02 -0700)]
NFS: Fix the return value in nfs_page_mkwrite()
Commit
c2ec175c39f62949438354f603f4aa170846aabb ("mm: page_mkwrite
change prototype to match fault") exposed a bug in the NFS
implementation of page_mkwrite. We should be returning 0 on success...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 7 Apr 2009 18:24:19 +0000 (11:24 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: pci_slot: grab refcount on slot's bus
PCI Hotplug: acpiphp: grab refcount on p2p subordinate bus
PCI: allow PCI core hotplug to remove PCI root bus
PCI: Fix oops in pci_vpd_truncate
PCI: don't corrupt enable_cnt when doing manual resource alignment
PCI: annotate pci_rescan_bus as __ref, not __devinit
PCI-IOV: fix missing kernel-doc
PCI: Setup disabled bridges even if buses are added
PCI: SR-IOV quirk for Intel 82576 NIC
Linus Torvalds [Tue, 7 Apr 2009 18:06:41 +0000 (11:06 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
loop: mutex already unlocked in loop_clr_fd()
cfq-iosched: don't let idling interfere with plugging
block: remove unused REQ_UNPLUG
cfq-iosched: kill two unused cfqq flags
cfq-iosched: change dispatch logic to deal with single requests at the time
mflash: initial support
cciss: change to discover first memory BAR
cciss: kernel scan thread for MSA2012
cciss: fix residual count for block pc requests
block: fix inconsistency in I/O stat accounting code
block: elevator quiescing helpers
Linus Torvalds [Tue, 7 Apr 2009 14:59:41 +0000 (07:59 -0700)]
Fix build errors due to CONFIG_BRANCH_TRACER=y
The code that enables branch tracing for all (non-constant) branches
plays games with the preprocessor and #define's the C 'if ()' construct
to do tracing.
That's all fine, but it fails for some unusual but valid C code that is
sometimes used in macros, notably by the intel-iommu code:
if (i=drhd->iommu, drhd->ignored) ..
because now the preprocessor complains about multiple arguments to the
'if' macro.
So make the macro expansion of this particularly horrid trick use
varargs, and handle the case of comma-expressions in if-statements. Use
another macro to do it cleanly in just one place.
This replaces a patch by David (and acked by Steven) that did this all
inside that one already-too-horrid macro.
Tested-by: Ingo Molnar <mingo@elte.hu>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 7 Apr 2009 15:54:43 +0000 (08:54 -0700)]
Merge branch 'for-2.6.30' of git://git./linux/kernel/git/broonie/sound-2.6
* 'for-2.6.30' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound-2.6:
ASoC: TWL4030: Compillation error fix
Linus Torvalds [Tue, 7 Apr 2009 15:53:38 +0000 (08:53 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: (36 commits)
ALSA: hda - Add VREF powerdown sequence for another board
ALSA: oss - volume control for CSWITCH and CROUTE
ALSA: hda - add missing comma in ad1884_slave_vols
sound: usb-audio: allow period sizes less than 1 ms
sound: usb-audio: save data packet interval in audioformat structure
sound: usb-audio: remove check_hw_params_convention()
sound: usb-audio: show sample format width in proc file
ASoC: fsl_dma: Pass the proper device for dma mapping routines
ASoC: Fix null dereference in ak4535_remove()
ALSA: hda - enable SPDIF output for Intel DX58SO board
ALSA: snd-atmel-abdac: increase periods_min to 6 instead of 4
ALSA: snd-atmel-abdac: replace bus_id with dev_name()
ALSA: snd-atmel-ac97c: replace bus_id with dev_name()
ALSA: snd-atmel-ac97c: cleanup registers when removing driver
ALSA: snd-atmel-ac97c: do a proper reset of the external codec
ALSA: snd-atmel-ac97c: enable interrupts to catch events for error reporting
ALSA: snd-atmel-ac97c: set correct size for buffer hardware parameter
ALSA: snd-atmel-ac97c: do not overwrite OCA and ICA when assigning channels
ALSA: snd-atmel-ac97c: remove dead break statements after return in switch case
ALSA: snd-atmel-ac97c: cleanup register definitions
...
Linus Torvalds [Tue, 7 Apr 2009 15:53:02 +0000 (08:53 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
sata_mv: shorten register names
sata_mv: workaround errata SATA#13
sata_mv: cosmetic renames
sata_mv: workaround errata SATA#26
sata_mv: workaround errata PCI#7
sata_mv: replace 0x1f with ATA_PIO4 (v2)
sata_mv: fix irq mask races
sata_mv: revert SoC irq breakage
libata: ahci enclosure management bios workaround
ata: Add TRIM infrastructure
ata_piix: VGN-BX297XP wants the controller power up on suspend
libata: Remove some redundant casts from pata_octeon_cf.c
pata_artop: typo
Linus Torvalds [Tue, 7 Apr 2009 15:45:12 +0000 (08:45 -0700)]
Merge branch 'i2c-for-2630-v2' of git://aeryn.fluff.org.uk/bjdooks/linux
* 'i2c-for-2630-v2' of git://aeryn.fluff.org.uk/bjdooks/linux:
i2c: imx: Make disable_delay a per-device variable
i2c: xtensa s6000 i2c driver
powerpc/85xx: i2c-mpc: use new I2C bindings for the Socates board
i2c: i2c-mpc: make I2C bus speed configurable
i2c: i2c-mpc: use dev based printout function
i2c: i2c-mpc: various coding style fixes
i2c: imx: Add missing request_mem_region in probe()
i2c: i2c-s3c2410: Initialise Samsung I2C controller early
i2c-s3c2410: Simplify bus frequency calculation
i2c-s3c2410: sda_delay should be in ns, not clock ticks
i2c: iMX/MXC support
Linus Torvalds [Tue, 7 Apr 2009 15:44:43 +0000 (08:44 -0700)]
Merge branch 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
hwmon: Add Asus ATK0110 support
hwmon: (lm95241) Convert to new-style i2c driver
Alan Cox [Tue, 7 Apr 2009 14:30:57 +0000 (15:30 +0100)]
parport: Use the PCI IRQ if offered
PCI parallel port devices can IRQ share so we should stop them hogging
the line and making a mess on modern PC systems. We know the sharing
side works as the PCMCIA driver has shared the parallel port IRQ for
some time.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Breno Leitao [Tue, 7 Apr 2009 15:53:48 +0000 (16:53 +0100)]
tty: jsm cleanups
Here are some cleanups, mainly removing unused variables and silly
declarations.
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 7 Apr 2009 15:53:11 +0000 (16:53 +0100)]
Adjust path to gpio headers
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 7 Apr 2009 15:52:49 +0000 (16:52 +0100)]
KGDB_SERIAL_CONSOLE check for module
Depend on KGDB_SERIAL_CONSOLE being set to N rather than !Y, since it can
be built as a module.
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 7 Apr 2009 15:52:39 +0000 (16:52 +0100)]
Change KCONFIG name
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sonic Zhang [Tue, 7 Apr 2009 15:52:26 +0000 (16:52 +0100)]
tty: Blackin CTS/RTS
Both software emulated and hardware based CTS and RTS are enabled in
serial driver.
The CTS RTS PIN connection on BF548 UART port is defined as a modem
device not as a host device. In order to test it under Linux, please
nake a cross UART cable to exchange CTS and RTS signal.
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sonic Zhang [Tue, 7 Apr 2009 15:51:15 +0000 (16:51 +0100)]
Change hardware flow control from poll to interrupt driven
Only the CTS bit is affected.
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christian Pellegrin [Tue, 7 Apr 2009 15:48:51 +0000 (16:48 +0100)]
Add support for the MAX3100 SPI UART.
(akpm: queued pending confirmation of the new major number)
[randy.dunlap@oracle.com: select SERIAL_CORE]
Signed-off-by: Christian Pellegrin <chripell@fsfe.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 7 Apr 2009 15:48:35 +0000 (16:48 +0100)]
lanana: assign a device name and numbering for MAX3100
This is a low density serial port so needs a real major/minor
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 7 Apr 2009 15:48:27 +0000 (16:48 +0100)]
serqt: initial clean up pass for tty side
Avoid using port->tty where possible (makes refcount fixing easier
later).
Remove unused code (the ioctl path is not used if the device has
mget/mset functions)
Remove various un-needed typecasts and long names so it could read it to
do the changes.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Claudio Scordino [Tue, 7 Apr 2009 15:48:19 +0000 (16:48 +0100)]
tty: Use the generic RS485 ioctl on CRIS
Use the new general RS485 Linux data structure (introduced by Alan with
commit number
c26c56c0f40e200e61d1390629c806f6adaffbcc) in the Cris
architecture too (currently, Cris still uses the old private data
structure instead of the new one).
Signed-off-by: Claudio Scordino <claudio@evidence.eu.com>
Tested-by: Hinko Kocevar <hinko.kocevar@cetrtapot.si>
Tested-by: Janez Cufer <janez.cufer@cetrtapot.si>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Tue, 7 Apr 2009 15:48:07 +0000 (16:48 +0100)]
tty: Correct inline types for tty_driver_kref_get()
tty_driver_kref_get() should be static inline and not extern inline
(the latter even changed it's semantics in gcc >= 4.3).
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Miklos Szeredi [Mon, 6 Apr 2009 15:41:00 +0000 (17:41 +0200)]
splice: fix deadlock in splicing to file
There's a possible deadlock in generic_file_splice_write(),
splice_from_pipe() and ocfs2_file_splice_write():
- task A calls generic_file_splice_write()
- this calls inode_double_lock(), which locks i_mutex on both
pipe->inode and target inode
- ordering depends on inode pointers, can happen that pipe->inode is
locked first
- __splice_from_pipe() needs more data, calls pipe_wait()
- this releases lock on pipe->inode, goes to interruptible sleep
- task B calls generic_file_splice_write(), similarly to the first
- this locks pipe->inode, then tries to lock inode, but that is
already held by task A
- task A is interrupted, it tries to lock pipe->inode, but fails, as
it is already held by task B
- ABBA deadlock
Fix this by explicitly ordering locks: the outer lock must be on
target inode and the inner lock (which is later unlocked and relocked)
must be on pipe->inode. This is OK, pipe inodes and target inodes
form two nonoverlapping sets, generic_file_splice_write() and friends
are not called with a target which is a pipe.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:02:00 +0000 (19:02 -0700)]
nilfs2: support nanosecond timestamp
After a review of user's feedback for finding out other compatibility
issues, I found nilfs improperly initializes timestamps in inode;
CURRENT_TIME was used there instead of CURRENT_TIME_SEC even though nilfs
didn't have nanosecond timestamps on disk. A few users gave us the report
that the tar program sometimes failed to expand symbolic links on nilfs,
and it turned out to be the cause.
Instead of applying the above displacement, I've decided to support
nanosecond timestamps on this occation. Fortunetaly, a needless 64-bit
field was in the nilfs_inode struct, and I found it's available for this
purpose without impact for the users.
So, this will do the enhancement and resolve the tar problem.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:59 +0000 (19:01 -0700)]
nilfs2: introduce secondary super block
The former versions didn't have extra super blocks. This improves the
weak point by introducing another super block at unused region in tail of
the partition.
This doesn't break disk format compatibility; older versions just ingore
the secondary super block, and new versions just recover it if it doesn't
exist. The partition created by an old mkfs may not have unused region,
but in that case, the secondary super block will not be added.
This doesn't make more redundant copies of the super block; it is a future
work.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:58 +0000 (19:01 -0700)]
nilfs2: simplify handling of active state of segments
will reduce some lines of segment constructor. Previously, the state was
complexly controlled through a list of segments in order to keep
consistency in meta data of usage state of segments. Instead, this
presents ``calculated'' active flags to userland cleaner program and stop
maintaining its real flag on disk.
Only by this fake flag, the cleaner cannot exactly know if each segment is
reclaimable or not. However, the recent extension of nilfs_sustat ioctl
struct (nilfs2-extend-nilfs_sustat-ioctl-struct.patch) can prevent the
cleaner from reclaiming in-use segment wrongly.
So, now I can apply this for simplification.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:57 +0000 (19:01 -0700)]
nilfs2: mark minor flag for checkpoint created by internal operation
Nilfs creates checkpoints even for garbage collection or metadata updates
such as checkpoint mode change. So, user often sees checkpoints created
only by such internal operations.
This is inconvenient in some situations. For example, application that
monitors checkpoints and changes them to snapshots, will fall into an
infinite loop because it cannot distinguish internally created
checkpoints.
This patch solves this sort of problem by adding a flag to checkpoint for
identification.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:56 +0000 (19:01 -0700)]
nilfs2: clean up sketch file
The sketch file is a file to mark checkpoints with user data. It was
experimentally introduced in the original implementation, and now
obsolete. The file was handled differently with regular files; the file
size got truncated when a checkpoint was created.
This stops the special treatment and will treat it as a regular file.
Most users are not affected because mkfs.nilfs2 no longer makes this file.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:55 +0000 (19:01 -0700)]
nilfs2: super block operations fix endian bug
This adds a missing endian conversion of checksum field in the super
block. This fixes compatibility issue on big endian machines which will
come to surface after supporting recovery of super block.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:55 +0000 (19:01 -0700)]
nilfs2: replace BUG_ON and BUG calls triggerable from ioctl
Pekka Enberg advised me:
> It would be nice if BUG(), BUG_ON(), and panic() calls would be
> converted to proper error handling using WARN_ON() calls. The BUG()
> call in nilfs_cpfile_delete_checkpoints(), for example, looks to be
> triggerable from user-space via the ioctl() system call.
This will follow the comment and keep them to a minimum.
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:54 +0000 (19:01 -0700)]
nilfs2: extend nilfs_sustat ioctl struct
This adds a new argument to the nilfs_sustat structure.
The extended field allows to delete volatile active state of segments,
which was needed to protect freshly-created segments from garbage
collection but has confused code dealing with segments. This
extension alleviates the mess and gives room for further
simplifications.
The volatile active flag is not persistent, so it's eliminable on this
occasion without affecting compatibility other than the ioctl change.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:53 +0000 (19:01 -0700)]
nilfs2: use unlocked_ioctl
Pekka Enberg suggested converting ->ioctl operations to use
->unlocked_ioctl to avoid BKL.
The conversion was verified to be safe, so I will take it on this
occasion.
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:53 +0000 (19:01 -0700)]
nilfs2: remove compat ioctl code
This removes compat code from the nilfs ioctls and applies the same
function for both .ioctl and .compat_ioctl file operations.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:52 +0000 (19:01 -0700)]
nilfs2: use fixed sized types for ioctl structures
Nilfs ioctl had structures not having fixed sized types such as:
struct nilfs_argv {
void *v_base;
size_t v_nmembs;
size_t v_size;
int v_index;
int v_flags;
};
Further, some of them are wrongly aligned:
e.g.
struct nilfs_cpmode {
__u64 cm_cno;
int cm_mode;
};
The size of wrongly aligned structures varies depending on
architectures, and it breaks the identity of ioctl commands, which
leads to arch dependent errors.
Previously, these are compensated by using compat_ioctl.
This fixes these problems and allows removal of compat ioctl.
Since this will change sizes of those structures, binary compatibility
for the past utilities will once break; new utilities have to be used
instead. However, it would be helpful to avoid platform dependent
problems in the long term.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:51 +0000 (19:01 -0700)]
nilfs2: remove timedwait ioctl command
This removes NILFS_IOCTL_TIMEDWAIT command from ioctl interface along
with the related flags and wait queue.
The command is terrible because it just sleeps in the ioctl. I prefer
to avoid this by devising means of event polling in userland program.
By reconsidering the userland GC daemon, I found this is possible
without changing behaviour of the daemon and sacrificing efficiency.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:50 +0000 (19:01 -0700)]
nilfs2: fix buggy behavior seen in enumerating checkpoints
This will fix the weird behavior of lscp command in listing continuously
created checkpoints; the output of lscp is rewinded regularly for the
recent nilfs. As a result of debugging, a defect was found in
nilfs_cpfile_do_get_cpinfo() function.
Though the function can be repeatedly called to enumerate checkpoints and
it can skip invalid checkpoint entries, the index value was not carried
between successive calls.
The bug has long been present, and came to surface after applying a bugfix
nilfs2-fix-problems-of-memory-allocation-in-ioctl.patch, which increased
frequency of calling the function. The similar bugfix was already applied
for ``snapshots'' by
nilfs2-fix-gc-failure-on-volumes-keeping-numerous-snapshots.patch.
This fixes the problem by making the index argument bidirectional on the
function.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pekka Enberg [Tue, 7 Apr 2009 02:01:49 +0000 (19:01 -0700)]
nilfs2: clean up indirect function calling conventions
This cleans up the strange indirect function calling convention used in
nilfs to follow the normal kernel coding style.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:48 +0000 (19:01 -0700)]
nilfs2: fix improper return values of nilfs_get_cpinfo ioctl
A few tool developers gave me requests for fixing inconvenient return
value of nilfs_get_cpinfo() ioctl; if the requested mode is NILFS_SNAPSHOT
and the specified start entry is not a snapshot, the ioctl unnaturally
returns one as the number of acquired snapshot item.
In addition, the ioctl function returns an ENOENT error for checkpoints
within blocks deleted by garbage collection.
These behaviors require corrections for programs which enumerate
snapshots. This resolves the inconvenience by changing the return values
to zero for the above cases.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:47 +0000 (19:01 -0700)]
nilfs2: fix gc failure on volumes keeping numerous snapshots
This resolves the following failure of nilfs2 cleaner daemon:
nilfs_cleanerd[20670]: cannot clean segments: No such file or directory
nilfs_cleanerd[20670]: shutdown
When creating thousands of snapshots, the cleaner daemon had rarely died
as above due to an error returned from the kernel code.
After applying the recent patch which fixed memory allocation problems in
ioctl (Message-Id: <
20081215.155840.
105124170.ryusuke@osrg.net>), the
problem gets more frequent.
It turned out to be a bug of nilfs_ioctl_wrap_copy function and one of its
callback routines to read out information of snapshots; if the
nilfs_ioctl_wrap_copy function divided a large read request into multiple
requests, the second and later requests have failed since a restart
position on snapshot meta data was not properly set forward.
It's a deficiency of the callback interface that cannot pass the restart
position among multiple requests. This patch fixes the issue by allowing
nilfs_ioctl_wrap_copy and snapshot read functions to exchange a position
argument.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:46 +0000 (19:01 -0700)]
nilfs2: add maintainer
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:45 +0000 (19:01 -0700)]
nilfs2: insert explanations in gcinode file
The file gcinode.c gives buffer cache functions for on-disk blocks
moved in garbage collection. Joern Engel has suggested inserting its
explanations in the source file (Message-ID:
<
20080917144146.GD8750@logfs.org> and
<
20080917224953.GB14644@logfs.org>).
This follows the comment.
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:45 +0000 (19:01 -0700)]
nilfs2: avoid double error caused by nilfs_transaction_end
Pekka Enberg pointed out that double error handlings found after
nilfs_transaction_end() can be avoided by separating abort operation:
OK, I don't understand this. The only way nilfs_transaction_end() can
fail is if we have NILFS_TI_SYNC set and we fail to construct the
segment. But why do we want to construct a segment if we don't commit?
I guess what I'm asking is why don't we have a separate
nilfs_transaction_abort() function that can't fail for the erroneous
case to avoid this double error value tracking thing?
This does the separation and renames nilfs_transaction_end() to
nilfs_transaction_commit() for clarification.
Since, some calls of these functions were used just for exclusion control
against the segment constructor, they are replaced with semaphore
operations.
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:44 +0000 (19:01 -0700)]
nilfs2: cleanup nilfs_clear_inode
This will remove the following unnecessary locks and cleanup code in
nilfs_clear_inode():
- unnecessary protection using nilfs_transaction_begin() and
nilfs_transaction_end().
- cleanup code of i_dirty list field which is never chained
when this function is called.
- spinlock used when releasing i_bh field.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:43 +0000 (19:01 -0700)]
nilfs2: fix problems of memory allocation in ioctl
This is another patch for fixing the following problems of a memory
copy function in nilfs2 ioctl:
(1) It tries to allocate 128KB size of memory even for small objects.
(2) Though the function repeatedly tries large memory allocations
while reducing the size, GFP_NOWAIT flag is not specified.
This increases the possibility of system memory shortage.
(3) During the retries of (2), verbose warnings are printed
because _GFP_NOWARN flag is not used for the kmalloc calls.
The first patch was still doing large allocations by kmalloc which are
repeatedly tried while reducing the size.
Andi Kleen told me that using copy_from_user for large memory is not
good from the viewpoint of preempt latency:
On Fri, 12 Dec 2008 21:24:11 +0100, Andi Kleen <andi@firstfloor.org> wrote:
> > In the current interface, each data item is copied twice: one is to
> > the allocated memory from user space (via copy_from_user), and another
>
> For such large copies it is better to use multiple smaller (e.g. 4K)
> copy user, that gives better real time preempt latencies. Each cfu has a
> cond_resched(), but only one, not multiple times in the inner loop.
He also advised me that:
On Sun, 14 Dec 2008 16:13:27 +0100, Andi Kleen <andi@firstfloor.org> wrote:
> Better would be if you could go to PAGE_SIZE. order 0 allocations
> are typically the fastest / least likely to stall.
>
> Also in this case it's a good idea to use __get_free_pages()
> directly, kmalloc tends to be become less efficient at larger
> sizes.
For the function in question, the size of buffer memory can be reduced
since the buffer is repeatedly used for a number of small objects. On
the other hand, it may incur large preempt latencies for larger buffer
because a copy_from_user (and a copy_to_user) was applied only once
each cycle.
With that, this revision uses the order 0 allocations with
__get_free_pages() to fix the original problems.
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:41 +0000 (19:01 -0700)]
nilfs2: update makefile and Kconfig
This adds a Makefile for the nilfs2 file system, and updates the
makefile and Kconfig file in the file system directory.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Koji Sato [Tue, 7 Apr 2009 02:01:41 +0000 (19:01 -0700)]
nilfs2: ioctl operations
This adds userland interface implemented with ioctl.
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:40 +0000 (19:01 -0700)]
nilfs2: block cache for garbage collection
This adds the cache of on-disk blocks to be moved in garbage
collection. The disk blocks are held with dummy inodes (called
gcinodes), and this file provides lookup function of the dummy inodes,
and their buffer read function.
Signed-off-by: Seiji Kihara <kihara.seiji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Yoshiji Amagai <amagai.yoshiji@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:39 +0000 (19:01 -0700)]
nilfs2: another dat for garbage collection
NILFS2 uses another DAT inode during garbage collection to ensure
atomicity and consistency of the DAT in the transient state. This
twin inode is called GCDAT.
This adds functions to initialize the GCDAT and to switch page caches
and B-tree node caches between these two inodes.
Signed-off-by: Seiji Kihara <kihara.seiji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Yoshiji Amagai <amagai.yoshiji@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:38 +0000 (19:01 -0700)]
nilfs2: recovery functions
This adds recovery function on mount.
Usually the recovery is achieved by just finding the latest super
root. When logs without checkpoints were appended for data sync
operations after the latest super root, the recovery function will
perform roll forwarding and reconstruct new log(s) with a super root.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:38 +0000 (19:01 -0700)]
nilfs2: fix missed-sync issue for do_sync_mapping_range()
Chris Mason pointed out that there is a missed sync issue in
nilfs_writepages():
On Wed, 17 Dec 2008 21:52:55 -0500, Chris Mason wrote:
> It looks like nilfs_writepage ignores WB_SYNC_NONE, which is used by
> do_sync_mapping_range().
where WB_SYNC_NONE in do_sync_mapping_range() was replaced with
WB_SYNC_ALL by Nick's patch (commit:
ee53a891f47444c53318b98dac947ede963db400).
This fixes the problem by letting nilfs_writepages() write out the log of
file data within the range if sync_mode is WB_SYNC_ALL.
This involves removal of nilfs_file_aio_write() which was previously
needed to ensure O_SYNC sync writes.
Cc: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:37 +0000 (19:01 -0700)]
nilfs2: segment constructor
This adds the segment constructor (also called log writer).
The segment constructor collects dirty buffers for every dirty inode,
makes summaries of the buffers, assigns disk block addresses to the
buffers, and then submits BIOs for the buffers.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:36 +0000 (19:01 -0700)]
nilfs2: segment buffer
This adds the segment buffer which is used to constuct logs.
[akpm@linux-foundation.org: BIO_RW_SYNC got removed]
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:35 +0000 (19:01 -0700)]
nilfs2: super block operations
This adds super block operations for the nilfs2 file system.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:35 +0000 (19:01 -0700)]
nilfs2: operations for the_nilfs core object
This adds functions on the_nilfs object, which keeps shared resources and
states among a read/write mount and snapshots mounts going individually.
the_nilfs is allocated per block device; it is created when user first
mount a snapshot or a read/write mount on the device, then it is reused
for successive mounts. It will be freed when all mount instances on the
device are detached.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:34 +0000 (19:01 -0700)]
nilfs2: pathname operations
This adds pathname operations, most of which comes from the ext2 file
system.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yoshiji Amagai [Tue, 7 Apr 2009 02:01:34 +0000 (19:01 -0700)]
nilfs2: directory entry operations
This adds directory handling functions, most of which comes from the ext2
file system.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Yoshiji Amagai <amagai.yoshiji@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:33 +0000 (19:01 -0700)]
nilfs2: file operations
This adds primitives for regular file handling.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:32 +0000 (19:01 -0700)]
nilfs2: inode operations
This adds inode level operations of the nilfs2 file system.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Koji Sato [Tue, 7 Apr 2009 02:01:32 +0000 (19:01 -0700)]
nilfs2: segment usage file
This adds a meta data file which stores the allocation state of segments.
[konishi.ryusuke@lab.ntt.co.jp: fix wrong counting of checkpoints and dirty segments]
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Koji Sato [Tue, 7 Apr 2009 02:01:31 +0000 (19:01 -0700)]
nilfs2: checkpoint file
This adds a meta data file which holds checkpoint entries in its data
blocks.
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:30 +0000 (19:01 -0700)]
nilfs2: inode map file
This adds a meta data file which stores on-disk inodes in its data blocks.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Yoshiji Amagai <amagai.yoshiji@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Koji Sato [Tue, 7 Apr 2009 02:01:30 +0000 (19:01 -0700)]
nilfs2: disk address translator
This adds the disk address translation file (DAT) whose primary function
is to convert virtual disk block numbers to actual disk block numbers.
The virtual block numbers of NILFS are associated with checkpoint
generation numbers, and this file also provides functions to manage the
lifetime information of each virtual block number.
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:29 +0000 (19:01 -0700)]
nilfs2: persistent object allocator
This adds common functions to allocate or deallocate entries with bitmaps
on a meta data file. This feature is used by the DAT and ifile.
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Yoshiji Amagai <amagai.yoshiji@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:28 +0000 (19:01 -0700)]
nilfs2: meta data file
This adds the meta data file, which serves common buffer functions to the
DAT, sufile, cpfile, ifile, and so forth.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:27 +0000 (19:01 -0700)]
nilfs2: buffer and page operations
This adds common routines for buffer/page operations used in B-tree
node caches, meta data files, or segment constructor (log writer).
NILFS uses copy functions for buffers and pages due to the following
reasons:
1) Relocation required for COW
Since NILFS changes address of on-disk blocks, moving buffers
in page cache is needed for the buffers which are not addressed
by a file offset. If buffer size is smaller than page size,
this involves partial copy of pages.
2) Freezing mmapped pages
NILFS calculates checksums for each log to ensure its validity.
If page data changes after the checksum calculation, this validity
check will not work correctly. To avoid this failure for mmaped
pages, NILFS freezes their data by copying.
3) Copy-on-write for DAT pages
NILFS makes clones of DAT page caches in a copy-on-write manner
during GC processes, and this ensures atomicity and consistency
of the DAT in the transient state.
In addition, NILFS uses two obsolete functions, nilfs_mark_buffer_dirty()
and nilfs_clear_page_dirty() respectively.
* nilfs_mark_buffer_dirty() was required to avoid NULL pointer
dereference faults:
Since the page cache of B-tree node pages or data page cache of pseudo
inodes does not have a valid mapping->host, calling mark_buffer_dirty()
for their buffers causes the fault; it calls __mark_inode_dirty(NULL)
through __set_page_dirty().
* nilfs_clear_page_dirty() was needed in the two cases:
1) For B-tree node pages and data pages of the dat/gcdat, NILFS2 clears
page dirty flags when it copies back pages from the cloned cache
(gcdat->{i_mapping,i_btnode_cache}) to its original cache
(dat->{i_mapping,i_btnode_cache}).
2) Some B-tree operations like insertion or deletion may dispose buffers
in dirty state, and this needs to cancel the dirty state of their
pages. clear_page_dirty_for_io() caused faults because it does not
clear the dirty tag on the page cache.
Signed-off-by: Seiji Kihara <kihara.seiji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:25 +0000 (19:01 -0700)]
nilfs2: B-tree node cache
This adds routines for B-tree node buffers.
Signed-off-by: Seiji Kihara <kihara.seiji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Koji Sato [Tue, 7 Apr 2009 02:01:25 +0000 (19:01 -0700)]
nilfs2: direct block mapping
This adds block mappings using direct pointers which are stored in the
i_bmap array of inode.
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Koji Sato [Tue, 7 Apr 2009 02:01:24 +0000 (19:01 -0700)]
nilfs2: B-tree based block mapping
This adds declarations and functions of NILFS2 B-tree.
Two variants are integrated in the NILFS2 B-tree. The B-tree for the most
files points to the child nodes or data blocks with virtual block
addresses, whereas the B-tree of the DAT uses actual block addresses.
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Koji Sato [Tue, 7 Apr 2009 02:01:23 +0000 (19:01 -0700)]
nilfs2: integrated block mapping
This adds structures and operations for the block mapping (bmap for
short). NILFS2 uses direct mappings for short files or B-tree based
mappings for longer files.
Every on-disk data block is held with inodes and managed through this
block mapping. The nilfs_bmap structure and a set of functions here
provide this capability to the NILFS2 inode.
[penberg@cs.helsinki.fi: remove a bunch of bmap wrapper macros]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:23 +0000 (19:01 -0700)]
nilfs2: add inode and other major structures
This adds the following common structures of the NILFS2 file system.
* nilfs_inode_info structure:
gives on-memory inode.
* nilfs_sb_info structure:
keeps per-mount state and a special inode for the ifile.
This structure is attached to the super_block structure.
* the_nilfs structure:
keeps shared state and locks among a read/write mount and snapshot
mounts. This keeps special inodes for the sufile, cpfile, dat, and
another dat inode used during GC (gcdat). This also has a hash table
of dummy inodes to cache disk blocks during GC (gcinodes).
* nilfs_transaction_info structure:
keeps per task state while nilfs is writing logs or doing indivisible
inode or namespace operations. This structure is used to identify
context during log making and store nest level of the lock which
ensures atomicity of file system operations.
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Koji Sato [Tue, 7 Apr 2009 02:01:21 +0000 (19:01 -0700)]
nilfs2: disk format and userland interface
This adds a header file which specifies the on-disk format and ioctl
interface of the nilfs2 file system.
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:20 +0000 (19:01 -0700)]
nilfs2: add document
This adds a document describing the features, mount options, userland
tools, usage, disk format, and related URLs for the nilfs2 file system.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yang Hongyang [Tue, 7 Apr 2009 02:01:19 +0000 (19:01 -0700)]
dma-mapping: update the old macro DMA_nBIT_MASK related documentations
Update the old macro DMA_nBIT_MASK related documentations
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yang Hongyang [Tue, 7 Apr 2009 02:01:18 +0000 (19:01 -0700)]
dma-mapping: replace all DMA_24BIT_MASK macro with DMA_BIT_MASK(24)
Replace all DMA_24BIT_MASK macro with DMA_BIT_MASK(24)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yang Hongyang [Tue, 7 Apr 2009 02:01:17 +0000 (19:01 -0700)]
dma-mapping: replace all DMA_28BIT_MASK macro with DMA_BIT_MASK(28)
Replace all DMA_28BIT_MASK macro with DMA_BIT_MASK(28)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yang Hongyang [Tue, 7 Apr 2009 02:01:17 +0000 (19:01 -0700)]
dma-mapping: replace all DMA_30BIT_MASK macro with DMA_BIT_MASK(30)
Replace all DMA_30BIT_MASK macro with DMA_BIT_MASK(30)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yang Hongyang [Tue, 7 Apr 2009 02:01:16 +0000 (19:01 -0700)]
dma-mapping: replace all DMA_31BIT_MASK macro with DMA_BIT_MASK(31)
Replace all DMA_31BIT_MASK macro with DMA_BIT_MASK(31)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yang Hongyang [Tue, 7 Apr 2009 02:01:15 +0000 (19:01 -0700)]
dma-mapping: replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)
Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yang Hongyang [Tue, 7 Apr 2009 02:01:15 +0000 (19:01 -0700)]
dma-mapping: replace all DMA_39BIT_MASK macro with DMA_BIT_MASK(39)
Replace all DMA_39BIT_MASK macro with DMA_BIT_MASK(39)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yang Hongyang [Tue, 7 Apr 2009 02:01:14 +0000 (19:01 -0700)]
dma-mapping: replace all DMA_40BIT_MASK macro with DMA_BIT_MASK(40)
Replace all DMA_40BIT_MASK macro with DMA_BIT_MASK(40)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>