openwrt/staging/blogic.git
14 years ago3c59x: Use fine-grained locks for MII and windowed register access
Ben Hutchings [Tue, 29 Jun 2010 15:26:56 +0000 (15:26 +0000)]
3c59x: Use fine-grained locks for MII and windowed register access

This avoids scheduling in atomic context and also means that IRQs
will only be deferred for relatively short periods of time.

Previously discussed in:
http://article.gmane.org/gmane.linux.network/155024

Reported-by: Arne Nordmark <nordmark@mech.kth.se>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: disable EEE support by default
Bruce Allan [Tue, 29 Jun 2010 18:13:13 +0000 (18:13 +0000)]
e1000e: disable EEE support by default

Based on community feedback, EEE should be disabled by default until the
IEEE802.3az specification has been finalized.

Cc: bhutchings@solarflare.com
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: remove EEE module parameter
Bruce Allan [Tue, 29 Jun 2010 18:12:52 +0000 (18:12 +0000)]
e1000e: remove EEE module parameter

As requested by Dave Miller.  A follow-on set of patches will allow for
ethtool to enable/disable the feature instead.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: suppress compile warnings on certain archs
Bruce Allan [Tue, 29 Jun 2010 18:12:30 +0000 (18:12 +0000)]
e1000e: suppress compile warnings on certain archs

Commit 84f4ee902ad3ee964b7b3a13d5b7cf9c086e9916 causes compile warnings on
architectures that have unsigned long long's that are not 64-bit, e.g.
ia64.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: don't inadvertently re-set INTX_DISABLE
Dean Nelson [Tue, 29 Jun 2010 18:12:05 +0000 (18:12 +0000)]
e1000e: don't inadvertently re-set INTX_DISABLE

Should e1000_test_msi() fail to see an msi interrupt, it attempts to
fallback to legacy INTx interrupts. But an error in the code may prevent
this from happening correctly.

Before calling e1000_test_msi_interrupt(), e1000_test_msi() disables SERR
by clearing the SERR bit from the just read PCI_COMMAND bits as it writes
them back out.

Upon return from calling e1000_test_msi_interrupt(), it re-enables SERR
by writing out the version of PCI_COMMAND it had previously read.

The problem with this is that e1000_test_msi_interrupt() calls
pci_disable_msi(), which eventually ends up in pci_intx(). And because
pci_intx() was called with enable set to 1, the INTX_DISABLE bit gets
cleared from PCI_COMMAND, which is what we want. But when we get back to
e1000_test_msi(), the INTX_DISABLE bit gets inadvertently re-set because
of the attempt by e1000_test_msi() to re-enable SERR.

The solution is to have e1000_test_msi() re-read the PCI_COMMAND bits as
part of its attempt to re-enable SERR.

During debugging/testing of this issue I found that not all the systems
I ran on had the SERR bit set to begin with. And on some of the systems
the same could be said for the INTX_DISABLE bit. Needless to say these
latter systems didn't have a problem falling back to legacy INTx
interrupts with the code as is.

Signed-off-by: Dean Nelson <dnelson@redhat.com>
CC: stable@kernel.org
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodrivers/net/Makefile: conditionally descend to wireless
Nicolas Kaiser [Sun, 27 Jun 2010 11:44:52 +0000 (11:44 +0000)]
drivers/net/Makefile: conditionally descend to wireless

Don't descend to wireless unless it is actually used.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/Makefile: conditionally descend to wireless and ieee802154
Nicolas Kaiser [Sun, 27 Jun 2010 00:00:25 +0000 (00:00 +0000)]
net/Makefile: conditionally descend to wireless and ieee802154

Don't descend to wireless and ieee802154 unless they are actually used.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: Add support for configuring eswitch and npars
Rajesh K Borundia [Tue, 29 Jun 2010 08:01:20 +0000 (08:01 +0000)]
qlcnic: Add support for configuring eswitch and npars

Following changes are made:
1.Obtain capabilities of Nic partition.
2.Configure tx bandwidth of particular Nic partition.
3.Configure the eswitch for setting port mirroring, enable mac
learning, promiscous mode.

Signed-off-by: Rajesh K Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: Remove obsolete code
Anirban Chakraborty [Tue, 29 Jun 2010 07:52:12 +0000 (07:52 +0000)]
qlcnic: Remove obsolete code

Current driver uses FW API version 2 and thus code corresponding to FW API
version 1 has become obsolete. Clean up this from the driver.

Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agomicrel phy driver - updated(1)
Choi, David [Mon, 28 Jun 2010 15:23:41 +0000 (15:23 +0000)]
micrel phy driver - updated(1)

Hello all:

This patch fixes what Ben mentioned, namely duplicated ids.

From: David J. Choi <david.choi@micrel.com>

Body of the explanation: This patch has changes as followings;
 -support the interrupt from phy devices from Micrel Inc.
 -support more phy devices, ks8737, ks8721, ks8041, ks8051 from Micrel.
 -remove vsc8201 because this device was used only internal test at Micrel.

Signed-off-by: David J. Choi <david.choi@micrel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: fail when try to setup unsupported features
Stanislaw Gruszka [Sun, 27 Jun 2010 23:31:34 +0000 (23:31 +0000)]
qlcnic: fail when try to setup unsupported features

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetxen: fail when try to setup unsupported features
Stanislaw Gruszka [Sun, 27 Jun 2010 23:33:29 +0000 (23:33 +0000)]
netxen: fail when try to setup unsupported features

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: fail when try to setup unsupported features
Stanislaw Gruszka [Sun, 27 Jun 2010 23:28:11 +0000 (23:28 +0000)]
bnx2x: fail when try to setup unsupported features

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovmxnet3: fail when try to setup unsupported features
Stanislaw Gruszka [Sun, 27 Jun 2010 23:29:42 +0000 (23:29 +0000)]
vmxnet3: fail when try to setup unsupported features

Return EOPNOTSUPP in ethtool_ops->set_flags.

Fix coding style while at it.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: fail when try to setup unsupported features
Stanislaw Gruszka [Sun, 27 Jun 2010 23:26:23 +0000 (23:26 +0000)]
e1000e: fail when try to setup unsupported features

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocaif-driver: Add CAIF-SPI Protocol driver.
Sjur Braendeland [Tue, 29 Jun 2010 07:08:21 +0000 (00:08 -0700)]
caif-driver: Add CAIF-SPI Protocol driver.

This patch introduces the CAIF SPI Protocol Driver for
CAIF Link Layer.

This driver implements a platform driver to accommodate for a
platform specific SPI device. A general platform driver is not
possible as there are no SPI Slave side Kernel API defined.
A sample CAIF SPI Platform device can be found in
.../Documentation/networking/caif/spi_porting.txt

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocaif: Kconfig and Makefile fixes
Sjur Braendeland [Sat, 26 Jun 2010 11:31:28 +0000 (11:31 +0000)]
caif: Kconfig and Makefile fixes

Use "depends on" instead of "if" in Kconfig files.
Fixed CAIF debug flag, and removed unnecessary clean-* options.

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Stitch new T4 PCI-E SR-IOV Virtual Function driver into the build
Casey Leedom [Fri, 25 Jun 2010 12:15:33 +0000 (12:15 +0000)]
cxgb4vf: Stitch new T4 PCI-E SR-IOV Virtual Function driver into the build

Stitch new T4 PCI-E SR-IOV Virtual Function driver into the build.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Add new Makefile for T4 PCI-E SR-IOV Virtual Function driver cxgb4vf
Casey Leedom [Fri, 25 Jun 2010 12:14:57 +0000 (12:14 +0000)]
cxgb4vf: Add new Makefile for T4 PCI-E SR-IOV Virtual Function driver cxgb4vf

Add new Makefile for T4 PCI-E SR-IOV Virtual Function driver "cxgb4vf".

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Add main T4 PCI-E SR-IOV Virtual Function driver for cxgb4vf
Casey Leedom [Fri, 25 Jun 2010 12:14:15 +0000 (12:14 +0000)]
cxgb4vf: Add main T4 PCI-E SR-IOV Virtual Function driver for cxgb4vf

Add main T4 PCI-E SR-IOV Virtual Function driver for "cxgb4vf".

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Add T4 Virtual Function Scatter-Gather Engine DMA code
Casey Leedom [Fri, 25 Jun 2010 12:13:28 +0000 (12:13 +0000)]
cxgb4vf: Add T4 Virtual Function Scatter-Gather Engine DMA code

Add T4 Virtual Function Scatter-Gather Engine DMA code.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Add core T4 PCI-E SR-IOV Virtual Function hardware definitions and device...
Casey Leedom [Fri, 25 Jun 2010 12:12:54 +0000 (12:12 +0000)]
cxgb4vf: Add core T4 PCI-E SR-IOV Virtual Function hardware definitions and device communication code

Add core T4 PCI-E SR-IOV Virtual Function hardware definitions and device
communication code.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Add code to provision T4 PCI-E SR-IOV Virtual Functions with hardware resources
Casey Leedom [Fri, 25 Jun 2010 12:11:46 +0000 (12:11 +0000)]
cxgb4vf: Add code to provision T4 PCI-E SR-IOV Virtual Functions with hardware resources

Add code to provision T4 PCI-E SR-IOV Virtual Functions with hardware
resources.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Add new macros and definitions for hardware constants
Casey Leedom [Fri, 25 Jun 2010 12:11:05 +0000 (12:11 +0000)]
cxgb4vf: Add new macros and definitions for hardware constants

Add new macros and definitions for hardware constants.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: update to latest T4 firmware API file
Casey Leedom [Fri, 25 Jun 2010 12:10:32 +0000 (12:10 +0000)]
cxgb4vf: update to latest T4 firmware API file

Update to latest T4 firmware API file.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: small changes to message processing structures/macros
Casey Leedom [Fri, 25 Jun 2010 12:09:38 +0000 (12:09 +0000)]
cxgb4vf: small changes to message processing structures/macros

Split cpl_tx_pkt_lso into core message structure and encapsulated message,
make RSPD_LEN macro match other response descriptor macros.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetdev: mdio-octeon: Fix section mismatch errors.
David Daney [Thu, 24 Jun 2010 09:14:48 +0000 (09:14 +0000)]
netdev: mdio-octeon: Fix section mismatch errors.

We started getting:

WARNING: vmlinux.o(.data+0x20bd0): Section mismatch in reference from
the variable octeon_mdiobus_driver to the function
.init.text:octeon_mdiobus_probe()

This fixes it.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetdev: octeon_mgmt: Fix section mismatch errors.
David Daney [Thu, 24 Jun 2010 09:14:47 +0000 (09:14 +0000)]
netdev: octeon_mgmt: Fix section mismatch errors.

We started getting:

WARNING: drivers/net/built-in.o(.data+0x10f0): Section mismatch in
reference from the variable octeon_mgmt_driver to the function
.init.text:octeon_mgmt_probe()

This fixes it.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoact_mirred: don't clone skb when skb isn't shared
Changli Gao [Thu, 24 Jun 2010 16:25:12 +0000 (16:25 +0000)]
act_mirred: don't clone skb when skb isn't shared

don't clone skb when skb isn't shared

When the tcf_action is TC_ACT_STOLEN, and the skb isn't shared, we don't need
to clone a new skb. As the skb will be freed after this function returns, we
can use it freely once we get a reference to it.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/sch_generic.h |   11 +++++++++--
 net/sched/act_mirred.c    |    6 +++---
 2 files changed, 12 insertions(+), 5 deletions(-)
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotcp: tso_fragment() might avoid GFP_ATOMIC
Eric Dumazet [Thu, 24 Jun 2010 01:00:22 +0000 (01:00 +0000)]
tcp: tso_fragment() might avoid GFP_ATOMIC

We can pass a gfp argument to tso_fragment() and avoid GFP_ATOMIC
allocations sometimes.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovlan: 64 bit rx counters
Eric Dumazet [Thu, 24 Jun 2010 00:55:06 +0000 (00:55 +0000)]
vlan: 64 bit rx counters

Use u64_stats_sync infrastructure to implement 64bit rx stats.

(tx stats are addressed later)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agomacvlan: 64 bit rx counters
Eric Dumazet [Thu, 24 Jun 2010 00:54:21 +0000 (00:54 +0000)]
macvlan: 64 bit rx counters

Use u64_stats_sync infrastructure to implement 64bit stats.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: u64_stats_fetch_begin_bh() and u64_stats_fetch_retry_bh()
Eric Dumazet [Thu, 24 Jun 2010 00:54:06 +0000 (00:54 +0000)]
net: u64_stats_fetch_begin_bh() and u64_stats_fetch_retry_bh()

- Must disable preemption in case of 32bit UP in u64_stats_fetch_begin()
and u64_stats_fetch_retry()

- Add new u64_stats_fetch_begin_bh() and u64_stats_fetch_retry_bh() for
network usage, disabling BH on 32bit UP only.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: use this_cpu_ptr()
Eric Dumazet [Thu, 24 Jun 2010 00:52:37 +0000 (00:52 +0000)]
net: use this_cpu_ptr()

use this_cpu_ptr(p) instead of per_cpu_ptr(p, smp_processor_id())

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: u64_stats_sync improvements
Eric Dumazet [Thu, 24 Jun 2010 00:04:38 +0000 (00:04 +0000)]
net: u64_stats_sync improvements

- Add a comment about interrupts:

6) If counter might be written by an interrupt, readers should block
interrupts.

- Fix a typo in sample of use.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years ago3c59x: Specify window explicitly for access to windowed registers
Ben Hutchings [Wed, 23 Jun 2010 13:54:31 +0000 (13:54 +0000)]
3c59x: Specify window explicitly for access to windowed registers

Currently much of the code assumes that a specific window has been
selected, while a few functions save and restore the window.  This
makes it impossible to introduce fine-grained locking.

Make those assumptions explicit by introducing wrapper functions
to set the window and read/write a register.  Use these everywhere
except vortex_interrupt(), vortex_start_xmit() and vortex_rx().
These set the window just once, or not at all in the case of
vortex_rx() as it should always be called from vortex_interrupt().

Cache the current window in struct vortex_private to avoid
unnecessary hardware writes.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Arne Nordmark <nordmark@mech.kth.se> [against 2.6.32]
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agomlx4: add dynamic LRO disable support
Amerigo Wang [Mon, 21 Jun 2010 22:50:17 +0000 (22:50 +0000)]
mlx4: add dynamic LRO disable support

This patch adds dynamic LRO diable support for mlx4 net driver.
It also fixes a bug of mlx4, which checks NETIF_F_LRO flag in rx
path without rtnl lock.

(I don't have mlx4 card, so only did compiling test. Anyone who wants
to test this is more than welcome.)

This is based on Neil's initial work too, and heavily modified based
on Stanislaw's suggestions.

Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Neil Horman <nhorman@redhat.com>
Acked-by: Neil Horman <nhorman@redhat.com>
Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agos2io: add dynamic LRO disable support
Jon Mason [Thu, 24 Jun 2010 18:45:10 +0000 (18:45 +0000)]
s2io: add dynamic LRO disable support

This patch adds dynamic LRO disable support for s2io net driver,
enables LRO by default, increases the driver version number, and
corrects the name of the LRO modparm.

This is mostly Wang's patch based on Neil's initial work, heavily
modified based on Ramkrishna's suggestions.  This has been tested on
a Neterion Xframe adapter and verified via adapter LRO statistics.

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Neil Horman <nhorman@redhat.com>
Acked-by: Neil Horman <nhorman@redhat.com>
Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Ramkrishna Vepa <Ramkrishna.Vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosyncookies: add support for ECN
Florian Westphal [Mon, 21 Jun 2010 11:48:45 +0000 (11:48 +0000)]
syncookies: add support for ECN

Allows use of ECN when syncookies are in effect by encoding ecn_ok
into the syn-ack tcp timestamp.

While at it, remove a uneeded #ifdef CONFIG_SYN_COOKIES.
With CONFIG_SYN_COOKIES=nm want_cookie is ifdef'd to 0 and gcc
removes the "if (0)".

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosyncookies: do not store rcv_wscale in tcp timestamp
Florian Westphal [Mon, 21 Jun 2010 11:48:44 +0000 (11:48 +0000)]
syncookies: do not store rcv_wscale in tcp timestamp

As pointed out by Fernando Gont there is no need to encode rcv_wscale
into the cookie.

We did not use the restored rcv_wscale anyway; it is recomputed
via tcp_select_initial_window().

Thus we can save 4 bits in the ts option space by removing rcv_wscale.
In case window scaling was not supported, we set the (invalid) wscale
value 0xf.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoipv6: remove ipv6_statistics
Eric Dumazet [Wed, 23 Jun 2010 02:51:25 +0000 (02:51 +0000)]
ipv6: remove ipv6_statistics

commit 9261e5370112 (ipv6: making ip and icmp statistics per/namespace)
forgot to remove ipv6_statistics variable.

commit bc417d99bf27 (ipv6: remove stale MIB definitions) took care of
icmpv6_statistics & icmpv6msg_statistics

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Denis V. Lunev <den@openvz.org>
CC: Alexey Dobriyan <adobriyan@gmail.com>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosnmp: add align parameter to snmp_mib_init()
Eric Dumazet [Tue, 22 Jun 2010 20:58:41 +0000 (20:58 +0000)]
snmp: add align parameter to snmp_mib_init()

In preparation for 64bit snmp counters for some mibs,
add an 'align' parameter to snmp_mib_init(), instead
of assuming mibs only contain 'unsigned long' fields.

Callers can use __alignof__(type) to provide correct
alignment.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoloopback: use u64_stats_sync infrastructure
Eric Dumazet [Tue, 22 Jun 2010 12:44:11 +0000 (12:44 +0000)]
loopback: use u64_stats_sync infrastructure

Commit 6b10de38f0ef (loopback: Implement 64bit stats on 32bit arches)
introduced 64bit stats in loopback driver, using a private seqcount and
private helpers.

David suggested to introduce a generic infrastructure, added in (net:
Introduce u64_stats_sync infrastructure)

This patch reimplements loopback 64bit stats using the u64_stats_sync
infrastructure.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoarp: RCU change in arp_solicit()
Eric Dumazet [Tue, 22 Jun 2010 07:43:15 +0000 (07:43 +0000)]
arp: RCU change in arp_solicit()

Avoid two atomic ops in arp_solicit()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodccp: make implementation of Syn-RTT symmetric
Gerrit Renker [Tue, 22 Jun 2010 01:14:35 +0000 (01:14 +0000)]
dccp: make implementation of Syn-RTT symmetric

This patch is thanks to Andre Noll who reported the issue and helped testing.

The Syn-RTT sampled during the initial handshake currently only works for
the client sending the DCCP-Request. TFRC penalizes the absence of an RTT
sample with a very slow initial speed (1 packet per second), which delays
slow-start significantly, resulting in sluggish performance.

This patch mirrors the "Syn RTT" principle by adding a timestamp also onto
the DCCP-Response, producing an RTT sample  when the (Data)Ack completing
the handshake arrives.

Also changed the documentation to 'TFRC' since Syn RTTs are also used by CCID-4.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodccp: remove unused function argument
Gerrit Renker [Tue, 22 Jun 2010 01:14:34 +0000 (01:14 +0000)]
dccp: remove unused function argument

This removes an unused 'sk' argument from several option-inserting functions.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/fec: clean suspend/resume
Eric Benard [Fri, 18 Jun 2010 04:19:54 +0000 (04:19 +0000)]
net/fec: clean suspend/resume

Commit 59d4289b83b11379d867e2f7146904b19cc96404 converted fec to dev_pm_ops but
didn't update the suspend/resume functions thus leading to the following warning :
"initialization from incompatible pointer type" when CONFIG_PM is set.

This patch also fixe a few indentation and style around CONFIG_PM area.

Signed-off-by: Eric Bénard <eric@eukrea.com>
Cc: netdev@vger.kernel.org
Cc: davem@davemloft.net
Cc: amit.kucheria@canonical.com
Cc: s.hauer@pengutronix.de
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb3: request 7.10 firmware
Divy Le Ray [Mon, 21 Jun 2010 15:54:53 +0000 (15:54 +0000)]
cxgb3: request 7.10 firmware

The driver requests FW 7.10
Bump up driver version.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb3: update FW to 7.10
Divy Le Ray [Mon, 21 Jun 2010 15:54:48 +0000 (15:54 +0000)]
cxgb3: update FW to 7.10

Update FW to 7.10

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/core/pktgen.c: Use pr_<level>
Joe Perches [Mon, 21 Jun 2010 12:29:14 +0000 (12:29 +0000)]
net/core/pktgen.c: Use pr_<level>

Add pr_fmt(fmt) KBUILD_MODNAME ": " fmt
Remove "pktgen: " from formats
Convert printks to pr_<level>
Added func_enter() for debugging
Moved version to end of string at module_init
Coalesced long formats

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: optimize Berkeley Packet Filter (BPF) processing
Hagen Paul Pfeifer [Sat, 19 Jun 2010 17:05:36 +0000 (17:05 +0000)]
net: optimize Berkeley Packet Filter (BPF) processing

Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).

Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):

7ff: 8b 06                 mov    (%rsi),%eax
801: 66 83 f8 35           cmp    $0x35,%ax
805: 0f 84 d0 02 00 00     je     adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00     ja     918 <sk_run_filter+0x15a>
811: 66 83 f8 15           cmp    $0x15,%ax
815: 0f 84 c5 02 00 00     je     ae0 <sk_run_filter+0x322>
81b: 77 73                 ja     890 <sk_run_filter+0xd2>
81d: 66 83 f8 04           cmp    $0x4,%ax
821: 0f 84 17 02 00 00     je     a3e <sk_run_filter+0x280>
827: 77 29                 ja     852 <sk_run_filter+0x94>
829: 66 83 f8 01           cmp    $0x1,%ax
[...]

With the modification the compiler translate the switch statement into
the following jump table fragment:

7ff: 66 83 3e 2c           cmpw   $0x2c,(%rsi)
803: 0f 87 1f 02 00 00     ja     a28 <sk_run_filter+0x26a>
809: 0f b7 06              movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00  jmpq   *0x0(,%rax,8)
813: 44 89 e3              mov    %r12d,%ebx
816: e9 43 03 00 00        jmpq   b5e <sk_run_filter+0x3a0>
81b: 41 89 dc              mov    %ebx,%r12d
81e: e9 3b 03 00 00        jmpq   b5e <sk_run_filter+0x3a0>

Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Log clearer error messages for hardware monitor
Ben Hutchings [Fri, 25 Jun 2010 07:06:29 +0000 (07:06 +0000)]
sfc: Log clearer error messages for hardware monitor

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Use Toeplitz IPv4 hash for RSS and hash insertion
Ben Hutchings [Fri, 25 Jun 2010 07:05:56 +0000 (07:05 +0000)]
sfc: Use Toeplitz IPv4 hash for RSS and hash insertion

Insertion of the Falcon hash is unreliable.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Move siena_nic_data::ipv6_rss_key to efx_nic::rx_hash_key
Ben Hutchings [Fri, 25 Jun 2010 07:05:43 +0000 (07:05 +0000)]
sfc: Move siena_nic_data::ipv6_rss_key to efx_nic::rx_hash_key

We will use this hash key for Toeplitz IPv4 hashing too.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Fix reading of inserted hash
Ben Hutchings [Fri, 25 Jun 2010 07:05:33 +0000 (07:05 +0000)]
sfc: Fix reading of inserted hash

The hash appears immediately before the packet data, not at the
beginning of the buffer. This means we can easily use negative offsets
from the start of packet data, so adjust the data and length at the
top of __efx_rx_packet() instead of wherever we consume the hash.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Clean ups
Vasanthy Kolluri [Thu, 24 Jun 2010 10:52:26 +0000 (10:52 +0000)]
enic: Clean ups

1) Update copyright
2) Fix hardware queue descriptor field size CQ_ENET_RQ_DESC_FCOE_SOF_BITS
3) Include rtnetlink.h instead of if_link.h
4) Selectively flush writes to interrupt mask register
5) Use pci_enable_device_mem
6) Remove unused variables and header files
7) Fix size mismatch between memory alloc and free operations of a variable
8) Check for non null arguments to vic_provinfo_alloc

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Bug Fix: Handle surprise hardware removals
Vasanthy Kolluri [Thu, 24 Jun 2010 10:52:08 +0000 (10:52 +0000)]
enic: Bug Fix: Handle surprise hardware removals

Handle surprise hardware removals gracefully during devcmd issue and init,
cleanup of queues.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Feature Add: Add loopback capability to enic devices
Vasanthy Kolluri [Thu, 24 Jun 2010 10:51:59 +0000 (10:51 +0000)]
enic: Feature Add: Add loopback capability to enic devices

Hardware has the loopback capability to queue the packets transmitted from
a device to the receive queue of the same device. enic now supports the
loopback capability.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Use receive queue buffer blocks of 32/64 entries
Vasanthy Kolluri [Thu, 24 Jun 2010 10:51:51 +0000 (10:51 +0000)]
enic: Use receive queue buffer blocks of 32/64 entries

Change the receive queue buffer allocations into blocks of 32 entries when
ring size is less than 64, otherwise use 64 entries per block.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Add new firmware devcmds
Vasanthy Kolluri [Thu, 24 Jun 2010 10:51:43 +0000 (10:51 +0000)]
enic: Add new firmware devcmds

Add new firmware devcmds - CMD_PROXY_BY_BDF, CMD_PACKET_FILTER_ALL,
CMD_ENABLE_WAIT.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Use (netdev|dev|pr)_<level> macro helpers for logging
Vasanthy Kolluri [Thu, 24 Jun 2010 10:50:56 +0000 (10:50 +0000)]
enic: Use (netdev|dev|pr)_<level> macro helpers for logging

Replace all printk routines with the (netdev|dev|pr)_<level> macros that
provide verbose logs.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Clean up: Add wrapper routines for firmware devcmd calls
Vasanthy Kolluri [Thu, 24 Jun 2010 10:50:12 +0000 (10:50 +0000)]
enic: Clean up: Add wrapper routines for firmware devcmd calls

Add wrapper routines that issue devcmds to firmware and ensure that a
devcmd lock is held for each devcmd call.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Use a lighter reset operation for enic devices
Vasanthy Kolluri [Thu, 24 Jun 2010 10:50:00 +0000 (10:50 +0000)]
enic: Use a lighter reset operation for enic devices

The port profile information for a dynamic enic device is set by the upper
layers, that are oblivious to the device reset operation. We do not want a
reset operation erase the network state of a dynamic enic device as there
is no way to set up the port profile information again. Hence a lighter
reset operation called hang reset is used. Hang reset, unlike soft reset
does not reset the network state and resets the host side state only.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Bug Fix: Change hardware ingress vlan rewrite mode
Vasanthy Kolluri [Thu, 24 Jun 2010 10:49:51 +0000 (10:49 +0000)]
enic: Bug Fix: Change hardware ingress vlan rewrite mode

The current ingress vlan rewrite mode setting lets the hardware strip off
the tag control information of a packet received on native vlan. As a
result, the priority bits are also lost. The fix is to change the ingress
vlan rewrite mode setting such that the complete tag control information is
retained for packets that belong to native vlan.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Feature Add: Replace LRO with GRO
Vasanthy Kolluri [Thu, 24 Jun 2010 10:49:25 +0000 (10:49 +0000)]
enic: Feature Add: Replace LRO with GRO

enic now uses the GRO mechanism instead of LRO to pass skbs to upper
layers.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocnic: Update version to 2.1.3.
Michael Chan [Thu, 24 Jun 2010 14:58:42 +0000 (14:58 +0000)]
cnic: Update version to 2.1.3.

Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocnic: Further unify kcq handling code.
Michael Chan [Thu, 24 Jun 2010 14:58:41 +0000 (14:58 +0000)]
cnic: Further unify kcq handling code.

This eliminates some of the duplicate code for the various devices
that require the same basic kcq handling.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocnic: Restructure kcq processing.
Michael Chan [Thu, 24 Jun 2010 14:58:40 +0000 (14:58 +0000)]
cnic: Restructure kcq processing.

By doing more work in the common function cnic_get_kcqes(), and
making full use of the kcq_info structure.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocnic: Unify kcq allocation for all devices.
Michael Chan [Thu, 24 Jun 2010 14:58:39 +0000 (14:58 +0000)]
cnic: Unify kcq allocation for all devices.

By creating a common data stucture kcq_info for all devices, the kcq
(kernel completion queue) for all devices can be allocated by common
code.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocnic: Unify IRQ code for all hardware types.
Michael Chan [Thu, 24 Jun 2010 14:58:38 +0000 (14:58 +0000)]
cnic: Unify IRQ code for all hardware types.

By creating a common cnic_doirq().

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocnic: Fine-tune CID memory space calculation.
Michael Chan [Thu, 24 Jun 2010 14:58:37 +0000 (14:58 +0000)]
cnic: Fine-tune CID memory space calculation.

The current code makes assumptions about the CID (context ID) memory
space and starting CID that may not be always correct when firmware
changes.  In particular, BNX2_ISCSI_START_CID may not always be fixed.
We now calculate cp->max_cid_space and cp->iscsi_start_cid dynamically
instead of using fixed constants.  The unused cp->max_iscsi_conn is also
eliminated.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Record hardware RX hash on each skb where possible
Ben Hutchings [Wed, 23 Jun 2010 11:31:28 +0000 (11:31 +0000)]
sfc: Record hardware RX hash on each skb where possible

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Disable setting feature flags that are not implemented
Ben Hutchings [Wed, 23 Jun 2010 11:30:35 +0000 (11:30 +0000)]
sfc: Disable setting feature flags that are not implemented

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Replace EFX_DRIVER_NAME with KBUILD_MODNAME
Ben Hutchings [Wed, 23 Jun 2010 11:30:26 +0000 (11:30 +0000)]
sfc: Replace EFX_DRIVER_NAME with KBUILD_MODNAME

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Implement message level control
Ben Hutchings [Wed, 23 Jun 2010 11:30:07 +0000 (11:30 +0000)]
sfc: Implement message level control

Replace EFX_ERR() with netif_err(), EFX_INFO() with netif_info(),
EFX_LOG() with netif_dbg() and EFX_TRACE() and EFX_REGDUMP() with
netif_vdbg().

Replace EFX_ERR_RL(), EFX_INFO_RL() and EFX_LOG_RL() using explicit
calls to net_ratelimit().

Implement the ethtool operations to get and set message level flags,
and add a 'debug' module parameter for the initial value.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Log MTD errors using partition name, not just net device name
Ben Hutchings [Wed, 23 Jun 2010 11:29:24 +0000 (11:29 +0000)]
sfc: Log MTD errors using partition name, not just net device name

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Implement ethtool register dump operation
Ben Hutchings [Mon, 21 Jun 2010 03:06:53 +0000 (03:06 +0000)]
sfc: Implement ethtool register dump operation

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotcp: do not send reset to already closed sockets
Konstantin Khorenko [Fri, 25 Jun 2010 04:54:58 +0000 (21:54 -0700)]
tcp: do not send reset to already closed sockets

i've found that tcp_close() can be called for an already closed
socket, but still sends reset in this case (tcp_send_active_reset())
which seems to be incorrect.  Moreover, a packet with reset is sent
with different source port as original port number has been already
cleared on socket.  Besides that incrementing stat counter for
LINUX_MIB_TCPABORTONCLOSE also does not look correct in this case.

Initially this issue was found on 2.6.18-x RHEL5 kernel, but the same
seems to be true for the current mainstream kernel (checked on
2.6.35-rc3).  Please, correct me if i missed something.

How that happens:

1) the server receives a packet for socket in TCP_CLOSE_WAIT state
   that triggers a tcp_reset():

Call Trace:
 <IRQ>  [<ffffffff8025b9b9>] tcp_reset+0x12f/0x1e8
 [<ffffffff80046125>] tcp_rcv_state_process+0x1c0/0xa08
 [<ffffffff8003eb22>] tcp_v4_do_rcv+0x310/0x37a
 [<ffffffff80028bea>] tcp_v4_rcv+0x74d/0xb43
 [<ffffffff8024ef4c>] ip_local_deliver_finish+0x0/0x259
 [<ffffffff80037131>] ip_local_deliver+0x200/0x2f4
 [<ffffffff8003843c>] ip_rcv+0x64c/0x69f
 [<ffffffff80021d89>] netif_receive_skb+0x4c4/0x4fa
 [<ffffffff80032eca>] process_backlog+0x90/0xec
 [<ffffffff8000cc50>] net_rx_action+0xbb/0x1f1
 [<ffffffff80012d3a>] __do_softirq+0xf5/0x1ce
 [<ffffffff8001147a>] handle_IRQ_event+0x56/0xb0
 [<ffffffff8006334c>] call_softirq+0x1c/0x28
 [<ffffffff80070476>] do_softirq+0x2c/0x85
 [<ffffffff80070441>] do_IRQ+0x149/0x152
 [<ffffffff80062665>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffff80008a2e>] __handle_mm_fault+0x6cd/0x1303
 [<ffffffff80008903>] __handle_mm_fault+0x5a2/0x1303
 [<ffffffff80033a9d>] cache_free_debugcheck+0x21f/0x22e
 [<ffffffff8006a263>] do_page_fault+0x49a/0x7dc
 [<ffffffff80066487>] thread_return+0x89/0x174
 [<ffffffff800c5aee>] audit_syscall_exit+0x341/0x35c
 [<ffffffff80062e39>] error_exit+0x0/0x84

tcp_rcv_state_process()
...  // (sk_state == TCP_CLOSE_WAIT here)
...
        /* step 2: check RST bit */
        if(th->rst) {
                tcp_reset(sk);
                goto discard;
        }
...
---------------------------------
tcp_rcv_state_process
 tcp_reset
  tcp_done
   tcp_set_state(sk, TCP_CLOSE);
     inet_put_port
      __inet_put_port
       inet_sk(sk)->num = 0;

   sk->sk_shutdown = SHUTDOWN_MASK;

2) After that the process (socket owner) tries to write something to
   that socket and "inet_autobind" sets a _new_ (which differs from
   the original!) port number for the socket:

 Call Trace:
  [<ffffffff80255a12>] inet_bind_hash+0x33/0x5f
  [<ffffffff80257180>] inet_csk_get_port+0x216/0x268
  [<ffffffff8026bcc9>] inet_autobind+0x22/0x8f
  [<ffffffff80049140>] inet_sendmsg+0x27/0x57
  [<ffffffff8003a9d9>] do_sock_write+0xae/0xea
  [<ffffffff80226ac7>] sock_writev+0xdc/0xf6
  [<ffffffff800680c7>] _spin_lock_irqsave+0x9/0xe
  [<ffffffff8001fb49>] __pollwait+0x0/0xdd
  [<ffffffff8008d533>] default_wake_function+0x0/0xe
  [<ffffffff800a4f10>] autoremove_wake_function+0x0/0x2e
  [<ffffffff800f0b49>] do_readv_writev+0x163/0x274
  [<ffffffff80066538>] thread_return+0x13a/0x174
  [<ffffffff800145d8>] tcp_poll+0x0/0x1c9
  [<ffffffff800c56d3>] audit_syscall_entry+0x180/0x1b3
  [<ffffffff800f0dd0>] sys_writev+0x49/0xe4
  [<ffffffff800622dd>] tracesys+0xd5/0xe0

3) sendmsg fails at last with -EPIPE (=> 'write' returns -EPIPE in userspace):

F: tcp_sendmsg1 -EPIPE: sk=ffff81000bda00d0, sport=49847, old_state=7, new_state=7, sk_err=0, sk_shutdown=3

Call Trace:
 [<ffffffff80027557>] tcp_sendmsg+0xcb/0xe87
 [<ffffffff80033300>] release_sock+0x10/0xae
 [<ffffffff8016f20f>] vgacon_cursor+0x0/0x1a7
 [<ffffffff8026bd32>] inet_autobind+0x8b/0x8f
 [<ffffffff8003a9d9>] do_sock_write+0xae/0xea
 [<ffffffff80226ac7>] sock_writev+0xdc/0xf6
 [<ffffffff800680c7>] _spin_lock_irqsave+0x9/0xe
 [<ffffffff8001fb49>] __pollwait+0x0/0xdd
 [<ffffffff8008d533>] default_wake_function+0x0/0xe
 [<ffffffff800a4f10>] autoremove_wake_function+0x0/0x2e
 [<ffffffff800f0b49>] do_readv_writev+0x163/0x274
 [<ffffffff80066538>] thread_return+0x13a/0x174
 [<ffffffff800145d8>] tcp_poll+0x0/0x1c9
 [<ffffffff800c56d3>] audit_syscall_entry+0x180/0x1b3
 [<ffffffff800f0dd0>] sys_writev+0x49/0xe4
 [<ffffffff800622dd>] tracesys+0xd5/0xe0

tcp_sendmsg()
...
        /* Wait for a connection to finish. */
        if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) {
                int old_state = sk->sk_state;
                if ((err = sk_stream_wait_connect(sk, &timeo)) != 0) {
if (f_d && (err == -EPIPE)) {
        printk("F: tcp_sendmsg1 -EPIPE: sk=%p, sport=%u, old_state=%d, new_state=%d, "
                "sk_err=%d, sk_shutdown=%d\n",
                sk, ntohs(inet_sk(sk)->sport), old_state, sk->sk_state,
                sk->sk_err, sk->sk_shutdown);
        dump_stack();
}
                        goto out_err;
                }
        }
...

4) Then the process (socket owner) understands that it's time to close
   that socket and does that (and thus triggers sending reset packet):

Call Trace:
...
 [<ffffffff80032077>] dev_queue_xmit+0x343/0x3d6
 [<ffffffff80034698>] ip_output+0x351/0x384
 [<ffffffff80251ae9>] dst_output+0x0/0xe
 [<ffffffff80036ec6>] ip_queue_xmit+0x567/0x5d2
 [<ffffffff80095700>] vprintk+0x21/0x33
 [<ffffffff800070f0>] check_poison_obj+0x2e/0x206
 [<ffffffff80013587>] poison_obj+0x36/0x45
 [<ffffffff8025dea6>] tcp_send_active_reset+0x15/0x14d
 [<ffffffff80023481>] dbg_redzone1+0x1c/0x25
 [<ffffffff8025dea6>] tcp_send_active_reset+0x15/0x14d
 [<ffffffff8000ca94>] cache_alloc_debugcheck_after+0x189/0x1c8
 [<ffffffff80023405>] tcp_transmit_skb+0x764/0x786
 [<ffffffff8025df8a>] tcp_send_active_reset+0xf9/0x14d
 [<ffffffff80258ff1>] tcp_close+0x39a/0x960
 [<ffffffff8026be12>] inet_release+0x69/0x80
 [<ffffffff80059b31>] sock_release+0x4f/0xcf
 [<ffffffff80059d4c>] sock_close+0x2c/0x30
 [<ffffffff800133c9>] __fput+0xac/0x197
 [<ffffffff800252bc>] filp_close+0x59/0x61
 [<ffffffff8001eff6>] sys_close+0x85/0xc7
 [<ffffffff800622dd>] tracesys+0xd5/0xe0

So, in brief:

* a received packet for socket in TCP_CLOSE_WAIT state triggers
  tcp_reset() which clears inet_sk(sk)->num and put socket into
  TCP_CLOSE state

* an attempt to write to that socket forces inet_autobind() to get a
  new port (but the write itself fails with -EPIPE)

* tcp_close() called for socket in TCP_CLOSE state sends an active
  reset via socket with newly allocated port

This adds an additional check in tcp_close() for already closed
sockets. We do not want to send anything to closed sockets.

Signed-off-by: Konstantin Khorenko <khorenko@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobroadcom: Add 5241 support
Dmitry Baryshkov [Wed, 16 Jun 2010 23:02:24 +0000 (23:02 +0000)]
broadcom: Add 5241 support

This patch adds the 5241 PHY ID to the broadcom module.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobroadcom: move all PHY_ID's to header
Dmitry Baryshkov [Wed, 16 Jun 2010 23:02:23 +0000 (23:02 +0000)]
broadcom: move all PHY_ID's to header

Move all PHY IDs to brcmphy.h header for completeness and unification of code.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: fix "netpoll: Allow netpoll_setup/cleanup recursion"
Andrew Morton [Fri, 25 Jun 2010 03:33:04 +0000 (20:33 -0700)]
net: fix "netpoll: Allow netpoll_setup/cleanup recursion"

Remove rtnl_unlock() which had no corresponding rtnl_lock().

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
David S. Miller [Thu, 24 Jun 2010 01:26:27 +0000 (18:26 -0700)]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6

Conflicts:
net/ipv4/ip_output.c

14 years agosky2: enable rx/tx in sky2_phy_reinit()
Brandon Philips [Wed, 16 Jun 2010 16:21:58 +0000 (16:21 +0000)]
sky2: enable rx/tx in sky2_phy_reinit()

sky2_phy_reinit is called by the ethtool helpers sky2_set_settings,
sky2_nway_reset and sky2_set_pauseparam when netif_running.

However, at the end of sky2_phy_init GM_GP_CTRL has GM_GPCR_RX_ENA and
GM_GPCR_TX_ENA cleared. So, doing these commands causes the device to
stop working:

$ ethtool -r eth0
$ ethtool -A eth0 autoneg off

Fix this issue by enabling Rx/Tx after running sky2_phy_init in
sky2_phy_reinit.

Signed-off-by: Brandon Philips <bphilips@suse.de>
Tested-by: Brandon Philips <bphilips@suse.de>
Cc: stable@kernel.org
Tested-by: Mike McCormack <mikem@ring3k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet - IP_NODEFRAG option for IPv4 socket
Jiri Olsa [Tue, 15 Jun 2010 01:07:31 +0000 (01:07 +0000)]
net - IP_NODEFRAG option for IPv4 socket

this patch is implementing IP_NODEFRAG option for IPv4 socket.
The reason is, there's no other way to send out the packet with user
customized header of the reassembly part.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: handle missing z/VM authorization of OSX
Ursula Braun [Mon, 21 Jun 2010 22:57:12 +0000 (22:57 +0000)]
qeth: handle missing z/VM authorization of OSX

For z/VM guest operating systems, OSX CHPIDs can only be used, if
LPAR and z/VM userID are explicitly authorized through the Service
Element. Issue a message if this SE-authorization is missing.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: specify correct function level for OSN devices
Ursula Braun [Mon, 21 Jun 2010 22:57:11 +0000 (22:57 +0000)]
qeth: specify correct function level for OSN devices

OSN devices use the same function level as OSD devices. This patch
adds OSN-devices to the initialization function for func_level.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: fix page breaks in hw headers
Frank Blaschka [Mon, 21 Jun 2010 22:57:10 +0000 (22:57 +0000)]
qeth: fix page breaks in hw headers

Turning on memory debugging showed there could be page breaks in
hardware headers. OSA does not allow this so we had to add code
to bounce the header in case there is a page break. This patch also
fixes a problem in case the skb->data part of a fragmented skb
spreads multiple pages.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: fix use after free for qeths debug area
Carsten Otte [Mon, 21 Jun 2010 22:57:09 +0000 (22:57 +0000)]
qeth: fix use after free for qeths debug area

The function qeth_free_buffer_pool is called _after_ the per-card
debug area has been released. This debug message is not all that
usefull anyway, and thus gets removed.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: Fold qeth_qerr debug area
Carsten Otte [Mon, 21 Jun 2010 22:57:08 +0000 (22:57 +0000)]
qeth: Fold qeth_qerr debug area

This patch removes the qerr debug area. Most info that goes in here is logged
to the card's local debug area already, those duplicates are removed. All other
elements are moved to the card's local debug area.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: Fold qeth_misc debug area
Carsten Otte [Mon, 21 Jun 2010 22:57:07 +0000 (22:57 +0000)]
qeth: Fold qeth_misc debug area

This patch removes the misc debug area. Instead of logging the entire skb
we just log a pointer to it into the card's local debug area in
qeth_core_get_next_skb. Other then that, this debug area is not used anywhere.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: Fold qeth_sense debug area
Carsten Otte [Mon, 21 Jun 2010 22:57:06 +0000 (22:57 +0000)]
qeth: Fold qeth_sense debug area

This patch removes the sense debug area completely. Despite the name this
debug area makes no sense at all because it's unused completely. Ouch.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: Fold qeth_trace debug area
Carsten Otte [Mon, 21 Jun 2010 22:57:05 +0000 (22:57 +0000)]
qeth: Fold qeth_trace debug area

This patch removes the qeth_trace debug area. All relevant data is logged into
either qeth_setup or into each card's own debug area. Superfluous information
(such as the card number when logging into the card's own debug area) is
removed without replacement.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: Add new s390 debug feature for each qeth card
Carsten Otte [Mon, 21 Jun 2010 22:57:04 +0000 (22:57 +0000)]
qeth: Add new s390 debug feature for each qeth card

This patch adds a debug area for each qeth card. This debug area will replace
various other debug areas that are global for all cards handled by the device
driver. On crash dump analysis this makes life easier when trying to find out
what's going on with an interface. Also, the forest of debug areas for this
device driver is significantly cleared up.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: Rework qeth_dbf_longtext
Carsten Otte [Mon, 21 Jun 2010 22:57:03 +0000 (22:57 +0000)]
qeth: Rework qeth_dbf_longtext

This patch decouples qeth_dbf_longtext from qeth's static debug array. The
function only uses one member anyway.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosmsgiucv: guarantee single iucv connect in thaw
Ursula Braun [Mon, 21 Jun 2010 22:57:02 +0000 (22:57 +0000)]
smsgiucv: guarantee single iucv connect in thaw

If another smsgiucv_app device exists, suspend / resume fails with
iucv path list corruption, because the same iucv_path_connect is
called twice.
The patch introduces a flag to save connect status of the smsgiucv
path to make sure iucv_path_connect in smsg_pm_restore_thaw is
called only once.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: update version to 5.0.6
Amit Kumar Salecha [Tue, 22 Jun 2010 03:19:05 +0000 (03:19 +0000)]
qlcnic: update version to 5.0.6

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: mark context state freed after destroy
Amit Kumar Salecha [Tue, 22 Jun 2010 03:19:04 +0000 (03:19 +0000)]
qlcnic: mark context state freed after destroy

After destroying recv ctx, context state remain same.
Fix it by marking as FREED.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: offload tx timeout recovery
Amit Kumar Salecha [Tue, 22 Jun 2010 03:19:03 +0000 (03:19 +0000)]
qlcnic: offload tx timeout recovery

Offload tx timeout recovery to fw recovery func(check_health).
In check_health, first check health of device, if it its ok, then
do tx timeout recovery otherwise device recovery.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: dont free host resources during fw recovery
Amit Kumar Salecha [Tue, 22 Jun 2010 03:19:02 +0000 (03:19 +0000)]
qlcnic: dont free host resources during fw recovery

There is no need to free/alloc host resources during firmware
recovery.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: release device resources during interface down
Amit Kumar Salecha [Tue, 22 Jun 2010 03:19:01 +0000 (03:19 +0000)]
qlcnic: release device resources during interface down

Previously we were allocating device resources during probe and
release them during remove.
Now alloc during interface up and release in interface down.
This helps in device performance, as it doesn't need to keep
track of inactive resources.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>