openwrt/staging/blogic.git
10 years agodoc: update driver TX algorithm in timestamping.txt
Jakub Kicinski [Sun, 16 Mar 2014 19:32:48 +0000 (20:32 +0100)]
doc: update driver TX algorithm in timestamping.txt

Since cd4d8fdad1f1 ("net: kernel panic in dev_hard_start_xmit:
remove faulty software TX time stamping") dev_hard_start_xmit()
will not provide software timestamps. It's a responsibility of
the drivers to call skb_tx_timestamp() at the right time.

Cc: linux-doc@vger.kernel.org
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agobonding: ratelimit pr_warn()s in 802.3ad mode
Veaceslav Falico [Sun, 16 Mar 2014 16:55:03 +0000 (17:55 +0100)]
bonding: ratelimit pr_warn()s in 802.3ad mode

Only ratelimit the ones that might spam, omiting the ones from
enslave/deslave.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: sched: use no more than one page in struct fw_head
Eric Dumazet [Tue, 18 Mar 2014 03:20:49 +0000 (20:20 -0700)]
net: sched: use no more than one page in struct fw_head

In commit b4e9b520ca5d ("[NET_SCHED]: Add mask support to fwmark
classifier") Patrick added an u32 field in fw_head, making it slightly
bigger than one page.

Lets use 256 slots to make fw_hash() more straight forward, and move
@mask to the beginning of the structure as we often use a small number
of skb->mark. @mask and first hash buckets share the same cache line.

This brings back the memory usage to less than 4000 bytes, and permits
John to add a rcu_head at the end of the structure later without any
worry.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: John Fastabend <john.fastabend@gmail.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec...
David S. Miller [Tue, 18 Mar 2014 18:09:07 +0000 (14:09 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
One patch to rename a newly introduced struct. The rest is
the rework of the IPsec virtual tunnel interface for ipv6 to
support inter address family tunneling and namespace crossing.

1) Rename the newly introduced struct xfrm_filter to avoid a
   conflict with iproute2. From Nicolas Dichtel.

2) Introduce xfrm_input_afinfo to access the address family
   dependent tunnel callback functions properly.

3) Add and use a IPsec protocol multiplexer for ipv6.

4) Remove dst_entry caching. vti can lookup multiple different
   dst entries, dependent of the configured xfrm states. Therefore
   it does not make to cache a dst_entry.

5) Remove caching of flow informations. vti6 does not use the the
   tunnel endpoint addresses to do route and xfrm lookups.

6) Update the vti6 to use its own receive hook.

7) Remove the now unused xfrm_tunnel_notifier. This was used from vti
   and is replaced by the IPsec protocol multiplexer hooks.

8) Support inter address family tunneling for vti6.

9) Check if the tunnel endpoints of the xfrm state and the vti interface
   are matching and return an error otherwise.

10) Enable namespace crossing for vti devices.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet/i40e: Avoid double setting of NETIF_F_SG for the HW encapsulation feature mask
Or Gerlitz [Tue, 18 Mar 2014 08:36:45 +0000 (10:36 +0200)]
net/i40e: Avoid double setting of NETIF_F_SG for the HW encapsulation feature mask

The networking core does it for the driver during registration time.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoi40evf: Rename i40e_ptype_lookup i40evf_ptype_lookup
Eric W Biederman [Tue, 18 Mar 2014 07:26:50 +0000 (00:26 -0700)]
i40evf: Rename i40e_ptype_lookup i40evf_ptype_lookup

When compiling the i40e and the i40evf driver into the same kernel I get:
LD      drivers/net/ethernet/intel/built-in.o
drivers/net/ethernet/intel/i40evf/built-in.o:(.data+0x300): multiple definition of `i40e_ptype_lookup'
drivers/net/ethernet/intel/i40e/built-in.o:(.data+0x780): first defined here
make[3]: *** [drivers/net/ethernet/intel/built-in.o] Error 1
make[2]: *** [drivers/net/ethernet/intel] Error 2
make[1]: *** [drivers/net/ethernet/] Error 2
make: *** [sub-make] Error 2

Fix this by renaming the i40evf version of this structure from
i40e_ptype_lookup to i40evf_ptype_lookup.

This build failure was introduced in:
  commit 206812b5fccb808d1194344eaa942f68f59b2630
  Author: Jesse Brandeburg <jesse.brandeburg@intel.com>
  i40e/i40evf: i40e implementation for skb_set_hash

Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Eric W Biederman <ebiederm@xmission.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoe1000e: fix the build error when PM is disabled
Kevin Hao [Tue, 18 Mar 2014 07:26:49 +0000 (00:26 -0700)]
e1000e: fix the build error when PM is disabled

The commit 2800209994f8 (e1000e: Refactor PM flows) changed the
SET_SYSTEM_SLEEP_PM_OPS to open-coded assignment, but forgot to
protect them with CONFIG_PM_SLEEP. Then cause the following build
error when PM is disabled:
drivers/net/ethernet/intel/e1000e/netdev.c:7079:13:
error: 'e1000e_pm_suspend' undeclared here (not in a function)
  .suspend = e1000e_pm_suspend,
             ^
drivers/net/ethernet/intel/e1000e/netdev.c:7080:13:
error: 'e1000e_pm_resume' undeclared here (not in a function)
  .resume  = e1000e_pm_resume,
             ^
drivers/net/ethernet/intel/e1000e/netdev.c:7082:11:
error: 'e1000e_pm_thaw' undeclared here (not in a function)
  .thaw  = e1000e_pm_thaw,
           ^
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoigb: remove references to long gone command line parameters
Fernando Luis Vazquez Cao [Tue, 18 Mar 2014 07:26:48 +0000 (00:26 -0700)]
igb: remove references to long gone command line parameters

Command line parameters QueuePairs, Node, EEE, DMAC and InterruptThrottleRate
do not exist these days. Remove all references to them in the Documentation
folder and update code comments.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'altera_tse'
David S. Miller [Tue, 18 Mar 2014 01:37:25 +0000 (21:37 -0400)]
Merge branch 'altera_tse'

Vince Bridgers says:

====================
Altera Triple Speed Ethernet (TSE) Driver

This is the version 6 submission for the Altera Triple Speed Ethernet (TSE)
driver. All comments received during the version 2, 3, 4, and 5 submissions
have been accepted. Please find the change log and a description of the
submission below.

If you find the submission acceptable, please consider this patch set for
inclusion into the Linux kernel.

V6: Address comments from V5 review
    - add call to skb_tx_timestamp in the drivers transmit path
    - correct use of unsigned int where it was cast to pointer. Use types
      appropriate for intended and correct use to let the compiler warn us
      when type usage is incorrect.
    - use correct semantics for pointer arithmetic in same code path

V5: Address comments from V4 review
    - Add descriptions of statistics to driver documentation. The statstics
      supported by the driver/controller map to IEEE and RFC statistics, and
      the names and mappings are described in the user documentation.
    - Change "unsigned int" to u32 in device structure definitions
    - Change used of netdev_warn to netif_warn in altera_sgdma.c
    - Change stat name rx_fifo_drops to ether_drops to match the event
      actually counted by the hardware.

V4: Address comments from V3 review
    - Change statistics names in ethtool module to follow common use in
      other ethernet drivers.
    - remove an unnecessary case in ethtool module
    - change logging to use netdev_* where possible instead of dev_*
    - remove logging for OOM errors since those are already logged

V3: Address comments from V2 review
    - Reorder patch submission so that net/ethernet Makefile and Kconfig
      are committed last, thus not breaking bisect
    - Use of_get_mac_address instead of of_get_property
    - Change supplemental and hash configuration bindings to boolean/empty,
      and more meaningful names
    - Add check for failure from calls to of_phy_connect and
      connect_local_phy
    - Correct code to find mdio child node
    - Update bindings document
    - Remove cast to u64 when not necessary
    - add use of const for statistics strings

V2: Address comments from initial RFC review.
    - The driver files were broken up by major sections of functionality.
      These include MSGDMA, SGDMA, Misc, and Main.
    - Add patch for MAINTAINERS file, add the maintainer for this submission
    - Use 32-bit lower/upper physical address accessor functions so the driver
      is 64-bit ready.
    - Use standard bindings where applicable. Especially phy-addr, and change
      "altr,rx-fifo-depth" to "rx-fifo-depth" and "altr,tx-fifo-depth" to
      "tx-fifo-depth".
    - Add use of max-frame-size property
    - Update bindings documents accordingly
    - Correct interrupt handler to use budget parameter in the convential way
    - Use macros consistently to define bit fields across files
    - Correct include exclusion macro in altera_msgdmahw.h (typo)
    - Remove use of barriers, these were not necessary since the DMA APIs
      ensure memory & buffer consistency
    - Remove use of netif_carrier_off in driver
    - move probing of phy from the open function to the probe function
    - use of_get_phy_mode instead of custom function
    - Use the .data field in the device structure to obtain a pointer
      to SGDMA or MSGDMA device specific properties and functions.
    - remove custom function to access devicetree since Altera specific
      bindings requiring it's use have been deprecated in favor of
      standard bindings.

The Altera TSE is a 10/100/1000 Mbps Ethernet soft IP component that can be
configured and synthesized using Quartus, and programmed into Altera FPGAs.
Two types of soft DMA IP components are supported by this driver - the Altera
SGDMA and the MSGDMA. The MSGDMA DMA component is preferred over the SGDMA,
since the SGDMA will be deprecated in favor of the MSGDMA. Software supporting
both is provided for customers still using the SGDMA and to demonstrate how
multiple types of DMA engines may be supported by the TSE driver in the event
customers wish to develop their own custom soft DMA engine for particular
applications.

The design has been tested on Altera's Cyclone 4, 5, and Cyclone 5 SOC
development kits using an ARM A9 processor and an Altera NIOS2 processor.
Differences in CPU/DMA coherency management and address alignment are
addressed by proper use of driver APIs and semantics.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: ethernet: Change Ethernet Makefile and Kconfig for Altera TSE driver
Vince Bridgers [Mon, 17 Mar 2014 22:52:41 +0000 (17:52 -0500)]
net: ethernet: Change Ethernet Makefile and Kconfig for Altera TSE driver

This patch changes the Ethernet Makefile and Kconfig files to add the Altera
Ethernet driver component.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMAINTAINERS: Add entry for Altera Triple Speed Ethernet Driver
Vince Bridgers [Mon, 17 Mar 2014 22:52:40 +0000 (17:52 -0500)]
MAINTAINERS: Add entry for Altera Triple Speed Ethernet Driver

Add a MAINTAINERS entry covering the Altera Triple Speed
Ethernet Driver, with support for the MSGDMA and SGDMA
soft DMA IP components.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoAltera TSE: Add Altera Ethernet Driver Makefile and Kconfig
Vince Bridgers [Mon, 17 Mar 2014 22:52:39 +0000 (17:52 -0500)]
Altera TSE: Add Altera Ethernet Driver Makefile and Kconfig

This patch adds the Altera Triple Speed Ethernet Makfile and
Kconfig file.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoAltera TSE: Add main and header file for Altera Ethernet Driver
Vince Bridgers [Mon, 17 Mar 2014 22:52:38 +0000 (17:52 -0500)]
Altera TSE: Add main and header file for Altera Ethernet Driver

This patch adds the main driver and header file for the Altera Triple
Speed Ethernet driver.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoAltera TSE: Add Miscellaneous Files for Altera Ethernet Driver
Vince Bridgers [Mon, 17 Mar 2014 22:52:37 +0000 (17:52 -0500)]
Altera TSE: Add Miscellaneous Files for Altera Ethernet Driver

This patch adds miscellaneous files for the Altera Ethernet Driver,
including ethtool support.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoAltera TSE: Add Altera Ethernet Driver SGDMA file components
Vince Bridgers [Mon, 17 Mar 2014 22:52:36 +0000 (17:52 -0500)]
Altera TSE: Add Altera Ethernet Driver SGDMA file components

This patch adds the SGDMA soft IP support for the Altera Triple
Speed Ethernet driver.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoAltera TSE: Add Altera Ethernet Driver MSGDMA File Components
Vince Bridgers [Mon, 17 Mar 2014 22:52:35 +0000 (17:52 -0500)]
Altera TSE: Add Altera Ethernet Driver MSGDMA File Components

This patch adds the MSGDMA soft IP support for the Altera Triple
Speed Ethernet driver.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoDocumentation: networking: Add Altera Ethernet (TSE) Documentation
Vince Bridgers [Mon, 17 Mar 2014 22:52:34 +0000 (17:52 -0500)]
Documentation: networking: Add Altera Ethernet (TSE) Documentation

This patch adds a bindings description for the Altera Triple Speed Ethernet
(TSE) driver. The bindings support the legacy SGDMA soft IP as well as the
preferred MSGDMA soft IP. The TSE can be configured and synthesized in soft
logic using Altera's Quartus toolchain. Please consult the bindings document
for supported options.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agodts: Add bindings for the Altera Triple Speed Ethernet driver
Vince Bridgers [Mon, 17 Mar 2014 22:52:33 +0000 (17:52 -0500)]
dts: Add bindings for the Altera Triple Speed Ethernet driver

This patch adds a bindings description for the Altera Triple Speed Ethernet
(TSE) driver. The bindings support the legacy SGDMA soft IP as well as the
preferred MSGDMA soft IP. The TSE can be configured and synthesized in soft
logic using Altera's Quartus toolchain. Please consult the bindings document
for supported options.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetfilter: conntrack: Fix UP builds
Eric Dumazet [Mon, 17 Mar 2014 20:37:53 +0000 (13:37 -0700)]
netfilter: conntrack: Fix UP builds

ARRAY_SIZE(nf_conntrack_locks) is undefined if spinlock_t is an
empty structure. Replace it by CONNTRACK_LOCKS

Fixes: 93bb0ceb75be ("netfilter: conntrack: remove central spinlock nf_conntrack_lock")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'at86rf230'
David S. Miller [Mon, 17 Mar 2014 20:10:43 +0000 (16:10 -0400)]
Merge branch 'at86rf230'

Alexander Aring says:

====================
at86rf230: various fixes and devicetree support

this patch series fix some bugs with the at86rf231 chip and cleaup some code.
Also add devicetree support for the at86rf230 driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoat86rf230: add support for devicetree
Alexander Aring [Sat, 15 Mar 2014 08:29:07 +0000 (09:29 +0100)]
at86rf230: add support for devicetree

This patch adds devicetree support for the at86rf230 driver.

Possible gpios to configure are "reset-gpio" and "sleep-gpio".
Also add support to configure the "irq-type" for the irq polarity
register.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoat86rf230: make reset pin optionally
Alexander Aring [Sat, 15 Mar 2014 08:29:06 +0000 (09:29 +0100)]
at86rf230: make reset pin optionally

This patch make the reset pin optionally. Some devices like the atben
from qi-hardware don't have a reset pin externally. The usually way is
to turn power off/on for the atben device to initiate a device reset.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoat86rf230: change reset timings
Alexander Aring [Sat, 15 Mar 2014 08:29:05 +0000 (09:29 +0100)]
at86rf230: change reset timings

While checkpatch another patch I got a:

"WARNING: msleep < 20ms can sleep for up to 20ms"

The datasheet of at86rf231 and at86rf212 says a minimum delay for reset
pulse width and spi access latency after reset is 625 nanoseconds.

This patch removes the 1 milliseconds sleep and replace it with a 1
microseconds udelay which should be also okay for the reset pulse width.

To change the state from RESET -> TRX_OFF the at86rf230 device needs 120
microseconds, this is a worst case of all at86rf* chips.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoat86rf230: move locking state in xmit
Alexander Aring [Sat, 15 Mar 2014 08:29:04 +0000 (09:29 +0100)]
at86rf230: move locking state in xmit

There is no need to lock the clearing of IRQ_TRX_END in status.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoat86rf230: fix unexpected state change
Alexander Aring [Sat, 15 Mar 2014 08:29:03 +0000 (09:29 +0100)]
at86rf230: fix unexpected state change

This patch fix a unexpected state change for the at86rf231 chip.
We can't change into STATE_FORCE_TX_ON while the chip is in one of
SLEEP, P_ON, RESET, TRX_OFF, and all *_NOCLK states.

In this case we are in the TRX_OFF state. See datasheet [1] page 71 for
more information.

Without this patch you will get the following message on a at86rf231 device:

[   20.065218] unexpected state change: 8, asked for 4
[   20.070527] ------------[ cut here ]------------
[   20.075414] WARNING: CPU: 0 PID: 160 at net/mac802154/ieee802154_dev.c:43 mac802154_slave_open+0x70/0xb8()
[   20.085594] Modules linked in: autofs4
[   20.089667] CPU: 0 PID: 160 Comm: ifconfig Not tainted 3.14.0-20140108-1-00993-g905c192 #162
[   20.098612] [<c00127b8>] (unwind_backtrace) from [<c0010b1c>] (show_stack+0x10/0x14)
[   20.106819] [<c0010b1c>] (show_stack) from [<c0033838>] (warn_slowpath_common+0x60/0x80)
[   20.115311] [<c0033838>] (warn_slowpath_common) from [<c00338e8>] (warn_slowpath_null+0x18/0x20)
[   20.124590] [<c00338e8>] (warn_slowpath_null) from [<c057b7e8>] (mac802154_slave_open+0x70/0xb8)
[   20.133880] [<c057b7e8>] (mac802154_slave_open) from [<c0488a58>] (__dev_open+0xa8/0x108)
[   20.142553] [<c0488a58>] (__dev_open) from [<c0488cb0>] (__dev_change_flags+0x8c/0x148)
[   20.151051] [<c0488cb0>] (__dev_change_flags) from [<c0488d84>] (dev_change_flags+0x18/0x48)
[   20.159968] [<c0488d84>] (dev_change_flags) from [<c04e2e9c>] (devinet_ioctl+0x2b0/0x63c)
[   20.168623] [<c04e2e9c>] (devinet_ioctl) from [<c04712e4>] (sock_ioctl+0x23c/0x29c)
[   20.176727] [<c04712e4>] (sock_ioctl) from [<c00e3cb8>] (do_vfs_ioctl+0x4a8/0x578)
[   20.184671] [<c00e3cb8>] (do_vfs_ioctl) from [<c00e3dd4>] (SyS_ioctl+0x4c/0x78)
[   20.192402] [<c00e3dd4>] (SyS_ioctl) from [<c000da00>] (ret_fast_syscall+0x0/0x48)
[   20.200392] ---[ end trace 9a34542f4ea08e47 ]---

This patch was tested on at86rf231 and at86rf212.

[1] http://www.atmel.com/images/doc8111.pdf

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'sh_eth'
David S. Miller [Mon, 17 Mar 2014 20:06:48 +0000 (16:06 -0400)]
Merge branch 'sh_eth'

Sergei Shtylyov says:

====================
Beautify 'sh_eth' driver's messages

This patchset converts te driver to using netdev_*() and netif_*() to print out
its messages whenever possible.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosh_eth: fold netif_msg_*() and netdev_*() calls into netif_*() invocations
Sergei Shtylyov [Sat, 15 Mar 2014 00:30:59 +0000 (03:30 +0300)]
sh_eth: fold netif_msg_*() and netdev_*() calls into netif_*() invocations

Now that we call netdev_*() under netif_msg_*() checks, we can fold these into
netif_*() macro invocations.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosh_eth: convert dev_*() to netdev_*() calls
Sergei Shtylyov [Sat, 15 Mar 2014 00:29:14 +0000 (03:29 +0300)]
sh_eth: convert dev_*() to netdev_*() calls

Convert dev_*(&ndev->dev, ...) to netdev_*(ndev, ...) calls since they are a bit
shorter and at the same time give more information on a device.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosh_eth: convert pr_*() to netdev_*() calls
Sergei Shtylyov [Sat, 15 Mar 2014 00:27:54 +0000 (03:27 +0300)]
sh_eth: convert pr_*() to netdev_*() calls

Convert pr_*() to netdev_*() calls as the latter provide info on a device.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosh_eth: exit probe with unknown register layout
Sergei Shtylyov [Sat, 15 Mar 2014 00:11:24 +0000 (03:11 +0300)]
sh_eth: exit probe with unknown register layout

Exit the driver's probe() method when the register layout is unknown as the
driver would cause kernel oops in this case anyway.

While at it, move the corresponding error message printout and convert it from
pr_err() to dev_err().

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'netpoll-next'
David S. Miller [Mon, 17 Mar 2014 19:48:53 +0000 (15:48 -0400)]
Merge branch 'netpoll-next'

Eric W. Biederman says:

====================
netpoll: Cleanup received packet processing

This is the long-winded, careful, and polite version of removing the netpoll
receive packet processing.

First I untangle the code in small steps.  Then I modify the code to not
force reception and dropping of packets when we are transmiting a packet
with netpoll.  Finally I move all of the packet reception under
CONFIG_NETPOLL_TRAP and delete CONFIG_NETPOLL_TRAP.

If someone wants to do a stable backport of these patches, it would
require backporting the first 18 patches that handle the budget == 0 in
the networking drivers, and the first 6 of these patches.

If anyone wants to resurrect netpoll packet reception someday it should
just be a matter of reverting the last patch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP)
Eric W. Biederman [Sat, 15 Mar 2014 03:51:52 +0000 (20:51 -0700)]
netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP)

The netpoll packet receive code only becomes active if the netpoll
rx_skb_hook is implemented, and there is not a single implementation
of the netpoll rx_skb_hook in the kernel.

All of the out of tree implementations I have found all call
netpoll_poll which was removed from the kernel in 2011, so this
change should not add any additional breakage.

There are problems with the netpoll packet receive code.  __netpoll_rx
does not call dev_kfree_skb_irq or dev_kfree_skb_any in hard irq
context.  netpoll_neigh_reply leaks every skb it receives.  Reception
of packets does not work successfully on stacked devices (aka bonding,
team, bridge, and vlans).

Given that the netpoll packet receive code is buggy, there are no
out of tree users that will be merged soon, and the code has
not been used for in tree for a decade let's just remove it.

Reverting this commit can server as a starting point for anyone
who wants to resurrect netpoll packet reception support.

Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Move all receive processing under CONFIG_NETPOLL_TRAP
Eric W. Biederman [Sat, 15 Mar 2014 03:50:58 +0000 (20:50 -0700)]
netpoll: Move all receive processing under CONFIG_NETPOLL_TRAP

Make rx_skb_hook, and rx in struct netpoll depend on
CONFIG_NETPOLL_TRAP Make rx_lock, rx_np, and neigh_tx in struct
netpoll_info depend on CONFIG_NETPOLL_TRAP

Make the functions netpoll_rx_on, netpoll_rx, and netpoll_receive_skb
no-ops when CONFIG_NETPOLL_TRAP is not set.

Only build netpoll_neigh_reply, checksum_udp service_neigh_queue,
pkt_is_ns, and __netpoll_rx when CONFIG_NETPOLL_TRAP is defined.

Add helper functions netpoll_trap_setup, netpoll_trap_setup_info,
netpoll_trap_cleanup, and netpoll_trap_cleanup_info that initialize
and cleanup the struct netpoll and struct netpoll_info receive
specific fields when CONFIG_NETPOLL_TRAP is enabled and do nothing
otherwise.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Consolidate neigh_tx processing in service_neigh_queue
Eric W. Biederman [Sat, 15 Mar 2014 03:50:25 +0000 (20:50 -0700)]
netpoll: Consolidate neigh_tx processing in service_neigh_queue

Move the bond slave device neigh_tx handling into service_neigh_queue.

In connection with neigh_tx processing remove unnecessary tests of
a NULL netpoll_info.  As the netpoll_poll_dev has already used
and thus verified the existince of the netpoll_info.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP
Eric W. Biederman [Sat, 15 Mar 2014 03:49:43 +0000 (20:49 -0700)]
netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP

Now that we no longer need to receive packets to safely drain the
network drivers receive queue move netpoll_trap and netpoll_set_trap
under CONFIG_NETPOLL_TRAP

Making netpoll_trap and netpoll_set_trap noop inline functions
when CONFIG_NETPOLL_TRAP is not set.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Don't drop all received packets.
Eric W. Biederman [Sat, 15 Mar 2014 03:48:28 +0000 (20:48 -0700)]
netpoll: Don't drop all received packets.

Change the strategy of netpoll from dropping all packets received
during netpoll_poll_dev to calling napi poll with a budget of 0
(to avoid processing drivers rx queue), and to ignore packets received
with netif_rx (those will safely be placed on the backlog queue).

All of the netpoll supporting drivers have been reviewed to ensure
either thay use netif_rx or that a budget of 0 is supported by their
napi poll routine and that a budget of 0 will not process the drivers
rx queues.

Not dropping packets makes NETPOLL_RX_DROP unnecesary so it is removed.

npinfo->rx_flags is removed  as rx_flags with just the NETPOLL_RX_ENABLED
flag becomes just a redundant mirror of list_empty(&npinfo->rx_np).

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Add netpoll_rx_processing
Eric W. Biederman [Sat, 15 Mar 2014 03:47:49 +0000 (20:47 -0700)]
netpoll: Add netpoll_rx_processing

Add a helper netpoll_rx_processing that reports when netpoll has
receive side processing to perform.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Warn if more packets are processed than are budgeted
Eric W. Biederman [Sat, 15 Mar 2014 03:47:15 +0000 (20:47 -0700)]
netpoll: Warn if more packets are processed than are budgeted

There is already a warning for this case in the normal netpoll path,
but put a copy here in case how netpoll calls the poll functions
causes a differenet result.

netpoll will shortly call the napi poll routine with a budget 0 to
avoid any rx packets being processed.  As nothing does that today
we may encounter drivers that have problems so a netpoll specific
warning seems desirable.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Visit all napi handlers in poll_napi
Eric W. Biederman [Sat, 15 Mar 2014 03:45:51 +0000 (20:45 -0700)]
netpoll: Visit all napi handlers in poll_napi

In poll_napi loop through all of the napi handlers even when the
budget falls to 0 to ensure that we process all of the tx_queues, and
so that we continue to call into drivers when our initial budget is 0.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Pass budget into poll_napi
Eric W. Biederman [Sat, 15 Mar 2014 03:45:17 +0000 (20:45 -0700)]
netpoll: Pass budget into poll_napi

This moves the control logic to the top level in netpoll_poll_dev
instead of having it dispersed throughout netpoll_poll_dev,
poll_napi and poll_one_napi.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: move setting of NETPOLL_RX_DROP into netpoll_poll_dev
Eric W. Biederman [Sat, 15 Mar 2014 03:44:37 +0000 (20:44 -0700)]
netpoll: move setting of NETPOLL_RX_DROP into netpoll_poll_dev

Today netpoll depends on setting NETPOLL_RX_DROP before networking
drivers receive packets in interrupt context so that the packets can
be dropped.  Move this setting into netpoll_poll_dev from
poll_one_napi so that if ndo_poll_controller happens to receive
packets we will drop the packets on the floor instead of letting the
packets bounce through the networking stack and potentially cause problems.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
David S. Miller [Mon, 17 Mar 2014 19:06:24 +0000 (15:06 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next,
most relevantly they are:

* cleanup to remove double semicolon from stephen hemminger.

* calm down sparse warning in xt_ipcomp, from Fan Du.

* nf_ct_labels support for nf_tables, from Florian Westphal.

* new macros to simplify rcu dereferences in the scope of nfnetlink
  and nf_tables, from Patrick McHardy.

* Accept queue and drop (including reason for drop) to verdict
  parsing in nf_tables, also from Patrick.

* Remove unused random seed initialization in nfnetlink_log, from
  Florian Westphal.

* Allow to attach user-specific information to nf_tables rules, useful
  to attach user comments to rule, from me.

* Return errors in ipset according to the manpage documentation, from
  Jozsef Kadlecsik.

* Fix coccinelle warnings related to incorrect bool type usage for ipset,
  from Fengguang Wu.

* Add hash:ip,mark set type to ipset, from Vytas Dauksa.

* Fix message for each spotted by ipset for each netns that is created,
  from Ilia Mirkin.

* Add forceadd option to ipset, which evicts a random entry from the set
  if it becomes full, from Josh Hunt.

* Minor IPVS cleanups and fixes from Andi Kleen and Tingwei Liu.

* Improve conntrack scalability by removing a central spinlock, original
  work from Eric Dumazet. Jesper Dangaard Brouer took them over to address
  remaining issues. Several patches to prepare this change come in first
  place.

* Rework nft_hash to resolve bugs (leaking chain, missing rcu synchronization
  on element removal, etc. from Patrick McHardy.

* Restore context in the rule deletion path, as we now release rule objects
  synchronously, from Patrick McHardy. This gets back event notification for
  anonymous sets.

* Fix NAT family validation in nft_nat, also from Patrick.

* Improve scalability of xt_connlimit by using an array of spinlocks and
  by introducing a rb-tree of hashtables for faster lookup of accounted
  objects per network. This patch was preceded by several patches and
  refactorizations to accomodate this change including the use of kmem_cache,
  from Florian Westphal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetfilter: connlimit: use rbtree for per-host conntrack obj storage
Florian Westphal [Wed, 12 Mar 2014 22:49:51 +0000 (23:49 +0100)]
netfilter: connlimit: use rbtree for per-host conntrack obj storage

With current match design every invocation of the connlimit_match
function means we have to perform (number_of_conntracks % 256) lookups
in the conntrack table [ to perform GC/delete stale entries ].
This is also the reason why ____nf_conntrack_find() in perf top has
> 20% cpu time per core.

This patch changes the storage to rbtree which cuts down the number of
ct objects that need testing.

When looking up a new tuple, we only test the connections of the host
objects we visit while searching for the wanted host/network (or
the leaf we need to insert at).

The slot count is reduced to 32.  Increasing slot count doesn't
speed up things much because of rbtree nature.

before patch (50kpps rx, 10kpps tx):
+  20.95%  ksoftirqd/0  [nf_conntrack] [k] ____nf_conntrack_find
+  20.50%  ksoftirqd/1  [nf_conntrack] [k] ____nf_conntrack_find
+  20.27%  ksoftirqd/2  [nf_conntrack] [k] ____nf_conntrack_find
+   5.76%  ksoftirqd/1  [nf_conntrack] [k] hash_conntrack_raw
+   5.39%  ksoftirqd/2  [nf_conntrack] [k] hash_conntrack_raw
+   5.35%  ksoftirqd/0  [nf_conntrack] [k] hash_conntrack_raw

after (90kpps, 51kpps tx):
+  17.24%       swapper  [nf_conntrack]    [k] ____nf_conntrack_find
+   6.60%   ksoftirqd/2  [nf_conntrack]    [k] ____nf_conntrack_find
+   2.73%       swapper  [nf_conntrack]    [k] hash_conntrack_raw
+   2.36%       swapper  [xt_connlimit]    [k] count_tree

Obvious disadvantages to previous version are the increase in code
complexity and the increased memory cost.

Partially based on Eric Dumazets fq scheduler.

Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: connlimit: make same_source_net signed
Florian Westphal [Wed, 12 Mar 2014 22:49:50 +0000 (23:49 +0100)]
netfilter: connlimit: make same_source_net signed

currently returns 1 if they're the same.  Make it work like mem/strcmp
so it can be used as rbtree search function.

Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: connlimit: use keyed locks
Florian Westphal [Wed, 12 Mar 2014 22:49:49 +0000 (23:49 +0100)]
netfilter: connlimit: use keyed locks

connlimit currently suffers from spinlock contention, example for
4-core system with rps enabled:

+  20.84%   ksoftirqd/2  [kernel.kallsyms] [k] _raw_spin_lock_bh
+  20.76%   ksoftirqd/1  [kernel.kallsyms] [k] _raw_spin_lock_bh
+  20.42%   ksoftirqd/0  [kernel.kallsyms] [k] _raw_spin_lock_bh
+   6.07%   ksoftirqd/2  [nf_conntrack]    [k] ____nf_conntrack_find
+   6.07%   ksoftirqd/1  [nf_conntrack]    [k] ____nf_conntrack_find
+   5.97%   ksoftirqd/0  [nf_conntrack]    [k] ____nf_conntrack_find
+   2.47%   ksoftirqd/2  [nf_conntrack]    [k] hash_conntrack_raw
+   2.45%   ksoftirqd/0  [nf_conntrack]    [k] hash_conntrack_raw
+   2.44%   ksoftirqd/1  [nf_conntrack]    [k] hash_conntrack_raw

May allow parallel lookup/insert/delete if the entry is hashed to
another slot.  With patch:

+  20.95%  ksoftirqd/0  [nf_conntrack] [k] ____nf_conntrack_find
+  20.50%  ksoftirqd/1  [nf_conntrack] [k] ____nf_conntrack_find
+  20.27%  ksoftirqd/2  [nf_conntrack] [k] ____nf_conntrack_find
+   5.76%  ksoftirqd/1  [nf_conntrack] [k] hash_conntrack_raw
+   5.39%  ksoftirqd/2  [nf_conntrack] [k] hash_conntrack_raw
+   5.35%  ksoftirqd/0  [nf_conntrack] [k] hash_conntrack_raw
+   2.00%  ksoftirqd/1  [kernel.kallsyms] [k] __rcu_read_unlock

Improved rx processing rate from ~35kpps to ~50 kpps.

Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agoMerge branch 'napi_budget_zero'
David S. Miller [Sat, 15 Mar 2014 02:53:35 +0000 (22:53 -0400)]
Merge branch 'napi_budget_zero'

Eric W. Biederman says:

====================
Don't receive packets when the napi budget == 0

After reading through all 120 drivers supporting netpoll I have found 16
more that process at least received packet when the napi budget == 0.

Processing more packets than your budget has always been a bug but
we haven't cared before so it looks like these drivers slipped through,
and need fixes.

As netpoll will shortly be using a budget of 0 to get the tx queue
processing with the rx queue processing we now care.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosfc: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:11:22 +0000 (18:11 -0700)]
sfc: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agovxge: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:10:50 +0000 (18:10 -0700)]
vxge: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agotc35815: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:10:14 +0000 (18:10 -0700)]
tc35815: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agotilepro: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:09:01 +0000 (18:09 -0700)]
tilepro: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agotilegx: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:08:21 +0000 (18:08 -0700)]
tilegx: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agos2io: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:06:26 +0000 (18:06 -0700)]
s2io: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agomlx4: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:05:58 +0000 (18:05 -0700)]
mlx4: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosky2: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:05:26 +0000 (18:05 -0700)]
sky2: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoibmveth: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:03:50 +0000 (18:03 -0700)]
ibmveth: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agofs_enet: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:03:23 +0000 (18:03 -0700)]
fs_enet: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoenic: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:02:08 +0000 (18:02 -0700)]
enic: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoamd8111e: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:01:11 +0000 (18:01 -0700)]
amd8111e: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoixgbe: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:00:41 +0000 (18:00 -0700)]
ixgbe: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoigb: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 01:00:06 +0000 (18:00 -0700)]
igb: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoi40e: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 00:59:10 +0000 (17:59 -0700)]
i40e: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agobnx2x: Don't receive packets when the napi budget == 0
Eric W. Biederman [Sat, 15 Mar 2014 00:57:59 +0000 (17:57 -0700)]
bnx2x: Don't receive packets when the napi budget == 0

Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'cxgb4-next'
David S. Miller [Sat, 15 Mar 2014 02:44:19 +0000 (22:44 -0400)]
Merge branch 'cxgb4-next'

Hariprasad Shenai says:

====================
Doorbell drop Avoidance Bug fix for iw_cxgb4

This patch series provides fixes for Chelsio T4/T5 adapters
related to DB Drop avoidance and other small fix related to keepalive on
iw-cxgb4.

The patches series is created against David Miller's 'net-next' tree.
And includes patches on cxgb4 and iw_cxgb4 driver.

We would like to request this patch series to get merged via David Miller's
'net-next' tree.

We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agocxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes
Steve Wise [Fri, 14 Mar 2014 16:22:08 +0000 (21:52 +0530)]
cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes

The current logic suffers from a slow response time to disable user DB
usage, and also fails to avoid DB FIFO drops under heavy load. This commit
fixes these deficiencies and makes the avoidance logic more optimal.
This is done by more efficiently notifying the ULDs of potential DB
problems, and implements a smoother flow control algorithm in iw_cxgb4,
which is the ULD that puts the most load on the DB fifo.

Design:

cxgb4:

Direct ULD callback from the DB FULL/DROP interrupt handler.  This allows
the ULD to stop doing user DB writes as quickly as possible.

While user DB usage is disabled, the LLD will accumulate DB write events
for its queues.  Then once DB usage is reenabled, a single DB write is
done for each queue with its accumulated write count.  This reduces the
load put on the DB fifo when reenabling.

iw_cxgb4:

Instead of marking each qp to indicate DB writes are disabled, we create
a device-global status page that each user process maps.  This allows
iw_cxgb4 to only set this single bit to disable all DB writes for all
user QPs vs traversing the idr of all the active QPs.  If the libcxgb4
doesn't support this, then we fall back to the old approach of marking
each QP.  Thus we allow the new driver to work with an older libcxgb4.

When the LLD upcalls iw_cxgb4 indicating DB FULL, we disable all DB writes
via the status page and transition the DB state to STOPPED.  As user
processes see that DB writes are disabled, they call into iw_cxgb4
to submit their DB write events.  Since the DB state is in STOPPED,
the QP trying to write gets enqueued on a new DB "flow control" list.
As subsequent DB writes are submitted for this flow controlled QP, the
amount of writes are accumulated for each QP on the flow control list.
So all the user QPs that are actively ringing the DB get put on this
list and the number of writes they request are accumulated.

When the LLD upcalls iw_cxgb4 indicating DB EMPTY, which is in a workq
context, we change the DB state to FLOW_CONTROL, and begin resuming all
the QPs that are on the flow control list.  This logic runs on until
the flow control list is empty or we exit FLOW_CONTROL mode (due to
a DB DROP upcall, for example).  QPs are removed from this list, and
their accumulated DB write counts written to the DB FIFO.  Sets of QPs,
called chunks in the code, are removed at one time. The chunk size is 64.
So 64 QPs are resumed at a time, and before the next chunk is resumed, the
logic waits (blocks) for the DB FIFO to drain.  This prevents resuming to
quickly and overflowing the FIFO.  Once the flow control list is empty,
the db state transitions back to NORMAL and user QPs are again allowed
to write directly to the user DB register.

The algorithm is designed such that if the DB write load is high enough,
then all the DB writes get submitted by the kernel using this flow
controlled approach to avoid DB drops.  As the load lightens though, we
resume to normal DB writes directly by user applications.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agocxgb4/iw_cxgb4: Treat CPL_ERR_KEEPALV_NEG_ADVICE as negative advice
Steve Wise [Fri, 14 Mar 2014 16:22:07 +0000 (21:52 +0530)]
cxgb4/iw_cxgb4: Treat CPL_ERR_KEEPALV_NEG_ADVICE as negative advice

Based on original work by Anand Priyadarshee <anandp@chelsio.com>.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq
Eric W. Biederman [Fri, 14 Mar 2014 04:26:42 +0000 (21:26 -0700)]
net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq

Replace the bh safe variant with the hard irq safe variant.

We need a hard irq safe variant to deal with netpoll transmitting
packets from hard irq context, and we need it in most if not all of
the places using the bh safe variant.

Except on 32bit uni-processor the code is exactly the same so don't
bother with a bh variant, just have a hard irq safe variant that
everyone can use.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Sat, 15 Mar 2014 02:31:55 +0000 (22:31 -0400)]
Merge git://git./linux/kernel/git/davem/net

Conflicts:
drivers/net/usb/r8152.c
drivers/net/xen-netback/netback.c

Both the r8152 and netback conflicts were simple overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'alb_learning'
David S. Miller [Sat, 15 Mar 2014 02:21:04 +0000 (22:21 -0400)]
Merge branch 'alb_learning'

Veaceslav Falico says:

====================
bonding: use correct ether type for alb

There have been reports that, while using the ETH_P_LOOP ether type
(0x0060), the ether type is treated as its packet length.

To avoid that and to not break already existing apps - add new ether type
ETH_P_LOOPBACK that contains the correct id - 0x9000.
====================

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
10 years agobonding: use the correct ether type for alb
Veaceslav Falico [Thu, 13 Mar 2014 11:41:58 +0000 (12:41 +0100)]
bonding: use the correct ether type for alb

Currently it's using the wrong ETH_P_LOOP type, which is sometimes treated
as packet length instead of ether type (because it's 0x0060).

Use the new ETH_P_LOOPBACK type.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoether: add loopback type ETH_P_LOOPBACK
Veaceslav Falico [Thu, 13 Mar 2014 11:41:57 +0000 (12:41 +0100)]
ether: add loopback type ETH_P_LOOPBACK

Per IEEE 802.3*, the correct packet type for loopback 0x9000. There's
already one ETH_P_LOOP 0x0060, which has been there for ages, however it's
plainly wrong as anything that small is considered a length field.

We can't remove it because legacy, so add a new type which corresponds to
the correct id.

http://www.iana.org/assignments/ieee-802-numbers/ieee-802-numbers.xhtml

CC: "David S. Miller" <davem@davemloft.net>
CC: Stefan Richter <stefanr@s5r6.in-berlin.de>
CC: Simon Wunderlich <sw@simonwunderlich.de>
CC: Neil Jerram <Neil.Jerram@metaswitch.com>
CC: Simon Horman <horms@verge.net.au>
CC: Arvid Brodin <Arvid.Brodin@xdin.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Sat, 15 Mar 2014 02:18:48 +0000 (22:18 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to igb, i40e and i40evf.

I provide a code comment fix which David Miller noticed in the last
series of patches I submitted.

Shannon provides a patch to cleanup the NAPI structs when deleting the
netdev.

Anjali provides several patches for i40e, first fixes a bug in the update
filter logic which was causing a kernel panic.  Then provides a fix to
rename an error bit to correctly indicate the error.  Adds a definition
for a new state variable to keep track of features automatically disabled
due to hardware resource limitations versus user enforced feature disabled.
Anjali provides a patch to add code to handle when there is a filter
programming error due to a full table, which also resolves a previous
compile warning about an unused "*pf" variable introduced in the last i40e
series patch submission.

Jesse provides three i40e patches to cleanup strings to make more
consistent and to align with other Intel drivers.

Akeem cleans up a misleading function header comment for i40e.

Mitch provides a fix for i40e/i40evf to use the correctly reported number
of MSI-X vectors in the PF an VF.  Then provides a patch to use
dma_set_mask_and_coherent() which was introduced in v3.13 and simplifies
the DMA mapping code a bit.

v2:
- dropped the 2 ixgbe patches from Emil based on feedback from David Miller,
  where the 2 fixes should be handled in the net core to fix all drivers
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'ieee802154-next'
David S. Miller [Sat, 15 Mar 2014 02:15:35 +0000 (22:15 -0400)]
Merge branch 'ieee802154-next'

Phoebe Buckheister says:

====================
ieee802154: fix endianness and header handling

This patch set enforces network byte order on all internal operations and
fields of the 802.15.4 stack and adds a general representation of 802.15.4
headers with operations to create and parse those headers. This reduces code
duplication in the current stack and also allows for upper layers to read
headers of packets they have just received; it is also necessary for 802.15.4
link layer security, which requires header mangling.

Changes since v1:
 * fixed lowpan packet rx after reassembly. Control blocks were used to
   retrieve source/dest addresses, but the CB is clobbered by reassembly.
   Instead, parse the header anew in lowpan.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoieee802154: add proper length checks to header creations
Phoebe Buckheister [Fri, 14 Mar 2014 20:24:04 +0000 (21:24 +0100)]
ieee802154: add proper length checks to header creations

Have mac802154 header_ops.create fail with -EMSGSIZE if the length
passed will be too large to fit a frame. Since 6lowpan will ensure that
no packet payload will be too large, pass a length of 0 there. 802.15.4
dgram sockets will also return -EMSGSIZE on payloads larger than the
device MTU instead of -EINVAL.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years ago6lowpan: move lowpan frag_info out of 802.15.4 headers
Phoebe Buckheister [Fri, 14 Mar 2014 20:24:03 +0000 (21:24 +0100)]
6lowpan: move lowpan frag_info out of 802.15.4 headers

Fragmentation and reassembly information for 6lowpan is independent from
the 802.15.4 stack and used only by the 6lowpan reassembly process. Move
the ieee802154_frag_info struct to a private are, it needn't be in the
802.15.4 skb control block.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoieee802154: use ieee802154_addr instead of *_sa variants
Phoebe Buckheister [Fri, 14 Mar 2014 20:24:02 +0000 (21:24 +0100)]
ieee802154: use ieee802154_addr instead of *_sa variants

Change all internal uses of ieee802154_addr_sa to ieee802154_addr,
except for those instances that communicate directly with userspace.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agomac802154: use header operations to create/parse headers
Phoebe Buckheister [Fri, 14 Mar 2014 20:24:01 +0000 (21:24 +0100)]
mac802154: use header operations to create/parse headers

Use the operations on 802.15.4 header structs introduced in a previous
patch to create and parse all headers in the mac802154 stack. This patch
reduces code duplication between different parts of the mac802154 stack
that needed information from headers, and also fixes a few bugs that
seem to have gone unnoticed until now:

 * 802.15.4 dgram sockets would return a slightly incorrect value for
   the SIOCINQ ioctl
 * mac802154 would not drop frames with the "security enabled" bit set,
   even though it does not support security, in violation of the
   standard

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoieee802154: add header structs with endiannes and operations
Phoebe Buckheister [Fri, 14 Mar 2014 20:24:00 +0000 (21:24 +0100)]
ieee802154: add header structs with endiannes and operations

This patch provides a set of structures to represent 802.15.4 MAC
headers, and a set of operations to push/pull/peek these structs from
skbs. We cannot simply pointer-cast the skb MAC header pointer to these
structs, because 802.15.4 headers are wildly variable - depending on the
first three bytes, virtually all other fields of the header may be
present or not, and be present with different lengths.

The new header creation/parsing routines also support 802.15.4 security
headers, which are currently not supported by the mac802154
implementation of the protocol.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoieee802154: enforce consistent endianness in the 802.15.4 stack
Phoebe Buckheister [Fri, 14 Mar 2014 20:23:59 +0000 (21:23 +0100)]
ieee802154: enforce consistent endianness in the 802.15.4 stack

Enable sparse warnings about endianness, replace the remaining fields
regarding network operations without explicit endianness annotations
with such that are annotated, and propagate this through the entire
stack.

Uses of ieee802154_addr_sa are not changed yet, this patch is only
concerned with all other fields (such as address filters, operation
parameters and the likes).

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoieee802154: add address struct with proper endiannes and some operations
Phoebe Buckheister [Fri, 14 Mar 2014 20:23:58 +0000 (21:23 +0100)]
ieee802154: add address struct with proper endiannes and some operations

Add a replacement ieee802154_addr struct with proper endianness on
fields. Short address fields are stored as __le16 as on the network,
extended (EUI64) addresses are __le64 as opposed to the u8[8] format
used previously. This disconnect with the netdev address, which is
stored as big-endian u8[8], is intentional.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoieee802154: rename struct ieee802154_addr to *_sa
Phoebe Buckheister [Fri, 14 Mar 2014 20:23:57 +0000 (21:23 +0100)]
ieee802154: rename struct ieee802154_addr to *_sa

The struct as currently defined uses host byte order for some fields,
and most big endian/EUI display byte order for other fields. Inside the
stack, endianness should ideally match network byte order where possible
to minimize the number of byteswaps done in critical paths, but this
patch does not address this; it is only preparatory.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 15 Mar 2014 01:07:51 +0000 (18:07 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Peter Anvin:
 "Two x86 fixes: Suresh's eager FPU fix, and a fix to the NUMA quirk for
  AMD northbridges.

  This only includes Suresh's fix patch, not the "mostly a cleanup"
  patch which had __init issues"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/amd/numa: Fix northbridge quirk to assign correct NUMA node
  x86, fpu: Check tsk_used_math() in kernel_fpu_end() for eager FPU

10 years agoMerge tag 'pm+acpi-3.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Sat, 15 Mar 2014 01:02:02 +0000 (18:02 -0700)]
Merge tag 'pm+acpi-3.14-rc7' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI and power management fixes from Rafael Wysocki:
 "Three of these are regression fixes, for two recent regressions and
  one introduced during the 3.13 cycle, and the fourth one is a working
  version of the fix that had to be reverted last time.

  Specifics:

   - A recent ACPI resources handling fix overlooked the fact that it
     had to update the ACPI PNP subsystem's resources parsing too and
     caused confusing warning messages to be printed during system
     intialization on some systems (with arguably buggy ACPI tables).
     Fix from Zhang Rui.

   - Moving the early ACPI initialization before timekeeping_init()
     earlier in this cycle broke fast TSC calibration on at least one
     system, so it needs to be done later, but still before
     efi_enter_virtual_mode() to allow the EFI initialization to refer
     to ACPI.

   - A change related to code duplication reduction in the cpufreq core
     inadvertently caused cpufreq intialization to fail for some CPUs
     handled by intel_pstate by adding checks that may fail for that
     driver, but aren't even necessary when it is used.  The issue is
     addressed by preventing those checks from run in the configurations
     in which they aren't needed.

   - If the Hardware Reduced ACPI flag is set in the ACPI tables, system
     suspend, hibernation and ACPI power off will only work when special
     sleep control and sleep status registeres are provided (their
     addresses in the ACPI tables are not zero).  If those registers are
     not available, the features in question have no chances to work, so
     they shouldn't even be regarded as supported.  That helps with
     power off in particular, because alternative power off methods may
     be used then and they may actually work"

* tag 'pm+acpi-3.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / sleep: Add extra checks for HW Reduced ACPI mode sleep states
  ACPI / init: Invoke early ACPI initialization later
  cpufreq: Skip current frequency initialization for ->setpolicy drivers
  PNP / ACPI: proper handling of ACPI IO/Memory resource parsing failures

10 years agoMerge tag 'dm-3.14-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device...
Linus Torvalds [Sat, 15 Mar 2014 01:01:23 +0000 (18:01 -0700)]
Merge tag 'dm-3.14-fixes-4' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device-mapper fixes form Mike Snitzer:
 "Two small fixes for the DM cache target:

   - fix corruption with >2TB fast device due to truncation bug
   - fix access beyond end of origin device due to a partial block"

* tag 'dm-3.14-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm cache: fix access beyond end of origin device
  dm cache: fix truncation bug when copying a block to/from >2TB fast device

10 years agoi40e/i40evf: Use dma_set_mask_and_coherent
Mitch Williams [Tue, 11 Feb 2014 08:26:33 +0000 (08:26 +0000)]
i40e/i40evf: Use dma_set_mask_and_coherent

In Linux 3.13, dma_set_mask_and_coherent was introduced, and we have
been encouraged to use it. It simplifies the DMA mapping code a bit as
well.

Change-ID: I66e340245af7d0dedfa8b40fec1f5e352754432e
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e/i40evf: Use correct number of VF vectors
Mitch Williams [Tue, 11 Feb 2014 08:26:32 +0000 (08:26 +0000)]
i40e/i40evf: Use correct number of VF vectors

Now that the 2.4 firmware reports the correct number of MSI-X vectors,
use this value correctly when communicating with the VF, and when
setting up the interrupt linked list.

The PF has always reported the correct number of MSI-X vectors, so we
should never increment the value in the vf driver.

Change-ID: Ifeefc631c321390192219ce2af9ada6180c1492f
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: Let MDD events be handled by MDD handler
Anjali Singhai Jain [Wed, 12 Feb 2014 01:45:34 +0000 (01:45 +0000)]
i40e: Let MDD events be handled by MDD handler

We have a separate handler for MDD events, a generic reset is not required.

Change-ID: I77858e2d479e4e65c52aede67109464649ea0253
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: Bug fix for FDIR replay logic
Anjali Singhai Jain [Tue, 11 Feb 2014 08:26:30 +0000 (08:26 +0000)]
i40e: Bug fix for FDIR replay logic

The FDIR replay logic was being run a little too soon (before the
queues were enabled) and hence the tail bump was not effective till
a later transaction happened on the queue.

Change-ID: Icfd7cd2e79fc3cae3cbd3f703a2b3a148b4e7bf6
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: Add code to handle FD table full condition
Anjali Singhai Jain [Wed, 12 Feb 2014 06:33:25 +0000 (06:33 +0000)]
i40e: Add code to handle FD table full condition

Add code to enforce the following policy:
- If the HW reports filter programming error, we check if it's due to a
  full table.
- If so, we go ahead and turn off new rule addition for ATR and then SB
  in that order.
- We monitor the programmed filter count, if enough room is created due
  to filter deletion/reset, we then re-enable SB and ATR new rule addition.

Change-ID: I69d24b29e5c45bc4fa861258e11c2fa7b8868748
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: Define a new state variable to keep track of feature auto disable
Anjali Singhai Jain [Tue, 11 Feb 2014 08:26:28 +0000 (08:26 +0000)]
i40e: Define a new state variable to keep track of feature auto disable

This variable is a bit mask. It is needed to differentiate between
user enforced feature disables and auto disable of features due to
HW resource limitations.

Change-ID: Ib4b4f6ae1bb2668c12e482d2555100bc8ad713d5
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: Fix function comments
Akeem G Abodunrin [Tue, 11 Feb 2014 08:24:15 +0000 (08:24 +0000)]
i40e: Fix function comments

Correct misleading function comment.

Change-ID: I3f66cff5cc00250a285756b6500a58fad8eba4b5
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: simplified init string
Jesse Brandeburg [Tue, 11 Feb 2014 08:24:14 +0000 (08:24 +0000)]
i40e: simplified init string

In a similar way to how ixgbe works, print a short one-line string
showing what features and number of queues the driver and hardware has
enabled at probe time.

Example (wrapped for the commit message):
i40e 0000:06:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 32 FDir RSS
ATR NTUPLE DCB

Change-ID: I177bf7f93d1c4c921529c92fdf66e614f6b4f755
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: cleanup strings
Jesse Brandeburg [Tue, 11 Feb 2014 08:24:13 +0000 (08:24 +0000)]
i40e: cleanup strings

This patch cleans up the strings that the driver prints during normal
operation and moves many strings into dev_dbg.  It also cleans up
strings printed during reset.

Change-ID: I1835cc4e3c3b22596182b683284e6bb87eac61b2
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: make string references to q be queue
Jesse Brandeburg [Tue, 11 Feb 2014 08:24:12 +0000 (08:24 +0000)]
i40e: make string references to q be queue

This cleans up strings for consistency, q is replaced with queue.

Change-ID: Ia5f9dfae9af261f4c24485854264e02363729cf3
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e/i40evf: Some flow director HW definition fixes
Anjali Singhai Jain [Tue, 11 Feb 2014 08:24:11 +0000 (08:24 +0000)]
i40e/i40evf: Some flow director HW definition fixes

1) Fix a name of the error bit to correctly indicate the error.
2) Added a fd_id field in the 32 byte desc at the place(qw0) where it gets
reported in the programming error desc WB. In a normal data desc
the fd_id field is reported in qw3.

Change-ID: Ide9a24bff7273da5889c36635d629bc3b5212010
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Kevin Scott <kevin.c.scott@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: Fix a bug in the update logic for FDIR SB filter.
Anjali Singhai Jain [Tue, 11 Feb 2014 08:24:09 +0000 (08:24 +0000)]
i40e: Fix a bug in the update logic for FDIR SB filter.

The update filter logic was causing a kernel panic in the original code.
We need to compare the input set to decide whether or not to delete a
filter since we do not have a hash stored. This new design helps fix the issue.

Change-ID: I2462b108e58ca4833312804cda730b4660cc18c9
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: delete netdev after deleting napi and vectors
Shannon Nelson [Tue, 11 Feb 2014 08:24:07 +0000 (08:24 +0000)]
i40e: delete netdev after deleting napi and vectors

We've been deleting the netdev before getting around to deleting the napi
structs.  Unfortunately, we then didn't delete the napi structs because we
have a check for netdev, thus we were leaving garbage around in the system.

Change-ID: Ife540176f6c9f801147495b3f2d2ac2e61ddcc58
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoigb: Fix code comment
Jeff Kirsher [Thu, 13 Mar 2014 23:07:14 +0000 (16:07 -0700)]
igb: Fix code comment

Recently added code comment was missing a space that is needed.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agox86/amd/numa: Fix northbridge quirk to assign correct NUMA node
Daniel J Blueman [Thu, 13 Mar 2014 11:43:01 +0000 (19:43 +0800)]
x86/amd/numa: Fix northbridge quirk to assign correct NUMA node

For systems with multiple servers and routed fabric, all
northbridges get assigned to the first server. Fix this by also
using the node reported from the PCI bus. For single-fabric
systems, the northbriges are on PCI bus 0 by definition, which
are on NUMA node 0 by definition, so this is invarient on most
systems.

Tested on fam10h and fam15h single and multi-fabric systems and
candidate for stable.

Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Steffen Persvold <sp@numascale.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1394710981-3596-1-git-send-email-daniel@numascale.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agovti6: Enable namespace changing
Steffen Klassert [Fri, 14 Mar 2014 06:28:09 +0000 (07:28 +0100)]
vti6: Enable namespace changing

vti6 is now fully namespace aware, so allow namespace changing
for vti devices.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
10 years agovti6: Check the tunnel endpoints of the xfrm state and the vti interface
Steffen Klassert [Fri, 14 Mar 2014 06:28:09 +0000 (07:28 +0100)]
vti6: Check the tunnel endpoints of the xfrm state and the vti interface

The tunnel endpoints of the xfrm_state we got from the xfrm_lookup
must match the tunnel endpoints of the vti interface. This patch
ensures this matching.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>