openwrt/staging/blogic.git
10 years agonet: resort some Kbuild files to hopefully help avoid some conflicts
Stephen Rothwell [Tue, 14 Jan 2014 05:37:45 +0000 (16:37 +1100)]
net: resort some Kbuild files to hopefully help avoid some conflicts

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'qlcnic'
David S. Miller [Mon, 13 Jan 2014 23:31:42 +0000 (15:31 -0800)]
Merge branch 'qlcnic'

Shahed Shaikh says:

====================
This series includes following changes:
o SRIOV and VLAN filtering related enhancements which includes
   - Do MAC learning for PF
   - Restrict VF from configuring any VLAN mode
   - Enable flooding on PF
   - Turn on promiscuous mode for PF

o Bug fix in qlcnic_sriov_cleanup() introduced by commit
  154d0c81("qlcnic: VLAN enhancement for 84XX adapters")

o Beaconing support for 83xx and 84xx series adapters

o Allow 82xx adapter to perform IPv6 LRO even if destination IP address is not
  programmed.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoqlcnic: Update version to 5.3.54
Shahed Shaikh [Fri, 10 Jan 2014 16:48:59 +0000 (11:48 -0500)]
qlcnic: Update version to 5.3.54

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoqlcnic: Enable IPv6 LRO even if IP address is not programmed
Shahed Shaikh [Fri, 10 Jan 2014 16:48:58 +0000 (11:48 -0500)]
qlcnic: Enable IPv6 LRO even if IP address is not programmed

o Enabling BIT_9 while configuring hardware LRO allows adapter to
  perform LRO even if destination IP address is not programmed in adapter.

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoqlcnic: Fix SR-IOV cleanup code path
Manish Chopra [Fri, 10 Jan 2014 16:48:57 +0000 (11:48 -0500)]
qlcnic: Fix SR-IOV cleanup code path

o Add __QLCNIC_SRIOV_ENABLE bit check before doing SRIOV cleanup

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoqlcnic: Enable beaconing for 83xx/84xx Series adapter.
Himanshu Madhani [Fri, 10 Jan 2014 16:48:56 +0000 (11:48 -0500)]
qlcnic: Enable beaconing for 83xx/84xx Series adapter.

o Refactored code to handle beaconing test for all adapters.
o Use GET_LED_CONFIG mailbox command for 83xx/84xx series adapter
  to detect current beaconing state of the adapter.

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoqlcnic: Do MAC learning for SRIOV PF.
Sucheta Chakraborty [Fri, 10 Jan 2014 16:48:55 +0000 (11:48 -0500)]
qlcnic: Do MAC learning for SRIOV PF.

o MAC learning will be done for SRIOV PF to help program VLAN filters
  onto adapter. This will help VNIC traffic to flow through without
  flooding traffic.

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoqlcnic: Turn on promiscous mode for SRIOV PF.
Sucheta Chakraborty [Fri, 10 Jan 2014 16:48:54 +0000 (11:48 -0500)]
qlcnic: Turn on promiscous mode for SRIOV PF.

o By default, SRIOV PF will have promiscous mode on.

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoqlcnic: Enable VF flood bit on PF.
Sucheta Chakraborty [Fri, 10 Jan 2014 16:48:53 +0000 (11:48 -0500)]
qlcnic: Enable VF flood bit on PF.

o On enabling VF flood bit, PF driver will  be able to receive traffic
  from all its VFs.

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoqlcnic: Restrict VF from configuring any VLAN mode.
Sucheta Chakraborty [Fri, 10 Jan 2014 16:48:52 +0000 (11:48 -0500)]
qlcnic: Restrict VF from configuring any VLAN mode.

o Adapter should allow vlan traffic only for vlans configured on a VF.
  On configuring any vlan mode from VF, adapter will allow any vlan
  traffic to pass for that VF. Do not allow VF to configure this mode.

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: make dev_set_mtu() honor notification return code
Veaceslav Falico [Fri, 10 Jan 2014 15:56:25 +0000 (16:56 +0100)]
net: make dev_set_mtu() honor notification return code

Currently, after changing the MTU for a device, dev_set_mtu() calls
NETDEV_CHANGEMTU notification, however doesn't verify it's return code -
which can be NOTIFY_BAD - i.e. some of the net notifier blocks refused this
change, and continues nevertheless.

To fix this, verify the return code, and if it's an error - then revert the
MTU to the original one, notify again and pass the error code.

CC: Jiri Pirko <jiri@resnulli.us>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agopacket: doc: describe PACKET_MMAP with one packet socket for rx and tx
Norbert van Bolhuis [Fri, 10 Jan 2014 09:22:37 +0000 (10:22 +0100)]
packet: doc: describe PACKET_MMAP with one packet socket for rx and tx

Document how to use one AF_PACKET mmap socket for RX and TX.

Signed-off-by: Norbert van Bolhuis <nvbolhuis@aimvalley.nl>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosctp: make sctp_addto_chunk_fixed local
stephen hemminger [Fri, 10 Jan 2014 06:31:11 +0000 (22:31 -0800)]
sctp: make sctp_addto_chunk_fixed local

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agophylib: Add of_phy_attach
Andy Fleming [Fri, 10 Jan 2014 06:28:11 +0000 (14:28 +0800)]
phylib: Add of_phy_attach

10G PHYs don't currently support running the state machine, which
is implicitly setup via of_phy_connect(). Therefore, it is necessary
to implement an OF version of phy_attach(), which does everything
except start the state machine.

Signed-off-by: Andy Fleming <afleming@gmail.com>
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agophylib: Support attaching to generic 10g driver
Andy Fleming [Fri, 10 Jan 2014 06:27:54 +0000 (14:27 +0800)]
phylib: Support attaching to generic 10g driver

phy_attach_direct() may now attach to a generic 10G driver. It can
also be used exactly as phy_connect_direct(), which will be useful
when using of_mdio, as phy_connect (and therefore of_phy_connect)
start the PHY state machine, which is currently irrelevant for 10G
PHYs.

Signed-off-by: Andy Fleming <afleming@gmail.com>
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agophylib: Add generic 10G driver
Andy Fleming [Fri, 10 Jan 2014 06:27:37 +0000 (14:27 +0800)]
phylib: Add generic 10G driver

Very incomplete, but will allow for binding an ethernet controller
to it.

Signed-off-by: Andy Fleming <afleming@gmail.com>
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agophylib: turn genphy_driver to an array
Shaohui Xie [Fri, 10 Jan 2014 06:27:22 +0000 (14:27 +0800)]
phylib: turn genphy_driver to an array

Then other generic phy driver such as generic 10g phy driver can join it.

Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agophylib: introduce PHY_INTERFACE_MODE_XGMII for 10G PHY
Andy Fleming [Fri, 10 Jan 2014 06:26:46 +0000 (14:26 +0800)]
phylib: introduce PHY_INTERFACE_MODE_XGMII for 10G PHY

Signed-off-by: Andy Fleming <afleming@gmail.com>
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agophylib: Add Clause 45 read/write functions
Andy Fleming [Fri, 10 Jan 2014 06:25:09 +0000 (14:25 +0800)]
phylib: Add Clause 45 read/write functions

Need an extra parameter to read or write Clause 45 PHYs, so
need a different API with the extra parameter.

Signed-off-by: Andy Fleming <afleming@gmail.com>
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agol2tp: make local functions static
stephen hemminger [Fri, 10 Jan 2014 06:22:27 +0000 (22:22 -0800)]
l2tp: make local functions static

Avoid needless export of local functions

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agobnx2x: namespace and dead code cleanups
stephen hemminger [Fri, 10 Jan 2014 06:20:11 +0000 (22:20 -0800)]
bnx2x: namespace and dead code cleanups

Fix a bunch of whole lot of namespace issues with the Broadcom bnx2x driver
found by running 'make namespacecheck'

 * global variables must be prefixed with bnx2x_
    naming a variable int_mode, or num_queue is invitation to disaster

 * make local functions static

 * move some inline's used in one file out of header
   (this driver has a bad case of inline-itis)

 * remove resulting dead code fallout
   bnx2x_pfc_statistic,
 bnx2x_emac_get_pfc_stat
   bnx2x_init_vlan_mac_obj,
   Looks like vlan mac support in this driver was a botch from day one
   either never worked, or not implemented or missing support functions

Compile tested only.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agodrivers: net: silence compiler warning in smc91x.c
Pankaj Dubey [Fri, 10 Jan 2014 03:04:06 +0000 (12:04 +0900)]
drivers: net: silence compiler warning in smc91x.c

If used 64 bit compiler GCC warns that:

drivers/net/ethernet/smsc/smc91x.c:1897:7:
warning: cast from pointer to integer of different
size [-Wpointer-to-int-cast]

This patch fixes this by changing typecast from "unsigned int" to "unsigned long"

CC: "David S. Miller" <davem@davemloft.net>
CC: Jingoo Han <jg1.han@samsung.com>
CC: netdev@vger.kernel.org
Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agogre_offload: simplify GRE header length calculation in gre_gso_segment()
Neal Cardwell [Fri, 10 Jan 2014 01:47:17 +0000 (20:47 -0500)]
gre_offload: simplify GRE header length calculation in gre_gso_segment()

Simplify the GRE header length calculation in gre_gso_segment().
Switch to an approach that is simpler, faster, and more general. The
new approach will continue to be correct even if we add support for
the optional variable-length routing info that may be present in a GRE
header.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet_sched: act: remove struct tcf_act_hdr
WANG Cong [Fri, 10 Jan 2014 00:14:05 +0000 (16:14 -0800)]
net_sched: act: remove struct tcf_act_hdr

It is not necessary at all.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet_sched: avoid casting void pointer
WANG Cong [Fri, 10 Jan 2014 00:14:03 +0000 (16:14 -0800)]
net_sched: avoid casting void pointer

tp->root is a void* pointer, no need to cast it.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet_sched: optimize tcf_match_indev()
WANG Cong [Fri, 10 Jan 2014 00:14:02 +0000 (16:14 -0800)]
net_sched: optimize tcf_match_indev()

tcf_match_indev() is called in fast path, it is not wise to
search for a netdev by ifindex and then compare by its name,
just compare the ifindex.

Also, dev->name could be changed by user-space, therefore
the match would be always fail, but dev->ifindex could
be consistent.

BTW, this will also save some bytes from the core struct of u32.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet_sched: add struct net pointer to tcf_proto_ops->dump
WANG Cong [Fri, 10 Jan 2014 00:14:01 +0000 (16:14 -0800)]
net_sched: add struct net pointer to tcf_proto_ops->dump

It will be needed by the next patch.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet_sched: act: clean up notification functions
WANG Cong [Fri, 10 Jan 2014 00:14:00 +0000 (16:14 -0800)]
net_sched: act: clean up notification functions

Refactor tcf_add_notify() and factor out tcf_del_notify().

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet_sched: act: move idx_gen into struct tcf_hashinfo
WANG Cong [Fri, 10 Jan 2014 00:13:59 +0000 (16:13 -0800)]
net_sched: act: move idx_gen into struct tcf_hashinfo

There is no need to store the index separatedly
since tcf_hashinfo is allocated statically too.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: gro: change GRO overflow strategy
Eric Dumazet [Thu, 9 Jan 2014 22:12:19 +0000 (14:12 -0800)]
net: gro: change GRO overflow strategy

GRO layer has a limit of 8 flows being held in GRO list,
for performance reason.

When a packet comes for a flow not yet in the list,
and list is full, we immediately give it to upper
stacks, lowering aggregation performance.

With TSO auto sizing and FQ packet scheduler, this situation
happens more often.

This patch changes strategy to simply evict the oldest flow of
the list. This works better because of the nature of packet
trains for which GRO is efficient. This also has the effect
of lowering the GRO latency if many flows are competing.

Tested :

Used a 40Gbps NIC, with 4 RX queues, and 200 concurrent TCP_STREAM
netperf.

Before patch, aggregate rate is 11Gbps (while a single flow can reach
30Gbps)

After patch, line rate is reached.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jerry Chu <hkchu@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet/mlx4_en: call gro handler for encapsulated frames
Eric Dumazet [Thu, 9 Jan 2014 18:30:13 +0000 (10:30 -0800)]
net/mlx4_en: call gro handler for encapsulated frames

In order to use the native GRO handling of encapsulated protocols on
mlx4, we need to call napi_gro_receive() instead of netif_receive_skb()
unless busy polling is in action.

While we are at it, rename mlx4_en_cq_ll_polling() to
mlx4_en_cq_busy_polling()

Tested with GRE tunnel : GRO aggregation is now performed on the
ethernet device instead of being done later on gre device.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Amir Vadai <amirv@mellanox.com>
Cc: Jerry Chu <hkchu@google.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Acked-By: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agogre_offload: fix sparse non static symbol warning
Wei Yongjun [Thu, 9 Jan 2014 14:22:05 +0000 (22:22 +0800)]
gre_offload: fix sparse non static symbol warning

Fixes the following sparse warning:

net/ipv4/gre_offload.c:253:5: warning:
 symbol 'gre_gro_complete' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'ip_forward_pmtu'
David S. Miller [Mon, 13 Jan 2014 19:23:02 +0000 (11:23 -0800)]
Merge branch 'ip_forward_pmtu'

Hannes Frederic Sowa says:

====================
path mtu hardening patches

After a lot of back and forth I want to propose these changes regarding
path mtu hardening and give an outline why I think this is the best way
how to proceed:

This set contains the following patches:
* ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing
* ipv6: introduce ip6_dst_mtu_forward and protect forwarding path with it
* ipv4: introduce hardened ip_no_pmtu_disc mode

The first one switches the forwarding path of IPv4 to use the interface
mtu by default and ignore a possible discovered path mtu. It provides
a sysctl to switch back to the original behavior (see discussion below).

The second patch does the same thing unconditionally for IPv6. I don't
provide a knob for IPv6 to switch to original behavior (please see
below).

The third patch introduces a hardened pmtu mode, where only pmtu
information are accepted where the protocol is able to do more stringent
checks on the icmp piggyback payload (please see the patch commit msg
for further details).

Why is this change necessary?

First of all, RFC 1191 4. Router specification says:
"When a router is unable to forward a datagram because it exceeds the
 MTU of the next-hop network and its Don't Fragment bit is set, the
 router is required to return an ICMP Destination Unreachable message
 to the source of the datagram, with the Code indicating
 "fragmentation needed and DF set". ..."

For some time now fragmentation has been considered problematic, e.g.:
* http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-87-3.pdf
* http://tools.ietf.org/search/rfc4963

Most of them seem to agree that fragmentation should be avoided because
of efficiency, data corruption or security concerns.

Recently it was shown possible that correctly guessing IP ids could lead
to data injection on DNS packets:
<https://sites.google.com/site/hayashulman/files/fragmentation-poisoning.pdf>

While we can try to completly stop fragmentation on the end host
(this is e.g. implemented via IP_PMTUDISC_INTERFACE), we cannot stop
fragmentation completly on the forwarding path. On the end host the
application has to deal with MTUs and has to choose fallback methods
if fragmentation could be an attack vector. This is already the case for
most DNS software, where a maximum UDP packet size can be configured. But
until recently they had no control over local fragmentation and could
thus emit fragmented packets.

On the forwarding path we can just try to delay the fragmentation to
the last hop where this is really necessary. Current kernel already does
that but only because routers don't receive feedback of path mtus, these are
only send back to the end host system. But it is possible to maliciously
insert path mtu inforamtion via ICMP packets which have an icmp echo_reply
payload, because we cannot validate those notifications against local
sockets. DHCP clients which establish an any-bound RAW-socket could also
start processing unwanted fragmentation-needed packets.

Why does IPv4 has a knob to revert to old behavior while IPv6 doesn't?

IPv4 does fragmentation on the path while IPv6 does always respond with
packet-too-big errors. The interface MTU will always be greater than
the path MTU information. So we would discard packets we could actually
forward because of malicious information. After this change we would
let the hop, which really could not forward the packet, notify the host
of this problem.

IPv4 allowes fragmentation mid-path. In case someone does use a software
which tries to discover such paths and assumes that the kernel is handling
the discovered pmtu information automatically. This should be an extremly
rare case, but because I could not exclude the possibility this knob is
provided. Also this software could insert non-locked mtu information
into the kernel. We cannot distinguish that from path mtu information
currently. Premature fragmentation could solve some problems in wrongly
configured networks, thus this switch is provided.

One frag-needed packet could reduce the path mtu down to 522 bytes
(route/min_pmtu).

Misc:

IPv6 neighbor discovery could advertise mtu information for an
interface. These information update the ipv6-specific interface mtu and
thus get used by the forwarding path.

Tunnel and xfrm output path will still honour path mtu and also respond
with Packet-too-Big or fragmentation-needed errors if needed.

Changelog for all patches:
v2)
* enabled ip_forward_use_pmtu by default
* reworded
v3)
* disabled ip_forward_use_pmtu by default
* reworded
v4)
* renamed ip_dst_mtu_secure to ip_dst_mtu_maybe_forward
* updated changelog accordingly
* removed unneeded !!(... & ...) double negations

v2)
* by default we honour pmtu information
3)
* only honor interface mtu
* rewritten and simplified
* no knob to fall back to old mode any more

v2)
* reworded Documentation
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoipv4: introduce hardened ip_no_pmtu_disc mode
Hannes Frederic Sowa [Thu, 9 Jan 2014 09:01:17 +0000 (10:01 +0100)]
ipv4: introduce hardened ip_no_pmtu_disc mode

This new ip_no_pmtu_disc mode only allowes fragmentation-needed errors
to be honored by protocols which do more stringent validation on the
ICMP's packet payload. This knob is useful for people who e.g. want to
run an unmodified DNS server in a namespace where they need to use pmtu
for TCP connections (as they are used for zone transfers or fallback
for requests) but don't want to use possibly spoofed UDP pmtu information.

Currently the whitelisted protocols are TCP, SCTP and DCCP as they check
if the returned packet is in the window or if the association is valid.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: John Heffner <johnwheffner@gmail.com>
Suggested-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoipv6: introduce ip6_dst_mtu_forward and protect forwarding path with it
Hannes Frederic Sowa [Thu, 9 Jan 2014 09:01:16 +0000 (10:01 +0100)]
ipv6: introduce ip6_dst_mtu_forward and protect forwarding path with it

In the IPv6 forwarding path we are only concerend about the outgoing
interface MTU, but also respect locked MTUs on routes. Tunnel provider
or IPSEC already have to recheck and if needed send PtB notifications
to the sending host in case the data does not fit into the packet with
added headers (we only know the final header sizes there, while also
using path MTU information).

The reason for this change is, that path MTU information can be injected
into the kernel via e.g. icmp_err protocol handler without verification
of local sockets. As such, this could cause the IPv6 forwarding path to
wrongfully emit Packet-too-Big errors and drop IPv6 packets.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: John Heffner <johnwheffner@gmail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu...
Hannes Frederic Sowa [Thu, 9 Jan 2014 09:01:15 +0000 (10:01 +0100)]
ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing

While forwarding we should not use the protocol path mtu to calculate
the mtu for a forwarded packet but instead use the interface mtu.

We mark forwarded skbs in ip_forward with IPSKB_FORWARDED, which was
introduced for multicast forwarding. But as it does not conflict with
our usage in unicast code path it is perfect for reuse.

I moved the functions ip_sk_accept_pmtu, ip_sk_use_pmtu and ip_skb_dst_mtu
along with the new ip_dst_mtu_maybe_forward to net/ip.h to fix circular
dependencies because of IPSKB_FORWARDED.

Because someone might have written a software which does probe
destinations manually and expects the kernel to honour those path mtus
I introduced a new per-namespace "ip_forward_use_pmtu" knob so someone
can disable this new behaviour. We also still use mtus which are locked on a
route for forwarding.

The reason for this change is, that path mtus information can be injected
into the kernel via e.g. icmp_err protocol handler without verification
of local sockets. As such, this could cause the IPv4 forwarding path to
wrongfully emit fragmentation needed notifications or start to fragment
packets along a path.

Tunnel and ipsec output paths clear IPCB again, thus IPSKB_FORWARDED
won't be set and further fragmentation logic will use the path mtu to
determine the fragmentation size. They also recheck packet size with
help of path mtu discovery and report appropriate errors.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: John Heffner <johnwheffner@gmail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoHHF qdisc: fix jiffies-time conversion.
Terry Lam [Thu, 9 Jan 2014 08:40:00 +0000 (00:40 -0800)]
HHF qdisc: fix jiffies-time conversion.

This is to be compatible with the use of "get_time" (i.e. default
time unit in us) in iproute2 patch for HHF as requested by Stephen.

Signed-off-by: Terry Lam <vtlam@google.com>
Acked-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoqlcnic: Convert vmalloc/memset to kcalloc
Joe Perches [Thu, 9 Jan 2014 06:42:25 +0000 (22:42 -0800)]
qlcnic: Convert vmalloc/memset to kcalloc

vmalloc is a limited resource.  Don't use it unnecessarily.

It seems this allocation should work with kcalloc.

Remove unnecessary memset(,0,) of buf as it's completely
overwritten as the previously only unset field in
struct qlcnic_pci_func_cfg is now set to 0.

Use kfree instead of vfree.
Use ETH_ALEN instead of 6.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agobonding: remove dead code from 3ad
Veaceslav Falico [Wed, 8 Jan 2014 15:46:48 +0000 (16:46 +0100)]
bonding: remove dead code from 3ad

That code has been around for ages without being used.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agobonding: convert 3ad to use pr_warn instead of pr_warning
Veaceslav Falico [Wed, 8 Jan 2014 15:46:47 +0000 (16:46 +0100)]
bonding: convert 3ad to use pr_warn instead of pr_warning

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agobonding: clean up style for bond_3ad.c
Veaceslav Falico [Wed, 8 Jan 2014 15:46:46 +0000 (16:46 +0100)]
bonding: clean up style for bond_3ad.c

It's a huge mess currently, that is really hard to read. This cleanup
doesn't touch the logic at all, it only breaks easy-to-fix long lines and
updates comment styles.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'alx_stats'
David S. Miller [Sun, 12 Jan 2014 04:53:03 +0000 (20:53 -0800)]
Merge branch 'alx_stats'

Sabrina Dubroca says:

====================
alx: add statistics

Currently, the alx driver doesn't support statistics [1,2]. The
original alx driver [3] that Johannes Berg modified provided
statistics. This patch is an adaptation of the statistics code from
the original driver to the alx driver included in the kernel.

v4:
 - modified the assignements of hw stats to netstats (Ben Hutchings)
 - added comments to describe the stats fields (copied from atlx)

v3:
 - renamed __alx_update_hw_stats to alx_update_hw_stats (Stephen Hemminger)

v2:
 - use u64 instead of unsigned long  (Ben Hutchings)
 - implement ndo_get_stats64 instead of ndo_get_stats (Ben Hutchings)
 - use EINVAL instead of ENOTSUPP  (Ben Hutchings)
 - add BUILD_BUG_ON to check the size of the stats (Johannes Berg, Ben
   Hutchings)
 - add a comment regarding persistence of the stats (Stephen Hemminger)
 - align assignments in __alx_update_hw_stats

[1] https://bugzilla.kernel.org/show_bug.cgi?id=63401
[2] http://www.spinics.net/lists/netdev/msg245544.html
[3] https://github.com/mcgrof/alx
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoalx: add stats to ethtool
Sabrina Dubroca [Thu, 9 Jan 2014 09:09:31 +0000 (10:09 +0100)]
alx: add stats to ethtool

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoalx: add alx_get_stats64 operation
Sabrina Dubroca [Thu, 9 Jan 2014 09:09:30 +0000 (10:09 +0100)]
alx: add alx_get_stats64 operation

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoalx: add stats update function
Sabrina Dubroca [Thu, 9 Jan 2014 09:09:29 +0000 (10:09 +0100)]
alx: add stats update function

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoalx: add constants for the stats fields
Sabrina Dubroca [Thu, 9 Jan 2014 09:09:28 +0000 (10:09 +0100)]
alx: add constants for the stats fields

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoalx: add a hardware stats structure
Sabrina Dubroca [Thu, 9 Jan 2014 09:09:27 +0000 (10:09 +0100)]
alx: add a hardware stats structure

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Sun, 12 Jan 2014 04:51:10 +0000 (20:51 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to i40e and now i40evf.

Most notable is Jacob's patch to add PTP support to i40e.

Mitch cleans up additional memcpy's and use struct assignment instead.
Then fixes long lines to appease checkpatch.pl.  Mitch then provides
a fix to keep us from spamming the log with confusing errors.  If you
use ip to change the MAC address of a VF while the VF driver is loaded,
closing the VF interface or unloading the VF driver will cause the VF
driver to remove the MAC filter for its original (now invalid) MAC
address.

Jesse cleans up macros which are no longer needed or used.

I (Jeff) cleanup function header comments to ensure Doxygen/kdoc works
correctly to generate documentation without warnings.

Anjali fixes a bug where ethtool set-channels would return failure when
configuring only one Rx queue.  Then fixes a bug where the driver was
erroneously exiting the driver unload path if one part of the unload
failed.

Shannon fixes if the IPV6EXADD but is set in the Rx descriptor status,
there was an optional extension header with an alternate IP address
detected and the hardware checksum was not handling the alternate IP
address correctly.  Then adjusts the ITR max and min values to match
the hardware max value and recommended min value.  Shannon makes sure
to clear the PXE mode after the adminq is initialized.

v2:
 - fix patch 14 "i40e: enable PTP" to address Richard Cochran's spelling
   catch and Ben Hutchings Kconfig, SIOCGHWTSTAMP and sizeof() suggestions
 - added Paul Gortmaker's i40evf fix patch
v3:
 - fix patch 14 "i40e: enable PTP" to address Ben Hutchings concerns about
   a race with PTP init and cleanup and i40e_get_ts_info().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoi40evf: fix s390 build failure due to implicit prefetch.h
Paul Gortmaker [Sat, 11 Jan 2014 04:00:31 +0000 (04:00 +0000)]
i40evf: fix s390 build failure due to implicit prefetch.h

As of commit 7f12ad741a4870b8b6e3aafbcd868d0191770802 ("i40evf: transmit
and receive functionality") the s390 builds (allyesconfig) fail with:

drivers/net/ethernet/intel/i40evf/i40e_txrx.c: In function 'i40e_clean_rx_irq':
drivers/net/ethernet/intel/i40evf/i40e_txrx.c:818:3: error: implicit declaration of function 'prefetch'
make[5]: *** [drivers/net/ethernet/intel/i40evf/i40e_txrx.o] Error 1

due to an implicit assumption that the prototype from linux/prefetch.h
will be present.

Cc: Mitch Williams <mitch.a.williams@intel.com>
Cc: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: Bump version
Catherine Sullivan [Sat, 21 Dec 2013 05:44:52 +0000 (05:44 +0000)]
i40e: Bump version

Update the driver version to 0.3.28-k.

Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: fix log message wording
Shannon Nelson [Sat, 21 Dec 2013 05:44:51 +0000 (05:44 +0000)]
i40e: fix log message wording

Change the redundant "vsi VSI" to VSI.

Change-ID: Ic16ea5820a99abc7831713cde39e7d032a7ba4d3
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: enable PTP
Jacob Keller [Sat, 11 Jan 2014 05:43:19 +0000 (05:43 +0000)]
i40e: enable PTP

New feature: Enable PTP support in the i40e driver.

Change-ID: I6a8e799f582705191f9583afb1b9231a8db96cc8
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: call clear_pxe after adminq is initialized
Shannon Nelson [Sat, 21 Dec 2013 05:44:49 +0000 (05:44 +0000)]
i40e: call clear_pxe after adminq is initialized

In the latest firmware the clear_pxe_mode function will use the
AdminQ request, so call this after AdminQ is set up rather than
relying on i40e_pf_reset() to clear the PXE mode.

Change-ID: Ice8cba2e9cbc3c7bde0a0bcf8eaf5009abef040b
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: clear qtx_head before enabling Tx queue
Shannon Nelson [Sat, 21 Dec 2013 05:44:48 +0000 (05:44 +0000)]
i40e: clear qtx_head before enabling Tx queue

Make sure the "new" qtx_head[q] register is cleared before
enabling the Tx queue.

Change-ID: I0c7a12815e343a5ae68807af172a35d6c6857935
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: adjust ITR max and min values
Shannon Nelson [Sat, 21 Dec 2013 05:44:47 +0000 (05:44 +0000)]
i40e: adjust ITR max and min values

Set the ITR max and min values to match the hardware max value
and the recommended min value.  These values are shifted right
one bit because the register counts in 2 usec units, so leave
a comment to explain.

Change-ID: I289c27955cf6c566a6d21b95c3110b88cbb15dad
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: check for possible incorrect ipv6 checksum
Shannon Nelson [Sat, 21 Dec 2013 05:44:46 +0000 (05:44 +0000)]
i40e: check for possible incorrect ipv6 checksum

If the IPV6EXADD bit is set in the Rx descriptor status, there
was an optional extension header with an alternate IP address
detected.  The HW checksum offload doesn't handle the alternate
IP address correctly so likely comes up with the wrong answer.
Thus, if the bit is set we ignore the checksum offload value.

Change-ID: I70ff8d38cdcddccf44107691cae13d0c07c284c8
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: allow VF to remove any MAC filter
Mitch Williams [Sat, 21 Dec 2013 05:44:45 +0000 (05:44 +0000)]
i40e: allow VF to remove any MAC filter

If you use ip to change the MAC address of a VF while the VF
driver is loaded, closing the VF interface or unloading the VF
driver will cause the VF driver to remove the MAC filter for its
original (now invalid) MAC address. This would cause the PF
driver to kick an error message to the log, and back to the VF
driver.

Since the VF driver has not really done anything naughty, let's
not punish it. Don't check for MAC address overrides on the
delete operation, just make sure it's a valid address. This keeps
us from spamming the log with confusing errors.

Change-ID: I1f051bd4014e50855457d928c9ee8b0766981b2f
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: do not bail when disabling if Tx queue disable fails
Anjali Singhai Jain [Sat, 21 Dec 2013 05:44:44 +0000 (05:44 +0000)]
i40e: do not bail when disabling if Tx queue disable fails

Fix a bug where the driver was erroneously exiting the driver unload
path if one part of the unload failed.  Instead of the original way
the driver should always continue when disabling and be sure to disable
all queues.

Change-ID: Ib8c81c596bc87c31d8e9ca97ebf871168475279d
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: Setting queue count to 1 using ethtool is valid
Anjali Singhai Jain [Sat, 21 Dec 2013 05:44:43 +0000 (05:44 +0000)]
i40e: Setting queue count to 1 using ethtool is valid

Fix a bug where ethtool set-channels would return failure when configuring
only one Rx queue.

Change-ID: Id833c48c17d71e352b30f3249f6acf9e7aaec57e
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: Cleanup Doxygen warnings
Jeff Kirsher [Sat, 21 Dec 2013 05:44:42 +0000 (05:44 +0000)]
i40e: Cleanup Doxygen warnings

These changes make Doxygen/kdoc work correctly without warnings.

Change-ID: I2941f38860be805ff7548d84dae35754c83f1d62
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
10 years agoi40e: fix long lines
Mitch Williams [Sat, 21 Dec 2013 05:44:41 +0000 (05:44 +0000)]
i40e: fix long lines

Avoid over-length lines in order to appease checkpatch.

Change-ID: I63820a710acf798f49d2f85c610228711af84f72
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: Bump version
Catherine Sullivan [Wed, 18 Dec 2013 13:46:07 +0000 (13:46 +0000)]
i40e: Bump version

Update driver version to 0.3.27-k

Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: Update the Current NVM version Low value
Anjali Singhai Jain [Wed, 18 Dec 2013 13:46:06 +0000 (13:46 +0000)]
i40e: Update the Current NVM version Low value

The current driver will warn the user if the NVM version
is out of date, this raises the bar to a newer version.

Change-ID: I5ec21d8efa4e7c3fdacb56f85d310bb2229b1483
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: drop unused macros
Jesse Brandeburg [Wed, 18 Dec 2013 13:46:05 +0000 (13:46 +0000)]
i40e: drop unused macros

A previous commit removed any need for these macros, so remove
them too.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: use assignment instead of memcpy
Mitch Williams [Wed, 18 Dec 2013 13:46:04 +0000 (13:46 +0000)]
i40e: use assignment instead of memcpy

These instances were found by coccinelle/spatch, and can
use struct assignment instead of memcpy.

Change-ID: Idc23c3599241bf8a658bda18c80417af3fbfee66
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoMerge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge
David S. Miller [Fri, 10 Jan 2014 22:59:34 +0000 (17:59 -0500)]
Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge

Included changes:
- substitute FSF address with URL
- deselect current bat-GW when GW-client mode gets deactivated
- send every DHCP packet using bat-unicast messages when GW-client mode is
  enabled
- implement the Extended Isolation mechanism (it is an enhancement of the
  already existing batman-AP-isolation). This mechanism allows the user to drop
  packets exchanged by selected clients by using netfilter marks.
- fix typ0 in header guard
- minor code cleanups

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'tcp_metrics_saddr'
David S. Miller [Fri, 10 Jan 2014 22:38:33 +0000 (17:38 -0500)]
Merge branch 'tcp_metrics_saddr'

Christoph Paasch says:

====================
Make tcp-metrics source-address aware

Currently tcp-metrics only stores per-destination addresses. This brings
problems, when a host has multiple interfaces (e.g., a smartphone having
WiFi/3G):

For example, a host contacting a server over WiFi will store the tcp-metrics
per destination IP. If then the host contacts the same server over 3G, the
same tcp-metrics will be used, although the path-characteristics are completly
different (e.g., the ssthresh is probably not the same).

In case of TFO this is not a problem, as the server will provide us a new cookie
once he saw our SYN+DATA with an incorrect cookie.
It may be (in case of carrier-grade NAT), that we keep the same public IP but
have a different private IP. Thus, we better reuse the old cookie even if our
source-IP has changed. However, this scenario is probably very uncommon, as
carriers try to provide the same src-IP to the clients behind their CGN.

Patches 1 + 2 add the source-IP to the tcp metrics.

Patches 3 to 5 modify the netlink-api to support the source-IP. From now on,
when using the command "ip tcp_metrics delete address ADDRESS" all entries
which match this destination IP will be deleted.

Today's iproute2 will complain when doing "ip tcp_metrics flush PREFIX" if
several entries are present for the same destination-IP but with different
source-IPs:

root@client:~/test# ip tcp_metrics
10.2.1.2 age 3.640sec rtt 16250us rttvar 15000us cwnd 10
10.2.1.2 age 4.030sec rtt 18750us rttvar 15000us cwnd 10
root@client:~/test# ip tcp_metrics flush 10.2.1.2/16
Failed to send flush request
: No such process

Follow-up patches will modify iproute2 to handle this correctly and allow
specifying the source-IP in the get/del commands.

v2: Added the patch that allows to selectively get/del of tcp-metrics based
    on src-IP and moved the patch that adds the new netlink attribute before
    the other patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agotcp: metrics: Allow selective get/del of tcp-metrics based on src IP
Christoph Paasch [Wed, 8 Jan 2014 15:05:59 +0000 (16:05 +0100)]
tcp: metrics: Allow selective get/del of tcp-metrics based on src IP

We want to be able to get/del tcp-metrics based on the src IP. This
patch adds the necessary parsing of the netlink attribute and if the
source address is set, it will match on this one too.

Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agotcp: metrics: Delete all entries matching a certain destination
Christoph Paasch [Wed, 8 Jan 2014 15:05:58 +0000 (16:05 +0100)]
tcp: metrics: Delete all entries matching a certain destination

As we now can have multiple entries per destination-IP, the "ip
tcp_metrics delete address ADDRESS" command deletes all of them.

Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agotcp: metrics: New netlink attribute for src IP and dumped in netlink reply
Christoph Paasch [Wed, 8 Jan 2014 15:05:57 +0000 (16:05 +0100)]
tcp: metrics: New netlink attribute for src IP and dumped in netlink reply

This patch adds a new netlink attribute for the source-IP and appends it
to the netlink reply. Now, iproute2 can have access to the source-IP.

Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agotcp: metrics: Add source-address to tcp-metrics
Christoph Paasch [Wed, 8 Jan 2014 15:05:56 +0000 (16:05 +0100)]
tcp: metrics: Add source-address to tcp-metrics

We add the source-address to the tcp-metrics, so that different metrics
will be used per source/destination-pair. We use the destination-hash to
store the metric inside the hash-table. That way, deleting and dumping
via "ip tcp_metrics" is easy.

Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agotcp: metrics: rename tcpm_addr to tcpm_daddr
Christoph Paasch [Wed, 8 Jan 2014 15:05:55 +0000 (16:05 +0100)]
tcp: metrics: rename tcpm_addr to tcpm_daddr

As we will add also the source-address, we rename all accesses to the
tcp-metrics address to use "daddr".

Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville...
David S. Miller [Fri, 10 Jan 2014 19:53:33 +0000 (14:53 -0500)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless-next

John W. Linville says:

====================
Please pull these updates for the 3.14 stream!

For the mac80211 bits, Johannes says:

"Felix adds some helper functions for P2P NoA software tracking, Joe
fixes alignment (but as this apparently never caused issues I didn't
send it to 3.13), Kyeyoon/Jouni add QoS-mapping support (a Hotspot 2.0
feature), Weilong fixed a bunch of checkpatch errors and I get to play
fire-fighter or so and clean up other people's locking issues. I also
added nl80211 vendor-specific events, as we'd discussed at the wireless
summit."

For the iwlwifi bits, Emmanuel says:

"I have here a rework of the interrupt handling to meet RT kernel
requirements - basically we don't take any lock in the primary interrupt
handler. This gave me a good reason to clean things up a bit on the way.
There is also a fix of the QoS mapping along with a few workarounds for
hardware / firmware issues that are hard to hit.
Three fixes suggested by static analyzers, and other various stuff.
Most importantly, I update the Copyright note to include the new year."

For the bluetooth bits, Gustavo says:

"More patches to 3.14. The bulk of changes here is the 6LoWPAN support for
Bluetooth LE Devices. The commits that touches net/ieee802154/ are already
acked by David Miller. Other than that we have some RFCOMM fixes and
improvements plus fixes and clean ups all over the tree."

Beyond that, ath9k, brcmfmac, mwifiex, and wil6210 get their usual
level of attention.  The wl1251 driver gets a number of updates,
and there are a handful of other bits here and there.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
David S. Miller [Fri, 10 Jan 2014 19:50:02 +0000 (14:50 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
This batch contains one single patch with the l2tp match
for xtables, from James Chapman.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
John W. Linville [Fri, 10 Jan 2014 15:59:40 +0000 (10:59 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next into for-davem

Conflicts:
net/ieee802154/6lowpan.c

10 years agoxen-netback: stop vif thread spinning if frontend is unresponsive
Paul Durrant [Wed, 8 Jan 2014 12:41:58 +0000 (12:41 +0000)]
xen-netback: stop vif thread spinning if frontend is unresponsive

The recent patch to improve guest receive side flow control (ca2f09f2) had a
slight flaw in the wait condition for the vif thread in that any remaining
skbs in the guest receive side netback internal queue would prevent the
thread from sleeping. An unresponsive frontend can lead to a permanently
non-empty internal queue and thus the thread will spin. In this case the
thread should really sleep until the frontend becomes responsive again.

This patch adds an extra flag to the vif which is set if the shared ring
is full and cleared when skbs are drained into the shared ring. Thus,
if the thread runs, finds the shared ring full and can make no progress the
flag remains set. If the flag remains set then the thread will sleep,
regardless of a non-empty queue, until the next event from the frontend.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agocxgb4: Changed FW check version to match FW binary version
Hariprasad Shenai [Wed, 8 Jan 2014 11:24:47 +0000 (16:54 +0530)]
cxgb4: Changed FW check version to match FW binary version

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoirda: sh_sir: use devm_request_irq()
Kuninori Morimoto [Wed, 8 Jan 2014 04:55:05 +0000 (20:55 -0800)]
irda: sh_sir: use devm_request_irq()

Huqiu reported current sh_sir driver doesn't
call free_irq() in spite of using request_irq().
This patch replaces request_irq() into devm_request_irq()
to solve this issue

Reported-by: Huqiu Liu<huqiuliu@gmail.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoirda: sh_irda: use devm_request_irq()
Kuninori Morimoto [Wed, 8 Jan 2014 04:54:55 +0000 (20:54 -0800)]
irda: sh_irda: use devm_request_irq()

Huqiu reported current sh_irda driver doesn't
call free_irq() in spite of using request_irq().
This patch replaces request_irq() into devm_request_irq()
to solve this issue

Reported-by: Huqiu Liu<huqiuliu@gmail.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoirda: fixup SH_SIR position on Kconfig
Kuninori Morimoto [Wed, 8 Jan 2014 04:54:46 +0000 (20:54 -0800)]
irda: fixup SH_SIR position on Kconfig

SH_SIR is not Dongle

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables
David S. Miller [Fri, 10 Jan 2014 02:36:01 +0000 (21:36 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/pablo/nftables

Pablo Neira Ayuso says:

====================
nf_tables updates for net-next

The following patchset contains the following nf_tables updates,
mostly updates from Patrick McHardy, they are:

* Add the "inet" table and filter chain type for this new netfilter
  family: NFPROTO_INET. This special table/chain allows IPv4 and IPv6
  rules, this should help to simplify the burden in the administration
  of dual stack firewalls. This also includes several patches to prepare
  the infrastructure for this new table and a new meta extension to
  match the layer 3 and 4 protocol numbers, from Patrick McHardy.

* Load both IPv4 and IPv6 conntrack modules in nft_ct if the rule is used
  in NFPROTO_INET, as we don't certainly know which one would be used,
  also from Patrick McHardy.

* Do not allow to delete a table that contains sets, otherwise these
  sets become orphan, from Patrick McHardy.

* Hold a reference to the corresponding nf_tables family module when
  creating a table of that family type, to avoid the module deletion
  when in use, from Patrick McHardy.

* Update chain counters before setting the chain policy to ensure that
  we don't leave the chain in inconsistent state in case of errors (aka.
  restore chain atomicity). This also fixes a possible leak if it fails
  to allocate the chain counters if no counters are passed to be restored,
  from Patrick McHardy.

* Don't check for overflows in the table counter if we are just renaming
  a chain, from Patrick McHardy.

* Replay the netlink request after dropping the nfnl lock to load the
  module that supports provides a chain type, from Patrick.

* Fix chain type module references, from Patrick.

* Several cleanups, function renames, constification and code
  refactorizations also from Patrick McHardy.

* Add support to set the connmark, this can be used to set it based on
  the meta mark (similar feature to -j CONNMARK --restore), from
  Kristian Evensen.

* A couple of fixes to the recently added meta/set support and nft_reject,
  and fix missing chain type unregistration if we fail to register our
  the family table/filter chain type, from myself.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetfilter: nf_tables: fix error path in the init functions
Pablo Neira Ayuso [Thu, 9 Jan 2014 19:32:19 +0000 (20:32 +0100)]
netfilter: nf_tables: fix error path in the init functions

We have to unregister chain type if this fails to register netns.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: introduce l2tp match extension
James Chapman [Mon, 6 Jan 2014 10:17:08 +0000 (10:17 +0000)]
netfilter: introduce l2tp match extension

Introduce an xtables add-on for matching L2TP packets. Supports L2TPv2
and L2TPv3 over IPv4 and IPv6. As well as filtering on L2TP tunnel-id
and session-id, the filtering decision can also include the L2TP
packet type (control or data), protocol version (2 or 3) and
encapsulation type (UDP or IP).

The most common use for this will likely be to filter L2TP data
packets of individual L2TP tunnels or sessions. While a u32 match can
be used, the L2TP protocol headers are such that field offsets differ
depending on bits set in the header, making rules for matching generic
L2TP connections cumbersome. This match extension takes care of all
that.

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Thu, 9 Jan 2014 20:13:12 +0000 (15:13 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to i40e only.

Anjali provides a fix where interrupts were not being re-enabled on ICR0
even though they were auto masked by hardware.  Then provides a fix to
cleanup RSS initialization because it was doing some extra work, so
remove the extra work and any bugs it created when managing number of
queues.  Since hardware requires a full packet template to be pointed to
when adding hardware flow filters, add the template and use it for
programming filters.

Jesse provides a fix to replace the use of driver specific defines with
kernel ETH_ALEN defines.  Then disables packet split because with the
use of GRO, we do not need the extra bus overhead.  Fixes spelling
error in code comment.

Kamil provides a fix for the driver where the hardware expects the MAC
address in a very specific format and the driver was filing the data
incorrectly.

Mitch provides a fix to resolve a panic on reset by adding checks to
VSI->rx_rings.  Then shortens alloc_rx_buff_failed and
alloc_rx_page_failed variables since both part of an RX specific
structure so just remove the _rx part of the name.  Then fixes
badly formatted lines, long lines and mis-formatted lines.

Shannon provides a fix to call AQ to release any reservation held by this
PF on the NVM resource lock on startup, in order to clear anything that
might have been left over from a previous run.  Then removes interrupt on
AQ error since nearly everything we do is synchronous, using the
interrupt-on-error bit is unnecessary and causing unneeded interrupts.
Adds code to handle the ability to send messages among the physical
function interfaces by the admin queue.

Catherine sets the MFP flag earlier in software init and uses that flag
to decide if other hardware work-arounds are required which turns
off flow director in MFP mode.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoip_tunnel: fix sparse non static symbol warning
Wei Yongjun [Wed, 8 Jan 2014 13:59:30 +0000 (21:59 +0800)]
ip_tunnel: fix sparse non static symbol warning

Fixes the following sparse warning:

net/ipv4/ip_tunnel.c:116:18: warning:
 symbol 'tunnel_dst_check' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoopenvswitch: Use kmem_cache_free() instead of kfree()
Wei Yongjun [Wed, 8 Jan 2014 10:13:14 +0000 (18:13 +0800)]
openvswitch: Use kmem_cache_free() instead of kfree()

memory allocated by kmem_cache_alloc() should be freed using
kmem_cache_free(), not kfree().

Fixes: e298e5057006 ('openvswitch: Per cpu flow stats.')
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetfilter: nf_tables: rename nft_do_chain_pktinfo() to nft_do_chain()
Patrick McHardy [Thu, 9 Jan 2014 18:42:43 +0000 (18:42 +0000)]
netfilter: nf_tables: rename nft_do_chain_pktinfo() to nft_do_chain()

We don't encode argument types into function names and since besides
nft_do_chain() there are only AF-specific versions, there is no risk
of confusion.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: nf_tables: prohibit deletion of a table with existing sets
Patrick McHardy [Thu, 9 Jan 2014 18:42:41 +0000 (18:42 +0000)]
netfilter: nf_tables: prohibit deletion of a table with existing sets

We currently leak the set memory when deleting a table that still has
sets in it. Return EBUSY when attempting to delete a table with sets.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: nf_tables: take AF module reference when creating a table
Patrick McHardy [Thu, 9 Jan 2014 18:42:40 +0000 (18:42 +0000)]
netfilter: nf_tables: take AF module reference when creating a table

The table refers to data of the AF module, so we need to make sure the
module isn't unloaded while the table exists.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: nf_tables: perform flags validation before table allocation
Patrick McHardy [Thu, 9 Jan 2014 18:42:39 +0000 (18:42 +0000)]
netfilter: nf_tables: perform flags validation before table allocation

Simplifies error handling. Additionally use the correct type u32 for the
host byte order flags value.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: nf_tables: minor nf_chain_type cleanups
Patrick McHardy [Thu, 9 Jan 2014 18:42:38 +0000 (18:42 +0000)]
netfilter: nf_tables: minor nf_chain_type cleanups

Minor nf_chain_type cleanups:

- reorder struct to plug a hoe
- rename struct module member to "owner" for consistency
- rename nf_hookfn array to "hooks" for consistency
- reorder initializers for better readability

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: nf_tables: constify chain type definitions and pointers
Patrick McHardy [Thu, 9 Jan 2014 18:42:37 +0000 (18:42 +0000)]
netfilter: nf_tables: constify chain type definitions and pointers

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: nf_tables: replay request after dropping locks to load chain type
Patrick McHardy [Thu, 9 Jan 2014 18:42:36 +0000 (18:42 +0000)]
netfilter: nf_tables: replay request after dropping locks to load chain type

To avoid races, we need to replay to request after dropping the nfnl_mutex
to auto-load the chain type module.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: nf_tables: add missing module references to chain types
Patrick McHardy [Thu, 9 Jan 2014 18:42:35 +0000 (18:42 +0000)]
netfilter: nf_tables: add missing module references to chain types

In some cases we neither take a reference to the AF info nor to the
chain type, allowing the module to be unloaded while in use.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: nf_tables: fix chain type module reference handling
Patrick McHardy [Thu, 9 Jan 2014 18:42:34 +0000 (18:42 +0000)]
netfilter: nf_tables: fix chain type module reference handling

The chain type module reference handling makes no sense at all: we take
a reference immediately when the module is registered, preventing the
module from ever being unloaded.

Fix by taking a reference when we're actually creating a chain of the
chain type and release the reference when destroying the chain.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: nf_tables: fix check for table overflow
Patrick McHardy [Thu, 9 Jan 2014 18:42:33 +0000 (18:42 +0000)]
netfilter: nf_tables: fix check for table overflow

The table use counter is only increased for new chains, so move the check
to the correct position.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: nf_tables: restore chain change atomicity
Patrick McHardy [Thu, 9 Jan 2014 18:42:32 +0000 (18:42 +0000)]
netfilter: nf_tables: restore chain change atomicity

Chain counter validation is performed after the chain policy has
potentially been changed. Move counter validation/setting before
changing of the chain policy to fix this.

Additionally fix a memory leak if chain counter allocation fails
for new chains, remove an unnecessary free_percpu() and move
counter allocation for new chains

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: nf_tables: split chain policy validation from actually setting it
Patrick McHardy [Thu, 9 Jan 2014 18:42:31 +0000 (18:42 +0000)]
netfilter: nf_tables: split chain policy validation from actually setting it

Currently nf_tables_newchain() atomicity is broken because of having
validation of some netlink attributes performed after changing attributes
of the chain. The chain policy is (currently) fine, but split it up as
preparation for the following fixes and to avoid future mistakes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: nft_meta: fix lack of validation of the input register
Pablo Neira Ayuso [Thu, 9 Jan 2014 19:03:55 +0000 (20:03 +0100)]
netfilter: nft_meta: fix lack of validation of the input register

We have to validate that the input register is in the range of
allowed registers, otherwise we can take a incorrect register
value as input that may lead us to a crash.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: nft_ct: Add support to set the connmark
Kristian Evensen [Tue, 7 Jan 2014 15:43:54 +0000 (16:43 +0100)]
netfilter: nft_ct: Add support to set the connmark

This patch adds kernel support for setting properties of tracked
connections. Currently, only connmark is supported. One use-case
for this feature is to provide the same functionality as
-j CONNMARK --save-mark in iptables.

Some restructuring was needed to implement the set op. The new
structure follows that of nft_meta.

Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>