openwrt/staging/blogic.git
6 years agovxge: Mark expected switch fall-throughs
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:24:38 +0000 (18:24 -0500)]
vxge: Mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114796 ("Missing break in switch")
Addresses-Coverity-ID: 114804 ("Missing break in switch")
Addresses-Coverity-ID: 114806 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoigbvf: netdev: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:24:04 +0000 (18:24 -0500)]
igbvf: netdev: Mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114801 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoigb: e1000_phy: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:23:31 +0000 (18:23 -0500)]
igb: e1000_phy: Mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114800 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoigb: e1000_82575: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:22:57 +0000 (18:22 -0500)]
igb: e1000_82575: Mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114799 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoigb_main: Mark expected switch fall-throughs
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:22:22 +0000 (18:22 -0500)]
igb_main: Mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 200521 ("Missing break in switch")
Addresses-Coverity-ID: 114797 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/mlx4/en_rx: Mark expected switch fall-throughs
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:21:40 +0000 (18:21 -0500)]
net/mlx4/en_rx: Mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114794 ("Missing break in switch")
Addresses-Coverity-ID: 114795 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/mlx4/mcg: Mark expected switch fall-throughs
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:21:05 +0000 (18:21 -0500)]
net/mlx4/mcg: Mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114792 ("Missing break in switch")
Addresses-Coverity-ID: 114793 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoi40e_txrx: mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:20:27 +0000 (18:20 -0500)]
i40e_txrx: mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114791 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoi40e_main: mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:19:42 +0000 (18:19 -0500)]
i40e_main: mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114790 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:18:30 +0000 (18:18 -0500)]
net: hns3: Mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114789 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:17:50 +0000 (18:17 -0500)]
net: hns: Mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114788 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobe2net: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:17:08 +0000 (18:17 -0500)]
be2net: Mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114787 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: tulip: de4x5: mark expected switch fall-throughs
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:16:07 +0000 (18:16 -0500)]
net: tulip: de4x5: mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114784 ("Missing break in switch")
Addresses-Coverity-ID: 114785 ("Missing break in switch")
Addresses-Coverity-ID: 114786 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: tulip_core: mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:15:35 +0000 (18:15 -0500)]
net: tulip_core: mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114782 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: thunderx: mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:15:32 +0000 (18:15 -0500)]
net: thunderx: mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114781 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb3/l2t: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:14:18 +0000 (18:14 -0500)]
cxgb3/l2t: Mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114780 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4/t4_hw: mark expected switch fall-throughs
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:13:44 +0000 (18:13 -0500)]
cxgb4/t4_hw: mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114777 ("Missing break in switch")
Addresses-Coverity-ID: 114778 ("Missing break in switch")
Addresses-Coverity-ID: 114779 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4/l2t: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:13:12 +0000 (18:13 -0500)]
cxgb4/l2t: Mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114910 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoliquidio: mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:13:05 +0000 (18:13 -0500)]
liquidio: mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 143135 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: macb: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:11:26 +0000 (18:11 -0500)]
net: macb: Mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobnx2x: Mark expected switch fall-thoughs
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:11:14 +0000 (18:11 -0500)]
bnx2x: Mark expected switch fall-thoughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114878 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoalteon: acenic: mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:09:19 +0000 (18:09 -0500)]
alteon: acenic: mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114891 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years ago8390: axnet_cs: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 7 Aug 2018 23:09:09 +0000 (18:09 -0500)]
8390: axnet_cs: Mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114889 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: forwarding: gre_multipath: Update next-hop statistics match criteria
Nir Dotan [Tue, 7 Aug 2018 16:41:55 +0000 (19:41 +0300)]
selftests: forwarding: gre_multipath: Update next-hop statistics match criteria

gre_multipath test was using egress vlan_id matching on flows, for the
purpose of collecting next-hops statistics, later to be compared
against configured weights.
As matching on vlan_id on egress direction is not supported on all HW
devices, change the match criteria to use destination IP.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Acked-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotc-tests: initial version of nat action unit tests
Keara Leibovitz [Tue, 7 Aug 2018 19:18:43 +0000 (15:18 -0400)]
tc-tests: initial version of nat action unit tests

Initial set of nat action unit tests.

Signed-off-by: Keara Leibovitz <kleib@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'brcm-omega'
David S. Miller [Tue, 7 Aug 2018 22:48:38 +0000 (15:48 -0700)]
Merge branch 'brcm-omega'

Arun Parameswaran says:

====================
Add Broadcom Omega SoC internal switch and phy

The patchset is based on David Miller's "net-next" repo.

The patches add support for the Broadcom Omega SoC's internal ethernet
switch and the internal gphy.

The internal ethernet switch in the Omega is a b53 srab based switch.
The support for the switch is added to the b53 driver in the dsa
framework.

The gphy support is added to the bcm7xxx driver.
====================

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: Add support for Broadcom Omega internal Combo GPHY
Arun Parameswaran [Tue, 7 Aug 2018 17:02:44 +0000 (10:02 -0700)]
net: phy: Add support for Broadcom Omega internal Combo GPHY

Add support for the Broadcom Omega SoC internal Combo Ethernet
GPHY to the bcm7xxx phy driver.

Signed-off-by: Arun Parameswaran <arun.parameswaran@broadcom.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: b53: Add support for Broadcom Omega SoC internal switch
Arun Parameswaran [Tue, 7 Aug 2018 17:02:43 +0000 (10:02 -0700)]
net: dsa: b53: Add support for Broadcom Omega SoC internal switch

Add support for the Broadcom Omega SoC internal ethernet switch
to the b53 srab driver in the DSA framework.

Signed-off-by: Arun Parameswaran <arun.parameswaran@broadcom.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agodt-bindings: net: dsa: Add compatibility strings for Broadcom Omega
Arun Parameswaran [Tue, 7 Aug 2018 17:02:42 +0000 (10:02 -0700)]
dt-bindings: net: dsa: Add compatibility strings for Broadcom Omega

Add compatibility strings for the internal switch in the Broadcom
Omega SoC family (BCM5831X/BCM1140X) to B53.

Signed-off-by: Arun Parameswaran <arun.parameswaran@broadcom.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Tue, 7 Aug 2018 22:43:12 +0000 (15:43 -0700)]
Merge branch '40GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2018-08-07

This series contains updates to i40e and i40evf only.

Sergey cleans up a duplicate call to i40e_prep_for_reset() during
shutdown.

YueHaibing cleans up i40evf by removing code that was never being used
or called within the driver.

Jake updates the ethtool statistics to use a helper function since many
of the statistics use the same basic logic for copying strings into the
supplied buffer.  Cleaned up the use of a local variable that is no
longer needed or used.  Fixed additional stats issues, including the
failure to update the data pointer which was causing stats to be
reported incorrectly.

Mariusz fixes a bug where there was an oversight in configuring FEC when
link settings were forced which was causing 25G link to be configured
incorrectly.

Piotr adds a missing return code for when the firmware returns a busy
state.  Also added the process to command firmware to start
rearrangement when switching between old NVM structure to the new flat
NVM.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'qed-Add-Multi-TC-RoCE-support'
David S. Miller [Tue, 7 Aug 2018 20:22:11 +0000 (13:22 -0700)]
Merge branch 'qed-Add-Multi-TC-RoCE-support'

Denis Bolotin says:

====================
qed: Add Multi-TC RoCE support

This patch series adds support for multiple concurrent traffic classes for RoCE.
The first three patches enable the required parts of the driver to learn the TC
configuration, and the last one makes use of it to enable the feature.
Please consider applying this to net-next.

V1->V2:
-------
Avoid allocation in qed_dcbx_get_priority_tc().
Move qed_dcbx_get_priority_tc() out of CONFIG_DCB section since it doesn't call
qed_dcbx_query_params() anymore.

v2->V3:
-------
patch 1/3:
qed_dcbx_get_priority_tc() always returns a valid TC by value. In error cases,
it returns QED_DCBX_DEFAULT_TC (currently defined 0).
patch 3/3:
Cosmetic changes in qed_dev.c.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: Add Multi-TC RoCE support
Denis Bolotin [Tue, 7 Aug 2018 12:48:10 +0000 (15:48 +0300)]
qed: Add Multi-TC RoCE support

RoCE qps use a pair of physical queues (pq) received from the Queue Manager
(QM) - an offload queue (OFLD) and a low latency queue (LLT). The QM block
creates a pq for each TC, and allows RoCE qps to ask for a pq with a
specific TC. As a result, qps with different VLAN priorities can be mapped
to different TCs, and employ features such as PFC and ETS.

Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: Add a flag which indicates if offload TC is set
Denis Bolotin [Tue, 7 Aug 2018 12:48:09 +0000 (15:48 +0300)]
qed: Add a flag which indicates if offload TC is set

Distinguish not set offload_tc from offload_tc 0 and add getters and
setters.

Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: Add DCBX API - qed_dcbx_get_priority_tc()
Denis Bolotin [Tue, 7 Aug 2018 12:48:08 +0000 (15:48 +0300)]
qed: Add DCBX API - qed_dcbx_get_priority_tc()

The API receives a priority and looks for the TC it is mapped to in the
operational DCBX configuration. The API returns QED_DCBX_DEFAULT_TC (0)
when DCBX is disabled.

Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoRDS: IB: fix 'passing zero to ERR_PTR()' warning
YueHaibing [Tue, 7 Aug 2018 11:34:16 +0000 (19:34 +0800)]
RDS: IB: fix 'passing zero to ERR_PTR()' warning

Fix a static code checker warning:
 net/rds/ib_frmr.c:82 rds_ib_alloc_frmr() warn: passing zero to 'ERR_PTR'

The error path for ib_alloc_mr failure should set err to PTR_ERR.

Fixes: 1659185fb4d0 ("RDS: IB: Support Fastreg MR (FRMR) memory registration mode")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'macb-add-pad-and-fcs-support'
David S. Miller [Tue, 7 Aug 2018 20:18:50 +0000 (13:18 -0700)]
Merge branch 'macb-add-pad-and-fcs-support'

Claudiu Beznea says:

====================
net: macb: add pad and fcs support

In [1] it was reported that UDP checksum is offloaded to hardware no mather
it was previously computed in software or not. The proposal on [1] was to
disable TX checksum offload.

This series (mostly patch 3/3) address the issue described at [1] by
setting NOCRC bit to TX buffer descriptor for SKBs that arrived from
networking stack with checksum computed. For these packets padding and FCS
need to be added (hardware doesn't compute them if NOCRC bit is set). The
minimum packet size that hardware expects is 64 bytes (including FCS).
This feature could not be used in case of GSO, so, it was used only for
no GSO SKBs.

For SKBs wich requires padding and FCS computation macb_pad_and_fcs()
checks if there is enough headroom and tailroom in SKB to avoid copying
SKB structure. Since macb_pad_and_fcs() may change SKB the
macb_pad_and_fcs() was places in macb_start_xmit() b/w macb_csum_clear()
and skb_headlen() calls.

This patch was tested with pktgen in kernel tool in a script like this:
(pktgen_sample01_simple.sh is at [2]):

minSize=1
maxSize=1500

for i in `seq $minSize $maxSize` ; do
copy="$(shuf -i 1-2000 -n 1)"
./pktgen_sample01_simple.sh -i eth0 \
-m <dst-mac-addr> -d <dst-ip-addr> -x -s $i -c $copy
done

minStep=1
maxStep=200
for i in `seq $minStep $maxStep` ; do
copy="$(shuf -i 1-2000 -n 1)"
size="$(shuf -i 1-1500 -n 1)"
./pktgen_sample01_simple.sh -i eth0 \
-m <dst-mac-addr> -d <dst-ip-addr> -x -s $size -c $copy
done

Changes since RFC:
- in patch 3/3 order local variables by their lenght (reverse christmas tree
  format)

[1] https://www.spinics.net/lists/netdev/msg505065.html
[2] https://github.com/netoptimizer/network-testing/blob/master/pktgen/pktgen_sample01_simple.sh
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: macb: add support for padding and fcs computation
Claudiu Beznea [Tue, 7 Aug 2018 09:25:14 +0000 (12:25 +0300)]
net: macb: add support for padding and fcs computation

For packets with computed IP/TCP/UDP checksum there is no need to tell
hardware to recompute it. For such kind of packets hardware expects the
packet to be at least 64 bytes and FCS to be computed.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: macb: move checksum clearing outside of spinlock
Claudiu Beznea [Tue, 7 Aug 2018 09:25:13 +0000 (12:25 +0300)]
net: macb: move checksum clearing outside of spinlock

Move checksum clearing outside of spinlock. The SKB is protected by
networking lock (HARD_TX_LOCK()).

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: macb: use netdev_tx_t return type for ndo_start_xmit functions
Claudiu Beznea [Tue, 7 Aug 2018 09:25:12 +0000 (12:25 +0300)]
net: macb: use netdev_tx_t return type for ndo_start_xmit functions

Use netdev_tx_t return type for ndo_start_xmit function of macb driver.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'ibmvnic-next'
David S. Miller [Tue, 7 Aug 2018 19:46:28 +0000 (12:46 -0700)]
Merge branch 'ibmvnic-next'

Thomas Falcon says:

====================
ibmvnic: Update firmware error reporting

This patch set cleans out a lot of dead code from the ibmvnic driver
and adds some more. The error ID field of the descriptor is not filled
in by firmware, so do not print it and do not use it to query for
more detailed information. Remove the unused code written for this.
Finally, update the message to print a string explainng the error
cause instead of just the error code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoibmvnic: Update firmware error reporting with cause string
Thomas Falcon [Tue, 7 Aug 2018 02:39:59 +0000 (21:39 -0500)]
ibmvnic: Update firmware error reporting with cause string

Print a string instead of the error code. Since there is a
possibility that the driver can recover, classify it as a
warning instead of an error.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoibmvnic: Remove code to request error information
Thomas Falcon [Tue, 7 Aug 2018 02:39:58 +0000 (21:39 -0500)]
ibmvnic: Remove code to request error information

When backing device firmware reports an error, it provides an
error ID, which is meant to be queried for more detailed error
information. Currently, however, an error ID is not provided by
the Virtual I/O server and there are not any plans to do so. For
now, it is always unfilled or zero, so request_error_information
will never be called.  Remove it.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoliquidio: avoided acquiring post_lock for data only queues
Intiyaz Basha [Mon, 6 Aug 2018 20:09:40 +0000 (13:09 -0700)]
liquidio: avoided acquiring post_lock for data only queues

All control commands (soft commands) goes through only Queue 0
(control and data queue). So only queue-0 needs post_lock,
other queues are only data queues and does not need post_lock

Added a flag to indicate the queue can be used for soft commands.

If this flag is set, post_lock must be acquired before posting
a command to the queue.
If this flag is clear, post_lock is invalid for the queue.

Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoip6_tunnel: collect_md xmit: Use ip_tunnel_key's provided src address
Shmulik Ladkani [Mon, 6 Aug 2018 12:00:59 +0000 (15:00 +0300)]
ip6_tunnel: collect_md xmit: Use ip_tunnel_key's provided src address

When using an ip6tnl device in collect_md mode, the xmit methods ignore
the ipv6.src field present in skb_tunnel_info's key, both for route
calculation purposes (flowi6 construction) and for assigning the
packet's final ipv6h->saddr.

This makes it impossible specifying a desired ipv6 local address in the
encapsulating header (for example, when using tc action tunnel_key).

This is also not aligned with behavior of ipip (ipv4) in collect_md
mode, where the key->u.ipv4.src gets used.

Fix, by assigning fl6.saddr with given key->u.ipv6.src.
In case ipv6.src is not specified, ip6_tnl_xmit uses existing saddr
selection code.

Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Reviewed-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: cls_flower: set correct offload data in fl_reoffload
Vlad Buslov [Mon, 6 Aug 2018 08:27:10 +0000 (11:27 +0300)]
net: sched: cls_flower: set correct offload data in fl_reoffload

fl_reoffload implementation sets following members of struct
tc_cls_flower_offload incorrectly:
 - masked key instead of mask
 - key instead of masked key

Fix fl_reoffload to provide correct data to offload callback.

Fixes: 31533cba4327 ("net: sched: cls_flower: implement offload tcf_proto_op")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'nfp-ttl-tos-geneve'
David S. Miller [Tue, 7 Aug 2018 19:22:15 +0000 (12:22 -0700)]
Merge branch 'nfp-ttl-tos-geneve'

Simon Horman says:

====================
nfp: flower: tunnel TTL & TOS, and Geneve options set & match support

this series contains updates for the TC Flower classifier
and the offload facility for it in the NFP driver.

* Patches 1 & 2: update the NFP driver to allow offload
  of matching and setting tunnel ToS/TTL of flows using the TC Flower
  classifier and tun_key action

* Patches 3 & 4: enhance the flow dissector and TC Flower classifier
  to allow match on Geneve options

* Patch 5 & 6: update the NFP driver to allow offload of
  matching and setting Geneve options of flows using the TC Flower
  classifier and tun_key action
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: flower: add geneve option match offload
Pieter Jansen van Vuuren [Tue, 7 Aug 2018 15:36:03 +0000 (17:36 +0200)]
nfp: flower: add geneve option match offload

Introduce a new layer for matching on geneve options. This allows
offloading filters configured to match geneve with options.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: flower: add geneve option push action offload
Pieter Jansen van Vuuren [Tue, 7 Aug 2018 15:36:02 +0000 (17:36 +0200)]
nfp: flower: add geneve option push action offload

Introduce new push geneve option action. This allows offloading
filters configured to entunnel geneve with options.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/sched: allow flower to match tunnel options
Pieter Jansen van Vuuren [Tue, 7 Aug 2018 15:36:01 +0000 (17:36 +0200)]
net/sched: allow flower to match tunnel options

Allow matching on options in Geneve tunnel headers.
This makes use of existing tunnel metadata support.

The options can be described in the form
CLASS:TYPE:DATA/CLASS_MASK:TYPE_MASK:DATA_MASK, where CLASS is
represented as a 16bit hexadecimal value, TYPE as an 8bit
hexadecimal value and DATA as a variable length hexadecimal value.

e.g.
 # ip link add name geneve0 type geneve dstport 0 external
 # tc qdisc add dev geneve0 ingress
 # tc filter add dev geneve0 protocol ip parent ffff: \
     flower \
       enc_src_ip 10.0.99.192 \
       enc_dst_ip 10.0.99.193 \
       enc_key_id 11 \
       geneve_opts 0102:80:1122334421314151/ffff:ff:ffffffffffffffff \
       ip_proto udp \
       action mirred egress redirect dev eth1

This patch adds support for matching Geneve options in the order
supplied by the user. This leads to an efficient implementation in
the software datapath (and in our opinion hardware datapaths that
offload this feature). It is also compatible with Geneve options
matching provided by the Open vSwitch kernel datapath which is
relevant here as the Flower classifier may be used as a mechanism
to program flows into hardware as a form of Open vSwitch datapath
offload (sometimes referred to as OVS-TC). The netlink
Kernel/Userspace API may be extended, for example by adding a flag,
if other matching options are desired, for example matching given
options in any order. This would require an implementation in the
TC software datapath. And be done in a way that drivers that
facilitate offload of the Flower classifier can reject or accept
such flows based on hardware datapath capabilities.

This approach was discussed and agreed on at Netconf 2017 in Seoul.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoflow_dissector: allow dissection of tunnel options from metadata
Simon Horman [Tue, 7 Aug 2018 15:36:00 +0000 (17:36 +0200)]
flow_dissector: allow dissection of tunnel options from metadata

Allow the existing 'dissection' of tunnel metadata to 'dissect'
options already present in tunnel metadata. This dissection is
controlled by a new dissector key, FLOW_DISSECTOR_KEY_ENC_OPTS.

This dissection only occurs when skb_flow_dissect_tunnel_info()
is called, currently only the Flower classifier makes that call.
So there should be no impact on other users of the flow dissector.

This is in preparation for allowing the flower classifier to
match on Geneve options.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: flower: allow matching on ipv4 UDP tunnel tos and ttl
John Hurley [Tue, 7 Aug 2018 15:35:59 +0000 (17:35 +0200)]
nfp: flower: allow matching on ipv4 UDP tunnel tos and ttl

The addition of FLOW_DISSECTOR_KEY_ENC_IP to TC flower means that the ToS
and TTL of the tunnel header can now be matched on.

Extend the NFP tunnel match function to include these new fields.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: flower: set ip tunnel ttl from encap action
John Hurley [Tue, 7 Aug 2018 15:35:58 +0000 (17:35 +0200)]
nfp: flower: set ip tunnel ttl from encap action

The TTL for encapsulating headers in IPv4 UDP tunnels is taken from a
route lookup. Modify this to first check if a user has specified a TTL to
be used in the TC action.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoi40e: fix i40e_add_queue_stats data pointer update
Jacob Keller [Tue, 31 Jul 2018 10:41:48 +0000 (03:41 -0700)]
i40e: fix i40e_add_queue_stats data pointer update

This function accidentally failed to update the data pointer, which
caused the reported stats to be incorrect. Additionally, statistics
which follow queue stats in the output would potentially read non-zeroed
garbage data from the ethtool buffer.

This occurred because the data double pointer was not dereferenced
before incrementing the size.

Additionally, make sure this issue is more visible by adding a WARN_ONCE
to the i40e_get_ethtool_stats function. This warning will trigger
whenever the data pointer is not at the expected address, similar to the
check that we make in the i40e_get_stat_strings() function.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: Add AQ command for rearrange NVM structure
Piotr Azarewicz [Tue, 31 Jul 2018 10:41:47 +0000 (03:41 -0700)]
i40e: Add AQ command for rearrange NVM structure

During switching between old NVM structure approach (called structured
NVM) to new one (called flat NVM) or backward flash needs to be
rearranged to required NVM structure. This is a part of transition from
one NVM structure to another. The function is introduced to command
firmware to start rearrangement process.

Signed-off-by: Piotr Azarewicz <piotr.azarewicz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: Add additional return code to i40e_asq_send_command
Piotr Azarewicz [Tue, 31 Jul 2018 10:41:46 +0000 (03:41 -0700)]
i40e: Add additional return code to i40e_asq_send_command

Firmware can return a busy state, so the function return
I40E_ERR_NOT_READY.

Signed-off-by: Piotr Azarewicz <piotr.azarewicz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: fix warning about shadowed ring parameter
Jacob Keller [Tue, 31 Jul 2018 10:41:45 +0000 (03:41 -0700)]
i40e: fix warning about shadowed ring parameter

In commit 147e81ec7568 ("i40e: Test memory before ethtool alloc succeeds")
code was added to handle ring allocation on systems with low memory.

It shadowed the ring parameter pointer by introducing a local ring
pointer inside the for loop. Most of the code in the loop already just
accessed the ring via &rx_rings[i]. Since most of the code already does
this, just remove the local variable.

If someone considers it worth keeping a local around, they should use it
for the whole section instead of just a couple of accesses.

This fixes a warning when -Wshadow is enabled

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: remove unnecessary i variable causing -Wshadow warning
Jacob Keller [Tue, 31 Jul 2018 10:41:44 +0000 (03:41 -0700)]
i40e: remove unnecessary i variable causing -Wshadow warning

Commit c61c8fe1d592 ("i40e: Implement an ethtool private flag to stop
LLDP in FW") added an extra for-loop which added a shadowing 'i'
variable as the index.

However, the local variable i already exists, and we already use it as
a loop index. Additionally, at this point, there is no further use of
the variable, so it's safe to simply overwrite the variable contents.

This fixes a -Wshadow warning which has started being enabled on some
distributions

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Patryk Malek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoMerge branch 'WoL-filters'
David S. Miller [Tue, 7 Aug 2018 19:15:03 +0000 (12:15 -0700)]
Merge branch 'WoL-filters'

Florian Fainelli says:

====================
net: Support Wake-on-LAN using filters

This is technically a v2, but this patch series builds on your feedback
and defines the following:

- a new WAKE_* bit: WAKE_FILTER which can be enabled alongside other type
  of Wake-on-LAN to support waking up on a programmed filter (match + action)
- a new RX_CLS_FLOW_WAKE flow action which can be specified by an user when
  inserting a flow using ethtool::rxnfc, similar to the existing RX_CLS_FLOW_DISC

The bcm_sf2 and bcmsysport drivers are updated accordingly to work in concert to
allow matching packets at the switch level, identified by their filter location
to be used as a match by the SYSTEM PORT (CPU/management controller) during
Wake-on-LAN.

Let me know if this looks better than the previous incarnation of the patch
series.

Attached is also the ethtool patch that I would be submitting once the uapi
changes are committed.

Thank you!

Changes in v2:

- bail out earlier in bcm_sf2_cfp's get_rxnfc if an error is
  encountered (Andrew)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: systemport: Add support for WAKE_FILTER
Florian Fainelli [Tue, 7 Aug 2018 17:50:23 +0000 (10:50 -0700)]
net: systemport: Add support for WAKE_FILTER

The SYSTEMPORT MAC allows up to 8 filters to be programmed to wake-up
from LAN. Verify that we have up to 8 filters and program them to the
appropriate RXCHK entries to be matched (along with their masks).

We need to update the entry and exit to Wake-on-LAN mode to keep the
RXCHK engine running to match during suspend, but this is otherwise
fairly similar to Magic Packet detection.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: bcm_sf2: Propagate ethtool::rxnfc to CPU port
Florian Fainelli [Tue, 7 Aug 2018 17:50:22 +0000 (10:50 -0700)]
net: dsa: bcm_sf2: Propagate ethtool::rxnfc to CPU port

Allow propagating ethtool::rxnfc programming to the CPU/management port
such that it is possible for such a CPU to perform e.g: Wake-on-LAN
using filters configured by the switch. We need a tiny bit of
cooperation between the switch drivers which is able to do the full flow
matching, whereas the CPU/management port might not. The CPU/management
driver needs to return -EOPNOTSUPP to indicate an non critical error,
any other error code otherwise.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoethtool: Add WAKE_FILTER and RX_CLS_FLOW_WAKE
Florian Fainelli [Tue, 7 Aug 2018 17:50:20 +0000 (10:50 -0700)]
ethtool: Add WAKE_FILTER and RX_CLS_FLOW_WAKE

Add the ability to specify through ethtool::rxnfc that a rule location is
special and will be used to participate in Wake-on-LAN, by e.g: having a
specific pattern be matched. When this is the case, fs->ring_cookie must
be set to the special value RX_CLS_FLOW_WAKE.

We also define an additional ethtool::wolinfo flag: WAKE_FILTER which
can be used to configure an Ethernet adapter to allow Wake-on-LAN using
previously programmed filters.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
David S. Miller [Tue, 7 Aug 2018 18:02:05 +0000 (11:02 -0700)]
Merge git://git./linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2018-08-07

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add cgroup local storage for BPF programs, which provides a fast
   accessible memory for storing various per-cgroup data like number
   of transmitted packets, etc, from Roman.

2) Support bpf_get_socket_cookie() BPF helper in several more program
   types that have a full socket available, from Andrey.

3) Significantly improve the performance of perf events which are
   reported from BPF offload. Also convert a couple of BPF AF_XDP
   samples overto use libbpf, both from Jakub.

4) seg6local LWT provides the End.DT6 action, which allows to
   decapsulate an outer IPv6 header containing a Segment Routing Header.
   Adds this action now to the seg6local BPF interface, from Mathieu.

5) Do not mark dst register as unbounded in MOV64 instruction when
   both src and dst register are the same, from Arthur.

6) Define u_smp_rmb() and u_smp_wmb() to their respective barrier
   instructions on arm64 for the AF_XDP sample code, from Brian.

7) Convert the tcp_client.py and tcp_server.py BPF selftest scripts
   over from Python 2 to Python 3, from Jeremy.

8) Enable BTF build flags to the BPF sample code Makefile, from Taeung.

9) Remove an unnecessary rcu_read_lock() in run_lwt_bpf(), from Taehee.

10) Several improvements to the README.rst from the BPF documentation
    to make it more consistent with RST format, from Tobin.

11) Replace all occurrences of strerror() by calls to strerror_r()
    in libbpf and fix a FORTIFY_SOURCE build error along with it,
    from Thomas.

12) Fix a bug in bpftool's get_btf() function to correctly propagate
    an error via PTR_ERR(), from Yue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoieee802154: hwsim: fix rcu address annotation
Alexander Aring [Tue, 7 Aug 2018 14:34:44 +0000 (10:34 -0400)]
ieee802154: hwsim: fix rcu address annotation

This patch fixes the following sparse warning about mismatch rcu
attribute for address space annotation:

...
error: incompatible types in comparison expression (different modifiers)
error: incompatible types in comparison expression (different address spaces)
...

Some __rcu annotation was at non-pointers list head structures and one was
missing in edge information which is used by rcu_assign_pointer() to
update edge setting information.

Cc: Stefan Schmidt <stefan@datenfreihafen.org>
Fixes: f25da51fdc38 ("ieee802154: hwsim: add replacement for fakelb")
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoi40e: convert priority flow control stats to use helpers
Jacob Keller [Tue, 31 Jul 2018 10:41:42 +0000 (03:41 -0700)]
i40e: convert priority flow control stats to use helpers

The priority flow control statistics are laid out in the stats structure
using arrays. This made it unwieldy to use as part of an i40e_stats
array.

Add a new structure type, i40e_pfc_stats, and a helper function
i40e_get_pfc_stats which can return the stats for a given priority
value as an i40e_pfc_stats structure.

Use this to create an i40e_stats array, which we'll use to format and
copy the strings and stats into the supplied buffers.

This reduces even more boiler plate code in i40e_get_ethtool_stats and
i40e_get_stat_strings.

An alternative would be to modify the structure definition for the pfc
stats, but this is more invasive to the rest of the code base.

Note that a macro was used to setup the copy of stats from the
pf->stats, as this reduces the chance of typos in the code names. It
will produce a checkpatch.pl warning due to re-use of a macro argument.
In this case, it should be safe, as the macro will fail to compile in
cases where the argument is not a simple structure member name, and thus
arguments with side effects should not be an issue.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: convert VEB TC stats to use an i40e_stats array
Jacob Keller [Tue, 31 Jul 2018 10:41:41 +0000 (03:41 -0700)]
i40e: convert VEB TC stats to use an i40e_stats array

The VEB TC stats are currently implemented with separate parsing,
instead of using the i40e_stats array and associated helper functions.
This is likely because the stats rely on embedding the TC number into
the stat name.

Update i40e_add_stat_strings to take variadic arguments, and use these
to vsnprintf the i40e_stats string as a string containing format
specifiers.

Create a stats array for the VEB TC related stats,
i40e_gstrings_veb_tc_stats, and use this along with the helper functions
to remove the specialized boiler plate code.

Always call i40e_add_ethtool_stats for both this array and the general
VEB stats array. This ensures that we zero out any memory in case it was
not zero-allocated for us.

This ultimately results in less boiler plate code for the
i40e_get_stat_strings and i40e_get_ethtool_stats.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: Set fec_config when forcing link state
Mariusz Stachura [Tue, 31 Jul 2018 10:41:40 +0000 (03:41 -0700)]
i40e: Set fec_config when forcing link state

This patch configures FEC setting in i40e_force_link_state().
For some reason setting this field was overlooked thus causing
25G link to be configured incorrectly.

Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: add helper to copy statistic values into ethtool buffer
Jacob Keller [Tue, 31 Jul 2018 10:41:39 +0000 (03:41 -0700)]
i40e: add helper to copy statistic values into ethtool buffer

Similar to the helper function to copy the ethtool stats strings, add
and use a helper function for copying the ethtool stats into the
supplied buffer.

Just like before, we use a macro to avoid having to pass ARRAY_SIZE
manually, so as to reduce chance of bugs.

Some of the stats, especially queue stats, are a bit trickier, and will
be handled in future patches.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: add helper function for copying strings from stat arrays
Jacob Keller [Tue, 31 Jul 2018 10:41:38 +0000 (03:41 -0700)]
i40e: add helper function for copying strings from stat arrays

Many of the ethtool statistics use the same basic logic for copying
strings into the supplied buffer. A set of stats are stored in a const
array of i40e_stats structures, and we apply these all together.

Simplify the stats code by introducing a helper function which can take
a stats array and copy the strings into the buffer, updating the buffer
pointer as we go.

We use a macro to implement i40e_add_stat_strings so that ARRAY_SIZE can
be used on the array passed in. This ensures that we always use the
matching size in __i40e_add_stat_strings.

More complex stats currently do not use i40e_stats arrays, usually due
to custom formatted strings, or because the stats are not laid out in
the expected way. These stats will be updated to use the helper function
in separate future patches.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e/i40evf: remove redundant functions i40evf_aq_{set/get}_phy_register
YueHaibing [Thu, 26 Jul 2018 06:37:36 +0000 (14:37 +0800)]
i40e/i40evf: remove redundant functions i40evf_aq_{set/get}_phy_register

There are no in-tree callers.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: Remove duplicated prepare call in i40e_shutdown
Sergey Nemov [Thu, 19 Jul 2018 11:25:22 +0000 (13:25 +0200)]
i40e: Remove duplicated prepare call in i40e_shutdown

Function call to i40e_prep_for_reset() is duplicated in
i40e_shutdown routine and gets called before
i40e_enable_mc_magic_wake() which blocks it from being executed
correctly on system reboot or shutdown because adminq is already
disabled by first i40e_prep_for_reset() call.

Two register write calls are also duplicated.

Signed-off-by: Sergey Nemov <sergey.nemov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agobpf: introduce update_effective_progs()
Roman Gushchin [Mon, 6 Aug 2018 21:27:28 +0000 (14:27 -0700)]
bpf: introduce update_effective_progs()

__cgroup_bpf_attach() and __cgroup_bpf_detach() functions have
a good amount of duplicated code, which is possible to eliminate
by introducing the update_effective_progs() helper function.

The update_effective_progs() calls compute_effective_progs()
and then in case of success it calls activate_effective_progs()
for each descendant cgroup. In case of failure (OOM), it releases
allocated prog arrays and return the error code.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agoMerge branch 'ieee802154-for-davem-2018-08-06' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Mon, 6 Aug 2018 20:17:48 +0000 (13:17 -0700)]
Merge branch 'ieee802154-for-davem-2018-08-06' of git://git./linux/kernel/git/sschmidt/wpan-next

Stefan Schmidt says:

====================
pull-request: ieee802154-next 2018-08-06

An update from ieee802154 for *net-next*

Romuald added a socket option to get the LQI value of the received datagram.
Alexander added a new hardware simulation driver modelled after hwsim of the
wireless people. It allows runtime configuration for new nodes and edges over a
netlink interface (a config utlity is making its way into wpan-tools).
We also have three fixes in here. One from Colin which is more of a cleanup and
two from Alex fixing tailroom and single frame space problems.
I would normally put the last two into my fixes tree, but given we are already
in -rc8 I simply put them here and added a cc: stable to them.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoipv4: frags: precedence bug in ip_expire()
Dan Carpenter [Mon, 6 Aug 2018 19:17:35 +0000 (22:17 +0300)]
ipv4: frags: precedence bug in ip_expire()

We accidentally removed the parentheses here, but they are required
because '!' has higher precedence than '&'.

Fixes: fa0f527358bd ("ip: use rb trees for IP frag queue.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoptp_qoriq: use div_u64/div_u64_rem for 64-bit division
Yangbo Lu [Mon, 6 Aug 2018 04:39:11 +0000 (12:39 +0800)]
ptp_qoriq: use div_u64/div_u64_rem for 64-bit division

This is a fix-up patch for below build issue with multi_v7_defconfig.

drivers/ptp/ptp_qoriq.o: In function `qoriq_ptp_probe':
ptp_qoriq.c:(.text+0xd0c): undefined reference to `__aeabi_uldivmod'

Fixes: 91305f281262 ("ptp_qoriq: support automatic configuration for ptp timer")
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: avoid unnecessary sock_flag() check when enable timestamp
Yafang Shao [Mon, 6 Aug 2018 03:57:02 +0000 (11:57 +0800)]
net: avoid unnecessary sock_flag() check when enable timestamp

The sock_flag() check is alreay inside sock_enable_timestamp(), so it is
unnecessary checking it in the caller.

    void sock_enable_timestamp(struct sock *sk, int flag)
    {
        if (!sock_flag(sk, flag)) {
            ...
        }
    }

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agovhost: switch to use new message format
Jason Wang [Mon, 6 Aug 2018 03:17:47 +0000 (11:17 +0800)]
vhost: switch to use new message format

We use to have message like:

struct vhost_msg {
int type;
union {
struct vhost_iotlb_msg iotlb;
__u8 padding[64];
};
};

Unfortunately, there will be a hole of 32bit in 64bit machine because
of the alignment. This leads a different formats between 32bit API and
64bit API. What's more it will break 32bit program running on 64bit
machine.

So fixing this by introducing a new message type with an explicit
32bit reserved field after type like:

struct vhost_msg_v2 {
__u32 type;
__u32 reserved;
union {
struct vhost_iotlb_msg iotlb;
__u8 padding[64];
};
};

We will have a consistent ABI after switching to use this. To enable
this capability, introduce a new ioctl (VHOST_SET_BAKCEND_FEATURE) for
userspace to enable this feature (VHOST_BACKEND_F_IOTLB_V2).

Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/bridge/br_multicast: remove redundant variable "err"
zhong jiang [Mon, 6 Aug 2018 03:07:23 +0000 (11:07 +0800)]
net/bridge/br_multicast: remove redundant variable "err"

The err is not modified after initalization, So remove it and make
it to be void function.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomellanox: fix the dport endianness in call of __inet6_lookup_established()
Al Viro [Sat, 4 Aug 2018 20:41:27 +0000 (21:41 +0100)]
mellanox: fix the dport endianness in call of __inet6_lookup_established()

__inet6_lookup_established() expect th->dport passed in host-endian,
not net-endian.  The reason is microoptimization in __inet6_lookup(),
but if you use the lower-level helpers, you have to play by their
rules...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoieee802154: fakelb: add deprecated msg while probe
Alexander Aring [Sat, 14 Jul 2018 16:33:06 +0000 (12:33 -0400)]
ieee802154: fakelb: add deprecated msg while probe

Since mac802154_hwsim the fakelb driver will get deprecated. This patch will
notifier all users of fakelb to switch to the new mac802154_hwsim driver.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
6 years agoieee802154: hwsim: add replacement for fakelb
Alexander Aring [Sat, 14 Jul 2018 16:33:05 +0000 (12:33 -0400)]
ieee802154: hwsim: add replacement for fakelb

This patch adds a new virtual driver mac802154_hwsim which is based on
the fakelb driver.
The fakelb driver will get deprecated and hopefully removed someday.
The main reason for doing this step is to rename the driver to
mac802154_hwsim to have a similar naming scheme as mac80211_hwsim,
which is more popular in the 802.11 wireless word and the idea is the
same behind this driver.

The new features of this driver are to have knowledge about connected
edges, which can be changed during runtime. This offers a testing
environment for routing protocols e.g. RPL.
The default behaviour is still as fakelb: two radios connected to each
other. New added radios during runtime will not be connected to other
wpan_hwsim instances.

The netlink api is not namespace aware on purpose, only the registered
wpan_phy's can be moved to namespaces. The physical layer according to
wiresless "air" communication can be handled across namespaces.

Furthermore the edges can be weighted with the LQI value according IEEE
802.15.4 which offers additional handling to mark bad or good connection
indicators to other connected virtual phys.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
6 years agonet: ieee802154: 6lowpan: remove redundant pointers 'fq' and 'net'
Colin Ian King [Tue, 31 Jul 2018 15:45:07 +0000 (16:45 +0100)]
net: ieee802154: 6lowpan: remove redundant pointers 'fq' and 'net'

Pointers fq and net are being assigned but are never used hence they
are redundant and can be removed.

Cleans up clang warnings:
warning: variable 'fq' set but not used [-Wunused-but-set-variable]
warning: variable 'net' set but not used [-Wunused-but-set-variable]

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
6 years agonet: mac802154: tx: expand tailroom if necessary
Alexander Aring [Mon, 2 Jul 2018 20:32:03 +0000 (16:32 -0400)]
net: mac802154: tx: expand tailroom if necessary

This patch is necessary if case of AF_PACKET or other socket interface
which I am aware of it and didn't allocated the necessary room.

Reported-by: David Palma <david.palma@ntnu.no>
Reported-by: Rabi Narayan Sahoo <rabinarayans0828@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
6 years agonet: 6lowpan: fix reserved space for single frames
Alexander Aring [Sat, 14 Jul 2018 16:52:10 +0000 (12:52 -0400)]
net: 6lowpan: fix reserved space for single frames

This patch fixes patch add handling to take care tail and headroom for
single 6lowpan frames. We need to be sure we have a skb with the right
head and tailroom for single frames. This patch do it by using
skb_copy_expand() if head and tailroom is not enough allocated by upper
layer.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195059
Reported-by: David Palma <david.palma@ntnu.no>
Reported-by: Rabi Narayan Sahoo <rabinarayans0828@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
6 years agoMerge remote-tracking branch 'net-next/master'
Stefan Schmidt [Mon, 6 Aug 2018 07:04:48 +0000 (09:04 +0200)]
Merge remote-tracking branch 'net-next/master'

6 years agotc-testing: remove duplicate spaces in skbedit match patterns
Vlad Buslov [Sun, 5 Aug 2018 19:37:09 +0000 (22:37 +0300)]
tc-testing: remove duplicate spaces in skbedit match patterns

Match patterns for some skbedit tests contain duplicate whitespace that is
not present in actual tc output. This causes tests to fail because they
can't match required action, even when it was successfully created.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotc-testing: remove duplicate spaces in connmark match patterns
Vlad Buslov [Sun, 5 Aug 2018 19:36:44 +0000 (22:36 +0300)]
tc-testing: remove duplicate spaces in connmark match patterns

Match patterns for some connmark tests contain duplicate whitespace that is
not present in actual tc output. This causes tests to fail because they
can't match required action, even when it was successfully created.

Fixes: 1dad0f9ffff7 ("tc-testing: add connmark action tests")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotc-testing: flush gact actions on test teardown
Vlad Buslov [Sun, 5 Aug 2018 19:36:25 +0000 (22:36 +0300)]
tc-testing: flush gact actions on test teardown

Test 6fb4 creates one mirred and one pipe action, but only flushes mirred
on teardown. Leaking pipe action causes failures in other tests.

Add additional teardown command to also flush gact actions.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotc-testing: fix ip address in u32 test
Vlad Buslov [Sun, 5 Aug 2018 19:35:56 +0000 (22:35 +0300)]
tc-testing: fix ip address in u32 test

Fix expected ip address to actually match configured ip address.
Fix test to expect single matched filter.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'wireless-drivers-next-for-davem-2018-08-05' of git://git.kernel.org/pub...
David S. Miller [Mon, 6 Aug 2018 00:36:01 +0000 (17:36 -0700)]
Merge tag 'wireless-drivers-next-for-davem-2018-08-05' of git://git./linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for 4.19

This time a bigger pull request as we have two new Mediatek drivers
MT76x2u (CONFIG_MT76x2U) and MT76x0U (CONFIG_MT76x0U). Also iwlwifi got
support for the new IEEE 802.11ax standard, the successor for
802.11ac. And naturally smaller new features and bugfixes all over.

Major changes:

wcn36xx

* fix WEP in client mode

wil6210

* add support for Talyn-MB (Talyn ver 2.0) device

* add support for enhanced DMA firmware feature

iwlwifi

* implement 802.11ax D2.0

* support for the new 22560 device family

* new PCI IDs for 22000 and 22560

qtnfmac

* implement cfg80211 power management callback

* enable multiple SSIDs scan support

* qtnfmac: implement basic WoWLAN support

mt7601u

* fall back to software encryption for hw unsupported ciphers

* enable 802.11 Management Frame Protection (MFP)

mt76

* support setting RTS threshold

* add USB support

* add support for MT76x2u devices

* add support for MT76x0U devices

mwifiex

* allow user space to set all other IEs except WMM IE

rsi

* add firmware support for AP+BT dual mode
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetoot...
David S. Miller [Mon, 6 Aug 2018 00:29:27 +0000 (17:29 -0700)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2018-08-05

Here's the main bluetooth-next pull request for the 4.19 kernel.

 - Added support for Bluetooth Advertising Extensions
 - Added vendor driver support to hci_h5 HCI driver
 - Added serdev support to hci_h5 driver
 - Added support for Qualcomm wcn3990 controller
 - Added support for RTL8723BS and RTL8723DS controllers
 - btusb: Added new ID for Realtek 8723DE
 - Several other smaller fixes & cleanups

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'mlxsw-Enable-MC-aware-mode-for-mlxsw-ports'
David S. Miller [Mon, 6 Aug 2018 00:28:22 +0000 (17:28 -0700)]
Merge branch 'mlxsw-Enable-MC-aware-mode-for-mlxsw-ports'

Ido Schimmel says:

====================
mlxsw: Enable MC-aware mode for mlxsw ports

Petr says:

Due to an issue in Spectrum chips, when unicast traffic shares the same
queue as BUM traffic, and there is a congestion, the BUM traffic is
admitted to the queue anyway, thus pushing out all UC traffic. In order
to give unicast traffic precedence over BUM traffic, configure
multicast-aware mode on all ports.

Under multicast-aware regime, when assigning traffic class to a packet,
the switch doesn't merely take the value prescribed by the QTCT
register. For BUM traffic, it instead assigns that value plus 8. That
limits the number of available TCs, but since mlxsw currently only uses
the lower eight anyway, it is no real loss.

The two TCs (UC and MC one) are then mapped to the same subgroup and
strictly prioritized so that UC traffic is preferred in case of
congestion.

In patch #1, introduce a new register, QTCTM, which enables the
multicast-aware mode.

In patch #2, fix a typo in related code.

In patch #3, set up TCs and QTCTM to enable multicast-aware mode.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum: Configure MC-aware mode on mlxsw ports
Petr Machata [Sun, 5 Aug 2018 06:03:08 +0000 (09:03 +0300)]
mlxsw: spectrum: Configure MC-aware mode on mlxsw ports

In order to give unicast traffic precedence over BUM traffic, configure
multicast-aware mode on all ports.

Under multicast-aware regime, when assigning traffic class to a packet,
the switch doesn't merely take the value prescribed by the QTCT
register. For BUM traffic, it instead assigns that value plus 8.

ETS elements for TCs 8..15 thus need to be configured as well. Extend
mlxsw_sp_port_ets_init() so that it maps each of them to the same
subgroup as their corresponding TC from the range 0..7, such that TCs X
and X+8 map to the same subgroup.

The existing code configures TCs with strict priority. So far this was
immaterial, because each TC had its own subgroup. Now that two TCs share
a subgroup it becomes important. TCs are prioritized in order of 7, 6,
..., 0, 15, 14, ..., 8: the higher TCs used for BUM traffic end up being
deprioritized. Since that's what's needed, keep that configuration as it
is, and configure the new TCs likewise.

Finally in mlxsw_sp_port_create(), invoke configuration of QTCTM to
enable MC-aware mode on each port.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum: Fix a typo
Petr Machata [Sun, 5 Aug 2018 06:03:07 +0000 (09:03 +0300)]
mlxsw: spectrum: Fix a typo

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: reg: Add QoS Switch Traffic Class Table is Multicast-Aware Register
Petr Machata [Sun, 5 Aug 2018 06:03:06 +0000 (09:03 +0300)]
mlxsw: reg: Add QoS Switch Traffic Class Table is Multicast-Aware Register

This register configures if the Switch Priority to Traffic Class mapping
is based on Multicast packet indication. If so, then multicast packets
will get a Traffic Class that is plus (cap_max_tclass_data/2) the value
configured by QTCT.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agovirtio-net: mark expected switch fall-throughs
Gustavo A. R. Silva [Sun, 5 Aug 2018 02:42:05 +0000 (21:42 -0500)]
virtio-net: mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 1402059 ("Missing break in switch")
Addresses-Coverity-ID: 1402060 ("Missing break in switch")
Addresses-Coverity-ID: 1402061 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: cls_flower: Fix an error code in fl_tmplt_create()
Dan Carpenter [Fri, 3 Aug 2018 19:27:55 +0000 (22:27 +0300)]
net: sched: cls_flower: Fix an error code in fl_tmplt_create()

We forgot to set the error code on this path, so we return NULL instead
of an error pointer.  In the current code kzalloc() won't fail for small
allocations so this doesn't really affect runtime.

Fixes: b95ec7eb3b4d ("net: sched: cls_flower: implement chain templates")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: check extack._msg before print
Li RongQing [Fri, 3 Aug 2018 07:45:21 +0000 (15:45 +0800)]
net: check extack._msg before print

dev_set_mtu_ext is able to fail with a valid mtu value, at that
condition, extack._msg is not set and random since it is in stack,
then kernel will crash when print it.

Fixes: 7a4c53bee3324a ("net: report invalid mtu value via netlink extack")
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoipv6: defrag: drop non-last frags smaller than min mtu
Florian Westphal [Fri, 3 Aug 2018 00:22:20 +0000 (02:22 +0200)]
ipv6: defrag: drop non-last frags smaller than min mtu

don't bother with pathological cases, they only waste cycles.
IPv6 requires a minimum MTU of 1280 so we should never see fragments
smaller than this (except last frag).

v3: don't use awkward "-offset + len"
v2: drop IPv4 part, which added same check w. IPV4_MIN_MTU (68).
    There were concerns that there could be even smaller frags
    generated by intermediate nodes, e.g. on radio networks.

Cc: Peter Oskolkov <posk@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'ip-Use-rb-trees-for-IP-frag-queue'
David S. Miller [Mon, 6 Aug 2018 00:16:46 +0000 (17:16 -0700)]
Merge branch 'ip-Use-rb-trees-for-IP-frag-queue'

Peter Oskolkov says:

====================
ip: Use rb trees for IP frag queue.

This patchset
 * changes IPv4 defrag behavior to match that of IPv6: overlapping
   fragments now cause the whole IP datagram to be discarded (suggested
   by David Miller): there are no legitimate use cases for overlapping
   fragments;
 * changes IPv4 defrag queue from a list to a rb tree (suggested
   by Eric Dumazet): this change removes a potential attach vector.

Upcoming patches will contain similar changes for IPv6 frag queue,
as well as a comprehensive IP defrag self-test (temporarily delayed).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoip: use rb trees for IP frag queue.
Peter Oskolkov [Thu, 2 Aug 2018 23:34:39 +0000 (23:34 +0000)]
ip: use rb trees for IP frag queue.

Similar to TCP OOO RX queue, it makes sense to use rb trees to store
IP fragments, so that OOO fragments are inserted faster.

Tested:

- a follow-up patch contains a rather comprehensive ip defrag
  self-test (functional)
- ran neper `udp_stream -c -H <host> -F 100 -l 300 -T 20`:
    netstat --statistics
    Ip:
        282078937 total packets received
        0 forwarded
        0 incoming packets discarded
        946760 incoming packets delivered
        18743456 requests sent out
        101 fragments dropped after timeout
        282077129 reassemblies required
        944952 packets reassembled ok
        262734239 packet reassembles failed
   (The numbers/stats above are somewhat better re:
    reassemblies vs a kernel without this patchset. More
    comprehensive performance testing TBD).

Reported-by: Jann Horn <jannh@google.com>
Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>