Moshe Shemesh [Sun, 7 Jan 2018 14:45:27 +0000 (16:45 +0200)]
net/mlx5: Add support for QUERY_VNIC_ENV command
Add support for new FW command QUERY_VNIC_ENV.
The command is used by the driver to query vnic diagnostic statistics
from FW.
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Inbar Karmy [Mon, 20 Nov 2017 16:06:20 +0000 (18:06 +0200)]
net/mlx5e: PFC stall prevention support
Implement set/get functions to configure PFC stall prevention
timeout by tunables api through ethtool.
By default the stall prevention timeout is configured to 8 sec.
Timeout range is: 80-8000 msec.
Enabling stall prevention with the auto timeout will set
the timeout to 100 msec.
Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Inbar Karmy [Mon, 20 Nov 2017 14:14:30 +0000 (16:14 +0200)]
ethtool: Add support for configuring PFC stall prevention in ethtool
In the event where the device unexpectedly becomes unresponsive
for a long period of time, flow control mechanism may propagate
pause frames which will cause congestion spreading to the entire
network.
To prevent this scenario, when the device is stalled for a period
longer than a pre-configured timeout, flow control mechanisms are
automatically disabled.
This patch adds support for the ETHTOOL_PFC_STALL_PREVENTION
as a tunable.
This API provides support for configuring flow control storm prevention
timeout (msec).
Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Cc: Michal Kubecek <mkubecek@suse.cz>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Inbar Karmy [Thu, 17 Aug 2017 13:39:47 +0000 (16:39 +0300)]
net/mlx5e: Expose PFC stall prevention counters
Add the needed capability bit and counters to device spec description.
Expose the following two counters in ethtool:
tx_pause_storm_warning_events: when the device is stalled for a period
longer than a pre-configured watermark, the counter increase, allowing
the debug utility an insight into current device status.
tx_pause_storm_error_events: when the device is stalled for a period
longer than a pre-configured timeout, the pause transmission is disabled,
and the counter increase.
Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
David S. Miller [Mon, 26 Mar 2018 17:14:45 +0000 (13:14 -0400)]
Merge branch 'mlxsw-Offload-IPv6-multicast-routes'
Ido Schimmel says:
====================
mlxsw: Offload IPv6 multicast routes
Yuval says:
The series is intended to allow offloading IPv6 multicast routes
and is split into two parts:
- First half of the patches continue extending ip6mr [& refactor ipmr]
with missing bits necessary for the offloading - fib-notifications,
mfc refcounting and default rule identification.
- Second half of the patches extend functionality inside mlxsw,
beginning with extending lower-parts to support IPv6 mroutes
to host and later extending the router/mr internal APIs within
the driver to accommodate support in ipv6 configurations.
Lastly it adds support in the RTNL_FAMILY_IP6MR notifications,
allowing driver to react and offload related routes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:45 +0000 (15:01 +0300)]
mlxsw: spectrum: Add multicast router trap for PIMv6
Add a new trap for PIMv6 packets. As PIM already has a designated trap
group [ & rate limiter], simply use the same for PIMv6 as well.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:44 +0000 (15:01 +0300)]
mlxsw: spectrum_router: Process IP6MR fib notification
Following previous patches driver is ready to handle notifications
arriving from ip6mr - start processing those when they arrive following
the same manner ipmr currently goes through.
This should enable driver to start offloading ipv6 multicast routes.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:43 +0000 (15:01 +0300)]
mlxsw: spectrum_mr: Add ipv6 specific operations
Populate the various operation structures meant for IPv6 with logic
unique to that protocol suite.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:42 +0000 (15:01 +0300)]
mlxsw: spectrum_router: Make IPMR-related APIs family agnostic
spectrum_router and spectrum_mr have several APIs that are used to
manipulate configurations originating from ipmr fib notifications.
Following previous patches all the protocol-specifics that are necessary
for the configuration are hidden within spectrum_mr. This allows us to
clean the API and make sure that other than choosing the mr_table based
on the fib notification family, spectrum_router wouldn't care about the
source of the notification when passing it onward to spectrum_mr.
This would later allow us to leverage the same code for fib
notifications originating from ip6mr.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:41 +0000 (15:01 +0300)]
mlxsw: spectrum_mr: Convert into using mr_mfc
Current multicast routing logic in driver assumes it's always meant to
deal with IPv4 multicast routes, leaving several placeholders for
later IPv6 support [currently usually WARN()].
This patch changes the driver's internal multicast route struct into
holding a common mr_mfc instead of the IPv4 mfc_cache.
The various placeholders are grouped into 2:
- Functions that require only the common bits; These remain and the
restriction for IPv4-only is lifted.
- Function that require IPv4-specifics - for handling these functions
we add sets of operations that encapsulate the protocol differences
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:40 +0000 (15:01 +0300)]
mlxsw: spectrum_router: Support IPv6 multicast to host CPU
A step toward offloading IPv6 routing, this adds an additional
multicast routing table meant for IPv6 [with its underlying TCAM
region] and populates the default rule for IPv6 multicast packets.
Following this, ingress IPv6 multicast packets would be trapped and
delivered to the host CPU.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:39 +0000 (15:01 +0300)]
mlxsw: spectrum_mr: Pass protocol as part of catchall route params
Since commit
c011ec1bbfd6 ("mlxsw: spectrum: Add the multicast routing
offloading logic") spectrum_mr did not populate the protocol portion of
the catcahall_route_params; mr-tcam logic worked correctly for ipv4
since the enum value for MLXSW_SP_L3_PROTO_IPV4 is '0'.
Explicitly fill the protocol as we'll soon need to differentiate between
ipv4 and ipv6.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:38 +0000 (15:01 +0300)]
mlxsw: reg: Add register settings for IPv6 multicast routing
Add new fields for the rmft register necessary for setting the IPv6
multicast FIB table. Add a matching wrapper function for filling
the register in the IPv6 scenario.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:37 +0000 (15:01 +0300)]
mlxsw: reg: Configure RIF to forward IPv6 multicast packets
Similarly to what was done in commit
4af5964e5888 ("mlxsw: reg:
Configure RIF to forward IPv4 multicast packets by default") we now set
two additional bits to allow IPv6 multicast forwarding.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:36 +0000 (15:01 +0300)]
ip6mr: Add refcounting to mfc
Since ipmr and ip6mr are using the same mr_mfc struct at their core, we
can now refactor the ipmr_cache_{hold,put} logic and apply refcounting
to both ipmr and ip6mr.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:35 +0000 (15:01 +0300)]
ip6mr: Add API for default_rule fib
Add the ability to discern whether a given FIB rule notification relates
to the default rule inserted when registering ip6mr or a different one.
Would later be used by drivers wishing to offload ipv6 multicast routes
but unable to offload rules other than the default one.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:34 +0000 (15:01 +0300)]
ip6mr: Support fib notifications
In similar fashion to ipmr, support fib notifications for ip6mr mfc and
vif related events. This would later allow drivers to react to said
notifications and offload the IPv6 mroutes.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:33 +0000 (15:01 +0300)]
ipmr: Make ipmr_dump() common
Since all the primitive elements used for the notification done by ipmr
are now common [mr_table, mr_mfc, vif_device] we can refactor the logic
for dumping them to a common file.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:32 +0000 (15:01 +0300)]
ipmr: Make MFC fib notifiers common
Like vif notifications, move the notifier struct for MFC as well as its
helpers into a common file; Currently they're only used by ipmr.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 26 Mar 2018 12:01:31 +0000 (15:01 +0300)]
ipmr: Make vif fib notifiers common
The fib-notifiers are tightly coupled with the vif_device which is
already common. Move the notifier struct definition and helpers to the
common file; Currently they're only used by ipmr.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 26 Mar 2018 17:03:27 +0000 (13:03 -0400)]
Merge branch 'pernet-convert-part7.1'
Kirill Tkhai says:
====================
Converting pernet_operations (part #7.1)
this is a resending of the 4 patches from path #7.
Anna kindly reviewed them and suggested to take the patches
through net tree, since there is pernet_operations::async only
in net-next.git.
There is Anna's acks on every header, the rest of patch
has no changes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Mon, 26 Mar 2018 09:29:13 +0000 (12:29 +0300)]
net: Convert nfs4blocklayout_net_ops
These pernet_operations create and destroy per-net pipe
and dentry, and they seem safe to be marked as async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Anna Schumaker <Anna.Schumaker@netapp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Mon, 26 Mar 2018 09:29:04 +0000 (12:29 +0300)]
net: Convert nfs4_dns_resolver_ops
These pernet_operations look similar to rpcsec_gss_net_ops,
they just create and destroy another cache. Also they create
and destroy directory. So, they also look safe to be async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Anna Schumaker <Anna.Schumaker@netapp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Mon, 26 Mar 2018 09:28:55 +0000 (12:28 +0300)]
net: Convert sunrpc_net_ops
These pernet_operations look similar to rpcsec_gss_net_ops,
they just create and destroy another caches. So, they also
can be async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Anna Schumaker <Anna.Schumaker@netapp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Mon, 26 Mar 2018 09:28:47 +0000 (12:28 +0300)]
net: Convert rpcsec_gss_net_ops
These pernet_operations initialize and destroy sunrpc_net_id
refered per-net items. Only used global list is cache_list,
and accesses already serialized.
sunrpc_destroy_cache_detail() check for list_empty() without
cache_list_lock, but when it's called from unregister_pernet_subsys(),
there can't be callers in parallel, so we won't miss list_empty()
in this case.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Anna Schumaker <Anna.Schumaker@netapp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 26 Mar 2018 17:01:10 +0000 (13:01 -0400)]
Merge branch 'nfp-flower-add-ip-fragmentation-offloading-support'
Pieter Jansen van Vuuren says:
====================
nfp: flower: add ip fragmentation offloading support
This set allows offloading IP fragmentation classification. It Implements
ip fragmentation match offloading for both IPv4 and IPv6 and offloads
frag, nofrag, first and nofirstfrag classification.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Jansen van Vuuren [Mon, 26 Mar 2018 08:16:38 +0000 (10:16 +0200)]
nfp: flower: implement ip fragmentation match offload
Implement ip fragmentation match offloading for both IPv4 and IPv6. Allows
offloading frag, nofrag, first and nofirstfrag classification.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Jansen van Vuuren [Mon, 26 Mar 2018 08:16:37 +0000 (10:16 +0200)]
nfp: flower: refactor shared ip header in match offload
Refactored shared ip header code for IPv4 and IPv6 in match offload.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 26 Mar 2018 16:47:57 +0000 (12:47 -0400)]
Merge branch 'net-driver-barriers'
Sinan Kaya says:
====================
netdev: Eliminate duplicate barriers on weakly-ordered archs
Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.
This ends up CPU observing two barriers back to back before executing the
register write.
Since code already has an explicit barrier call, changing writel() to
writel_relaxed().
I did a regex search for wmb() followed by writel() in each drivers
directory.
I scrubbed the ones I care about in this series.
I considered "ease of change", "popular usage" and "performance critical
path" as the determining criteria for my filtering.
We used relaxed API heavily on ARM for a long time but
it did not exist on other architectures. For this reason, relaxed
architectures have been paying double penalty in order to use the common
drivers.
Now that relaxed API is present on all architectures, we can go and scrub
all drivers to see what needs to change and what can remain.
We start with mostly used ones and hope to increase the coverage over time.
It will take a while to cover all drivers.
Feel free to apply patches individually.
Changes since v6:
- bring back amazon ena and add mmiowb, remove
ena_com_write_sq_doorbell_rel().
- remove extra mmiowb in bnx2x
- correct spelling mistake in bnx2x: Replace doorbell barrier() with wmb()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sinan Kaya [Sun, 25 Mar 2018 14:39:21 +0000 (10:39 -0400)]
net: ena: Eliminate duplicate barriers on weakly-ordered archs
Code includes barrier() followed by writel(). writel() already has a
barrier on some architectures like arm64.
This ends up CPU observing two barriers back to back before executing the
register write.
Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a barrier().
Since code already has an explicit barrier call, changing writel() to
writel_relaxed() and adding mmiowb() for ordering protection.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sinan Kaya [Sun, 25 Mar 2018 14:39:20 +0000 (10:39 -0400)]
bnxt_en: Eliminate duplicate barriers on weakly-ordered archs
Code includes wmb() followed by writel(). writel() already has a barrier on
some architectures like arm64.
This ends up CPU observing two barriers back to back before executing the
register write.
Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a wmb().
Since code already has an explicit barrier call, changing writel() to
writel_relaxed().
Also add mmiowb() so that write code doesn't move outside of scope.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sinan Kaya [Sun, 25 Mar 2018 14:39:19 +0000 (10:39 -0400)]
net: qlge: Eliminate duplicate barriers on weakly-ordered archs
Code includes wmb() followed by writel(). writel() already has a barrier on
some architectures like arm64.
This ends up CPU observing two barriers back to back before executing the
register write.
Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a wmb().
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sinan Kaya [Sun, 25 Mar 2018 14:39:18 +0000 (10:39 -0400)]
bnx2x: Eliminate duplicate barriers on weakly-ordered archs
Code includes wmb() followed by writel(). writel() already has a
barrier on some architectures like arm64.
This ends up CPU observing two barriers back to back before executing
the register write.
Since code already has an explicit barrier call, changing writel() to
writel_relaxed().
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sinan Kaya [Sun, 25 Mar 2018 14:39:17 +0000 (10:39 -0400)]
bnx2x: Replace doorbell barrier() with wmb()
barrier() doesn't guarantee memory writes to be observed by the hardware on
all architectures. barrier() only tells compiler not to move this code
with respect to other read/writes.
If memory write needs to be observed by the HW, wmb() is the right choice.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sinan Kaya [Sun, 25 Mar 2018 14:39:16 +0000 (10:39 -0400)]
qlcnic: Eliminate duplicate barriers on weakly-ordered archs
Code includes wmb() followed by writel(). writel() already has a
barrier on some architectures like arm64.
This ends up CPU observing two barriers back to back before executing
the register write.
Since code already has an explicit barrier call, changing writel() to
writel_relaxed().
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Acked-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sinan Kaya [Sun, 25 Mar 2018 14:39:15 +0000 (10:39 -0400)]
net: qla3xxx: Eliminate duplicate barriers on weakly-ordered archs
Code includes wmb() followed by writel(). writel() already has a
barrier on some architectures like arm64.
This ends up CPU observing two barriers back to back before executing
the register write.
Since code already has an explicit barrier call, changing code to
wmb()
writel_relaxed()
mmiowb()
for multi-arch support.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 26 Mar 2018 16:34:20 +0000 (12:34 -0400)]
Merge branch 'sh_eth-unify-the-SoC-feature-checks'
Sergei Shtylyov says:
====================
sh_eth: unify the SoC feature checks
Here's a set of 5 patches against DaveM's 'net-next.git' repo.
The Ether driver sometimes uses the bit fields in 'struct sh_eth_cpu_data'
to check which Ether registers exist in a certain SoC and sometimes it uses
sh_eth_is_{gether|rz_fast_ether}() which basically compares 2 pointers (1 of
them being constant) -- the latter is definitely not a strongest feature of
the RISC CPUs (be it SH or ARM), so I decided to get rid of this type of
the feature checks in favour of the bit fields (I've also made use of a
32-bit value and method pointer where appropriate)...
[1/5] sh_eth: add sh_eth_cpu_data::soft_reset() method
[2/5] sh_eth: add sh_eth_cpu_data::edtrr_trns value
[3/5] sh_eth: add sh_eth_cpu_data::xdfar_rw flag
[4/5] sh_eth: add sh_eth_cpu_data::no_tx_cntr flag
[5/5] sh_eth: add sh_eth_cpu_data::cexcr flag
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Sat, 24 Mar 2018 20:12:54 +0000 (23:12 +0300)]
sh_eth: add sh_eth_cpu_data::cexcr flag
GEther controllers have CERCR/CEECR instead of CNDCR on the others.
Currently we are calling sh_eth_is_gether() in order to check for this,
however it would be simpler to check the new 'cexcr' bitfield in the
'struct sh_eth_cpu_data'; then we'd be able to remove sh_eth_is_gether()
as there would be no callers left...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Sat, 24 Mar 2018 20:11:19 +0000 (23:11 +0300)]
sh_eth: add sh_eth_cpu_data::no_tx_cntrs flag
RZ/A1H (R7S72100) Ether controller doesn't seem to have the TX counter
registers like TROCR/CDCR/LCCR (or at least they are still undocumented
like some TSU registers), so we bail out of sh_eth_get_stats() early in
this case. Currently we are calling sh_eth_is_rz_fast_ether() in order
to check for this, but it would be simpler to check the new 'no_tx_cntrs'
bitfield in the 'struct sh_eth_cpu_data'; then we'd be able to remove
sh_eth_is_rz_fast_ether() as there would be no callers left...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Sat, 24 Mar 2018 20:09:55 +0000 (23:09 +0300)]
sh_eth: add sh_eth_cpu_data::xdfar_rw flag
The GEther-like controllers have writeable RDFAR/TDFAR, on the others
they are read-only or just absent (on R-Car). Currently we are calling
sh_eth_is_{gether|rz_fast_ether}() in order to check if these registers
can be written to, however it would be simpler to check the new 'xdfar_rw'
bitfield in the 'struct sh_eth_cpu_data'...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Sat, 24 Mar 2018 20:08:42 +0000 (23:08 +0300)]
sh_eth: add sh_eth_cpu_data::edtrr_trns value
sh_eth_get_edtrr_trns() returns the value to be written to EDTRR in order
to start TX DMA -- this value is different between the GEther-like and
the other controllers. We can replace this function (and thus get rid of
the calls to sh_eth_is_{gether|rz_fast_ether}() by it) with a new field
'edtrr_trns' in the 'struct sh_eth_cpu_data'.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Sat, 24 Mar 2018 20:07:41 +0000 (23:07 +0300)]
sh_eth: add sh_eth_cpu_data::soft_reset() method
sh_eth_reset() performs a software reset which is implemented in a
completely different way for the GEther-like controllers vs the other
controllers due to a different layout of EDMR (and other factors) --
it therefore makes sense to convert this function to a mandatory
sh_eth_cpu_data::soft_reset() method and thus get rid of the runtime
controller type check via sh_eth_is_{gether|rz_fast_ether}().
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Cochran [Sat, 24 Mar 2018 04:24:02 +0000 (21:24 -0700)]
ptp: Fix documentation to match code.
Ever since commit
3a06c7ac24f9 ("posix-clocks: Remove interval timer
facility and mmap/fasync callbacks") the possibility of PHC based
posix timers has been removed. In addition it will probably never
make sense to implement this functionality.
This patch removes the misleading text which seems to suggest that
posix timers for PHC devices will ever be a thing.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 26 Mar 2018 16:12:27 +0000 (12:12 -0400)]
Merge branch 'hns3-fixes-next'
Peng Li says:
====================
fix some bugs for HNS3
This patchset fixes some bugs for HNS3 driver:
[Patch 1/5 - 2/5] fix 2 return vlaue issues.
[Patch 3/5 - 4/5] fix 2 comments reported by code review.
[Ptach 5/5] avoid sending message to IMP because IMP will not
handle any message when it is resetting.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Sat, 24 Mar 2018 03:32:47 +0000 (11:32 +0800)]
net: hns3: never send command queue message to IMP when reset
IMP will not handle and command queue message any more when it is
in core/global, driver should not send command queue message to
IMP until reinitialize the NIC HW.
This patch checks the status and avoid the message sent to IMP when
reset.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fuyun Liang [Sat, 24 Mar 2018 03:32:46 +0000 (11:32 +0800)]
net: hns3: fix for not initializing VF rss_hash_key problem
Default rss_hash_key value should be given to all vports. But just the
PF rss_hash_key has the default value here. This patch adds rss_hash_key
Initialization for all vports.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fuyun Liang [Sat, 24 Mar 2018 03:32:45 +0000 (11:32 +0800)]
net: hns3: fix for the wrong shift problem in hns3_set_txbd_baseinfo
Third parameter of hnae_set_field is shift, But a mask is given. This
patch fixes it by replacing HNS3_TXD_BDTYPE_M with HNS3_TXD_BDTYPE_S.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fuyun Liang [Sat, 24 Mar 2018 03:32:44 +0000 (11:32 +0800)]
net: hns3: fix for returning wrong value problem in hns3_get_rss_indir_size
The return type of hns3_get_rss_indir_size is u32. But a negative value is
returned. This patch fixes it by replacing the negative value with zero.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fuyun Liang [Sat, 24 Mar 2018 03:32:43 +0000 (11:32 +0800)]
net: hns3: fix for returning wrong value problem in hns3_get_rss_key_size
The return type of hns3_get_rss_key_size is u32. But a negative value is
returned. This patch fixes it by replacing the negative value with zero.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Fri, 23 Mar 2018 23:51:57 +0000 (23:51 +0000)]
net: qualcomm: rmnet: check for null ep to avoid null pointer dereference
The call to rmnet_get_endpoint can potentially return NULL so check
for this to avoid any subsequent null pointer dereferences on a NULL
ep.
Detected by CoverityScan, CID#
1465385 ("Dereference null return value")
Fixes: 23790ef12082 ("net: qualcomm: rmnet: Allow to configure flags for existing devices")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Fri, 23 Mar 2018 23:34:44 +0000 (16:34 -0700)]
ethernet: Use octal not symbolic permissions
Prefer the direct use of octal for permissions.
Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace
and some typing.
Miscellanea:
o Whitespace neatening around these conversions.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Fri, 23 Mar 2018 22:54:39 +0000 (15:54 -0700)]
drivers/net: Use octal not symbolic permissions
Prefer the direct use of octal for permissions.
Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace
and some typing.
Miscellanea:
o Whitespace neatening around these conversions.
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Fri, 23 Mar 2018 22:54:38 +0000 (15:54 -0700)]
net: Use octal not symbolic permissions
Prefer the direct use of octal for permissions.
Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace
and some typing.
Miscellanea:
o Whitespace neatening around these conversions.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 26 Mar 2018 15:34:07 +0000 (11:34 -0400)]
Merge branch 'Drop-NETDEV_UNREGISTER_FINAL'
Kirill Tkhai says:
====================
Drop NETDEV_UNREGISTER_FINAL (was unnamed)
This series drops unused NETDEV_UNREGISTER_FINAL
after some preparations.
v2: New patch [2/3]. Use switch() in [1/3].
The first version was acked by Jason Gunthorpe,
and [1/3] was acked by David Ahern.
Since there are differences to v1, I haven't added
Acked-by tags of people. It would be nice, if you
fill OK to tag v2 too.
====================
Acked-by: Jason Gunthorpe <jgg@mellanox>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Fri, 23 Mar 2018 16:47:39 +0000 (19:47 +0300)]
net: Drop NETDEV_UNREGISTER_FINAL
Last user is gone after
bdf5bd7f2132 "rds: tcp: remove
register_netdevice_notifier infrastructure.", so we can
remove this netdevice command. This allows to delete
rtnl_lock() in netdev_run_todo(), which is hot path for
net namespace unregistration.
dev_change_net_namespace() and netdev_wait_allrefs()
have rcu_barrier() before NETDEV_UNREGISTER_FINAL call,
and the source commits say they were introduced to
delemit the call with NETDEV_UNREGISTER, but this patch
leaves them on the places, since they require additional
analysis, whether we need in them for something else.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Fri, 23 Mar 2018 16:47:29 +0000 (19:47 +0300)]
infiniband: Replace usnic_ib_netdev_event_to_string() with netdev_cmd_to_name()
This function just calls netdev_cmd_to_name().
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Fri, 23 Mar 2018 16:47:19 +0000 (19:47 +0300)]
net: Make NETDEV_XXX commands enum { }
This patch is preparation to drop NETDEV_UNREGISTER_FINAL.
Since the cmd is used in usnic_ib_netdev_event_to_string()
to get cmd name, after plain removing NETDEV_UNREGISTER_FINAL
from everywhere, we'd have holes in event2str[] in this
function.
Instead of that, let's make NETDEV_XXX commands names
available for everyone, and to define netdev_cmd_to_name()
in the way we won't have to shaffle names after their
numbers are changed.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Fri, 23 Mar 2018 14:52:25 +0000 (09:52 -0500)]
fsl/fman: remove unnecessary set_dma_ops() call and HAS_DMA dependency
The platform device is no longer used for DMA mapping so the
(questionable) setting of the DMA ops done here is no longer
needed. Removing it together with the HAS_DMA dependency that
it required.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 26 Mar 2018 15:29:11 +0000 (11:29 -0400)]
Merge branch 'ethernet-ave-add-UniPhier-PXs3-support'
Kunihiko Hayashi says:
====================
net: ethernet: ave: add UniPhier PXs3 support
Add ethernet controller support on UniPhier PXs3 SoC.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kunihiko Hayashi [Fri, 23 Mar 2018 12:30:37 +0000 (21:30 +0900)]
net: ethernet: ave: add UniPhier PXs3 support
Add a compatible string and SoC data for ethernet controller on
UniPhier PXs3 SoC.
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kunihiko Hayashi [Fri, 23 Mar 2018 12:30:36 +0000 (21:30 +0900)]
dt-bindings: net: ave: add PXs3 support
Add a compatible string for ethernet controller on UniPhier PXs3 SoC.
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 26 Mar 2018 01:27:38 +0000 (21:27 -0400)]
Merge tag 'wireless-drivers-next-for-davem-2018-03-24' of git://git./linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for 4.17
The biggest changes are the bluetooth related patches to the rsi
driver. It adds a new bluetooth driver which communicates directly
with the wireless driver and the interface is defined in
include/net/rsi_91x.h.
Major changes:
wl1251
* read the MAC address from the NVS file
rtlwifi
* enable mac80211 fast-tx support
mt76
* add capability to select tx/rx antennas
mt7601
* let mac80211 validate rx CCMP Packet Number (PN)
rsi
* bluetooth: add new btrsi driver
* btcoex support with the new btrsi driver
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
kbuild test robot [Fri, 23 Mar 2018 19:47:42 +0000 (03:47 +0800)]
tipc: tipc_disc_addr_trial_msg() can be static
Fixes: 25b0b9c4e835 ("tipc: handle collisions of 32-bit node address hash values")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Jon Maloy jon.maloy@ericsson.com
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 23 Mar 2018 11:36:15 +0000 (14:36 +0300)]
ibmvnic: Potential NULL dereference in clean_one_tx_pool()
There is an && vs || typo here, which potentially leads to a NULL
dereference.
Fixes: e9e1e97884b7 ("ibmvnic: Update TX pool cleaning routine")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Fri, 23 Mar 2018 11:35:49 +0000 (17:05 +0530)]
cxgb4: support new ISSI flash parts
Add support for new 32MB and 64MB ISSI (Integrated Silicon
Solution, Inc.) FLASH parts.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Fri, 23 Mar 2018 11:33:10 +0000 (17:03 +0530)]
cxgb4: depend on firmware event for link status
Depend on the firmware sending us link status changes,
rather than assuming that the link goes down upon L1
configuration.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Fri, 23 Mar 2018 10:18:46 +0000 (15:48 +0530)]
cxgb4: copy vlan_id in ndo_get_vf_config
Copy vlan_id to get it displayed in vf info.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudhar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Fri, 23 Mar 2018 09:55:10 +0000 (15:25 +0530)]
cxgb4: Setup FW queues before registering netdev
When NetworkManager is enabled, there are chances that interface up
is called even before probe completes. This means we have not yet
allocated the FW sge queues, hence rest of ingress queue allocation
wont be proper. Fix this by calling setup_fw_sge_queues() before
register_netdev().
Fixes: 0fbc81b3ad51 ('chcr/cxgb4i/cxgbit/RDMA/cxgb4: Allocate resources dynamically for all cxgb4 ULD's')
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 26 Mar 2018 00:48:26 +0000 (20:48 -0400)]
Merge branch 'broadcom-Adaptive-interrupt-coalescing'
Florian Fainelli says:
====================
net: broadcom: Adaptive interrupt coalescing
This patch series adds adaptive interrupt coalescing for the Gigabit Ethernet
drivers SYSTEMPORT and GENET.
This really helps lower the interrupt count and system load, as measured by
vmstat for a Gigabit TCP RX session:
SYSTEMPORT:
without:
1 0 0 192188 0 25472 0 0 0 0 122100 38870 1 42 57 0 0
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 1.03 GBytes 884 Mbits/sec
with:
1 0 0 192288 0 25468 0 0 0 0 58806 44401 0 100 0 0 0
[ 5] 0.0-10.0 sec 1.04 GBytes 888 Mbits/sec
GENET:
without:
1 0 0
1170404 0 25420 0 0 0 0 130785 63402 2 85 12 0 0
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 1.04 GBytes 888 Mbits/sec
with:
1 0 0
1170560 0 25420 0 0 0 0 50610 48477 0 100 0 0 0
[ 5] 0.0-10.0 sec 1.05 GBytes 899 Mbits/sec
Please look at the implementation and let me know if you see any problems, this
was largely inspired by bnxt_en.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 23 Mar 2018 01:19:33 +0000 (18:19 -0700)]
net: bcmgenet: Add support for adaptive RX coalescing
Unlike the moder modern SYSTEMPORT hardware, we do not have a
configurable TDMA timeout, which limits us to implement adaptive RX
interrupt coalescing only. We have each of our RX rings implement a
bcmgenet_net_dim structure which holds an interrupt counter, number of
packets, bytes, and a container for a net_dim instance.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 23 Mar 2018 01:19:32 +0000 (18:19 -0700)]
net: systemport: Implement adaptive interrupt coalescing
Implement support for adaptive RX and TX interrupt coalescing using
net_dim. We have each of our TX ring and our single RX ring implement a
bcm_sysport_net_dim structure which holds an interrupt counter, number
of packets, bytes, and a container for a net_dim instance.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 26 Mar 2018 00:43:42 +0000 (20:43 -0400)]
Merge branch 'mv88e6xxx-module-reloading'
Andrew Lunn says:
====================
Fixes to allow mv88e6xxx module to be reloaded
As reported by Uwe Kleine-König, the interrupt trigger is first
configured by DT and then reconfigured to edge. This results in a
failure on EPROBE_DEFER, or if the module is unloaded and reloaded.
A second crash happens on module reload due to a missing call to the
common IRQ free code when using polled interrupts.
With these fixes in place, it becomes possible to load and unload the
kernel modules a few times without it crashing.
v2: Fix the ü in Künig a couple of times
v3: But the ü should be an ö!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 25 Mar 2018 21:43:15 +0000 (23:43 +0200)]
net: dsa: mv88e6xxx: Call the common IRQ free code
When free'ing the polled IRQs, call the common irq free code.
Otherwise the interrupts are left registered, and when we come to load
the driver a second time, we get an Opps.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 25 Mar 2018 21:43:14 +0000 (23:43 +0200)]
net: dsa: mv88e6xxx: Use the DT IRQ trigger mode
By calling request_threaded_irq() with the flag IRQF_TRIGGER_FALLING
we override the trigger mode provided in device tree. And the
interrupt is actually active low, which is what all the current device
tree descriptions use.
Suggested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Mashak [Sun, 25 Mar 2018 21:20:06 +0000 (17:20 -0400)]
tc-testing: updated police, mirred, skbedit and skbmod with more tests
Added extra test cases for control actions (reclassify, pipe etc.),
cookies, max index value and police args sanity check.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 25 Mar 2018 21:07:41 +0000 (17:07 -0400)]
Merge branch 'hv_netvsc-Fix-improve-RX-path-error-handling'
Haiyang Zhang says:
====================
hv_netvsc: Fix/improve RX path error handling
Fix the status code returned to the host. Also add range
check for rx packet offset and length.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Thu, 22 Mar 2018 19:01:14 +0000 (12:01 -0700)]
hv_netvsc: Add range checking for rx packet offset and length
This patch adds range checking for rx packet offset and length.
It may only happen if there is a host side bug.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Thu, 22 Mar 2018 19:01:13 +0000 (12:01 -0700)]
hv_netvsc: Fix the return status in RX path
As defined in hyperv_net.h, the NVSP_STAT_SUCCESS is one not zero.
Some functions returns 0 when it actually means NVSP_STAT_SUCCESS.
This patch fixes them.
In netvsc_receive(), it puts the last RNDIS packet's receive status
for all packets in a vmxferpage which may contain multiple RNDIS
packets.
This patch puts NVSP_STAT_FAIL in the receive completion if one of
the packets in a vmxferpage fails.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 25 Mar 2018 20:46:05 +0000 (16:46 -0400)]
Merge branch 'net-permit-skb_segment-on-head_frag-frag_list-skb'
Yonghong Song says:
====================
net: permit skb_segment on head_frag frag_list skb
One of our in-house projects, bpf-based NAT, hits a kernel BUG_ON at
function skb_segment(), line 3667. The bpf program attaches to
clsact ingress, calls bpf_skb_change_proto to change protocol
from ipv4 to ipv6 or from ipv6 to ipv4, and then calls bpf_redirect
to send the changed packet out.
...
3665 while (pos < offset + len) {
3666 if (i >= nfrags) {
3667 BUG_ON(skb_headlen(list_skb));
...
The triggering input skb has the following properties:
list_skb = skb->frag_list;
skb->nfrags != NULL && skb_headlen(list_skb) != 0
and skb_segment() is not able to handle a frag_list skb
if its headlen (list_skb->len - list_skb->data_len) is not 0.
Patch #1 provides a simple solution to avoid BUG_ON. If
list_skb->head_frag is true, its page-backed frag will
be processed before the list_skb->frags.
Patch #2 provides a test case in test_bpf module which
constructs a skb and calls skb_segment() directly. The test
case is able to trigger the BUG_ON without Patch #1.
The patch has been tested in the following setup:
ipv6_host <-> nat_server <-> ipv4_host
where nat_server has a bpf program doing ipv4<->ipv6
translation and forwarding through clsact hook
bpf_skb_change_proto.
Changelog:
v5 -> v6:
. Added back missed BUG_ON(!nfrags) for zero
skb_headlen(skb) case, plus a couple of
cosmetic changes, from Alexander.
v4 -> v5:
. Replace local variable head_frag with
a static inline function skb_head_frag_to_page_desc
which gets the head_frag on-demand. This makes
code more readable and also does not increase
the stack size, from Alexander.
. Remove the "if(nfrags)" guard for skb_orphan_frags
and skb_zerocopy_clone as I found that they can
handle zero-frag skb (with non-zero skb_headlen(skb))
properly.
. Properly release segment list from skb_segment()
in the test, from Eric.
v3 -> v4:
. Remove dynamic memory allocation and use rewinding
for both index and frag to remove one branch in fast path,
from Alexander.
. Fix a bunch of issues in test_bpf skb_segment() test,
including proper way to allocate skb, proper function
argument for skb_add_rx_frag and not freeint skb, etc.,
from Eric.
v2 -> v3:
. Use starting frag index -1 (instead of 0) to
special process head_frag before other frags in the skb,
from Alexander Duyck.
v1 -> v2:
. Removed never-hit BUG_ON, spotted by Linyu Yuan.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yonghong Song [Wed, 21 Mar 2018 23:31:04 +0000 (16:31 -0700)]
net: bpf: add a test for skb_segment in test_bpf module
Without the previous commit,
"modprobe test_bpf" will have the following errors:
...
[ 98.149165] ------------[ cut here ]------------
[ 98.159362] kernel BUG at net/core/skbuff.c:3667!
[ 98.169756] invalid opcode: 0000 [#1] SMP PTI
[ 98.179370] Modules linked in:
[ 98.179371] test_bpf(+)
...
which triggers the bug the previous commit intends to fix.
The skbs are constructed to mimic what mlx5 may generate.
The packet size/header may not mimic real cases in production. But
the processing flow is similar.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yonghong Song [Wed, 21 Mar 2018 23:31:03 +0000 (16:31 -0700)]
net: permit skb_segment on head_frag frag_list skb
One of our in-house projects, bpf-based NAT, hits a kernel BUG_ON at
function skb_segment(), line 3667. The bpf program attaches to
clsact ingress, calls bpf_skb_change_proto to change protocol
from ipv4 to ipv6 or from ipv6 to ipv4, and then calls bpf_redirect
to send the changed packet out.
3472 struct sk_buff *skb_segment(struct sk_buff *head_skb,
3473 netdev_features_t features)
3474 {
3475 struct sk_buff *segs = NULL;
3476 struct sk_buff *tail = NULL;
...
3665 while (pos < offset + len) {
3666 if (i >= nfrags) {
3667 BUG_ON(skb_headlen(list_skb));
3668
3669 i = 0;
3670 nfrags = skb_shinfo(list_skb)->nr_frags;
3671 frag = skb_shinfo(list_skb)->frags;
3672 frag_skb = list_skb;
...
call stack:
...
#1 [
ffff883ffef03558] __crash_kexec at
ffffffff8110c525
#2 [
ffff883ffef03620] crash_kexec at
ffffffff8110d5cc
#3 [
ffff883ffef03640] oops_end at
ffffffff8101d7e7
#4 [
ffff883ffef03668] die at
ffffffff8101deb2
#5 [
ffff883ffef03698] do_trap at
ffffffff8101a700
#6 [
ffff883ffef036e8] do_error_trap at
ffffffff8101abfe
#7 [
ffff883ffef037a0] do_invalid_op at
ffffffff8101acd0
#8 [
ffff883ffef037b0] invalid_op at
ffffffff81a00bab
[exception RIP: skb_segment+3044]
RIP:
ffffffff817e4dd4 RSP:
ffff883ffef03860 RFLAGS:
00010216
RAX:
0000000000002bf6 RBX:
ffff883feb7aaa00 RCX:
0000000000000011
RDX:
ffff883fb87910c0 RSI:
0000000000000011 RDI:
ffff883feb7ab500
RBP:
ffff883ffef03928 R8:
0000000000002ce2 R9:
00000000000027da
R10:
000001ea00000000 R11:
0000000000002d82 R12:
ffff883f90a1ee80
R13:
ffff883fb8791120 R14:
ffff883feb7abc00 R15:
0000000000002ce2
ORIG_RAX:
ffffffffffffffff CS: 0010 SS: 0018
#9 [
ffff883ffef03930] tcp_gso_segment at
ffffffff818713e7
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 25 Mar 2018 20:24:34 +0000 (16:24 -0400)]
Merge branch '10GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2018-03-23
This series contains updates to ixgbe and ixgbevf only.
Paul adds status register reads to reduce a potential race condition
where registers can read 0xFFFFFFFF during a PCI reset, which in turn
causes the driver to remove the adapter. Then fixes an assignment
operation with an "OR" operation.
Shannon Nelson provides several IPsec offload cleanups to ixgbe, as well as a
patch to enable TSO with IPsec offload.
Tony provides the much anticipated XDP support for ixgbevf. Currently,
pass, drop and XDP_TX actions are supported, as well as meta data and
stats reporting.
Björn Töpel tweaks the page counting for XDP_REDIRECT, since a page can
have its reference count decreased via the xdp_do_redirect() call.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 25 Mar 2018 20:18:55 +0000 (16:18 -0400)]
Merge branch 'liquidio-Tx-queue-cleanup'
Intiyaz Basha says:
====================
liquidio: Tx queue cleanup
Moved some common function to octeon_network.h
Removed some unwanted functions and checks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Sat, 24 Mar 2018 00:37:44 +0000 (17:37 -0700)]
liquidio: Renamed txqs_start to start_txqs
For consistency renaming txqs_start to start_txqs
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Sat, 24 Mar 2018 00:37:41 +0000 (17:37 -0700)]
liquidio: Renamed txqs_stop to stop_txqs
For consistency renaming txqs_stop to stop_txqs
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Sat, 24 Mar 2018 00:37:39 +0000 (17:37 -0700)]
liquidio: Renamed txqs_wake to wake_txqs
For consistency renaming txqs_wake to wake_txqs
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Sat, 24 Mar 2018 00:37:36 +0000 (17:37 -0700)]
liquidio: Function call skb_iq for deriving queue from skb
Using skb_iq function for deriving queue from skb
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Sat, 24 Mar 2018 00:37:33 +0000 (17:37 -0700)]
liquidio: Removed one line function wake_q
Removing one line function wake_q
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Sat, 24 Mar 2018 00:37:30 +0000 (17:37 -0700)]
liquidio: Removed one line function stop_q
Removing one line function stop_q
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Sat, 24 Mar 2018 00:37:28 +0000 (17:37 -0700)]
liquidio: Removed netif_is_multiqueue check
Removing checks for netif_is_multiqueue.
Configuring single queue will be a multiqueue netdev with one queues.
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Sat, 24 Mar 2018 00:37:25 +0000 (17:37 -0700)]
liquidio: Removed start_txq function
Removing start_txq function from VF and PF files
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Sat, 24 Mar 2018 00:37:20 +0000 (17:37 -0700)]
liquidio: Removed one line function stop_txq
Removing one line function stop_txq
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Sat, 24 Mar 2018 00:37:17 +0000 (17:37 -0700)]
liquidio: Moved common function skb_iq to to octeon_network.h
Moving common function skb_iq to to octeon_network.h
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Sat, 24 Mar 2018 00:37:07 +0000 (17:37 -0700)]
liquidio: Moved common function txqs_start to octeon_network.h
Moving common function txqs_start to octeon_network.h
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Sat, 24 Mar 2018 00:36:58 +0000 (17:36 -0700)]
liquidio: Moved common function txqs_wake to octeon_network.h
Moving common function txqs_wake to octeon_network.h
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Sat, 24 Mar 2018 00:36:56 +0000 (17:36 -0700)]
liquidio: Moved common function txqs_stop to octeon_network.h
Moving common function txqs_stop to octeon_network.h
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Fri, 23 Mar 2018 18:31:30 +0000 (19:31 +0100)]
net/sched: act_vlan: declare push_vid with host byte order
use u16 in place of __be16 to suppress the following sparse warnings:
net/sched/act_vlan.c:150:26: warning: incorrect type in assignment (different base types)
net/sched/act_vlan.c:150:26: expected restricted __be16 [usertype] push_vid
net/sched/act_vlan.c:150:26: got unsigned short
net/sched/act_vlan.c:151:21: warning: restricted __be16 degrades to integer
net/sched/act_vlan.c:208:26: warning: incorrect type in assignment (different base types)
net/sched/act_vlan.c:208:26: expected unsigned short [unsigned] [usertype] tcfv_push_vid
net/sched/act_vlan.c:208:26: got restricted __be16 [usertype] push_vid
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Fri, 23 Mar 2018 18:09:39 +0000 (19:09 +0100)]
net/sched: remove tcf_idr_cleanup()
tcf_idr_cleanup() is no more used, so remove it.
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 23 Mar 2018 18:03:58 +0000 (21:03 +0300)]
mlxsw: spectrum_span: Prevent duplicate mirrors
In net commit
8175f7c4736f ("mlxsw: spectrum: Prevent duplicate
mirrors") we prevented the user from mirroring more than once from a
single binding point (port-direction pair).
The fix was essentially reverted in a merge conflict resolution when net
was merged into net-next. Restore it.
Fixes: 03fe2debbb27 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Björn Töpel [Thu, 22 Mar 2018 09:02:36 +0000 (10:02 +0100)]
ixgbe: tweak page counting for XDP_REDIRECT
The current page counting scheme assumes that the reference count
cannot decrease until the received frame is sent to the upper layers
of the networking stack. This assumption does not hold for the
XDP_REDIRECT action, since a page (pointed out by xdp_buff) can have
its reference count decreased via the xdp_do_redirect call.
To work around that, we now start off by a large page count and then
don't allow a refcount less than two.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>