Ilan Tayari [Mon, 27 Mar 2017 11:52:09 +0000 (14:52 +0300)]
net/mlx5: FPGA, Add FW commands for FPGA QPs
The FPGA QP is a high-bandwidth communication channel between the host
CPU and the FPGA device. It allows performing DMA operations between
host memory and the FPGA logic via the ConnectX chip.
Add ConnectX FW commands which create and manipulate FPGA QPs.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Ilan Tayari [Wed, 14 Jun 2017 07:19:54 +0000 (10:19 +0300)]
net/mlx5: FPGA, Move FPGA init/cleanup to init_once
The FPGA init and cleanup routines should be called just once per
device.
Move them to the init_once and cleanup_once routines.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Ilan Tayari [Sun, 26 Mar 2017 14:46:03 +0000 (17:46 +0300)]
net/mlx5: Add QP WQ support
A QP in ConnectX is a concatenation of RQ and SQ which share a QP-number
and work together.
Add support for allocating and managing the work-queue buffer for a QP, in
a similar way to how SQs and RQs are already supported.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Ilan Tayari [Mon, 19 Jun 2017 09:53:25 +0000 (12:53 +0300)]
net/mlx5: Make get_cqe routine not ethernet-specific
Move mlx5e_get_cqe routine to wq.h and rename it to
mlx5_cqwq_get_cqe.
This allows it to be used by other CQ users outside of the
ethernet driver code.
A later patch in this patchset will make use of it from
FPGA code for the FPGA high-speed connection.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Ilan Tayari [Sun, 14 May 2017 13:04:30 +0000 (16:04 +0300)]
IB/mlx5: Respect mlx5_core reserved GIDs
Reserved gids are taken by the mlx5_core, report smaller GID table
size to IB core.
Set mlx5_query_roce_port's return value back to int. In case of
error, return an indication. This rolls back some of the change
in commit
50f22fd8ecf9 ("IB/mlx5: Set mlx5_query_roce_port's return value to void")
Change set_roce_addr to use gid_set function, instead of directly
sending the command.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Ilan Tayari [Sun, 26 Mar 2017 14:23:42 +0000 (17:23 +0300)]
net/mlx5: Add support for multiple RoCE enable
Previously, only mlx5_ib enabled RoCE on the port, but FPGA needs it as
well.
Add support for counting number of enables, so that FPGA and IB can work
in parallel and independently.
Program the HW to enable RoCE on the first enable call, and program to
disable RoCE on the last disable call.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Ilan Tayari [Sun, 26 Mar 2017 14:01:57 +0000 (17:01 +0300)]
net/mlx5: Add reserved-gids support
Reserved GIDs are entries in the GID table in use by the mlx5_core
and its submodules (e.g. FPGA, SRIOV, E-Swtich, netdev).
The entries are reserved at the high indexes of the GID table.
A mlx5 submodule may reserve a certain amount of GIDs for its own use
during the load sequence by calling mlx5_core_reserve_gids, and must
also take care to un-reserve these GIDs when it closes.
Reservation is only allowed during the load sequence and before any
interfaces (e.g. mlx5_ib or mlx5_en) are up.
After reservation, a submodule may call mlx5_core_reserved_gid_alloc/
free to allocate entries from the reserved GIDs pool.
Reserve a GID table entry for every supported FPGA QP.
A later patch in the patchset will remove them from being reported to
IB core.
Another such patch will make use of these for FPGA QPs in Innova NIC.
Added lib/mlx5.h to serve as a library for mlx5 submodlues, and to
expose only public mlx5 API, more mlx5 library files will be added in
future submissions.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Ilan Tayari [Thu, 25 May 2017 05:42:07 +0000 (08:42 +0300)]
net/mlx5: Set interface flags before cleanup in unload_one
In load_one, the interface flags are changed from down to up,
only after initializing the interfaces.
In unload_one, the flags are changed from up to down before the
interface cleanup.
Change the cleanup order to be opposite to initialization order.
This fixes flag consistency between init and cleanup.
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Colin Ian King [Mon, 26 Jun 2017 12:53:46 +0000 (13:53 +0100)]
net/mlx4: fix spelling mistake: "coalesing" -> "coalescing"
Trivial fix to spelling mistake in en_dbg debug message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 27 Jun 2017 03:13:23 +0000 (23:13 -0400)]
Merge branch 'net-add-netlink_ext_ack-support-to-rtnl_link_ops'
Matthias Schiffer says:
====================
net: add netlink_ext_ack support to rtnl_link_ops
Same changes as http://patchwork.ozlabs.org/patch/780351/ , split into
separate patches for each rtnl_link_ops field as requested.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthias Schiffer [Sun, 25 Jun 2017 21:56:03 +0000 (23:56 +0200)]
net: add netlink_ext_ack argument to rtnl_link_ops.slave_validate
Add support for extended error reporting.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthias Schiffer [Sun, 25 Jun 2017 21:56:02 +0000 (23:56 +0200)]
net: add netlink_ext_ack argument to rtnl_link_ops.slave_changelink
Add support for extended error reporting.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthias Schiffer [Sun, 25 Jun 2017 21:56:01 +0000 (23:56 +0200)]
net: add netlink_ext_ack argument to rtnl_link_ops.validate
Add support for extended error reporting.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthias Schiffer [Sun, 25 Jun 2017 21:56:00 +0000 (23:56 +0200)]
net: add netlink_ext_ack argument to rtnl_link_ops.changelink
Add support for extended error reporting.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthias Schiffer [Sun, 25 Jun 2017 21:55:59 +0000 (23:55 +0200)]
net: add netlink_ext_ack argument to rtnl_link_ops.newlink
Add support for extended error reporting.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Grzeschik [Fri, 23 Jun 2017 14:54:10 +0000 (16:54 +0200)]
net: macb: add fixed-link node support
In case the MACB is directly connected to a
non-mdio PHY/device, it should be possible to provide
a fixed link configuration in the DT.
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 25 Jun 2017 18:45:34 +0000 (14:45 -0400)]
Merge tag 'wireless-drivers-next-for-davem-2017-06-25' of git://git./linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for 4.13
New features and bug fixes to quite a few different drivers, but
nothing really special standing out.
What makes me happy that we have now more vendors actively
contributing to upstream drivers. In this pull request we have patches
from Broadcom, Intel, Qualcomm, Realtek and Redpine Signals, and I
still have patches from Marvell and Quantenna pending in patchwork. Now
that's something comparing to how things looked 11 years ago in Jeff
Garzik's "State of the Union: Wireless" email:
https://lkml.org/lkml/2006/1/5/671
Major changes:
wil6210
* add low level RF sector interface via nl80211 vendor commands
* add module parameter ftm_mode to load separate firmware for factory
testing
* support devices with different PCIe bar size
* add support for PCIe D3hot in system suspend
* remove ioctl interface which should not be in a wireless driver
ath10k
* go back to using dma_alloc_coherent() for firmware scratch memory
* add per chain RSSI reporting
brcmfmac
* add support multi-scheduled scan
* add scheduled scan support for specified BSSIDs
* add support for brcm43430 revision 0
wlcore
* add wil1285 compatible
rsi
* add RS9113 USB support
iwlwifi
* FW API documentation improvements (for tools and htmldoc)
* continuing work for the new A000 family
* bump the maximum supported FW API to 31
* improve the differentiation between 8000, 9000 and A000 families
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 25 Jun 2017 18:43:53 +0000 (14:43 -0400)]
Merge branch 'sctp-RFC-4960-Errata-fixes'
Marcelo Ricardo Leitner says:
====================
sctp: RFC 4960 Errata fixes
This patchset contains fixes for 4 Errata topics from
https://tools.ietf.org/html/draft-ietf-tsvwg-rfc4960-errata-01
Namely, sections:
3.12. Order of Adjustments of partial_bytes_acked and cwnd
3.22. Increase of partial_bytes_acked in Congestion Avoidance
3.26. CWND Increase in Congestion Avoidance Phase
3.27. Refresh of cwnd and ssthresh after Idle Period
Tests performed with netperf using net namespaces, with drop rates at
0%, 0.5% and 1% by netem, IPv4 and IPv6, 10 runs for each combination.
I couldn't spot differences on the stats. With and without these patches
the results vary in a similar way in terms of throughput and
retransmissions.
Tests with 20ms delay and 20ms delay + drops at 0.5% and 1% also had
results in a similar way, no noticeable difference.
Looking at cwnd, it was possible to notice slightly lower values being
used while still sustaining same throughput profile.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Fri, 23 Jun 2017 22:59:36 +0000 (19:59 -0300)]
sctp: adjust ssthresh when transport is idle
RFC 4960 Errata 3.27 identifies that ssthresh should be adjusted to cwnd
because otherwise it could cause the transport to lock into congestion
avoidance phase specially if ssthresh was previously reduced by some
packet drop, leading to poor performance.
The Errata says to adjust ssthresh to cwnd only once, though the same
goal is achieved by updating it every time we update cwnd too. The
caveat is that we could take longer to get back up to speed but that
should be compensated by the fact that we don't adjust on RTO basis (as
RFC says) but based on Heartbeats, which are usually way longer.
See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-rfc4960-errata-01#section-3.27
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Fri, 23 Jun 2017 22:59:35 +0000 (19:59 -0300)]
sctp: adjust cwnd increase in Congestion Avoidance phase
RFC4960 Errata 3.26 identified that at the same time RFC4960 states that
cwnd should never grow more than 1*MTU per RTT, Section 7.2.2 was
underspecified and as described could allow increasing cwnd more than
that.
This patch updates it so partial_bytes_acked is maxed to cwnd if
flight_size doesn't reach cwnd, protecting it from such case.
See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-rfc4960-errata-01#section-3.26
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Fri, 23 Jun 2017 22:59:34 +0000 (19:59 -0300)]
sctp: allow increasing cwnd regardless of ctsn moving or not
As per RFC4960 Errata 3.22, this condition is not needed anymore as it
could cause the partial_bytes_acked to not consider the TSNs acked in
the Gap Ack Blocks although they were received by the peer successfully.
This patch thus drops the check for new Cumulative TSN Ack Point,
leaving just the flight_size < cwnd one.
See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-rfc4960-errata-01#section-3.22
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Fri, 23 Jun 2017 22:59:33 +0000 (19:59 -0300)]
sctp: update order of adjustments of partial_bytes_acked and cwnd
RFC4960 Errata 3.12 says RFC4960 is unclear about the order of
adjustments applied to partial_bytes_acked and cwnd in the congestion
avoidance phase, and that the actual order should be:
partial_bytes_acked is reset to (partial_bytes_acked - cwnd). Next, cwnd
is increased by MTU.
We were first increasing cwnd, and then subtracting the new value pba,
which leads to a different result as pba is smaller than what it should
and could cause cwnd to not grow as much.
See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-rfc4960-errata-01#section-3.12
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mintz, Yuval [Sun, 25 Jun 2017 08:09:12 +0000 (11:09 +0300)]
net: Remove ndo_dfwd_start_xmit
Looks like commit
f663dd9aaf9e ("net: core: explicitly select a txq before doing l2 forwarding")
has removed the need for this dedicated xmit function [it even explicitly
states so in its commit log message] but it hasn't removed the definition
of the ndo.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
CC: Jason Wang <jasowang@redhat.com>
CC: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 25 Jun 2017 15:44:29 +0000 (11:44 -0400)]
Merge branch 'qcom-emac-various-minor-improvements'
Timur Tabi says:
====================
net: qcom/emac: various minor improvements
A collection of minor fixes and features to the Qualcomm Technologies
EMAC network driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Timur Tabi [Fri, 23 Jun 2017 19:33:30 +0000 (14:33 -0500)]
net: qcom/emac: add support for emulation systems
On emulation systems, the EMAC's internal PHY ("SGMII") is not present,
but is not needed for network functionality. So just display a warning
message and ignore the SGMII.
Tested-by: Philip Elcan <pelcan@codeaurora.org>
Tested-by: Adam Wallis <awallis@codeaurora.org>
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timur Tabi [Fri, 23 Jun 2017 19:33:29 +0000 (14:33 -0500)]
net: qcom/emac: do not reset the EMAC during initialization
On ACPI systems, the driver depends on firmware pre-initializing the
EMAC because we don't have access to the clocks, and the EMAC has specific
clock programming requirements. Therefore, we don't want to reset the
EMAC while we are completing the initialization.
Tested-by: Richard Ruigrok <rruigrok@codeaurora.org>
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timur Tabi [Fri, 23 Jun 2017 19:33:28 +0000 (14:33 -0500)]
net: qcom/emac: add shutdown function
The shutdown function halts all DMA and interrupts, so that all
operations are discontinued when the system shuts down, e.g. via
kexec or a forced reboot.
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mateusz Jurczyk [Fri, 23 Jun 2017 17:32:28 +0000 (19:32 +0200)]
af_iucv: Move sockaddr length checks to before accessing sa_family in bind and connect handlers
Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in bind() and connect()
handlers of the AF_IUCV socket. Since neither syscall enforces a minimum
size of the corresponding memory region, very short sockaddrs (zero or
one byte long) result in operating on uninitialized memory while
referencing .sa_family.
Fixes: 52a82e23b9f2 ("af_iucv: Validate socket address length in iucv_sock_bind()")
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
[jwi: removed unneeded null-check for addr]
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hans Wippel [Fri, 23 Jun 2017 17:32:27 +0000 (19:32 +0200)]
net/iucv: improve endianness handling
Use proper endianness conversion for an skb protocol assignment. Given
that IUCV is only available on big endian systems (s390), this simply
avoids an endianness warning reported by sparse.
Signed-off-by: Hans Wippel <hwippel@linux.vnet.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 23 Jun 2017 15:17:04 +0000 (18:17 +0300)]
net: dsa: mv88e6xxx: fix error code in mv88e6390_serdes_power()
We're accidentally returning the wrong variable. "cmode" is
uninitialized at this point so it causes a static checker warning.
Fixes: 6335e9f2446b ("net: dsa: mv88e6xxx: mv88e6390X SERDES support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 25 Jun 2017 15:42:09 +0000 (11:42 -0400)]
Merge branch 'nfp-add-flower-app-with-representors'
Simon Horman says:
====================
nfp: add flower app with representors
this series adds a flower app to the NFP driver.
It initialises four types of netdevs:
* PF netdev - lower-device for communication of packets to device
* PF representor netdev
* VF representor netdevs
* Phys port representor netdevs
The PF netdev acts as a lower-device which sends and receives packets to
and from the firmware. The representors act as upper-devices. For TX
representors attach a metadata dst to the skb which is used by the PF
netdev to prepend metadata to the packet before forwarding the firmware. On
RX the PF netdev looks up the representor based on the prepended metadata
received from the firmware and forwards the skb to the representor after
removing the metadata.
Control queues are used to send and receive control messages which are
used to communicate configuration information with the firmware. These
are in separate vNIC to the queues belonging to the PF netdev. The control
queues are not exposed to use-space via a netdev or any other means.
The first 9 patches of this series provide app-independent infrastructure
to instantiate representors and the remaining 3 patches provide an app
which uses this infrastructure.
As the name implies this app is targeted at providing offload of TC flower.
Flower offload - allowing classifiers to be attached to representor netdevs
- is intended to be provided by follow-up patches at which point it will
become the dominant feature of the app.
Minor changes since v2 noted in changelogs of individual patches.
Review of v1 and v2 of this patchset have been addressed either
through discussion on-list or changes in this patchset.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Fri, 23 Jun 2017 20:12:09 +0000 (22:12 +0200)]
nfp: add VF and PF representors to flower app
Initialise VF and PF representors in flower app.
Based in part on work by Benjamin LaHaise, Bert van Leeuwen and
Jakub Kicinski.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Fri, 23 Jun 2017 20:12:08 +0000 (22:12 +0200)]
nfp: add flower app
Add app for flower offload. At this point the PF netdev and phys port
representor netdevs are initialised. Follow-up work will add support for
VF and PF representors and beyond that offloading the flower classifier.
Based in part on work by Benjamin LaHaise and Bert van Leeuwen.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Fri, 23 Jun 2017 20:12:07 +0000 (22:12 +0200)]
nfp: add support for control messages for flower app
In preparation for adding a new flower app - targeted at offloading
the flower classifier - provide support for control message that it will
use to communicate with the NFP.
Based in part on work by Bert van Leeuwen.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Fri, 23 Jun 2017 20:12:06 +0000 (22:12 +0200)]
nfp: add support for tx/rx with metadata portid
Allow tx/rx with metadata port id. This will be used for tx/rx of
representor netdevs acting as upper-devices while a pf netdev acts
as a lower-device.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Fri, 23 Jun 2017 20:12:05 +0000 (22:12 +0200)]
nfp: provide nfp_port to of nfp_net_get_mac_addr()
Provide port rather than vNIC as parameter of nfp_net_get_mac_addr.
This is to allow this function to be used by representor netdevs where
a vNIC may have more than one physical port none of which are associated
with the vNIC.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Fri, 23 Jun 2017 20:12:04 +0000 (22:12 +0200)]
nfp: app callbacks for SRIOV
Add app-callbacks for app-specific initialisation of SRIOV.
Disabling SRIOV is brought forward in nfp_pci_remove()
so that nfp_app_sriov_disable is called while the app still exists.
This is intended to be used to implement representor netdevs for virtual
ports.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Fri, 23 Jun 2017 20:12:03 +0000 (22:12 +0200)]
nfp: add stats and xmit helpers for representors
Provide helpers for stats and xmit on representor netdevs.
Parts based on work by Bert van Leeuwen, Benjamin LaHaise and
Jakub Kicinski.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Fri, 23 Jun 2017 20:12:02 +0000 (22:12 +0200)]
nfp: general representor implementation
Provide infrastructure to create and destroy representors of a given type.
Parts based on work by Bert van Leeuwen, Benjamin LaHaise,
and Jakub Kicinski.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Fri, 23 Jun 2017 20:12:01 +0000 (22:12 +0200)]
nfp: map mac_stats and vf_cfg BARs
If present map mac_stats and vf_cfg BARs. These will be used by
representor netdevs to read statistics for phys port and vf representors.
Also provide defines describing the layout of the mac_stats area.
Similar defines are already present for the cf_cfg area.
Based in part on work by Jakub Kicinski.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 23 Jun 2017 20:12:00 +0000 (22:12 +0200)]
nfp: move physical port init into a helper
Move MAC/PHY port init into a helper to make it easier to reuse
it in the representor code.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 23 Jun 2017 20:11:59 +0000 (22:11 +0200)]
nfp: devlink add support for getting eswitch mode
Add app callback for reporting eswitch mode. Non-SRIOV apps
should not implement this callback, nfp_app code will then
respond with -EOPNOTSUPP.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 23 Jun 2017 20:11:58 +0000 (22:11 +0200)]
net: store port/representator id in metadata_dst
Switches and modern SR-IOV enabled NICs may multiplex traffic from Port
representators and control messages over single set of hardware queues.
Control messages and muxed traffic may need ordered delivery.
Those requirements make it hard to comfortably use TC infrastructure today
unless we have a way of attaching metadata to skbs at the upper device.
Because single set of queues is used for many netdevs stopping TC/sched
queues of all of them reliably is impossible and lower device has to
retreat to returning NETDEV_TX_BUSY and usually has to take extra locks on
the fastpath.
This patch attempts to enable port/representative devs to attach metadata
to skbs which carry port id. This way representatives can be queueless and
all queuing can be performed at the lower netdev in the usual way.
Traffic arriving on the port/representative interfaces will be have
metadata attached and will subsequently be queued to the lower device for
transmission. The lower device should recognize the metadata and translate
it to HW specific format which is most likely either a special header
inserted before the network headers or descriptor/metadata fields.
Metadata is associated with the lower device by storing the netdev pointer
along with port id so that if TC decides to redirect or mirror the new
netdev will not try to interpret it.
This is mostly for SR-IOV devices since switches don't have lower netdevs
today.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 23 Jun 2017 19:06:44 +0000 (15:06 -0400)]
Merge branch 'phy-internal'
Florian Fainelli says:
====================
net: phy: Support "internal" PHY interface
This makes the "internal" phy-mode property generally available and
documented and this allows us to remove some custom parsing code
we had for bcmgenet and bcm_sf2 which both used that specific value.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 23 Jun 2017 17:33:16 +0000 (10:33 -0700)]
net: dsa: bcm_sf2: Remove special handling of "internal" phy-mode
The PHY library now supports an "internal" phy-mode, thus making our
custom parsing code now unnecessary.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 23 Jun 2017 17:33:15 +0000 (10:33 -0700)]
net: bcmgenet: Remove special handling of "internal" phy-mode
The PHY library now supports an "internal" phy-mode, thus making our
custom parsing code now unnecessary.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 23 Jun 2017 17:33:14 +0000 (10:33 -0700)]
net: phy: Support "internal" PHY interface
Now that the Device Tree binding has been updated, update the PHY
library phy_interface_t and phy_modes to support the "internal" PHY
interface type.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 23 Jun 2017 17:33:13 +0000 (10:33 -0700)]
dt-bindings: Add "internal" as a valid 'phy-mode' property
A number of Ethernet MACs have internal Ethernet PHYs and the internal
wiring makes it so that this knowledge needs to be available using the
standard 'phy-mode' property.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 23 Jun 2017 18:24:28 +0000 (14:24 -0400)]
Merge tag 'mlx5-updates-2017-06-23' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2017-06-23
This series provides some updates to the mlx5 core and netdevice drivers.
Three patches from Tariq, Introduces page reuse mechanism in non-Striding
RQ RX datapath, we allow the the RX descriptor to reuse its allocated page
as much as it could, until the page is fully consumed. RX page reuse
reduces the stress on page allocator and improves RX performance especially
with high speeds (100Gb/s).
Next four patches of the series from Or allows to offload tc flower matching
on ttl/hoplimit and header re-write of hoplimit.
The rest of the series from Yotam and Or enhances mlx5 to support FW flashing
through the mlxfw module, in a similar manner done by the mlxsw driver.
Currently, only ethtool based flashing is implemented, where both Eth and IB ports
are supported.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Fri, 23 Jun 2017 13:44:37 +0000 (19:14 +0530)]
cxgb4: Use Firmware params to get buffer-group map
Buffer group mappings can be obtained using FW_PARAMs cmd for newer FW.
Since some of the bg_maps are obtained in atomic context, created another
t4_query_params_ns(), that wont sleep when awaiting mbox cmd completion.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Fri, 23 Jun 2017 13:44:36 +0000 (19:14 +0530)]
cxgb4: Update T6 Buffer Group and Channel Mappings
We were using t4_get_mps_bg_map() for both t4_get_port_stats()
to determine which MPS Buffer Groups to report statistics on for a given
Port, and also for t4_sge_alloc_rxq() to provide a TP Ingress Channel
Congestion Map. For T4/T5 these are actually the same values (because they
are ~somewhat~ related), but for T6 they should return different values
(T6 has Port 0 associated with MPS Buffer Group 0 (with MPS Buffer Group 1
silently cascading off) and Port 1 is associated with MPS Buffer Group 2
(with 3 cascading off)).
Based on the original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 23 Jun 2017 10:15:44 +0000 (13:15 +0300)]
tls: return -EFAULT if copy_to_user() fails
The copy_to_user() function returns the number of bytes remaining but we
want to return -EFAULT here.
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 23 Jun 2017 18:17:31 +0000 (14:17 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2017-06-23
1) Use memdup_user to spmlify xfrm_user_policy.
From Geliang Tang.
2) Make xfrm_dev_register static to silence a sparse warning.
From Wei Yongjun.
3) Use crypto_memneq to check the ICV in the AH protocol.
From Sabrina Dubroca.
4) Remove some unused variables in esp6.
From Stephen Hemminger.
5) Extend XFRM MIGRATE to allow to change the UDP encapsulation port.
From Antony Antony.
6) Include the UDP encapsulation port to km_migrate announcements.
From Antony Antony.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 23 Jun 2017 18:15:12 +0000 (14:15 -0400)]
Merge branch 'ena-new-features-and-improvements'
Netanel Belgazal says:
====================
net: update ena ethernet driver to version 1.2.0
This patchset contains some new features/improvements that were added
to the ENA driver to increase its robustness and are based on
experience of wide ENA deployment.
Change log:
V2:
* Remove patch that add inline to C-file static function (contradict coding style).
* Remove patch that moves MTU parameter validation in ena_change_mtu() instead of
using the network stack.
* Use upper_32_bits()/lower_32_bits() instead of casting.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Fri, 23 Jun 2017 08:22:00 +0000 (11:22 +0300)]
net: ena: update ena driver to version 1.2.0
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Fri, 23 Jun 2017 08:21:59 +0000 (11:21 +0300)]
net: ena: update driver's rx drop statistics
rx drop counter is reported by the device in the keep-alive
event.
update the driver's counter with the device counter.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Fri, 23 Jun 2017 08:21:58 +0000 (11:21 +0300)]
net: ena: use lower_32_bits()/upper_32_bits() to split dma address
In ena_com_mem_addr_set(), use the above functions to split dma address
to the lower 32 bits and the higher 16 bits.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Fri, 23 Jun 2017 08:21:57 +0000 (11:21 +0300)]
net: ena: separate skb allocation to dedicated function
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Fri, 23 Jun 2017 08:21:56 +0000 (11:21 +0300)]
net: ena: use napi_schedule_irqoff when possible
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Fri, 23 Jun 2017 08:21:55 +0000 (11:21 +0300)]
net: ena: allow the driver to work with small number of msix vectors
Current driver tries to allocate msix vectors as the number of the
negotiated io queues. (with another msix vector for management).
If pci_alloc_irq_vectors() fails, the driver aborts the probe
and the ENA network device is never brought up.
With this patch, the driver's logic will reduce the number of IO
queues to the number of allocated msix vectors (minus one for management)
instead of failing probe().
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Fri, 23 Jun 2017 08:21:54 +0000 (11:21 +0300)]
net: ena: add support for out of order rx buffers refill
ENA driver post Rx buffers through the Rx submission queue
for the ENA device to fill them with receive packets.
Each Rx buffer is marked with req_id in the Rx descriptor.
Newer ENA devices could consume the posted Rx buffer in out of order,
and as result the corresponding Rx completion queue will have Rx
completion descriptors with non contiguous req_id(s)
In this change the driver holds two rings.
The first ring (called free_rx_ids) is a mapping ring.
It holds all the unused request ids.
The values in this ring are from 0 to ring_size -1.
When the driver wants to allocate a new Rx buffer it uses the head of
free_rx_ids and uses it's value as the index for rx_buffer_info ring.
The req_id is also written to the Rx descriptor
Upon Rx completion,
The driver took the req_id from the completion descriptor and uses it
as index in rx_buffer_info.
The req_id is then return to the free_rx_ids ring.
This patch also adds statistics to inform when the driver receive out
of range or unused req_id.
Note:
free_rx_ids is only accessible from the napi handler, so no locking is
required
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Fri, 23 Jun 2017 08:21:53 +0000 (11:21 +0300)]
net: ena: add reset reason for each device FLR
For each device reset, log to the device what is the cause
the reset occur.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Fri, 23 Jun 2017 08:21:52 +0000 (11:21 +0300)]
net: ena: change sizeof() argument to be the type pointer
Instead of using:
memset(ptr, 0x0, sizeof(struct ...))
use:
memset(ptr, 0x0, sizeor(*ptr))
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Fri, 23 Jun 2017 08:21:51 +0000 (11:21 +0300)]
net: ena: add hardware hints capability to the driver
With this patch, ENA device can update the ena driver about
the desired timeout values:
These values are part of the "hardware hints" which are transmitted
to the driver as Asynchronous event through ENA async
event notification queue.
In case the ENA device does not support this capability,
the driver will use its own default values.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Fri, 23 Jun 2017 08:21:50 +0000 (11:21 +0300)]
net: ena: change return value for unsupported features unsupported return value
return -EOPNOTSUPP instead of -EPERM.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 23 Jun 2017 01:57:55 +0000 (18:57 -0700)]
tcp: fix out-of-bounds access in ULP sysctl
KASAN reports out-of-bound access in proc_dostring() coming from
proc_tcp_available_ulp() because in case TCP ULP list is empty
the buffer allocated for the response will not have anything
printed into it. Set the first byte to zero to avoid strlen()
going out-of-bounds.
Fixes: 734942cc4ea6 ("tcp: ULP infrastructure")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yonghong Song [Thu, 22 Jun 2017 22:07:39 +0000 (15:07 -0700)]
bpf: possibly avoid extra masking for narrower load in verifier
Commit
31fd85816dbe ("bpf: permits narrower load from bpf program
context fields") permits narrower load for certain ctx fields.
The commit however will already generate a masking even if
the prog-specific ctx conversion produces the result with
narrower size.
For example, for __sk_buff->protocol, the ctx conversion
loads the data into register with 2-byte load.
A narrower 2-byte load should not generate masking.
For __sk_buff->vlan_present, the conversion function
set the result as either 0 or 1, essentially a byte.
The narrower 2-byte or 1-byte load should not generate masking.
To avoid unnecessary masking, prog-specific *_is_valid_access
now passes converted_op_size back to verifier, which indicates
the valid data width after perceived future conversion.
Based on this information, verifier is able to avoid
unnecessary marking.
Since we want more information back from prog-specific
*_is_valid_access checking, all of them are packed into
one data structure for more clarity.
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Thu, 22 Jun 2017 16:17:29 +0000 (17:17 +0100)]
net: stmmac: make some functions static
The functions dwmac4_dma_init_rx_chan, dwmac4_dma_init_tx_chan and
dwmac4_dma_init_channel do not need to be in global scope, so them
static.
Cleans up sparse warnings:
"symbol 'dwmac4_dma_init_rx_chan' was not declared. Should it be static?"
"symbol 'dwmac4_dma_init_tx_chan' was not declared. Should it be static?"
"symbol 'dwmac4_dma_init_channel' was not declared. Should it be static?"
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 23 Jun 2017 17:42:21 +0000 (13:42 -0400)]
Merge branch 'xdp-offload-mode'
Jakub Kicinski says:
====================
xdp: offload mode
While we discuss the representors.. :)
This set adds XDP flag for forcing offload and a attachment mode
for reporting to user space that program has been offloaded. The
nfp driver is modified to make use of the new flags, but also to
adhere to the DRV_MODE flag which should disable the HW offload.
The intended driver behaviour is:
DRV mode offload
no flags yes attempted
DRV_MODE yes no
HW_MODE no yes
Where 'yes' means required, and error will be returned if setup fails.
'Attempted' means the offload will only happen automatically if HW is
capable and offloading the program will cause no change in system
behaviour (e.g. maps don't have to bound).
Thanks to loading the program both to the driver and HW by default we
can fallback to the driver mode without disruption in case user replaces
the program with one which cannot be offloaded later.
Note that the NFP driver currently claims XDP offload support but
lacks most basic features like direct packet access.
Only change compared to the RFC is fixing the double bpf_prog_put()
which Daniel has spotted (patch 5).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 22 Jun 2017 01:25:10 +0000 (18:25 -0700)]
nfp: xdp: report if program is offloaded
Make use of just added XDP_ATTACHED_HW.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 22 Jun 2017 01:25:09 +0000 (18:25 -0700)]
xdp: add reporting of offload mode
Extend the XDP_ATTACHED_* values to include offloaded mode.
Let drivers report whether program is installed in the driver
or the HW by changing the prog_attached field from bool to
u8 (type of the netlink attribute).
Exploit the fact that the value of XDP_ATTACHED_DRV is 1,
therefore since all drivers currently assign the mode with
double negation:
mode = !!xdp_prog;
no drivers have to be modified.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 22 Jun 2017 01:25:08 +0000 (18:25 -0700)]
nfp: bpf: add support for XDP_FLAGS_HW_MODE
Respect the XDP_FLAGS_HW_MODE. When it's set install the program
on the NIC and skip enabling XDP in the driver.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 22 Jun 2017 01:25:07 +0000 (18:25 -0700)]
nfp: bpf: release the reference on offloaded programs
The xdp_prog member of the adapter's data path structure is used
for XDP in driver mode. In case a XDP program is loaded with in
HW-only mode, we need to store it somewhere else. Add a new XDP
prog pointer in the main structure and use that when we need to
know whether any XDP program is loaded, not only a driver mode
one. Only release our reference on adapter free instead of
immediately after netdev unregister to allow offload to be disabled
first.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 22 Jun 2017 01:25:06 +0000 (18:25 -0700)]
nfp: bpf: don't offload XDP programs in DRV_MODE
DRV_MODE means that user space wants the program to be run in
the driver. Do not try to offload. Only offload if no mode
flags have been specified.
Remember what the mode is when the program is installed and refuse
new setup requests if there is already a program loaded in a
different mode. This should leave it open for us to implement
simultaneous loading of two programs - one in the drv path and
another to the NIC later.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 22 Jun 2017 01:25:05 +0000 (18:25 -0700)]
nfp: xdp: move driver XDP setup into a separate function
In preparation of XDP offload flags move the driver setup into
a function. Otherwise the number of conditions in one function
would make it slightly hard to follow. The offload handler may
now be called with NULL prog, even if no offload is currently
active, but that's fine, offload code can handle that.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 22 Jun 2017 01:25:04 +0000 (18:25 -0700)]
xdp: add HW offload mode flag for installing programs
Add an installation-time flag for requesting that the program
be installed only if it can be offloaded to HW.
Internally new command for ndo_xdp is added, this way we avoid
putting checks into drivers since they all return -EINVAL on
an unknown command.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 22 Jun 2017 01:25:03 +0000 (18:25 -0700)]
xdp: pass XDP flags into install handlers
Pass XDP flags to the xdp ndo. This will allow drivers to look
at the mode flags and make decisions about offload.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Fri, 23 Jun 2017 12:19:51 +0000 (14:19 +0200)]
udp: fix poll()
Michael reported an UDP breakage caused by the commit
b65ac44674dd
("udp: try to avoid 2 cache miss on dequeue").
The function __first_packet_length() can update the checksum bits
of the pending skb, making the scratched area out-of-sync, and
setting skb->csum, if the skb was previously in need of checksum
validation.
On later recvmsg() for such skb, checksum validation will be
invoked again - due to the wrong udp_skb_csum_unnecessary()
value - and will fail, causing the valid skb to be dropped.
This change addresses the issue refreshing the scratch area in
__first_packet_length() after the possible checksum update.
Fixes: b65ac44674dd ("udp: try to avoid 2 cache miss on dequeue")
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Thu, 22 Jun 2017 13:01:22 +0000 (15:01 +0200)]
udp/v6: prefetch rmem_alloc in udp6_queue_rcv_skb()
very similar to commit
dd99e425be23 ("udp: prefetch
rmem_alloc in udp_queue_rcv_skb()"), this allows saving a cache
miss when the BH is bottle-neck for UDP over ipv6 packet
processing, e.g. for small packets when a single RX NIC ingress
queue is in use.
Performances under flood when multiple NIC RX queues used are
unaffected, but when a single NIC rx queue is in use, this
gives ~8% performance improvement.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 22 Jun 2017 17:42:57 +0000 (13:42 -0400)]
Merge branch 'net-mvpp2-misc-improvements'
Thomas Petazzoni says:
====================
net: mvpp2: misc improvements
Here are a few patches making various small improvements/refactoring
in the mvpp2 driver. They are based on today's net-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Petazzoni [Thu, 22 Jun 2017 12:23:20 +0000 (14:23 +0200)]
net: mvpp2: remove mvpp2_pool_refill()
When all a function does is calling another function with the exact same
arguments, in the exact same order, you know it's time to remove said
function. Which is exactly what this commit does.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Petazzoni [Thu, 22 Jun 2017 12:23:19 +0000 (14:23 +0200)]
net: mvpp2: remove unused mvpp2_bm_cookie_pool_set() function
This function is not used in the driver, remove it.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Petazzoni [Thu, 22 Jun 2017 12:23:18 +0000 (14:23 +0200)]
net: mvpp2: add comments about smp_processor_id() usage
A previous commit modified a number of smp_processor_id() used in
migration-enabled contexts into get_cpu/put_cpu sections. However, a few
smp_processor_id() calls remain in the driver, and this commit adds
comments explaining why they can be kept.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 22 Jun 2017 17:39:58 +0000 (13:39 -0400)]
Merge branch 'stmmac-pci-Refactor-DMI-probing'
Jan Kiszka says:
====================
stmmac: pci: Refactor DMI probing
Some cleanups of the way we probe DMI platforms in the driver. Reduces
a bit of open-coding and makes the logic easier reusable for any
potential DMI platform != Quark.
Tested on IOT2000 and Galileo Gen2.
Changes in v5:
- fixed a remaining issue in patch 5
- dropped patch 6 for now
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Kiszka [Thu, 22 Jun 2017 06:18:01 +0000 (08:18 +0200)]
stmmac: pci: Use dmi_system_id table for retrieving PHY addresses
Avoids reimplementation of DMI matching in stmmac_pci_find_phy_addr.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Kiszka [Thu, 22 Jun 2017 06:18:00 +0000 (08:18 +0200)]
stmmac: pci: Select quark_pci_dmi_data from quark_default_data
No need to carry this reference in stmmac_pci_info - the Quark-specific
setup handler knows that it needs to use the Quark-specific DMI table.
This also allows to drop the stmmac_pci_info reference from the setup
handler parameter list.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Kiszka [Thu, 22 Jun 2017 06:17:59 +0000 (08:17 +0200)]
stmmac: pci: Make stmmac_pci_find_phy_addr truly generic
Move the special case for the early Galileo firmware into
quark_default_setup. This allows to use stmmac_pci_find_phy_addr for
non-quark cases.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Kiszka [Thu, 22 Jun 2017 06:17:58 +0000 (08:17 +0200)]
stmmac: pci: Use stmmac_pci_info for all devices
Make stmmac_default_data compatible with stmmac_pci_info.setup and use
an info structure for all devices. This allows to make the probing more
regular.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Kiszka [Thu, 22 Jun 2017 06:17:57 +0000 (08:17 +0200)]
stmmac: pci: Make stmmac_pci_info structure constant
By removing the PCI device reference from the structure and passing it
as parameters to the interested functions, we can make quark_pci_info
const.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Wed, 21 Jun 2017 23:40:47 +0000 (16:40 -0700)]
hv_netvsc: Fix the carrier state error when data path is off
When the VF NIC is opened, the synthetic NIC's carrier state is set to
off. This tells the host to transitions data path to the VF device. But
if startup script or user manipulates the admin state of the netvsc
device directly for example:
# ifconfig eth0 down
# ifconfig eth0 up
Then the carrier state of the synthetic NIC would be on, even though the
data path was still over the VF NIC. This patch sets the carrier state
of synthetic NIC with consideration of the related VF state.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Wed, 21 Jun 2017 23:40:46 +0000 (16:40 -0700)]
hv_netvsc: Remove unnecessary var link_state from struct netvsc_device_info
We simply use rndis_device->link_state in the netdev_dbg. The variable,
link_state from struct netvsc_device_info, is not used anywhere else.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yonghong Song [Wed, 21 Jun 2017 20:48:27 +0000 (13:48 -0700)]
samples/bpf: fix a build problem
tracex5_kern.c build failed with the following error message:
../samples/bpf/tracex5_kern.c:12:10: fatal error: 'syscall_nrs.h' file not found
#include "syscall_nrs.h"
The generated file syscall_nrs.h is put in build/samples/bpf directory,
but this directory is not in include path, hence build failed.
The fix is to add $(obj) into the clang compilation path.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 22 Jun 2017 15:34:05 +0000 (11:34 -0400)]
Merge branch 'rds-tcp-fixes'
Sowmini Varadhan says:
====================
rds: tcp: fixes
Patch1 is a bug fix for correct reconnect when a connection
is restarted. Patch 2 accelerates cleanup by setting linger
to 1 and sending a RST to the peer.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sowmini Varadhan [Wed, 21 Jun 2017 20:40:13 +0000 (13:40 -0700)]
rds: tcp: set linger to 1 when unloading a rds-tcp
If we are unloading the rds_tcp module, we can set linger to 1
and drop pending packets to accelerate reconnect. The peer will
end up resetting the connection based on new generation numbers
of the new incarnation, so hanging on to unsent TCP packets via
linger is mostly pointless in this case.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Tested-by: Jenny Xu <jenny.x.xu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sowmini Varadhan [Wed, 21 Jun 2017 20:40:12 +0000 (13:40 -0700)]
rds: tcp: send handshake ping-probe from passive endpoint
The RDS handshake ping probe added by commit
5916e2c1554f
("RDS: TCP: Enable multipath RDS for TCP") is sent from rds_sendmsg()
before the first data packet is sent to a peer. If the conversation
is not bidirectional (i.e., one side is always passive and never
invokes rds_sendmsg()) and the passive side restarts its rds_tcp
module, a new HS ping probe needs to be sent, so that the number
of paths can be re-established.
This patch achieves that by sending a HS ping probe from
rds_tcp_accept_one() when c_npaths is 0 (i.e., we have not done
a handshake probe with this peer yet).
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Tested-by: Jenny Xu <jenny.x.xu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nathan Fontenot [Wed, 21 Jun 2017 20:41:02 +0000 (15:41 -0500)]
ibmvnic: Correct return code checking for ibmvnic_init during probe
The update to ibmvnic_init to allow an EAGAIN return code broke
the calling of ibmvnic_init from ibmvnic_probe. The code now
will return from this point in the probe routine if anything
other than EAGAIN is returned. The check should be to see if rc
is non-zero and not equal to EAGAIN.
Without this fix, the vNIC driver can return 0 (success) from
its probe routine due to ibmvnic_init returning zero, but before
completing the probe process and registering with the netdev layer.
Fixes: 6a2fb0e99f9c (ibmvnic: driver initialization for kdump/kexec)
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 22 Jun 2017 15:31:35 +0000 (11:31 -0400)]
Merge branch 'ibmvnic-Correct-long-term-mapped-buffer-error-handling'
Thomas Falcon says:
====================
ibmvnic: Correct long-term-mapped buffer error handling
This patch set fixes the error-handling of long-term-mapped buffers
during adapter initialization and reset. The first patch fixes a bug
in an incorrectly defined descriptor that was keeping the return
codes from the VIO server from being properly checked. The second patch
fixes and cleans up the error-handling implementation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Wed, 21 Jun 2017 19:53:01 +0000 (14:53 -0500)]
ibmvnic: Fix error handling when registering long-term-mapped buffers
The patch stores the return code of the REQUEST_MAP_RSP sub-CRQ command
in the private data structure, where it can be later checked during
device open or a reset.
In the case of a reset, the mapping request to the vNIC Server may fail,
especially in the case of a partition migration. The driver attempts to
handle this by re-allocating the buffer and re-sending the mapping request.
The original error handling implementation was removed. The separate
function handling the REQUEST_MAP response message was also removed,
since it is now simple enough to be handled in the ibmvnic_handle_crq
function.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Wed, 21 Jun 2017 19:53:00 +0000 (14:53 -0500)]
ibmvnic: Fix incorrectly defined ibmvnic_request_map_rsp structure
This reserved area should be eight bytes in length instead of four.
As a result, the return codes in the REQUEST_MAP_RSP descriptors
were not being properly handled.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chenbo Feng [Wed, 21 Jun 2017 02:06:40 +0000 (19:06 -0700)]
tcp: Add a tcp_filter hook before handle ack packet
Currently in both ipv4 and ipv6 code path, the ack packet received when
sk at TCP_NEW_SYN_RECV state is not filtered by socket filter or cgroup
filter since it is handled from tcp_child_process and never reaches the
tcp_filter inside tcp_v4_rcv or tcp_v6_rcv. Adding a tcp_filter hooks
here can make sure all the ingress tcp packet can be correctly filtered.
Signed-off-by: Chenbo Feng <fengc@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>