Arnd Bergmann [Wed, 11 Oct 2017 13:55:31 +0000 (15:55 +0200)]
ip_tunnel: fix building with NET_IP_TUNNEL=m
When af_mpls is built-in but the tunnel support is a module,
we get a link failure:
net/mpls/af_mpls.o: In function `mpls_init':
af_mpls.c:(.init.text+0xdc): undefined reference to `ip_tunnel_encap_add_ops'
This adds a Kconfig statement to prevent the broken
configuration and force mpls to be a module as well in
this case.
Fixes: bdc476413dcd ("ip_tunnel: add mpls over gre support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Amine Kherbouche <amine.kherbouche@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Oct 2017 19:20:27 +0000 (12:20 -0700)]
Merge branch 'smc-ib_query_gid'
Ursula Braun says:
====================
net/smc: ib_query_gid() patches
triggered by Parav Pandit here are 2 cleanup patches for usage of
ib_query_gid() in the smc-code.
Thanks, Ursula
v2 changes advised by Parav Pandit:
extra check is_vlan_dev() in patch 2/2
"RoCE" spelling
added "Reported-by"
added "Reviewed-by"
added "Fixes"
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Wed, 11 Oct 2017 11:47:23 +0000 (13:47 +0200)]
net/smc: dev_put for netdev after usage of ib_query_gid()
For RoCEs ib_query_gid() takes a reference count on the net_device.
This reference count must be decreased by the caller.
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Reported-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Fixes: 0cfdd8f92cac ("smc: connection and link group creation")
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Wed, 11 Oct 2017 11:47:22 +0000 (13:47 +0200)]
net/smc: replace function pointer get_netdev()
SMC should not open code the function pointer get_netdev of the
IB device. Replacing ib_query_gid(..., NULL) with
ib_query_gid(..., gid_attr) allows access to the netdev.
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Suggested-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Oct 2017 19:10:02 +0000 (12:10 -0700)]
Merge branch 'dsa-ACB-for-bcm_sf2-and-bcmsysport'
Florian Fainelli says:
====================
Enable ACB for bcm_sf2 and bcmsysport
This patch series enables Broadcom's Advanced Congestion Buffering mechanism
which requires cooperation between the CPU/Management Ethernet MAC controller
and the switch.
I took the notifier approach because ultimately the information we need to
carry to the master network device is DSA specific and I saw little room for
generalizing beyond what DSA requires. Chances are that this is highly specific
to the Broadcom HW as I don't know of any HW out there that supports something
nearly similar for similar or identical needs.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 11 Oct 2017 17:57:52 +0000 (10:57 -0700)]
net: systemport: Turn on ACB at the SYSTEMPORT level
Now that we have established the queue mapping between the switch port
egress queues and the SYSTEMPORT egress queues, we can turn on Advanced
Congestion Buffering (ACB) at the SYSTEMPORT level. This enables the
Ethernet MAC controller to get out of band flow control information
directly from the switch port and queue that it monitors such that its
internal TDMA can be appropriately backpressured.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 11 Oct 2017 17:57:51 +0000 (10:57 -0700)]
net: dsa: bcm_sf2: Turn on ACB at the switch level
Turn on the out of band Advanced Congestion Buffering (ACB) mechanism at
the switch level now that we have properly established the queue mapping
between the switch egress queues and the SYSTEMPORT egress queues. This
allows the switch to correctly backpressure the host system when one of
its queue drops below the configured thresholds.
This is also helping achieve so called "lossless" behavior by adapting
the TX interrupt pacing to the actual speed and capacity of the switch
port.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 11 Oct 2017 17:57:50 +0000 (10:57 -0700)]
net: systemport: Establish lower/upper queue mapping
Establish a queue mapping between the DSA slave network device queues
created that correspond to switch port queues, and the transmit queue
that SYSTEMPORT manages.
We need to configure the SYSTEMPORT transmit queue with the switch port number
and switch port queue number in order for the switch and SYSTEMPORT hardware to
utilize the out of band congestion notification. This hardware mechanism works
by looking at the switch port egress queue and determines whether there is
enough buffers for this queue, with that class of service for a successful
transmission and if not, backpressures the SYSTEMPORT queue that is being used.
For this to work, we implement a notifier which looks at the
DSA_PORT_REGISTER event. When DSA network devices are registered, the
framework calls the DSA notifiers when that happens, extracts the number
of queues for these devices and their associated port number, remembers
that in the driver private structure and linearly maps those queues to
TX rings/queues that we manage.
This scheme works because DSA slave network deviecs always transmit
through SYSTEMPORT so when DSA slave network devices are
destroyed/brought down, the corresponding SYSTEMPORT queues are no
longer used. Also, by design of the DSA framework, the master network
device (SYSTEMPORT) is registered first.
For faster lookups we use an array of up to DSA_MAX_PORTS * number of
queues per port, and then map pointers to bcm_sysport_tx_ring such that
our ndo_select_queue() implementation can just index into that array to
locate the corresponding ring index.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 11 Oct 2017 17:57:49 +0000 (10:57 -0700)]
net: dsa: tag_brcm: Indicate to master netdevice port + queue
We need to tell the DSA master network device doing the actual
transmission what the desired switch port and queue number is for it to
resolve that to the internal transmit queue it is mapped to.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 11 Oct 2017 17:57:48 +0000 (10:57 -0700)]
net: dsa: Add support for DSA specific notifiers
In preparation for communicating a given DSA network device's port
number and switch index, create a specialized DSA notifier and two
events: DSA_PORT_REGISTER and DSA_PORT_UNREGISTER that communicate: the
slave network device (slave_dev), port number and switch number in the
tree.
This will be later used for network device drivers like bcmsysport which
needs to cooperate with its DSA network devices to set-up queue mapping
and scheduling.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timur Tabi [Thu, 12 Oct 2017 17:42:04 +0000 (12:42 -0500)]
Revert "net: qcom/emac: enforce DMA address restrictions"
This reverts commit
df1ec1b9d0df57e96011f175418dc95b1af46821.
It turns out that memory allocated via dma_alloc_coherent is always
aligned to the size of the buffer, so there's no way the RRD and RFD
can ever be in separate 32-bit regions.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 12 Oct 2017 03:45:40 +0000 (20:45 -0700)]
tcp: remove obsolete helpers
Remove three inline helpers that are no longer needed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 11 Oct 2017 10:56:23 +0000 (11:56 +0100)]
bpf: remove redundant variable old_flags
Variable old_flags is being assigned but is never read; it is redundant
and can be removed.
Cleans up clang warning: Value stored to 'old_flags' is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Oct 2017 03:21:23 +0000 (20:21 -0700)]
Merge branch 'mlx4-XDP-TX-improvements'
Tariq Toukan says:
====================
mlx4_en XDP TX improvements
This patchset contains performance improvements
to the XDP_TX use case in the mlx4 Eth driver.
Patch 1 is a simple change in a function parameter type.
Patch 2 replaces a call to a generic function with the
relevant parts inlined.
Patch 3 moves the write of descriptors' constant values
from data path to control path.
Series generated against net-next commit:
833e0e2f24fd net: dst: move cpu inside ifdef to avoid compilation warning
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tariq Toukan [Wed, 11 Oct 2017 10:17:27 +0000 (13:17 +0300)]
net/mlx4_en: XDP_TX, assign constant values of TX descs on ring creaion
In XDP_TX, some fields in tx_info and tx_desc are constants across
all entries of the different XDP_TX rings.
Assign values to these fields on ring creation time, rather than in
data-path.
Patchset performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Single queue no-RSS optimization ON.
XDP_TX packet rate:
------------------------------
Before | After | Gain |
13.7 Mpps | 14.0 Mpps | %2.2 |
------------------------------
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tariq Toukan [Wed, 11 Oct 2017 10:17:26 +0000 (13:17 +0300)]
net/mlx4_en: Obsolete call to generic write_desc in XDP xmit flow
Function mlx4_en_tx_write_desc() is not optimized to use of XDP xmit.
Use the relevant parts inline instead.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tariq Toukan [Wed, 11 Oct 2017 10:17:25 +0000 (13:17 +0300)]
net/mlx4_en: Replace netdev parameter with priv in XDP xmit function
The struct net_device parameter was passed only to extract
struct mlx4_en_priv out of it.
Here we pass the priv parameter directly.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 11 Oct 2017 09:53:28 +0000 (10:53 +0100)]
net: mpls: make function ipgre_mpls_encap_hlen static
The function ipgre_mpls_encap_hlen is local to the source and
does not need to be in global scope, so make it static.
Cleans up sparse warning:
symbol 'ipgre_mpls_encap_hlen' was not declared. Should it be static?
Fixes: bdc476413dcdb ("ip_tunnel: add mpls over gre support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 11 Oct 2017 10:17:57 +0000 (11:17 +0100)]
sctp: make array sctp_sched_ops static
The array sctp_sched_ops is local to the source and
does not need to be in global scope, so make it static.
Cleans up sparse warning:
symbol 'sctp_sched_ops' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Wed, 11 Oct 2017 08:28:01 +0000 (10:28 +0200)]
ipv6: addrconf: don't use rtnl mutex in RTM_GETADDR
Similar to the previous patch, use the device lookup functions
that bump device refcount and flag this as DOIT_UNLOCKED to avoid
rtnl mutex.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Wed, 11 Oct 2017 08:28:00 +0000 (10:28 +0200)]
ipv6: addrconf: don't use rtnl mutex in RTM_GETNETCONF
Instead of relying on rtnl mutex bump device reference count.
After this change, values reported can change in parallel, but thats not
much different from current state, as anyone can change the settings
right after rtnl_unlock (and before userspace processed reply).
While at it, switch to GFP_KERNEL allocation.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Oct 2017 03:15:43 +0000 (20:15 -0700)]
Merge branch 'net-sched-get-rid-of-cls_flower-egress_dev'
Jiri Pirko says:
====================
net: sched: get rid of cls_flower->egress_dev
Introduction of cls_flower->egress_dev was a workaround. Turned out
to be a bit ugly hack. So replace it with more generic and reusable
infrastructure.
This is a dependency of shared block introduction that will be send as
a follow-up patchsets group.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 11 Oct 2017 07:41:10 +0000 (09:41 +0200)]
net: sched: remove unused tcf_exts_get_dev helper and cls_flower->egress_dev
The helper and the struct field ares no longer used by any code,
so remove them.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 11 Oct 2017 07:41:09 +0000 (09:41 +0200)]
net: sched: convert cls_flower->egress_dev users to tc_setup_cb_egdev infra
The only user of cls_flower->egress_dev is mlx5. So do the conversion
there alongside with the code originating the call in cls_flower
function fl_hw_replace_filter to the newly introduced egress device
callback infrastucture.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 11 Oct 2017 07:41:08 +0000 (09:41 +0200)]
net: sched: introduce per-egress action device callbacks
Introduce infrastructure that allows drivers to register callbacks that
are called whenever tc would offload inserted rule and specified device
acts as tc action egress device.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 11 Oct 2017 07:41:07 +0000 (09:41 +0200)]
net: sched: make tc_action_ops->get_dev return dev and avoid passing net
Return dev directly, NULL if not possible. That is enough.
Makes no sense to pass struct net * to get_dev op, as there is only one
net possible, the one the action was created in. So just store it in
mirred priv and use directly.
Rename the mirred op callback function.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Oct 2017 03:05:30 +0000 (20:05 -0700)]
Merge branch 'rmnet-Rewrite-some-existing-functionality'
Subash Abhinov Kasiviswanathan says:
====================
net: qualcomm: rmnet: Rewrite some existing functionality
This series fixes some of the broken rmnet functionality.
Bridge mode is re-written and made useable and the muxed_ep is converted to hlist.
Patches 1-5 are cleanups in preparation for these changes.
Patch 6 does the hlist conversion.
Patch 7 has the implementation of the rmnet bridge mode.
v1->v2: Fix the warning and code style issue in rmnet_rx_handler as
mentioned by David.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Subash Abhinov Kasiviswanathan [Thu, 12 Oct 2017 00:43:58 +0000 (18:43 -0600)]
net: qualcomm: rmnet: Implement bridge mode
Add support to bridge two devices which can send multiplexing and
aggregation (MAP) data. This is done only when the data itself is
not going to be consumed in the stack but is being passed on to a
different endpoint. This is mainly used for testing.
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subash Abhinov Kasiviswanathan [Thu, 12 Oct 2017 00:43:57 +0000 (18:43 -0600)]
net: qualcomm: rmnet: Convert the muxed endpoint to hlist
Rather than using a static array, use a hlist to store the muxed
endpoints and use the mux id to query the rmnet_device.
This is useful as usually very few mux ids are used.
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subash Abhinov Kasiviswanathan [Thu, 12 Oct 2017 00:43:56 +0000 (18:43 -0600)]
net: qualcomm: rmnet: Remove duplicate setting of rmnet_devices
The rmnet_devices information is already stored in muxed_ep, so
storing this in rmnet_devices[] again is redundant.
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subash Abhinov Kasiviswanathan [Thu, 12 Oct 2017 00:43:55 +0000 (18:43 -0600)]
net: qualcomm: rmnet: Remove duplicate setting of rmnet private info
The end point is set twice in the local_ep as well as the mux_id and
the real_dev in the rmnet private structure. Remove the local_ep.
While these elements are equivalent, rmnet_endpoint will be
used only as part of the rmnet_port for muxed scenarios in VND mode.
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subash Abhinov Kasiviswanathan [Thu, 12 Oct 2017 00:43:54 +0000 (18:43 -0600)]
net: qualcomm: rmnet: Move rmnet_mode to rmnet_port
Mode information on the real device makes it easier to route packets
to rmnet device or bridged device based on the configuration.
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subash Abhinov Kasiviswanathan [Thu, 12 Oct 2017 00:43:53 +0000 (18:43 -0600)]
net: qualcomm: rmnet: Remove some unused defines
Most of these constants were used in the initial patchset where
custom netlink configuration was used and hence are no longer relevant.
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subash Abhinov Kasiviswanathan [Thu, 12 Oct 2017 00:43:52 +0000 (18:43 -0600)]
net: qualcomm: rmnet: Remove existing logic for bridge mode
This will be rewritten in the following patches.
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Oct 2017 23:01:57 +0000 (16:01 -0700)]
Merge branch 'qcom-emac-various-minor-fixes'
Timur Tabi says:
====================
net: qcom/emac: various minor fixes
A set of patches for 4.15 that clean up some code, apply minors fixes,
and so on. Some of the code also prepares the driver for a future
version of the EMAC controller.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Timur Tabi [Wed, 11 Oct 2017 19:52:26 +0000 (14:52 -0500)]
net: qcom/emac: clean up some TX/RX error messages
Some of the error messages that are printed by the interrupt handlers
are poorly written. For example, many don't include a device prefix,
so there's no indication that they are EMAC errors.
Also use rate limiting for all messages that could be printed from
interrupt context.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timur Tabi [Wed, 11 Oct 2017 19:52:25 +0000 (14:52 -0500)]
net: qcom/emac: enforce DMA address restrictions
The EMAC has a restriction that the upper 32 bits of the base addresses
for the RFD and RRD rings must be the same. The ensure that restriction,
we allocate twice the space for the RRD and locate it at an appropriate
address.
We also re-arrange the allocations so that invalid addresses are even
less likely.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timur Tabi [Wed, 11 Oct 2017 19:52:24 +0000 (14:52 -0500)]
net: qcom/emac: remove unused address arrays
The EMAC is capable of multiple TX and RX rings, but the driver only
supports one ring for each. One function had some left-over unused
code that supports multiple rings, but all it did was make the code
harder to read.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timur Tabi [Wed, 11 Oct 2017 19:52:23 +0000 (14:52 -0500)]
net: qcom/emac: specify the correct DMA mask
The 64/32-bit DMA mask hackery in the EMAC driver is not actually necessary,
and is technically not accurate. The EMAC hardware is limted to a 45-bit
DMA address. Although no EMAC-enabled system can have that much DDR,
an IOMMU could possible provide a larger address. Rather than play games
with the DMA mappings, the driver should provide a correct value and
trust the DMA/IOMMU layers to do the right thing.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Oct 2017 22:28:39 +0000 (15:28 -0700)]
Merge branch 'qrtr-Fixes-and-support-receiving-version-2-packets'
Bjorn Andersson says:
====================
net: qrtr: Fixes and support receiving version 2 packets
On the latest Qualcomm platforms remote processors are sending packets with
version 2 of the message header. This series starts off with some fixes and
then refactors the qrtr code to support receiving messages of both version 1
and version 2.
As all remotes are backwards compatible transmitted packets continues to be
send as version 1, but some groundwork has been done to make this a per-link
property.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjorn Andersson [Wed, 11 Oct 2017 06:45:23 +0000 (23:45 -0700)]
net: qrtr: Support decoding incoming v2 packets
Add the necessary logic for decoding incoming messages of version 2 as
well. Also make sure there's room for the bigger of version 1 and 2
headers in the code allocating skbs for outgoing messages.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjorn Andersson [Wed, 11 Oct 2017 06:45:22 +0000 (23:45 -0700)]
net: qrtr: Use sk_buff->cb in receive path
Rather than parsing the header of incoming messages throughout the
implementation do it once when we retrieve the message and store the
relevant information in the "cb" member of the sk_buff.
This allows us to, in a later commit, decode version 2 messages into
this same structure.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjorn Andersson [Wed, 11 Oct 2017 06:45:21 +0000 (23:45 -0700)]
net: qrtr: Clean up control packet handling
As the message header generation is deferred the internal functions for
generating control packets can be simplified.
This patch modifies qrtr_alloc_ctrl_packet() to, in addition to the
sk_buff, return a reference to a struct qrtr_ctrl_pkt, which clarifies
and simplifies the helpers to the point that these functions can be
folded back into the callers.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjorn Andersson [Wed, 11 Oct 2017 06:45:20 +0000 (23:45 -0700)]
net: qrtr: Pass source and destination to enqueue functions
Defer writing the message header to the skb until its time to enqueue
the packet. As the receive path is reworked to decode the message header
as it's received from the transport and only pass around the payload in
the skb this change means that we do not have to fill out the full
message header just to decode it immediately in qrtr_local_enqueue().
In the future this change also makes it possible to prepend message
headers based on the version of each link.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjorn Andersson [Wed, 11 Oct 2017 06:45:19 +0000 (23:45 -0700)]
net: qrtr: Add control packet definition to uapi
The QMUX protocol specification defines structure of the special control
packet messages being sent between handlers of the control port.
Add these to the uapi header, as this structure and the associated types
are shared between the kernel and all userspace handlers of control
messages.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjorn Andersson [Wed, 11 Oct 2017 06:45:18 +0000 (23:45 -0700)]
net: qrtr: Move constants to header file
The constants are used by both the name server and clients, so clarify
their value and move them to the uapi header.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjorn Andersson [Wed, 11 Oct 2017 06:45:17 +0000 (23:45 -0700)]
net: qrtr: Invoke sk_error_report() after setting sk_err
Rather than manually waking up any context sleeping on the sock to
signal an error we should call sk_error_report(). This has the added
benefit that in-kernel consumers can override this notification with
its own callback.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Wed, 11 Oct 2017 02:35:23 +0000 (02:35 +0000)]
net: hns3: make local functions static
Fixes the following sparse warnings:
drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c:464:5: warning:
symbol 'hns3_change_all_ring_bd_num' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c:477:5: warning:
symbol 'hns3_set_ringparam' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Tue, 10 Oct 2017 19:25:48 +0000 (12:25 -0700)]
atm: idt77105: Drop needless setup_timer()
Calling setup_timer() is redundant when DEFINE_TIMER() has been used.
Cc: Chas Williams <3chas3@gmail.com>
Cc: linux-atm-general@lists.sourceforge.net
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Murphy [Tue, 10 Oct 2017 17:42:56 +0000 (12:42 -0500)]
net: phy: at803x: Change error to EINVAL for invalid MAC
Change the return error code to EINVAL if the MAC
address is not valid in the set_wol function.
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Murphy [Tue, 10 Oct 2017 17:42:55 +0000 (12:42 -0500)]
net: phy: DP83822 initial driver submission
Add support for the TI DP83822 10/100Mbit ethernet phy.
The DP83822 provides flexibility to connect to a MAC through a
standard MII, RMII or RGMII interface.
In addition the DP83822 needs to be removed from the DP83848 driver
as the WoL support is added here for this device.
Datasheet:
http://www.ti.com/product/DP83822I/datasheet
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Acked-by: Andrew F. Davis <afd@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Oct 2017 20:53:21 +0000 (13:53 -0700)]
Merge branch 'lan9303-Add-basic-offloading-of-unicast-traffic'
Egil Hjelmeland says:
====================
lan9303: Add basic offloading of unicast traffic
This series add basic offloading of unicast traffic to the lan9303
DSA driver.
Review welcome!
Changes v1 -> v2:
- Patch 1: Codestyle linting.
- Patch 2: Remember SWE_PORT_STATE while not bridged.
Added constant LAN9303_SWE_PORT_MIRROR_DISABLED.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Egil Hjelmeland [Tue, 10 Oct 2017 12:49:53 +0000 (14:49 +0200)]
net: dsa: lan9303: Add basic offloading of unicast traffic
When both user ports are joined to the same bridge, the normal
HW MAC learning is enabled. This means that unicast traffic is forwarded
in HW.
If one of the user ports leave the bridge,
the ports goes back to the initial separated operation.
Port separation relies on disabled HW MAC learning. Hence the condition
that both ports must join same bridge.
Add brigde methods port_bridge_join, port_bridge_leave and
port_stp_state_set.
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Egil Hjelmeland [Tue, 10 Oct 2017 12:49:52 +0000 (14:49 +0200)]
net: dsa: lan9303: Move tag setup to new lan9303_setup_tagging
Prepare for next patch:
Move tag setup from lan9303_separate_ports() to new function
lan9303_setup_tagging()
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 11 Oct 2017 20:27:29 +0000 (13:27 -0700)]
tcp: fix tcp_unlink_write_queue()
Yury reported crash with this signature :
[ 554.034021] [<
ffff80003ccd5a58>] 0xffff80003ccd5a58
[ 554.034156] [<
ffff00000888fd34>] skb_release_all+0x14/0x30
[ 554.034288] [<
ffff00000888fd64>] __kfree_skb+0x14/0x28
[ 554.034409] [<
ffff0000088ece6c>] tcp_sendmsg_locked+0x4dc/0xcc8
[ 554.034541] [<
ffff0000088ed68c>] tcp_sendmsg+0x34/0x58
[ 554.034659] [<
ffff000008919fd4>] inet_sendmsg+0x2c/0xf8
[ 554.034783] [<
ffff0000088842e8>] sock_sendmsg+0x18/0x30
[ 554.034928] [<
ffff0000088861fc>] SyS_sendto+0x84/0xf8
Problem is that skb->destructor contains garbage, and this is
because I accidentally removed tcp_skb_tsorted_anchor_cleanup()
from tcp_unlink_write_queue()
This would trigger with a write(fd, <invalid_memory>, len) attempt,
and we will add to packetdrill this capability to avoid future
regressions.
Fixes: 75c119afe14f ("tcp: implement rb-tree based retransmit queue")
Reported-by: Yury Norov <ynorov@caviumnetworks.com>
Tested-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Oct 2017 17:15:01 +0000 (10:15 -0700)]
Merge tag 'mac80211-next-for-davem-2017-10-11' of git://git./linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Work continues in various areas:
* port authorized event for 4-way-HS offload (Avi)
* enable MFP optional for such devices (Emmanuel)
* Kees's timer setup patch for mac80211 mesh
(the part that isn't trivially scripted)
* improve VLAN vs. TXQ handling (myself)
* load regulatory database as firmware file (myself)
* with various other small improvements and cleanups
I merged net-next once in the meantime to allow Kees's
timer setup patch to go in.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Wed, 13 Sep 2017 20:21:08 +0000 (22:21 +0200)]
cfg80211: implement regdb signature checking
Currently CRDA implements the signature checking, and the previous
commits added the ability to load the whole regulatory database
into the kernel.
However, we really can't lose the signature checking, so implement
it in the kernel by loading a detached signature (regulatory.db.p7s)
and check it against built-in keys.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Thu, 15 Oct 2015 12:35:41 +0000 (14:35 +0200)]
cfg80211: reg: remove support for built-in regdb
Parsing and building C structures from a regdb is no longer needed
since the "firmware" file (regulatory.db) can be linked into the
kernel image to achieve the same effect.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Wed, 13 Sep 2017 14:07:22 +0000 (16:07 +0200)]
cfg80211: support reloading regulatory database
If the regulatory database is loaded, and then updated, it may
be necessary to reload it. Add an nl80211 command to do this.
Note that this just reloads the database, it doesn't re-apply
the rules from it immediately.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Thu, 15 Oct 2015 09:22:58 +0000 (11:22 +0200)]
cfg80211: support loading regulatory database as firmware file
As the current regulatory database is only about 4k big, and already
difficult to extend, we decided that overall it would be better to
get rid of the complications with CRDA and load the database into the
kernel directly, but in a new format that is extensible.
The new file format can be extended since it carries a length field
on all the structs that need to be extensible.
In order to be able to request firmware when the module initializes,
move cfg80211 from subsys_initcall() to the later fs_initcall(); the
firmware loader is at the same level but linked earlier, so it can
be called from there. Otherwise, when both the firmware loader and
cfg80211 are built-in, the request will crash the kernel. We also
need to be before device_initcall() so that cfg80211 is available
for devices when they initialize.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Fri, 6 Oct 2017 09:53:33 +0000 (11:53 +0200)]
mac80211: only remove AP VLAN frames from TXQ
When removing an AP VLAN interface, mac80211 currently purges
the entire TXQ for the AP interface. Fix this by using the FQ
API introduced in the previous patch to filter frames.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Toke HĆøiland-JĆørgensen <toke@toke.dk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Fri, 6 Oct 2017 09:53:32 +0000 (11:53 +0200)]
fq: support filtering a given tin
Add to the FQ API a way to filter a given tin, in order to
remove frames that fulfil certain criteria according to a
filter function.
This will be used by mac80211 to remove frames belonging to
an AP VLAN interface that's being removed.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Toke HĆøiland-JĆørgensen <toke@toke.dk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Xiang Gao [Wed, 11 Oct 2017 02:31:49 +0000 (22:31 -0400)]
mac80211: aead api to reduce redundancy
Currently, the aes_ccm.c and aes_gcm.c are almost line by line copy of
each other. This patch reduce code redundancy by moving the code in these
two files to crypto/aead_api.c to make it a higher level aead api. The
file aes_ccm.c and aes_gcm.c are removed and all the functions there are
now implemented in their headers using the newly added aead api.
Signed-off-by: Xiang Gao <qasdfgtyuiop@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Tue, 10 Oct 2017 07:57:59 +0000 (09:57 +0200)]
MAINTAINERS: update Johannes Berg's entries
Update my MAINTAINERS file entries to list all the right files.
Since I'm also the de-facto wireless extensions maintainer,
there's little point in excluding those.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Eric Garver [Tue, 10 Oct 2017 20:54:44 +0000 (16:54 -0400)]
openvswitch: add ct_clear action
This adds a ct_clear action for clearing conntrack state. ct_clear is
currently implemented in OVS userspace, but is not backed by an action
in the kernel datapath. This is useful for flows that may modify a
packet tuple after a ct lookup has already occurred.
Signed-off-by: Eric Garver <e@erig.me>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Tue, 10 Oct 2017 22:05:39 +0000 (15:05 -0700)]
net: dst: move cpu inside ifdef to avoid compilation warning
If CONFIG_DST_CACHE is not selected cpu variable
will be unused and we will see a compilation warning.
Move it under the ifdef.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: d66f2b91f95b ("bpf: don't rely on the verifier lock for metadata_dst allocation")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 10 Oct 2017 20:20:16 +0000 (13:20 -0700)]
Merge branch '1GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
1GbE Intel Wired LAN Driver Updates 2017-10-10
This series contains updates to e1000e and igb.
Benjamin Poirier provides several fixes for e1000e, starting with a
correction to the return status which was always returning success even
if it was not successful. Fixed code comments to reflect the actual
code behavior. Fixed the conditional test for the correct return
value. Fixed a potential race condition reported by Lennart Sorensen,
where the single flag get_link_status is used to signal two different
states.
Sasha fixes a buffer overrun for i219 devices, where the chipset had
reduced the round-trip latency for the LAN controller DMA accesses
which in some high performance cases caused a buffer overrun while
processing the DMA transactions.
Willem de Bruijn changes the default behavior of e1000e to use the
burst mode settings by default unless the user specifies the
receive interrupt delay (RxIntDelay).
Florian Fainelli updates the driver to differentiate between when
e1000e_put_txbuf() is called from normal reclamation or when a
DMA mapping failure to make the driver more "drop monitor friendly".
Christophe JAILLET fixes a potential NULL pointer dereference by
properly returning -ENOMEM on memory allocation failures.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Tue, 10 Oct 2017 15:10:04 +0000 (17:10 +0200)]
rtnetlink: bridge: use ext_ack instead of printk
We can now piggyback error strings to userspace via extended acks
rather than using printk.
Before:
bridge fdb add 01:02:03:04:05:06 dev br0 vlan 4095
RTNETLINK answers: Invalid argument
After:
bridge fdb add 01:02:03:04:05:06 dev br0 vlan 4095
Error: invalid vlan id.
v3: drop 'RTM_' prefixes, suggested by David Ahern, they
are not useful, the add/del in bridge command line is enough.
Also reword error in response to malformed/bad vlan id attribute
size.
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Tue, 10 Oct 2017 14:18:05 +0000 (16:18 +0200)]
selftests: rtnetlink: test RTM_GETNETCONF
exercise RTM_GETNETCONF call path for unspec, inet and inet6
families, they are DOIT_UNLOCKED candidates.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 10 Oct 2017 20:11:23 +0000 (13:11 -0700)]
Merge branch 'mlx4_en-num-of-rings'
Tariq Toukan says:
====================
mlx4_en num of rings
This patchset from Inbar contains changes to rings control
to the mlx4 Eth driver.
Patches 1 and 2 limit the number of rings to the number of CPUs.
Patch 3 removes a limitation in logic of default number of RX rings.
Series generated against net-next commit:
812b5ca7d376 Add a driver for Renesas uPD60620 and uPD60620A PHYs
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Inbar Karmy [Tue, 10 Oct 2017 09:28:35 +0000 (12:28 +0300)]
net/mlx4_en: Increase number of default RX rings
Remove limitation of netif_get_num_default_rss_queues()
from logic of RX rings default number.
Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Inbar Karmy [Tue, 10 Oct 2017 09:28:34 +0000 (12:28 +0300)]
net/mlx4_en: Limit the number of RX rings
Limit the number of RX rings by the number of cores
in the system.
Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Inbar Karmy [Tue, 10 Oct 2017 09:28:33 +0000 (12:28 +0300)]
net/mlx4_en: Limit the number of TX rings
Limit the number of TX rings per UP by the number of cores
in the system.
Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 10 Oct 2017 20:09:14 +0000 (13:09 -0700)]
Merge branch 'hnx3-rxnfc'
Lipeng says:
====================
Support set_ringparam and {set|get}_rxnfc ethtool commands
1, Patch [1/5,2/5] add support for ethtool ops set_ringparam
(ethtool -G) and fix related bug.
2, Patch [3/5,4/5, 5/5] add support for ethtool ops
set_rxnfc/get_rxnfc (-n/-N) and fix related bug.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Lipeng [Tue, 10 Oct 2017 08:42:07 +0000 (16:42 +0800)]
net: hns3: fix the ring count for ETHTOOL_GRXRINGS
This patch fix the ring count for ETHTOOL_GRXRINGS. Ring count
not TC size should be return for command "ethtool -n ethx".
Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lipeng [Tue, 10 Oct 2017 08:42:06 +0000 (16:42 +0800)]
net: hns3: add support for ETHTOOL_GRXFH
This patch add support for ethtool's ETHTOOL_GRXFH in hns3_get_rxnfc().
Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lipeng [Tue, 10 Oct 2017 08:42:05 +0000 (16:42 +0800)]
net: hns3: add support for set_rxnfc
This patch supports the ethtool's set_rxnfc().
Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lipeng [Tue, 10 Oct 2017 08:42:04 +0000 (16:42 +0800)]
net: hns3: add support for set_ringparam
This patch supports the ethtool's set_ringparam().
Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lipeng [Tue, 10 Oct 2017 08:42:03 +0000 (16:42 +0800)]
net: hns3: fixes the ring index in hns3_fini_ring
This patch fixes the ring index in hns3_fini_ring.
Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Tue, 10 Oct 2017 07:15:02 +0000 (12:45 +0530)]
cxgb4: add new T5 pci device id's
Add 0x50aa and 0x50ab T5 device id's.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Tue, 10 Oct 2017 07:14:13 +0000 (12:44 +0530)]
cxgb4: Add support for new flash parts
Add support for new flash parts identification, and
also cleanup the flash Part identifying and decoding
code.
Based on the original work of Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tim Hansen [Mon, 9 Oct 2017 15:37:59 +0000 (11:37 -0400)]
net/core: Fix BUG to BUG_ON conditionals.
Fix BUG() calls to use BUG_ON(conditional) macros.
This was found using make coccicheck M=net/core on linux next
tag next-
2017092
Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 10 Oct 2017 19:30:17 +0000 (12:30 -0700)]
Merge branch 'bpf-get-rid-of-global-verifier-state-and-reuse-instruction-printer'
Jakub Kicinski says:
====================
bpf: get rid of global verifier state and reuse instruction printer
This set started off as simple extraction of eBPF verifier's instruction
printer into a separate file but evolved into removal of global state.
The purpose of moving instruction printing code is to be able to reuse it
from the bpftool.
As far as the global verifier lock goes, this set removes the global
variables relating to the log buffer, makes the one-time init done
by bpf_get_skb_set_tunnel_proto() not depend on any external locking,
and performs verifier log writeback as data is produced removing the need
for allocating a potentially large temporary buffer.
The final step of actually removing the verifier lock is left to someone
more competent and self-confident :)
Note that struct bpf_verifier_env is just 40B under two pages now,
we should probably switch to vzalloc() when it's expanded again...
v2:
- add a selftest;
- use env buffer and flush on every print (Alexei);
- handle kernel log allocation failures (Daniel);
- put the env log members into a struct (Daniel).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 9 Oct 2017 17:30:15 +0000 (10:30 -0700)]
bpf: write back the verifier log buffer as it gets filled
Verifier log buffer can be quite large (up to 16MB currently).
As Eric Dumazet points out if we allow multiple verification
requests to proceed simultaneously, malicious user may use the
verifier as a way of allocating large amounts of unswappable
memory to OOM the host.
Switch to a strategy of allocating a smaller buffer (1024B)
and writing it out into the user buffer after every print.
While at it remove the old BUG_ON().
This is in preparation of the global verifier lock removal.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 9 Oct 2017 17:30:14 +0000 (10:30 -0700)]
bpf: don't rely on the verifier lock for metadata_dst allocation
bpf_skb_set_tunnel_*() functions require allocation of per-cpu
metadata_dst. The allocation happens upon verification of the
first program using those helpers. In preparation for removing
the verifier lock, use cmpxchg() to make sure we only allocate
the metadata_dsts once.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 9 Oct 2017 17:30:13 +0000 (10:30 -0700)]
tools: bpftool: use the kernel's instruction printer
Compile the instruction printer from kernel/bpf and use it
for disassembling "translated" eBPF code.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 9 Oct 2017 17:30:12 +0000 (10:30 -0700)]
bpf: move instruction printing into a separate file
Separate the instruction printing into a standalone source file.
This way sneaky code from tools/ can compile it in directly.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 9 Oct 2017 17:30:11 +0000 (10:30 -0700)]
bpf: move global verifier log into verifier environment
The biggest piece of global state protected by the verifier lock
is the verifier_log. Move that log to struct bpf_verifier_env.
struct bpf_verifier_env has to be passed now to all invocations
of verbose().
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 9 Oct 2017 17:30:10 +0000 (10:30 -0700)]
bpf: encapsulate verifier log state into a structure
Put the loose log_* variables into a structure. This will make
it simpler to remove the global verifier state in following patches.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 9 Oct 2017 17:30:09 +0000 (10:30 -0700)]
selftests/bpf: add a test for verifier logs
Add a test for verifier log handling. Check bad attr combinations
but focus on cases when log is truncated.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 10 Oct 2017 18:10:30 +0000 (19:10 +0100)]
ipv6: fix incorrect bitwise operator used on rt6i_flags
The use of the | operator always leads to true which looks rather
suspect to me. Fix this by using & instead to just check the
RTF_CACHE entry bit.
Detected by CoverityScan, CID#
1457734, #
1457747 ("Wrong operator used")
Fixes: 35732d01fe31 ("ipv6: introduce a hash table to store dst cache")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 10 Oct 2017 17:01:16 +0000 (18:01 +0100)]
ipv6: fix dereference of rt6_ex before null check error
Currently rt6_ex is being dereferenced before it is null checked
hence there is a possible null dereference bug. Fix this by only
dereferencing rt6_ex after it has been null checked.
Detected by CoverityScan, CID#
1457749 ("Dereference before null check")
Fixes: 81eb8447daae ("ipv6: take care of rt6_stats")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe JAILLET [Sun, 27 Aug 2017 06:39:51 +0000 (08:39 +0200)]
igb: check memory allocation failure
Check memory allocation failures and return -ENOMEM in such cases, as
already done for other memory allocations in this function.
This avoids NULL pointers dereference.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Tested-by: Aaron Brown <aaron.f.brown@intel.com
Acked-by: PJ Waskiewicz <peter.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Florian Fainelli [Sat, 26 Aug 2017 01:14:24 +0000 (18:14 -0700)]
e1000e: Be drop monitor friendly
e1000e_put_txbuf() can be called from normal reclamation path as well as
when a DMA mapping failure, so we need to differentiate these two cases
when freeing SKBs to be drop monitor friendly. e1000e_tx_hwtstamp_work()
and e1000_remove() are processing TX timestamped SKBs and those should
not be accounted as drops either.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Willem de Bruijn [Fri, 25 Aug 2017 15:06:26 +0000 (11:06 -0400)]
e1000e: apply burst mode settings only on default
Devices that support FLAG2_DMA_BURST have different default values
for RDTR and RADV. Apply burst mode default settings only when no
explicit value was passed at module load.
The RDTR default is zero. If the module is loaded for low latency
operation with RxIntDelay=0, do not override this value with a burst
default of 32.
Move the decision to apply burst values earlier, where explicitly
initialized module variables can be distinguished from defaults.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Sasha Neftin [Sun, 6 Aug 2017 13:49:18 +0000 (16:49 +0300)]
e1000e: fix buffer overrun while the I219 is processing DMA transactions
IntelĀ® 100/200 Series Chipset platforms reduced the round-trip
latency for the LAN Controller DMA accesses, causing in some high
performance cases a buffer overrun while the I219 LAN Connected
Device is processing the DMA transactions. I219LM and I219V devices
can fall into unrecovered Tx hang under very stressfully UDP traffic
and multiple reconnection of Ethernet cable. This Tx hang of the LAN
Controller is only recovered if the system is rebooted. Slightly slow
down DMA access by reducing the number of outstanding requests.
This workaround could have an impact on TCP traffic performance
on the platform. Disabling TSO eliminates performance loss for TCP
traffic without a noticeable impact on CPU performance.
Please, refer to I218/I219 specification update:
https://www.intel.com/content/www/us/en/embedded/products/networking/
ethernet-connection-i218-family-documentation.html
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Reviewed-by: Raanan Avargil <raanan.avargil@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Benjamin Poirier [Fri, 21 Jul 2017 18:36:27 +0000 (11:36 -0700)]
e1000e: Avoid receiver overrun interrupt bursts
When e1000e_poll() is not fast enough to keep up with incoming traffic, the
adapter (when operating in msix mode) raises the Other interrupt to signal
Receiver Overrun.
This is a double problem because 1) at the moment e1000_msix_other()
assumes that it is only called in case of Link Status Change and 2) if the
condition persists, the interrupt is repeatedly raised again in quick
succession.
Ideally we would configure the Other interrupt to not be raised in case of
receiver overrun but this doesn't seem possible on this adapter. Instead,
we handle the first part of the problem by reverting to the practice of
reading ICR in the other interrupt handler, like before commit
16ecba59bc33
("e1000e: Do not read ICR in Other interrupt"). Thanks to commit
0a8047ac68e5 ("e1000e: Fix msi-x interrupt automask") which cleared IAME
from CTRL_EXT, reading ICR doesn't interfere with RxQ0, TxQ0 interrupts
anymore. We handle the second part of the problem by not re-enabling the
Other interrupt right away when there is overrun. Instead, we wait until
traffic subsides, napi polling mode is exited and interrupts are
re-enabled.
Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca>
Fixes: 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Benjamin Poirier [Fri, 21 Jul 2017 18:36:26 +0000 (11:36 -0700)]
e1000e: Separate signaling for link check/link up
Lennart reported the following race condition:
\ e1000_watchdog_task
\ e1000e_has_link
\ hw->mac.ops.check_for_link() === e1000e_check_for_copper_link
/* link is up */
mac->get_link_status = false;
/* interrupt */
\ e1000_msix_other
hw->mac.get_link_status = true;
link_active = !hw->mac.get_link_status
/* link_active is false, wrongly */
This problem arises because the single flag get_link_status is used to
signal two different states: link status needs checking and link status is
down.
Avoid the problem by using the return value of .check_for_link to signal
the link status to e1000e_has_link().
Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca>
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Benjamin Poirier [Fri, 21 Jul 2017 18:36:25 +0000 (11:36 -0700)]
e1000e: Fix return value test
All the helpers return -E1000_ERR_PHY.
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Benjamin Poirier [Fri, 21 Jul 2017 18:36:24 +0000 (11:36 -0700)]
e1000e: Fix wrong comment related to link detection
Reading e1000e_check_for_copper_link() shows that get_link_status is set to
false after link has been detected. Therefore, it stays TRUE until then.
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>