git.openwrt.org Git - openwrt/staging/blogic.git/log

tipc: add second source address to recvmsg()/recvfrom()

With group communication, it becomes important for a message receiver to
identify not only from which socket (identfied by a node:port tuple) the
message was sent, but also the logical identity (type:instance) of the
sending member.

We fix this by adding a second instance of struct sockaddr_tipc to the
source address area when a message is read. The extra address struct
is filled in with data found in the received message header (type,) and
in the local member representation struct (instance.)

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: introduce communication groups

As a preparation for introducing flow control for multicast and datagram
messaging we need a more strictly defined framework than we have now. A
socket must be able keep track of exactly how many and which other
sockets it is allowed to communicate with at any moment, and keep the
necessary state for those.

We therefore introduce a new concept we have named Communication Group.
Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
The call takes four parameters: 'type' serves as group identifier,
'instance' serves as an logical member identifier, and 'scope' indicates
the visibility of the group (node/cluster/zone). Finally, 'flags' makes
it possible to set certain properties for the member. For now, there is
only one flag, indicating if the creator of the socket wants to receive
a copy of broadcast or multicast messages it is sending via the socket,
and if wants to be eligible as destination for its own anycasts.

A group is closed, i.e., sockets which have not joined a group will
not be able to send messages to or receive messages from members of
the group, and vice versa.

Any member of a group can send multicast ('group broadcast') messages
to all group members, optionally including itself, using the primitive
send(). The messages are received via the recvmsg() primitive. A socket
can only be member of one group at a time.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: improve destination linked list

We often see a need for a linked list of destination identities,
sometimes containing a port number, sometimes a node identity, and
sometimes both. The currently defined struct u32_list is not generic
enough to cover all cases, so we extend it to contain two u32 integers
and rename it to struct tipc_dest_list.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add new function for sending multiple small messages

We see an increasing need to send multiple single-buffer messages
of TIPC_SYSTEM_IMPORTANCE to different individual destination nodes.
Instead of looping over the send queue and sending each buffer
individually, as we do now, we add a new help function
tipc_node_distr_xmit() to do this.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: refactor function filter_rcv()

In the following commits we will need to handle multiple incoming and
rejected/returned buffers in the function socket.c::filter_rcv().
As a preparation for this, we generalize the function by handling
buffer queues instead of individual buffers. We also introduce a
help function tipc_skb_reject(), and rename filter_rcv() to
tipc_sk_filter_rcv() in line with other functions in socket.c.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add ability to obtain node availability status from other files

In the coming commits, functions at the socket level will need the
ability to read the availability status of a given node. We therefore
introduce a new function for this purpose, while renaming the existing
static function currently having the wanted name.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: improve address sanity check in tipc_connect()

The address given to tipc_connect() is not completely sanity checked,
under the assumption that this will be done later in the function
__tipc_sendmsg() when the address is used there.

However, the latter functon will in the next commits serve as caller
to several other send functions, so we want to move the corresponding
sanity check there to the beginning of that function, before we possibly
need to grab the address stored by tipc_connect(). We must therefore
be able to trust that this address already has been thoroughly checked.

We do this in this commit.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: add ability to order and receive topology events in driver

As preparation for introducing communication groups, we add the ability
to issue topology subscriptions and receive topology events from kernel
space. This will make it possible for group member sockets to keep track
of other group members.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: rtnetlink: add a small macsec test case

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

ravb: Consolidate clock handling

The module clock is used for two purposes:
  - Wake-on-LAN (WoL), which is optional,
  - gPTP Timer Increment (GTI) configuration, which is mandatory.

As the clock is needed for GTI configuration anyway, WoL is always
available.  Hence remove duplication and repeated obtaining of the clock
by making GTI use the stored clock for WoL use.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-support-bgmac-with-B50212E-B1-PHY'

Rafał Miłecki says:

====================
net: support bgmac with B50212E B1 PHY

I got a report that a board with BCM47189 SoC and B50212E B1 PHY doesn't
work well some devices as there is massive ping loss. After analyzing
PHY state it has appeared that is runs in slave mode and doesn't auto
switch to master properly when needed.

This patchset fixes this by:
1) Adding new flag support to the PHY driver for setting master mode
2) Modifying bgmac to request master mode for reported hardware
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: bgmac: enable master mode for BCM54210E and B50212E PHYs

There are 4 very similar PHYs:
0x600d84a1: BCM54210E (rev B0)
0x600d84a2: BCM54210E (rev B1)
0x600d84a5: B50212E (rev B0)
0x600d84a6: B50212E (rev B1)
that need setting master mode manually. It's because they run in slave
mode by default with Automatic Slave/Master configuration disabled which
can lead to unreliable connection with massive ping loss.

So far it was reported for a board with BCM47189 SoC and B50212E B1 PHY
connected to the bgmac supported ethernet device. Telling PHY driver to
setup PHY properly solves this issue.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: broadcom: support new device flag for setting master mode

Some of Broadcom's PHYs run by default in slave mode with Automatic
Slave/Master configuration disabled. It stops them from working properly
with some devices.

So far it has been verified for BCM54210E and BCM50212E which don't
work well with Intel's I217-LM and I218-LM:
http://ark.intel.com/products/60019/Intel-Ethernet-Connection-I217-LM
http://ark.intel.com/products/71307/Intel-Ethernet-Connection-I218-LM
I was told there is massive ping loss.

This commit adds support for a new flag which can be set by an ethernet
driver to fixup PHY setup.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipvlan: always use the current L2 addr of the master

If the underlying master ever changes its L2 (e.g. bonding device),
then make sure that the IPvlan slaves always emit packets with the
current L2 of the master instead of the stale mac addr which was
copied during the device creation. The problem can be seen with
following script -

  #!/bin/bash
  # Create a vEth pair
  ip link add dev veth0 type veth peer name veth1
  ip link set veth0 up
  ip link set veth1 up
  ip link show veth0
  ip link show veth1
  # Create an IPvlan device on one end of this vEth pair.
  ip link add link veth0 dev ipvl0 type ipvlan mode l2
  ip link show ipvl0
  # Change the mac-address of the vEth master.
  ip link set veth0 address 02:11:22:33:44:55

Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'act-ife-misc'

Alexander Aring says:

====================
sched: act: ife: UAPI checks and performance tweaks

this patch series contains at first a patch which adds a check for
IFE_ENCODE and IFE_DECODE when a ife act gets created or updated and adding
handling of these cases only inside the act callback only.

The second patch use per-cpu counters and move the spinlock around so that
the spinlock is less being held in act callback.

The last patch use rcu for update parameters and also move the spinlock for
the same purpose as in patch 2.

Notes:
- There is still a spinlock around for protecting the metalist and a
   rw-lock for another list. Should be migrated to a rcu list, ife
   possible.

- I use still dereference in dump callback, so I think what I didn't
   got was what happened when rcu_assign_pointer will do when rcu read
   lock is held. I suppose the pointer will be updated, then we don't
   have any issue here.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

sched: act: ife: update parameters via rcu handling

This patch changes the parameter updating via RCU and not protected by a
spinlock anymore. This reduce the time that the spinlock is being held.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sched: act: ife: migrate to use per-cpu counters

This patch migrates the current counter handling which is protected by a
spinlock to a per-cpu counter handling. This reduce the time where the
spinlock is being held.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sched: act: ife: move encode/decode check to init

This patch adds the check of the two possible ife handlings encode
and decode to the init callback. The decode value is for usability
aspect and used in userspace code only. The current code offers encode
else decode only. This patch avoids any other option than this.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-sched-fix-IFE-meta-modules-loading'

Roman Mashak says:

====================
net: sched: Fix IFE meta modules loading

Adjust module alias names of IFE meta modules and fix the bug that
prevented auto-loading IFE modules in run-time.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net sched actions: fix module auto-loading

Macro __stringify_1() can stringify a macro argument, however IFE_META_*
are enums, so they never expand, however request_module expects an integer
in IFE module name, so as a result it always fails to auto-load.

Fixes: ef6980b6becb ("introduce IFE action")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net sched actions: change IFE modules alias names

Make style of module alias name consistent with other subsystems in kernel,
for example net devices.

Fixes: 084e2f6566d2 ("Support to encoding decoding skb mark on IFE action")
Fixes: 200e10f46936 ("Support to encoding decoding skb prio on IFE action")
Fixes: 408fbc22ef1e ("net sched ife action: Introduce skb tcindex metadata encap decap")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vxge: Clean up unused variables in vxge-traffic

Delete unused channel variables in vxge-traffic.

Signed-off-by: Christos Gkekas <chris.gekas@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sched: tc_mirred: Remove whitespaces

This file contains unnecessary whitespaces as newlines, remove them,
found by looking at what struct tc_mirred looks like.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip_tunnel: fix building with NET_IP_TUNNEL=m

When af_mpls is built-in but the tunnel support is a module,
we get a link failure:

net/mpls/af_mpls.o: In function `mpls_init':
af_mpls.c:(.init.text+0xdc): undefined reference to `ip_tunnel_encap_add_ops'

This adds a Kconfig statement to prevent the broken
configuration and force mpls to be a module as well in
this case.

Fixes: bdc476413dcd ("ip_tunnel: add mpls over gre support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Amine Kherbouche <amine.kherbouche@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'smc-ib_query_gid'

Ursula Braun says:

====================
net/smc: ib_query_gid() patches

triggered by Parav Pandit here are 2 cleanup patches for usage of
ib_query_gid() in the smc-code.

Thanks, Ursula

v2 changes advised by Parav Pandit:
   extra check is_vlan_dev() in patch 2/2
   "RoCE" spelling
   added "Reported-by"
   added "Reviewed-by"
   added "Fixes"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: dev_put for netdev after usage of ib_query_gid()

For RoCEs ib_query_gid() takes a reference count on the net_device.
This reference count must be decreased by the caller.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Reported-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Fixes: 0cfdd8f92cac ("smc: connection and link group creation")
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: replace function pointer get_netdev()

SMC should not open code the function pointer get_netdev of the
IB device. Replacing ib_query_gid(..., NULL) with
ib_query_gid(..., gid_attr) allows access to the netdev.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Suggested-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'dsa-ACB-for-bcm_sf2-and-bcmsysport'

Florian Fainelli says:

====================
Enable ACB for bcm_sf2 and bcmsysport

This patch series enables Broadcom's Advanced Congestion Buffering mechanism
which requires cooperation between the CPU/Management Ethernet MAC controller
and the switch.

I took the notifier approach because ultimately the information we need to
carry to the master network device is DSA specific and I saw little room for
generalizing beyond what DSA requires. Chances are that this is highly specific
to the Broadcom HW as I don't know of any HW out there that supports something
nearly similar for similar or identical needs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: systemport: Turn on ACB at the SYSTEMPORT level

Now that we have established the queue mapping between the switch port
egress queues and the SYSTEMPORT egress queues, we can turn on Advanced
Congestion Buffering (ACB) at the SYSTEMPORT level. This enables the
Ethernet MAC controller to get out of band flow control information
directly from the switch port and queue that it monitors such that its
internal TDMA can be appropriately backpressured.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: bcm_sf2: Turn on ACB at the switch level

Turn on the out of band Advanced Congestion Buffering (ACB) mechanism at
the switch level now that we have properly established the queue mapping
between the switch egress queues and the SYSTEMPORT egress queues. This
allows the switch to correctly backpressure the host system when one of
its queue drops below the configured thresholds.

This is also helping achieve so called "lossless" behavior by adapting
the TX interrupt pacing to the actual speed and capacity of the switch
port.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: systemport: Establish lower/upper queue mapping

Establish a queue mapping between the DSA slave network device queues
created that correspond to switch port queues, and the transmit queue
that SYSTEMPORT manages.

We need to configure the SYSTEMPORT transmit queue with the switch port number
and switch port queue number in order for the switch and SYSTEMPORT hardware to
utilize the out of band congestion notification. This hardware mechanism works
by looking at the switch port egress queue and determines whether there is
enough buffers for this queue, with that class of service for a successful
transmission and if not, backpressures the SYSTEMPORT queue that is being used.

For this to work, we implement a notifier which looks at the
DSA_PORT_REGISTER event. When DSA network devices are registered, the
framework calls the DSA notifiers when that happens, extracts the number
of queues for these devices and their associated port number, remembers
that in the driver private structure and linearly maps those queues to
TX rings/queues that we manage.

This scheme works because DSA slave network deviecs always transmit
through SYSTEMPORT so when DSA slave network devices are
destroyed/brought down, the corresponding SYSTEMPORT queues are no
longer used. Also, by design of the DSA framework, the master network
device (SYSTEMPORT) is registered first.

For faster lookups we use an array of up to DSA_MAX_PORTS * number of
queues per port, and then map pointers to bcm_sysport_tx_ring such that
our ndo_select_queue() implementation can just index into that array to
locate the corresponding ring index.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: tag_brcm: Indicate to master netdevice port + queue

We need to tell the DSA master network device doing the actual
transmission what the desired switch port and queue number is for it to
resolve that to the internal transmit queue it is mapped to.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: Add support for DSA specific notifiers

In preparation for communicating a given DSA network device's port
number and switch index, create a specialized DSA notifier and two
events: DSA_PORT_REGISTER and DSA_PORT_UNREGISTER that communicate: the
slave network device (slave_dev), port number and switch number in the
tree.

This will be later used for network device drivers like bcmsysport which
needs to cooperate with its DSA network devices to set-up queue mapping
and scheduling.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "net: qcom/emac: enforce DMA address restrictions"

This reverts commit df1ec1b9d0df57e96011f175418dc95b1af46821.

It turns out that memory allocated via dma_alloc_coherent is always
aligned to the size of the buffer, so there's no way the RRD and RFD
can ever be in separate 32-bit regions.

Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: remove obsolete helpers

Remove three inline helpers that are no longer needed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: remove redundant variable old_flags

Variable old_flags is being assigned but is never read; it is redundant
and can be removed.

Cleans up clang warning: Value stored to 'old_flags' is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mlx4-XDP-TX-improvements'

Tariq Toukan says:

====================
mlx4_en XDP TX improvements

This patchset contains performance improvements
to the XDP_TX use case in the mlx4 Eth driver.

Patch 1 is a simple change in a function parameter type.
Patch 2 replaces a call to a generic function with the
relevant parts inlined.
Patch 3 moves the write of descriptors' constant values
from data path to control path.

Series generated against net-next commit:
833e0e2f24fd net: dst: move cpu inside ifdef to avoid compilation warning
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: XDP_TX, assign constant values of TX descs on ring creaion

In XDP_TX, some fields in tx_info and tx_desc are constants across
all entries of the different XDP_TX rings.
Assign values to these fields on ring creation time, rather than in
data-path.

Patchset performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Single queue no-RSS optimization ON.

XDP_TX packet rate:
------------------------------
Before | After | Gain |
13.7 Mpps | 14.0 Mpps | %2.2 |
------------------------------

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Obsolete call to generic write_desc in XDP xmit flow

Function mlx4_en_tx_write_desc() is not optimized to use of XDP xmit.
Use the relevant parts inline instead.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Replace netdev parameter with priv in XDP xmit function

The struct net_device parameter was passed only to extract
struct mlx4_en_priv out of it.
Here we pass the priv parameter directly.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mpls: make function ipgre_mpls_encap_hlen static

The function ipgre_mpls_encap_hlen is local to the source and
does not need to be in global scope, so make it static.

Cleans up sparse warning:
symbol 'ipgre_mpls_encap_hlen' was not declared. Should it be static?

Fixes: bdc476413dcdb ("ip_tunnel: add mpls over gre support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: make array sctp_sched_ops static

The array sctp_sched_ops is local to the source and
does not need to be in global scope, so make it static.

Cleans up sparse warning:
symbol 'sctp_sched_ops' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: addrconf: don't use rtnl mutex in RTM_GETADDR

Similar to the previous patch, use the device lookup functions
that bump device refcount and flag this as DOIT_UNLOCKED to avoid
rtnl mutex.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: addrconf: don't use rtnl mutex in RTM_GETNETCONF

Instead of relying on rtnl mutex bump device reference count.
After this change, values reported can change in parallel, but thats not
much different from current state, as anyone can change the settings
right after rtnl_unlock (and before userspace processed reply).

While at it, switch to GFP_KERNEL allocation.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-sched-get-rid-of-cls_flower-egress_dev'

Jiri Pirko says:

====================
net: sched: get rid of cls_flower->egress_dev

Introduction of cls_flower->egress_dev was a workaround. Turned out
to be a bit ugly hack. So replace it with more generic and reusable
infrastructure.

This is a dependency of shared block introduction that will be send as
a follow-up patchsets group.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: sched: remove unused tcf_exts_get_dev helper and cls_flower->egress_dev

The helper and the struct field ares no longer used by any code,
so remove them.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sched: convert cls_flower->egress_dev users to tc_setup_cb_egdev infra

The only user of cls_flower->egress_dev is mlx5. So do the conversion
there alongside with the code originating the call in cls_flower
function fl_hw_replace_filter to the newly introduced egress device
callback infrastucture.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sched: introduce per-egress action device callbacks

Introduce infrastructure that allows drivers to register callbacks that
are called whenever tc would offload inserted rule and specified device
acts as tc action egress device.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sched: make tc_action_ops->get_dev return dev and avoid passing net

Return dev directly, NULL if not possible. That is enough.

Makes no sense to pass struct net * to get_dev op, as there is only one
net possible, the one the action was created in. So just store it in
mirred priv and use directly.

Rename the mirred op callback function.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'rmnet-Rewrite-some-existing-functionality'

Subash Abhinov Kasiviswanathan says:

====================
net: qualcomm: rmnet: Rewrite some existing functionality

This series fixes some of the broken rmnet functionality.
Bridge mode is re-written and made useable and the muxed_ep is converted to hlist.

Patches 1-5 are cleanups in preparation for these changes.
Patch 6 does the hlist conversion.
Patch 7 has the implementation of the rmnet bridge mode.

v1->v2: Fix the warning and code style issue in rmnet_rx_handler as
mentioned by David.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rmnet: Implement bridge mode

Add support to bridge two devices which can send multiplexing and
aggregation (MAP) data. This is done only when the data itself is
not going to be consumed in the stack but is being passed on to a
different endpoint. This is mainly used for testing.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rmnet: Convert the muxed endpoint to hlist

Rather than using a static array, use a hlist to store the muxed
endpoints and use the mux id to query the rmnet_device.
This is useful as usually very few mux ids are used.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rmnet: Remove duplicate setting of rmnet_devices

The rmnet_devices information is already stored in muxed_ep, so
storing this in rmnet_devices[] again is redundant.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rmnet: Remove duplicate setting of rmnet private info

The end point is set twice in the local_ep as well as the mux_id and
the real_dev in the rmnet private structure. Remove the local_ep.
While these elements are equivalent, rmnet_endpoint will be
used only as part of the rmnet_port for muxed scenarios in VND mode.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rmnet: Move rmnet_mode to rmnet_port

Mode information on the real device makes it easier to route packets
to rmnet device or bridged device based on the configuration.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rmnet: Remove some unused defines

Most of these constants were used in the initial patchset where
custom netlink configuration was used and hence are no longer relevant.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rmnet: Remove existing logic for bridge mode

This will be rewritten in the following patches.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'qcom-emac-various-minor-fixes'

Timur Tabi says:

====================
net: qcom/emac: various minor fixes

A set of patches for 4.15 that clean up some code, apply minors fixes,
and so on. Some of the code also prepares the driver for a future
version of the EMAC controller.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: qcom/emac: clean up some TX/RX error messages

Some of the error messages that are printed by the interrupt handlers
are poorly written. For example, many don't include a device prefix,
so there's no indication that they are EMAC errors.

Also use rate limiting for all messages that could be printed from
interrupt context.

Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qcom/emac: enforce DMA address restrictions

The EMAC has a restriction that the upper 32 bits of the base addresses
for the RFD and RRD rings must be the same. The ensure that restriction,
we allocate twice the space for the RRD and locate it at an appropriate
address.

We also re-arrange the allocations so that invalid addresses are even
less likely.

Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qcom/emac: remove unused address arrays

The EMAC is capable of multiple TX and RX rings, but the driver only
supports one ring for each. One function had some left-over unused
code that supports multiple rings, but all it did was make the code
harder to read.

Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qcom/emac: specify the correct DMA mask

The 64/32-bit DMA mask hackery in the EMAC driver is not actually necessary,
and is technically not accurate.  The EMAC hardware is limted to a 45-bit
DMA address.  Although no EMAC-enabled system can have that much DDR,
an IOMMU could possible provide a larger address.  Rather than play games
with the DMA mappings, the driver should provide a correct value and
trust the DMA/IOMMU layers to do the right thing.

Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'qrtr-Fixes-and-support-receiving-version-2-packets'

Bjorn Andersson says:

====================
net: qrtr: Fixes and support receiving version 2 packets

On the latest Qualcomm platforms remote processors are sending packets with
version 2 of the message header. This series starts off with some fixes and
then refactors the qrtr code to support receiving messages of both version 1
and version 2.

As all remotes are backwards compatible transmitted packets continues to be
send as version 1, but some groundwork has been done to make this a per-link
property.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: qrtr: Support decoding incoming v2 packets

Add the necessary logic for decoding incoming messages of version 2 as
well. Also make sure there's room for the bigger of version 1 and 2
headers in the code allocating skbs for outgoing messages.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qrtr: Use sk_buff->cb in receive path

Rather than parsing the header of incoming messages throughout the
implementation do it once when we retrieve the message and store the
relevant information in the "cb" member of the sk_buff.

This allows us to, in a later commit, decode version 2 messages into
this same structure.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qrtr: Clean up control packet handling

As the message header generation is deferred the internal functions for
generating control packets can be simplified.

This patch modifies qrtr_alloc_ctrl_packet() to, in addition to the
sk_buff, return a reference to a struct qrtr_ctrl_pkt, which clarifies
and simplifies the helpers to the point that these functions can be
folded back into the callers.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qrtr: Pass source and destination to enqueue functions

Defer writing the message header to the skb until its time to enqueue
the packet. As the receive path is reworked to decode the message header
as it's received from the transport and only pass around the payload in
the skb this change means that we do not have to fill out the full
message header just to decode it immediately in qrtr_local_enqueue().

In the future this change also makes it possible to prepend message
headers based on the version of each link.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qrtr: Add control packet definition to uapi

The QMUX protocol specification defines structure of the special control
packet messages being sent between handlers of the control port.

Add these to the uapi header, as this structure and the associated types
are shared between the kernel and all userspace handlers of control
messages.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qrtr: Move constants to header file

The constants are used by both the name server and clients, so clarify
their value and move them to the uapi header.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qrtr: Invoke sk_error_report() after setting sk_err

Rather than manually waking up any context sleeping on the sock to
signal an error we should call sk_error_report(). This has the added
benefit that in-kernel consumers can override this notification with
its own callback.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: make local functions static

Fixes the following sparse warnings:

drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c:464:5: warning:
symbol 'hns3_change_all_ring_bd_num' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c:477:5: warning:
symbol 'hns3_set_ringparam' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

atm: idt77105: Drop needless setup_timer()

Calling setup_timer() is redundant when DEFINE_TIMER() has been used.

Cc: Chas Williams <3chas3@gmail.com>
Cc: linux-atm-general@lists.sourceforge.net
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: at803x: Change error to EINVAL for invalid MAC

Change the return error code to EINVAL if the MAC
address is not valid in the set_wol function.

Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: DP83822 initial driver submission

Add support for the TI DP83822 10/100Mbit ethernet phy.

The DP83822 provides flexibility to connect to a MAC through a
standard MII, RMII or RGMII interface.

In addition the DP83822 needs to be removed from the DP83848 driver
as the WoL support is added here for this device.

Datasheet:
http://www.ti.com/product/DP83822I/datasheet

Signed-off-by: Dan Murphy <dmurphy@ti.com>
Acked-by: Andrew F. Davis <afd@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'lan9303-Add-basic-offloading-of-unicast-traffic'

Egil Hjelmeland says:

====================
lan9303: Add basic offloading of unicast traffic

This series add basic offloading of unicast traffic to the lan9303
DSA driver.

Review welcome!

Changes v1 -> v2:
- Patch 1: Codestyle linting.
- Patch 2: Remember SWE_PORT_STATE while not bridged.
Added constant LAN9303_SWE_PORT_MIRROR_DISABLED.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: lan9303: Add basic offloading of unicast traffic

When both user ports are joined to the same bridge, the normal
HW MAC learning is enabled. This means that unicast traffic is forwarded
in HW.

If one of the user ports leave the bridge,
the ports goes back to the initial separated operation.

Port separation relies on disabled HW MAC learning. Hence the condition
that both ports must join same bridge.

Add brigde methods port_bridge_join, port_bridge_leave and
port_stp_state_set.

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: lan9303: Move tag setup to new lan9303_setup_tagging

Prepare for next patch:
Move tag setup from lan9303_separate_ports() to new function
lan9303_setup_tagging()

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: fix tcp_unlink_write_queue()

Yury reported crash with this signature :

[  554.034021] [<ffff80003ccd5a58>] 0xffff80003ccd5a58
[  554.034156] [<ffff00000888fd34>] skb_release_all+0x14/0x30
[  554.034288] [<ffff00000888fd64>] __kfree_skb+0x14/0x28
[  554.034409] [<ffff0000088ece6c>] tcp_sendmsg_locked+0x4dc/0xcc8
[  554.034541] [<ffff0000088ed68c>] tcp_sendmsg+0x34/0x58
[  554.034659] [<ffff000008919fd4>] inet_sendmsg+0x2c/0xf8
[  554.034783] [<ffff0000088842e8>] sock_sendmsg+0x18/0x30
[  554.034928] [<ffff0000088861fc>] SyS_sendto+0x84/0xf8

Problem is that skb->destructor contains garbage, and this is
because I accidentally removed tcp_skb_tsorted_anchor_cleanup()
from tcp_unlink_write_queue()

This would trigger with a write(fd, <invalid_memory>, len) attempt,
and we will add to packetdrill this capability to avoid future
regressions.

Fixes: 75c119afe14f ("tcp: implement rb-tree based retransmit queue")
Reported-by: Yury Norov <ynorov@caviumnetworks.com>
Tested-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'mac80211-next-for-davem-2017-10-11' of git://git./linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
Work continues in various areas:
* port authorized event for 4-way-HS offload (Avi)
* enable MFP optional for such devices (Emmanuel)
* Kees's timer setup patch for mac80211 mesh
(the part that isn't trivially scripted)
* improve VLAN vs. TXQ handling (myself)
* load regulatory database as firmware file (myself)
* with various other small improvements and cleanups

I merged net-next once in the meantime to allow Kees's
timer setup patch to go in.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

cfg80211: implement regdb signature checking

Currently CRDA implements the signature checking, and the previous
commits added the ability to load the whole regulatory database
into the kernel.

However, we really can't lose the signature checking, so implement
it in the kernel by loading a detached signature (regulatory.db.p7s)
and check it against built-in keys.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>

cfg80211: reg: remove support for built-in regdb

Parsing and building C structures from a regdb is no longer needed
since the "firmware" file (regulatory.db) can be linked into the
kernel image to achieve the same effect.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>

cfg80211: support reloading regulatory database

If the regulatory database is loaded, and then updated, it may
be necessary to reload it. Add an nl80211 command to do this.

Note that this just reloads the database, it doesn't re-apply
the rules from it immediately.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>

cfg80211: support loading regulatory database as firmware file

As the current regulatory database is only about 4k big, and already
difficult to extend, we decided that overall it would be better to
get rid of the complications with CRDA and load the database into the
kernel directly, but in a new format that is extensible.

The new file format can be extended since it carries a length field
on all the structs that need to be extensible.

In order to be able to request firmware when the module initializes,
move cfg80211 from subsys_initcall() to the later fs_initcall(); the
firmware loader is at the same level but linked earlier, so it can
be called from there. Otherwise, when both the firmware loader and
cfg80211 are built-in, the request will crash the kernel. We also
need to be before device_initcall() so that cfg80211 is available
for devices when they initialize.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: only remove AP VLAN frames from TXQ

When removing an AP VLAN interface, mac80211 currently purges
the entire TXQ for the AP interface. Fix this by using the FQ
API introduced in the previous patch to filter frames.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

fq: support filtering a given tin

Add to the FQ API a way to filter a given tin, in order to
remove frames that fulfil certain criteria according to a
filter function.

This will be used by mac80211 to remove frames belonging to
an AP VLAN interface that's being removed.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: aead api to reduce redundancy

Currently, the aes_ccm.c and aes_gcm.c are almost line by line copy of
each other. This patch reduce code redundancy by moving the code in these
two files to crypto/aead_api.c to make it a higher level aead api. The
file aes_ccm.c and aes_gcm.c are removed and all the functions there are
now implemented in their headers using the newly added aead api.

Signed-off-by: Xiang Gao <qasdfgtyuiop@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

MAINTAINERS: update Johannes Berg's entries

Update my MAINTAINERS file entries to list all the right files.
Since I'm also the de-facto wireless extensions maintainer,
there's little point in excluding those.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>

openvswitch: add ct_clear action

This adds a ct_clear action for clearing conntrack state. ct_clear is
currently implemented in OVS userspace, but is not backed by an action
in the kernel datapath. This is useful for flows that may modify a
packet tuple after a ct lookup has already occurred.

Signed-off-by: Eric Garver <e@erig.me>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dst: move cpu inside ifdef to avoid compilation warning

If CONFIG_DST_CACHE is not selected cpu variable
will be unused and we will see a compilation warning.
Move it under the ifdef.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: d66f2b91f95b ("bpf: don't rely on the verifier lock for metadata_dst allocation")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch '1GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2017-10-10

This series contains updates to e1000e and igb.

Benjamin Poirier provides several fixes for e1000e, starting with a
correction to the return status which was always returning success even
if it was not successful.  Fixed code comments to reflect the actual
code behavior.  Fixed the conditional test for the correct return
value.  Fixed a potential race condition reported by Lennart Sorensen,
where the single flag get_link_status is used to signal two different
states.

Sasha fixes a buffer overrun for i219 devices, where the chipset had
reduced the round-trip latency for the LAN controller DMA accesses
which in some high performance cases caused a buffer overrun while
processing the DMA transactions.

Willem de Bruijn changes the default behavior of e1000e to use the
burst mode settings by default unless the user specifies the
receive interrupt delay (RxIntDelay).

Florian Fainelli updates the driver to differentiate between when
e1000e_put_txbuf() is called from normal reclamation or when a
DMA mapping failure to make the driver more "drop monitor friendly".

Christophe JAILLET fixes a potential NULL pointer dereference by
properly returning -ENOMEM on memory allocation failures.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

rtnetlink: bridge: use ext_ack instead of printk

We can now piggyback error strings to userspace via extended acks
rather than using printk.

Before:
bridge fdb add 01:02:03:04:05:06 dev br0 vlan 4095
RTNETLINK answers: Invalid argument

After:
bridge fdb add 01:02:03:04:05:06 dev br0 vlan 4095
Error: invalid vlan id.

v3: drop 'RTM_' prefixes, suggested by David Ahern, they
are not useful, the add/del in bridge command line is enough.

Also reword error in response to malformed/bad vlan id attribute
size.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: rtnetlink: test RTM_GETNETCONF

exercise RTM_GETNETCONF call path for unspec, inet and inet6
families, they are DOIT_UNLOCKED candidates.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mlx4_en-num-of-rings'

Tariq Toukan says:

====================
mlx4_en num of rings

This patchset from Inbar contains changes to rings control
to the mlx4 Eth driver.

Patches 1 and 2 limit the number of rings to the number of CPUs.
Patch 3 removes a limitation in logic of default number of RX rings.

Series generated against net-next commit:
812b5ca7d376 Add a driver for Renesas uPD60620 and uPD60620A PHYs
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Increase number of default RX rings

Remove limitation of netif_get_num_default_rss_queues()
from logic of RX rings default number.

Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Limit the number of RX rings

Limit the number of RX rings by the number of cores
in the system.

Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Limit the number of TX rings

Limit the number of TX rings per UP by the number of cores
in the system.

Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'hnx3-rxnfc'

Lipeng says:

====================
Support set_ringparam and {set|get}_rxnfc ethtool commands

1, Patch [1/5,2/5] add support for ethtool ops set_ringparam
(ethtool -G) and fix related bug.
2, Patch [3/5,4/5, 5/5] add support for ethtool ops
set_rxnfc/get_rxnfc (-n/-N) and fix related bug.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: fix the ring count for ETHTOOL_GRXRINGS

This patch fix the ring count for ETHTOOL_GRXRINGS. Ring count
not TC size should be return for command "ethtool -n ethx".

Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: add support for ETHTOOL_GRXFH

This patch add support for ethtool's ETHTOOL_GRXFH in hns3_get_rxnfc().

Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: add support for set_rxnfc

This patch supports the ethtool's set_rxnfc().

Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>