openwrt/staging/blogic.git
7 years agomlxsw: spectrum_switchdev: Save mids list per bridge device
Nogah Frankel [Wed, 20 Sep 2017 14:15:04 +0000 (16:15 +0200)]
mlxsw: spectrum_switchdev: Save mids list per bridge device

Instead of saving all the mids in the same list, save them per vlan
device. This change allows a more efficient mid find.
Also, in the next patches, there will be added a lot of loops over all the
mids in bridge device for multicast disable, mrouter change and ndb flush.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum_switchdev: Remove reference count from mid
Nogah Frankel [Wed, 20 Sep 2017 14:15:03 +0000 (16:15 +0200)]
mlxsw: spectrum_switchdev: Remove reference count from mid

Since there is a bitmap for the ports registered to each mid, there is no
need for a ref count, since it will always be the number of set bits in
this bitmap. Any check of the ref count was replaced with checking if the
bitmap is empty.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum_switchdev: Add a ports bitmap to the mid db
Nogah Frankel [Wed, 20 Sep 2017 14:15:02 +0000 (16:15 +0200)]
mlxsw: spectrum_switchdev: Add a ports bitmap to the mid db

Add a bitmap of ports to the mid struct to hold the ports that are
registered to this mid.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum_switchdev: Change mc_router to mrouter
Nogah Frankel [Wed, 20 Sep 2017 14:15:01 +0000 (16:15 +0200)]
mlxsw: spectrum_switchdev: Change mc_router to mrouter

Change the naming of mc_router to mrouter to keep consistency.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocxgb4: add new T5 pci device id's
Ganesh Goudar [Wed, 20 Sep 2017 06:02:07 +0000 (11:32 +0530)]
cxgb4: add new T5 pci device id's

Add 0x50a5, 0x50a6, 0x50a7, 0x50a8 and 0x50a9 T5 device
id's.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'blackfin-Drop-non-functional-DSA-code'
David S. Miller [Wed, 20 Sep 2017 22:57:02 +0000 (15:57 -0700)]
Merge branch 'blackfin-Drop-non-functional-DSA-code'

Florian Fainelli says:

====================
blackfin: Drop non-functional DSA code

I sent those many months ago in the hope that the bfin-linux people
would pick those patches but nobody seems to be responding, can you
queue those via net-next since this affects DSA?
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoblackfin: ezbrd: Remove non-functional DSA/KSZ8893M code
Florian Fainelli [Wed, 20 Sep 2017 01:03:46 +0000 (18:03 -0700)]
blackfin: ezbrd: Remove non-functional DSA/KSZ8893M code

There is no in tree driver for the KSZ8893M switch driver, so just get rid of
the code in that board file.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoblackfin: tcm-bf518: Remove dsa.h inclusion
Florian Fainelli [Wed, 20 Sep 2017 01:03:45 +0000 (18:03 -0700)]
blackfin: tcm-bf518: Remove dsa.h inclusion

Nothing in that file uses definitions from that header, so just get rid of it.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: Utilize dsa_slave_dev_check()
Florian Fainelli [Wed, 20 Sep 2017 01:00:37 +0000 (18:00 -0700)]
net: dsa: Utilize dsa_slave_dev_check()

Instead of open coding the check.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoRevert "bridge: also trigger RTM_NEWLINK when interface is released from bridge"
David S. Miller [Wed, 20 Sep 2017 22:39:59 +0000 (15:39 -0700)]
Revert "bridge: also trigger RTM_NEWLINK when interface is released from bridge"

This reverts commit 00ba4cb36da682c68dc87d1703a8aaffe2b4e9c5.

Discussion with David Ahern determined that this change is
actually not needed.

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoudp: do rmem bulk free even if the rx sk queue is empty
Paolo Abeni [Tue, 19 Sep 2017 10:11:43 +0000 (12:11 +0200)]
udp: do rmem bulk free even if the rx sk queue is empty

The commit 6b229cf77d68 ("udp: add batching to udp_rmem_release()")
reduced greatly the cacheline contention between the BH and the US
reader batching the rmem updates in most scenarios.

Such optimization is explicitly avoided if the US reader is faster
then BH processing.

My fault, I initially suggested this kind of behavior due to concerns
of possible regressions with small sk_rcvbuf values. Tests showed
such concerns are misplaced, so this commit relaxes the condition
for rmem bulk updates, obtaining small but measurable performance
gain in the scenario described above.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agovirtio-net: support XDP_REDIRECT
Jason Wang [Tue, 19 Sep 2017 09:42:43 +0000 (17:42 +0800)]
virtio-net: support XDP_REDIRECT

This patch tries to add XDP_REDIRECT for virtio-net. The changes are
not complex as we could use exist XDP_TX helpers for most of the
work. The rest is passing the XDP_TX to NAPI handler for implementing
batching.

Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agovirtio-net: add packet len average only when needed during XDP
Jason Wang [Tue, 19 Sep 2017 09:42:42 +0000 (17:42 +0800)]
virtio-net: add packet len average only when needed during XDP

There's no need to add packet len average in the case of XDP_PASS
since it will be done soon after skb is created.

Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agovirtio-net: remove unnecessary parameter of virtnet_xdp_xmit()
Jason Wang [Tue, 19 Sep 2017 09:42:41 +0000 (17:42 +0800)]
virtio-net: remove unnecessary parameter of virtnet_xdp_xmit()

CC: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: lan9303: Add adjust_link() method
Egil Hjelmeland [Tue, 19 Sep 2017 08:09:24 +0000 (10:09 +0200)]
net: dsa: lan9303: Add adjust_link() method

Make the driver react to device tree "fixed-link" declaration on CPU port.

- turn off autonegotiation
- force speed 10 or 100 mb/s
- force duplex mode

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobridge: also trigger RTM_NEWLINK when interface is released from bridge
Vincent Bernat [Sat, 16 Sep 2017 14:18:33 +0000 (16:18 +0200)]
bridge: also trigger RTM_NEWLINK when interface is released from bridge

Currently, when an interface is released from a bridge via
ioctl(), we get a RTM_DELLINK event through netlink:

Deleted 2: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 master bridge0 state UNKNOWN
    link/ether 6e:23:c2:54:3a:b3

Userspace has to interpret that as a removal from the bridge, not as a
complete removal of the interface. When an bridged interface is
completely removed, we get two events:

Deleted 2: dummy0: <BROADCAST,NOARP> mtu 1500 master bridge0 state DOWN
    link/ether 6e:23:c2:54:3a:b3
Deleted 2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
    link/ether 6e:23:c2:54:3a:b3 brd ff:ff:ff:ff:ff:ff

In constrast, when an interface is released from a bond, we get a
RTM_NEWLINK with only the new characteristics (no master):

3: dummy1: <BROADCAST,NOARP,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UNKNOWN group default
    link/ether ae:dc:7a:8c:9a:3c brd ff:ff:ff:ff:ff:ff
3: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
    link/ether ae:dc:7a:8c:9a:3c brd ff:ff:ff:ff:ff:ff
4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether ae:dc:7a:8c:9a:3c brd ff:ff:ff:ff:ff:ff
3: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether ae:dc:7a:8c:9a:3c brd ff:ff:ff:ff:ff:ff
3: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether ca:c8:7b:66:f8:25 brd ff:ff:ff:ff:ff:ff
4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether ae:dc:7a:8c:9a:3c brd ff:ff:ff:ff:ff:ff

Userland may be confused by the fact we say a link is deleted while
its characteristics are only modified. A first solution would have
been to turn the RTM_DELLINK event in del_nbp() into a RTM_NEWLINK
event. However, maybe some piece of userland is relying on this
RTM_DELLINK to detect when a bridged interface is released. Instead,
we also emit a RTM_NEWLINK event once the interface is
released (without master info).

Deleted 2: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 master bridge0 state UNKNOWN
    link/ether 8a:bb:e7:94:b1:f8
2: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
    link/ether 8a:bb:e7:94:b1:f8 brd ff:ff:ff:ff:ff:ff

This is done only when using ioctl(). When using Netlink, such an
event is already automatically emitted in do_setlink().

Signed-off-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomacvlan: code refine to check data before using
Zhang Shengju [Wed, 20 Sep 2017 00:12:23 +0000 (08:12 +0800)]
macvlan: code refine to check data before using

This patch checks data first at one place, return if it's null.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv6: Use ipv6_authlen for len in ipv6_skip_exthdr
Xiang Gao [Wed, 20 Sep 2017 16:18:17 +0000 (12:18 -0400)]
ipv6: Use ipv6_authlen for len in ipv6_skip_exthdr

In ipv6_skip_exthdr, the lengh of AH header is computed manually
as (hp->hdrlen+2)<<2. However, in include/linux/ipv6.h, a macro
named ipv6_authlen is already defined for exactly the same job. This
commit replaces the manual computation code with the macro.

Signed-off-by: Xiang Gao <qasdfgtyuiop@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'net-speedup-netns-create-delete-time'
David S. Miller [Tue, 19 Sep 2017 23:32:24 +0000 (16:32 -0700)]
Merge branch 'net-speedup-netns-create-delete-time'

Eric Dumazet says:

====================
net: speedup netns create/delete time

When rate of netns creation/deletion is high enough,
we observe softlockups in cleanup_net() caused by huge list
of netns and way too many rcu_barrier() calls.

This patch series does some optimizations in kobject,
and add batching to tunnels so that netns dismantles are
less costly.

IPv6 addrlabels also get a per netns list, and tcp_metrics
also benefit from batch flushing.

This gives me one order of magnitude gain.
(~50 ms -> ~5 ms for one netns create/delete pair)

Tested:

for i in `seq 1 40`
do
 (for j in `seq 1 100` ; do  unshare -n /bin/true >/dev/null ; done) &
done
wait ; grep net_namespace /proc/slabinfo

Before patch series :

$ time ./add_del_unshare.sh
net_namespace        116    258   5504    1    2 : tunables    8    4    0 : slabdata    116    258      0

real 3m24.910s
user 0m0.747s
sys 0m43.162s

After :
$ time ./add_del_unshare.sh
net_namespace        135    291   5504    1    2 : tunables    8    4    0 : slabdata    135    291      0

real 0m22.117s
user 0m0.728s
sys 0m35.328s
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv4: speedup ipv6 tunnels dismantle
Eric Dumazet [Tue, 19 Sep 2017 23:27:09 +0000 (16:27 -0700)]
ipv4: speedup ipv6 tunnels dismantle

Implement exit_batch() method to dismantle more devices
per round.

(rtnl_lock() ...
 unregister_netdevice_many() ...
 rtnl_unlock())

Tested:
$ cat add_del_unshare.sh
for i in `seq 1 40`
do
 (for j in `seq 1 100` ; do unshare -n /bin/true >/dev/null ; done) &
done
wait ; grep net_namespace /proc/slabinfo

Before patch :
$ time ./add_del_unshare.sh
net_namespace        126    282   5504    1    2 : tunables    8    4    0 : slabdata    126    282      0

real    1m38.965s
user    0m0.688s
sys     0m37.017s

After patch:
$ time ./add_del_unshare.sh
net_namespace        135    291   5504    1    2 : tunables    8    4    0 : slabdata    135    291      0

real 0m22.117s
user 0m0.728s
sys 0m35.328s

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv6: speedup ipv6 tunnels dismantle
Eric Dumazet [Tue, 19 Sep 2017 23:27:08 +0000 (16:27 -0700)]
ipv6: speedup ipv6 tunnels dismantle

Implement exit_batch() method to dismantle more devices
per round.

(rtnl_lock() ...
 unregister_netdevice_many() ...
 rtnl_unlock())

Tested:
$ cat add_del_unshare.sh
for i in `seq 1 40`
do
 (for j in `seq 1 100` ; do unshare -n /bin/true >/dev/null ; done) &
done
wait ; grep net_namespace /proc/slabinfo

Before patch :
$ time ./add_del_unshare.sh
net_namespace        110    267   5504    1    2 : tunables    8    4    0 : slabdata    110    267      0

real    3m25.292s
user    0m0.644s
sys     0m40.153s

After patch:

$ time ./add_del_unshare.sh
net_namespace        126    282   5504    1    2 : tunables    8    4    0 : slabdata    126    282      0

real 1m38.965s
user 0m0.688s
sys 0m37.017s

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotcp: batch tcp_net_metrics_exit
Eric Dumazet [Tue, 19 Sep 2017 23:27:07 +0000 (16:27 -0700)]
tcp: batch tcp_net_metrics_exit

When dealing with a list of dismantling netns, we can scan
tcp_metrics once, saving cpu cycles.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv6: addrlabel: per netns list
Eric Dumazet [Tue, 19 Sep 2017 23:27:06 +0000 (16:27 -0700)]
ipv6: addrlabel: per netns list

Having a global list of labels do not scale to thousands of
netns in the cloud era. This causes quadratic behavior on
netns creation and deletion.

This is time having a per netns list of ~10 labels.

Tested:

$ time perf record (for f in `seq 1 3000` ; do ip netns add tast$f; done)
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 3.637 MB perf.data (~158898 samples) ]

real    0m20.837s # instead of 0m24.227s
user    0m0.328s
sys     0m20.338s # instead of 0m23.753s

    16.17%       ip  [kernel.kallsyms]  [k] netlink_broadcast_filtered
    12.30%       ip  [kernel.kallsyms]  [k] netlink_has_listeners
     6.76%       ip  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
     5.78%       ip  [kernel.kallsyms]  [k] memset_erms
     5.77%       ip  [kernel.kallsyms]  [k] kobject_uevent_env
     5.18%       ip  [kernel.kallsyms]  [k] refcount_sub_and_test
     4.96%       ip  [kernel.kallsyms]  [k] _raw_read_lock
     3.82%       ip  [kernel.kallsyms]  [k] refcount_inc_not_zero
     3.33%       ip  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
     2.11%       ip  [kernel.kallsyms]  [k] unmap_page_range
     1.77%       ip  [kernel.kallsyms]  [k] __wake_up
     1.69%       ip  [kernel.kallsyms]  [k] strlen
     1.17%       ip  [kernel.kallsyms]  [k] __wake_up_common
     1.09%       ip  [kernel.kallsyms]  [k] insert_header
     1.04%       ip  [kernel.kallsyms]  [k] page_remove_rmap
     1.01%       ip  [kernel.kallsyms]  [k] consume_skb
     0.98%       ip  [kernel.kallsyms]  [k] netlink_trim
     0.51%       ip  [kernel.kallsyms]  [k] kernfs_link_sibling
     0.51%       ip  [kernel.kallsyms]  [k] filemap_map_pages
     0.46%       ip  [kernel.kallsyms]  [k] memcpy_erms

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agokobject: factorize skb setup in kobject_uevent_net_broadcast()
Eric Dumazet [Tue, 19 Sep 2017 23:27:05 +0000 (16:27 -0700)]
kobject: factorize skb setup in kobject_uevent_net_broadcast()

We can build one skb and let it be cloned in netlink.

This is much faster, and use less memory (all clones will
share the same skb->head)

Tested:

time perf record (for f in `seq 1 3000` ; do ip netns add tast$f; done)
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 4.110 MB perf.data (~179584 samples) ]

real    0m24.227s # instead of 0m52.554s
user    0m0.329s
sys 0m23.753s # instead of 0m51.375s

    14.77%       ip  [kernel.kallsyms]  [k] __ip6addrlbl_add
    14.56%       ip  [kernel.kallsyms]  [k] netlink_broadcast_filtered
    11.65%       ip  [kernel.kallsyms]  [k] netlink_has_listeners
     6.19%       ip  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
     5.66%       ip  [kernel.kallsyms]  [k] kobject_uevent_env
     4.97%       ip  [kernel.kallsyms]  [k] memset_erms
     4.67%       ip  [kernel.kallsyms]  [k] refcount_sub_and_test
     4.41%       ip  [kernel.kallsyms]  [k] _raw_read_lock
     3.59%       ip  [kernel.kallsyms]  [k] refcount_inc_not_zero
     3.13%       ip  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
     1.55%       ip  [kernel.kallsyms]  [k] __wake_up
     1.20%       ip  [kernel.kallsyms]  [k] strlen
     1.03%       ip  [kernel.kallsyms]  [k] __wake_up_common
     0.93%       ip  [kernel.kallsyms]  [k] consume_skb
     0.92%       ip  [kernel.kallsyms]  [k] netlink_trim
     0.87%       ip  [kernel.kallsyms]  [k] insert_header
     0.63%       ip  [kernel.kallsyms]  [k] unmap_page_range

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agokobject: copy env blob in one go
Eric Dumazet [Tue, 19 Sep 2017 23:27:04 +0000 (16:27 -0700)]
kobject: copy env blob in one go

No need to iterate over strings, just copy in one efficient memcpy() call.

Tested:
time perf record "(for f in `seq 1 3000` ; do ip netns add tast$f; done)"
[ perf record: Woken up 10 times to write data ]
[ perf record: Captured and wrote 8.224 MB perf.data (~359301 samples) ]

real    0m52.554s  # instead of 1m7.492s
user    0m0.309s
sys 0m51.375s # instead of 1m6.875s

     9.88%       ip  [kernel.kallsyms]  [k] netlink_broadcast_filtered
     8.86%       ip  [kernel.kallsyms]  [k] string
     7.37%       ip  [kernel.kallsyms]  [k] __ip6addrlbl_add
     5.68%       ip  [kernel.kallsyms]  [k] netlink_has_listeners
     5.52%       ip  [kernel.kallsyms]  [k] memcpy_erms
     4.76%       ip  [kernel.kallsyms]  [k] __alloc_skb
     4.54%       ip  [kernel.kallsyms]  [k] vsnprintf
     3.94%       ip  [kernel.kallsyms]  [k] format_decode
     3.80%       ip  [kernel.kallsyms]  [k] kmem_cache_alloc_node_trace
     3.71%       ip  [kernel.kallsyms]  [k] kmem_cache_alloc_node
     3.66%       ip  [kernel.kallsyms]  [k] kobject_uevent_env
     3.38%       ip  [kernel.kallsyms]  [k] strlen
     2.65%       ip  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
     2.20%       ip  [kernel.kallsyms]  [k] kfree
     2.09%       ip  [kernel.kallsyms]  [k] memset_erms
     2.07%       ip  [kernel.kallsyms]  [k] ___cache_free
     1.95%       ip  [kernel.kallsyms]  [k] kmem_cache_free
     1.91%       ip  [kernel.kallsyms]  [k] _raw_read_lock
     1.45%       ip  [kernel.kallsyms]  [k] ksize
     1.25%       ip  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
     1.00%       ip  [kernel.kallsyms]  [k] widen_string

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agokobject: add kobject_uevent_net_broadcast()
Eric Dumazet [Tue, 19 Sep 2017 23:27:03 +0000 (16:27 -0700)]
kobject: add kobject_uevent_net_broadcast()

This removes some #ifdef pollution and will ease follow up patches.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: no need to free qdisc in RCU callback
Cong Wang [Tue, 19 Sep 2017 20:15:42 +0000 (13:15 -0700)]
net_sched: no need to free qdisc in RCU callback

gen estimator has been rewritten in commit 1c0d32fde5bd
("net_sched: gen_estimator: complete rewrite of rate estimators"),
the caller no longer needs to wait for a grace period. So this
patch gets rid of it.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoteam: fall back to hash if table entry is empty
Jim Hanko [Tue, 19 Sep 2017 18:33:39 +0000 (11:33 -0700)]
team: fall back to hash if table entry is empty

If the hash to port mapping table does not have a valid port (i.e. when
a port goes down), fall back to the simple hashing mechanism to avoid
dropping packets.

Signed-off-by: Jim Hanko <hanko@drivescale.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'test_rhashtable-dont-allocate-huge-static-array'
David S. Miller [Tue, 19 Sep 2017 23:15:48 +0000 (16:15 -0700)]
Merge branch 'test_rhashtable-dont-allocate-huge-static-array'

Florian Westphal says:

====================
test_rhashtable: don't allocate huge static array

Add a test case for the rhlist interface.
While at it, cleanup current rhashtable test a bit and add a check
for max_size support.

No changes since v1, except in last patch.
kbuild robot complained about large onstack allocation caused by
struct rhltable when lockdep is enabled.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotest_rhashtable: add test case for rhl_table interface
Florian Westphal [Tue, 19 Sep 2017 23:12:14 +0000 (01:12 +0200)]
test_rhashtable: add test case for rhl_table interface

also test rhltable.  rhltable remove operations are slow as
deletions require a list walk, thus test with 1/16th of the given
entry count number to get a run duration similar to rhashtable one.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotest_rhashtable: add a check for max_size
Florian Westphal [Tue, 19 Sep 2017 23:12:13 +0000 (01:12 +0200)]
test_rhashtable: add a check for max_size

add a test that tries to insert more than max_size elements.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotest_rhashtable: don't use global entries variable
Florian Westphal [Tue, 19 Sep 2017 23:12:12 +0000 (01:12 +0200)]
test_rhashtable: don't use global entries variable

pass the entries to test as an argument instead.
Followup patch will add an rhlist test case; rhlist delete opererations
are slow so we need to use a smaller number to test it.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotest_rhashtable: don't allocate huge static array
Florian Westphal [Tue, 19 Sep 2017 23:12:11 +0000 (01:12 +0200)]
test_rhashtable: don't allocate huge static array

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'dsa-b53-bcm_sf2-cleanups'
David S. Miller [Tue, 19 Sep 2017 23:08:54 +0000 (16:08 -0700)]
Merge branch 'dsa-b53-bcm_sf2-cleanups'

Florian Fainelli says:

====================
net: dsa: b53/bcm_sf2 cleanups

This patch series is a first pass set of clean-ups to reduce the number of LOCs
between b53 and bcm_sf2 and sharing as many functions as possible.

There is a number of additional cleanups queued up locally that require more
thorough testing.

Changes in v3:

- remove one extra argument for the b53_build_io_op macro (David Laight)
- added additional Reviewed-by tags from Vivien

Changes in v2:

- added Reviewed-by tags from Vivien
- added a missing EXPORT_SYMBOL() in patch 8
- fixed a typo in patch 5
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: bcm_sf2: Utilize b53_{enable, disable}_port
Florian Fainelli [Tue, 19 Sep 2017 17:46:54 +0000 (10:46 -0700)]
net: dsa: bcm_sf2: Utilize b53_{enable, disable}_port

Export b53_{enable,disable}_port and use these two functions in
bcm_sf2_port_setup and bcm_sf2_port_disable. The generic functions
cannot be used without wrapping because we need to manage additional
switch integration details (PHY, Broadcom tag etc.).

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: bcm_sf2: Use SF2_NUM_EGRESS_QUEUES for CFP
Florian Fainelli [Tue, 19 Sep 2017 17:46:53 +0000 (10:46 -0700)]
net: dsa: bcm_sf2: Use SF2_NUM_EGRESS_QUEUES for CFP

The magic number 8 in 3 locations in bcm_sf2_cfp.c actually designates
the number of switch port egress queues, so use that define instead of
open-coding it.

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: b53: Export b53_imp_vlan_setup()
Florian Fainelli [Tue, 19 Sep 2017 17:46:52 +0000 (10:46 -0700)]
net: dsa: b53: Export b53_imp_vlan_setup()

bcm_sf2 and b53 do exactly the same thing, so share that piece.

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: b53: Wire-up EEE
Florian Fainelli [Tue, 19 Sep 2017 17:46:51 +0000 (10:46 -0700)]
net: dsa: b53: Wire-up EEE

Add support for enabling and disabling EEE, as well as re-negotiating it in
.adjust_link() and in .port_enable().

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: b53: Move EEE functions to b53
Florian Fainelli [Tue, 19 Sep 2017 17:46:50 +0000 (10:46 -0700)]
net: dsa: b53: Move EEE functions to b53

Move the bcm_sf2 EEE-related functions to the b53 driver because this is shared
code amongst Gigabit capable switch, only 5325 and 5365 are too old to support
that.

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: b53: Define EEE register page
Florian Fainelli [Tue, 19 Sep 2017 17:46:49 +0000 (10:46 -0700)]
net: dsa: b53: Define EEE register page

In preparation for migrating the EEE code from bcm_sf2 to b53, define the full
EEE register page and offsets within that page.

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: b53: Move Broadcom header setup to b53
Florian Fainelli [Tue, 19 Sep 2017 17:46:48 +0000 (10:46 -0700)]
net: dsa: b53: Move Broadcom header setup to b53

The code to enable Broadcom tags/headers is largely switch independent,
and in preparation for enabling it for multiple devices with b53, move
the code we have in bcm_sf2.c to b53_common.c

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: b53: Use a macro to define I/O operations
Florian Fainelli [Tue, 19 Sep 2017 17:46:47 +0000 (10:46 -0700)]
net: dsa: b53: Use a macro to define I/O operations

Instead of repeating the same pattern: acquire mutex, read/write,
release mutex, define a macro: b53_build_op() which takes the type
(read|write), I/O size, and value (scalar or pointer). This helps with
fixing bugs that could exist (e.g: missing barrier, lock etc.).

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: bcm_sf2: Defer port enabling to calling port_enable
Florian Fainelli [Tue, 19 Sep 2017 17:46:46 +0000 (10:46 -0700)]
net: dsa: bcm_sf2: Defer port enabling to calling port_enable

There is no need to configure the enabled ports once in bcm_sf2_sw_setup() and
then a second time around when dsa_switch_ops::port_enable is called, just do
it when port_enable is called which is better in terms of power consumption and
correctness.

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: b53: Defer port enabling to calling port_enable
Florian Fainelli [Tue, 19 Sep 2017 17:46:45 +0000 (10:46 -0700)]
net: dsa: b53: Defer port enabling to calling port_enable

There is no need to configure the enabled ports once in b53_setup() and then a
second time around when dsa_switch_ops::port_enable is called, just do it when
port_enable is called which is better in terms of power consumption and
correctness.

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: b53: Make b53_enable_cpu_port() take a port argument
Florian Fainelli [Tue, 19 Sep 2017 17:46:44 +0000 (10:46 -0700)]
net: dsa: b53: Make b53_enable_cpu_port() take a port argument

In preparation for future changes allowing the configuring of multiple
CPU ports, make b53_enable_cpu_port() take a port argument.

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: b53: Remove is_cpu_port()
Florian Fainelli [Tue, 19 Sep 2017 17:46:43 +0000 (10:46 -0700)]
net: dsa: b53: Remove is_cpu_port()

This is not used anywhere, so remove it.

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'dsa-master-ethtool-move'
David S. Miller [Tue, 19 Sep 2017 23:04:23 +0000 (16:04 -0700)]
Merge branch 'dsa-master-ethtool-move'

Vivien Didelot says:

====================
net: dsa: move master ethtool code

The DSA core overrides the master device's ethtool_ops structure so that
it can inject statistics and such of its dedicated switch CPU port.

This ethtool code is currently called on unnecessary conditions or
before the master interface and its switch CPU port get wired up.
This patchset fixes this.

Similarly to slave.c where the DSA slave net_device is the entry point
of the dsa_slave_* functions, this patchset also isolates the master's
ethtool code in a new master.c file, where the DSA master net_device is
the entry point of the dsa_master_* functions.

This is a first step towards better control of the master device and
support for multiple CPU ports.
====================

Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: move master ethtool code
Vivien Didelot [Tue, 19 Sep 2017 15:57:00 +0000 (11:57 -0400)]
net: dsa: move master ethtool code

DSA overrides the master device ethtool ops, so that it can inject stats
from its dedicated switch CPU port as well.

The related code is currently split in dsa.c and slave.c, but it only
scopes the master net device. Move it to a new master.c DSA core file.

This file will be later extented with master net device specific code.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: setup master ethtool after dsa_ptr
Vivien Didelot [Tue, 19 Sep 2017 15:56:59 +0000 (11:56 -0400)]
net: dsa: setup master ethtool after dsa_ptr

DSA overrides the master's ethtool ops so that we can inject its CPU
port's statistics. Because of that, we need to setup the ethtool ops
after the master's dsa_ptr pointer has been assigned, not before.

This patch setups the ethtool ops after dsa_ptr is assigned, and
restores them before it gets cleared.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: setup master ethtool unconditionally
Vivien Didelot [Tue, 19 Sep 2017 15:56:58 +0000 (11:56 -0400)]
net: dsa: setup master ethtool unconditionally

When a DSA switch tree is meant to be applied, it already has a CPU
port. Thus remove the condition of dst->cpu_dp.

Moreover, the next lines access dst->cpu_dp unconditionally.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: remove copy of master ethtool_ops
Vivien Didelot [Tue, 19 Sep 2017 15:56:57 +0000 (11:56 -0400)]
net: dsa: remove copy of master ethtool_ops

There is no need to store a copy of the master ethtool ops, storing the
original pointer in DSA and the new one in the master netdev itself is
enough.

In the meantime, set orig_ethtool_ops to NULL when restoring the master
ethtool ops and check the presence of the master original ethtool ops as
well as its needed functions before calling them.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests: rtnetlink.sh: add test case for device ifalias
Florian Westphal [Tue, 19 Sep 2017 12:42:17 +0000 (14:42 +0200)]
selftests: rtnetlink.sh: add test case for device ifalias

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: sk_buff rbnode reorg
Eric Dumazet [Tue, 19 Sep 2017 12:14:24 +0000 (05:14 -0700)]
net: sk_buff rbnode reorg

skb->rbnode shares space with skb->next, skb->prev and skb->tstamp

Current uses (TCP receive ofo queue and netem) need to save/restore
tstamp, while skb->dev is either NULL (TCP) or a constant for a given
queue (netem).

Since we plan using an RB tree for TCP retransmit queue to speedup SACK
processing with large BDP, this patch exchanges skb->dev and
skb->tstamp.

This saves some overhead in both TCP and netem.

v2: removes the swtstamp field from struct tcp_skb_cb

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Wei Wang <weiwan@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'mlxsw-Prepare-for-multicast-router-offload'
David S. Miller [Tue, 19 Sep 2017 21:21:41 +0000 (14:21 -0700)]
Merge branch 'mlxsw-Prepare-for-multicast-router-offload'

Jiri Pirko says:

====================
mlxsw: Prepare for multicast router offload

Yotam says:

This patch-set makes various preparations needed for the multicast router
offloading, which include:
 - Add the needed registers.
 - Add needed ACL actions.
 - Add new traps and trap groups.
 - Exporting needed private structs and enums.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum: Add multicast router traps and trap groups
Yotam Gigi [Tue, 19 Sep 2017 08:00:20 +0000 (10:00 +0200)]
mlxsw: spectrum: Add multicast router traps and trap groups

Add three new traps needed for multicast routing:
 - PIM: Trap for PIM protocol control packets.
 - RPF: Trap for packets that fail the RPF check on a specific hardware
   route entry.
 - MULTICAST: Generic trap for multicast. It is used for routes that trap
   the packets to the CPU.

The RPF and MULTICAST traps have rate limiters as these traps may have
line-rate of packets trapped. The PIM trap has a rate limiter similarly to
other L3 control protocols. The rate limiters are implemented by adding
three new trap groups for the newly introduced traps.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum_router: Export RIF dev access function
Yotam Gigi [Tue, 19 Sep 2017 08:00:19 +0000 (10:00 +0200)]
mlxsw: spectrum_router: Export RIF dev access function

The mlxsw_sp_rif struct, defined as private struct in spectrum_router.c
will be used in the multicast router source file. Due to the fact that the
dev field will be needed by the multicast router logic, add an access
function to it.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: reg: Configure RIF to forward IPv4 multicast packets by default
Yotam Gigi [Tue, 19 Sep 2017 08:00:18 +0000 (10:00 +0200)]
mlxsw: reg: Configure RIF to forward IPv4 multicast packets by default

Turn on two bits on the Spectrum RIF configuration:
 - IPv4 multicast: when a multicast packet arrives on a RIF, send it to go
   through multicast routes lookup.
 - IPv4 multicast forwarding enable: when multicast packet arrives on a
   RIF, allow it to be forwarded by multicast routes. If this bit is not
   set, multicast packets will go through multicast routing lookup but will
   be dropped at the egress of the ports.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: reg: Add Router Rules Copy Register
Yotam Gigi [Tue, 19 Sep 2017 08:00:17 +0000 (10:00 +0200)]
mlxsw: reg: Add Router Rules Copy Register

The RRCR register is used for copying and moving TCAM multicast routes
from different offsets. It will be used to allow routes relocation for
parman ops as part of the multicast router offloading logic.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: reg: Add the Router Multicast Forwarding Table Version 2 register
Yotam Gigi [Tue, 19 Sep 2017 08:00:16 +0000 (10:00 +0200)]
mlxsw: reg: Add the Router Multicast Forwarding Table Version 2 register

The RMFT-V2 register is used to configure and query the multicast table and
will be used by the multicast router offloading logic.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: resources: Add multicast ERIF list entries resource
Yotam Gigi [Tue, 19 Sep 2017 08:00:15 +0000 (10:00 +0200)]
mlxsw: resources: Add multicast ERIF list entries resource

The multicast ERIF list entries resource indicates the number of entries
that can be put in one rigr2 register operation. While the register can
hold up to MLXSW_REG_RIGR2_MAX_ERIFS ( = 32) ERIF entries, the actual
number allowed by firmware is indicated with this resource.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: reg: Add the Router Interface Group Version 2 register
Yotam Gigi [Tue, 19 Sep 2017 08:00:14 +0000 (10:00 +0200)]
mlxsw: reg: Add the Router Interface Group Version 2 register

The RIGR-V2 register is used to add, remove and query egress interface list
of a multicast forwarding entry and it will be used by the multicast
router offloading logic.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: reg: Add The Router TCAM Allocation register
Yotam Gigi [Tue, 19 Sep 2017 08:00:13 +0000 (10:00 +0200)]
mlxsw: reg: Add The Router TCAM Allocation register

This register is used for allocation of regions in the TCAM table and it
will be used by the multicast router offloading logic.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: reg: Rename the flexible action set length field
Yotam Gigi [Tue, 19 Sep 2017 08:00:12 +0000 (10:00 +0200)]
mlxsw: reg: Rename the flexible action set length field

The MLXSW_REG_PXXX_FLEX_ACTION_SET_LEN is relevant for the multicast router
registers too, so rename it to have a general name which is not bound to a
specific register.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: acl: Change trap ACL action to get the trap_id as a parameter
Yotam Gigi [Tue, 19 Sep 2017 08:00:11 +0000 (10:00 +0200)]
mlxsw: acl: Change trap ACL action to get the trap_id as a parameter

Allow the trap ACL action to be configured with different traps. This
allows the multicast router offloading code to use that same ACL action
with the multicast router traps. By using different traps, the multicast
router can have different trap policies and can handle the packet
differently.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: acl: Introduce mcrouter ACL action
Yotam Gigi [Tue, 19 Sep 2017 08:00:10 +0000 (10:00 +0200)]
mlxsw: acl: Introduce mcrouter ACL action

The Spectrum multicast forwarding is done using an ACL action. Add the
mcrouter ACL action that will be used to offload the multicast router
logic.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum: Move ACL flexible actions instance to spectrum
Yotam Gigi [Tue, 19 Sep 2017 08:00:09 +0000 (10:00 +0200)]
mlxsw: spectrum: Move ACL flexible actions instance to spectrum

A flexible action instance allows, given a set of ops, creating, committing
and sharing a set of ACL action blocks. The flexible action instance in
question is using the spectrum KVD linear space to store the flexible
action sets.

Move this flexible action instance to the common spectrum struct to allow
other users (such as multicast router) to get that functionality.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum: Change init order
Yotam Gigi [Tue, 19 Sep 2017 08:00:08 +0000 (10:00 +0200)]
mlxsw: spectrum: Change init order

The multicast router offloading code is going to require the counter_pools
initialization to occur before the router initialization, thus, change the
spectrum initialization order to fix it.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'bpf-lpm-delete'
David S. Miller [Tue, 19 Sep 2017 20:55:15 +0000 (13:55 -0700)]
Merge branch 'bpf-lpm-delete'

Craig Gallek says:

====================
Implement delete for BPF LPM trie

This was previously left as a TODO.  Add the implementation and
extend the test to cover it.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: Test deletion in BPF_MAP_TYPE_LPM_TRIE
Craig Gallek [Mon, 18 Sep 2017 19:30:57 +0000 (15:30 -0400)]
bpf: Test deletion in BPF_MAP_TYPE_LPM_TRIE

Extend the 'random' operation tests to include a delete operation
(delete half of the nodes from both lpm implementions and ensure
that lookups are still equivalent).

Also, add a simple IPv4 test which verifies lookup behavior as nodes
are deleted from the tree.

Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: Add uniqueness invariant to trivial lpm test implementation
Craig Gallek [Mon, 18 Sep 2017 19:30:56 +0000 (15:30 -0400)]
bpf: Add uniqueness invariant to trivial lpm test implementation

The 'trivial' lpm implementation in this test allows equivalent nodes
to be added (that is, nodes consisting of the same prefix and prefix
length).  For lookup operations, this is fine because insertion happens
at the head of the (singly linked) list and the first, best match is
returned.  In order to support deletion, the tlpm data structue must
first enforce uniqueness.  This change modifies the insertion algorithm
to search for equivalent nodes and remove them.  Note: the
BPF_MAP_TYPE_LPM_TRIE already has a uniqueness invariant that is
implemented as node replacement.

Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: Implement map_delete_elem for BPF_MAP_TYPE_LPM_TRIE
Craig Gallek [Mon, 18 Sep 2017 19:30:55 +0000 (15:30 -0400)]
bpf: Implement map_delete_elem for BPF_MAP_TYPE_LPM_TRIE

This is a simple non-recursive delete operation.  It prunes paths
of empty nodes in the tree, but it does not try to further compress
the tree as nodes are removed.

Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agovsock: vmci: Remove unneeded linux/miscdevice.h include
Corentin Labbe [Mon, 18 Sep 2017 18:18:55 +0000 (20:18 +0200)]
vsock: vmci: Remove unneeded linux/miscdevice.h include

net/vmw_vsock/vmci_transport.c does not use any miscdevice so this patch
remove this unnecessary inclusion.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: remove useless goto
Antoine Tenart [Mon, 18 Sep 2017 13:36:51 +0000 (15:36 +0200)]
net: mvpp2: remove useless goto

Remove a goto in the PPv2 tx function which jumps to the next line
anyway. This is a cosmetic commit.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: sch_htb: add per class overlimits counter
Eric Dumazet [Mon, 18 Sep 2017 19:36:22 +0000 (12:36 -0700)]
net_sched: sch_htb: add per class overlimits counter

HTB qdisc overlimits counter is properly increased, but we have no per
class counter, meaning it is difficult to diagnose HTB problems.

This patch adds this counter, visible in "tc -s class show dev eth0",
with current iproute2.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: use explicit size of struct tcmsg, remove need to declare tcm
Colin Ian King [Mon, 18 Sep 2017 11:40:38 +0000 (12:40 +0100)]
net_sched: use explicit size of struct tcmsg, remove need to declare tcm

Pointer tcm is being initialized and is never read, it is only being used
to determine the size of struct tcmsg.  Clean this up by removing
variable tcm and explicitly using the sizeof struct tcmsg rather than *tcm.
Cleans up clang warning:

warning: Value stored to 'tcm' during its initialization is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'korina-performance-fixes-and-cleanup'
David S. Miller [Mon, 18 Sep 2017 23:50:07 +0000 (16:50 -0700)]
Merge branch 'korina-performance-fixes-and-cleanup'

Roman Yeryomin says:

====================
korina: performance fixes and cleanup

Changes from v1:
- use GRO instead of increasing ring size
- use NAPI_POLL_WEIGHT instead of defining own NAPI_WEIGHT
- optimize rx descriptor flags processing
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: korina: bump version
Roman Yeryomin [Sun, 17 Sep 2017 17:25:21 +0000 (20:25 +0300)]
net: korina: bump version

Signed-off-by: Roman Yeryomin <roman@advem.lv>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: korina: update authors
Roman Yeryomin [Sun, 17 Sep 2017 17:25:11 +0000 (20:25 +0300)]
net: korina: update authors

Signed-off-by: Roman Yeryomin <roman@advem.lv>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: korina: whitespace cleanup
Roman Yeryomin [Sun, 17 Sep 2017 17:25:02 +0000 (20:25 +0300)]
net: korina: whitespace cleanup

Signed-off-by: Roman Yeryomin <roman@advem.lv>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: korina: use GRO
Roman Yeryomin [Sun, 17 Sep 2017 17:24:50 +0000 (20:24 +0300)]
net: korina: use GRO

Performance gain when receiving locally is 55->95Mbps and 50->65Mbps for NAT.

Signed-off-by: Roman Yeryomin <roman@advem.lv>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: korina: use NAPI_POLL_WEIGHT
Roman Yeryomin [Sun, 17 Sep 2017 17:24:38 +0000 (20:24 +0300)]
net: korina: use NAPI_POLL_WEIGHT

Signed-off-by: Roman Yeryomin <roman@advem.lv>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: korina: optimize rx descriptor flags processing
Roman Yeryomin [Sun, 17 Sep 2017 17:24:26 +0000 (20:24 +0300)]
net: korina: optimize rx descriptor flags processing

Signed-off-by: Roman Yeryomin <roman@advem.lv>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: korina: don't use overflow and underflow interrupts
Roman Yeryomin [Sun, 17 Sep 2017 17:24:15 +0000 (20:24 +0300)]
net: korina: don't use overflow and underflow interrupts

When such interrupts occur there is not much we can do.
Dropping the whole ring doesn't help and only produces high packet loss.
If we just ignore the interrupt the mac will drop one or few packets instead of the whole ring.
Also this will lower the irq handling load and increase performance.

Signed-off-by: Roman Yeryomin <roman@advem.lv>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agohamradio: baycom: use new parport device model
Sudip Mukherjee [Sun, 17 Sep 2017 11:46:20 +0000 (12:46 +0100)]
hamradio: baycom: use new parport device model

Modify baycom driver to use the new parallel port device model.

Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Acked-By: Thomas Sailer <t.sailer@alumni.ethz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/ethernet/freescale: fix warning for ucc_geth
Valentin Longchamp [Fri, 15 Sep 2017 05:58:47 +0000 (07:58 +0200)]
net/ethernet/freescale: fix warning for ucc_geth

uf_info.regs is resource_size_t i.e. phys_addr_t that can be either u32
or u64 according to CONFIG_PHYS_ADDR_T_64BIT.

The printk format is thus adaptet to u64 and the regs value cast to u64
to take both u32 and u64 into account.

Signed-off-by: Valentin Longchamp <valentin.longchamp@keymile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoforcedeth: replace pci_map_single with dma_map_single functions
Zhu Yanjun [Fri, 15 Sep 2017 03:01:51 +0000 (23:01 -0400)]
forcedeth: replace pci_map_single with dma_map_single functions

pci_map_single functions are obsolete. So replace them with
dma_map_single functions.

Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 's390-qeth-next'
David S. Miller [Mon, 18 Sep 2017 21:41:38 +0000 (14:41 -0700)]
Merge branch 's390-qeth-next'

Julian Wiedmann says:

====================
s390/qeth: updates 2017-09-18

first batch of patches for 4.15. One larger item in there is Hans'
addition of new configuration options for flexible packet processing
('VNIC characteristics'). The patch descriptions have all the details.
Please apply.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: tidy up parameter naming for qeth_do_send_packet()
Jens Remus [Mon, 18 Sep 2017 19:18:22 +0000 (21:18 +0200)]
s390/qeth: tidy up parameter naming for qeth_do_send_packet()

Cppcheck reports the following for drivers/s390/net/qeth_core.h:

    warning - line 1560 - Function 'qeth_do_send_packet' argument order
    different:
    declaration 'card, queue, skb, hdr, hd_len, offset, elements'
    definition  'card, queue, skb, hdr, offset, hd_len, elements_needed'.

Match the naming in the function's declaration against its definition.

Signed-off-by: Jens Remus <jremus@linux.vnet.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: fold VLAN handling into l3_rebuild_skb()
Julian Wiedmann [Mon, 18 Sep 2017 19:18:21 +0000 (21:18 +0200)]
s390/qeth: fold VLAN handling into l3_rebuild_skb()

Move the overly complicated VLAN processing from the L3 RX handler into
its l3_rebuild_skb() helper. No change in functionality.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: translate SETVLAN/DELVLAN errors
Julian Wiedmann [Mon, 18 Sep 2017 19:18:20 +0000 (21:18 +0200)]
s390/qeth: translate SETVLAN/DELVLAN errors

Properly return any error encountered during VLAN processing to the
the caller.
Resulting change in behaviour: if SETVLAN fails while registering a
new VLAN ID, the stack no longer creates the corresponding vlan device.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: simplify L3 sysfs group management
Julian Wiedmann [Mon, 18 Sep 2017 19:18:19 +0000 (21:18 +0200)]
s390/qeth: simplify L3 sysfs group management

Use the right helpers to create/remove all attribute groups in one go.

Suggested-by: Hans Wippel <hwippel@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: don't take queue lock in send_packet_fast()
Julian Wiedmann [Mon, 18 Sep 2017 19:18:18 +0000 (21:18 +0200)]
s390/qeth: don't take queue lock in send_packet_fast()

Locking the output queue prior to TX is needed on OSA devices,
to synchronize against a packing flush from the TX completion code
(via qeth_check_outbound_queue()).
But send_packet_fast() is only used for IQDs, which don't do packing.
So remove the locking, and apply some easy cleanups.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: remove unused code in qdio_establish_cq()
Julian Wiedmann [Mon, 18 Sep 2017 19:18:17 +0000 (21:18 +0200)]
s390/qeth: remove unused code in qdio_establish_cq()

Storing the number of input buffers into 'i' has no effect, it is
immediately re-assigned in the next line.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: add VNICC get/set timeout support
Hans Wippel [Mon, 18 Sep 2017 19:18:16 +0000 (21:18 +0200)]
s390/qeth: add VNICC get/set timeout support

HiperSockets allow configuring so called VNIC Characteristics (VNICC)
that influence how the underlying hardware handles packets. For VNICCs,
additional commands for getting and setting timeouts are available.
Currently, the learning VNICC uses these commands.

* Learning VNICC: If learning is enabled on a qeth device, the device
  learns the source MAC addresses of outgoing packets and incoming
  packets to those learned MAC addresses are received.

For learning, the timeout specifies the idle period in seconds, after
which the underlying hardware removes a learned MAC address again.

This patch adds support for the IPA commands that are required to get
and set the current timeout values for the learning VNIC characteristic.
Also, it introduces the sysfs interface that allows users to configure
the timeout.

Signed-off-by: Hans Wippel <hwippel@linux.vnet.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: add VNICC enable/disable support
Hans Wippel [Mon, 18 Sep 2017 19:18:15 +0000 (21:18 +0200)]
s390/qeth: add VNICC enable/disable support

HiperSocket devices allow enabling and disabling so called VNIC
Characteristics (VNICC) that influence how the underlying hardware
handles packets. These VNICCs are:

* Flooding VNICC: Flooding allows specifying if packets to unknown
  destination MAC addresses are received by the qeth device.

* Multicast flooding VNICC: Multicast flooding allows specifying if
  packets to multicast MAC addresses are received by the qeth device.

* Learning VNICC: If learning is enabled on a qeth device, the device
  learns the source MAC addresses of outgoing packets and incoming
  packets to those learned MAC addresses are received.

* Takeover setvmac VNICC: If takeover setvmac is configured on a qeth
  device, the MAC address of this device can be configured on a
  different qeth device with the setvmac IPA command.

* Takeover by learning VNICC: If takeover learning is enabled on a qeth
  device, the MAC address of this device can be learned (learning VNICC)
  on a different qeth device.

* BridgePort invisible VNICC: If BridgePort invisible is enabled on a
  qeth device, (1) packets from this device are not sent to a BridgePort
  enabled qeth device and (2) packets coming from a BridgePort enabled
  qeth device are not received by this device.

* Receive broadcast VNICC: Receive broadcast allows configuring if a
  qeth device receives packets with the broadcast destination MAC
  address.

This patch adds support for the IPA commands that are required to enable
and disable these VNIC characteristics on qeth devices. As a
prerequisite, it also adds the query commands IPA command.

The query commands IPA command allows requesting the supported commands
for each characteristic from the underlying hardware.

Additionally, this patch provides users with a sysfs user interface to
enable/disable the VNICCs mentioned above.

Signed-off-by: Hans Wippel <hwippel@linux.vnet.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: add basic VNICC support
Hans Wippel [Mon, 18 Sep 2017 19:18:14 +0000 (21:18 +0200)]
s390/qeth: add basic VNICC support

VNIC Characteristics (VNICC) are features of HiperSockets that define
how packets are handled by the underlying network hardware. For example,
if the VNICC flooding is configured on a qeth device, ethernet frames to
unknown destination MAC addresses are received.

Currently, there is support for seven VNICCs: flooding, multicast
flooding, receive broadcast, learning, takeover learning, takeover
setvmac, bridge invisible. Also, six IPA commands exist for configuring
VNICCs on a qeth device: query characteristics, query commands, enable
characteristic, disable characteristic, set timeout, get timeout.

This patch adds the basic code infrastructure for VNICC support to qeth.
It allows querying VNICC support from the underlying hardware. To this
end, it adds:

* basic message formats for IPA commands
* basic data structures
* basic error handling
* query characteristics IPA command support

The query characteristics IPA command allows requesting the currently
supported and currently enabled VNIC characteristics from the underlying
hardware.

Support for the other IPA commands and for the configuration of VNICCs
is added in follow-up patches together with the respective user
interface functions.

Signed-off-by: Hans Wippel <hwippel@linux.vnet.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodt-bindings: net: renesas-ravb: Add support for R8A77995 RAVB
Yoshihiro Shimoda [Thu, 14 Sep 2017 00:06:38 +0000 (09:06 +0900)]
dt-bindings: net: renesas-ravb: Add support for R8A77995 RAVB

Add a new compatible string for the R8A77995 (R-Car D3) RAVB.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: Convert int functions to bool
Joe Perches [Wed, 13 Sep 2017 20:58:15 +0000 (13:58 -0700)]
net: Convert int functions to bool

Global function ipv6_rcv_saddr_equal and static functions
ipv6_rcv_saddr_equal and ipv4_rcv_saddr_equal currently return int.

bool is slightly more descriptive for these functions so change
their return type from int to bool.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoravb: document R8A77970 bindings
Sergei Shtylyov [Tue, 12 Sep 2017 20:02:08 +0000 (23:02 +0300)]
ravb: document R8A77970 bindings

R-Car V3M (R8A77970) SoC also has the R-Car gen3 compatible EtherAVB
device, so document  the SoC specific bindings.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: phy: realtek: add RTL8201F phy-id and functions
Jassi Brar [Tue, 12 Sep 2017 09:54:36 +0000 (18:54 +0900)]
net: phy: realtek: add RTL8201F phy-id and functions

Add RTL8201F phy-id and the related functions to the driver.

The original patch is as follows:
https://patchwork.kernel.org/patch/2538341/

Signed-off-by: Jongsung Kim <neidhard.kim@lge.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>