openwrt/staging/blogic.git
5 years agonet: i825xx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Thu, 14 Feb 2019 14:45:38 +0000 (22:45 +0800)]
net: i825xx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called in i596_interrupt() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agolib: objagg: fix handling of object with 0 users when assembling hints
Jiri Pirko [Thu, 14 Feb 2019 14:39:07 +0000 (15:39 +0100)]
lib: objagg: fix handling of object with 0 users when assembling hints

It is possible that there might be an originally parent object with 0
direct users that is in hints no longer considered as parent. Then the
weight of this object is 0 and current code ignores him. That's why the
total amount of hint objects might be lower than for the original
objagg and WARN_ON is hit. Fix this be considering 0 weight valid.

Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'cxgb4-SGE-doorbell-queue-timer'
David S. Miller [Thu, 14 Feb 2019 17:39:35 +0000 (12:39 -0500)]
Merge branch 'cxgb4-SGE-doorbell-queue-timer'

Vishal Kulkarni says:

====================
cxgb4/cxgb4vfSupport for SGE doorbell queue timer

This series of patchs add SGE doorbell queue timer for faster DMA completions.

Patch 1 Implements SGE doorbell queue timer

Patch 2 Adds ethtool capability to set/get SGE doorbell queue timer tick

v2
- Reverse christmas tree formatting for local variables.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: Add capability to get/set SGE Doorbell Queue Timer Tick
Vishal Kulkarni [Thu, 14 Feb 2019 12:49:16 +0000 (18:19 +0530)]
cxgb4: Add capability to get/set SGE Doorbell Queue Timer Tick

This patch gets/sets SGE Doorbell Queue timer ticks via ethtool

Original work by: Casey Leedom <leedom@chelsio.com>

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4/cxgb4vf: Add support for SGE doorbell queue timer
Vishal Kulkarni [Thu, 14 Feb 2019 12:49:15 +0000 (18:19 +0530)]
cxgb4/cxgb4vf: Add support for SGE doorbell queue timer

T6 introduced a Timer Mechanism in SGE called the
SGE Doorbell Queue Timer. With this we can now configure
TX Queues to get CIDX Updates when:

    Time(CIDX == PIDX) >= Timer

Previously we rely on TX Queue Status Page updates by hardware
for DMA completions. This will make Hardware/Firmware actually
deliver the CIDX Updates as Ingress Queue messages with
commensurate Interrupts.

So we now have a new RX Path component for processing CIDX Updates
and reclaiming TX Descriptors faster.

Original work by: Casey Leedom <leedom@chelsio.com>

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosfc: Replace dev_kfree_skb_any by dev_consume_skb_any
Huang Zijiang [Thu, 14 Feb 2019 06:42:13 +0000 (14:42 +0800)]
sfc: Replace dev_kfree_skb_any by dev_consume_skb_any

The skb should be freed by dev_consume_skb_any() in efx_tx_tso_fallback()
when skb is still used. The skb will be replaced by segments, so the
original skb should be consumed(not drop).

Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>
Acked-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet:ethernet:cadence: Replace dev_kfree_skb_any by dev_consume_skb_any
Huang Zijiang [Thu, 14 Feb 2019 06:41:18 +0000 (14:41 +0800)]
net:ethernet:cadence: Replace dev_kfree_skb_any by dev_consume_skb_any

The skb should be freed by dev_consume_skb_any() in macb_pad_and_fcs()
when *skb is still used. The *skb is be replaced by nskb, so the
original *skb should be consumed(not drop).

Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet:dl2k: Replace dev_kfree_skb_irq by dev_consume_skb_irq
Huang Zijiang [Thu, 14 Feb 2019 06:40:56 +0000 (14:40 +0800)]
net:dl2k: Replace dev_kfree_skb_irq by dev_consume_skb_irq

dev_consume_skb_irq() should be called when skb xmit
done.It makes drop profiles more friendly.

Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet:dl2k: Modify the code style escaping the warning
Huang Zijiang [Thu, 14 Feb 2019 06:40:31 +0000 (14:40 +0800)]
net:dl2k: Modify the code style escaping the warning

modify the code style in order to removing the following warning
when excute the script checkpatch.pl
WARNING: space prohibited between function name and open parenthesis '('

Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoisdn:hisax: Replace dev_kfree_skb_any by dev_consume_skb_any
Huang Zijiang [Thu, 14 Feb 2019 06:39:59 +0000 (14:39 +0800)]
isdn:hisax: Replace dev_kfree_skb_any by dev_consume_skb_any

The skb should be freed by dev_consume_skb_any() in hfcpci_fill_fifo()
when bcs->tx_skb is still used. The bcs->tx_skb is be replaced by
skb_dequeue(&bcs->squeue), so the original bcs->tx_skb should
be consumed(not drop).

Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ipvlan_l3s: fix kconfig dependency warning
Randy Dunlap [Wed, 13 Feb 2019 16:55:02 +0000 (08:55 -0800)]
net: ipvlan_l3s: fix kconfig dependency warning

Fix the kconfig warning in IPVLAN_L3S when neither INET nor IPV6
is enabled:

WARNING: unmet direct dependencies detected for NET_L3_MASTER_DEV
  Depends on [n]: NET [=y] && (INET [=n] || IPV6 [=n])
  Selected by [y]:
  - IPVLAN_L3S [=y] && NETDEVICES [=y] && NET_CORE [=y] && NETFILTER [=y]

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: nuvoton: w90p910_ether: replace dev_kfree_skb_irq by dev_consume_skb_irq for...
Yang Wei [Wed, 13 Feb 2019 15:21:02 +0000 (23:21 +0800)]
net: nuvoton: w90p910_ether: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called in w90p910_ether_start_xmit()
when skb xmit done. It makes drop profiles(dropwatch, perf) more
friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: natsemi: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Wed, 13 Feb 2019 15:19:14 +0000 (23:19 +0800)]
net: natsemi: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: micrel: ks8695net: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop...
Yang Wei [Wed, 13 Feb 2019 15:18:09 +0000 (23:18 +0800)]
net: micrel: ks8695net: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called in ks8695_tx_irq() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sgi: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Wed, 13 Feb 2019 15:17:06 +0000 (23:17 +0800)]
net: sgi: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: myri10ge: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Wed, 13 Feb 2019 15:15:43 +0000 (23:15 +0800)]
net: myri10ge: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called in myri10ge_tx_done() when
skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: amd: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Wed, 13 Feb 2019 15:14:54 +0000 (23:14 +0800)]
net: amd: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dlink: sundance: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Wed, 13 Feb 2019 15:12:02 +0000 (23:12 +0800)]
net: dlink: sundance: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called in intr_handler() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Remove a redundant blank line in intr_handler().

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'uapi-Add-a-new-header-for-time-types'
David S. Miller [Thu, 14 Feb 2019 16:51:51 +0000 (11:51 -0500)]
Merge branch 'uapi-Add-a-new-header-for-time-types'

Deepa Dinamani says:

====================
uapi: Add a new header for time types

The series aims at adding a new time header: time_types.h.  This header
is what will eventually hold all the uapi time types that we plan to
leave across the interfaces after the y2038 cleanup.

The series was discussed with Arnd Bergmann.

The second patch fixes the errqueue.h header, which has a dependency on
these types.

Note that there may be a trivial merge conflict with linux-next
c70a772fda11 ("y2038: remove struct definition redirects").
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoerrqueue.h: Include time_types.h
Deepa Dinamani [Wed, 13 Feb 2019 03:26:04 +0000 (19:26 -0800)]
errqueue.h: Include time_types.h

Now that we have a separate header for struct __kernel_timespec,
include it directly without relying on userspace to do it.

Reported-by: Ran Rozenstein <ranro@mellanox.com>
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotime: Add time_types.h
Deepa Dinamani [Wed, 13 Feb 2019 03:26:03 +0000 (19:26 -0800)]
time: Add time_types.h

sys/time.h is the mandated include for many time related
defines. However, linux/time.h overlaps sys/time.h
significantly and this makes including both from userspace
or one from the other impossible.

This also means that userspace can get away with including
sys/time.h whenever it needs linux/time.h and this is what's
been happening in the user world usually.

But, we have new data types that we plan to use in the uapi time
interfaces also defined in the linux/time.h. But, we are unable
to use these types when sys/time.h is included.

Hence, move the new types to a new header, time_types.h.
We intend to eventually have all the uapi defines that the kernel
uses defined in this header.
Note that the plan is to replace uapi interfaces with timeval to
use __kernel_old_timeval, timespec to use __kernel_old_timespec etc.

Reported-by: Ran Rozenstein <ranro@mellanox.com>
Fixes: 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW")
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'devlink-region-read-fixes'
David S. Miller [Thu, 14 Feb 2019 16:45:39 +0000 (11:45 -0500)]
Merge branch 'devlink-region-read-fixes'

Parav Pandit says:

====================
devlink: 2 fixes for devlink region read

This 2 patches consist of fixes for devlink region read handling.

v0->v1:
 - Fixed typo from user to use
v1->v2:
 - Rebased
====================

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodevlink: Fix list access without lock while reading region
Parav Pandit [Tue, 12 Feb 2019 20:24:08 +0000 (14:24 -0600)]
devlink: Fix list access without lock while reading region

While finding the devlink device during region reading,
devlink device list is accessed and devlink device is
returned without holding a lock. This could lead to use-after-free
accesses.

While at it, add lockdep assert to ensure that all future callers hold
the lock when calling devlink_get_from_attrs().

Fixes: 4e54795a27f5 ("devlink: Add support for region snapshot read command")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodevlink: Return right error code in case of errors for region read
Parav Pandit [Tue, 12 Feb 2019 20:23:58 +0000 (14:23 -0600)]
devlink: Return right error code in case of errors for region read

devlink_nl_cmd_region_read_dumpit() misses to return right error code on
most error conditions.
Return the right error code on such errors.

Fixes: 4e54795a27f5 ("devlink: Add support for region snapshot read command")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobonding: check slave set command firstly
Tonghao Zhang [Mon, 11 Feb 2019 18:49:48 +0000 (10:49 -0800)]
bonding: check slave set command firstly

This patch is a little improvement. If user use the
command shown as below, we should print the info [1]
instead of [2]. The eth0 exists actually, and it may
confuse user.

$ echo "eth0" > /sys/class/net/bond4/bonding/slaves

[1] "bond4: no command found in slaves file - use +ifname or -ifname"
[2] "write error: No such device"

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'mlxsw-hwmon-and-thermal-extensions'
David S. Miller [Thu, 14 Feb 2019 06:33:03 +0000 (22:33 -0800)]
Merge branch 'mlxsw-hwmon-and-thermal-extensions'

Ido Schimmel says:

====================
mlxsw: hwmon and thermal extensions

Vadim says:

This patchset contains various improvements to hwmon and thermal code in
mlxsw. The most significant improvement is the ability to read modules'
temperature attributes (input, fault, critical and emergency thresholds)
as well as fans' fault indication. These new attributes will improve the
ability to monitor the system.

Patches #1-#4 add the necessary device registers and APIs to read
modules' temperature attributes and fans' fault indication.

Patches #5-#8 perform small improvements in hwmon and thermal code such
as using a more indicative name for cooling devices.

Patch #9 exposes fans' fault indication via hwmon.

Patch #10 exposes modules' temperature attributes via hwmon.

Patch #11 adds an hwmon label to modules' temperature sensor. This helps
to parse the output of utilities such as "sensors".

Patch #12 allows to bind an external cooling device ("mlxreg-fan") to
mlxsw thermal zone. This will allow the mlxsw thermal zone to change the
cooling level of cooling devices not programmed via switch registers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: core: Allow thermal zone binding to an external cooling device
Vadim Pasternak [Wed, 13 Feb 2019 11:28:56 +0000 (11:28 +0000)]
mlxsw: core: Allow thermal zone binding to an external cooling device

Allow thermal zone binding to an external cooling device from the
cooling devices white list.

It provides support for Mellanox next generation systems on which
cooling device logic is not controlled through the switch registers.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: core: Add QSFP module temperature label attribute to hwmon
Vadim Pasternak [Wed, 13 Feb 2019 11:28:55 +0000 (11:28 +0000)]
mlxsw: core: Add QSFP module temperature label attribute to hwmon

Add label attribute to hwmon object for exposing QSFP module's
temperature sensor name. Modules are labeled as "front panel xxx". The
label is used by utilities such as "sensors":

front panel 001:   +0.0C  (crit =  +0.0C, emerg =  +0.0C)
..
front panel 020:  +31.0C  (crit = +70.0C, emerg = +80.0C)
..
front panel 056:  +41.0C  (crit = +70.0C, emerg = +80.0C)

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: core: Extend hwmon interface with QSFP module temperature attributes
Vadim Pasternak [Wed, 13 Feb 2019 11:28:54 +0000 (11:28 +0000)]
mlxsw: core: Extend hwmon interface with QSFP module temperature attributes

Add new attributes to hwmon object for exposing QSFP module temperature
input, fault indication, critical and emergency thresholds. Temperature
input and fault indication are read from Management Temperature Bulk
Register. Temperature thresholds are read from Management Cable Info
Access Register.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: core: Extend hwmon interface with fan fault attribute
Vadim Pasternak [Wed, 13 Feb 2019 11:28:53 +0000 (11:28 +0000)]
mlxsw: core: Extend hwmon interface with fan fault attribute

Add new fan hwmon attribute for exposing fan faults (fault indication is
read from Fan Out of Range Event Register).

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: core: Rename cooling device
Vadim Pasternak [Wed, 13 Feb 2019 11:28:52 +0000 (11:28 +0000)]
mlxsw: core: Rename cooling device

Rename cooling device from "Fan" to "mlxsw_fan".  Name "Fan" is too
common name, and such name is misleading, while it's interpreted by
user. For example name "Fan" could be used by ACPI.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: core: Replace thermal temperature trips with defines
Vadim Pasternak [Wed, 13 Feb 2019 11:28:51 +0000 (11:28 +0000)]
mlxsw: core: Replace thermal temperature trips with defines

Replace thermal hardcoded temperature trip values with defines.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: core: Modify thermal zone definition
Vadim Pasternak [Wed, 13 Feb 2019 11:28:50 +0000 (11:28 +0000)]
mlxsw: core: Modify thermal zone definition

Modify thermal zone trip points setting for better alignment with system
thermal requirement.

Add hysteresis thresholds for thermal trips in order to avoid throttling
around thermal trip point. If hysteresis temperature is not considered,
PWM can have side effect of flip up/down on thermal trip point boundary.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: core: Set different thermal polling time based on bus frequency capability
Vadim Pasternak [Wed, 13 Feb 2019 11:28:48 +0000 (11:28 +0000)]
mlxsw: core: Set different thermal polling time based on bus frequency capability

Add low frequency bus capability in order to allow core functionality
separation based on bus type. Driver could run over PCIe, which is
considered as high frequency bus or I2C, which is considered as low
frequency bus. In the last case time setting, for example, for thermal
polling interval, should be increased.

Use different thermal monitoring based on bus type. For I2C bus time is
set to 20 seconds, while for PCIe 1 second polling interval is used.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: core: Add API for QSFP module temperature thresholds reading
Vadim Pasternak [Wed, 13 Feb 2019 11:28:47 +0000 (11:28 +0000)]
mlxsw: core: Add API for QSFP module temperature thresholds reading

Add new API to read QSFP module's temperature thresholds - warning and
critical.

New internal API reads the temperature thresholds from the modules,
which are equipped with the thermal sensor. These thresholds will be
exposed via hwmon subsystem.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Fan Out of Range Event Register
Vadim Pasternak [Wed, 13 Feb 2019 11:28:46 +0000 (11:28 +0000)]
mlxsw: reg: Add Fan Out of Range Event Register

Add FORE (Fan Out of Range Event Register), which is used for fan fault
reading.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Management Temperature Bulk Register
Vadim Pasternak [Wed, 13 Feb 2019 11:28:45 +0000 (11:28 +0000)]
mlxsw: reg: Add Management Temperature Bulk Register

Add MTBR (Management Temperature Bulk Register), which is used for port
temperature reading in a bulk mode.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum: Move QSFP EEPROM definitions to common location
Vadim Pasternak [Wed, 13 Feb 2019 11:28:44 +0000 (11:28 +0000)]
mlxsw: spectrum: Move QSFP EEPROM definitions to common location

Move QSFP EEPROM definitions to common location from the spectrum driver
in order to make them available for other mlxsw modules. They are common
for all kind of chips and have relation to SFF specifications 8024,
8436, 8472, 8636, rather than to chip type.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'batadv-next-for-davem-20190213' of git://git.open-mesh.org/linux-merge
David S. Miller [Thu, 14 Feb 2019 06:28:11 +0000 (22:28 -0800)]
Merge tag 'batadv-next-for-davem-20190213' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This feature/cleanup patchset includes the following patches:

 - fix memory leak in in batadv_dat_put_dhcp, by Martin Weinelt

 - fix typo, by Sven Eckelmann

 - netlink restructuring patch series (part 2), by Sven Eckelmann
   (19 patches)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotest_objagg: Uninitialized variable in error handling
Dan Carpenter [Wed, 13 Feb 2019 08:59:31 +0000 (11:59 +0300)]
test_objagg: Uninitialized variable in error handling

We need to set the error message on this path otherwise some of the
callers, such as test_hints_case(), print from an uninitialized pointer.

We had a similar bug earlier and set "errmsg" to NULL in the caller,
test_delta_action_item().  That code is no longer required so I have
removed it.

Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotest_objagg: Test the correct variable
Dan Carpenter [Wed, 13 Feb 2019 08:58:20 +0000 (11:58 +0300)]
test_objagg: Test the correct variable

There is a typo here.  We intended to check "objagg2" but we instead
test "objagg" which is not an error pointer.

Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agolib: objagg: Fix an error code in objagg_hints_get()
Dan Carpenter [Wed, 13 Feb 2019 08:56:50 +0000 (11:56 +0300)]
lib: objagg: Fix an error code in objagg_hints_get()

We need to set the error code on this path otherwise we return
ERR_PTR(0) which would result in a NULL dereference in the caller.

Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4vf: Few more link management changes.
Vishal Kulkarni [Wed, 13 Feb 2019 05:18:52 +0000 (10:48 +0530)]
cxgb4vf: Few more link management changes.

CR4_QSFP 10G Speed technology should be 10000baseKR_Full
And also report available FEC modes.

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'pagepool-api-and-dma-address-storage'
David S. Miller [Thu, 14 Feb 2019 06:00:17 +0000 (22:00 -0800)]
Merge branch 'pagepool-api-and-dma-address-storage'

Jesper Dangaard Brouer says:

====================
Fix page_pool API and dma address storage

As pointed out by David Miller in [1] the current page_pool implementation
stores dma_addr_t in page->private. This won't work on 32-bit platforms with
64-bit DMA addresses since the page->private is an unsigned long and the
dma_addr_t a u64.

Since no driver is yet using the DMA mapping capabilities of the API let's
fix this by storing the information in 'struct page' and use that to store
and retrieve DMA addresses from network drivers.

As long as the addresses returned from dma_map_page() are aligned the first
bit, used by the compound pages code should not be set.

Ilias tested the first two patches on Espressobin driver mvneta, for which
we have patches for using the DMA API of page_pool.

[1]: https://lore.kernel.org/netdev/20181207.230655.1261252486319967024.davem@davemloft.net/
====================

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agopage_pool: use DMA_ATTR_SKIP_CPU_SYNC for DMA mappings
Jesper Dangaard Brouer [Wed, 13 Feb 2019 01:55:50 +0000 (02:55 +0100)]
page_pool: use DMA_ATTR_SKIP_CPU_SYNC for DMA mappings

As pointed out by Alexander Duyck, the DMA mapping done in page_pool needs
to use the DMA attribute DMA_ATTR_SKIP_CPU_SYNC.

As the principle behind page_pool keeping the pages mapped is that the
driver takes over the DMA-sync steps.

Reported-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: page_pool: don't use page->private to store dma_addr_t
Ilias Apalodimas [Wed, 13 Feb 2019 01:55:45 +0000 (02:55 +0100)]
net: page_pool: don't use page->private to store dma_addr_t

As pointed out by David Miller the current page_pool implementation
stores dma_addr_t in page->private.
This won't work on 32-bit platforms with 64-bit DMA addresses since the
page->private is an unsigned long and the dma_addr_t a u64.

A previous patch is adding dma_addr_t on struct page to accommodate this.
This patch adapts the page_pool related functions to use the newly added
struct for storing and retrieving DMA addresses from network drivers.

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomm: add dma_addr_t to struct page
Jesper Dangaard Brouer [Wed, 13 Feb 2019 01:55:40 +0000 (02:55 +0100)]
mm: add dma_addr_t to struct page

The page_pool API is using page->private to store DMA addresses.
As pointed out by David Miller we can't use that on 32-bit architectures
with 64-bit DMA

This patch adds a new dma_addr_t struct to allow storing DMA addresses

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: remove duplicated include from cls_api.c
YueHaibing [Wed, 13 Feb 2019 01:42:00 +0000 (01:42 +0000)]
net: sched: remove duplicated include from cls_api.c

Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoflow_offload: fix block stats
John Hurley [Wed, 13 Feb 2019 00:23:52 +0000 (00:23 +0000)]
flow_offload: fix block stats

With the introduction of flow_stats_update(), drivers now update the stats
fields of the passed tc_cls_flower_offload struct, rather than call
tcf_exts_stats_update() directly to update the stats of offloaded TC
flower rules. However, if multiple qdiscs are registered to a TC shared
block and a flower rule is applied, then, when getting stats for the rule,
multiple callbacks may be made.

Take this into consideration by modifying flow_stats_update to gather the
stats from all callbacks. Currently, the values in tc_cls_flower_offload
only account for the last stats callback in the list.

Fixes: 3b1903ef97c0 ("flow_offload: add statistics retrieval infrastructure and use it")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: flower: only return error from hw offload if skip_sw
Vlad Buslov [Tue, 12 Feb 2019 21:39:06 +0000 (23:39 +0200)]
net: sched: flower: only return error from hw offload if skip_sw

Recently introduced tc_setup_flow_action() can fail when parsing tcf_exts
on some unsupported action commands. However, this should not affect the
case when user did not explicitly request hw offload by setting skip_sw
flag. Modify tc_setup_flow_action() callers to only propagate the error if
skip_sw flag is set for filter that is being offloaded, and set extack
error message in that case.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Fixes: 3a7b68617de7 ("cls_api: add translator to flow_action representation")
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqlge: fix some indentation issues
Colin Ian King [Tue, 12 Feb 2019 16:08:07 +0000 (16:08 +0000)]
qlge: fix some indentation issues

There are some statements that are indented incorrectly. Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: fix indentation issue with statements in an if-block
Colin Ian King [Tue, 12 Feb 2019 16:01:53 +0000 (16:01 +0000)]
qed: fix indentation issue with statements in an if-block

There are some statements in an if-block that are not correctly
indented. Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ixp4xx_eth: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Tue, 12 Feb 2019 16:01:11 +0000 (00:01 +0800)]
net: ixp4xx_eth: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called in eth_txdone_irq() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: macb: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Tue, 12 Feb 2019 16:00:02 +0000 (00:00 +0800)]
net: macb: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called in at91ether_interrupt() when
skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sis: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Tue, 12 Feb 2019 15:59:04 +0000 (23:59 +0800)]
net: sis: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: fealnx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Tue, 12 Feb 2019 15:56:53 +0000 (23:56 +0800)]
net: fealnx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called in intr_handler() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: moxa: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Tue, 12 Feb 2019 15:56:00 +0000 (23:56 +0800)]
net: moxa: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called in moxart_tx_finished() when
skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: apple: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Tue, 12 Feb 2019 15:52:53 +0000 (23:52 +0800)]
net: apple: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called in mace_interrupt() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: atheros: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Tue, 12 Feb 2019 15:51:45 +0000 (23:51 +0800)]
net: atheros: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: qualcomm: emac: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Tue, 12 Feb 2019 15:49:57 +0000 (23:49 +0800)]
net: qualcomm: emac: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called in emac_mac_tx_process() when
skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: neterion: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
Yang Wei [Tue, 12 Feb 2019 15:47:31 +0000 (23:47 +0800)]
net: neterion: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles

dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'phy-25g'
David S. Miller [Thu, 14 Feb 2019 00:17:53 +0000 (19:17 -0500)]
Merge branch 'phy-25g'

Maxime Chevallier says:

====================
net: phy: Add 2.5G/5GBASET PHYs support

The 802.3bz standard defines 2 modes based on the NBASET alliance work
that allow to use 2.5Gbps and 5Gbps speeds on Cat 5e, 6 and 7 cables.

This series adds the necessary infrastructure to handle these modes with
C45 PHYs. This series was originally part of a bigger one, that has
seen 2 iterations [1] [2] that added support for these modes on Marvell
Alaska PHYs.

Following some discussions with Heiner and Andrew [3], we decided to
split-out the generic parts so that we can work together on the
following steps to get these mode fully working with Aquantia and
Marvell PHYS.

The first 3 patches are reworking some of the internal network phy
infrastructure to handle the new modes in a more generic way.

The 4th patch adds all the C45 register definition and accesses that
follows the 802.3bz standard to support 2.5GBASET and 5GBASET.

[1] : https://lore.kernel.org/netdev/20190118152352.26417-1-maxime.chevallier@bootlin.com/
[2] : https://lore.kernel.org/netdev/20190207094939.27369-1-maxime.chevallier@bootlin.com/
[3] : https://lore.kernel.org/netdev/81c340ea-54b0-1abf-94af-b8dc4ee83e3a@gmail.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: Add generic support for 2.5GBaseT and 5GBaseT
Maxime Chevallier [Mon, 11 Feb 2019 14:25:29 +0000 (15:25 +0100)]
net: phy: Add generic support for 2.5GBaseT and 5GBaseT

The 802.3bz specification, based on previous by the NBASET alliance,
defines the 2.5GBaseT and 5GBaseT link modes for ethernet traffic on
cat5e, cat6 and cat7 cables.

These mode integrate with the already defined C45 MDIO PMA/PMD registers
set that added 10G support, by defining some previously reserved bits,
and adding a new register (2.5G/5G Extended abilities).

This commit adds the required definitions in include/uapi/linux/mdio.h
to support these modes, and detect when a link-partner advertises them.

It also adds support for these mode in the generic C45 PHY
infrastructure.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: Extract genphy_c45_pma_read_abilities from marvell10g
Maxime Chevallier [Mon, 11 Feb 2019 14:25:28 +0000 (15:25 +0100)]
net: phy: Extract genphy_c45_pma_read_abilities from marvell10g

Marvell 10G PHY driver has a generic way of initializing the supported
link modes by reading the PHY's C45 PMA abilities. This can be made
generic, since these registers are part of the 802.3 specifications.

This commit extracts the config_init link_mode initialization code from
marvell10g and uses it to introduce the genphy_c45_pma_read_abilities
function.

Only PMA modes are read, it's still up to the caller to set the Pause
parameters.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: Move of_set_phy_eee_broken to phy-core.c
Maxime Chevallier [Mon, 11 Feb 2019 14:25:27 +0000 (15:25 +0100)]
net: phy: Move of_set_phy_eee_broken to phy-core.c

Since of_set_phy_supported was moved to phy-core.c, we can also move
of_set_phy_eee_broken to the same location, so that we have all OF
functions in the same place.

This patch doesn't intend to introduce any change in behaviour.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: Mask-out non-compatible modes when setting the max-speed
Maxime Chevallier [Mon, 11 Feb 2019 14:25:26 +0000 (15:25 +0100)]
net: phy: Mask-out non-compatible modes when setting the max-speed

When setting a PHY's max speed using either the max-speed DT property
or ethtool, we should mask-out all non-compatible modes according to the
settings table, instead of just the 10/100BASET modes.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-Remove-unused-variables'
David S. Miller [Wed, 13 Feb 2019 01:31:30 +0000 (17:31 -0800)]
Merge branch 'net-Remove-unused-variables'

Florian Fainelli says:

====================
Remove unused variables

This removes unused variables from mlxsw and ethsw after the recent
removal of SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS, build scripts are now
fixed to take care of those warnings :).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agostaging: fsl-dpaa2: ethsw: Remove unused port_priv variable
Florian Fainelli [Tue, 12 Feb 2019 23:39:59 +0000 (15:39 -0800)]
staging: fsl-dpaa2: ethsw: Remove unused port_priv variable

After 1b8b589d9103 ("staging: fsl-dpaa2: ethsw: Remove getting
PORT_BRIDGE_FLAGS") we are not accessing any driver private data
anymore, remove port_priv which is unused.

Fixes: 1b8b589d9103 ("staging: fsl-dpaa2: ethsw: Remove getting PORT_BRIDGE_FLAGS")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_switchdev: Remove unused variables
Florian Fainelli [Tue, 12 Feb 2019 23:39:58 +0000 (15:39 -0800)]
mlxsw: spectrum_switchdev: Remove unused variables

After 1ecb195753a1 ("mlxsw: spectrum_switchdev: Remove getting
PORT_BRIDGE_FLAGS") we are not accessing any driver private data
structure, so the mlxsw_sp_port and mlxsw_sp variables are unused.

Fixes: 1ecb195753a1 ("mlxsw: spectrum_switchdev: Remove getting PORT_BRIDGE_FLAGS")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agorocker: Remove port_attr_bridge_flags_get assignment
Florian Fainelli [Tue, 12 Feb 2019 21:39:05 +0000 (13:39 -0800)]
rocker: Remove port_attr_bridge_flags_get assignment

After 610d2b601bba ("rocker: Remove getting PORT_BRIDGE_FLAGS") we no
longer have a port_attr_bridge_flags_get member in the rocker_world_ops
structre, fix that.

Fixes: 610d2b601bba ("rocker: Remove getting PORT_BRIDGE_FLAGS")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'classifier-no-rtnl'
David S. Miller [Tue, 12 Feb 2019 18:41:33 +0000 (13:41 -0500)]
Merge branch 'classifier-no-rtnl'

Vlad Buslov says:

====================
Refactor classifier API to work with chain/classifiers without rtnl lock

Currently, all netlink protocol handlers for updating rules, actions and
qdiscs are protected with single global rtnl lock which removes any
possibility for parallelism. This patch set is a third step to remove
rtnl lock dependency from TC rules update path.

Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added.
Handlers registered with this flag are called without RTNL taken. End
goal is to have rule update handlers(RTM_NEWTFILTER, RTM_DELTFILTER,
etc.) to be registered with UNLOCKED flag to allow parallel execution.
However, there is no intention to completely remove or split rtnl lock
itself. This patch set addresses specific problems in implementation of
classifiers API that prevent its control path from being executed
concurrently, and completes refactoring of cls API rules update handlers
by removing rtnl lock dependency from code that handles chains and
classifiers. Rules update handlers are registered with
RTNL_FLAG_DOIT_UNLOCKED flag.

This patch set substitutes global rtnl lock dependency on rules update
path in cls API by extending its data structures with following locks:
- tcf_block with 'lock' mutex. It is used to protect block state and
  life-time management fields of chains on the block (chain->refcnt,
  chain->action_refcnt, chain->explicitly crated, etc.).
- tcf_chain with 'filter_chain_lock' mutex, that is used to protect list
  of classifier instances attached to chain. chain0->filter_chain_lock
  serializes calls to head change callbacks and allows them to rely on
  filter_chain_lock for serialization instead of rtnl lock.
- tcf_proto with 'lock' spinlock that is intended to be used to
  synchronize access to classifiers that support unlocked execution.

Classifiers are extended with reference counting to accommodate parallel
access by unlocked cls API. Classifier ops structure is extended with
additional 'put' function to allow reference counting of filters and
intended to be used by classifiers that implement rtnl-unlocked API.
Users of classifiers and individual filter instances are modified to
always hold reference while working with them.

Classifiers that support unlocked execution still need to know the
status of rtnl lock, so their API is extended with additional
'rtnl_held' argument that is used to indicate that caller holds rtnl
lock. Cls API propagates rtnl lock status across its helper functions
and passes it to classifier.

Changes from V3 to V4:
- Patch 1:
  - Extract code that manages chain 'explicitly_created' flag into
    standalone patch.
- Patch 2 - new.

Changes from V2 to V3:
- Change block->lock and chain->filter_chain_lock type to mutex. This
  removes the need for async miniqp refactoring and allows calling
  sleeping functions while holding the block->lock and
  chain->filter_chain_lock locks.
- Previous patch 1 - async miniqp is no longer needed, remove the patch.
- Patch 1:
  - Change block->lock type to mutex.
  - Implement tcf_block_destroy() helper function that destroys
    block->lock mutex before deallocating the block.
  - Revert GFP_KERNEL->GFP_ATOMIC memory allocation flags of tcf_chain
    which is no longer needed after block->lock type change.
- Patch 6:
  - Change chain->filter_chain_lock type to mutex.
  - Assume chain0->filter_chain_lock synchronizations instead of rtnl
    lock in mini_qdisc_pair_swap() function that is called from head
    change callback of ingress Qdisc. With filter_chain_lock type
    changed to mutex it is now possible to call sleeping function while
    holding it, so it is now used instead of async implementation from
    previous versions of this patch set.
- Patch 7:
  - Add local tp_next var to tcf_chain_flush() and use it to store
    tp->next pointer dereferenced with rcu_dereference_protected() to
    satisfy kbuild test robot.
  - Reset tp pointer to NULL at the beginning of tc_new_tfilter() to
    prevent its uninitialized usage in error handling code. This code
    was already implemented in patch 10, but must be in patch 8 to
    preserve code bisectability.
  - Put parent chain in tcf_proto_destroy(). In previous version this
    code was implemented in patch 1 which was removed in V3.

Changes from V1 to V2:
- Patch 1:
  - Use smp_store_release() instead of xchg() for setting
    miniqp->tp_head.
  - Move Qdisc deallocation to tc_proto_wq ordered workqueue that is
    used to destroy tcf proto instances. This is necessary to ensure
    that Qdisc is destroyed after all instances of chain/proto that it
    contains in order to prevent use-after-free error in
    tc_chain_notify_delete().
  - Cache parent net device ifindex in block to prevent use-after-free
    of dev queue in tc_chain_notify_delete().
- Patch 2:
  - Use lockdep_assert_held() instead of spin_is_locked() for assertion.
  - Use 'free_block' argument in tcf_chain_destroy() instead of checking
    block's reference count and chain_list for second time.
- Patch 7:
  - Refactor tcf_chain0_head_change_cb_add() to not take block->lock and
    chain0->filter_chain_lock in correct order.
- Patch 10:
  - Always set 'tp_created' flag when creating tp to prevent releasing
    the chain twice when tp with same priority was inserted
    concurrently.
- Patch 11:
  - Add additional check to prevent creation of new proto instance when
    parent chain is being flushed to reduce CPU usage.
  - Don't call tcf_chain_delete_empty() if tp insertion failed.
- Patch 16 - new.
- Patch 17:
  - Refactor to only lock take rtnl lock once (at the beginning of rule
    update handlers).
  - Always release rtnl mutex in the same function that locked it.
    Remove unlock code from tcf_block_release().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: unlock rules update API
Vlad Buslov [Mon, 11 Feb 2019 08:55:48 +0000 (10:55 +0200)]
net: sched: unlock rules update API

Register netlink protocol handlers for message types RTM_NEWTFILTER,
RTM_DELTFILTER, RTM_GETTFILTER as unlocked. Set rtnl_held variable that
tracks rtnl mutex state to be false by default.

Introduce tcf_proto_is_unlocked() helper that is used to check
tcf_proto_ops->flag to determine if ops can be called without taking rtnl
lock. Manually lookup Qdisc, class and block in rule update handlers.
Verify that both Qdisc ops and proto ops are unlocked before using any of
their callbacks, and obtain rtnl lock otherwise.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: refactor tcf_block_find() into standalone functions
Vlad Buslov [Mon, 11 Feb 2019 08:55:47 +0000 (10:55 +0200)]
net: sched: refactor tcf_block_find() into standalone functions

Refactor tcf_block_find() code into three standalone functions:
- __tcf_qdisc_find() to lookup Qdisc and increment its reference counter.
- __tcf_qdisc_cl_find() to lookup class.
- __tcf_block_find() to lookup block and increment its reference counter.

This change is necessary to allow netlink tc rule update handlers to call
these functions directly in order to conditionally take rtnl lock
according to Qdisc class ops flags before calling any of class ops
functions.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: add flags to Qdisc class ops struct
Vlad Buslov [Mon, 11 Feb 2019 08:55:46 +0000 (10:55 +0200)]
net: sched: add flags to Qdisc class ops struct

Extend Qdisc_class_ops with flags. Create enum to hold possible class ops
flag values. Add first class ops flags value QDISC_CLASS_OPS_DOIT_UNLOCKED
to indicate that class ops functions can be called without taking rtnl
lock.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: extend proto ops to support unlocked classifiers
Vlad Buslov [Mon, 11 Feb 2019 08:55:45 +0000 (10:55 +0200)]
net: sched: extend proto ops to support unlocked classifiers

Add 'rtnl_held' flag to tcf proto change, delete, destroy, dump, walk
functions to track rtnl lock status. Extend users of these function in cls
API to propagate rtnl lock status to them. This allows classifiers to
obtain rtnl lock when necessary and to pass rtnl lock status to extensions
and driver offload callbacks.

Add flags field to tcf proto ops. Add flag value to indicate that
classifier doesn't require rtnl lock.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: extend proto ops with 'put' callback
Vlad Buslov [Mon, 11 Feb 2019 08:55:44 +0000 (10:55 +0200)]
net: sched: extend proto ops with 'put' callback

Add optional tp->ops->put() API to be implemented for filter reference
counting. This new function is called by cls API to release filter
reference for filters returned by tp->ops->change() or tp->ops->get()
functions. Implement tfilter_put() helper to call tp->ops->put() only for
classifiers that implement it.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: track rtnl lock status when validating extensions
Vlad Buslov [Mon, 11 Feb 2019 08:55:43 +0000 (10:55 +0200)]
net: sched: track rtnl lock status when validating extensions

Actions API is already updated to not rely on rtnl lock for
synchronization. However, it need to be provided with rtnl status when
called from classifiers API in order to be able to correctly release the
lock when loading kernel module.

Extend extension validation function with 'rtnl_held' flag which is passed
to actions API. Add new 'rtnl_held' parameter to tcf_exts_validate() in cls
API. No classifier is currently updated to support unlocked execution, so
pass hardcoded 'true' flag parameter value.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: prevent insertion of new classifiers during chain flush
Vlad Buslov [Mon, 11 Feb 2019 08:55:42 +0000 (10:55 +0200)]
net: sched: prevent insertion of new classifiers during chain flush

Extend tcf_chain with 'flushing' flag. Use the flag to prevent insertion of
new classifier instances when chain flushing is in progress in order to
prevent resource leak when tcf_proto is created by unlocked users
concurrently.

Return EAGAIN error from tcf_chain_tp_insert_unique() to restart
tc_new_tfilter() and lookup the chain/proto again.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: refactor tp insert/delete for concurrent execution
Vlad Buslov [Mon, 11 Feb 2019 08:55:41 +0000 (10:55 +0200)]
net: sched: refactor tp insert/delete for concurrent execution

Implement unique insertion function to atomically attach tcf_proto to chain
after verifying that no other tcf proto with specified priority exists.
Implement delete function that verifies that tp is actually empty before
deleting it. Use these functions to refactor cls API to account for
concurrent tp and rule update instead of relying on rtnl lock. Add new
'deleting' flag to tcf proto. Use it to restart search when iterating over
tp's on chain to prevent accessing potentially inval tp->next pointer.

Extend tcf proto with spinlock that is intended to be used to protect its
data from concurrent modification instead of relying on rtnl mutex. Use it
to protect 'deleting' flag. Add lockdep macros to validate that lock is
held when accessing protected fields.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: traverse classifiers in chain with tcf_get_next_proto()
Vlad Buslov [Mon, 11 Feb 2019 08:55:40 +0000 (10:55 +0200)]
net: sched: traverse classifiers in chain with tcf_get_next_proto()

All users of chain->filters_chain rely on rtnl lock and assume that no new
classifier instances are added when traversing the list. Use
tcf_get_next_proto() to traverse filters list without relying on rtnl
mutex. This function iterates over classifiers by taking reference to
current iterator classifier only and doesn't assume external
synchronization of filters list.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: introduce reference counting for tcf_proto
Vlad Buslov [Mon, 11 Feb 2019 08:55:39 +0000 (10:55 +0200)]
net: sched: introduce reference counting for tcf_proto

In order to remove dependency on rtnl lock and allow concurrent tcf_proto
modification, extend tcf_proto with reference counter. Implement helper
get/put functions for tcf proto and use them to modify cls API to always
take reference to tcf_proto while using it. Only release reference to
parent chain after releasing last reference to tp.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: protect filter_chain list with filter_chain_lock mutex
Vlad Buslov [Mon, 11 Feb 2019 08:55:38 +0000 (10:55 +0200)]
net: sched: protect filter_chain list with filter_chain_lock mutex

Extend tcf_chain with new filter_chain_lock mutex. Always lock the chain
when accessing filter_chain list, instead of relying on rtnl lock.
Dereference filter_chain with tcf_chain_dereference() lockdep macro to
verify that all users of chain_list have the lock taken.

Rearrange tp insert/remove code in tc_new_tfilter/tc_del_tfilter to execute
all necessary code while holding chain lock in order to prevent
invalidation of chain_info structure by potential concurrent change. This
also serializes calls to tcf_chain0_head_change(), which allows head change
callbacks to rely on filter_chain_lock for synchronization instead of rtnl
mutex.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: protect chain template accesses with block lock
Vlad Buslov [Mon, 11 Feb 2019 08:55:37 +0000 (10:55 +0200)]
net: sched: protect chain template accesses with block lock

When cls API is called without protection of rtnl lock, parallel
modification of chain is possible, which means that chain template can be
changed concurrently in certain circumstances. For example, when chain is
'deleted' by new user-space chain API, the chain might continue to be used
if it is referenced by actions, and can be 're-created' again by user. In
such case same chain structure is reused and its template is changed. To
protect from described scenario, cache chain template while holding block
lock. Introduce standalone tc_chain_notify_delete() function that works
with cached template values, instead of chains themselves.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: traverse chains in block with tcf_get_next_chain()
Vlad Buslov [Mon, 11 Feb 2019 08:55:36 +0000 (10:55 +0200)]
net: sched: traverse chains in block with tcf_get_next_chain()

All users of block->chain_list rely on rtnl lock and assume that no new
chains are added when traversing the list. Use tcf_get_next_chain() to
traverse chain list without relying on rtnl mutex. This function iterates
over chains by taking reference to current iterator chain only and doesn't
assume external synchronization of chain list.

Don't take reference to all chains in block when flushing and use
tcf_get_next_chain() to safely iterate over chain list instead. Remove
tcf_block_put_all_chains() that is no longer used.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: protect block->chain0 with block->lock
Vlad Buslov [Mon, 11 Feb 2019 08:55:35 +0000 (10:55 +0200)]
net: sched: protect block->chain0 with block->lock

In order to remove dependency on rtnl lock, use block->lock to protect
chain0 struct from concurrent modification. Rearrange code in chain0
callback add and del functions to only access chain0 when block->lock is
held.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: refactor tc_ctl_chain() to use block->lock
Vlad Buslov [Mon, 11 Feb 2019 08:55:34 +0000 (10:55 +0200)]
net: sched: refactor tc_ctl_chain() to use block->lock

In order to remove dependency on rtnl lock, modify chain API to use
block->lock to protect chain from concurrent modification. Rearrange
tc_ctl_chain() code to call tcf_chain_hold() while holding block->lock to
prevent concurrent chain removal.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: protect chain->explicitly_created with block->lock
Vlad Buslov [Mon, 11 Feb 2019 08:55:33 +0000 (10:55 +0200)]
net: sched: protect chain->explicitly_created with block->lock

In order to remove dependency on rtnl lock, protect
tcf_chain->explicitly_created flag with block->lock. Consolidate code that
checks and resets 'explicitly_created' flag into __tcf_chain_put() to
execute it atomically with rest of code that puts chain reference.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: protect block state with mutex
Vlad Buslov [Mon, 11 Feb 2019 08:55:32 +0000 (10:55 +0200)]
net: sched: protect block state with mutex

Currently, tcf_block doesn't use any synchronization mechanisms to protect
critical sections that manage lifetime of its chains. block->chain_list and
multiple variables in tcf_chain that control its lifetime assume external
synchronization provided by global rtnl lock. Converting chain reference
counting to atomic reference counters is not possible because cls API uses
multiple counters and flags to control chain lifetime, so all of them must
be synchronized in chain get/put code.

Use single per-block lock to protect block data and manage lifetime of all
chains on the block. Always take block->lock when accessing chain_list.
Chain get and put modify chain lifetime-management data and parent block's
chain_list, so take the lock in these functions. Verify block->lock state
with assertions in functions that expect to be called with the lock taken
and are called from multiple places. Take block->lock when accessing
filter_chain_list.

In order to allow parallel update of rules on single block, move all calls
to classifiers outside of critical sections protected by new block->lock.
Rearrange chain get and put functions code to only access protected chain
data while holding block lock:
- Rearrange code to only access chain reference counter and chain action
  reference counter while holding block lock.
- Extract code that requires block->lock from tcf_chain_destroy() into
  standalone tcf_chain_destroy() function that is called by
  __tcf_chain_put() in same critical section that changes chain reference
  counters.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoisdn_v110: mark expected switch fall-through
Gustavo A. R. Silva [Mon, 11 Feb 2019 22:42:37 +0000 (16:42 -0600)]
isdn_v110: mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.

This patch fixes the following warnings:

drivers/isdn/i4l/isdn_v110.c: In function ‘EncodeMatrix’:
drivers/isdn/i4l/isdn_v110.c:353:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
    if (line >= mlen) {
       ^
drivers/isdn/i4l/isdn_v110.c:358:3: note: here
   case 128:
   ^~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

Notice that, in this particular case, the code comment is modified
in accordance with what GCC is expecting to find.

This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoisdn: i4l: isdn_tty: Mark expected switch fall-through
Gustavo A. R. Silva [Mon, 11 Feb 2019 22:38:21 +0000 (16:38 -0600)]
isdn: i4l: isdn_tty: Mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.

This patch fixes the following warnings:

drivers/isdn/i4l/isdn_tty.c: In function ‘isdn_tty_edit_at’:
drivers/isdn/i4l/isdn_tty.c:3644:18: warning: this statement may fall through [-Wimplicit-fallthrough=]
       m->mdmcmdl = 0;
       ~~~~~~~~~~~^~~
drivers/isdn/i4l/isdn_tty.c:3646:5: note: here
     case 0:
     ^~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

Notice that, in this particular case, the code comment is modified
in accordance with what GCC is expecting to find.

This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoser_gigaset: mark expected switch fall-through
Gustavo A. R. Silva [Mon, 11 Feb 2019 22:34:44 +0000 (16:34 -0600)]
ser_gigaset: mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.

This patch fixes the following warning:

drivers/isdn/gigaset/ser-gigaset.c: In function ‘gigaset_tty_ioctl’:
drivers/isdn/gigaset/ser-gigaset.c:627:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   switch (arg) {
   ^~~~~~
drivers/isdn/gigaset/ser-gigaset.c:638:2: note: here
  default:
  ^~~~~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

Notice that, in this particular case, the code comment is modified
in accordance with what GCC is expecting to find.

This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 's390-qeth-next'
David S. Miller [Tue, 12 Feb 2019 18:14:24 +0000 (13:14 -0500)]
Merge branch 's390-qeth-next'

Julian Wiedmann says:

====================
s390/qeth: updates 2019-02-12

please apply one more round of qeth patches to net-next.
This series targets the driver's control paths. It primarily brings improvements
to the error handling for sent cmds and received responses, along with the
usual cleanup and consolidation efforts.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: convert remaining legacy cmd callbacks
Julian Wiedmann [Tue, 12 Feb 2019 17:33:25 +0000 (18:33 +0100)]
s390/qeth: convert remaining legacy cmd callbacks

This calls the existing errno translation helpers from the callbacks,
adding trivial wrappers where necessary. For cmds that have no
sophisticated errno translation, default to -EIO.

For IPA cmds with no callback, fall back to a minimal default. This is
currently being used by qeth_l3_send_setrouting().

Thus having all converted all callbacks, remove the legacy path in
qeth_send_control_data_cb().

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: convert bridgeport callbacks
Julian Wiedmann [Tue, 12 Feb 2019 17:33:24 +0000 (18:33 +0100)]
s390/qeth: convert bridgeport callbacks

By letting the callbacks deal with error translation, we no longer need
to pass the raw error codes back to the originator. This allows us to
slim down the callback's private data, and nicely simplifies the code.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: allow cmd callbacks to return errnos
Julian Wiedmann [Tue, 12 Feb 2019 17:33:23 +0000 (18:33 +0100)]
s390/qeth: allow cmd callbacks to return errnos

Error propagation from cmd callbacks currently works in a way where
qeth_send_control_data_cb() picks the raw HW code from the response,
and the cmd's originator later translates this into an errno.
The callback itself only returns 0 ("done") or 1 ("expect more data").

This is
1. limiting, as the only means for the callback to report an internal
error is to invent pseudo HW codes (such as IPA_RC_ENOMEM), that
the originator then needs to understand. For non-IPA callbacks, we
even provide a separate field in the IO buffer metadata (iob->rc) so
the callback can pass back a return value.
2. fragile, as the originator must take care to not translate any errno
that is returned by qeth's own IO code paths (eg -ENOMEM). Also, any
originator that forgets to translate the HW codes potentially passes
garbage back to its caller. For instance, see
commit 2aa4867198c2 ("s390/qeth: translate SETVLAN/DELVLAN errors").

Introduce a new model where all HW error translation is done within the
callback, and the callback returns
>  0, if it expects more data (as before)
== 0, on success
<  0, with an errno

Start off with converting all callbacks to the new model that either
a) pass back pseudo HW codes, or b) have a dependency on a specific
HW error code. Also convert c) the one callback that uses iob->rc, and
d) qeth_setadpparms_change_macaddr_cb() so that it can pass back an
error back to qeth_l2_request_initial_mac() even when the cmd itself
was successful.

The old model remains supported: if the callback returns 0, we still
propagate the response's HW error code back to the originator.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: cancel cmd on early error
Julian Wiedmann [Tue, 12 Feb 2019 17:33:22 +0000 (18:33 +0100)]
s390/qeth: cancel cmd on early error

When sending cmds via qeth_send_control_data(), qeth puts the request
on the IO channel and then blocks on the reply object until the response
has been received.

If the IO completes with error, there will never be a response and we
block until the reply-wait hits its timeout. For this case, connect the
request buffer to its reply object, so that we can immediately cancel
the wait.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: simplify reply object handling
Julian Wiedmann [Tue, 12 Feb 2019 17:33:21 +0000 (18:33 +0100)]
s390/qeth: simplify reply object handling

Current code enqueues & dequeues a reply object from the waiter list
in various places. In particular, the dequeue & enqueue in
qeth_send_control_data_cb() looks fragile - this can cause
qeth_clear_ipacmd_list() to skip the active object.
Add some helpers, and boil the logic down by giving
qeth_send_control_data() the sole responsibility to add and remove
objects.

qeth_send_control_data_cb() and qeth_clear_ipacmd_list() will now only
notify the reply object to interrupt its wait cycle. This can cause
a slight delay in the removal, but that's no concern.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: limit trace to valid data of command request
Julian Wiedmann [Tue, 12 Feb 2019 17:33:20 +0000 (18:33 +0100)]
s390/qeth: limit trace to valid data of command request

'len' specifies how much data we send to the HW, don't dump beyond this
boundary.
As of today this is no big concern - commands are built in full, zeroed
pages.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: align csum offload with TSO control logic
Julian Wiedmann [Tue, 12 Feb 2019 17:33:19 +0000 (18:33 +0100)]
s390/qeth: align csum offload with TSO control logic

csum offload and TSO have similar programming requirements. The TSO code
was reworked with commit "s390/qeth: enhance TSO control sequence",
adjust the csum control flow accordingly. Primarily this means replacing
custom helpers with more generic infrastructure.

Also, change the LP2LP check so that it warns on TX offload (not RX).
This is where reduced csum capability actually matters.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: enable only required csum offload features
Julian Wiedmann [Tue, 12 Feb 2019 17:33:18 +0000 (18:33 +0100)]
s390/qeth: enable only required csum offload features

Current code attempts to enable all advertised HW csum offload features.
Future-proof this by enabling only those features that we actually use.

Also, the IPv4 header csum feature is only needed for TX on L3 devices.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>