openwrt/staging/blogic.git
6 years agoMerge branch 'net-Resolve-races-in-phy-accessors'
David S. Miller [Wed, 3 Jan 2018 16:00:24 +0000 (11:00 -0500)]
Merge branch 'net-Resolve-races-in-phy-accessors'

Russell King says:

====================
Resolve races in phy accessors

This series resolves races with various accesses to PHY registers.
The first five patches are necessary before we add phylink support
to mvneta, the remaining three are merely cleanups for unobserved
races, and hence are less critical.

There are two possible classes of races that can occur: where we
write to a page register that changes the meaning of a group of
other registers, and where we read-modify-write a register.

Resolve these races by performing the accesses under the mdio bus
lock, ensuring that no other user can access the bus while the
series of atomic operations are being performed.

These patches have been posted before, and have been modified
along the lines of previous feedback:

- The third patch was originally reviewed by Florian, but as I've
  added __phy_modify() to it, I've removed that attributation.
- Included generic page-based accessors as suggested last time
  around.
- Since we have the unlocked __phy_modify() in this patch series,
  it is sensible to include the changes for this to marvell.c -
  these accessors have to change anyway to avoid deadlocks on the
  mdio bus lock.

I haven't been able to test the at803x.c changes yet beyond compile
testing - although I do have systems with an ar8035 PHY.  However,
they should be straight forward to review.

This is targetted for net-next because the races have not been
found in existing drivers, but have been observed with phylink
integrated into mvneta - that's not to say that the races do not
exist today, they are just unobserved (probably through lack of
rigorous enough testing.)  The race provoking condition is detailed
in patch 5.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: convert read-modify-write to phy_modify()
Russell King [Tue, 2 Jan 2018 10:58:58 +0000 (10:58 +0000)]
net: phy: convert read-modify-write to phy_modify()

Convert read-modify-write sequences in at803x, Marvell and core phylib
to use phy_modify() to ensure safety.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: add phy_modify() accessor
Russell King [Tue, 2 Jan 2018 10:58:53 +0000 (10:58 +0000)]
net: phy: add phy_modify() accessor

Add phy_modify() convenience accessor to complement the mdiobus
counterpart.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: marvell: fix paged access races
Russell King [Tue, 2 Jan 2018 10:58:48 +0000 (10:58 +0000)]
net: phy: marvell: fix paged access races

For paged accesses to be truely safe, we need to hold the bus lock to
prevent anyone else gaining access to the registers while we modify
them.

The phydev->lock mutex does not do this: userspace via the MII ioctl
can still sneak in and read or write any register while we are on a
different page, and the suspend/resume methods can be called by a
thread different to the thread polling the phy status.

Races have been observed with mvneta on SolidRun Clearfog with phylink,
particularly between the phylib worker reading the PHYs status, and
the thread resuming mvneta, calling phy_start() which then calls
through to m88e1121_config_aneg_rgmii_delays(), which tries to
read-modify-write the MSCR register:

CPU0 CPU1
marvell_read_status_page()
marvell_set_page(phydev, MII_MARVELL_FIBER_PAGE)
...
m88e1121_config_aneg_rgmii_delays()
set_page(MII_MARVELL_MSCR_PAGE)
phy_read(phydev, MII_88E1121_PHY_MSCR_REG)
marvell_set_page(phydev, MII_MARVELL_COPPER_PAGE);
...
phy_write(phydev, MII_88E1121_PHY_MSCR_REG)

The result of this is we end up writing the copper page register 21,
which causes the copper PHY to be disabled, and the link partner sees
the link immediately go down.

Solve this by taking the bus lock instead of the PHY lock, thereby
preventing other accesses to the PHY while we are accessing other PHY
pages.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: add paged phy register accessors
Russell King [Tue, 2 Jan 2018 10:58:43 +0000 (10:58 +0000)]
net: phy: add paged phy register accessors

Add a set of paged phy register accessors which are inherently safe in
their design against other accesses interfering with the paged access.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: add unlocked accessors
Russell King [Tue, 2 Jan 2018 10:58:37 +0000 (10:58 +0000)]
net: phy: add unlocked accessors

Add unlocked versions of the bus accessors, which allows access to the
bus with all the tracing. These accessors validate that the bus mutex
is held, which is a basic requirement for all mii bus accesses.

Also added is a read-modify-write unlocked accessor with the same
locking requirements.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: use unlocked accessors for indirect MMD accesses
Russell King [Tue, 2 Jan 2018 10:58:32 +0000 (10:58 +0000)]
net: phy: use unlocked accessors for indirect MMD accesses

Use unlocked accessors for indirect MMD accesses to clause 22 PHYs.
This permits tracing of these accesses.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mdiobus: add unlocked accessors
Russell King [Tue, 2 Jan 2018 10:58:27 +0000 (10:58 +0000)]
net: mdiobus: add unlocked accessors

Add unlocked versions of the bus accessors, which allows access to the
bus with all the tracing. These accessors validate that the bus mutex
is held, which is a basic requirement for all mii bus accesses.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4: collect TX rate limit info in UP CIM logs
Rahul Lakkireddy [Tue, 2 Jan 2018 07:18:58 +0000 (12:48 +0530)]
cxgb4: collect TX rate limit info in UP CIM logs

Collect TX rate limiting related information in UP CIM logs.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'mvneta-phylink'
David S. Miller [Wed, 3 Jan 2018 15:38:54 +0000 (10:38 -0500)]
Merge branch 'mvneta-phylink'

Russell King says:

====================
Convert mvneta to phylink

This series converts mvneta to use phylink, which is necessary to
support the SFP cages on SolidRun's Clearfog platform.  This series just
converts mvneta without adding the DT parts - having discussed with
Andrew, we believe we're too close to the merge window to submit that
patch.

I've split the "net: mvneta: convert to phylink" patch up to make it
easier to review, and in doing so, spotted some minor corner cases that
needed to be fixed along the way.

This series depends on the previously merged phylink patches in netdev,
along with the recently reviewed 7 patch series "Resolve races in phy
accessors" without which, the race described in patch 5 of that series
is very evident when triggering a dummy hibernate cycle.

This series also illustrates how to convert mvpp2 to phylink.

mvneta is the only user of the fixed_phy_update_state() API, and this
becomes redundant with the conversion.

It would be good to get this series not only reviewed, but also
independently tested to ensure that I haven't missed anything - I only
have the Clearfog platform to test on, and that doesn't support all the
different interface modes that mvneta supports.

A particularly interesting side effect of this series is that DSA
switches no longer need the "CPU" port and DSA facing MAC ethernet
instance to be marked as a fixed link anymore with mvneta - we can use
1000BaseX mode, and the DSA to CPU link will use the 802.3z negotiation
to determine the link properties without needing the link parameters to
be explicitly stated in DT - that is a subject of a future patch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: fixed-phy: remove fixed_phy_update_state()
Russell King [Tue, 2 Jan 2018 17:25:20 +0000 (17:25 +0000)]
net: phy: fixed-phy: remove fixed_phy_update_state()

mvneta is the only user of fixed_phy_update_state(), which has been
converted to use phylink instead.  Remove fixed_phy_update_state().

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: add module EEPROM reading support
Russell King [Tue, 2 Jan 2018 17:25:15 +0000 (17:25 +0000)]
net: mvneta: add module EEPROM reading support

Add support for reading the SFF module's EEPROM via the ethtool API.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: disable MVNETA_CAUSE_PSC_SYNC_CHANGE interrupt
Russell King [Tue, 2 Jan 2018 17:25:09 +0000 (17:25 +0000)]
net: mvneta: disable MVNETA_CAUSE_PSC_SYNC_CHANGE interrupt

The PSC sync change interrupt can fire multiple times while the link is
down, which is caused by noise on the serdes lines. As this isn't
information we make use of, it's pointless having the interrupt enabled.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: add EEE support
Russell King [Tue, 2 Jan 2018 17:25:04 +0000 (17:25 +0000)]
net: mvneta: add EEE support

Add support for EEE to mvneta.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: add flow control support
Russell King [Tue, 2 Jan 2018 17:24:59 +0000 (17:24 +0000)]
net: mvneta: add flow control support

Add support for flow control to mvneta.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: add 1000BaseX support
Russell King [Tue, 2 Jan 2018 17:24:54 +0000 (17:24 +0000)]
net: mvneta: add 1000BaseX support

Add support for 1000BaseX link modes.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: move port configuration
Russell King [Tue, 2 Jan 2018 17:24:49 +0000 (17:24 +0000)]
net: mvneta: move port configuration

Move the port configuration and release of reset to mvneta_mac_config()
along side the rest of the port mode configuration.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: convert to phylink
Russell King [Tue, 2 Jan 2018 17:24:44 +0000 (17:24 +0000)]
net: mvneta: convert to phylink

Convert mvneta to use phylink, which models the MAC to PHY link in
a generic, reusable form.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
- remove unused sync status

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: prepare to convert to phylink
Russell King [Tue, 2 Jan 2018 17:24:39 +0000 (17:24 +0000)]
net: mvneta: prepare to convert to phylink

Prepare to convert mvneta to phylink by splitting the adjust_link
function into its consituent parts.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: ensure PM paths take the rtnl lock
Russell King [Tue, 2 Jan 2018 17:24:34 +0000 (17:24 +0000)]
net: mvneta: ensure PM paths take the rtnl lock

The netdev core always ensures that the rtnl lock is held while calling
the ndo_open() and ndo_stop() methods. However, the suspend/resume paths
do not hold the rtnl lock. phylink will expect the rtnl lock to be held
when the MAC driver calls it, so we end up with kernel warnings. Take
the lock to ensure that these functions are called in a consistent
manner.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'net-Renesas-kill-redundant-checks'
David S. Miller [Wed, 3 Jan 2018 15:21:35 +0000 (10:21 -0500)]
Merge branch 'net-Renesas-kill-redundant-checks'

Sergei Shtylyov says:

====================
Kill redundant checks in the Renesas Ethernet drivers

Here's a set of 2 patches against DaveM's 'net-next.git' repo removing
redundant checks in the driver probe() methods.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosh_eth: kill redundant check in the probe() method
Sergei Shtylyov [Sun, 31 Dec 2017 18:41:36 +0000 (21:41 +0300)]
sh_eth: kill redundant check in the probe() method

Browsing thru the driver disassembly, I noticed that gcc was  able to
figure  out  that the 'ndev' pointer is always non-NULL when calling
free_netdev()  on the probe() method's  error path and  thus skip that
redundant NULL check... gcc is smart, be like gcc! :-)

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoravb: kill redundant check in the probe() method
Sergei Shtylyov [Sun, 31 Dec 2017 18:41:35 +0000 (21:41 +0300)]
ravb: kill redundant check in the probe() method

Browsing thru the driver disassembly, I noticed that gcc was  able to
figure  out  that the 'ndev' pointer is always non-NULL when calling
free_netdev()  on the probe() method's  error path and  thus skip that
redundant NULL check... gcc is smart, be like gcc! :-)

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoliquidio: Use zeroing memory allocator than allocator/memset
Himanshu Jha [Sun, 31 Dec 2017 12:27:29 +0000 (17:57 +0530)]
liquidio: Use zeroing memory allocator than allocator/memset

Use vzalloc for allocating zeroed memory and remove unnecessary
memset function.

Done using Coccinelle.
Generated-by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci
0-day tested with no failures.

Suggested-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoethernet/broadcom: Use zeroing memory allocator than allocator/memset
Himanshu Jha [Sat, 30 Dec 2017 15:44:57 +0000 (21:14 +0530)]
ethernet/broadcom: Use zeroing memory allocator than allocator/memset

Use dma_zalloc_coherent for allocating zeroed
memory and remove unnecessary memset function.

Done using Coccinelle.
Generated-by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci
0-day tested with no failures.

Suggested-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: Use zeroing memory allocator than allocator/memset
Himanshu Jha [Sat, 30 Dec 2017 15:37:04 +0000 (21:07 +0530)]
qed: Use zeroing memory allocator than allocator/memset

Use dma_zalloc_coherent and vzalloc for allocating zeroed
memory and remove unnecessary memset function.

Done using Coccinelle.
Generated-by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci
0-day tested with no failures.

Suggested-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'net-stmmac-Couple-of-debug-prints-improvements'
David S. Miller [Wed, 3 Jan 2018 02:54:56 +0000 (21:54 -0500)]
Merge branch 'net-stmmac-Couple-of-debug-prints-improvements'

Florian Fainelli says:

====================
net: stmmac: Couple of debug prints improvements

While working on a particular problem, I had to turn on debug prints and found
them to be useful, but could deserve some improvements in order to help debug
situations.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: stmmac: Allow debug prints of frame_len/COE
Florian Fainelli [Sat, 30 Dec 2017 03:56:33 +0000 (19:56 -0800)]
net: stmmac: Allow debug prints of frame_len/COE

There is no reason not to allow printing the frame_len/COE value and put
that under a check for ETH_FRAME_LEN, drop it so we can see what the
descriptor reports.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: stmmac: Pad ring number with zeroes in display_ring()
Florian Fainelli [Sat, 30 Dec 2017 03:56:32 +0000 (19:56 -0800)]
net: stmmac: Pad ring number with zeroes in display_ring()

Make the printing of the ring number consistent and properly aligned by
padding the ring number with up to 3 zeroes, which covers the maximum
ring size. This makes it a lot easier to see outliers in debug prints.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: Fix dsa_legacy_register() return value
Florian Fainelli [Fri, 29 Dec 2017 19:05:45 +0000 (11:05 -0800)]
net: dsa: Fix dsa_legacy_register() return value

We need to make the dsa_legacy_register() stub return 0 in order for
dsa_init_module() to successfully register and continue registering the
ETH_P_XDSA packet handler.

Fixes: 2a93c1a3651f ("net: dsa: Allow compiling out legacy support")
Reported-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'further-sfp-phylink-updates'
David S. Miller [Wed, 3 Jan 2018 02:45:32 +0000 (21:45 -0500)]
Merge branch 'further-sfp-phylink-updates'

Russell King says:

====================
further sfp/phylink updates

This series:
- cleans up printing of module information
- improves the transceiver capability decoding, getting rid of the
  guessing by connector type, improves direct-attach cable support
  and adds support for 1G Base-PX and Base-BX10 modules.
- cleans up phylink_sfp_module_insert()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agophylink: remove 'mode' variable from phylink_sfp_module_insert()
Russell King [Fri, 29 Dec 2017 12:15:33 +0000 (12:15 +0000)]
phylink: remove 'mode' variable from phylink_sfp_module_insert()

'mode' is actually constant through phylink_sfp_module_insert(), so
remove it and replace it with the enumerated constant.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosfp: improve support for direct-attach copper cables
Russell King [Fri, 29 Dec 2017 12:15:28 +0000 (12:15 +0000)]
sfp: improve support for direct-attach copper cables

Improve the support for direct-attach copper so that we avoid kernel
warning messages, and report the appropriate PORT_DA type to userspace.
Direct Attach cables can use a number of protocols depending on their
range of speeds.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosfp: add support for 1000Base-PX and 1000Base-BX10
Russell King [Fri, 29 Dec 2017 12:15:23 +0000 (12:15 +0000)]
sfp: add support for 1000Base-PX and 1000Base-BX10

Add support for decoding the transceiver information for 1000Base-PX and
1000Base-BX10.  These use 1000BASE-X protocol.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosfp: don't guess support from connector type
Russell King [Fri, 29 Dec 2017 12:15:17 +0000 (12:15 +0000)]
sfp: don't guess support from connector type

Don't try to guess the support mask from the connector type - this is
mostly irrelevant to the speeds that the transceiver supports.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosfp: use precision to print non-null terminated strings
Russell King [Fri, 29 Dec 2017 12:15:12 +0000 (12:15 +0000)]
sfp: use precision to print non-null terminated strings

Rather than memcpy()'ing the strings and null terminate them, printf
allows non-NULL terminated strings provided the precision is not more
than the size of the buffer.  Use this form to print the basic module
information rather than copying and terminating the strings.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'marvell10g-phy-updates'
David S. Miller [Tue, 2 Jan 2018 20:00:50 +0000 (15:00 -0500)]
Merge branch 'marvell10g-phy-updates'

Russell King says:

====================
marvell10g updates

This series:
- adds MDI/MDIX reporting
- adds support for 10/100Mbps half-duplex link modes
- adds a comment describing the setup on VF610 ZII boards (where
  the phy interface mode doesn't change.)
- cleans up the phy interace mode switching
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: marvell10g: add support for half duplex 100M and 10M
Russell King [Fri, 29 Dec 2017 12:46:43 +0000 (12:46 +0000)]
net: phy: marvell10g: add support for half duplex 100M and 10M

Add support for half-duplex 100M and 10M copper connections by parsing
the advertisment results rather than trying to decode the negotiated
speed from one of the PHYs "vendor" registers.  This allows us to
decode the duplex as well, which means we can support half-duplex mode
for the slower speeds.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: add helper to convert negotiation result to phy settings
Russell King [Fri, 29 Dec 2017 12:46:38 +0000 (12:46 +0000)]
net: phy: add helper to convert negotiation result to phy settings

Add a helper to convert the result of the autonegotiation advertisment
into the PHYs speed and duplex settings.  If the result is full duplex,
also extract the pause mode settings from the link partner advertisment.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: marvell10g: clean up interface mode switching
Russell King [Fri, 29 Dec 2017 12:46:32 +0000 (12:46 +0000)]
net: phy: marvell10g: clean up interface mode switching

Centralise the PHY interface mode switching, rather than having it in
two places.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: marvell10g: add MDI swap reporting
Russell King [Fri, 29 Dec 2017 12:46:27 +0000 (12:46 +0000)]
net: phy: marvell10g: add MDI swap reporting

Add reporting of the MDI swap to the Marvell 10G PHY driver by providing
a generic implementation for the standard 10GBASE-T pair swap register
and polarity register.  We also support reading the MDI swap status for
1G and below from a PCS register.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: marvell10g: update header comments
Russell King [Fri, 29 Dec 2017 12:46:22 +0000 (12:46 +0000)]
net: phy: marvell10g: update header comments

Update header comments to indicate the newly found behaviour with XAUI
interfaces.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns: add ACPI mode support for ethtool -p
Jian Shen [Fri, 29 Dec 2017 09:11:11 +0000 (17:11 +0800)]
net: hns: add ACPI mode support for ethtool -p

The locate operation interface of fiber port can only
work with DT mode. Add a new interface to control the
locate led for ACPI mode.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Tested-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4: Check alignment constraint for T6
Ganesh Goudar [Fri, 29 Dec 2017 07:18:04 +0000 (12:48 +0530)]
cxgb4: Check alignment constraint for T6

Update the check for setting  IPV4 filters and align filter_id
to multiple of 2, only for IPv6 filters in case of T6.

Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'ena-next'
David S. Miller [Tue, 2 Jan 2018 19:35:12 +0000 (14:35 -0500)]
Merge branch 'ena-next'

Netanel Belgazal says:

====================
update ENA driver to version 1.5.0

This patchset contains two changes:
* Add a robust mechanism for detection of stuck Rx/Tx rings due to
  missed or misrouted MSI-X
* Increase the driver version to 1.5.0
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: ena: increase ena driver version to 1.5.0
Netanel Belgazal [Thu, 28 Dec 2017 21:31:31 +0000 (21:31 +0000)]
net: ena: increase ena driver version to 1.5.0

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: ena: add detection and recovery mechanism for handling missed/misrouted MSI-X
Netanel Belgazal [Thu, 28 Dec 2017 21:31:30 +0000 (21:31 +0000)]
net: ena: add detection and recovery mechanism for handling missed/misrouted MSI-X

A mechanism for detection of stuck Rx/Tx rings due to missed or
misrouted interrupts.
Check if there are unhandled completion descriptors before the first
MSI-X interrupt arrived.
The check is per queue and per interrupt vector.
Once such condition is detected, driver and device reset is scheduled.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'tcp-sctp-dccp-Replace-jprobe-usage-with-trace-events'
David S. Miller [Tue, 2 Jan 2018 19:28:07 +0000 (14:28 -0500)]
Merge branch 'tcp-sctp-dccp-Replace-jprobe-usage-with-trace-events'

Masami Hiramatsu says:

====================
net: tcp: sctp: dccp: Replace jprobe usage with trace events

This series is v7 of the replacement of jprobe usage with trace
events. This version fixes net/dccp/trace.h to avoid sparse
warning. Since the TP_STORE_ADDR_PORTS macro can be shared
with trace/events/tcp.h, it also introduce a new common header
file and move the definition of that macro.

Previous version is here;
 https://lkml.org/lkml/2017/12/28/7

Changes from v6:
  [5/6]: Avoid preprocessor directives in tracepoint macro args
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dccp: Remove dccpprobe module
Masami Hiramatsu [Fri, 29 Dec 2017 02:48:25 +0000 (11:48 +0900)]
net: dccp: Remove dccpprobe module

Remove DCCP probe module since jprobe has been deprecated.
That function is now replaced by dccp/dccp_probe trace-event.
You can use it via ftrace or perftools.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dccp: Add DCCP sendmsg trace event
Masami Hiramatsu [Fri, 29 Dec 2017 02:47:55 +0000 (11:47 +0900)]
net: dccp: Add DCCP sendmsg trace event

Add DCCP sendmsg trace event (dccp/dccp_probe) for
replacing dccpprobe. User can trace this event via
ftrace or perftools.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sctp: Remove debug SCTP probe module
Masami Hiramatsu [Fri, 29 Dec 2017 02:47:20 +0000 (11:47 +0900)]
net: sctp: Remove debug SCTP probe module

Remove SCTP probe module since jprobe has been deprecated.
That function is now replaced by sctp/sctp_probe and
sctp/sctp_probe_path trace-events.
You can use it via ftrace or perftools.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sctp: Add SCTP ACK tracking trace event
Masami Hiramatsu [Fri, 29 Dec 2017 02:46:51 +0000 (11:46 +0900)]
net: sctp: Add SCTP ACK tracking trace event

Add SCTP ACK tracking trace event to trace the changes of SCTP
association state in response to incoming packets.
It is used for debugging SCTP congestion control algorithms,
and will replace sctp_probe module.

Note that this event a bit tricky. Since this consists of 2
events (sctp_probe and sctp_probe_path) so you have to enable
both events as below.

  # cd /sys/kernel/debug/tracing
  # echo 1 > events/sctp/sctp_probe/enable
  # echo 1 > events/sctp/sctp_probe_path/enable

Or, you can enable all the events under sctp.

  # echo 1 > events/sctp/enable

Since sctp_probe_path event is always invoked from sctp_probe
event, you can not see any output if you only enable
sctp_probe_path.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: tcp: Remove TCP probe module
Masami Hiramatsu [Fri, 29 Dec 2017 02:46:21 +0000 (11:46 +0900)]
net: tcp: Remove TCP probe module

Remove TCP probe module since jprobe has been deprecated.
That function is now replaced by tcp/tcp_probe trace-event.
You can use it via ftrace or perftools.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: tcp: Add trace events for TCP congestion window tracing
Masami Hiramatsu [Fri, 29 Dec 2017 02:45:51 +0000 (11:45 +0900)]
net: tcp: Add trace events for TCP congestion window tracing

This adds an event to trace TCP stat variables with
slightly intrusive trace-event. This uses ftrace/perf
event log buffer to trace those state, no needs to
prepare own ring-buffer, nor custom user apps.

User can use ftrace to trace this event as below;

  # cd /sys/kernel/debug/tracing
  # echo 1 > events/tcp/tcp_probe/enable
  (run workloads)
  # cat trace

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'qed-Advance-to-FW-8.33.1.0'
David S. Miller [Tue, 2 Jan 2018 18:59:17 +0000 (13:59 -0500)]
Merge branch 'qed-Advance-to-FW-8.33.1.0'

Tomer Tayar says:

====================
qed*: Advance to FW 8.33.1.0

This series advances all qed* drivers to use firmware 8.33.1.0 which brings
new capabilities and initial support of new HW. The changes are mostly in
qed, and include changes in the FW interface files, as well as updating the
FW initialization and debug collection code. The protocol drivers have
minor functional changes for this firmware.

Patch 1 Rearranges and refactors the FW interface files in preparation of
the new FW (no functional change).
Patch 2 Prepares the code for support of new HW (no functional change).
Patch 3 Actual utilization of the new FW.
Patch 4 Advances drivers' version.

v3->v4:
Fix a compilation issue which was reported by krobot (dependency on CRC8).

v2->v3:
Resend the series with a fixed title in the cover letter.

v1->v2:
- Break the previous single patch into several patches.
- Fix compilation issues which were reported by krobot.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed*: Advance drivers' version to 8.33.0.20
Tomer Tayar [Wed, 27 Dec 2017 17:30:08 +0000 (19:30 +0200)]
qed*: Advance drivers' version to 8.33.0.20

Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Chad Dupuis <Chad.Dupuis@cavium.com>
Signed-off-by: Manish Rangankar <Manish.Rangankar@cavium.com>
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed*: Utilize FW 8.33.1.0
Tomer Tayar [Wed, 27 Dec 2017 17:30:07 +0000 (19:30 +0200)]
qed*: Utilize FW 8.33.1.0

Advance the qed* drivers to use firmware 8.33.1.0:
Modify core driver (qed) to utilize the new FW and initialize the device
with it. This is the lion's share of the patch, and includes changes to FW
interface files, device initialization flows, FW interaction flows, and
debug collection flows.
Modify Ethernet driver (qede) to make use of new FW in fastpath.
Modify RoCE/iWARP driver (qedr) to make use of new FW in fastpath.
Modify FCoE driver (qedf) to make use of new FW in fastpath.
Modify iSCSI driver (qedi) to make use of new FW in fastpath.

Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Bason <Yuval.Bason@cavium.com>
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Manish Chopra <Manish.Chopra@cavium.com>
Signed-off-by: Chad Dupuis <Chad.Dupuis@cavium.com>
Signed-off-by: Manish Rangankar <Manish.Rangankar@cavium.com>
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed*: HSI renaming for different types of HW
Tomer Tayar [Wed, 27 Dec 2017 17:30:06 +0000 (19:30 +0200)]
qed*: HSI renaming for different types of HW

This patch renames defines and structures in the FW HSI files to allow a
distinction between different types of HW.

Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Chad Dupuis <Chad.Dupuis@cavium.com>
Signed-off-by: Manish Rangankar <Manish.Rangankar@cavium.com>
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed*: Refactoring and rearranging FW API with no functional impact
Tomer Tayar [Wed, 27 Dec 2017 17:30:05 +0000 (19:30 +0200)]
qed*: Refactoring and rearranging FW API with no functional impact

This patch refactors and reorders the FW API files in preparation of
upgrading the code to support new FW.

- Make use of the BIT macro in appropriate places.
- Whitespace changes to align values and code blocks.
- Comments are updated (spelling mistakes, removed if not clear).
- Group together code blocks which are related or deal with similar
 matters.

Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoinet_diag: Add equal-operator for ports
Kristian Evensen [Wed, 27 Dec 2017 17:27:58 +0000 (18:27 +0100)]
inet_diag: Add equal-operator for ports

inet_diag currently provides less/greater than or equal operators for
comparing ports when filtering sockets. An equal comparison can be
performed by combining the two existing operators, or a user can for
example request a port range and then do the final filtering in
userspace. However, these approaches both have drawbacks. Implementing
equal using LE/GE causes the size and complexity of a filter to grow
quickly as the number of ports increase, while it on busy machines would
be great if the kernel only returns information about relevant sockets.

This patch introduces source and destination port equal operators.
INET_DIAG_BC_S_EQ is used to match a source port, INET_DIAG_BC_D_EQ a
destination port, and usage is the same as for the existing port
operators.  I.e., the port to match is stored in the no-member of the
next inet_diag_bc_op-struct in the filter.

Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 's390-next'
David S. Miller [Tue, 2 Jan 2018 18:52:23 +0000 (13:52 -0500)]
Merge branch 's390-next'

Julian Wiedmann says:

====================
s390/qeth: updates 2017-12-27

please apply some post-christmas leftovers for 4.16.
Two patches to improve IP address management on L3, and two that add
support to auto-config the transport mode on z/VM VNICs.
Note that one patch in the series touches arch/s390 (acked by Martin).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agos390/qeth: support early setup for z/VM NICs
Julian Wiedmann [Wed, 27 Dec 2017 16:44:31 +0000 (17:44 +0100)]
s390/qeth: support early setup for z/VM NICs

The transport mode that a z/VM NIC is configured in, must match the
hypervisor-internal network which the NIC is coupled to.

To get this right automatically, have qeth issue a diag26c hypervisor call
that provides all sorts of information for a specific VNIC.
With z/VM update VM65918, this also includes the VNIC's required
transport mode.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agos390/diag: add diag26c support for VNIC info
Julian Wiedmann [Wed, 27 Dec 2017 16:44:30 +0000 (17:44 +0100)]
s390/diag: add diag26c support for VNIC info

With subcode 0x24, diag26c returns all sorts of VNIC-related information.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agos390/qeth: use common helper to display rxip/vipa
Julian Wiedmann [Wed, 27 Dec 2017 16:44:29 +0000 (17:44 +0100)]
s390/qeth: use common helper to display rxip/vipa

By parameterising the address type, we need just one helper that walks
the IP table and builds up the response string.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agos390/qeth: improve error reporting on IP add/removal
Julian Wiedmann [Wed, 27 Dec 2017 16:44:28 +0000 (17:44 +0100)]
s390/qeth: improve error reporting on IP add/removal

When adding & removing IP entries for rxip/vipa/ipato/hsuid, forward any
resulting errors back to the sysfs-level caller.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoopenvswitch: drop unneeded newline
Julia Lawall [Wed, 27 Dec 2017 14:51:38 +0000 (15:51 +0100)]
openvswitch: drop unneeded newline

OVS_NLERR prints a newline at the end of the message string, so the
message string does not need to include a newline explicitly.  Done
using Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dccp: drop unneeded newline
Julia Lawall [Wed, 27 Dec 2017 14:51:36 +0000 (15:51 +0100)]
net: dccp: drop unneeded newline

DCCP_CRIT prints some other text and then a newline after the message
string, so the message string does not need to include a newline
explicitly.  Done using Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: fix skb leak in dev_requeue_skb()
Wei Yongjun [Wed, 27 Dec 2017 09:05:52 +0000 (17:05 +0800)]
net: sched: fix skb leak in dev_requeue_skb()

When dev_requeue_skb() is called with bulked skb list, only the
first skb of the list will be requeued to qdisc layer, and leak
the others without free them.

TCP is broken due to skb leak since no free skb will be considered
as still in the host queue and never be retransmitted. This happend
when dev_requeue_skb() called from qdisc_restart().
  qdisc_restart
  |-- dequeue_skb
  |-- sch_direct_xmit()
      |-- dev_requeue_skb() <-- skb may bluked

Fix dev_requeue_skb() to requeue the full bluked list. Also change
to use __skb_queue_tail() in __dev_requeue_skb() to avoid skb out
of order.

Fixes: a53851e2c321 ("net: sched: explicit locking in gso_cpu fallback")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4: use CLIP with LIP6 on T6 for TCAM filters
Ganesh Goudar [Wed, 27 Dec 2017 07:42:57 +0000 (13:12 +0530)]
cxgb4: use CLIP with LIP6 on T6 for TCAM filters

On T6, LIP compression is always enabled for IPv6 and uncompressed
IPv6 for LIP is not supported. So, for IPv6 TCAM filters on T6,
add LIP6 to CLIP on filter creation, and release the same on filter
deletion.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: rtnetlink: add erspan and ip6erspan
William Tu [Tue, 26 Dec 2017 19:10:07 +0000 (11:10 -0800)]
selftests: rtnetlink: add erspan and ip6erspan

Add test cases for ipv4, ipv6 erspan, v1 and v2 native mode
and external (collect metadata) mode.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoforcedeth: optimize the rx with likely
Zhu Yanjun [Tue, 26 Dec 2017 06:22:50 +0000 (01:22 -0500)]
forcedeth: optimize the rx with likely

In the rx fastpath, the function netdev_alloc_skb rarely fails.
Therefore, a likely() optimization is added to this error check
conditional.

CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Joe Jin <joe.jin@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests/net: fix bugs in address and port initialization
Sowmini Varadhan [Mon, 25 Dec 2017 22:43:04 +0000 (14:43 -0800)]
selftests/net: fix bugs in address and port initialization

Address/port initialization should work correctly regardless
of the order in which command line arguments are supplied,
E.g, cfg_port should be used to connect to the remote host
even if it is processed after -D, src/dst address initialization
should not require that [-4|-6] be specified before
the -S or -D args, receiver should be able to bind to *.<cfg_port>

Achieve this by making sure that the address/port structures
are initialized after all command line options are parsed.

Store cfg_port in host-byte order, and use htons()
to set up the sin_port/sin6_port before bind/connect,
so that the network system calls get the correct values
in network-byte order.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'net-sched-Fix-RED-qdisc-offload-flag'
David S. Miller [Tue, 2 Jan 2018 18:17:46 +0000 (13:17 -0500)]
Merge branch 'net-sched-Fix-RED-qdisc-offload-flag'

Nogah Frankel says:

====================
net: sched: Fix RED qdisc offload flag

Replacing the RED offload flag (TC_RED_OFFLOADED) with the generic one
(TCQ_F_OFFLOADED) deleted some of the logic behind it. This patchset fixes
this problem.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: Move offload check till after dump call
Nogah Frankel [Mon, 25 Dec 2017 08:51:42 +0000 (10:51 +0200)]
net: sched: Move offload check till after dump call

Move the check of the offload state to after the qdisc dump action was
called, so the qdisc could update it if it was changed.

Fixes: 7a4fa29106d9 ("net: sched: Add TCA_HW_OFFLOAD")
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet_sch: red: Fix the new offload indication
Nogah Frankel [Mon, 25 Dec 2017 08:51:41 +0000 (10:51 +0200)]
net_sch: red: Fix the new offload indication

Update the offload flag, TCQ_F_OFFLOADED, in each dump call (and ignore
the offloading function return value in relation to this flag).
This is done because a qdisc is being initialized, and therefore offloaded
before being grafted. Since the ability of the driver to offload the qdisc
depends on its location, a qdisc can be offloaded and un-offloaded by graft
calls, that doesn't effect the qdisc itself.

Fixes: 428a68af3a7c ("net: sched: Move to new offload indication in RED"
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosky2: Replace mdelay with msleep in sky2_vpd_wait
Jia-Ju Bai [Sun, 24 Dec 2017 03:54:33 +0000 (11:54 +0800)]
sky2: Replace mdelay with msleep in sky2_vpd_wait

sky2_vpd_wait is not called in an interrupt handler nor holding a spinlock.
The function mdelay in it can be replaced with msleep, to reduce busy wait.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: ptr_ring: otherwise safe empty checks can overrun array bounds
John Fastabend [Thu, 28 Dec 2017 03:50:25 +0000 (19:50 -0800)]
net: ptr_ring: otherwise safe empty checks can overrun array bounds

When running consumer and/or producer operations and empty checks in
parallel its possible to have the empty check run past the end of the
array. The scenario occurs when an empty check is run while
__ptr_ring_discard_one() is in progress. Specifically after the
consumer_head is incremented but before (consumer_head >= ring_size)
check is made and the consumer head is zeroe'd.

To resolve this, without having to rework how consumer/producer ops
work on the array, simply add an extra dummy slot to the end of the
array. Even if we did a rework to avoid the extra slot it looks
like the normal case checks would suffer some so best to just
allocate an extra pointer.

Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Fixes: c5ad119fb6c09 ("net: sched: pfifo_fast use skb_array")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Fri, 29 Dec 2017 20:14:27 +0000 (15:14 -0500)]
Merge git://git./linux/kernel/git/davem/net

net/ipv6/ip6_gre.c is a case of parallel adds.

include/trace/events/tcp.h is a little bit more tricky.  The removal
of in-trace-macro ifdefs in 'net' paralleled with moving
show_tcp_state_name and friends over to include/trace/events/sock.h
in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Fri, 29 Dec 2017 07:20:21 +0000 (23:20 -0800)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) IPv6 gre tunnels end up with different default features enabled
    depending upon whether netlink or ioctls are used to bring them up.
    Fix from Alexey Kodanev.

 2) Fix read past end of user control message in RDS< from Avinash
    Repaka.

 3) Missing RCU barrier in mini qdisc code, from Cong Wang.

 4) Missing policy put when reusing per-cpu route entries, from Florian
    Westphal.

 5) Handle nested PCI errors properly in bnx2x driver, from Guilherme G.
    Piccoli.

 6) Run nested transport mode IPSEC packets via tasklet, from Herbert
    Xu.

 7) Fix handling poll() for stream sockets in tipc, from Parthasarathy
    Bhuvaragan.

 8) Fix two stack-out-of-bounds issues in IPSEC, from Steffen Klassert.

 9) Another zerocopy ubuf handling fix, from Willem de Bruijn.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
  strparser: Call sock_owned_by_user_nocheck
  sock: Add sock_owned_by_user_nocheck
  skbuff: in skb_copy_ubufs unclone before releasing zerocopy
  tipc: fix hanging poll() for stream sockets
  sctp: Replace use of sockets_allocated with specified macro.
  bnx2x: Improve reliability in case of nested PCI errors
  tg3: Enable PHY reset in MTU change path for 5720
  tg3: Add workaround to restrict 5762 MRRS to 2048
  tg3: Update copyright
  net: fec: unmap the xmit buffer that are not transferred by DMA
  tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path
  tipc: error path leak fixes in tipc_enable_bearer()
  RDS: Check cmsg_len before dereferencing CMSG_DATA
  tcp: Avoid preprocessor directives in tracepoint macro args
  tipc: fix memory leak of group member when peer node is lost
  net: sched: fix possible null pointer deref in tcf_block_put
  tipc: base group replicast ack counter on number of actual receivers
  net_sched: fix a missing rcu barrier in mini_qdisc_pair_swap()
  net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround
  ip6_gre: fix device features for ioctl setup
  ...

6 years agoMerge tag 'drm-fixes-for-v4.15-rc6' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 29 Dec 2017 07:16:24 +0000 (23:16 -0800)]
Merge tag 'drm-fixes-for-v4.15-rc6' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "nouveau and i915 regression fixes"

* tag 'drm-fixes-for-v4.15-rc6' of git://people.freedesktop.org/~airlied/linux:
  drm/nouveau: fix race when adding delayed work items
  i915: Reject CCS modifiers for pipe C on Geminilake
  drm/i915/gvt: Fix pipe A enable as default for vgpu

6 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 29 Dec 2017 07:14:47 +0000 (23:14 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk fix from Stephen Boyd:
 "One more fix for the runtime PM clk patches. We're calling a runtime
  PM API that may schedule from somewhere that we can't do that. We
  change to the async version of pm_runtime_put() to fix it"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: use atomic runtime pm api in clk_core_is_enabled

6 years agoMerge tag 'led_fixes_for_4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 29 Dec 2017 07:09:45 +0000 (23:09 -0800)]
Merge tag 'led_fixes_for_4.15-rc6' of git://git./linux/kernel/git/j.anaszewski/linux-leds

Pull LED fix from Jacek Anaszewski:
 "A single LED fix for brightness setting when delay_off is 0"

* tag 'led_fixes_for_4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
  led: core: Fix brightness setting when setting delay_off=0

6 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Fri, 29 Dec 2017 07:06:01 +0000 (23:06 -0800)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "This is the next batch of for-rc patches from RDMA. It includes the
  fix for the ipoib regression I mentioned last time, and the result of
  a fairly major debugging effort to get iser working reliably on cxgb4
  hardware - it turns out the cxgb4 driver was not handling QP error
  flushing properly causing iser to fail.

   - cxgb4 fix for an iser testing failure as debugged by Steve and
     Sagi. The problem was a driver bug in the handling of shutting down
     a QP.

   - Various vmw_pvrdma fixes for bogus WARN_ON, missed resource free on
     error unwind and a use after free bug

   - Improper congestion counter values on mlx5 when link aggregation is
     enabled

   - ipoib lockdep regression introduced in this merge window

   - hfi1 regression supporting the device in a VM introduced in a
     recent patch

   - Typo that breaks future uAPI compatibility in the verbs core

   - More SELinux related oops fixing

   - Fix an oops during error unwind in mlx5"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/mlx5: Fix mlx5_ib_alloc_mr error flow
  IB/core: Verify that QP is security enabled in create and destroy
  IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp()
  IB/mlx5: Serialize access to the VMA list
  IB/hfi: Only read capability registers if the capability exists
  IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush
  IB/mlx5: Fix congestion counters in LAG mode
  RDMA/vmw_pvrdma: Avoid use after free due to QP/CQ/SRQ destroy
  RDMA/vmw_pvrdma: Use refcount_dec_and_test to avoid warning
  RDMA/vmw_pvrdma: Call ib_umem_release on destroy QP path
  iw_cxgb4: when flushing, complete all wrs in a chain
  iw_cxgb4: reflect the original WR opcode in drain cqes
  iw_cxgb4: Only validate the MSN for successful completions

6 years agoMerge tag 'mlx5-shared-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mella...
David S. Miller [Fri, 29 Dec 2017 00:32:59 +0000 (19:32 -0500)]
Merge tag 'mlx5-shared-4.16-1' of git://git./linux/kernel/git/mellanox/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 E-Switch updates 2017-12-19

This series includes updates for mlx5 E-Switch infrastructures,
to be merged into net-next and rdma-next trees.

Mark's patches provide E-Switch refactoring that generalize the mlx5
E-Switch vf representors interfaces and data structures. The serious is
mainly focused on moving ethernet (netdev) specific representors logic out
of E-Switch (eswitch.c) into mlx5e representor module (en_rep.c), which
provides better separation and allows future support for other types of vf
representors (e.g. RDMA).

Gal's patches at the end of this serious, provide a simple syntax fix and
two other patches that handles vport ingress/egress ACL steering name
spaces to be aligned with the Firmware/Hardware specs.

V1->V2:
 - Addressed coding style comments in patches #1 and #7
 - The series is still based on rc4, as now I see net-next is also @rc4.

V2->V3:
 - Fixed compilation warning, reported by Dave.

Please pull and let me know if there's any problem.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/mlx5: Separate ingress/egress namespaces for each vport
Gal Pressman [Tue, 28 Nov 2017 09:58:51 +0000 (11:58 +0200)]
net/mlx5: Separate ingress/egress namespaces for each vport

Each vport has its own root flow table for the ACL flow tables and root
flow table is per namespace, therefore we should create a namespace for
each vport.

Fixes: efdc810ba39d ("net/mlx5: Flow steering, Add vport ACL support")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5: Fix ingress/egress naming mistake
Gal Pressman [Wed, 29 Nov 2017 10:08:15 +0000 (12:08 +0200)]
net/mlx5: Fix ingress/egress naming mistake

The functions names do not represent their actions, switch the mistaken
ingress/egress naming.

Fixes: fba53f7b5719 ("net/mlx5: Introduce mlx5_flow_steering structure")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5e: E-Switch, Use the name of static array instead of its address
Gal Pressman [Thu, 26 Oct 2017 13:11:25 +0000 (16:11 +0300)]
net/mlx5e: E-Switch, Use the name of static array instead of its address

Using the address of a static array is the same as using its name (in
this specific use-case), but it's confusing and makes the code less
readable.

Fixes: 1bd27b11c1df ("net/mlx5: Introduce E-switch QoS management")
Fixes: bd77bf1cb595 ("net/mlx5: Add SRIOV VF max rate configuration support")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5e: E-Switch, Move send-to-vport rule struct to en_rep
Mark Bloch [Thu, 7 Dec 2017 21:39:52 +0000 (21:39 +0000)]
net/mlx5e: E-Switch, Move send-to-vport rule struct to en_rep

Move struct mlx5_esw_sq which keeps send-to-vport rule to from the eswitch
code to mlx5e and rename it to better reflect where it belongs

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5: E-Switch, Create generic header struct to be used by representors
Mark Bloch [Thu, 7 Dec 2017 21:25:57 +0000 (21:25 +0000)]
net/mlx5: E-Switch, Create generic header struct to be used by representors

Now that we don't store type dependent data in struct mlx5_eswitch_rep
we can create a generic interface, and representor type.

struct mlx5_eswitch_rep will store an array of interfaces, each
interface is used by a different representor type.

Once we moved to a more generic interface, rdma driver representors can
be added and utilize the same mechanism as the Ethernet driver
representors use.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agoMerge branch 'strparser-Fix-lockdep-issue'
David S. Miller [Thu, 28 Dec 2017 19:28:23 +0000 (14:28 -0500)]
Merge branch 'strparser-Fix-lockdep-issue'

Tom Herbert says:

====================
strparser: Fix lockdep issue

When sock_owned_by_user returns true in strparser. Fix is to add and
call sock_owned_by_user_nocheck since the check for owned by user is
not an error condition in this case.
====================

Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agostrparser: Call sock_owned_by_user_nocheck
Tom Herbert [Thu, 28 Dec 2017 19:00:44 +0000 (11:00 -0800)]
strparser: Call sock_owned_by_user_nocheck

strparser wants to check socket ownership without producing any
warnings. As indicated by the comment in the code, it is permissible
for owned_by_user to return true.

Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com>
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosock: Add sock_owned_by_user_nocheck
Tom Herbert [Thu, 28 Dec 2017 19:00:43 +0000 (11:00 -0800)]
sock: Add sock_owned_by_user_nocheck

This allows checking socket lock ownership with producing lockdep
warnings.

Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoskbuff: in skb_copy_ubufs unclone before releasing zerocopy
Willem de Bruijn [Thu, 28 Dec 2017 17:38:13 +0000 (12:38 -0500)]
skbuff: in skb_copy_ubufs unclone before releasing zerocopy

skb_copy_ubufs must unclone before it is safe to modify its
skb_shared_info with skb_zcopy_clear.

Commit b90ddd568792 ("skbuff: skb_copy_ubufs must release uarg even
without user frags") ensures that all skbs release their zerocopy
state, even those without frags.

But I forgot an edge case where such an skb arrives that is cloned.

The stack does not build such packets. Vhost/tun skbs have their
frags orphaned before cloning. TCP skbs only attach zerocopy state
when a frag is added.

But if TCP packets can be trimmed or linearized, this might occur.
Tracing the code I found no instance so far (e.g., skb_linearize
ends up calling skb_zcopy_clear if !skb->data_len).

Still, it is non-obvious that no path exists. And it is fragile to
rely on this.

Fixes: b90ddd568792 ("skbuff: skb_copy_ubufs must release uarg even without user frags")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'mlx4-misc-for-4.16'
David S. Miller [Thu, 28 Dec 2017 17:24:06 +0000 (12:24 -0500)]
Merge branch 'mlx4-misc-for-4.16'

Tariq Toukan says:

====================
mlx4 misc for 4.16

This patchset contains misc cleanups and improvements
to the mlx4 Core and Eth drivers.

In patches 1 and 2 I reduce and reorder the branches in the RX csum flow.
In patch 3 I align the FMR unmapping flow with the device spec, to allow
  a remapping afterwards.
Patch 4 by Moni changes the default QoS settings so that a pause
  frame stops all traffic regardless of its prio.

Series generated against net-next commit:
836df24a7062 net: hns3: hns3_get_channels() can be static
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/mlx4_en: Change default QoS settings
Moni Shoua [Thu, 28 Dec 2017 14:26:11 +0000 (16:26 +0200)]
net/mlx4_en: Change default QoS settings

Change the default mapping between TC and TCG as follows:

Prio     |             TC/TCG
         |      from             to
         |    (set by FW)      (set by SW)
---------+-----------------------------------
0        |      0/0              0/7
1        |      1/0              0/6
2        |      2/0              0/5
3        |      3/0              0/4
4        |      4/0              0/3
5        |      5/0              0/2
6        |      6/0              0/1
7        |      7/0              0/0

These new settings cause that a pause frame for any prio stops
traffic for all prios.

Fixes: 564c274c3df0 ("net/mlx4_en: DCB QoS support")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/mlx4_core: Cleanup FMR unmapping flow
Tariq Toukan [Thu, 28 Dec 2017 14:26:10 +0000 (16:26 +0200)]
net/mlx4_core: Cleanup FMR unmapping flow

Remove redundant and not essential operations in fmr unmap/free.
According to device spec, in FMR unmap it is sufficient to set
ownership bit to SW. This allows remapping afterwards.

Fixes: 8ad11fb6b073 ("IB/mlx4: Implement FMRs")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/mlx4_en: RX csum, reorder branches
Tariq Toukan [Thu, 28 Dec 2017 14:26:09 +0000 (16:26 +0200)]
net/mlx4_en: RX csum, reorder branches

Use early goto commands, and save else branches.
This uses less indentations and brackets, making the code
more readable.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/mlx4_en: RX csum, remove redundant branches and checks
Tariq Toukan [Thu, 28 Dec 2017 14:26:08 +0000 (16:26 +0200)]
net/mlx4_en: RX csum, remove redundant branches and checks

Do not check IPv6 bit in cqe status if CONFIG_IPV6 is not enabled.
Function check_csum() is reached only with IPv4 or IPv6 set (if enabled),
if IPv6 is not set (or is not enabled) it is redundant to test the
IPv4 bit.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: don't set extack message in case the qdisc will be created
Jiri Pirko [Thu, 28 Dec 2017 15:52:10 +0000 (16:52 +0100)]
net: sched: don't set extack message in case the qdisc will be created

If the qdisc is not found here, it is going to be created. Therefore,
this is not an error path. Remove the extack message set and don't
confuse user with error message in case the qdisc was created
successfully.

Fixes: 09215598119e ("net: sched: sch_api: handle generic qdisc errors")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotipc: fix hanging poll() for stream sockets
Parthasarathy Bhuvaragan [Thu, 28 Dec 2017 11:03:06 +0000 (12:03 +0100)]
tipc: fix hanging poll() for stream sockets

In commit 42b531de17d2f6 ("tipc: Fix missing connection request
handling"), we replaced unconditional wakeup() with condtional
wakeup for clients with flags POLLIN | POLLRDNORM | POLLRDBAND.

This breaks the applications which do a connect followed by poll
with POLLOUT flag. These applications are not woken when the
connection is ESTABLISHED and hence sleep forever.

In this commit, we fix it by including the POLLOUT event for
sockets in TIPC_CONNECTING state.

Fixes: 42b531de17d2f6 ("tipc: Fix missing connection request handling")
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>