openwrt/staging/blogic.git
6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Thu, 29 Nov 2018 06:10:54 +0000 (22:10 -0800)]
Merge git://git./linux/kernel/git/davem/net

Trivial conflict in net/core/filter.c, a locally computed
'sdif' is now an argument to the function.

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Wed, 28 Nov 2018 20:53:48 +0000 (12:53 -0800)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) ARM64 JIT fixes for subprog handling from Daniel Borkmann.

 2) Various sparc64 JIT bug fixes (fused branch convergance, frame
    pointer usage detection logic, PSEODU call argument handling).

 3) Fix to use BH locking in nf_conncount, from Taehee Yoo.

 4) Fix race of TX skb freeing in ipheth driver, from Bernd Eckstein.

 5) Handle return value of TX NAPI completion properly in lan743x
    driver, from Bryan Whitehead.

 6) MAC filter deletion in i40e driver clears wrong state bit, from
    Lihong Yang.

 7) Fix use after free in rionet driver, from Pan Bian.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits)
  s390/qeth: fix length check in SNMP processing
  net: hisilicon: remove unexpected free_netdev
  rapidio/rionet: do not free skb before reading its length
  i40e: fix kerneldoc for xsk methods
  ixgbe: recognize 1000BaseLX SFP modules as 1Gbps
  i40e: Fix deletion of MAC filters
  igb: fix uninitialized variables
  netfilter: nf_tables: deactivate expressions in rule replecement routine
  lan743x: Enable driver to work with LAN7431
  tipc: fix lockdep warning during node delete
  lan743x: fix return value for lan743x_tx_napi_poll
  net: via: via-velocity: fix spelling mistake "alignement" -> "alignment"
  qed: fix spelling mistake "attnetion" -> "attention"
  net: thunderx: fix NULL pointer dereference in nic_remove
  sctp: increase sk_wmem_alloc when head->truesize is increased
  firestream: fix spelling mistake: "Inititing" -> "Initializing"
  net: phy: add workaround for issue where PHY driver doesn't bind to the device
  usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2
  sparc: Adjust bpf JIT prologue for PSEUDO calls.
  bpf, doc: add entries of who looks over which jits
  ...

6 years agoMerge tag 'xtensa-20181128' of git://github.com/jcmvbkbc/linux-xtensa
Linus Torvalds [Wed, 28 Nov 2018 20:51:10 +0000 (12:51 -0800)]
Merge tag 'xtensa-20181128' of git://github.com/jcmvbkbc/linux-xtensa

Pull Xtensa fixes from Max Filippov:

 - fix kernel exception on userspace access to a currently disabled
   coprocessor

 - fix coprocessor data saving/restoring in configurations with multiple
   coprocessors

 - fix ptrace access to coprocessor data on configurations with multiple
   coprocessors with high alignment requirements

* tag 'xtensa-20181128' of git://github.com/jcmvbkbc/linux-xtensa:
  xtensa: fix coprocessor part of ptrace_{get,set}xregs
  xtensa: fix coprocessor context offset definitions
  xtensa: enable coprocessors that are being flushed

6 years agoMerge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Wed, 28 Nov 2018 19:33:35 +0000 (11:33 -0800)]
Merge branch '1GbE' of git://git./linux/kernel/git/jkirsher/net-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Fixes 2018-11-28

This series contains fixes to igb, ixgbe and i40e.

Yunjian Wang from Huawei resolves a variable that could potentially be
NULL before it is used.

Lihong fixes an i40e issue which goes back to 4.17 kernels, where
deleting any of the MAC filters was causing the incorrect syncing for
the PF.

Josh Elsasser caught that there were missing enum values in the link
capabilities for x550 devices, which was preventing link for 1000BaseLX
SFP modules.

Jan fixes the function header comments for XSK methods.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agos390/qeth: fix length check in SNMP processing
Julian Wiedmann [Wed, 28 Nov 2018 15:20:50 +0000 (16:20 +0100)]
s390/qeth: fix length check in SNMP processing

The response for a SNMP request can consist of multiple parts, which
the cmd callback stages into a kernel buffer until all parts have been
received. If the callback detects that the staging buffer provides
insufficient space, it bails out with error.
This processing is buggy for the first part of the response - while it
initially checks for a length of 'data_len', it later copies an
additional amount of 'offsetof(struct qeth_snmp_cmd, data)' bytes.

Fix the calculation of 'data_len' for the first part of the response.
This also nicely cleans up the memcpy code.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: qualcomm: rmnet: remove set but not used variables 'ip_family, fc_seq, qos_id'
YueHaibing [Wed, 28 Nov 2018 12:31:42 +0000 (20:31 +0800)]
net: qualcomm: rmnet: remove set but not used variables 'ip_family, fc_seq, qos_id'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c:26:6:
 warning: variable 'ip_family' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c:27:6:
 warning: variable 'fc_seq' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c:28:6:
 warning: variable 'qos_id' set but not used [-Wunused-but-set-variable]

It never used since introduction in commit
ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqlcnic: remove set but not used variables 'cur_rings, max_hw_rings, tx_desc_info'
YueHaibing [Wed, 28 Nov 2018 12:29:39 +0000 (20:29 +0800)]
qlcnic: remove set but not used variables 'cur_rings, max_hw_rings, tx_desc_info'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c:4011:5:
 warning: variable 'max_hw_rings' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c:4013:6:
 warning: variable 'cur_rings' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c:2996:25:
 warning: variable 'tx_desc_info' set but not used [-Wunused-but-set-variable]

'cur_rings, max_hw_rings' never used since introduction in commit
34e8c406fda5 ("qlcnic: refactor Tx/SDS ring calculation and validation in driver.")
'tx_desc_info' never used since commit
95b3890ae39f ("qlcnic: Enhance Tx timeout debugging.")
Also 'queue_type' only can be QLCNIC_RX_QUEUE/QLCNIC_TX_QUEUE,
so make a trival cleanup on if statement.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: neterion: vxge: remove set but not used variables 'max_frags' and 'txdl_priv'
YueHaibing [Wed, 28 Nov 2018 12:24:41 +0000 (20:24 +0800)]
net: neterion: vxge: remove set but not used variables 'max_frags' and 'txdl_priv'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/neterion/vxge/vxge-traffic.c:1698:35:
 warning: variable 'txdl_priv' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/neterion/vxge/vxge-traffic.c:1699:6:
 warning: variable 'max_frags' set but not used [-Wunused-but-set-variable]

It never used since introduction in
commit 113241321dcd ("Neterion: New driver: Traffic & alarm handler")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
David S. Miller [Wed, 28 Nov 2018 19:02:45 +0000 (11:02 -0800)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Disable BH while holding list spinlock in nf_conncount, from
   Taehee Yoo.

2) List corruption in nf_conncount, also from Taehee.

3) Fix race that results in leaving around an empty list node in
   nf_conncount, from Taehee Yoo.

4) Proper chain handling for inactive chains from the commit path,
   from Florian Westphal. This includes a selftest for this.

5) Do duplicate rule handles when replacing rules, also from Florian.

6) Remove net_exit path in xt_RATEEST that results in splat, from Taehee.

7) Possible use-after-free in nft_compat when releasing extensions.
   From Florian.

8) Memory leak in xt_hashlimit, from Taehee.

9) Call ip_vs_dst_notifier after ipv6_dev_notf, from Xin Long.

10) Fix cttimeout with udplite and gre, from Florian.

11) Preserve oif for IPv6 link-local generated traffic from mangle
    table, from Alin Nastac.

12) Missing error handling in masquerade notifiers, from Taehee Yoo.

13) Use mutex to protect registration/unregistration of masquerade
    extensions in order to prevent a race, from Taehee.

14) Incorrect condition check in tree_nodes_free(), also from Taehee.

15) Fix chain counter leak in rule replacement path, from Taehee.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'dpaa2-eth-Introduce-XDP-support'
David S. Miller [Wed, 28 Nov 2018 18:57:46 +0000 (10:57 -0800)]
Merge branch 'dpaa2-eth-Introduce-XDP-support'

Ioana Ciocoi Radulescu says:

====================
dpaa2-eth: Introduce XDP support

Add support for XDP programs. Only XDP_PASS, XDP_DROP and XDP_TX
actions are supported for now. Frame header changes are also
allowed.

v2: - count the XDP packets in the rx/tx inteface stats
    - add message with the maximum supported MTU value for XDP
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agodpaa2-eth: Add xdp counters
Ioana Ciocoi Radulescu [Mon, 26 Nov 2018 16:27:34 +0000 (16:27 +0000)]
dpaa2-eth: Add xdp counters

Add counters for xdp processed frames to the channel statistics.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agodpaa2-eth: Cleanup channel stats
Ioana Ciocoi Radulescu [Mon, 26 Nov 2018 16:27:33 +0000 (16:27 +0000)]
dpaa2-eth: Cleanup channel stats

Remove unused counter. Reorder fields in channel stats structure
to match the ethtool strings order and make it easier to print them
with ethtool -S.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agodpaa2-eth: Add support for XDP_TX
Ioana Ciocoi Radulescu [Mon, 26 Nov 2018 16:27:33 +0000 (16:27 +0000)]
dpaa2-eth: Add support for XDP_TX

Send frames back on the same port for XDP_TX action.
Since the frame buffers have been allocated by us, we can recycle
them directly into the Rx buffer pool instead of requesting a
confirmation frame upon transmission complete.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agodpaa2-eth: Map Rx buffers as bidirectional
Ioana Ciocoi Radulescu [Mon, 26 Nov 2018 16:27:32 +0000 (16:27 +0000)]
dpaa2-eth: Map Rx buffers as bidirectional

In order to support enqueueing Rx FDs back to hardware, we need to
DMA map Rx buffers as bidirectional.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agodpaa2-eth: Release buffers back to pool on XDP_DROP
Ioana Ciocoi Radulescu [Mon, 26 Nov 2018 16:27:31 +0000 (16:27 +0000)]
dpaa2-eth: Release buffers back to pool on XDP_DROP

Instead of freeing the RX buffers, release them back into the pool.
We wait for the maximum number of buffers supported by a single
release command to accumulate before issuing the command.

Also, don't unmap the Rx buffers at the beginning of the Rx routine
anymore, since that would require remapping them before release.
Instead, just do a DMA sync at first and only unmap if the frame is
meant for the stack.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agodpaa2-eth: Move function
Ioana Ciocoi Radulescu [Mon, 26 Nov 2018 16:27:31 +0000 (16:27 +0000)]
dpaa2-eth: Move function

We'll use function free_bufs() on the XDP path as well, so move
it higher in order to avoid a forward declaration.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agodpaa2-eth: Allow XDP header adjustments
Ioana Ciocoi Radulescu [Mon, 26 Nov 2018 16:27:30 +0000 (16:27 +0000)]
dpaa2-eth: Allow XDP header adjustments

Reserve XDP_PACKET_HEADROOM bytes in Rx buffers to allow XDP
programs to increase frame header size.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agodpaa2-eth: Add basic XDP support
Ioana Ciocoi Radulescu [Mon, 26 Nov 2018 16:27:29 +0000 (16:27 +0000)]
dpaa2-eth: Add basic XDP support

We keep one XDP program reference per channel. The only actions
supported for now are XDP_DROP and XDP_PASS.

Until now we didn't enforce a maximum size for Rx frames based
on MTU value. Change that, since for XDP mode we must ensure no
scatter-gather frames can be received.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hisilicon: remove unexpected free_netdev
Pan Bian [Wed, 28 Nov 2018 07:30:24 +0000 (15:30 +0800)]
net: hisilicon: remove unexpected free_netdev

The net device ndev is freed via free_netdev when failing to register
the device. The control flow then jumps to the error handling code
block. ndev is used and freed again. Resulting in a use-after-free bug.

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agorapidio/rionet: do not free skb before reading its length
Pan Bian [Wed, 28 Nov 2018 06:53:19 +0000 (14:53 +0800)]
rapidio/rionet: do not free skb before reading its length

skb is freed via dev_kfree_skb_any, however, skb->len is read then. This
may result in a use-after-free bug.

Fixes: e6161d64263 ("rapidio/rionet: rework driver initialization and removal")
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoi40e: fix kerneldoc for xsk methods
Jan Sokolowski [Tue, 27 Nov 2018 17:35:35 +0000 (09:35 -0800)]
i40e: fix kerneldoc for xsk methods

One method, xsk_umem_setup, had an incorrect kernel doc
description, which has been corrected.

Also fixes small typos found in the comments.

Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoMerge tag 'for-4.20-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Wed, 28 Nov 2018 16:38:20 +0000 (08:38 -0800)]
Merge tag 'for-4.20-rc4-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "Some of these bugs are being hit during testing so we'd like to get
  them merged, otherwise there are usual stability fixes for stable
  trees"

* tag 'for-4.20-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: relocation: set trans to be NULL after ending transaction
  Btrfs: fix race between enabling quotas and subvolume creation
  Btrfs: send, fix infinite loop due to directory rename dependencies
  Btrfs: ensure path name is null terminated at btrfs_control_ioctl
  Btrfs: fix rare chances for data loss when doing a fast fsync
  btrfs: Always try all copies when reading extent buffers

6 years agoMerge tag 'spi-fix-v4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brooni...
Linus Torvalds [Wed, 28 Nov 2018 16:33:55 +0000 (08:33 -0800)]
Merge tag 'spi-fix-v4.20-rc4' of git://git./linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A few driver specific fixes here, nothing big or that stands out for
  anyone other than the driver users.

  The omap2-mcspi fix is for issues that started showing up with a
  change in defconfig in this release to make cpuidle get turned on by
  default"

* tag 'spi-fix-v4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: omap2-mcspi: Add missing suspend and resume calls
  spi: mediatek: use correct mata->xfer_len when in fifo transfer
  spi: uniphier: fix incorrect property items

6 years agoixgbe: recognize 1000BaseLX SFP modules as 1Gbps
Josh Elsasser [Sat, 24 Nov 2018 20:57:33 +0000 (12:57 -0800)]
ixgbe: recognize 1000BaseLX SFP modules as 1Gbps

Add the two 1000BaseLX enum values to the X550's check for 1Gbps modules,
allowing the core driver code to establish a link over this SFP type.

This is done by the out-of-tree driver but the fix wasn't in mainline.

Fixes: e23f33367882 ("ixgbe: Fix 1G and 10G link stability for X550EM_x SFP+”)
Fixes: 6a14ee0cfb19 ("ixgbe: Add X550 support function pointers")
Signed-off-by: Josh Elsasser <jelsasser@appneta.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Wed, 28 Nov 2018 16:29:18 +0000 (08:29 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Bugfixes, many of them reported by syzkaller and mostly predating the
  merge window"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb
  kvm: mmu: Fix race in emulated page table writes
  KVM: nVMX: vmcs12 revision_id is always VMCS12_REVISION even when copied from eVMCS
  KVM: nVMX: Verify eVMCS revision id match supported eVMCS version on eVMCS VMPTRLD
  KVM: nVMX/nSVM: Fix bug which sets vcpu->arch.tsc_offset to L1 tsc_offset
  x86/kvm/vmx: fix old-style function declaration
  KVM: x86: fix empty-body warnings
  KVM: VMX: Update shared MSRs to be saved/restored on MSR_EFER.LMA changes
  KVM: x86: Fix kernel info-leak in KVM_HC_CLOCK_PAIRING hypercall
  KVM: nVMX: Fix kernel info-leak when enabling KVM_CAP_HYPERV_ENLIGHTENED_VMCS more than once
  svm: Add mutex_lock to protect apic_access_page_done on AMD systems
  KVM: X86: Fix scan ioapic use-before-initialization
  KVM: LAPIC: Fix pv ipis use-before-initialization
  KVM: VMX: re-add ple_gap module parameter
  KVM: PPC: Book3S HV: Fix handling for interrupted H_ENTER_NESTED

6 years agoi40e: Fix deletion of MAC filters
Lihong Yang [Wed, 21 Nov 2018 17:15:37 +0000 (09:15 -0800)]
i40e: Fix deletion of MAC filters

In __i40e_del_filter function, the flag __I40E_MACVLAN_SYNC_PENDING for
the PF state is wrongly set for the VSI. Deleting any of the MAC filters
has caused the incorrect syncing for the PF. Fix it by setting this state
flag to the intended PF.

CC: stable <stable@vger.kernel.org>
Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoigb: fix uninitialized variables
Yunjian Wang [Tue, 6 Nov 2018 08:27:12 +0000 (16:27 +0800)]
igb: fix uninitialized variables

This patch fixes the variable 'phy_word' may be used uninitialized.

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agonetfilter: nf_tables: deactivate expressions in rule replecement routine
Taehee Yoo [Wed, 28 Nov 2018 02:27:28 +0000 (11:27 +0900)]
netfilter: nf_tables: deactivate expressions in rule replecement routine

There is no expression deactivation call from the rule replacement path,
hence, chain counter is not decremented. A few steps to reproduce the
problem:

   %nft add table ip filter
   %nft add chain ip filter c1
   %nft add chain ip filter c1
   %nft add rule ip filter c1 jump c2
   %nft replace rule ip filter c1 handle 3 accept
   %nft flush ruleset

<jump c2> expression means immediate NFT_JUMP to chain c2.
Reference count of chain c2 is increased when the rule is added.

When rule is deleted or replaced, the reference counter of c2 should be
decreased via nft_rule_expr_deactivate() which calls
nft_immediate_deactivate().

Splat looks like:
[  214.396453] WARNING: CPU: 1 PID: 21 at net/netfilter/nf_tables_api.c:1432 nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables]
[  214.398983] Modules linked in: nf_tables nfnetlink
[  214.398983] CPU: 1 PID: 21 Comm: kworker/1:1 Not tainted 4.20.0-rc2+ #44
[  214.398983] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
[  214.398983] RIP: 0010:nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables]
[  214.398983] Code: 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 8e 00 00 00 48 8b 7b 58 e8 e1 2c 4e c6 48 89 df e8 d9 2c 4e c6 eb 9a <0f> 0b eb 96 0f 0b e9 7e fe ff ff e8 a7 7e 4e c6 e9 a4 fe ff ff e8
[  214.398983] RSP: 0018:ffff8881152874e8 EFLAGS: 00010202
[  214.398983] RAX: 0000000000000001 RBX: ffff88810ef9fc28 RCX: ffff8881152876f0
[  214.398983] RDX: dffffc0000000000 RSI: 1ffff11022a50ede RDI: ffff88810ef9fc78
[  214.398983] RBP: 1ffff11022a50e9d R08: 0000000080000000 R09: 0000000000000000
[  214.398983] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff11022a50eba
[  214.398983] R13: ffff888114446e08 R14: ffff8881152876f0 R15: ffffed1022a50ed6
[  214.398983] FS:  0000000000000000(0000) GS:ffff888116400000(0000) knlGS:0000000000000000
[  214.398983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  214.398983] CR2: 00007fab9bb5f868 CR3: 000000012aa16000 CR4: 00000000001006e0
[  214.398983] Call Trace:
[  214.398983]  ? nf_tables_table_destroy.isra.37+0x100/0x100 [nf_tables]
[  214.398983]  ? __kasan_slab_free+0x145/0x180
[  214.398983]  ? nf_tables_trans_destroy_work+0x439/0x830 [nf_tables]
[  214.398983]  ? kfree+0xdb/0x280
[  214.398983]  nf_tables_trans_destroy_work+0x5f5/0x830 [nf_tables]
[ ... ]

Fixes: bb7b40aecbf7 ("netfilter: nf_tables: bogus EBUSY in chain deletions")
Reported by: Christoph Anton Mitterer <calestyo@scientia.net>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914505
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201791
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
6 years agonet/ipv4: Fix missing raw_init when CONFIG_PROC_FS is disabled
David Ahern [Wed, 28 Nov 2018 04:52:24 +0000 (20:52 -0800)]
net/ipv4: Fix missing raw_init when CONFIG_PROC_FS is disabled

Randy reported when CONFIG_PROC_FS is not enabled:
    ld: net/ipv4/af_inet.o: in function `inet_init':
    af_inet.c:(.init.text+0x42d): undefined reference to `raw_init'

Fix by moving the endif up to the end of the proc entries

Fixes: 6897445fb194c ("net: provide a sysctl raw_l3mdev_accept for raw socket lookup with VRFs")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Mike Manning <mmanning@vyatta.att-mail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'bnx2x-Popoulate-firmware-versions-in-driver-info-query'
David S. Miller [Wed, 28 Nov 2018 00:41:19 +0000 (16:41 -0800)]
Merge branch 'bnx2x-Popoulate-firmware-versions-in-driver-info-query'

Sudarsana Reddy Kalluru says:

====================
bnx2x: Popoulate firmware versions in driver info query.

The patch series populates MBI and storm firware versions in the ethtool
driver info query.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobnx2x: Add storm FW version to ethtool driver query output.
Sudarsana Reddy Kalluru [Tue, 27 Nov 2018 06:25:57 +0000 (22:25 -0800)]
bnx2x: Add storm FW version to ethtool driver query output.

The patch populates the Storm FW version in the ethtool driver query data.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobnx2x: Add MBI version to ethtool driver query output.
Sudarsana Reddy Kalluru [Tue, 27 Nov 2018 06:25:56 +0000 (22:25 -0800)]
bnx2x: Add MBI version to ethtool driver query output.

The patch populates the MBI version in the ethtool driver query data.
Adding 'extended_dev_info_shared_cfg' structure describing the nvram
structure, this is required to access the mbi version string.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotcp: remove hdrlen argument from tcp_queue_rcv()
Eric Dumazet [Mon, 26 Nov 2018 22:49:12 +0000 (14:49 -0800)]
tcp: remove hdrlen argument from tcp_queue_rcv()

Only one caller needs to pull TCP headers, so lets
move __skb_pull() to the caller side.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/ncsi: Add NCSI Mellanox OEM command
Vijay Khemka [Mon, 26 Nov 2018 21:49:04 +0000 (13:49 -0800)]
net/ncsi: Add NCSI Mellanox OEM command

This patch adds OEM Mellanox commands and response handling. It also
defines OEM Get MAC Address handler to get and configure the device.

ncsi_oem_gma_handler_mlx: This handler send NCSI mellanox command for
getting mac address.
ncsi_rsp_handler_oem_mlx: This handles response received for all
mellanox OEM commands.
ncsi_rsp_handler_oem_mlx_gma: This handles get mac address response and
set it to device.

Signed-off-by: Vijay Khemka <vijaykhemka@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agor8169: remove unneeded mmiowb barriers
Heiner Kallweit [Mon, 26 Nov 2018 19:24:16 +0000 (20:24 +0100)]
r8169: remove unneeded mmiowb barriers

writex() has implicit barriers, that's what makes it different from
writex_relaxed(). Therefore these calls to mmiowb() can be removed.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: Config NIC port speed same as that of optical module
Peng Li [Mon, 26 Nov 2018 18:43:00 +0000 (18:43 +0000)]
net: hns3: Config NIC port speed same as that of optical module

Port 0/1 of HiP08 supports 10G and 25G. This patch adds a
change to configure NIC port speed same as that of  optical
module(SFP/QFSP). Driver gets the optical module speed and
sets NIC port speed accordingly.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agolan743x: Enable driver to work with LAN7431
Bryan Whitehead [Mon, 26 Nov 2018 17:27:10 +0000 (12:27 -0500)]
lan743x: Enable driver to work with LAN7431

This driver was designed to work with both LAN7430 and LAN7431.
The only difference between the two is the LAN7431 has support
for external phy.

This change adds LAN7431 to the list of recognized devices
supported by this driver.

Updates for v2:
    changed 'fixes' tag to match defined format

fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotipc: fix lockdep warning during node delete
Jon Maloy [Mon, 26 Nov 2018 17:26:14 +0000 (12:26 -0500)]
tipc: fix lockdep warning during node delete

We see the following lockdep warning:

[ 2284.078521] ======================================================
[ 2284.078604] WARNING: possible circular locking dependency detected
[ 2284.078604] 4.19.0+ #42 Tainted: G            E
[ 2284.078604] ------------------------------------------------------
[ 2284.078604] rmmod/254 is trying to acquire lock:
[ 2284.078604] 00000000acd94e28 ((&n->timer)#2){+.-.}, at: del_timer_sync+0x5/0xa0
[ 2284.078604]
[ 2284.078604] but task is already holding lock:
[ 2284.078604] 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x190 [tipc]
[ 2284.078604]
[ 2284.078604] which lock already depends on the new lock.
[ 2284.078604]
[ 2284.078604]
[ 2284.078604] the existing dependency chain (in reverse order) is:
[ 2284.078604]
[ 2284.078604] -> #1 (&(&tn->node_list_lock)->rlock){+.-.}:
[ 2284.078604]        tipc_node_timeout+0x20a/0x330 [tipc]
[ 2284.078604]        call_timer_fn+0xa1/0x280
[ 2284.078604]        run_timer_softirq+0x1f2/0x4d0
[ 2284.078604]        __do_softirq+0xfc/0x413
[ 2284.078604]        irq_exit+0xb5/0xc0
[ 2284.078604]        smp_apic_timer_interrupt+0xac/0x210
[ 2284.078604]        apic_timer_interrupt+0xf/0x20
[ 2284.078604]        default_idle+0x1c/0x140
[ 2284.078604]        do_idle+0x1bc/0x280
[ 2284.078604]        cpu_startup_entry+0x19/0x20
[ 2284.078604]        start_secondary+0x187/0x1c0
[ 2284.078604]        secondary_startup_64+0xa4/0xb0
[ 2284.078604]
[ 2284.078604] -> #0 ((&n->timer)#2){+.-.}:
[ 2284.078604]        del_timer_sync+0x34/0xa0
[ 2284.078604]        tipc_node_delete+0x1a/0x40 [tipc]
[ 2284.078604]        tipc_node_stop+0xcb/0x190 [tipc]
[ 2284.078604]        tipc_net_stop+0x154/0x170 [tipc]
[ 2284.078604]        tipc_exit_net+0x16/0x30 [tipc]
[ 2284.078604]        ops_exit_list.isra.8+0x36/0x70
[ 2284.078604]        unregister_pernet_operations+0x87/0xd0
[ 2284.078604]        unregister_pernet_subsys+0x1d/0x30
[ 2284.078604]        tipc_exit+0x11/0x6f2 [tipc]
[ 2284.078604]        __x64_sys_delete_module+0x1df/0x240
[ 2284.078604]        do_syscall_64+0x66/0x460
[ 2284.078604]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 2284.078604]
[ 2284.078604] other info that might help us debug this:
[ 2284.078604]
[ 2284.078604]  Possible unsafe locking scenario:
[ 2284.078604]
[ 2284.078604]        CPU0                    CPU1
[ 2284.078604]        ----                    ----
[ 2284.078604]   lock(&(&tn->node_list_lock)->rlock);
[ 2284.078604]                                lock((&n->timer)#2);
[ 2284.078604]                                lock(&(&tn->node_list_lock)->rlock);
[ 2284.078604]   lock((&n->timer)#2);
[ 2284.078604]
[ 2284.078604]  *** DEADLOCK ***
[ 2284.078604]
[ 2284.078604] 3 locks held by rmmod/254:
[ 2284.078604]  #0: 000000003368be9b (pernet_ops_rwsem){+.+.}, at: unregister_pernet_subsys+0x15/0x30
[ 2284.078604]  #1: 0000000046ed9c86 (rtnl_mutex){+.+.}, at: tipc_net_stop+0x144/0x170 [tipc]
[ 2284.078604]  #2: 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x19
[...}

The reason is that the node timer handler sometimes needs to delete a
node which has been disconnected for too long. To do this, it grabs
the lock 'node_list_lock', which may at the same time be held by the
generic node cleanup function, tipc_node_stop(), during module removal.
Since the latter is calling del_timer_sync() inside the same lock, we
have a potential deadlock.

We fix this letting the timer cleanup function use spin_trylock()
instead of just spin_lock(), and when it fails to grab the lock it
just returns so that the timer handler can terminate its execution.
This is safe to do, since tipc_node_stop() anyway is about to
delete both the timer and the node instance.

Fixes: 6a939f365bdb ("tipc: Auto removal of peer down node instance")
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agolan743x: fix return value for lan743x_tx_napi_poll
Bryan Whitehead [Mon, 26 Nov 2018 17:04:57 +0000 (12:04 -0500)]
lan743x: fix return value for lan743x_tx_napi_poll

The lan743x driver, when under heavy traffic load, has been noticed
to sometimes hang, or cause a kernel panic.

Debugging reveals that the TX napi poll routine was returning
the wrong value, 'weight'. Most other drivers return 0.
And call napi_complete, instead of napi_complete_done.

Additionally when creating the tx napi poll routine.
Changed netif_napi_add, to netif_tx_napi_add.

Updates for v3:
    changed 'fixes' tag to match defined format

Updates for v2:
use napi_complete, instead of napi_complete_done in
    lan743x_tx_napi_poll
use netif_tx_napi_add, instead of netif_napi_add for
    registration of tx napi poll routine

fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: via: via-velocity: fix spelling mistake "alignement" -> "alignment"
Colin Ian King [Mon, 26 Nov 2018 15:34:01 +0000 (15:34 +0000)]
net: via: via-velocity: fix spelling mistake "alignement" -> "alignment"

The text in array velocity_gstrings contains a spelling mistake,
rename rx_frame_alignement_errors to rx_frame_alignment_errors.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: fix spelling mistake "attnetion" -> "attention"
Colin Ian King [Mon, 26 Nov 2018 15:53:54 +0000 (15:53 +0000)]
qed: fix spelling mistake "attnetion" -> "attention"

The text in array s_igu_fifo_error_strs contains a spelling mistake,
fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'net-nsid-interpretation'
David S. Miller [Wed, 28 Nov 2018 00:20:20 +0000 (16:20 -0800)]
Merge branch 'net-nsid-interpretation'

Nicolas Dichtel says:

====================
Ease to interpret net-nsid

The goal of this series is to ease the interpretation of nsid received in
netlink messages from other netns (when the user uses
NETLINK_F_LISTEN_ALL_NSID).

After this series, with a patched iproute2:

$ ip netns add foo
$ ip netns add bar
$ touch /var/run/netns/init_net
$ mount --bind /proc/1/ns/net /var/run/netns/init_net
$ ip netns set init_net 11
$ ip netns set foo 12
$ ip netns set bar 13
$ ip netns
init_net (id: 11)
bar (id: 13)
foo (id: 12)
$ ip -n foo netns set init_net 21
$ ip -n foo netns set foo 22
$ ip -n foo netns set bar 23
$ ip -n foo netns
init_net (id: 21)
bar (id: 23)
foo (id: 22)
$ ip -n bar netns set init_net 31
$ ip -n bar netns set foo 32
$ ip -n bar netns set bar 33
$ ip -n bar netns
init_net (id: 31)
bar (id: 33)
foo (id: 32)
$ ip netns list-id target-nsid 12
nsid 21 current-nsid 11 (iproute2 netns name: init_net)
nsid 22 current-nsid 12 (iproute2 netns name: foo)
nsid 23 current-nsid 13 (iproute2 netns name: bar)
$ ip -n bar netns list-id target-nsid 32 nsid 31
nsid 21 current-nsid 31 (iproute2 netns name: init_net)

v3 -> v4:
  - patch 5/5: fix imbalance lock in error path

v2 -> v3:
  - patch 5/5: account NETNSA_CURRENT_NSID in rtnl_net_get_size()

v1 -> v2:
  - patch 1/5: remove net from struct rtnl_net_dump_cb
  - patch 2/5: new in this version
  - patch 3/5: use a bool to know if rtnl_get_net_ns_capable() was called
  - patch 5/5: use struct net_fill_args
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonetns: enable to dump full nsid translation table
Nicolas Dichtel [Mon, 26 Nov 2018 14:42:06 +0000 (15:42 +0100)]
netns: enable to dump full nsid translation table

Like the previous patch, the goal is to ease to convert nsids from one
netns to another netns.
A new attribute (NETNSA_CURRENT_NSID) is added to the kernel answer when
NETNSA_TARGET_NSID is provided, thus the user can easily convert nsids.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonetns: enable to specify a nsid for a get request
Nicolas Dichtel [Mon, 26 Nov 2018 14:42:05 +0000 (15:42 +0100)]
netns: enable to specify a nsid for a get request

Combined with NETNSA_TARGET_NSID, it enables to "translate" a nsid from one
netns to a nsid of another netns.
This is useful when using NETLINK_F_LISTEN_ALL_NSID because it helps the
user to interpret a nsid received from an other netns.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonetns: add support of NETNSA_TARGET_NSID
Nicolas Dichtel [Mon, 26 Nov 2018 14:42:04 +0000 (15:42 +0100)]
netns: add support of NETNSA_TARGET_NSID

Like it was done for link and address, add the ability to perform get/dump
in another netns by specifying a target nsid attribute.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonetns: introduce 'struct net_fill_args'
Nicolas Dichtel [Mon, 26 Nov 2018 14:42:03 +0000 (15:42 +0100)]
netns: introduce 'struct net_fill_args'

This is a preparatory work. To avoid having to much arguments for the
function rtnl_net_fill(), a new structure is defined.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonetns: remove net arg from rtnl_net_fill()
Nicolas Dichtel [Mon, 26 Nov 2018 14:42:02 +0000 (15:42 +0100)]
netns: remove net arg from rtnl_net_fill()

This argument is not used anymore.

Fixes: cab3c8ec8d57 ("netns: always provide the id to rtnl_net_fill()")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: thunderx: fix NULL pointer dereference in nic_remove
Lorenzo Bianconi [Mon, 26 Nov 2018 14:07:16 +0000 (15:07 +0100)]
net: thunderx: fix NULL pointer dereference in nic_remove

Fix a possible NULL pointer dereference in nic_remove routine
removing the nicpf module if nic_probe fails.
The issue can be triggered with the following reproducer:

$rmmod nicvf
$rmmod nicpf

[  521.412008] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000014
[  521.422777] Mem abort info:
[  521.425561]   ESR = 0x96000004
[  521.428624]   Exception class = DABT (current EL), IL = 32 bits
[  521.434535]   SET = 0, FnV = 0
[  521.437579]   EA = 0, S1PTW = 0
[  521.440730] Data abort info:
[  521.443603]   ISV = 0, ISS = 0x00000004
[  521.447431]   CM = 0, WnR = 0
[  521.450417] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000072a3da42
[  521.457022] [0000000000000014] pgd=0000000000000000
[  521.461916] Internal error: Oops: 96000004 [#1] SMP
[  521.511801] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018
[  521.518664] pstate: 80400005 (Nzcv daif +PAN -UAO)
[  521.523451] pc : nic_remove+0x24/0x88 [nicpf]
[  521.527808] lr : pci_device_remove+0x48/0xd8
[  521.532066] sp : ffff000013433cc0
[  521.535370] x29: ffff000013433cc0 x28: ffff810f6ac50000
[  521.540672] x27: 0000000000000000 x26: 0000000000000000
[  521.545974] x25: 0000000056000000 x24: 0000000000000015
[  521.551274] x23: ffff8007ff89a110 x22: ffff000001667070
[  521.556576] x21: ffff8007ffb170b0 x20: ffff8007ffb17000
[  521.561877] x19: 0000000000000000 x18: 0000000000000025
[  521.567178] x17: 0000000000000000 x16: 000000000000010ffc33ff98 x8 : 0000000000000000
[  521.593683] x7 : 0000000000000000 x6 : 0000000000000001
[  521.598983] x5 : 0000000000000002 x4 : 0000000000000003
[  521.604284] x3 : ffff8007ffb17184 x2 : ffff8007ffb17184
[  521.609585] x1 : ffff000001662118 x0 : ffff000008557be0
[  521.614887] Process rmmod (pid: 1897, stack limit = 0x00000000859535c3)
[  521.621490] Call trace:
[  521.623928]  nic_remove+0x24/0x88 [nicpf]
[  521.627927]  pci_device_remove+0x48/0xd8
[  521.631847]  device_release_driver_internal+0x1b0/0x248
[  521.637062]  driver_detach+0x50/0xc0
[  521.640628]  bus_remove_driver+0x60/0x100
[  521.644627]  driver_unregister+0x34/0x60
[  521.648538]  pci_unregister_driver+0x24/0xd8
[  521.652798]  nic_cleanup_module+0x14/0x111c [nicpf]
[  521.657672]  __arm64_sys_delete_module+0x150/0x218
[  521.662460]  el0_svc_handler+0x94/0x110
[  521.666287]  el0_svc+0x8/0xc
[  521.669160] Code: aa1e03e0 9102c295 d503201f f9404eb3 (b9401660)

Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network controller")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'qed-enhancements-series'
David S. Miller [Wed, 28 Nov 2018 00:17:20 +0000 (16:17 -0800)]
Merge branch 'qed-enhancements-series'

Sudarsana Reddy Kalluru says:

====================
qed* enhancements series

The patch series add few enhancements to qed/qede drivers.

Changes from previous versions:
-------------------------------
v3: Revert v2 changes as the other paths (i.e. ptp) access the same data in
    atomic context.
v2: Use __set_bit()/__clear_bit() where data access doesn't need to be
    atomic.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: Add support for MBI upgrade over MFW.
Sudarsana Reddy Kalluru [Mon, 26 Nov 2018 10:27:00 +0000 (02:27 -0800)]
qed: Add support for MBI upgrade over MFW.

The patch adds driver support for MBI image update through MFW.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqede: Update link status only when interface is ready.
Sudarsana Reddy Kalluru [Mon, 26 Nov 2018 10:26:59 +0000 (02:26 -0800)]
qede: Update link status only when interface is ready.

In the case of internal reload (e.g., mtu change), there could be a race
between link-up notification from mfw and the driver unload processing. In
such case kernel assumes the link is up and starts using the queues which
leads to the server crash.

Send link notification to the kernel only when driver has already requested
MFW for the link.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqede: Simplify the usage of qede-flags.
Sudarsana Reddy Kalluru [Mon, 26 Nov 2018 10:26:58 +0000 (02:26 -0800)]
qede: Simplify the usage of qede-flags.

The values represented by qede->flags is being used in mixed ways:
  1. As 'value' at some places e.g., QEDE_FLAGS_IS_VF usage
  2. As bit-mask(value) at some places e.g., QEDE_FLAGS_PTP_TX_IN_PRORGESS
     usage.
This implementation pose problems in future when we want to add more flag
values e.g., overlap of the values, overflow of 64-bit storage.

Updated the implementation to go with approach (2) for qede->flags.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: Display port_id in the UFP debug messages.
Sudarsana Reddy Kalluru [Mon, 26 Nov 2018 10:26:57 +0000 (02:26 -0800)]
qed: Display port_id in the UFP debug messages.

MFW sends UFP notifications mostly during the device init phase and PFs
might not be assigned with a name by this time. Hence capturing port-id in
the debug messages would help in finding which PF the ufp notification was
sent to.

Also, fixed a minor scemantic issue in a debug print.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'aquantia-usb'
David S. Miller [Tue, 27 Nov 2018 23:46:08 +0000 (15:46 -0800)]
Merge branch 'aquantia-usb'

Igor Russkikh says:

====================
Add support for Aquantia AQtion USB to 5/2.5GbE devices

This patchset introduces support for new multigig ethernet to USB dongle,
developed jointly by Aquantia (Phy) and ASIX (USB MAC).

The driver has similar structure with other ASIX MAC drivers (AX88179), but
with a number of important differences:
- Driver supports both direct Phy and custom firmware interface for Phy
  programming. This is due to different firmware modules available with
  this product.
- Driver handles new 2.5G/5G link speed configuration and reporting.
- Device support all speeds from 100M up to 5G.
- Device supports MTU up to 16K.

Device supports various standard networking features, like
checksum offloads, vlan tagging/filtering, TSO.

The code of this driver is based on original ASIX sources and was extended
by Aquantia for 5G multigig support.

Patchset v2 includes following changes:
- Function variables declarions fixed to reverse xmass tree
- Improve patch layout structure
- Remove unnecessary curly braces in switch/case statements
- Use 'packed' attribute for HW structures only
- Use eth_mac_addr function in set_mac_addr callback
- Remove unnecessary 'memset' calls.
- Read MAC address from EEPROM function has now better name
- Use driver_priv field to store context. It avoids ugly cast.
- Set max_mtu field. Remove check for MTU size
- Rewrite read/write functions. Add helpers for read/write 16/32 bit values
- Use mask and shifts instead of bitfields to support BE platforms.
- Use stack allocated buffer for configuring mcast filters
- Use AUTONEG_ENABLE when go to suspend state
- Pad out wol_cfg field from context structure. Use stack allocated instead
- Remove driver version
- Check field 'duplex' in set_link_ksetting callback as well
- Use already created defines in usb matching macro
- Rename phy_ops struct to phy_cfg
- Use ether_addr_copy for copying mac address
- Add fall-through comment in switch/case for avoid checkpatch warning
- Remove match for CDC ether device
- Add ASIX's HW id-s to match this driver
- Add all HW id-s with which driver can work to blacklist of cdc_ether driver

Patchset v3 includes following changes:
- Use linkmode_copy instead of bitmap_copy
- Remove Direct PHY access code since production HW will not have this
    mode anymore
- Fix line over 80 symbols and alignments in cdc_ether patch
- Add match for ECM configuration
    On start our HW reports both ECM and vendor configs.
    Linux prefers to use ECM driver and chooses active configuration
    which is for ecm driver (not for vendor specific).
    We need to match this configuration and forcibly switch configuration
    to vendor specific.

Patchset v4 includes following changes:
- Set gso_max_size.
- Optimize accessing to descriptors
- Use SKB_TRUESIZE macro.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Extend cdc_ether blacklist
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:50 +0000 (09:33 +0000)]
net: usb: aqc111: Extend cdc_ether blacklist

Added Aquantia and ASIX device IDs to prevent loading cdc_ether for
these devices. Our firmware reports CDC configuration simultaneously
with vendor specific.

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Add ASIX's HW ids
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:47 +0000 (09:33 +0000)]
net: usb: aqc111: Add ASIX's HW ids

It enables driver for ASIX products which are also based on
aqc111/112U chips.

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Add support for wake on LAN by MAGIC packet
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:45 +0000 (09:33 +0000)]
net: usb: aqc111: Add support for wake on LAN by MAGIC packet

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Implement get/set_link_ksettings callbacks
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:42 +0000 (09:33 +0000)]
net: usb: aqc111: Implement get/set_link_ksettings callbacks

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Initialize ethtool_ops structure
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:38 +0000 (09:33 +0000)]
net: usb: aqc111: Initialize ethtool_ops structure

Implement get_drvinfo, set/get_msglevel, get_link callbacks

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Add RX VLAN filtering support
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:35 +0000 (09:33 +0000)]
net: usb: aqc111: Add RX VLAN filtering support

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Add support for VLAN_CTAG_TX/RX offload
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:33 +0000 (09:33 +0000)]
net: usb: aqc111: Add support for VLAN_CTAG_TX/RX offload

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Implement set_rx_mode callback
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:31 +0000 (09:33 +0000)]
net: usb: aqc111: Implement set_rx_mode callback

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Add support for TSO
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:28 +0000 (09:33 +0000)]
net: usb: aqc111: Add support for TSO

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Add support for enable/disable checksum offload
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:26 +0000 (09:33 +0000)]
net: usb: aqc111: Add support for enable/disable checksum offload

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Add support for changing MTU
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:23 +0000 (09:33 +0000)]
net: usb: aqc111: Add support for changing MTU

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Add checksum offload support
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:21 +0000 (09:33 +0000)]
net: usb: aqc111: Add checksum offload support

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Implement RX data path
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:19 +0000 (09:33 +0000)]
net: usb: aqc111: Implement RX data path

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Implement TX data path
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:16 +0000 (09:33 +0000)]
net: usb: aqc111: Implement TX data path

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Add support for getting and setting of MAC address
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:14 +0000 (09:33 +0000)]
net: usb: aqc111: Add support for getting and setting of MAC address

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Introduce link management
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:12 +0000 (09:33 +0000)]
net: usb: aqc111: Introduce link management

Add full hardware initialization sequence and link configuration logic

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Introduce PHY access
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:09 +0000 (09:33 +0000)]
net: usb: aqc111: Introduce PHY access

Add helpers to write 32bit values.
Implement PHY power up/down sequences.
AQC111, PHY is being controlled via vendor command interface.

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Various callbacks implementation
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:07 +0000 (09:33 +0000)]
net: usb: aqc111: Various callbacks implementation

Reset, stop callbacks, driver unbind callback.
More register defines required for these callbacks.
Add helpers to read/write 16bit values

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Add implementation of read and write commands
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:04 +0000 (09:33 +0000)]
net: usb: aqc111: Add implementation of read and write commands

Read/write command register defines and functions

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Add bind and empty unbind callbacks
Dmitry Bezrukov [Mon, 26 Nov 2018 09:33:02 +0000 (09:33 +0000)]
net: usb: aqc111: Add bind and empty unbind callbacks

Initialize net_device_ops structure

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: aqc111: Driver skeleton for Aquantia AQtion USB to 5GbE
Dmitry Bezrukov [Mon, 26 Nov 2018 09:32:59 +0000 (09:32 +0000)]
net: usb: aqc111: Driver skeleton for Aquantia AQtion USB to 5GbE

Initialize usb_driver structure skeleton

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosctp: increase sk_wmem_alloc when head->truesize is increased
Xin Long [Mon, 26 Nov 2018 06:52:44 +0000 (14:52 +0800)]
sctp: increase sk_wmem_alloc when head->truesize is increased

I changed to count sk_wmem_alloc by skb truesize instead of 1 to
fix the sk_wmem_alloc leak caused by later truesize's change in
xfrm in Commit 02968ccf0125 ("sctp: count sk_wmem_alloc by skb
truesize in sctp_packet_transmit").

But I should have also increased sk_wmem_alloc when head->truesize
is increased in sctp_packet_gso_append() as xfrm does. Otherwise,
sctp gso packet will cause sk_wmem_alloc underflow.

Fixes: 02968ccf0125 ("sctp: count sk_wmem_alloc by skb truesize in sctp_packet_transmit")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoadd documents for snmp counters
yupeng [Mon, 26 Nov 2018 07:35:46 +0000 (23:35 -0800)]
add documents for snmp counters

Add explaination of below counters:
TcpExtTCPRcvCoalesce
TcpExtTCPAutoCorking
TcpExtTCPOrigDataSent
TCPSynRetrans
TCPFastOpenActiveFail
TcpExtListenOverflows
TcpExtListenDrops
TcpExtTCPHystartTrainDetect
TcpExtTCPHystartTrainCwnd
TcpExtTCPHystartDelayDetect
TcpExtTCPHystartDelayCwnd

Signed-off-by: yupeng <yupeng0921@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agofirestream: fix spelling mistake: "Inititing" -> "Initializing"
Colin Ian King [Sun, 25 Nov 2018 23:21:08 +0000 (23:21 +0000)]
firestream: fix spelling mistake: "Inititing" -> "Initializing"

There are spelling mistakes in debug messages, fix them.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'mlxsw-Prepare-for-VLAN-aware-bridge-w-VxLAN'
David S. Miller [Tue, 27 Nov 2018 23:27:08 +0000 (15:27 -0800)]
Merge branch 'mlxsw-Prepare-for-VLAN-aware-bridge-w-VxLAN'

Ido Schimmel says:

====================
mlxsw: Prepare for VLAN-aware bridge w/VxLAN

The driver is using 802.1Q filtering identifiers (FIDs) to represent the
different VLANs in the VLAN-aware bridge (only one is supported).

However, the device cannot assign a VNI to such FIDs, which prevents the
driver from supporting the enslavement of VxLAN devices to the
VLAN-aware bridge.

This patchset works around this limitation by emulating 802.1Q FIDs
using 802.1D FIDs, which can be assigned a VNI and so far have only been
used in conjunction with VLAN-unaware bridges.

The downside of this approach is that multiple {Port,VID}->FID entries
are required, whereas a single VID->FID entry is required with "true"
802.1Q FIDs.

First four patches introduce the new FID family of emulated 802.1Q FIDs
and the associated type of router interfaces (RIFs). Last patch flips
the driver to use this new FID family.

The diff is relatively small because the internal implementation of each
FID family is contained and hidden in spectrum_fid.c. Different internal
users (e.g., bridge, router) are aware of the different FID types, but
do not care about their internal implementation. This makes it trivial
to swap the current implementation of 802.1Q FIDs with the new one,
using 802.1D FIDs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum: Flip driver to use emulated 802.1Q FIDs
Ido Schimmel [Sun, 25 Nov 2018 09:43:59 +0000 (09:43 +0000)]
mlxsw: spectrum: Flip driver to use emulated 802.1Q FIDs

Replace 802.1Q FIDs and VLAN RIFs with their emulated counterparts.

The emulated 802.1Q FIDs are actually 802.1D FIDs and thus use the same
flood tables, of per-FID type. Therefore, add 4K-1 entries to the
per-FID flood tables for the new FIDs and get rid of the FID-offset
flood tables that were used by the old 802.1Q FIDs.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum_router: Introduce emulated VLAN RIFs
Ido Schimmel [Sun, 25 Nov 2018 09:43:58 +0000 (09:43 +0000)]
mlxsw: spectrum_router: Introduce emulated VLAN RIFs

Router interfaces (RIFs) constructed on top of VLAN-aware bridges are of
"VLAN" type, whereas RIFs constructed on top of VLAN-unaware bridges of
"FID" type.

In other words, the RIF type is derived from the underlying FID type.
VLAN RIFs are used on top of 802.1Q FIDs, whereas FID RIFs are used on
top of 802.1D FIDs.

Since the previous patch emulated 802.1Q FIDs using 802.1D FIDs, this
patch emulates VLAN RIFs using FID RIFs.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum_fid: Introduce emulated 802.1Q FIDs
Ido Schimmel [Sun, 25 Nov 2018 09:43:57 +0000 (09:43 +0000)]
mlxsw: spectrum_fid: Introduce emulated 802.1Q FIDs

The driver uses 802.1Q FIDs when offloading a VLAN-aware bridge.
Unfortunately, it is not possible to assign a VNI to such FIDs, which
prompts the driver to forbid the enslavement of VxLAN devices to a
VLAN-aware bridge.

Workaround this hardware limitation by creating a new family of FIDs,
emulated 802.1Q FIDs. These FIDs are emulated using 802.1D FIDs, which
can be assigned a VNI.

The downside of this approach is that multiple {Port, VID}->FID entries
are required, whereas only a single VID->FID is required with "true"
802.1Q FIDs.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum_fid: Make flood index calculation more robust
Ido Schimmel [Sun, 25 Nov 2018 09:43:55 +0000 (09:43 +0000)]
mlxsw: spectrum_fid: Make flood index calculation more robust

802.1D FIDs use a per-FID flood table, where the flood index into the
table is calculated by subtracting 4K from the FID's index.

Currently, 802.1D FIDs start at 4K, so the calculation is correct, but
if it was ever to change, the calculation will no longer be correct.

In addition, this change will allow us to reuse the flood index
calculation function in the next patch, where we are going to emulate
802.1Q FIDs using 802.1D FIDs.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum_switchdev: Do not set field when it is reserved
Ido Schimmel [Sun, 25 Nov 2018 09:43:54 +0000 (09:43 +0000)]
mlxsw: spectrum_switchdev: Do not set field when it is reserved

When configuring an FDB entry pointing to a LAG netdev (or its upper),
the driver should only set the 'lag_vid' field when the FID (filtering
identifier) is of 802.1D type.

Extend the 802.1D FID family with an attribute indicating whether this
field should be set and based on its value set the field or leave it
blank.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: aquantia: return 'err' if set MPI_DEINIT state fails
YueHaibing [Sat, 24 Nov 2018 10:16:41 +0000 (18:16 +0800)]
net: aquantia: return 'err' if set MPI_DEINIT state fails

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c:260:7:
 warning: variable 'err' set but not used [-Wunused-but-set-variable]

'err' should be returned while set MPI_DEINIT state fails
in hw_atl_utils_soft_reset.

Fixes: cce96d1883da ("net: aquantia: Regression on reset with 1.x firmware")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'bridge-bools'
David S. Miller [Tue, 27 Nov 2018 23:04:16 +0000 (15:04 -0800)]
Merge branch 'bridge-bools'

Nikolay Aleksandrov says:

====================
net: bridge: add an option to disabe linklocal learning

This set adds a new bridge option which can control learning from
link-local packets, by default learning is on to be consistent and avoid
breaking users expectations. If the new no_linklocal_learn option is
enabled then the bridge will stop learning from link-local packets.

In order to save space for future boolean options, patch 01 adds a new
bool option API that uses a bitmask to control boolean options. The
bridge is by far the largest netlink attr user and we keep adding simple
boolean options which waste nl attr ids and space. We're not directly
mapping these to the in-kernel bridge flags because some might require
more complex configuration changes (e.g. if we were to add the per port
vlan stats now, it'd require multiple checks before changing value).
Any new bool option needs to be handled by both br_boolopt_toggle and get
in order to be able to retrieve its state later. All such options are
automatically exported via netlink. The behaviour of setting such
options is consistent with netlink option handling when a missing
option is being set (silently ignored), e.g. when a newer iproute2 is used
on older kernel. All supported options are exported via bm's optmask
when dumping the new attribute.

v2: address Andrew Lunn's comments, squash a minor change into patch 01,
    export all supported options via optmask when dumping, add patch 03,
    pass down extack so options can return meaningful errors, add
    WARN_ON on unsupported options (should not happen)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: bridge: export supported boolopts
Nikolay Aleksandrov [Sat, 24 Nov 2018 02:34:22 +0000 (04:34 +0200)]
net: bridge: export supported boolopts

Now that we have at least one bool option, we can export all of the
supported bool options via optmask when dumping them.

v2: new patch

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: bridge: add no_linklocal_learn bool option
Nikolay Aleksandrov [Sat, 24 Nov 2018 02:34:21 +0000 (04:34 +0200)]
net: bridge: add no_linklocal_learn bool option

Use the new boolopt API to add an option which disables learning from
link-local packets. The default is kept as before and learning is
enabled. This is a simple map from a boolopt bit to a bridge private
flag that is tested before learning.

v2: pass NULL for extack via sysfs

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: bridge: add support for user-controlled bool options
Nikolay Aleksandrov [Sat, 24 Nov 2018 02:34:20 +0000 (04:34 +0200)]
net: bridge: add support for user-controlled bool options

We have been adding many new bridge options, a big number of which are
boolean but still take up netlink attribute ids and waste space in the skb.
Recently we discussed learning from link-local packets[1] and decided
yet another new boolean option will be needed, thus introducing this API
to save some bridge nl space.
The API supports changing the value of multiple boolean options at once
via the br_boolopt_multi struct which has an optmask (which options to
set, bit per opt) and optval (options' new values). Future boolean
options will only be added to the br_boolopt_id enum and then will have
to be handled in br_boolopt_toggle/get. The API will automatically
add the ability to change and export them via netlink, sysfs can use the
single boolopt function versions to do the same. The behaviour with
failing/succeeding is the same as with normal netlink option changing.

If an option requires mapping to internal kernel flag or needs special
configuration to be enabled then it should be handled in
br_boolopt_toggle. It should also be able to retrieve an option's current
state via br_boolopt_get.

v2: WARN_ON() on unsupported option as that shouldn't be possible and
    also will help catch people who add new options without handling
    them for both set and get. Pass down extack so if an option desires
    it could set it on error and be more user-friendly.

[1] https://www.spinics.net/lists/netdev/msg532698.html

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: add workaround for issue where PHY driver doesn't bind to the device
Heiner Kallweit [Fri, 23 Nov 2018 18:41:29 +0000 (19:41 +0100)]
net: phy: add workaround for issue where PHY driver doesn't bind to the device

After switching the r8169 driver to use phylib some user reported that
their network is broken. This was caused by the genphy PHY driver being
used instead of the dedicated PHY driver for the RTL8211B. Users
reported that loading the Realtek PHY driver module upfront fixes the
issue. See also this mail thread:
https://marc.info/?t=154279781800003&r=1&w=2
The issue is quite weird and the root cause seems to be somewhere in
the base driver core. The patch works around the issue and may be
removed once the actual issue is fixed.

The Fixes tag refers to the first reported occurrence of the issue.
The issue itself may have been existing much longer and it may affect
users of other network chips as well. Users typically will recognize
this issue only if their PHY stops working when being used with the
genphy driver.

Fixes: f1e911d5d0df ("r8169: add basic phylib support")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agousbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2
Bernd Eckstein [Fri, 23 Nov 2018 12:51:26 +0000 (13:51 +0100)]
usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2

The bug is not easily reproducable, as it may occur very infrequently
(we had machines with 20minutes heavy downloading before it occurred)
However, on a virual machine (VMWare on Windows 10 host) it occurred
pretty frequently (1-2 seconds after a speedtest was started)

dev->tx_skb mab be freed via dev_kfree_skb_irq on a callback
before it is set.

This causes the following problems:
- double free of the skb or potential memory leak
- in dmesg: 'recvmsg bug' and 'recvmsg bug 2' and eventually
  general protection fault

Example dmesg output:
[  134.841986] ------------[ cut here ]------------
[  134.841987] recvmsg bug: copied 9C24A555 seq 9C24B557 rcvnxt 9C25A6B3 fl 0
[  134.841993] WARNING: CPU: 7 PID: 2629 at /build/linux-hwe-On9fm7/linux-hwe-4.15.0/net/ipv4/tcp.c:1865 tcp_recvmsg+0x44d/0xab0
[  134.841994] Modules linked in: ipheth(OE) kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmw_balloon intel_rapl_perf joydev input_leds serio_raw vmw_vsock_vmci_transport vsock shpchp i2c_piix4 mac_hid binfmt_misc vmw_vmci parport_pc ppdev lp parport autofs4 vmw_pvscsi vmxnet3 hid_generic usbhid hid vmwgfx ttm drm_kms_helper syscopyarea sysfillrect mptspi mptscsih sysimgblt ahci psmouse fb_sys_fops pata_acpi mptbase libahci e1000 drm scsi_transport_spi
[  134.842046] CPU: 7 PID: 2629 Comm: python Tainted: G        W  OE    4.15.0-34-generic #37~16.04.1-Ubuntu
[  134.842046] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  134.842048] RIP: 0010:tcp_recvmsg+0x44d/0xab0
[  134.842048] RSP: 0018:ffffa6630422bcc8 EFLAGS: 00010286
[  134.842049] RAX: 0000000000000000 RBX: ffff997616f4f200 RCX: 0000000000000006
[  134.842049] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff9976257d6490
[  134.842050] RBP: ffffa6630422bd98 R08: 0000000000000001 R09: 000000000004bba4
[  134.842050] R10: 0000000001e00c6f R11: 000000000004bba4 R12: ffff99760dee3000
[  134.842051] R13: 0000000000000000 R14: ffff99760dee3514 R15: 0000000000000000
[  134.842051] FS:  00007fe332347700(0000) GS:ffff9976257c0000(0000) knlGS:0000000000000000
[  134.842052] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  134.842053] CR2: 0000000001e41000 CR3: 000000020e9b4006 CR4: 00000000003606e0
[  134.842055] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  134.842055] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  134.842057] Call Trace:
[  134.842060]  ? aa_sk_perm+0x53/0x1a0
[  134.842064]  inet_recvmsg+0x51/0xc0
[  134.842066]  sock_recvmsg+0x43/0x50
[  134.842070]  SYSC_recvfrom+0xe4/0x160
[  134.842072]  ? __schedule+0x3de/0x8b0
[  134.842075]  ? ktime_get_ts64+0x4c/0xf0
[  134.842079]  SyS_recvfrom+0xe/0x10
[  134.842082]  do_syscall_64+0x73/0x130
[  134.842086]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  134.842086] RIP: 0033:0x7fe331f5a81d
[  134.842088] RSP: 002b:00007ffe8da98398 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
[  134.842090] RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 00007fe331f5a81d
[  134.842094] RDX: 00000000000003fb RSI: 0000000001e00874 RDI: 0000000000000003
[  134.842095] RBP: 00007fe32f642c70 R08: 0000000000000000 R09: 0000000000000000
[  134.842097] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe332347698
[  134.842099] R13: 0000000001b7e0a0 R14: 0000000001e00874 R15: 0000000000000000
[  134.842103] Code: 24 fd ff ff e9 cc fe ff ff 48 89 d8 41 8b 8c 24 10 05 00 00 44 8b 45 80 48 c7 c7 08 bd 59 8b 48 89 85 68 ff ff ff e8 b3 c4 7d ff <0f> 0b 48 8b 85 68 ff ff ff e9 e9 fe ff ff 41 8b 8c 24 10 05 00
[  134.842126] ---[ end trace b7138fc08c83147f ]---
[  134.842144] general protection fault: 0000 [#1] SMP PTI
[  134.842145] Modules linked in: ipheth(OE) kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmw_balloon intel_rapl_perf joydev input_leds serio_raw vmw_vsock_vmci_transport vsock shpchp i2c_piix4 mac_hid binfmt_misc vmw_vmci parport_pc ppdev lp parport autofs4 vmw_pvscsi vmxnet3 hid_generic usbhid hid vmwgfx ttm drm_kms_helper syscopyarea sysfillrect mptspi mptscsih sysimgblt ahci psmouse fb_sys_fops pata_acpi mptbase libahci e1000 drm scsi_transport_spi
[  134.842161] CPU: 7 PID: 2629 Comm: python Tainted: G        W  OE    4.15.0-34-generic #37~16.04.1-Ubuntu
[  134.842162] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  134.842164] RIP: 0010:tcp_close+0x2c6/0x440
[  134.842165] RSP: 0018:ffffa6630422bde8 EFLAGS: 00010202
[  134.842167] RAX: 0000000000000000 RBX: ffff99760dee3000 RCX: 0000000180400034
[  134.842168] RDX: 5c4afd407207a6c4 RSI: ffffe868495bd300 RDI: ffff997616f4f200
[  134.842169] RBP: ffffa6630422be08 R08: 0000000016f4d401 R09: 0000000180400034
[  134.842169] R10: ffffa6630422bd98 R11: 0000000000000000 R12: 000000000000600c
[  134.842170] R13: 0000000000000000 R14: ffff99760dee30c8 R15: ffff9975bd44fe00
[  134.842171] FS:  00007fe332347700(0000) GS:ffff9976257c0000(0000) knlGS:0000000000000000
[  134.842173] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  134.842174] CR2: 0000000001e41000 CR3: 000000020e9b4006 CR4: 00000000003606e0
[  134.842177] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  134.842178] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  134.842179] Call Trace:
[  134.842181]  inet_release+0x42/0x70
[  134.842183]  __sock_release+0x42/0xb0
[  134.842184]  sock_close+0x15/0x20
[  134.842187]  __fput+0xea/0x220
[  134.842189]  ____fput+0xe/0x10
[  134.842191]  task_work_run+0x8a/0xb0
[  134.842193]  exit_to_usermode_loop+0xc4/0xd0
[  134.842195]  do_syscall_64+0xf4/0x130
[  134.842197]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  134.842197] RIP: 0033:0x7fe331f5a560
[  134.842198] RSP: 002b:00007ffe8da982e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[  134.842200] RAX: 0000000000000000 RBX: 00007fe32f642c70 RCX: 00007fe331f5a560
[  134.842201] RDX: 00000000008f5320 RSI: 0000000001cd4b50 RDI: 0000000000000003
[  134.842202] RBP: 00007fe32f6500f8 R08: 000000000000003c R09: 00000000009343c0
[  134.842203] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe32f6500d0
[  134.842204] R13: 00000000008f5320 R14: 00000000008f5320 R15: 0000000001cd4770
[  134.842205] Code: c8 00 00 00 45 31 e4 49 39 fe 75 4d eb 50 83 ab d8 00 00 00 01 48 8b 17 48 8b 47 08 48 c7 07 00 00 00 00 48 c7 47 08 00 00 00 00 <48> 89 42 08 48 89 10 0f b6 57 34 8b 47 2c 2b 47 28 83 e2 01 80
[  134.842226] RIP: tcp_close+0x2c6/0x440 RSP: ffffa6630422bde8
[  134.842227] ---[ end trace b7138fc08c831480 ]---

The proposed patch eliminates a potential racing condition.
Before, usb_submit_urb was called and _after_ that, the skb was attached
(dev->tx_skb). So, on a callback it was possible, however unlikely that the
skb was freed before it was set. That way (because dev->tx_skb was not set
to NULL after it was freed), it could happen that a skb from a earlier
transmission was freed a second time (and the skb we should have freed did
not get freed at all)

Now we free the skb directly in ipheth_tx(). It is not passed to the
callback anymore, eliminating the posibility of a double free of the same
skb. Depending on the retval of usb_submit_urb() we use dev_kfree_skb_any()
respectively dev_consume_skb_any() to free the skb.

Signed-off-by: Oliver Zweigle <Oliver.Zweigle@faro.com>
Signed-off-by: Bernd Eckstein <3ernd.Eckstein@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
David S. Miller [Tue, 27 Nov 2018 17:52:05 +0000 (09:52 -0800)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2018-11-27

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix several bugs in BPF sparc JIT, that is, convergence for fused
   branches, initialization of frame pointer register, and moving all
   arguments into output registers from input registers in prologue
   to fix BPF to BPF calls, from David.

2) Fix a bug in arm64 JIT for fetching BPF to BPF call addresses where
   they are not guaranteed to fit into imm field and therefore must be
   retrieved through prog aux data, from Daniel.

3) Explicitly add all JITs to MAINTAINERS file with developers able to
   help out in feature development, fixes, review, etc.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agokvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb
Jim Mattson [Tue, 22 May 2018 16:54:20 +0000 (09:54 -0700)]
kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb

Previously, we only called indirect_branch_prediction_barrier on the
logical CPU that freed a vmcb. This function should be called on all
logical CPUs that last loaded the vmcb in question.

Fixes: 15d45071523d ("KVM/x86: Add IBPB support")
Reported-by: Neel Natu <neelnatu@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agokvm: mmu: Fix race in emulated page table writes
Junaid Shahid [Wed, 31 Oct 2018 21:53:57 +0000 (14:53 -0700)]
kvm: mmu: Fix race in emulated page table writes

When a guest page table is updated via an emulated write,
kvm_mmu_pte_write() is called to update the shadow PTE using the just
written guest PTE value. But if two emulated guest PTE writes happened
concurrently, it is possible that the guest PTE and the shadow PTE end
up being out of sync. Emulated writes do not mark the shadow page as
unsync-ed, so this inconsistency will not be resolved even by a guest TLB
flush (unless the page was marked as unsync-ed at some other point).

This is fixed by re-reading the current value of the guest PTE after the
MMU lock has been acquired instead of just using the value that was
written prior to calling kvm_mmu_pte_write().

Signed-off-by: Junaid Shahid <junaids@google.com>
Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agoKVM: nVMX: vmcs12 revision_id is always VMCS12_REVISION even when copied from eVMCS
Liran Alon [Tue, 13 Nov 2018 15:44:46 +0000 (17:44 +0200)]
KVM: nVMX: vmcs12 revision_id is always VMCS12_REVISION even when copied from eVMCS

vmcs12 represents the per-CPU cache of L1 active vmcs12.

This cache can be loaded by one of the following:
1) Guest making a vmcs12 active by exeucting VMPTRLD
2) Guest specifying eVMCS in VP assist page and executing
VMLAUNCH/VMRESUME.

Either way, vmcs12 should have revision_id of VMCS12_REVISION.
Which is not equal to eVMCS revision_id which specifies used
VersionNumber of eVMCS struct (e.g. KVM_EVMCS_VERSION).

Specifically, this causes an issue in restoring a nested VM state
because vmx_set_nested_state() verifies that vmcs12->revision_id
is equal to VMCS12_REVISION which was not true in case vmcs12
was populated from an eVMCS by vmx_get_nested_state() which calls
copy_enlightened_to_vmcs12().

Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agoKVM: nVMX: Verify eVMCS revision id match supported eVMCS version on eVMCS VMPTRLD
Liran Alon [Thu, 1 Nov 2018 08:57:39 +0000 (10:57 +0200)]
KVM: nVMX: Verify eVMCS revision id match supported eVMCS version on eVMCS VMPTRLD

According to TLFS section 16.11.2 Enlightened VMCS, the first u32
field of eVMCS should specify eVMCS VersionNumber.

This version should be in the range of supported eVMCS versions exposed
to guest via CPUID.0x4000000A.EAX[0:15].
The range which KVM expose to guest in this CPUID field should be the
same as the value returned in vmcs_version by nested_enable_evmcs().

According to the above, eVMCS VMPTRLD should verify that version specified
in given eVMCS is in the supported range. However, current code
mistakenly verfies this field against VMCS12_REVISION.

One can also see that when KVM use eVMCS, it makes sure that
alloc_vmcs_cpu() sets allocated eVMCS revision_id to KVM_EVMCS_VERSION.

Obvious fix should just change eVMCS VMPTRLD to verify first u32 field
of eVMCS is equal to KVM_EVMCS_VERSION.
However, it turns out that Microsoft Hyper-V fails to comply to their
own invented interface: When Hyper-V use eVMCS, it just sets first u32
field of eVMCS to revision_id specified in MSR_IA32_VMX_BASIC (In our
case: VMCS12_REVISION). Instead of used eVMCS version number which is
one of the supported versions specified in CPUID.0x4000000A.EAX[0:15].
To overcome Hyper-V bug, we accept either a supported eVMCS version
or VMCS12_REVISION as valid values for first u32 field of eVMCS.

Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agoKVM: nVMX/nSVM: Fix bug which sets vcpu->arch.tsc_offset to L1 tsc_offset
Leonid Shatz [Tue, 6 Nov 2018 10:14:25 +0000 (12:14 +0200)]
KVM: nVMX/nSVM: Fix bug which sets vcpu->arch.tsc_offset to L1 tsc_offset

Since commit e79f245ddec1 ("X86/KVM: Properly update 'tsc_offset' to
represent the running guest"), vcpu->arch.tsc_offset meaning was
changed to always reflect the tsc_offset value set on active VMCS.
Regardless if vCPU is currently running L1 or L2.

However, above mentioned commit failed to also change
kvm_vcpu_write_tsc_offset() to set vcpu->arch.tsc_offset correctly.
This is because vmx_write_tsc_offset() could set the tsc_offset value
in active VMCS to given offset parameter *plus vmcs12->tsc_offset*.
However, kvm_vcpu_write_tsc_offset() just sets vcpu->arch.tsc_offset
to given offset parameter. Without taking into account the possible
addition of vmcs12->tsc_offset. (Same is true for SVM case).

Fix this issue by changing kvm_x86_ops->write_tsc_offset() to return
actually set tsc_offset in active VMCS and modify
kvm_vcpu_write_tsc_offset() to set returned value in
vcpu->arch.tsc_offset.
In addition, rename write_tsc_offset() callback to write_l1_tsc_offset()
to make it clear that it is meant to set L1 TSC offset.

Fixes: e79f245ddec1 ("X86/KVM: Properly update 'tsc_offset' to represent the running guest")
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Leonid Shatz <leonid.shatz@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agox86/kvm/vmx: fix old-style function declaration
Yi Wang [Thu, 8 Nov 2018 03:22:21 +0000 (11:22 +0800)]
x86/kvm/vmx: fix old-style function declaration

The inline keyword which is not at the beginning of the function
declaration may trigger the following build warnings, so let's fix it:

arch/x86/kvm/vmx.c:1309:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
arch/x86/kvm/vmx.c:5947:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
arch/x86/kvm/vmx.c:5985:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
arch/x86/kvm/vmx.c:6023:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agoKVM: x86: fix empty-body warnings
Yi Wang [Thu, 8 Nov 2018 08:48:36 +0000 (16:48 +0800)]
KVM: x86: fix empty-body warnings

We get the following warnings about empty statements when building
with 'W=1':

arch/x86/kvm/lapic.c:632:53: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
arch/x86/kvm/lapic.c:1907:42: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
arch/x86/kvm/lapic.c:1936:65: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
arch/x86/kvm/lapic.c:1975:44: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]

Rework the debug helper macro to get rid of these warnings.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
6 years agoKVM: VMX: Update shared MSRs to be saved/restored on MSR_EFER.LMA changes
Liran Alon [Tue, 20 Nov 2018 16:03:25 +0000 (18:03 +0200)]
KVM: VMX: Update shared MSRs to be saved/restored on MSR_EFER.LMA changes

When guest transitions from/to long-mode by modifying MSR_EFER.LMA,
the list of shared MSRs to be saved/restored on guest<->host
transitions is updated (See vmx_set_efer() call to setup_msrs()).

On every entry to guest, vcpu_enter_guest() calls
vmx_prepare_switch_to_guest(). This function should also take care
of setting the shared MSRs to be saved/restored. However, the
function does nothing in case we are already running with loaded
guest state (vmx->loaded_cpu_state != NULL).

This means that even when guest modifies MSR_EFER.LMA which results
in updating the list of shared MSRs, it isn't being taken into account
by vmx_prepare_switch_to_guest() because it happens while we are
running with loaded guest state.

To fix above mentioned issue, add a flag to mark that the list of
shared MSRs has been updated and modify vmx_prepare_switch_to_guest()
to set shared MSRs when running with host state *OR* list of shared
MSRs has been updated.

Note that this issue was mistakenly introduced by commit
678e315e78a7 ("KVM: vmx: add dedicated utility to access guest's
kernel_gs_base") because previously vmx_set_efer() always called
vmx_load_host_state() which resulted in vmx_prepare_switch_to_guest() to
set shared MSRs.

Fixes: 678e315e78a7 ("KVM: vmx: add dedicated utility to access guest's kernel_gs_base")
Reported-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>