David S. Miller [Fri, 17 Feb 2017 00:34:01 +0000 (19:34 -0500)]
Merge git://git./linux/kernel/git/davem/net
Ganesh Goudar [Thu, 16 Feb 2017 06:57:15 +0000 (12:27 +0530)]
cxgb4: Remove redundant code in t4_uld_clean_up()
Remove variable rxq_info and also remove redundant assignment
to it.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Thu, 16 Feb 2017 06:55:52 +0000 (12:25 +0530)]
cxgb4: Add new T5 and T6 pci device id's
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun V [Thu, 16 Feb 2017 06:52:45 +0000 (12:22 +0530)]
cxgb4: Increase max number of tc u32 links
Make max number of supported tc u32 links equal to max number of filters
supported by hardware.
Signed-off-by: Arjun V <arjun@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 16 Feb 2017 18:22:41 +0000 (10:22 -0800)]
Merge tag 'media/v4.10-5' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fix from Mauro Carvalho Chehab:
"A regression fix that makes the Siano driver to work again after the
CONFIG_VMAP_STACK change"
* tag 'media/v4.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] siano: make it work again with CONFIG_VMAP_STACK
Miklos Szeredi [Thu, 16 Feb 2017 16:49:02 +0000 (17:49 +0100)]
vfs: fix uninitialized flags in splice_to_pipe()
Flags (PIPE_BUF_FLAG_PACKET, PIPE_BUF_FLAG_GIFT) could remain on the
unused part of the pipe ring buffer. Previously splice_to_pipe() left
the flags value alone, which could result in incorrect behavior.
Uninitialized flags appears to have been there from the introduction of
the splice syscall.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # 2.6.17+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 16 Feb 2017 17:05:34 +0000 (09:05 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
"Fix a use after free bug introduced in 4.2 and using an uninitialized
value introduced in 4.9"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: fix uninitialized flags in pipe_buffer
fuse: fix use after free issue in fuse_dev_do_read()
Linus Torvalds [Thu, 16 Feb 2017 17:03:37 +0000 (09:03 -0800)]
Merge tag 'pci-v4.10-fixes-4' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
"Add back pcie_pme_remove() so we free the IRQ when removing PCIe port
devices; previously the leaked IRQ caused an MSI BUG_ON"
* tag 'pci-v4.10-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/PME: Restore pcie_pme_driver.remove
Linus Torvalds [Thu, 16 Feb 2017 16:37:18 +0000 (08:37 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) In order to avoid problems in the future, make cgroup bpf overriding
explicit using BPF_F_ALLOW_OVERRIDE. From Alexei Staovoitov.
2) LLC sets skb->sk without proper skb->destructor and this explodes,
fix from Eric Dumazet.
3) Make sure when we have an ipv4 mapped source address, the
destination is either also an ipv4 mapped address or
ipv6_addr_any(). Fix from Jonathan T. Leighton.
4) Avoid packet loss in fec driver by programming the multicast filter
more intelligently. From Rui Sousa.
5) Handle multiple threads invoking fanout_add(), fix from Eric
Dumazet.
6) Since we can invoke the TCP input path in process context, without
BH being disabled, we have to accomodate that in the locking of the
TCP probe. Also from Eric Dumazet.
7) Fix erroneous emission of NETEVENT_DELAY_PROBE_TIME_UPDATE when we
aren't even updating that sysctl value. From Marcus Huewe.
8) Fix endian bugs in ibmvnic driver, from Thomas Falcon.
[ This is the second version of the pull that reverts the nested
rhashtable changes that looked a bit too scary for this late in the
release - Linus ]
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
rhashtable: Revert nested table changes.
ibmvnic: Fix endian errors in error reporting output
ibmvnic: Fix endian error when requesting device capabilities
net: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notification
net: xilinx_emaclite: fix freezes due to unordered I/O
net: xilinx_emaclite: fix receive buffer overflow
bpf: kernel header files need to be copied into the tools directory
tcp: tcp_probe: use spin_lock_bh()
uapi: fix linux/if_pppol2tp.h userspace compilation errors
packet: fix races in fanout_add()
ibmvnic: Fix initial MTU settings
net: ethernet: ti: cpsw: fix cpsw assignment in resume
kcm: fix a null pointer dereference in kcm_sendmsg()
net: fec: fix multicast filtering hardware setup
ipv6: Handle IPv4-mapped src to in6addr_any dst.
ipv6: Inhibit IPv4-mapped src address on the wire.
net/mlx5e: Disable preemption when doing TC statistics upcall
rhashtable: Add nested tables
tipc: Fix tipc_sk_reinit race conditions
gfs2: Use rhashtable walk interface in glock_hash_walk
...
Miklos Szeredi [Thu, 16 Feb 2017 14:08:20 +0000 (15:08 +0100)]
fuse: fix uninitialized flags in pipe_buffer
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: d82718e348fe ("fuse_dev_splice_read(): switch to add_to_pipe()")
Cc: <stable@vger.kernel.org> # 4.9+
David S. Miller [Thu, 16 Feb 2017 03:29:51 +0000 (22:29 -0500)]
rhashtable: Revert nested table changes.
This reverts commits:
6a25478077d987edc5e2f880590a2bc5fcab4441
9dbbfb0ab6680c6a85609041011484e6658e7d3c
40137906c5f55c252194ef5834130383e639536f
It's too risky to put in this late in the release
cycle. We'll put these changes into the next merge
window instead.
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Wed, 15 Feb 2017 16:33:33 +0000 (10:33 -0600)]
ibmvnic: Fix endian errors in error reporting output
Error reports received from firmware were not being converted from
big endian values, leading to bogus error codes reported on little
endian systems.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Wed, 15 Feb 2017 16:32:11 +0000 (10:32 -0600)]
ibmvnic: Fix endian error when requesting device capabilities
When a vNIC client driver requests a faulty device setting, the
server returns an acceptable value for the client to request.
This 64 bit value was incorrectly being swapped as a 32 bit value,
resulting in loss of data. This patch corrects that by using
the 64 bit swap function.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Koniarik [Wed, 15 Feb 2017 15:59:35 +0000 (16:59 +0100)]
atm: idt77252, use setup_timer and mod_timer
Stop accessing timer struct members directly and use setup_timer and
mod_timer helpers intended for that use. It makes the code cleaner and
will allow for easier change of the timer struct internals.
Signed-off-by: Jan Koniarik <jan.koniarik@trustica.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Chas Williams <3chas3@gmail.com>
Cc: <linux-atm-general@lists.sourceforge.net>
Cc: <netdev@vger.kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 15 Feb 2017 11:09:51 +0000 (12:09 +0100)]
mlxsw: acl: Use PBS type for forward action
Current behaviour of "mirred redirect" action (forward) offload is a bit
odd. For matched packets the action forwards them to the desired
destination, but it also lets the packet duplicates to go the original
way down (bridge, router, etc). That is more like "mirred mirror".
Fix this by using PBS type which behaves exactly like "mirred redirect".
Note that PBS does not support loopback mode.
Fixes: 4cda7d8d7098 ("mlxsw: core: Introduce flexible actions support")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 15 Feb 2017 18:20:57 +0000 (13:20 -0500)]
Merge branch 'stmmac-misc'
Corentin Labbe says:
====================
stmmac: misc patchs
This is a follow up of my previous stmmac serie which address some comment
done in v2.
====================
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LABBE Corentin [Wed, 15 Feb 2017 09:46:45 +0000 (10:46 +0100)]
net: stmmac: invert the logic for dumping regs
It is easier to follow the logic by removing the not operator
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LABBE Corentin [Wed, 15 Feb 2017 09:46:44 +0000 (10:46 +0100)]
net: stmmac: reduce indentation by adding a continue
As suggested by Joe Perches, replacing the "if phydev" logic permit to
reduce indentation in the for loop.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LABBE Corentin [Wed, 15 Feb 2017 09:46:43 +0000 (10:46 +0100)]
net: stmmac: split the stmmac_adjust_link 10/100 case
The 10/100 case have too many ifcase.
This patch split it for removing an if.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LABBE Corentin [Wed, 15 Feb 2017 09:46:42 +0000 (10:46 +0100)]
net: stmmac: run stmmac_hw_fix_mac_speed when speed is valid
This patch mutualise a bit by running stmmac_hw_fix_mac_speed() after
the switch in case of valid speed.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LABBE Corentin [Wed, 15 Feb 2017 09:46:41 +0000 (10:46 +0100)]
net: stmmac: set speed at SPEED_UNKNOWN in case of broken speed
In case of invalid speed given, stmmac_adjust_link() still record it as
current speed.
This patch modify the default case to set speed as SPEED_UNKNOWN if not
10/100/1000.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LABBE Corentin [Wed, 15 Feb 2017 09:46:40 +0000 (10:46 +0100)]
net: stmmac: use SPEED_UNKNOWN/DUPLEX_UNKNOWN
It is better to use DUPLEX_UNKNOWN instead of just "-1".
Using 0 for an invalid speed is bad since 0 is a valid value for speed.
So this patch replace 0 by SPEED_UNKNOWN.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LABBE Corentin [Wed, 15 Feb 2017 09:46:39 +0000 (10:46 +0100)]
net: stmmac: likely is useless in occasional function
The stmmac_adjust_link() function is called too rarely for having
likely() macros being useful.
Just remove likely annotation in it.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LABBE Corentin [Wed, 15 Feb 2017 09:46:38 +0000 (10:46 +0100)]
net: stmmac: remove useless parenthesis
This patch remove some useless parenthesis.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christoph Hellwig [Wed, 15 Feb 2017 07:38:47 +0000 (08:38 +0100)]
net: ethernet: aquantia: switch to pci_alloc_irq_vectors
pci_enable_msix has been long deprecated, but this driver adds a new
instance. Convert it to pci_alloc_irq_vectors so that no new instance
of the deprecated function reaches mainline.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 15 Feb 2017 17:42:54 +0000 (12:42 -0500)]
Merge branch 'qed-ptp'
Yuval Mintz says:
====================
qed*: Add support for PTP
This patch series adds required changes for qed/qede drivers for
supporting the IEEE Precision Time Protocol (PTP).
Changes from previous versions:
v7: Fixed Kbuild robot warnings.
v6: Corrected broken loop iteration in previous version.
Reduced approximation error of adjfreq.
v5: Removed two divisions from the adjust-frequency loop.
Resulting logic would use 8 divisions [instead of 24].
v4: Remove the loop iteration for value '0' in the qed_ptp_hw_adjfreq()
implementation.
v3: Use div_s64 for 64-bit divisions as do_div gives error for signed
types.
Incorporated review comments from Richard Cochran.
- Clear timestamp resgisters as soon as timestamp is read.
- Use shift operation in the place of 'divide by 16'.
v2: Use do_div for 64-bit divisions.
====================
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru [Wed, 15 Feb 2017 08:24:11 +0000 (10:24 +0200)]
qede: Add driver support for PTP
This patch adds the driver support for,
- Registering the ptp clock functionality with the OS.
- Timestamping the Rx/Tx PTP packets.
- Ethtool callbacks related to PTP.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru [Wed, 15 Feb 2017 08:24:10 +0000 (10:24 +0200)]
qed: Add infrastructure for PTP support
The patch adds the required qed interfaces for configuring/reading
the PTP clock on the adapter.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Wed, 15 Feb 2017 06:15:25 +0000 (11:45 +0530)]
cxgb4: Update proper netdev stats for rx drops
Count buffer group drops or truncates as rx drops rather than
rx errors in netdev stats.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: Arjun V <arjun@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Wed, 15 Feb 2017 05:16:28 +0000 (21:16 -0800)]
openvswitch: Set internal device max mtu to ETH_MAX_MTU.
Commit
91572088e3fd ("net: use core MTU range checking in core net
infra") changed the openvswitch internal device to use the core net
infra for controlling the MTU range, but failed to actually set the
max_mtu as described in the commit message, which now defaults to
ETH_DATA_LEN.
This patch fixes this by setting max_mtu to ETH_MAX_MTU after
ether_setup() call.
Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcus Huewe [Wed, 15 Feb 2017 00:00:36 +0000 (01:00 +0100)]
net: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notification
When setting a neigh related sysctl parameter, we always send a
NETEVENT_DELAY_PROBE_TIME_UPDATE netevent. For instance, when
executing
sysctl net.ipv6.neigh.wlp3s0.retrans_time_ms=2000
a NETEVENT_DELAY_PROBE_TIME_UPDATE netevent is generated.
This is caused by commit
2a4501ae18b5 ("neigh: Send a
notification when DELAY_PROBE_TIME changes"). According to the
commit's description, it was intended to generate such an event
when setting the "delay_first_probe_time" sysctl parameter.
In order to fix this, only generate this event when actually
setting the "delay_first_probe_time" sysctl parameter. This fix
should not have any unintended side-effects, because all but one
registered netevent callbacks check for other netevent event
types (the registered callbacks were obtained by grepping for
"register_netevent_notifier"). The only callback that uses the
NETEVENT_DELAY_PROBE_TIME_UPDATE event is
mlxsw_sp_router_netevent_event() (in
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c): in case
of this event, it only accesses the DELAY_PROBE_TIME of the
passed neigh_parms.
Fixes: 2a4501ae18b5 ("neigh: Send a notification when DELAY_PROBE_TIME changes")
Signed-off-by: Marcus Huewe <suse-tux@gmx.de>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 15 Feb 2017 10:57:50 +0000 (11:57 +0100)]
sched: have stub for tcf_destroy_chain in case NET_CLS is not configured
This fixes broken build for !NET_CLS:
net/built-in.o: In function `fq_codel_destroy':
/home/sab/linux/net-next/net/sched/sch_fq_codel.c:468: undefined reference to `tcf_destroy_chain'
Fixes: cf1facda2f61 ("sched: move tcf_proto_destroy and tcf_destroy_chain helpers into cls_api")
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Anssi Hannula [Tue, 14 Feb 2017 17:11:45 +0000 (19:11 +0200)]
net: xilinx_emaclite: fix freezes due to unordered I/O
The xilinx_emaclite uses __raw_writel and __raw_readl for register
accesses. Those functions do not imply any kind of memory barriers and
they may be reordered.
The driver does not seem to take that into account, though, and the
driver does not satisfy the ordering requirements of the hardware.
For clear examples, see xemaclite_mdio_write() and xemaclite_mdio_read()
which try to set MDIO address before initiating the transaction.
I'm seeing system freezes with the driver with GCC 5.4 and current
Linux kernels on Zynq-7000 SoC immediately when trying to use the
interface.
In commit
123c1407af87 ("net: emaclite: Do not use microblaze and ppc
IO functions") the driver was switched from non-generic
in_be32/out_be32 (memory barriers, big endian) to
__raw_readl/__raw_writel (no memory barriers, native endian), so
apparently the device follows system endianness and the driver was
originally written with the assumption of memory barriers.
Rather than try to hunt for each case of missing barrier, just switch
the driver to use iowrite32/ioread32/iowrite32be/ioread32be depending
on endianness instead.
Tested on little-endian Zynq-7000 ARM SoC FPGA.
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Fixes: 123c1407af87 ("net: emaclite: Do not use microblaze and ppc IO
functions")
Signed-off-by: David S. Miller <davem@davemloft.net>
Anssi Hannula [Tue, 14 Feb 2017 17:11:44 +0000 (19:11 +0200)]
net: xilinx_emaclite: fix receive buffer overflow
xilinx_emaclite looks at the received data to try to determine the
Ethernet packet length but does not properly clamp it if
proto_type == ETH_P_IP or 1500 < proto_type <= 1518, causing a buffer
overflow and a panic via skb_panic() as the length exceeds the allocated
skb size.
Fix those cases.
Also add an additional unconditional check with WARN_ON() at the end.
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Fixes: bb81b2ddfa19 ("net: add Xilinx emac lite device driver")
Signed-off-by: David S. Miller <davem@davemloft.net>
Mickaël Salaün [Sat, 11 Feb 2017 22:20:23 +0000 (23:20 +0100)]
bpf: Rebuild bpf.o for any dependency update
This is needed to force a rebuild of bpf.o when one of its dependencies
(e.g. uapi/linux/bpf.h) is updated.
Add a phony target.
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mickaël Salaün [Sat, 11 Feb 2017 19:37:08 +0000 (20:37 +0100)]
bpf: Remove redundant ifdef
Remove a useless ifdef __NR_bpf as requested by Wang Nan.
Inline one-line static functions as it was in the bpf_sys.h file.
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/r/828ab1ff-4dcf-53ff-c97b-074adb895006@huawei.com
Acked-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 9 Feb 2017 17:10:04 +0000 (09:10 -0800)]
mlx4: do not use rwlock in fast path
Using a reader-writer lock in fast path is silly, when we can
instead use RCU or a seqlock.
For mlx4 hwstamp clock, a seqlock is the way to go, removing
two atomic operations and false sharing.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yinghai Lu [Wed, 15 Feb 2017 05:17:48 +0000 (21:17 -0800)]
PCI/PME: Restore pcie_pme_driver.remove
In addition to making PME non-modular,
d7def2040077 ("PCI/PME: Make
explicitly non-modular") removed the pcie_pme_driver .remove() method,
pcie_pme_remove().
pcie_pme_remove() freed the PME IRQ that was requested in pci_pme_probe().
The fact that we don't free the IRQ after
d7def2040077 causes the following
crash when removing a PCIe port device via /sys:
------------[ cut here ]------------
kernel BUG at drivers/pci/msi.c:370!
invalid opcode: 0000 [#1] SMP
Modules linked in:
CPU: 1 PID: 14509 Comm: sh Tainted: G W
4.8.0-rc1-yh-00012-gd29438d
RIP: 0010:[<
ffffffff9758bbf5>] free_msi_irqs+0x65/0x190
...
Call Trace:
[<
ffffffff9758cda4>] pci_disable_msi+0x34/0x40
[<
ffffffff97583817>] cleanup_service_irqs+0x27/0x30
[<
ffffffff97583e9a>] pcie_port_device_remove+0x2a/0x40
[<
ffffffff97584250>] pcie_portdrv_remove+0x40/0x50
[<
ffffffff97576d7b>] pci_device_remove+0x4b/0xc0
[<
ffffffff9785ebe6>] __device_release_driver+0xb6/0x150
[<
ffffffff9785eca5>] device_release_driver+0x25/0x40
[<
ffffffff975702e4>] pci_stop_bus_device+0x74/0xa0
[<
ffffffff975704ea>] pci_stop_and_remove_bus_device_locked+0x1a/0x30
[<
ffffffff97578810>] remove_store+0x50/0x70
[<
ffffffff9785a378>] dev_attr_store+0x18/0x30
[<
ffffffff97260b64>] sysfs_kf_write+0x44/0x60
[<
ffffffff9725feae>] kernfs_fop_write+0x10e/0x190
[<
ffffffff971e13f8>] __vfs_write+0x28/0x110
[<
ffffffff970b0fa4>] ? percpu_down_read+0x44/0x80
[<
ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0
[<
ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0
[<
ffffffff971e1f04>] vfs_write+0xc4/0x180
[<
ffffffff971e3089>] SyS_write+0x49/0xa0
[<
ffffffff97001a46>] do_syscall_64+0xa6/0x1b0
[<
ffffffff9819201e>] entry_SYSCALL64_slow_path+0x25/0x25
...
RIP [<
ffffffff9758bbf5>] free_msi_irqs+0x65/0x190
RSP <
ffff89ad3085bc48>
---[ end trace
f4505e1dac5b95d3 ]---
Segmentation fault
Restore pcie_pme_remove().
[bhelgaas: changelog]
Fixes: d7def2040077 ("PCI/PME: Make explicitly non-modular")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org # v4.9+
Sahitya Tummala [Wed, 8 Feb 2017 15:00:56 +0000 (20:30 +0530)]
fuse: fix use after free issue in fuse_dev_do_read()
There is a potential race between fuse_dev_do_write()
and request_wait_answer() contexts as shown below:
TASK 1:
__fuse_request_send():
|--spin_lock(&fiq->waitq.lock);
|--queue_request();
|--spin_unlock(&fiq->waitq.lock);
|--request_wait_answer():
|--if (test_bit(FR_SENT, &req->flags))
<gets pre-empted after it is validated true>
TASK 2:
fuse_dev_do_write():
|--clears bit FR_SENT,
|--request_end():
|--sets bit FR_FINISHED
|--spin_lock(&fiq->waitq.lock);
|--list_del_init(&req->intr_entry);
|--spin_unlock(&fiq->waitq.lock);
|--fuse_put_request();
|--queue_interrupt();
<request gets queued to interrupts list>
|--wake_up_locked(&fiq->waitq);
|--wait_event_freezable();
<as FR_FINISHED is set, it returns and then
the caller frees this request>
Now, the next fuse_dev_do_read(), see interrupts list is not empty
and then calls fuse_read_interrupt() which tries to access the request
which is already free'd and gets the below crash:
[11432.401266] Unable to handle kernel paging request at virtual address
6b6b6b6b6b6b6b6b
...
[11432.418518] Kernel BUG at
ffffff80083720e0
[11432.456168] PC is at __list_del_entry+0x6c/0xc4
[11432.463573] LR is at fuse_dev_do_read+0x1ac/0x474
...
[11432.679999] [<
ffffff80083720e0>] __list_del_entry+0x6c/0xc4
[11432.687794] [<
ffffff80082c65e0>] fuse_dev_do_read+0x1ac/0x474
[11432.693180] [<
ffffff80082c6b14>] fuse_dev_read+0x6c/0x78
[11432.699082] [<
ffffff80081d5638>] __vfs_read+0xc0/0xe8
[11432.704459] [<
ffffff80081d5efc>] vfs_read+0x90/0x108
[11432.709406] [<
ffffff80081d67f0>] SyS_read+0x58/0x94
As FR_FINISHED bit is set before deleting the intr_entry with input
queue lock in request completion path, do the testing of this flag and
queueing atomically with the same lock in queue_interrupt().
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: fd22d62ed0c3 ("fuse: no fc->lock for iqueue parts")
Cc: <stable@vger.kernel.org> # 4.2+
Ivan Khoronzhuk [Tue, 14 Feb 2017 14:02:36 +0000 (16:02 +0200)]
net: ethernet: ti: cpsw: use var instead of func for usage count
The usage count function is based on ndev_running flag that is
updated before calling ndo_open/close, but if ndo is called in
another place, as with suspend/resume, the counter is not changed,
that breaks sus/resume. For common resource no difference which
device is using it, does matter only device count. So, replace
usage count function on var and inc and dec it in ndo_open/close.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Rothwell [Mon, 13 Feb 2017 21:22:20 +0000 (08:22 +1100)]
bpf: kernel header files need to be copied into the tools directory
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 15 Feb 2017 01:11:14 +0000 (17:11 -0800)]
tcp: tcp_probe: use spin_lock_bh()
tcp_rcv_established() can now run in process context.
We need to disable BH while acquiring tcp probe spinlock,
or risk a deadlock.
Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Ricardo Nabinger Sanchez <rnsanchez@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry V. Levin [Wed, 15 Feb 2017 02:23:26 +0000 (05:23 +0300)]
uapi: fix linux/if_pppol2tp.h userspace compilation errors
Because of <linux/libc-compat.h> interface limitations, <netinet/in.h>
provided by libc cannot be included after <linux/in.h>, therefore any
header that includes <netinet/in.h> cannot be included after <linux/in.h>.
Change uapi/linux/l2tp.h, the last uapi header that includes
<netinet/in.h>, to include <linux/in.h> and <linux/in6.h> instead of
<netinet/in.h> and use __SOCK_SIZE__ instead of sizeof(struct sockaddr)
the same way as uapi/linux/in.h does, to fix linux/if_pppol2tp.h userspace
compilation errors like this:
In file included from /usr/include/linux/l2tp.h:12:0,
from /usr/include/linux/if_pppol2tp.h:21,
/usr/include/netinet/in.h:31:8: error: redefinition of 'struct in_addr'
Fixes: 47c3e7783be4 ("net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_*")
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mauro Carvalho Chehab [Tue, 14 Feb 2017 19:47:57 +0000 (17:47 -0200)]
[media] siano: make it work again with CONFIG_VMAP_STACK
Reported as a Kaffeine bug:
https://bugs.kde.org/show_bug.cgi?id=375811
The USB control messages require DMA to work. We cannot pass
a stack-allocated buffer, as it is not warranted that the
stack would be into a DMA enabled area.
On Kernel 4.9, the default is to not accept DMA on stack anymore
on x86 architecture. On other architectures, this has been a
requirement since Kernel 2.2. So, after this patch, this driver
should likely work fine on all archs.
Tested with USB ID 2040:5510: Hauppauge Windham
Cc: stable@vger.kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Eric Dumazet [Tue, 14 Feb 2017 17:03:51 +0000 (09:03 -0800)]
packet: fix races in fanout_add()
Multiple threads can call fanout_add() at the same time.
We need to grab fanout_mutex earlier to avoid races that could
lead to one thread freeing po->rollover that was set by another thread.
Do the same in fanout_release(), for peace of mind, and to help us
finding lockdep issues earlier.
Fixes: dc99f600698d ("packet: Add fanout support.")
Fixes: 0648ab70afe6 ("packet: rollover prepare: per-socket state")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Tue, 14 Feb 2017 16:47:12 +0000 (17:47 +0100)]
pch_gbe: Omit private ndo_get_stats function
pch_gbe_get_stats() just returns dev->stats so we can leave it out
altogether and let dev_get_stats() do the job.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Tue, 14 Feb 2017 14:10:06 +0000 (15:10 +0100)]
net: hip04: Omit private ndo_get_stats function
hip04_get_stats() just returns dev->stats so we can leave it
out altogether and let dev_get_stats() do the job.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Tue, 14 Feb 2017 14:19:32 +0000 (14:19 +0000)]
net_sched: nla_memdup_cookie() can be static
Fixes the following sparse warning:
net/sched/act_api.c:532:5: warning:
symbol 'nla_memdup_cookie' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Tue, 14 Feb 2017 16:22:59 +0000 (10:22 -0600)]
ibmvnic: Fix initial MTU settings
In the current driver, the MTU is set to the maximum value
capable for the backing device. This decision turned out to
be a mistake as it led to confusion among users. The expected
initial MTU value used for other IBM vNIC capable operating
systems is 1500, with the maximum value (9000) reserved for
when Jumbo frames are enabled. This patch sets the MTU to
the default value for a net device.
It also corrects a discrepancy between MTU values received from
firmware, which includes the ethernet header length, and net
device MTU values.
Finally, it removes redundant min/max MTU assignments after device
initialization.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Tue, 14 Feb 2017 12:42:15 +0000 (14:42 +0200)]
net: ethernet: ti: cpsw: fix cpsw assignment in resume
There is a copy-paste error, which hides breaking of resume
for CPSW driver: there was replaced netdev_priv() to ndev_to_cpsw(ndev)
in suspend, but left it unchanged in resume.
Fixes: 606f39939595a4d4540406bfc11f265b2036af6d
(ti: cpsw: move platform data and slaves info to cpsw_common)
Reported-by: Alexey Starikovskiy <AStarikovskiy@topcon.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manuel Lauss [Tue, 14 Feb 2017 12:08:04 +0000 (13:08 +0100)]
net: irda: au1k_ir: drop useless include
remove useless ioport.h include.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manuel Lauss [Tue, 14 Feb 2017 12:08:03 +0000 (13:08 +0100)]
net: irda: au1k_ir: remove unused timer
remove the unused timer. I suppose it was intended as a timeout
detector, but never properly implemented.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Alemayhu [Mon, 13 Feb 2017 23:02:35 +0000 (00:02 +0100)]
bpf: reduce compiler warnings by adding fallthrough comments
Fixes the following warnings:
kernel/bpf/verifier.c: In function ‘may_access_direct_pkt_data’:
kernel/bpf/verifier.c:702:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (t == BPF_WRITE)
^
kernel/bpf/verifier.c:704:2: note: here
case BPF_PROG_TYPE_SCHED_CLS:
^~~~
kernel/bpf/verifier.c: In function ‘reg_set_min_max_inv’:
kernel/bpf/verifier.c:2057:23: warning: this statement may fall through [-Wimplicit-fallthrough=]
true_reg->min_value = 0;
~~~~~~~~~~~~~~~~~~~~^~~
kernel/bpf/verifier.c:2058:2: note: here
case BPF_JSGT:
^~~~
kernel/bpf/verifier.c:2068:23: warning: this statement may fall through [-Wimplicit-fallthrough=]
true_reg->min_value = 0;
~~~~~~~~~~~~~~~~~~~~^~~
kernel/bpf/verifier.c:2069:2: note: here
case BPF_JSGE:
^~~~
kernel/bpf/verifier.c: In function ‘reg_set_min_max’:
kernel/bpf/verifier.c:2009:24: warning: this statement may fall through [-Wimplicit-fallthrough=]
false_reg->min_value = 0;
~~~~~~~~~~~~~~~~~~~~~^~~
kernel/bpf/verifier.c:2010:2: note: here
case BPF_JSGT:
^~~~
kernel/bpf/verifier.c:2019:24: warning: this statement may fall through [-Wimplicit-fallthrough=]
false_reg->min_value = 0;
~~~~~~~~~~~~~~~~~~~~~^~~
kernel/bpf/verifier.c:2020:2: note: here
case BPF_JSGE:
^~~~
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Alexander Alemayhu <alexander@alemayhu.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ondrej Zary [Mon, 13 Feb 2017 22:45:47 +0000 (23:45 +0100)]
pcnet32: fix BNC/AUI port on AM79C970A
Even though the port autoselection is enabled by default on AM79C970A,
BNC/AUI port does not work because the link is always reported to be
down. The link state reported by the chip belongs only to the TP port
but the driver uses it regardless of the port used. The chip can't
detect BNC/AUI link state.
Disable port autoselection and use TP port by default to keep current
behavior (link detection works on TP port, BNC/AUI port does not work).
Implement ethtool autoneg, port and duplex configuration to allow
using the BNC/AUI port.
Report the TP link state only if the TP port is selected. When the
port autoselection is enabled or AUI port is selected, report the link
as always up.
Move pcnet32_suspend() and pcnet32_clr_suspend() functions to avoid
forward declarations.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ondrej Zary [Mon, 13 Feb 2017 22:45:46 +0000 (23:45 +0100)]
pcnet32: factor out pcnet32_clr_suspend()
Move the code to clear SUSPEND flag to a separate function to simplify
code.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nogah Frankel [Mon, 13 Feb 2017 20:03:02 +0000 (21:03 +0100)]
mlxsw: spectrum: Change ipv6 unregistered mc table
Point back the unregister IPv6 mc table to the bc table.
It is done since IPv6 mcast snooping is not supported for Spectrum yet.
Reported-by: Jiri Pirko <jiri@mellanox.com>
Fixes: 71c365bdc439 ("mlxsw: spectrum: Separate bc and mc floods")
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Tested-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Mon, 13 Feb 2017 19:13:16 +0000 (11:13 -0800)]
kcm: fix a null pointer dereference in kcm_sendmsg()
In commit
98e3862ca2b1 ("kcm: fix 0-length case for kcm_sendmsg()")
I tried to avoid skb allocation for 0-length case, but missed
a check for NULL pointer in the non EOR case.
Fixes: 98e3862ca2b1 ("kcm: fix 0-length case for kcm_sendmsg()")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Feb 2017 18:04:11 +0000 (13:04 -0500)]
Merge branch 'sunvnet-driver-updates'
Shannon Nelson says:
====================
sunvnet driver updates
The sunvnet ldom virtual network driver was due for some updates and
a bugfix or two. These patches address a few items left over from
last year's make-over.
v2:
- changed memory barrier fix to use smp_wmb
- put NETIF_F_SG back into the advertised ldmvsw hw_features
v3:
- the sunvnet_common module doesn't need module_init or _exit
v4:
- dropped the statistics patch
- fixed up "default" tag for SUNVNET_COMMON
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Mon, 13 Feb 2017 18:57:04 +0000 (10:57 -0800)]
ldmvsw: disable tso and gso for bridge operations
The ldmvsw driver is specifically for supporting the ldom virtual
networking by running in the primary ldom and using the LDC to connect
the remaining ldoms to the outside world via a bridge. With TSO and GSO
supported while connected the bridge, things tend to misbehave as seen
in our case by delayed packets, enough to begin triggering retransmits
and affecting overall throughput. By turning off advertised support for
TSO and GSO we restore stable traffic flow through the bridge.
Orabug:
23293104
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Mon, 13 Feb 2017 18:57:03 +0000 (10:57 -0800)]
ldmvsw: update and simplify version string
New version and simplify the print code.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Mon, 13 Feb 2017 18:57:02 +0000 (10:57 -0800)]
sunvnet: remove extra rcu_read_unlocks
The RCU read lock is grabbed first thing in sunvnet_start_xmit_common()
so it always needs to be released. This removes the conditional release
in the dropped packet error path and removes a couple of superfluous
calls in the middle of the code.
Reported-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Mon, 13 Feb 2017 18:57:01 +0000 (10:57 -0800)]
sunvnet: straighten up message event handling logic
The use of gotos for handling the incoming events made this code
harder to read and support than it should be. This patch straightens
out and clears up the logic.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Mon, 13 Feb 2017 18:57:00 +0000 (10:57 -0800)]
sunvnet: add memory barrier before check for tx enable
In order to allow the underlying LDC and outstanding memory operations
to potentially catch up with the driver's Tx requests, add a memory
barrier before checking again for available tx descriptors.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Mon, 13 Feb 2017 18:56:59 +0000 (10:56 -0800)]
sunvnet: update version and version printing
There have been several changes since the first version of this code, so
we bump the version number. While we're at it, we can simplify the
version printing a bit and drop a couple lines of code.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sowmini Varadhan [Mon, 13 Feb 2017 18:56:58 +0000 (10:56 -0800)]
sunvnet: remove unused variable in maybe_tx_wakeup
The vio_dring_state *dr variable is unused in maybe_tx_wakeup().
As the comments indicate, we call maybe_tx_wakeup() whenever we
get a STOPPED LDC message on the port. If the queue is stopped,
we want to wake it up so that we will send another START message
at the next TX and trigger the consumer to drain the dring.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Mon, 13 Feb 2017 18:56:57 +0000 (10:56 -0800)]
sunvnet: make sunvnet common code dynamically loadable
When the sunvnet_common code was split out for use by both sunvnet
and the newer ldmvsw, it was made into a static kernel library, which
limits the usefulness of sunvnet and ldmvsw as loadables, since most
of the real work is being done in the shared code. Also, this is
simply dead code in kernels that aren't running the LDoms.
This patch makes the sunvnet_common into a dynamically loadable
module and makes sunvnet and ldmvsw dependent on sunvnet_common.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Feb 2017 17:43:19 +0000 (12:43 -0500)]
Merge branch 'sfc-bogus-interrupt-mode-fallbacks'
Edward Cree says:
====================
sfc: prevent bogus interrupt-mode fallbacks
EF10 VFs only support MSI-X interrupts, not MSI or legacy. This series
stops the probe logic from trying to fallback to those if MSI-X interrupt
probe fails. It also prevents selecting them with the interrupt_mode
module parameter.
This avoids producing messages like "failed to hook legacy IRQ 0" and "IRQ
handler type mismatch for IRQ 0", and ensures that the relevant error
(from the attempt to enable MSI-X) is reported to the caller.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Rybchenko [Mon, 13 Feb 2017 14:59:04 +0000 (14:59 +0000)]
sfc: only fall back to a lower interrupt mode if it is supported
If we fail to probe interrupts with our minimum mode, return that error.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Rybchenko [Mon, 13 Feb 2017 14:57:39 +0000 (14:57 +0000)]
sfc: MSI-X is the only interrupt mode for EF10 VFs
Add min_interrupt_mode specification per NIC type.
It is a bit confusing because of "highest interrupt mode is less capable".
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Feb 2017 17:41:04 +0000 (12:41 -0500)]
Merge branch 'bridge-fdb-minor-cleanup'
Nikolay Aleksandrov says:
====================
bridge: minor fdb cleanup
These patches aim to simplify the bridge fdb API a little by removing some
redundant functions and converting them into wrappers of a single function.
Also add proper lock checking to avoid future mistakes for the search
functions.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Mon, 13 Feb 2017 13:59:11 +0000 (14:59 +0100)]
bridge: fdb: converge fdb_delete_by functions into one
We can simplify the logic of entries pointing to the bridge by
converging the fdb_delete_by functions, this would allow us to use the
same function for both cases since the fdb's dst is set to NULL if it is
pointing to the bridge thus we can always check for a port match.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Mon, 13 Feb 2017 13:59:10 +0000 (14:59 +0100)]
bridge: fdb: add proper lock checks in searching functions
In order to avoid new errors add checks to br_fdb_find and fdb_find_rcu
functions. The first requires hash_lock, the second obviously RCU.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Mon, 13 Feb 2017 13:59:09 +0000 (14:59 +0100)]
bridge: fdb: converge fdb searching functions into one
Before this patch we had 3 different fdb searching functions which was
confusing. This patch reduces all of them to one - fdb_find_rcu(), and
two flavors: br_fdb_find() which requires hash_lock and br_fdb_find_rcu
which requires RCU. This makes it clear what needs to be used, we also
remove two abusers of __br_fdb_get which called it under hash_lock and
replace them with br_fdb_find().
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rui Sousa [Mon, 13 Feb 2017 02:01:25 +0000 (10:01 +0800)]
net: fec: fix multicast filtering hardware setup
Fix hardware setup of multicast address hash:
- Never clear the hardware hash (to avoid packet loss)
- Construct the hash register values in software and then write once
to hardware
Signed-off-by: Rui Sousa <rui.sousa@nxp.com>
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Feb 2017 17:13:52 +0000 (12:13 -0500)]
Merge branch 'ipv6-v4mapped'
Jonathan T. Leighton says:
====================
IPv4-mapped on wire, :: dst address issue
Under some circumstances IPv6 datagrams are sent with IPv4-mapped IPv6
addresses as the source. Given an IPv6 socket bound to an IPv4-mapped
IPv6 address, and an IPv6 destination address, both TCP and UDP will
will send packets using the IPv4-mapped IPv6 address as the source. Per
RFC 6890 (Table 20), IPv4-mapped IPv6 source addresses are not allowed
in an IP datagram. The problem can be observed by attempting to
connect() either a TCP or UDP socket, or by using sendmsg() with a UDP
socket. The patch is intended to correct this issue for all socket
types.
linux follows the BSD convention that an IPv6 destination address
specified as in6addr_any is converted to the loopback address.
Currently, neither TCP nor UDP consider the possibility that the source
address is an IPv4-mapped IPv6 address, and assume that the appropriate
loopback address is ::1. The patch adds a check on whether or not the
source address is an IPv4-mapped IPv6 address and then sets the
destination address to either ::ffff:127.0.0.1 or ::1, as appropriate.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jonathan T. Leighton [Sun, 12 Feb 2017 22:26:07 +0000 (17:26 -0500)]
ipv6: Handle IPv4-mapped src to in6addr_any dst.
This patch adds a check on the type of the source address for the case
where the destination address is in6addr_any. If the source is an
IPv4-mapped IPv6 source address, the destination is changed to
::ffff:127.0.0.1, and otherwise the destination is changed to ::1. This
is done in three locations to handle UDP calls to either connect() or
sendmsg() and TCP calls to connect(). Note that udpv6_sendmsg() delays
handling an in6addr_any destination until very late, so the patch only
needs to handle the case where the source is an IPv4-mapped IPv6
address.
Signed-off-by: Jonathan T. Leighton <jtleight@udel.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jonathan T. Leighton [Sun, 12 Feb 2017 22:26:06 +0000 (17:26 -0500)]
ipv6: Inhibit IPv4-mapped src address on the wire.
This patch adds a check for the problematic case of an IPv4-mapped IPv6
source address and a destination address that is neither an IPv4-mapped
IPv6 address nor in6addr_any, and returns an appropriate error. The
check in done before returning from looking up the route.
Signed-off-by: Jonathan T. Leighton <jtleight@udel.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Sun, 12 Feb 2017 09:21:31 +0000 (11:21 +0200)]
net/mlx5e: Disable preemption when doing TC statistics upcall
When called by HW offloading drivers, the TC action (e.g
net/sched/act_mirred.c) code uses this_cpu logic, e.g
_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets)
per the kernel documention, preemption should be disabled, add that.
Before the fix, when running with CONFIG_PREEMPT set, we get a
BUG: using smp_processor_id() in preemptible [
00000000] code: tc/3793
asserion from the TC action (mirred) stats_update callback.
Fixes: aad7e08d39bd ('net/mlx5e: Hardware offloaded flower filter statistics support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Volodymyr Bendiuga [Tue, 14 Feb 2017 10:29:30 +0000 (11:29 +0100)]
net:dsa:mv88e6xxx: use watchdog ops for 6097 chip
mv88e6097 chip requires watchdog_ops to be set.
Signed-off-by: Volodymyr Bendiuga <volodymyr.bendiuga@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 14 Feb 2017 15:27:13 +0000 (16:27 +0100)]
sched: Fix accidental removal of errout goto
Bring back the goto that was removed by accident.
Reported-by: Colin Ian King <colin.king@canonical.com>
Fixes: 40c81b25b16c ("sched: check negative err value to safe one level of indent")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 14 Feb 2017 14:29:21 +0000 (06:29 -0800)]
Merge tag 'media/v4.10-4' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"A colorspace regression fix in V4L2 core and a CEC core bug that makes
it discard valid messages"
* tag 'media/v4.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] cec: initiator should be the same as the destination for, poll
[media] videodev2.h: go back to limited range Y'CbCr for SRGB and, ADOBERGB
Dan Carpenter [Mon, 13 Feb 2017 11:00:22 +0000 (14:00 +0300)]
net: qcom/emac: fix a sizeof() typo
We had intended to say "sizeof(u32)" but the "u" is missing.
Fortunately, sizeof(32) is also 4, so the original code still works.
Fixes: c4e7beea2192 ("net: qcom/emac: add ethtool support for reading hardware registers")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe Jaillet [Fri, 10 Feb 2017 20:17:19 +0000 (21:17 +0100)]
net: fs_enet: Simplify code
There is no need to use an intermediate variable to handle an error code
in this case.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe Jaillet [Fri, 10 Feb 2017 20:17:06 +0000 (21:17 +0100)]
net: fs_enet: Fix an error handling path
'of_node_put(fpi->phy_node)' should also be called if we branch to
'out_deregister_fixed_link' error handling path.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Feb 2017 03:24:16 +0000 (22:24 -0500)]
Merge tag 'rxrpc-rewrite-
20170210' of git://git./linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
afs: Use system UUID generation
There is now a general function for generating a UUID and AFS should make
use of it. It's also been recommended to me that I switch to using random
rather than time plus MAC address-based UUIDs which this function does.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Fri, 10 Feb 2017 15:43:50 +0000 (16:43 +0100)]
net: make net_device members garp_port and mrp_port conditional
garp_port is only used in net/802/garp.c which is only compiled with
CONFIG_GARP enabled. Same goes for mrp_port which is only used in
net/802/mrp.c with CONFIG_MRP enabled.
Only include the two members in struct net_device if their respective
CONFIG_* is enabled. This saves a few bytes in struct net_device in case
CONFIG_GARP or CONFIG_MRP are not enabled.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 10 Feb 2017 13:46:46 +0000 (05:46 -0800)]
net: busy-poll: remove LL_FLUSH_FAILED and LL_FLUSH_BUSY
Commit
79e7fff47b7b ("net: remove support for per driver
ndo_busy_poll()") made them obsolete.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Feb 2017 03:23:23 +0000 (22:23 -0500)]
Merge branch '40GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2017-02-11
This series contains updates to i40e and i40evf only.
Jake makes a minor change to prevent a minor bit of work, if it is not
necessary. In the case where we do not have a client, there is no need
to check the client params, so move the check till after we have ensured
we have a client. Correct a code comment which incorrectly implied
that raw_packet buffers were freed in i40e_clean_tx_ring(), so fixed
the code comment to better explain where memory is freed. Reduce the
severity and frequency of the message notifying we cleared the receive
timestamp register, since the logic has a much better detection scheme
that could detect a stalled receive timestamp register. The improved
logic was actually causing the notification message to occur more
frequently and was giving the user a false perception that a timestamp
event was missed for a valid packet, so reduce the severity from
dev_warn to dev_dbg and only fire off the message when 3 or 4 of the
RXTIME registers are stalled and get cleared within the same
watchdog event. Fixed a bug, where we were modifying the mac_filter
outside a lock when handling the addition of broadcast filters. Fix
this by updating i40e_update_filter_state logic so that it knows to
avoid broadcast filters, which ensures that we do not have to remove
the filter separately and can put it back using the normal flow.
Refactored how we add new filters to firmware to avoid a race condition
that can occur due to removing filters from the hash temporarily.
Mitch adds a sleep (without timeout) so that we wait for a reply from
the PF before we continue, since the iWarp client cannot continue until
the operation is completed. Fixed up a function which could never
return an error, to be void and cleaned up the checking of the now
null and void return value.
Scott limits the DMA sync to CPU to the actual length of the incoming
packet, versus the syncing of the entire buffer. Also reduces the
receive buffer struct (by a single pointer) and align the driver to be
more consistent with other Intel drivers with respect to packets that
span buffers.
Sudheer adds a field to track the bus number info and modified log
statements to print bus, device and function information.
Henry adds the ability to store the FEC status bits from the link up
event. Also adds the ethtool support for FEC capabilities and 25G
link types.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Feb 2017 03:17:06 +0000 (22:17 -0500)]
Merge branch 'rhashtable-allocation-failure-during-insertion'
Herbert Xu says:
====================
rhashtable: Handle table allocation failure during insertion
v2 -
Added Ack to patch 2.
Fixed RCU annotation in code path executed by rehasher by using
rht_dereference_bucket.
v1 -
This series tackles the problem of table allocation failures during
insertion. The issue is that we cannot vmalloc during insertion.
This series deals with this by introducing nested tables.
The first two patches removes manual hash table walks which cannot
work on a nested table.
The final patch introduces nested tables.
I've tested this with test_rhashtable and it appears to work.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Sat, 11 Feb 2017 11:26:47 +0000 (19:26 +0800)]
rhashtable: Add nested tables
This patch adds code that handles GFP_ATOMIC kmalloc failure on
insertion. As we cannot use vmalloc, we solve it by making our
hash table nested. That is, we allocate single pages at each level
and reach our desired table size by nesting them.
When a nested table is created, only a single page is allocated
at the top-level. Lower levels are allocated on demand during
insertion. Therefore for each insertion to succeed, only two
(non-consecutive) pages are needed.
After a nested table is created, a rehash will be scheduled in
order to switch to a vmalloced table as soon as possible. Also,
the rehash code will never rehash into a nested table. If we
detect a nested table during a rehash, the rehash will be aborted
and a new rehash will be scheduled.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Sat, 11 Feb 2017 11:26:46 +0000 (19:26 +0800)]
tipc: Fix tipc_sk_reinit race conditions
There are two problems with the function tipc_sk_reinit. Firstly
it's doing a manual walk over an rhashtable. This is broken as
an rhashtable can be resized and if you manually walk over it
during a resize then you may miss entries.
Secondly it's missing memory barriers as previously the code used
spinlocks which provide the barriers implicitly.
This patch fixes both problems.
Fixes: 07f6c4bc048a ("tipc: convert tipc reference table to...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Sat, 11 Feb 2017 11:26:45 +0000 (19:26 +0800)]
gfs2: Use rhashtable walk interface in glock_hash_walk
The function glock_hash_walk walks the rhashtable by hand. This
is broken because if it catches the hash table in the middle of
a rehash, then it will miss entries.
This patch replaces the manual walk by using the rhashtable walk
interface.
Fixes: 88ffbf3e037e ("GFS2: Use resizable hash table for glocks")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ralf Baechle [Fri, 10 Feb 2017 23:38:57 +0000 (00:38 +0100)]
NET: Fix /proc/net/arp for AX.25
When sending ARP requests over AX.25 links the hwaddress in the neighbour
cache are not getting initialized. For such an incomplete arp entry
ax2asc2 will generate an empty string resulting in /proc/net/arp output
like the following:
$ cat /proc/net/arp
IP address HW type Flags HW address Mask Device
192.168.122.1 0x1 0x2 52:54:00:00:5d:5f * ens3
172.20.1.99 0x3 0x0 * bpq0
The missing field will confuse the procfs parsing of arp(8) resulting in
incorrect output for the device such as the following:
$ arp
Address HWtype HWaddress Flags Mask Iface
gateway ether 52:54:00:00:5d:5f C ens3
172.20.1.99 (incomplete) ens3
This changes the content of /proc/net/arp to:
$ cat /proc/net/arp
IP address HW type Flags HW address Mask Device
172.20.1.99 0x3 0x0 * * bpq0
192.168.122.1 0x1 0x2 52:54:00:00:5d:5f * ens3
To do so it change ax2asc to put the string "*" in buf for a NULL address
argument. Finally the HW address field is left aligned in a 17 character
field (the length of an ethernet HW address in the usual hex notation) for
readability.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mart van Santen [Fri, 10 Feb 2017 12:02:18 +0000 (12:02 +0000)]
xen-netback: vif counters from int/long to u64
This patch fixes an issue where the type of counters in the queue(s)
and interface are not in sync (queue counters are int, interface
counters are long), causing incorrect reporting of tx/rx values
of the vif interface and unclear counter overflows.
This patch sets both counters to the u64 type.
Signed-off-by: Mart van Santen <mart@greenhost.nl>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Belous [Thu, 9 Feb 2017 20:53:10 +0000 (23:53 +0300)]
net:ethernet:aquantia: Add 2500/5000 mbit link modes support.
Using new link mode indices instead deprecated SUPPORTED_/ADVERTISED_
macro.
Added indication for 2500 and 5000mbit link modes (AQtion adapter already
supports these speeds).
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnaldo Carvalho de Melo [Mon, 13 Feb 2017 17:15:44 +0000 (14:15 -0300)]
MAINTAINERS: Remove old e-mail address
The ghostprotocols.net domain is not working, remove it from CREDITS and
MAINTAINERS, and change the status to "Odd fixes", and since I haven't
been maintaining those, remove my address from there.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hans Verkuil [Sat, 11 Feb 2017 11:24:46 +0000 (09:24 -0200)]
[media] cec: initiator should be the same as the destination for, poll
Poll messages that are used to allocate a logical address should
use the same initiator as the destination. Instead, it expected that
the initiator was 0xf which is not according to the standard.
This also had consequences for the message checks in cec_transmit_msg_fh
that incorrectly rejected poll messages with the same initiator and
destination.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Hans Verkuil [Fri, 10 Feb 2017 09:18:36 +0000 (07:18 -0200)]
[media] videodev2.h: go back to limited range Y'CbCr for SRGB and, ADOBERGB
This reverts 'commit
7e0739cd9c40 ("[media] videodev2.h: fix
sYCC/AdobeYCC default quantization range").
The problem is that many drivers can convert R'G'B' content (often
from sensors) to Y'CbCr, but they all produce limited range Y'CbCr.
To stay backwards compatible the default quantization range for
sRGB and AdobeRGB Y'CbCr encoding should be limited range, not full
range, even though the corresponding standards specify full range.
Update the V4L2_MAP_QUANTIZATION_DEFAULT define accordingly and
also update the documentation.
Fixes: 7e0739cd9c40 ("[media] videodev2.h: fix sYCC/AdobeYCC default quantization range")
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Cc: <stable@vger.kernel.org> # for v4.9 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
David S. Miller [Mon, 13 Feb 2017 14:30:22 +0000 (09:30 -0500)]
Merge branch 'mv88e6xxx-Watchdog-support'
Andrew Lunn says:
====================
mv88e6xxx Watchdog support
The Marvell switches have an in built watchdog over some of the
internal state machine. The watchdog can be configured to raise an
interrupt on error. The problem the watchdog found is then logged to
the kernel log.
The older switches can automagically perform a software reset when the
watchdog triggers. This just resets the internal state machine, but
leaves the switch configuration unchanged.
The 6390 family of switches cannot both raise an interrupt and
automagically perform a software reset. So the interrupt handler has
to perform the switch reset, and then re-enable the watchdog
interrupts.
This has been tested using hacked together debugfs code which allows
the "force" bit to be set, so cause a watchdog interrupt.
v2: Remove g2_prefix
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Wed, 8 Feb 2017 23:03:43 +0000 (00:03 +0100)]
net: dsa: mv88e6xxx: Add mv88e6390 watchdog interrupt support
Implement the ops needed to support the watchdog for the MV88E6390
family.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>