Jakub Kicinski [Thu, 9 May 2019 23:19:34 +0000 (16:19 -0700)]
nfp: add missing kdoc
Add missing kdoc for app member.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 9 May 2019 23:37:40 +0000 (16:37 -0700)]
Merge branch 'tls-warnings'
Jakub Kicinski says:
====================
net/tls: fix W=1 build warnings
This small series cleans up two outstanding W=1 build
warnings in tls code. Both are set but not used variables.
The first case looks fairly straightforward. In the second
I think it's better to propagate the error code, even if
not doing some does not lead to a crash with current code.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 9 May 2019 23:14:07 +0000 (16:14 -0700)]
net/tls: handle errors from padding_length()
At the time padding_length() is called the record header
is still part of the message. If malicious TLS 1.3 peer
sends an all-zero record padding_length() will stop at
the record header, and return full length of the data
including the tail_size.
Subsequent subtraction of prot->overhead_size from rxm->full_len
will cause rxm->full_len to turn negative. skb accessors,
however, will always catch resulting out-of-bounds operation,
so in practice this fix comes down to returning the correct
error code. It also fixes a set but not used warning.
This code was added by commit
130b392c6cd6 ("net: tls: Add tls 1.3 support").
CC: Dave Watson <davejwatson@fb.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 9 May 2019 23:14:06 +0000 (16:14 -0700)]
net/tls: remove set but not used variables
Commit
4504ab0e6eb8 ("net/tls: Inform user space about send buffer availability")
made us report write_space regardless whether partial record
push was successful or not. Remove the now unused return value
to clean up the following W=1 warning:
net/tls/tls_device.c: In function ‘tls_device_write_space’:
net/tls/tls_device.c:546:6: warning: variable ‘rc’ set but not used [-Wunused-but-set-variable]
int rc = 0;
^~
CC: Vakul Garg <vakul.garg@nxp.com>
CC: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 9 May 2019 23:25:08 +0000 (16:25 -0700)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:
====================
pull-request: bpf 2019-05-09
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) three small fixes from Gary, Jiong and Lorenz.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Gary Lin [Wed, 8 May 2019 07:54:48 +0000 (15:54 +0800)]
docs/btf: fix the missing section marks
The section titles of 3.4 and 3.5 are not marked correctly.
Signed-off-by: Gary Lin <glin@suse.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Jiong Wang [Tue, 7 May 2019 16:41:30 +0000 (17:41 +0100)]
nfp: bpf: fix static check error through tightening shift amount adjustment
NFP shift instruction has something special. If shift direction is left
then shift amount of 1 to 31 is specified as 32 minus the amount to shift.
But no need to do this for indirect shift which has shift amount be 0. Even
after we do this subtraction, shift amount 0 will be turned into 32 which
will eventually be encoded the same as 0 because only low 5 bits are
encoded, but shift amount be 32 will fail the FIELD_PREP check done later
on shift mask (0x1f), due to 32 is out of mask range. Such error has been
observed when compiling nfp/bpf/jit.c using gcc 8.3 + O3.
This issue has started when indirect shift support added after which the
incoming shift amount to __emit_shf could be 0, therefore it is at that
time shift amount adjustment inside __emit_shf should have been tightened.
Fixes: 991f5b3651f6 ("nfp: bpf: support logic indirect shifts (BPF_[L|R]SH | BPF_X)")
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reported-by: Pablo Cascón <pablo.cascon@netronome.com
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Lorenz Bauer [Wed, 8 May 2019 16:49:32 +0000 (17:49 +0100)]
selftests: bpf: initialize bpf_object pointers where needed
There are a few tests which call bpf_object__close on uninitialized
bpf_object*, which may segfault. Explicitly zero-initialise these pointers
to avoid this.
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
YueHaibing [Thu, 9 May 2019 14:52:20 +0000 (22:52 +0800)]
packet: Fix error path in packet_init
kernel BUG at lib/list_debug.c:47!
invalid opcode: 0000 [#1
CPU: 0 PID: 12914 Comm: rmmod Tainted: G W 5.1.0+ #47
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:__list_del_entry_valid+0x53/0x90
Code: 48 8b 32 48 39 fe 75 35 48 8b 50 08 48 39 f2 75 40 b8 01 00 00 00 5d c3 48
89 fe 48 89 c2 48 c7 c7 18 75 fe 82 e8 cb 34 78 ff <0f> 0b 48 89 fe 48 c7 c7 50 75 fe 82 e8 ba 34 78 ff 0f 0b 48 89 f2
RSP: 0018:
ffffc90001c2fe40 EFLAGS:
00010286
RAX:
000000000000004e RBX:
ffffffffa0184000 RCX:
0000000000000000
RDX:
0000000000000000 RSI:
ffff888237a17788 RDI:
00000000ffffffff
RBP:
ffffc90001c2fe40 R08:
0000000000000000 R09:
0000000000000000
R10:
ffffc90001c2fe10 R11:
0000000000000000 R12:
0000000000000000
R13:
ffffc90001c2fe50 R14:
ffffffffa0184000 R15:
0000000000000000
FS:
00007f3d83634540(0000) GS:
ffff888237a00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000555c350ea818 CR3:
0000000231677000 CR4:
00000000000006f0
Call Trace:
unregister_pernet_operations+0x34/0x120
unregister_pernet_subsys+0x1c/0x30
packet_exit+0x1c/0x369 [af_packet
__x64_sys_delete_module+0x156/0x260
? lockdep_hardirqs_on+0x133/0x1b0
? do_syscall_64+0x12/0x1f0
do_syscall_64+0x6e/0x1f0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
When modprobe af_packet, register_pernet_subsys
fails and does a cleanup, ops->list is set to LIST_POISON1,
but the module init is considered to success, then while rmmod it,
BUG() is triggered in __list_del_entry_valid which is called from
unregister_pernet_subsys. This patch fix error handing path in
packet_init to avoid possilbe issue if some error occur.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 8 May 2019 23:46:14 +0000 (16:46 -0700)]
net/tcp: use deferred jump label for TCP acked data hook
User space can flip the clean_acked_data_enabled static branch
on and off with TLS offload when CONFIG_TLS_DEVICE is enabled.
jump_label.h suggests we use the delayed version in this case.
Deferred branches now also don't take the branch mutex on
decrement, so we avoid potential locking issues.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kefeng Wang [Thu, 9 May 2019 15:32:35 +0000 (23:32 +0800)]
net: aquantia: fix undefined devm_hwmon_device_register_with_info reference
drivers/net/ethernet/aquantia/atlantic/aq_drvinfo.o: In function `aq_drvinfo_init':
aq_drvinfo.c:(.text+0xe8): undefined reference to `devm_hwmon_device_register_with_info'
Fix it by using #if IS_REACHABLE(CONFIG_HWMON).
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 9 May 2019 16:44:17 +0000 (09:44 -0700)]
Merge tag 'batadv-net-for-davem-
20190509' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This feature/cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich (we forgot to include
this patch previously ...)
- fix multicast tt/tvlv worker locking, by Linus Lüssing
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Neukum [Thu, 9 May 2019 09:08:18 +0000 (11:08 +0200)]
aqc111: fix double endianness swap on BE
If you are using a function that does a swap in place,
you cannot just reuse the buffer on the assumption that it has
not been changed.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Neukum [Thu, 9 May 2019 09:08:17 +0000 (11:08 +0200)]
aqc111: fix writing to the phy on BE
When writing to the phy on BE architectures an internal data structure
was directly given, leading to it being byte swapped in the wrong
way for the CPU in 50% of all cases. A temporary buffer must be used.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Neukum [Thu, 9 May 2019 09:08:16 +0000 (11:08 +0200)]
aqc111: fix endianness issue in aqc111_change_mtu
If the MTU is large enough, the first write to the device
is just repeated. On BE architectures, however, the first
word of the command will be swapped a second time and garbage
will be written. Avoid that.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Thu, 9 May 2019 06:55:07 +0000 (14:55 +0800)]
vlan: disable SIOCSHWTSTAMP in container
With NET_ADMIN enabled in container, a normal user could be mapped to
root and is able to change the real device's rx filter via ioctl on
vlan, which would affect the other ptp process on host. Fix it by
disabling SIOCSHWTSTAMP in container.
Fixes: a6111d3c93d0 ("vlan: Pass SIOC[SG]HWTSTAMP ioctls to real device")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Thu, 9 May 2019 06:54:08 +0000 (14:54 +0800)]
macvlan: disable SIOCSHWTSTAMP in container
Miroslav pointed that with NET_ADMIN enabled in container, a normal user
could be mapped to root and is able to change the real device's rx
filter via ioctl on macvlan, which would affect the other ptp process on
host. Fix it by disabling SIOCSHWTSTAMP in container.
Fixes: 254c0a2bfedb ("macvlan: pass get_ts_info and SIOC[SG]HWTSTAMP ioctl to real device")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parthasarathy Bhuvaragan [Thu, 9 May 2019 05:13:42 +0000 (07:13 +0200)]
tipc: fix hanging clients using poll with EPOLLOUT flag
commit
517d7c79bdb398 ("tipc: fix hanging poll() for stream sockets")
introduced a regression for clients using non-blocking sockets.
After the commit, we send EPOLLOUT event to the client even in
TIPC_CONNECTING state. This causes the subsequent send() to fail
with ENOTCONN, as the socket is still not in TIPC_ESTABLISHED state.
In this commit, we:
- improve the fix for hanging poll() by replacing sk_data_ready()
with sk_state_change() to wake up all clients.
- revert the faulty updates introduced by commit
517d7c79bdb398
("tipc: fix hanging poll() for stream sockets").
Fixes: 517d7c79bdb398 ("tipc: fix hanging poll() for stream sockets")
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@gmail.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Thu, 9 May 2019 03:20:18 +0000 (23:20 -0400)]
tuntap: synchronize through tfiles array instead of tun->numqueues
When a queue(tfile) is detached through __tun_detach(), we move the
last enabled tfile to the position where detached one sit but don't
NULL out last position. We expect to synchronize the datapath through
tun->numqueues. Unfortunately, this won't work since we're lacking
sufficient mechanism to order or synchronize the access to
tun->numqueues.
To fix this, NULL out the last position during detaching and check
RCU protected tfile against NULL instead of checking tun->numqueues in
datapath.
Cc: YueHaibing <yuehaibing@huawei.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: weiyongjun (A) <weiyongjun1@huawei.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: c8d68e6be1c3b ("tuntap: multiqueue support")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Thu, 9 May 2019 03:20:17 +0000 (23:20 -0400)]
tuntap: fix dividing by zero in ebpf queue selection
We need check if tun->numqueues is zero (e.g for the persist device)
before trying to use it for modular arithmetic.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: 96f84061620c6("tun: add eBPF based queue selection method")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cheng Han [Thu, 9 May 2019 03:13:41 +0000 (11:13 +0800)]
dwmac4_prog_mtl_tx_algorithms() missing write operation
net: ethernet: stmmac: dwmac4_prog_mtl_tx_algorithms() missing write operation
The value of MTL_OPERATION_MODE is not written back
Signed-off-by: Cheng Han <hancheng2009@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Thu, 9 May 2019 03:07:12 +0000 (03:07 +0000)]
ptp_qoriq: fix NULL access if ptp dt node missing
Make sure ptp dt node exists before accessing it in case
of NULL pointer call trace.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Jansen van Vuuren [Wed, 8 May 2019 22:56:07 +0000 (15:56 -0700)]
net/sched: avoid double free on matchall reoffload
Avoid freeing cls_mall.rule twice when failing to setup flow_action
offload used in the hardware intermediate representation. This is
achieved by returning 0 when the setup fails but the skip software
flag has not been set.
Fixes: f00cbf196814 ("net/sched: use the hardware intermediate representation for matchall")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Jansen van Vuuren [Wed, 8 May 2019 22:52:56 +0000 (15:52 -0700)]
nfp: reintroduce ndo_get_port_parent_id for representor ports
NFP does not register devlink ports for representors (without
the "devlink: expose PF and VF representors as ports" series
there are no port flavours to expose them as).
Commit
c25f08ac65e4 ("nfp: remove ndo_get_port_parent_id implementation")
went to far in removing ndo_get_port_parent_id for representors.
This causes redirection offloads to fail, and switch_id attribute
missing.
Reintroduce the ndo_get_port_parent_id callback for representor ports.
Fixes: c25f08ac65e4 ("nfp: remove ndo_get_port_parent_id implementation")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 8 May 2019 23:31:38 +0000 (16:31 -0700)]
Merge branch 'phy-realtek-delays'
Serge Semin says:
====================
net: phy: realtek: Fix RGMII TX/RX-delays initial config of rtl8211(e|f)
It has been discovered that RX/TX delays of rtl8211e ethernet PHY
can be configured via a MDIO register hidden in the extension pages
layout. Particularly the extension page 0xa4 provides a register 0x1c,
which bits 1 and 2 control the described delays. They are used to
implement the "rgmii-{id,rxid,txid}" phy-mode support in patch 1.
The second patch makes sure the rtl8211f TX-delay is configured only
if RGMII interface mode is specified including the rgmii-rxid one.
In other cases (most importantly for NA mode) the delays are supposed
to be preconfigured by some other software or hardware and should be
left as is without any modification. The similar thing is also done
for rtl8211e in the patch 1 of this series.
Changelog v3
- Add this cover-letter.
- Add Andrew' Reviewed-by tag to patch 1.
- Accept RGMII_RXID interface mode for rtl8211f and clear the TX_DELAY
bit in this case.
- Initialize ret variable with 0 to prevent the "may be used uninitialized"
warning in patch 1.
Changelog v4
- Rebase onto net-next
====================
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Serge Semin [Wed, 8 May 2019 21:51:17 +0000 (00:51 +0300)]
net: phy: realtek: Change TX-delay setting for RGMII modes only
It's prone to problems if delay is cleared out for other than RGMII
modes. So lets set/clear the TX-delay in the config register only
if actually RGMII-like interface mode is requested. This only
concerns rtl8211f chips.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Serge Semin [Wed, 8 May 2019 21:51:15 +0000 (00:51 +0300)]
net: phy: realtek: Add rtl8211e rx/tx delays config
There are two chip pins named TXDLY and RXDLY which actually adds the 2ns
delays to TXC and RXC for TXD/RXD latching. Alas this is the only
documented info regarding the RGMII timing control configurations the PHY
provides. It turns out the same settings can be setup via MDIO registers
hidden in the extension pages layout. Particularly the extension page 0xa4
provides a register 0x1c, which bits 1 and 2 control the described delays.
They are used to implement the "rgmii-{id,rxid,txid}" phy-mode.
The hidden RGMII configs register utilization was found in the rtl8211e
U-boot driver:
https://elixir.bootlin.com/u-boot/v2019.01/source/drivers/net/phy/realtek.c#L99
There is also a freebsd-folks discussion regarding this register:
https://reviews.freebsd.org/D13591
It confirms that the register bits field must control the so called
configuration pins described in the table 12-13 of the official PHY
datasheet:
8:6 = PHY Address
5:4 = Auto-Negotiation
3 = Interface Mode Select
2 = RX Delay
1 = TX Delay
0 = SELRGV
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Wed, 8 May 2019 20:32:25 +0000 (23:32 +0300)]
net: dsa: sja1105: Don't return a negative in u8 sja1105_stp_state_get
Dan Carpenter says:
The patch
640f763f98c2: "net: dsa: sja1105: Add support for Spanning
Tree Protocol" from May 5, 2019, leads to the following static
checker warning:
drivers/net/dsa/sja1105/sja1105_main.c:1073 sja1105_stp_state_get()
warn: signedness bug returning '(-22)'
The caller doesn't check for negative errors anyway.
Fixes: 640f763f98c2: ("net: dsa: sja1105: Add support for Spanning Tree Protocol")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 8 May 2019 13:30:41 +0000 (14:30 +0100)]
net: dsa: sja1105: fix check on while loop exit
The while-loop exit condition check is not correct; the
loop should continue if the returns from the function calls are
negative or the CRC status returns are invalid. Currently it
is ignoring the returns from the function calls. Fix this by
removing the status return checks and only break from the loop
at the very end when we know that all the success condtions have
been met.
Kudos to Dan Carpenter for describing the correct fix and
Vladimir Oltean for noting the change to the check on the number
of retries.
Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 8aa9ebccae87 ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Hai [Wed, 8 May 2019 13:43:26 +0000 (21:43 +0800)]
net: dsa: sja1105: Make 'sja1105et_regs' and 'sja1105pqrs_regs' static
drivers/net/dsa/sja1105/sja1105_spi.c:486:21: warning: symbol 'sja1105et_regs' was not declared. Should it be static?
drivers/net/dsa/sja1105/sja1105_spi.c:511:21: warning: symbol 'sja1105pqrs_regs' was not declared. Should it be static?
Fixes: 8aa9ebccae87 ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai26@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Wed, 8 May 2019 13:32:51 +0000 (15:32 +0200)]
selinux: do not report error on connect(AF_UNSPEC)
calling connect(AF_UNSPEC) on an already connected TCP socket is an
established way to disconnect() such socket. After commit
68741a8adab9
("selinux: Fix ltp test connect-syscall failure") it no longer works
and, in the above scenario connect() fails with EAFNOSUPPORT.
Fix the above falling back to the generic/old code when the address family
is not AF_INET{4,6}, but leave the SCTP code path untouched, as it has
specific constraints.
Fixes: 68741a8adab9 ("selinux: Fix ltp test connect-syscall failure")
Reported-by: Tom Deseyn <tdeseyn@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 8 May 2019 10:51:35 +0000 (11:51 +0100)]
net: hns3: remove redundant assignment of l2_hdr to itself
The pointer l2_hdr is being assigned to itself, this is redundant
and can be removed.
Addresses-Coverity: ("Evaluation order violation")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 8 May 2019 10:22:09 +0000 (11:22 +0100)]
net: dsa: lantiq: fix spelling mistake "brigde" -> "bridge"
There are several spelling mistakes in dev_err messages. Fix these.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven [Wed, 8 May 2019 06:52:32 +0000 (08:52 +0200)]
openvswitch: Replace removed NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT)
Commit
4806e975729f99c7 ("netfilter: replace NF_NAT_NEEDED with
IS_ENABLED(CONFIG_NF_NAT)") removed CONFIG_NF_NAT_NEEDED, but a new user
popped up afterwards.
Fixes: fec9c271b8f1bde1 ("openvswitch: load and reference the NAT helper.")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Florian Westphal <fw@strlen.de>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Wed, 8 May 2019 03:44:59 +0000 (20:44 -0700)]
ipv4: Fix raw socket lookup for local traffic
inet_iif should be used for the raw socket lookup. inet_iif considers
rt_iif which handles the case of local traffic.
As it stands, ping to a local address with the '-I <dev>' option fails
ever since ping was changed to use SO_BINDTODEVICE instead of
cmsg + IP_PKTINFO.
IPv6 works fine.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Tue, 7 May 2019 09:11:18 +0000 (17:11 +0800)]
fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied
With commit
153380ec4b9 ("fib_rules: Added NLM_F_EXCL support to
fib_nl_newrule") we now able to check if a rule already exists. But this
only works with iproute2. For other tools like libnl, NetworkManager,
it still could add duplicate rules with only NLM_F_CREATE flag, like
[localhost ~ ]# ip rule
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
100000: from 192.168.7.5 lookup 5
100000: from 192.168.7.5 lookup 5
As it doesn't make sense to create two duplicate rules, let's just return
0 if the rule exists.
Fixes: 153380ec4b9 ("fib_rules: Added NLM_F_EXCL support to fib_nl_newrule")
Reported-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 8 May 2019 05:03:58 +0000 (22:03 -0700)]
Merge git://git./linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
"Highlights:
1) Support AES128-CCM ciphers in kTLS, from Vakul Garg.
2) Add fib_sync_mem to control the amount of dirty memory we allow to
queue up between synchronize RCU calls, from David Ahern.
3) Make flow classifier more lockless, from Vlad Buslov.
4) Add PHY downshift support to aquantia driver, from Heiner
Kallweit.
5) Add SKB cache for TCP rx and tx, from Eric Dumazet. This reduces
contention on SLAB spinlocks in heavy RPC workloads.
6) Partial GSO offload support in XFRM, from Boris Pismenny.
7) Add fast link down support to ethtool, from Heiner Kallweit.
8) Use siphash for IP ID generator, from Eric Dumazet.
9) Pull nexthops even further out from ipv4/ipv6 routes and FIB
entries, from David Ahern.
10) Move skb->xmit_more into a per-cpu variable, from Florian
Westphal.
11) Improve eBPF verifier speed and increase maximum program size,
from Alexei Starovoitov.
12) Eliminate per-bucket spinlocks in rhashtable, and instead use bit
spinlocks. From Neil Brown.
13) Allow tunneling with GUE encap in ipvs, from Jacky Hu.
14) Improve link partner cap detection in generic PHY code, from
Heiner Kallweit.
15) Add layer 2 encap support to bpf_skb_adjust_room(), from Alan
Maguire.
16) Remove SKB list implementation assumptions in SCTP, your's truly.
17) Various cleanups, optimizations, and simplifications in r8169
driver. From Heiner Kallweit.
18) Add memory accounting on TX and RX path of SCTP, from Xin Long.
19) Switch PHY drivers over to use dynamic featue detection, from
Heiner Kallweit.
20) Support flow steering without masking in dpaa2-eth, from Ioana
Ciocoi.
21) Implement ndo_get_devlink_port in netdevsim driver, from Jiri
Pirko.
22) Increase the strict parsing of current and future netlink
attributes, also export such policies to userspace. From Johannes
Berg.
23) Allow DSA tag drivers to be modular, from Andrew Lunn.
24) Remove legacy DSA probing support, also from Andrew Lunn.
25) Allow ll_temac driver to be used on non-x86 platforms, from Esben
Haabendal.
26) Add a generic tracepoint for TX queue timeouts to ease debugging,
from Cong Wang.
27) More indirect call optimizations, from Paolo Abeni"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1763 commits)
cxgb4: Fix error path in cxgb4_init_module
net: phy: improve pause mode reporting in phy_print_status
dt-bindings: net: Fix a typo in the phy-mode list for ethernet bindings
net: macb: Change interrupt and napi enable order in open
net: ll_temac: Improve error message on error IRQ
net/sched: remove block pointer from common offload structure
net: ethernet: support of_get_mac_address new ERR_PTR error
net: usb: smsc: fix warning reported by kbuild test robot
staging: octeon-ethernet: Fix of_get_mac_address ERR_PTR check
net: dsa: support of_get_mac_address new ERR_PTR error
net: dsa: sja1105: Fix status initialization in sja1105_get_ethtool_stats
vrf: sit mtu should not be updated when vrf netdev is the link
net: dsa: Fix error cleanup path in dsa_init_module
l2tp: Fix possible NULL pointer dereference
taprio: add null check on sched_nest to avoid potential null pointer dereference
net: mvpp2: cls: fix less than zero check on a u32 variable
net_sched: sch_fq: handle non connected flows
net_sched: sch_fq: do not assume EDT packets are ordered
net: hns3: use devm_kcalloc when allocating desc_cb
net: hns3: some cleanup for struct hns3_enet_ring
...
Linus Torvalds [Wed, 8 May 2019 04:55:37 +0000 (21:55 -0700)]
Merge tag 'devicetree-for-5.2' of git://git./linux/kernel/git/robh/linux
Pull Devicetree updates from Rob Herring:
- Fix possible memory leak in reserved-memory failure case
- Support for DMA parent bus which are not a parent node
- Clang -Wunsequenced fix
- Remove some unnecessary prints on memory alloc failures
- Various printk msg and comment fixes
- Update DT schema tools repository location
- Convert simple-framebuffer binding to DT schema
- Bindings for isl68137 and ir38064 trivial devices
- New documentation on binding do's and don't's for binding writers to
ignore
* tag 'devicetree-for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (22 commits)
of: unittest: Remove error printing on OOM
of: irq: Remove WARN_ON() for kzalloc() failure
dt-bindings: pinctrl: fix bias-pull,up typo
dt-bindings: Update schema project location to devicetree.org github group
of: fix clang -Wunsequenced for be32_to_cpu()
of/device.c: fix the wrong comments
dt-bindings: Add isl68137 as a trivial device
dt-bindings: Add ir38064 as a trivial device
of: del redundant type conversion
dt-bindings: mfd: axp20x: Add fallback for axp805
of: Improve of_phandle_iterator_next() error message
dt-bindings: connector: Spelling mistake
dt-bindings: Add schemas for simple-framebuffer
of: address: Add support for the parent DMA bus
of: address: Retrieve a parent through a callback in __of_translate_address
dt-bindings: bus: Add binding for the Allwinner MBUS controller
dt-bindings: interconnect: Add a dma interconnect name
of: use correct function prototype for of_overlay_fdt_apply()
of: reserved_mem: fix reserve memory leak
of: property: Document that of_graph_get_endpoint_by_regs needs of_node_put
...
Linus Torvalds [Wed, 8 May 2019 04:42:23 +0000 (21:42 -0700)]
Merge tag 'random_for_linus' of git://git./linux/kernel/git/tytso/random
Pull randomness updates from Ted Ts'o:
- initialize the random driver earler
- fix CRNG initialization when we trust the CPU's RNG on NUMA systems
- other miscellaneous cleanups and fixes.
* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: add a spinlock_t to struct batched_entropy
random: document get_random_int() family
random: fix CRNG initialization when random.trust_cpu=1
random: move rand_initialize() earlier
random: only read from /dev/random after its pool has received 128 bits
drivers/char/random.c: make primary_crng static
drivers/char/random.c: remove unused stuct poolinfo::poolbits
drivers/char/random.c: constify poolinfo_table
Linus Torvalds [Wed, 8 May 2019 04:28:04 +0000 (21:28 -0700)]
Merge tag 'fscrypt_for_linus' of git://git./fs/fscrypt/fscrypt
Pull fscrypt updates from Ted Ts'o:
"Clean up fscrypt's dcache revalidation support, and other
miscellaneous cleanups"
* tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fscrypt: cache decrypted symlink target in ->i_link
vfs: use READ_ONCE() to access ->i_link
fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
fscrypt: only set dentry_operations on ciphertext dentries
fs, fscrypt: clear DCACHE_ENCRYPTED_NAME when unaliasing directory
fscrypt: fix race allowing rename() and link() of ciphertext dentries
fscrypt: clean up and improve dentry revalidation
fscrypt: use READ_ONCE() to access ->i_crypt_info
fscrypt: remove WARN_ON_ONCE() when decryption fails
fscrypt: drop inode argument from fscrypt_get_ctx()
Linus Torvalds [Wed, 8 May 2019 04:12:44 +0000 (21:12 -0700)]
Merge tag 'ext4_for_linus' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"Add as a feature case-insensitive directories (the casefold feature)
using Unicode 12.1.
Also, the usual largish number of cleanups and bug fixes"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (25 commits)
ext4: export /sys/fs/ext4/feature/casefold if Unicode support is present
ext4: fix ext4_show_options for file systems w/o journal
unicode: refactor the rule for regenerating utf8data.h
docs: ext4.rst: document case-insensitive directories
ext4: Support case-insensitive file name lookups
ext4: include charset encoding information in the superblock
MAINTAINERS: add Unicode subsystem entry
unicode: update unicode database unicode version 12.1.0
unicode: introduce test module for normalized utf8 implementation
unicode: implement higher level API for string handling
unicode: reduce the size of utf8data[]
unicode: introduce code for UTF-8 normalization
unicode: introduce UTF-8 character database
ext4: actually request zeroing of inode table after grow
ext4: cond_resched in work-heavy group loops
ext4: fix use-after-free race with debug_want_extra_isize
ext4: avoid drop reference to iloc.bh twice
ext4: ignore e_value_offs for xattrs with value-in-ea-inode
ext4: protect journal inode's blocks using block_validity
ext4: use BUG() instead of BUG_ON(1)
...
Linus Torvalds [Wed, 8 May 2019 03:51:58 +0000 (20:51 -0700)]
Merge tag 'afs-next-
20190507' of git://git./linux/kernel/git/dhowells/linux-fs
Pull AFS updates from David Howells:
"A set of fix and development patches for AFS for 5.2.
Summary:
- Fix the AFS file locking so that sqlite can run on an AFS mount and
also so that firefox and gnome can use a homedir that's mounted
through AFS.
This required emulation of fine-grained locking when the server
will only support whole-file locks and no upgrade/downgrade. Four
modes are provided, settable by mount parameter:
"flock=local" - No reference to the server
"flock=openafs" - Fine-grained locks are local-only, whole-file
locks require sufficient server locks
"flock=strict" - All locks require sufficient server locks
"flock=write" - Always get an exclusive server lock
If the volume is a read-only or backup volume, then flock=local for
that volume.
- Log extra information for a couple of cases where the client mucks
up somehow: AFS vnode with undefined type and dir check failure -
in both cases we seem to end up with unfilled data, but the issues
happen infrequently and are difficult to reproduce at will.
- Implement silly rename for unlink() and rename().
- Set i_blocks so that du can get some information about usage.
- Fix xattr handlers to return the right amount of data and to not
overflow buffers.
- Implement getting/setting raw AFS and YFS ACLs as xattrs"
* tag 'afs-next-
20190507' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Implement YFS ACL setting
afs: Get YFS ACLs and information through xattrs
afs: implement acl setting
afs: Get an AFS3 ACL as an xattr
afs: Fix getting the afs.fid xattr
afs: Fix the afs.cell and afs.volume xattr handlers
afs: Calculate i_blocks based on file size
afs: Log more information for "kAFS: AFS vnode with undefined type\n"
afs: Provide mount-time configurable byte-range file locking emulation
afs: Add more tracepoints
afs: Implement sillyrename for unlink and rename
afs: Add directory reload tracepoint
afs: Handle lock rpc ops failing on a file that got deleted
afs: Improve dir check failure reports
afs: Add file locking tracepoints
afs: Further fix file locking
afs: Fix AFS file locking to allow fine grained locks
afs: Calculate lock extend timer from set/extend reply reception
afs: Split wait from afs_make_call()
Linus Torvalds [Wed, 8 May 2019 03:50:27 +0000 (20:50 -0700)]
Merge branch 'work.misc' of git://git./linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
"Assorted stuff, with no common topic whatsoever..."
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
libfs: document simple_get_link()
Documentation/filesystems/Locking: fix ->get_link() prototype
Documentation/filesystems/vfs.txt: document how ->i_link works
Documentation/filesystems/vfs.txt: remove bogus "Last updated" date
fs: use timespec64 in relatime_need_update
fs/block_dev.c: remove unused include
Linus Torvalds [Wed, 8 May 2019 03:34:21 +0000 (20:34 -0700)]
Merge branch 'work.file' of git://git./linux/kernel/git/viro/vfs
Pull vfs 'struct file' related updates from Al Viro:
"A bit more of 'this fget() would be better off as fdget()'
whack-a-mole + a couple of ->f_count-related cleanups"
* 'work.file' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
media: switch to fdget()
drm_syncobj: switch to fdget()
amdgpu: switch to fdget()
don't open-code file_count()
fs: drop unused fput_atomic definition
Linus Torvalds [Wed, 8 May 2019 03:17:51 +0000 (20:17 -0700)]
Merge branch 'work.mount-syscalls' of git://git./linux/kernel/git/viro/vfs
Pull mount ABI updates from Al Viro:
"The syscalls themselves, finally.
That's not all there is to that stuff, but switching individual
filesystems to new methods is fortunately independent from everything
else, so e.g. NFS series can go through NFS tree, etc.
As those conversions get done, we'll be finally able to get rid of a
bunch of duplication in fs/super.c introduced in the beginning of the
entire thing. I expect that to be finished in the next window..."
* 'work.mount-syscalls' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: Add a sample program for the new mount API
vfs: syscall: Add fspick() to select a superblock for reconfiguration
vfs: syscall: Add fsmount() to create a mount for a superblock
vfs: syscall: Add fsconfig() for configuring and managing a context
vfs: Implement logging through fs_context
vfs: syscall: Add fsopen() to prepare for superblock creation
Make anon_inodes unconditional
teach move_mount(2) to work with OPEN_TREE_CLONE
vfs: syscall: Add move_mount(2) to move mounts around
vfs: syscall: Add open_tree(2) to reference or clone a mount
Linus Torvalds [Wed, 8 May 2019 03:03:32 +0000 (20:03 -0700)]
Merge branch 'work.dcache' of git://git./linux/kernel/git/viro/vfs
Pull misc dcache updates from Al Viro:
"Most of this pile is putting name length into struct name_snapshot and
making use of it.
The beginning of this series ("ovl_lookup_real_one(): don't bother
with strlen()") ought to have been split in two (separate switch of
name_snapshot to struct qstr from overlayfs reaping the trivial
benefits of that), but I wanted to avoid a rebase - by the time I'd
spotted that it was (a) in -next and (b) close to 5.1-final ;-/"
* 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
audit_compare_dname_path(): switch to const struct qstr *
audit_update_watch(): switch to const struct qstr *
inotify_handle_event(): don't bother with strlen()
fsnotify: switch send_to_group() and ->handle_event to const struct qstr *
fsnotify(): switch to passing const struct qstr * for file_name
switch fsnotify_move() to passing const struct qstr * for old_name
ovl_lookup_real_one(): don't bother with strlen()
sysv: bury the broken "quietly truncate the long filenames" logics
nsfs: unobfuscate
unexport d_alloc_pseudo()
Linus Torvalds [Wed, 8 May 2019 02:34:17 +0000 (19:34 -0700)]
Merge branch 'parisc-5.2-1' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
"Many great new features, fixes and optimizations, including:
- Convert page table updates to use per-pagetable spinlocks which
overall improves performance on SMP machines a lot, by Mikulas
Patocka
- Kernel debugger (KGDB) support, by Sven Schnelle
- KPROBES support, by Sven Schnelle
- Lots of TLB lock/flush improvements, by Dave Anglin
- Drop DISCONTIGMEM and switch to SPARSEMEM
- Added JUMP_LABEL, branch runtime-patching support
- Lots of other small speedups and cleanups, e.g. for QEMU, stack
randomization, avoidance of name clashes, documentation updates,
etc ..."
* 'parisc-5.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (28 commits)
parisc: Add static branch and JUMP_LABEL feature
parisc: Use PA_ASM_LEVEL in boot code
parisc: Rename LEVEL to PA_ASM_LEVEL to avoid name clash with DRBD code
parisc: Update huge TLB page support to use per-pagetable spinlock
parisc: Use per-pagetable spinlock
parisc: Allow live-patching of __meminit functions
parisc: Add memory barrier to asm pdc and sync instructions
parisc: Add memory clobber to TLB purges
parisc: Use ldcw instruction for SMP spinlock release barrier
parisc: Remove lock code to serialize TLB operations in pacache.S
parisc: Switch from DISCONTIGMEM to SPARSEMEM
parisc: enable wide mode early
parisc: update feature lists
parisc: Show n/a if product number not available
parisc: remove unused flags parameter in __patch_text()
doc: update kprobes supported architecture list
parisc: Implement kretprobes
parisc: remove kprobes.h from generic-y
parisc: Implement kprobes
parisc: add functions required by KPROBE_EVENTS
...
Linus Torvalds [Wed, 8 May 2019 02:06:04 +0000 (19:06 -0700)]
Merge tag 'audit-pr-
20190507' of git://git./linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:
"We've got a reasonably broad set of audit patches for the v5.2 merge
window, the highlights are below:
- The biggest change, and the source of all the arch/* changes, is
the patchset from Dmitry to help enable some of the work he is
doing around PTRACE_GET_SYSCALL_INFO.
To be honest, including this in the audit tree is a bit of a
stretch, but it does help move audit a little further along towards
proper syscall auditing for all arches, and everyone else seemed to
agree that audit was a "good" spot for this to land (or maybe they
just didn't want to merge it? dunno.).
- We can now audit time/NTP adjustments.
- We continue the work to connect associated audit records into a
single event"
* tag 'audit-pr-
20190507' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: (21 commits)
audit: fix a memory leak bug
ntp: Audit NTP parameters adjustment
timekeeping: Audit clock adjustments
audit: purge unnecessary list_empty calls
audit: link integrity evm_write_xattrs record to syscall event
syscall_get_arch: add "struct task_struct *" argument
unicore32: define syscall_get_arch()
Move EM_UNICORE to uapi/linux/elf-em.h
nios2: define syscall_get_arch()
nds32: define syscall_get_arch()
Move EM_NDS32 to uapi/linux/elf-em.h
m68k: define syscall_get_arch()
hexagon: define syscall_get_arch()
Move EM_HEXAGON to uapi/linux/elf-em.h
h8300: define syscall_get_arch()
c6x: define syscall_get_arch()
arc: define syscall_get_arch()
Move EM_ARCOMPACT and EM_ARCV2 to uapi/linux/elf-em.h
audit: Make audit_log_cap and audit_copy_inode static
audit: connect LOGIN record to its syscall record
...
Linus Torvalds [Wed, 8 May 2019 01:48:09 +0000 (18:48 -0700)]
Merge tag 'selinux-pr-
20190507' of git://git./linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
"We've got a few SELinux patches for the v5.2 merge window, the
highlights are below:
- Add LSM hooks, and the SELinux implementation, for proper labeling
of kernfs. While we are only including the SELinux implementation
here, the rest of the LSM folks have given the hooks a thumbs-up.
- Update the SELinux mdp (Make Dummy Policy) script to actually work
on a modern system.
- Disallow userspace to change the LSM credentials via
/proc/self/attr when the task's credentials are already overridden.
The change was made in procfs because all the LSM folks agreed this
was the Right Thing To Do and duplicating it across each LSM was
going to be annoying"
* tag 'selinux-pr-
20190507' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
proc: prevent changes to overridden credentials
selinux: Check address length before reading address family
kernfs: fix xattr name handling in LSM helpers
MAINTAINERS: update SELinux file patterns
selinux: avoid uninitialized variable warning
selinux: remove useless assignments
LSM: lsm_hooks.h - fix missing colon in docstring
selinux: Make selinux_kernfs_init_security static
kernfs: initialize security of newly created nodes
selinux: implement the kernfs_init_security hook
LSM: add new hook for kernfs node initialization
kernfs: use simple_xattrs for security attributes
selinux: try security xattr after genfs for kernfs filesystems
kernfs: do not alloc iattrs in kernfs_xattr_get
kernfs: clean up struct kernfs_iattrs
scripts/selinux: fix build
selinux: use kernel linux/socket.h for genheaders and mdp
scripts/selinux: modernize mdp
Linus Torvalds [Wed, 8 May 2019 01:45:27 +0000 (18:45 -0700)]
Merge branch 'stable/for-linus-5.2' of git://git./linux/kernel/git/konrad/swiotlb
Pull swiotlb updates from Konrad Rzeszutek Wilk:
"Cleanups in the swiotlb code and extra debugfs knobs to help with the
field diagnostics"
* 'stable/for-linus-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
swiotlb-xen: ensure we have a single callsite for xen_dma_map_page
swiotlb-xen: simplify the DMA sync method implementations
swiotlb-xen: use ->map_page to implement ->map_sg
swiotlb-xen: make instances match their method names
swiotlb: save io_tlb_used to local variable before leaving critical section
swiotlb: dump used and total slots when swiotlb buffer is full
Linus Torvalds [Wed, 8 May 2019 01:43:36 +0000 (18:43 -0700)]
Merge tag 'for-5.2/libata-
20190507' of git://git.kernel.dk/linux-block
Pull libata updates from Jens Axboe:
"Just two minor fixes queued up for 5.2, adding support for two
different platforms to ahci"
* tag 'for-5.2/libata-
20190507' of git://git.kernel.dk/linux-block:
ahci: qoriq: add ls1028a platforms support
ahci: qoriq: add lx2160 platforms support
Linus Torvalds [Wed, 8 May 2019 01:30:11 +0000 (18:30 -0700)]
Merge tag 'for-5.2/io_uring-
20190507' of git://git.kernel.dk/linux-block
Pull io_uring updates from Jens Axboe:
"Set of changes/improvements for io_uring. This contains:
- Fix of a shadowed variable (Colin)
- Add support for draining commands (me)
- Add support for sync_file_range() (me)
- Add eventfd support (me)
- cpu_online() fix (Shenghui)
- Removal of a redundant ->error assignment (Stefan)"
* tag 'for-5.2/io_uring-
20190507' of git://git.kernel.dk/linux-block:
io_uring: use cpu_online() to check p->sq_thread_cpu instead of cpu_possible()
io_uring: fix shadowed variable ret return code being not checked
req->error only used for iopoll
io_uring: add support for eventfd notifications
io_uring: add support for IORING_OP_SYNC_FILE_RANGE
fs: add sync_file_range() helper
io_uring: add support for marking commands as draining
Linus Torvalds [Wed, 8 May 2019 01:14:36 +0000 (18:14 -0700)]
Merge tag 'for-5.2/block-
20190507' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
"Nothing major in this series, just fixes and improvements all over the
map. This contains:
- Series of fixes for sed-opal (David, Jonas)
- Fixes and performance tweaks for BFQ (via Paolo)
- Set of fixes for bcache (via Coly)
- Set of fixes for md (via Song)
- Enabling multi-page for passthrough requests (Ming)
- Queue release fix series (Ming)
- Device notification improvements (Martin)
- Propagate underlying device rotational status in loop (Holger)
- Removal of mtip32xx trim support, which has been disabled for years
(Christoph)
- Improvement and cleanup of nvme command handling (Christoph)
- Add block SPDX tags (Christoph)
- Cleanup/hardening of bio/bvec iteration (Christoph)
- A few NVMe pull requests (Christoph)
- Removal of CONFIG_LBDAF (Christoph)
- Various little fixes here and there"
* tag 'for-5.2/block-
20190507' of git://git.kernel.dk/linux-block: (164 commits)
block: fix mismerge in bvec_advance
block: don't drain in-progress dispatch in blk_cleanup_queue()
blk-mq: move cancel of hctx->run_work into blk_mq_hw_sysfs_release
blk-mq: always free hctx after request queue is freed
blk-mq: split blk_mq_alloc_and_init_hctx into two parts
blk-mq: free hw queue's resource in hctx's release handler
blk-mq: move cancel of requeue_work into blk_mq_release
blk-mq: grab .q_usage_counter when queuing request from plug code path
block: fix function name in comment
nvmet: protect discovery change log event list iteration
nvme: mark nvme_core_init and nvme_core_exit static
nvme: move command size checks to the core
nvme-fabrics: check more command sizes
nvme-pci: check more command sizes
nvme-pci: remove an unneeded variable initialization
nvme-pci: unquiesce admin queue on shutdown
nvme-pci: shutdown on timeout during deletion
nvme-pci: fix psdt field for single segment sgls
nvme-multipath: don't print ANA group state by default
nvme-multipath: split bios with the ns_head bio_set before submitting
...
Linus Torvalds [Wed, 8 May 2019 01:02:51 +0000 (18:02 -0700)]
Merge tag 'leds-for-5.2-rc1' of git://git./linux/kernel/git/j.anaszewski/linux-leds
Pull LED updates from Jacek Anaszewski:
"LED core fixes and improvements:
- avoid races with workqueue
- Kconfig: pedantic cleanup
- small fixes for Flash class description
leds-lt3593:
- remove unneeded assignment in lt3593_led_probe
- drop pdata handling code
leds-blinkm:
- clean up double assignment to data->i2c_addr
leds-pca955x, leds-pca963x:
- revert ACPI support, as it turned out that there is no evidence
of officially registered ACPI IDs for these devices.
- make use of device property API
leds-as3645a:
- switch to fwnode property API
LED related addition to ACPI documentation:
- document how to refer to LEDs from remote nodes
LED related fix to ALSA line6/toneport driver:
- avoid polluting led_* namespace
And lm3532 driver relocation from MFD to LED subsystem, accompanied by
various improvements and optimizations; it entails also a change in
omap4-droid4-xt894.dts:
- leds: lm3532: Introduce the lm3532 LED driver
- mfd: ti-lmu: Remove LM3532 backlight driver references
- ARM: dts: omap4-droid4: Update backlight dt properties
- dt: lm3532: Add lm3532 dt doc and update ti_lmu doc"
* tag 'leds-for-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
leds: avoid races with workqueue
ALSA: line6: Avoid polluting led_* namespace
leds: lm3532: Introduce the lm3532 LED driver
mfd: ti-lmu: Remove LM3532 backlight driver references
ARM: dts: omap4-droid4: Update backlight dt properties
dt: lm3532: Add lm3532 dt doc and update ti_lmu doc
leds: Small fixes for Flash class description
leds: blinkm: clean up double assignment to data->i2c_addr
leds: pca963x: Make use of device property API
leds: pca955x: Make use of device property API
leds: lt3593: Remove unneeded assignment in lt3593_led_probe
leds: lt3593: drop pdata handling code
leds: pca955x: Revert "Add ACPI support"
leds: pca963x: Revert "Add ACPI support"
drivers: leds: Kconfig: pedantic cleanups
ACPI: Document how to refer to LEDs from remote nodes
leds: as3645a: Switch to fwnode property API
David S. Miller [Wed, 8 May 2019 00:22:09 +0000 (17:22 -0700)]
Merge git://git./linux/kernel/git/davem/net
Minor conflict with the DSA legacy code removal.
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 7 May 2019 20:39:22 +0000 (13:39 -0700)]
Merge tag 'char-misc-5.2-rc1-part2' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc update part 2 from Greg KH:
"Here is the "real" big set of char/misc driver patches for 5.2-rc1
Loads of different driver subsystem stuff in here, all over the places:
- thunderbolt driver updates
- habanalabs driver updates
- nvmem driver updates
- extcon driver updates
- intel_th driver updates
- mei driver updates
- coresight driver updates
- soundwire driver cleanups and updates
- fastrpc driver updates
- other minor driver updates
- chardev minor fixups
Feels like this tree is getting to be a dumping ground of "small
driver subsystems" these days. Which is fine with me, if it makes
things easier for those subsystem maintainers.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.2-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (255 commits)
intel_th: msu: Add current window tracking
intel_th: msu: Add a sysfs attribute to trigger window switch
intel_th: msu: Correct the block wrap detection
intel_th: Add switch triggering support
intel_th: gth: Factor out trace start/stop
intel_th: msu: Factor out pipeline draining
intel_th: msu: Switch over to scatterlist
intel_th: msu: Replace open-coded list_{first,last,next}_entry variants
intel_th: Only report useful IRQs to subdevices
intel_th: msu: Start handling IRQs
intel_th: pci: Use MSI interrupt signalling
intel_th: Communicate IRQ via resource
intel_th: Add "rtit" source device
intel_th: Skip subdevices if their MMIO is missing
intel_th: Rework resource passing between glue layers and core
intel_th: SPDX-ify the documentation
intel_th: msu: Fix single mode with IOMMU
coresight: funnel: Support static funnel
dt-bindings: arm: coresight: Unify funnel DT binding
coresight: replicator: Add new device id for static replicator
...
Linus Torvalds [Tue, 7 May 2019 20:33:31 +0000 (13:33 -0700)]
Merge tag 'char-misc-5.2-rc1-part1' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc update part 1 from Greg KH:
"This contains only a small number of bugfixes that would have gone to
you for 5.1-rc8 if that had happened, but instead I let them sit in
linux-next for an extra week just "to be sure".
The "big" patch here is for hyper-v, fixing a bug in their sysfs files
that could cause big problems. The others are all small fixes,
resolving reported issues that showed up in 5.1-rcs, plus some odd
'static' cleanups for the phy drivers that really should have waited
for -rc1. Most of these are tagged for the stable trees, so 5.1 will
pick them up.
All of these have been in linux-next for a while, with no reported
issues"
* tag 'char-misc-5.2-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
misc: rtsx: Fixed rts5260 power saving parameter and sd glitch
binder: take read mode of mmap_sem in binder_alloc_free_page()
intel_th: pci: Add Comet Lake support
stm class: Fix channel bitmap on 32-bit systems
stm class: Fix channel free in stm output free path
phy: sun4i-usb: Make sure to disable PHY0 passby for peripheral mode
phy: fix platform_no_drv_owner.cocci warnings
phy: mapphone-mdm6600: add gpiolib dependency
phy: ti: usb2: fix OMAP_CONTROL_PHY dependency
phy: allwinner: allow compile testing
phy: qcom-ufs: Make ufs_qcom_phy_disable_iface_clk static
phy: rockchip-typec: Make usb3_pll_cfg and dp_pll_cfg static
phy: phy-twl4030-usb: Fix cable state handling
Drivers: hv: vmbus: Remove the undesired put_cpu_ptr() in hv_synic_cleanup()
Drivers: hv: vmbus: Fix race condition with new ring_buffer_info mutex
Drivers: hv: vmbus: Set ring_info field to 0 and remove memset
Drivers: hv: vmbus: Refactor chan->state if statement
Drivers: hv: vmbus: Expose monitor data only when monitor pages are used
Linus Torvalds [Tue, 7 May 2019 20:31:29 +0000 (13:31 -0700)]
Merge tag 'staging-5.2-rc1' of git://git./linux/kernel/git/gregkh/staging
Pull staging / IIO driver updates from Greg KH:
"Here is the big staging and iio driver update for 5.2-rc1.
Lots of tiny fixes all over the staging and IIO driver trees here,
along with some new IIO drivers.
The "counter" subsystem was added in here as well, as it is needed by
the IIO drivers and subsystem.
Also we ended up deleting two drivers, making this pull request remove
a few hundred thousand lines of code, always a nice thing to see. Both
of the drivers removed have been replaced with "real" drivers in their
various subsystem directories, and they will be coming to you from
those locations during this merge window.
There are some core vt/selection changes in here, that was due to some
cleanups needed for the speakup fixes. Those have all been acked by
the various subsystem maintainers (i.e. me), so those are ok.
We also added a few new drivers, for some odd hardware, giving new
developers plenty to work on with basic coding style cleanups to come
in the near future.
Other than that, nothing unusual here.
All of these have been in linux-next for a while with no reported
issues, other than an odd gcc warning for one of the new drivers that
should be fixed up soon"
[ I fixed up the warning myself - Linus ]
* tag 'staging-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (663 commits)
staging: kpc2000: kpc_spi: Fix build error for {read,write}q
Staging: rtl8192e: Remove extra space before break statement
Staging: rtl8192u: ieee80211: Fix if-else indentation warning
Staging: rtl8192u: ieee80211: Fix indentation errors by removing extra spaces
staging: most: cdev: fix chrdev_region leak in mod_exit
staging: wlan-ng: Fix improper SPDX comment style
staging: rtl8192u: ieee80211: Resolve ERROR reported by checkpatch
staging: vc04_services: bcm2835-camera: Compress two lines into one line
staging: rtl8723bs: core: Use !x in place of NULL comparison.
staging: rtl8723bs: core: Prefer using the BIT Macro.
staging: fieldbus: anybus-s: fix wait_for_completion_timeout return handling
staging: kpc2000: fix up build problems with readq()
staging: rtlwifi: move remaining phydm .h files
staging: rtlwifi: strip down phydm .h files
staging: rtlwifi: delete the staging driver
staging: fieldbus: anybus-s: rename bus id field to avoid confusion
staging: fieldbus: anybus-s: keep device bus id in bus endianness
Staging: sm750fb: Change *array into *const array
staging: rtl8192u: ieee80211: Fix spelling mistake
staging: rtl8192u: ieee80211: Replace bit shifting with BIT macro
...
Linus Torvalds [Tue, 7 May 2019 20:01:40 +0000 (13:01 -0700)]
Merge tag 'driver-core-5.2-rc1' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core/kobject updates from Greg KH:
"Here is the "big" set of driver core patches for 5.2-rc1
There are a number of ACPI patches in here as well, as Rafael said
they should go through this tree due to the driver core changes they
required. They have all been acked by the ACPI developers.
There are also a number of small subsystem-specific changes in here,
due to some changes to the kobject core code. Those too have all been
acked by the various subsystem maintainers.
As for content, it's pretty boring outside of the ACPI changes:
- spdx cleanups
- kobject documentation updates
- default attribute groups for kobjects
- other minor kobject/driver core fixes
All have been in linux-next for a while with no reported issues"
* tag 'driver-core-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (47 commits)
kobject: clean up the kobject add documentation a bit more
kobject: Fix kernel-doc comment first line
kobject: Remove docstring reference to kset
firmware_loader: Fix a typo ("syfs" -> "sysfs")
kobject: fix dereference before null check on kobj
Revert "driver core: platform: Fix the usage of platform device name(pdev->name)"
init/config: Do not select BUILD_BIN2C for IKCONFIG
Provide in-kernel headers to make extending kernel easier
kobject: Improve doc clarity kobject_init_and_add()
kobject: Improve docs for kobject_add/del
driver core: platform: Fix the usage of platform device name(pdev->name)
livepatch: Replace klp_ktype_patch's default_attrs with groups
cpufreq: schedutil: Replace default_attrs field with groups
padata: Replace padata_attr_type default_attrs field with groups
irqdesc: Replace irq_kobj_type's default_attrs field with groups
net-sysfs: Replace ktype default_attrs field with groups
block: Replace all ktype default_attrs with groups
samples/kobject: Replace foo_ktype's default_attrs field with groups
kobject: Add support for default attribute groups to kobj_type
driver core: Postpone DMA tear-down until after devres release for probe failure
...
Linus Torvalds [Tue, 7 May 2019 19:56:19 +0000 (12:56 -0700)]
Merge tag 'mmc-v5.2' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC updates from Ulf Hansson:
"MMC core:
- Fix a few memoryleaks
- Minor improvements to the card initialization sequence
- Partially support sleepy GPIO controllers for pwrseq eMMC
MMC host:
- alcor: Work with multiple-entry sglists
- alcor: Enable DMA for writes
- meson-gx: Improve tuning support
- meson-gx: Avoid clock glitch when switching to DDR modes
- meson-gx: Disable unreliable HS400 mode
- mmci: Minor updates for support of HW busy detection
- mmci: Support data transfers for the stm32_sdmmc variant
- mmci: Restructure code to better support different variants
- mtk-sd: Add support for version found on MT7620 family SOCs
- mtk-sd: Add support for the MT8516 version
- mtk-sd: Add Chaotian Jing as the maintainer
- sdhci: Reorganize request-code to convert from tasklet to workqueue
- sdhci_am654: Stabilize support for lower speed modes
- sdhci-esdhc-imx: Add HS400 support for iMX7ULP
- sdhci-esdhc-imx: Add support for iMX7ULP version
- sdhci-of-arasan: Allow to disable DCMDs via DT for CQE
- sdhci-of-esdhc: Add support for the ls1028a version
- sdhci-of-esdhc: Several fixups for errata
- sdhci-pci: Fix BYT OCP setting
- sdhci-pci: Add support for Intel CML
- sdhci-tegra: Add support for system suspend/resume
- sdhci-tegra: Add CQE support for Tegra186 WAR
- sdhci-tegra: Add support for Tegra194
- sdhci-tegra: Update HW tuning process
MEMSTICK:
- I volunteered to help as a maintainer for the memstick subsystem,
which is reflected by an update to the MAINTAINERS file. Changes
are funneled through my MMC git and we will use the linux-mmc
mailing list.
MEMSTICK host:
- A few minor cleanups"
* tag 'mmc-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (87 commits)
mmc: sdhci-pci: Fix BYT OCP setting
dt-bindings: mmc: add DT bindings for ls1028a eSDHC host controller
mmc: alcor: Drop pointer to mmc_host from alcor_sdmmc_host
mmc: mtk-sd: select REGULATOR
mmc: mtk-sd: enable internal card-detect logic.
mmc: mtk-sd: add support for config found in mt7620 family SOCs.
mmc: mtk-sd: don't hard-code interrupt trigger type
mmc: core: Fix tag set memory leak
dt-bindings: mmc: Add support for MT8516 to mtk-sd
mmc: mmci: Prevent polling for busy detection in IRQ context
mmc: mmci: Cleanup mmci_cmd_irq() for busy detect
mmc: usdhi6rol0: mark expected switch fall-throughs
mmc: core: Verify SD bus width
mmc: sdhci-esdhc-imx: Add HS400 support for iMX7ULP
mmc: sdhci-esdhc-imx: add pm_qos to interact with cpuidle
dt-bindings: mmc: fsl-imx-esdhc: add imx7ulp compatible string
mmc: meson-gx: add signal resampling tuning
mmc: meson-gx: remove Rx phase tuning
mmc: meson-gx: avoid clock glitch when switching to DDR modes
mmc: meson-gx: disable HS400
...
Linus Torvalds [Tue, 7 May 2019 19:48:10 +0000 (12:48 -0700)]
Merge tag 'Wimplicit-fallthrough-5.2-rc1' of git://git./linux/kernel/git/gustavoars/linux
Pull Wimplicit-fallthrough updates from Gustavo A. R. Silva:
"Mark switch cases where we are expecting to fall through.
This is part of the ongoing efforts to enable -Wimplicit-fallthrough.
Most of them have been baking in linux-next for a whole development
cycle. And with Stephen Rothwell's help, we've had linux-next
nag-emails going out for newly introduced code that triggers
-Wimplicit-fallthrough to avoid gaining more of these cases while we
work to remove the ones that are already present.
We are getting close to completing this work. Currently, there are
only 32 of 2311 of these cases left to be addressed in linux-next. I'm
auditing every case; I take a look into the code and analyze it in
order to determine if I'm dealing with an actual bug or a false
positive, as explained here:
https://lore.kernel.org/lkml/
c2fad584-1705-a5f2-d63c-
824e9b96cf50@embeddedor.com/
While working on this, I've found and fixed the several missing
break/return bugs, some of them introduced more than 5 years ago.
Once this work is finished, we'll be able to universally enable
"-Wimplicit-fallthrough" to avoid any of these kinds of bugs from
entering the kernel again"
* tag 'Wimplicit-fallthrough-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (27 commits)
memstick: mark expected switch fall-throughs
drm/nouveau/nvkm: mark expected switch fall-throughs
NFC: st21nfca: Fix fall-through warnings
NFC: pn533: mark expected switch fall-throughs
block: Mark expected switch fall-throughs
ASN.1: mark expected switch fall-through
lib/cmdline.c: mark expected switch fall-throughs
lib: zstd: Mark expected switch fall-throughs
scsi: sym53c8xx_2: sym_nvram: Mark expected switch fall-through
scsi: sym53c8xx_2: sym_hipd: mark expected switch fall-throughs
scsi: ppa: mark expected switch fall-through
scsi: osst: mark expected switch fall-throughs
scsi: lpfc: lpfc_scsi: Mark expected switch fall-throughs
scsi: lpfc: lpfc_nvme: Mark expected switch fall-through
scsi: lpfc: lpfc_nportdisc: Mark expected switch fall-through
scsi: lpfc: lpfc_hbadisc: Mark expected switch fall-throughs
scsi: lpfc: lpfc_els: Mark expected switch fall-throughs
scsi: lpfc: lpfc_ct: Mark expected switch fall-throughs
scsi: imm: mark expected switch fall-throughs
scsi: csiostor: csio_wr: mark expected switch fall-through
...
Linus Torvalds [Tue, 7 May 2019 19:44:49 +0000 (12:44 -0700)]
Merge tag 'meminit-v5.2-rc1' of git://git./linux/kernel/git/kees/linux
Pull compiler-based variable initialization updates from Kees Cook:
"This is effectively part of my gcc-plugins tree, but as this adds some
Clang support, it felt weird to still call it "gcc-plugins". :)
This consolidates Kconfig for the existing stack variable
initialization (via structleak and stackleak gcc plugins) and adds
Alexander Potapenko's support for Clang's new similar functionality.
Summary:
- Consolidate memory initialization Kconfigs (Kees)
- Implement support for Clang's stack variable auto-init (Alexander)"
* tag 'meminit-v5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
security: Implement Clang's stack initialization
security: Move stackleak config to Kconfig.hardening
security: Create "kernel hardening" config area
YueHaibing [Mon, 6 May 2019 15:57:54 +0000 (23:57 +0800)]
cxgb4: Fix error path in cxgb4_init_module
BUG: unable to handle kernel paging request at
ffffffffa016a270
PGD
3270067 P4D
3270067 PUD
3271063 PMD
230bbd067 PTE 0
Oops: 0000 [#1
CPU: 0 PID: 6134 Comm: modprobe Not tainted 5.1.0+ #33
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:atomic_notifier_chain_register+0x24/0x60
Code: 1f 80 00 00 00 00 55 48 89 e5 41 54 49 89 f4 53 48 89 fb e8 ae b4 38 01 48 8b 53 38 48 8d 4b 38 48 85 d2 74 20 45 8b 44 24 10 <44> 3b 42 10 7e 08 eb 13 44 39 42 10 7c 0d 48 8d 4a 08 48 8b 52 08
RSP: 0018:
ffffc90000e2bc60 EFLAGS:
00010086
RAX:
0000000000000292 RBX:
ffffffff83467240 RCX:
ffffffff83467278
RDX:
ffffffffa016a260 RSI:
ffffffff83752140 RDI:
ffffffff83467240
RBP:
ffffc90000e2bc70 R08:
0000000000000000 R09:
0000000000000001
R10:
0000000000000000 R11:
00000000014fa61f R12:
ffffffffa01c8260
R13:
ffff888231091e00 R14:
0000000000000000 R15:
ffffc90000e2be78
FS:
00007fbd8d7cd540(0000) GS:
ffff888237a00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
ffffffffa016a270 CR3:
000000022c7e3000 CR4:
00000000000006f0
Call Trace:
register_inet6addr_notifier+0x13/0x20
cxgb4_init_module+0x6c/0x1000 [cxgb4
? 0xffffffffa01d7000
do_one_initcall+0x6c/0x3cc
? do_init_module+0x22/0x1f1
? rcu_read_lock_sched_held+0x97/0xb0
? kmem_cache_alloc_trace+0x325/0x3b0
do_init_module+0x5b/0x1f1
load_module+0x1db1/0x2690
? m_show+0x1d0/0x1d0
__do_sys_finit_module+0xc5/0xd0
__x64_sys_finit_module+0x15/0x20
do_syscall_64+0x6b/0x1d0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
If pci_register_driver fails, register inet6addr_notifier is
pointless. This patch fix the error path in cxgb4_init_module.
Fixes: b5a02f503caa ("cxgb4 : Update ipv6 address handling api")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 5 May 2019 17:03:51 +0000 (19:03 +0200)]
net: phy: improve pause mode reporting in phy_print_status
So far we report symmetric pause only, and we don't consider the local
pause capabilities. Let's properly consider local and remote
capabilities, and report also asymmetric pause.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Tue, 7 May 2019 15:35:55 +0000 (17:35 +0200)]
dt-bindings: net: Fix a typo in the phy-mode list for ethernet bindings
The phy_mode "2000base-x" is actually supposed to be "1000base-x", even
though the commit title of the original patch says otherwise.
Fixes: 55601a880690 ("net: phy: Add 2000base-x, 2500base-x and rxaui modes")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 7 May 2019 19:30:24 +0000 (12:30 -0700)]
Merge tag 'pidfd-v5.2-rc1' of git://git./linux/kernel/git/brauner/linux
Pull pidfd updates from Christian Brauner:
"This patchset makes it possible to retrieve pidfds at process creation
time by introducing the new flag CLONE_PIDFD to the clone() system
call. Linus originally suggested to implement this as a new flag to
clone() instead of making it a separate system call.
After a thorough review from Oleg CLONE_PIDFD returns pidfds in the
parent_tidptr argument. This means we can give back the associated pid
and the pidfd at the same time. Access to process metadata information
thus becomes rather trivial.
As has been agreed, CLONE_PIDFD creates file descriptors based on
anonymous inodes similar to the new mount api. They are made
unconditional by this patchset as they are now needed by core kernel
code (vfs, pidfd) even more than they already were before (timerfd,
signalfd, io_uring, epoll etc.). The core patchset is rather small.
The bulky looking changelist is caused by David's very simple changes
to Kconfig to make anon inodes unconditional.
A pidfd comes with additional information in fdinfo if the kernel
supports procfs. The fdinfo file contains the pid of the process in
the callers pid namespace in the same format as the procfs status
file, i.e. "Pid:\t%d".
To remove worries about missing metadata access this patchset comes
with a sample/test program that illustrates how a combination of
CLONE_PIDFD and pidfd_send_signal() can be used to gain race-free
access to process metadata through /proc/<pid>.
Further work based on this patchset has been done by Joel. His work
makes pidfds pollable. It finished too late for this merge window. I
would prefer to have it sitting in linux-next for a while and send it
for inclusion during the 5.3 merge window"
* tag 'pidfd-v5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
samples: show race-free pidfd metadata access
signal: support CLONE_PIDFD with pidfd_send_signal
clone: add CLONE_PIDFD
Make anon_inodes unconditional
Harini Katakam [Tue, 7 May 2019 14:29:10 +0000 (19:59 +0530)]
net: macb: Change interrupt and napi enable order in open
Current order in open:
-> Enable interrupts (macb_init_hw)
-> Enable NAPI
-> Start PHY
Sequence of RX handling:
-> RX interrupt occurs
-> Interrupt is cleared and interrupt bits disabled in handler
-> NAPI is scheduled
-> In NAPI, RX budget is processed and RX interrupts are re-enabled
With the above, on QEMU or fixed link setups (where PHY state doesn't
matter), there's a chance macb RX interrupt occurs before NAPI is
enabled. This will result in NAPI being scheduled before it is enabled.
Fix this macb open by changing the order.
Fixes: ae1f2a56d273 ("net: macb: Added support for many RX queues")
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Esben Haabendal [Tue, 7 May 2019 06:42:57 +0000 (08:42 +0200)]
net: ll_temac: Improve error message on error IRQ
The channel status register value can be very helpful when debugging
SDMA problems.
Signed-off-by: Esben Haabendal <esben@geanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Jansen van Vuuren [Tue, 7 May 2019 00:24:21 +0000 (17:24 -0700)]
net/sched: remove block pointer from common offload structure
Based on feedback from Jiri avoid carrying a pointer to the tcf_block
structure in the tc_cls_common_offload structure. Instead store
a flag in driver private data which indicates if offloads apply
to a shared block at block binding time.
Suggested-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 7 May 2019 19:22:47 +0000 (12:22 -0700)]
Merge branch 'of_get_mac_address-ERR_PTR-fixes'
Petr Štetiar says:
====================
of_get_mac_address ERR_PTR fixes
this patch series is an attempt to fix the mess, I've somehow managed to
introduce.
First patch in this series is defacto v5 of the previous 05/10 patch in the
series, but since the v4 of this 05/10 patch wasn't picked up by the
patchwork for some unknown reason, this patch wasn't applied with the other
9 patches in the series, so I'm resending it as a separate patch of this
fixup series again.
Second patch is a result of this rebase against net-next tree, where I was
checking again all current users of of_get_mac_address and found out, that
there's new one in DSA, so I've converted this user to the new ERR_PTR
encoded error value as well.
Third patch which was sent as v5 wasn't considered for merge, but I still
think, that we need to check for possible NULL value, thus current IS_ERR
check isn't sufficient and we need to use IS_ERR_OR_NULL instead.
Fourth patch fixes warning reported by kbuild test robot.
====================
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Štetiar [Mon, 6 May 2019 21:27:04 +0000 (23:27 +0200)]
net: ethernet: support of_get_mac_address new ERR_PTR error
There was NVMEM support added to of_get_mac_address, so it could now
return ERR_PTR encoded error values, so we need to adjust all current
users of of_get_mac_address to this new fact.
While at it, remove superfluous is_valid_ether_addr as the MAC address
returned from of_get_mac_address is always valid and checked by
is_valid_ether_addr anyway.
Fixes: d01f449c008a ("of_net: add NVMEM support to of_get_mac_address")
Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Štetiar [Mon, 6 May 2019 21:24:47 +0000 (23:24 +0200)]
net: usb: smsc: fix warning reported by kbuild test robot
This patch fixes following warning reported by kbuild test robot:
In function ‘memcpy’,
inlined from ‘smsc75xx_init_mac_address’ at drivers/net/usb/smsc75xx.c:778:3,
inlined from ‘smsc75xx_bind’ at drivers/net/usb/smsc75xx.c:1501:2:
./include/linux/string.h:355:9: warning: argument 2 null where non-null expected [-Wnonnull]
return __builtin_memcpy(p, q, size);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/usb/smsc75xx.c: In function ‘smsc75xx_bind’:
./include/linux/string.h:355:9: note: in a call to built-in function ‘__builtin_memcpy’
I've replaced the offending memcpy with ether_addr_copy, because I'm
100% sure, that of_get_mac_address can't return NULL as it returns valid
pointer or ERR_PTR encoded value, nothing else.
I'm hesitant to just change IS_ERR into IS_ERR_OR_NULL check, as this
would make the warning disappear also, but it would be confusing to
check for impossible return value just to make a compiler happy.
Fixes: adfb3cb2c52e ("net: usb: support of_get_mac_address new ERR_PTR error")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Petr Štetiar <ynezz@true.cz>
Reviewed-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Štetiar [Mon, 6 May 2019 21:24:46 +0000 (23:24 +0200)]
staging: octeon-ethernet: Fix of_get_mac_address ERR_PTR check
Commit
284eb160681c ("staging: octeon-ethernet: support
of_get_mac_address new ERR_PTR error") has introduced checking for
ERR_PTR encoded error value from of_get_mac_address with IS_ERR macro,
which is not sufficient in this case, as the mac variable is set to NULL
initialy and if the kernel is compiled without DT support this NULL
would get passed to IS_ERR, which would lead to the wrong decision and
would pass that NULL pointer and invalid MAC address further.
Fixes: 284eb160681c ("staging: octeon-ethernet: support of_get_mac_address new ERR_PTR error")
Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Štetiar [Mon, 6 May 2019 21:24:45 +0000 (23:24 +0200)]
net: dsa: support of_get_mac_address new ERR_PTR error
There was NVMEM support added to of_get_mac_address, so it could now
return ERR_PTR encoded error values, so we need to adjust all current
users of of_get_mac_address to this new fact.
While at it, remove superfluous is_valid_ether_addr as the MAC address
returned from of_get_mac_address is always valid and checked by
is_valid_ether_addr anyway.
Fixes: d01f449c008a ("of_net: add NVMEM support to of_get_mac_address")
Signed-off-by: Petr Štetiar <ynezz@true.cz>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nathan Chancellor [Mon, 6 May 2019 20:24:47 +0000 (13:24 -0700)]
net: dsa: sja1105: Fix status initialization in sja1105_get_ethtool_stats
Clang warns:
drivers/net/dsa/sja1105/sja1105_ethtool.c:316:39: warning: suggest
braces around initialization of subobject [-Wmissing-braces]
struct sja1105_port_status status = {0};
^
{}
1 warning generated.
One way to fix these warnings is to add additional braces like Clang
suggests; however, there has been a bit of push back from some
maintainers[1][2], who just prefer memset as it is unambiguous, doesn't
depend on a particular compiler version[3], and properly initializes all
subobjects. Do that here so there are no more warnings.
[1]: https://lore.kernel.org/lkml/
022e41c0-8465-dc7a-a45c-
64187ecd9684@amd.com/
[2]: https://lore.kernel.org/lkml/
20181128.215241.
702406654469517539.davem@davemloft.net/
[3]: https://lore.kernel.org/lkml/
20181116150432.
2408a075@redhat.com/
Fixes: 52c34e6e125c ("net: dsa: sja1105: Add support for ethtool port counters")
Link: https://github.com/ClangBuiltLinux/linux/issues/471
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Suryaputra [Mon, 6 May 2019 19:00:01 +0000 (15:00 -0400)]
vrf: sit mtu should not be updated when vrf netdev is the link
VRF netdev mtu isn't typically set and have an mtu of 65536. When the
link of a tunnel is set, the tunnel mtu is changed from 1480 to the link
mtu minus tunnel header. In the case of VRF netdev is the link, then the
tunnel mtu becomes 65516. So, fix it by not setting the tunnel mtu in
this case.
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Mon, 6 May 2019 15:25:29 +0000 (23:25 +0800)]
net: dsa: Fix error cleanup path in dsa_init_module
BUG: unable to handle kernel paging request at
ffffffffa01c5430
PGD
3270067 P4D
3270067 PUD
3271063 PMD
230bc5067 PTE 0
Oops: 0000 [#1
CPU: 0 PID: 6159 Comm: modprobe Not tainted 5.1.0+ #33
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:raw_notifier_chain_register+0x16/0x40
Code: 63 f8 66 90 e9 5d ff ff ff 90 90 90 90 90 90 90 90 90 90 90 55 48 8b 07 48 89 e5 48 85 c0 74 1c 8b 56 10 3b 50 10 7e 07 eb 12 <39> 50 10 7c 0d 48 8d 78 08 48 8b 40 08 48 85 c0 75 ee 48 89 46 08
RSP: 0018:
ffffc90001c33c08 EFLAGS:
00010282
RAX:
ffffffffa01c5420 RBX:
ffffffffa01db420 RCX:
4fcef45928070a8b
RDX:
0000000000000000 RSI:
ffffffffa01db420 RDI:
ffffffffa01b0068
RBP:
ffffc90001c33c08 R08:
000000003e0a33d0 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000094443661 R12:
ffff88822c320700
R13:
ffff88823109be80 R14:
0000000000000000 R15:
ffffc90001c33e78
FS:
00007fab8bd08540(0000) GS:
ffff888237a00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
ffffffffa01c5430 CR3:
00000002297ea000 CR4:
00000000000006f0
Call Trace:
register_netdevice_notifier+0x43/0x250
? 0xffffffffa01e0000
dsa_slave_register_notifier+0x13/0x70 [dsa_core
? 0xffffffffa01e0000
dsa_init_module+0x2e/0x1000 [dsa_core
do_one_initcall+0x6c/0x3cc
? do_init_module+0x22/0x1f1
? rcu_read_lock_sched_held+0x97/0xb0
? kmem_cache_alloc_trace+0x325/0x3b0
do_init_module+0x5b/0x1f1
load_module+0x1db1/0x2690
? m_show+0x1d0/0x1d0
__do_sys_finit_module+0xc5/0xd0
__x64_sys_finit_module+0x15/0x20
do_syscall_64+0x6b/0x1d0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Cleanup allocated resourses if there are errors,
otherwise it will trgger memleak.
Fixes: c9eb3e0f8701 ("net: dsa: Add support for learning FDB through notification")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Mon, 6 May 2019 14:44:04 +0000 (22:44 +0800)]
l2tp: Fix possible NULL pointer dereference
BUG: unable to handle kernel NULL pointer dereference at
0000000000000128
PGD 0 P4D 0
Oops: 0000 [#1
CPU: 0 PID: 5697 Comm: modprobe Tainted: G W 5.1.0-rc7+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:__lock_acquire+0x53/0x10b0
Code: 8b 1c 25 40 5e 01 00 4c 8b 6d 10 45 85 e4 0f 84 bd 06 00 00 44 8b 1d 7c d2 09 02 49 89 fe 41 89 d2 45 85 db 0f 84 47 02 00 00 <48> 81 3f a0 05 70 83 b8 00 00 00 00 44 0f 44 c0 83 fe 01 0f 86 3a
RSP: 0018:
ffffc90001c07a28 EFLAGS:
00010002
RAX:
0000000000000000 RBX:
ffff88822f038440 RCX:
0000000000000000
RDX:
0000000000000000 RSI:
0000000000000000 RDI:
0000000000000128
RBP:
ffffc90001c07a88 R08:
0000000000000001 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000001 R12:
0000000000000001
R13:
0000000000000000 R14:
0000000000000128 R15:
0000000000000000
FS:
00007fead0811540(0000) GS:
ffff888237a00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000128 CR3:
00000002310da000 CR4:
00000000000006f0
Call Trace:
? __lock_acquire+0x24e/0x10b0
lock_acquire+0xdf/0x230
? flush_workqueue+0x71/0x530
flush_workqueue+0x97/0x530
? flush_workqueue+0x71/0x530
l2tp_exit_net+0x170/0x2b0 [l2tp_core
? l2tp_exit_net+0x93/0x2b0 [l2tp_core
ops_exit_list.isra.6+0x36/0x60
unregister_pernet_operations+0xb8/0x110
unregister_pernet_device+0x25/0x40
l2tp_init+0x55/0x1000 [l2tp_core
? 0xffffffffa018d000
do_one_initcall+0x6c/0x3cc
? do_init_module+0x22/0x1f1
? rcu_read_lock_sched_held+0x97/0xb0
? kmem_cache_alloc_trace+0x325/0x3b0
do_init_module+0x5b/0x1f1
load_module+0x1db1/0x2690
? m_show+0x1d0/0x1d0
__do_sys_finit_module+0xc5/0xd0
__x64_sys_finit_module+0x15/0x20
do_syscall_64+0x6b/0x1d0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fead031a839
Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
RSP: 002b:
00007ffe8d9acca8 EFLAGS:
00000246 ORIG_RAX:
0000000000000139
RAX:
ffffffffffffffda RBX:
0000560078398b80 RCX:
00007fead031a839
RDX:
0000000000000000 RSI:
000056007659dc2e RDI:
0000000000000003
RBP:
000056007659dc2e R08:
0000000000000000 R09:
0000560078398b80
R10:
0000000000000003 R11:
0000000000000246 R12:
0000000000000000
R13:
00005600783a04a0 R14:
0000000000040000 R15:
0000560078398b80
Modules linked in: l2tp_core(+) e1000 ip_tables ipv6 [last unloaded: l2tp_core
CR2:
0000000000000128
---[ end trace
8322b2b8bf83f8e1
If alloc_workqueue fails in l2tp_init, l2tp_net_ops
is unregistered on failure path. Then l2tp_exit_net
is called which will flush NULL workqueue, this patch
add a NULL check to fix it.
Fixes: 67e04c29ec0d ("l2tp: unregister l2tp_net_ops on failure path")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Sun, 5 May 2019 21:50:19 +0000 (22:50 +0100)]
taprio: add null check on sched_nest to avoid potential null pointer dereference
The call to nla_nest_start_noflag can return a null pointer and currently
this is not being checked and this can lead to a null pointer dereference
when the null pointer sched_nest is passed to function nla_nest_end. Fix
this by adding in a null pointer check.
Addresses-Coverity: ("Dereference null return value")
Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 7 May 2019 19:15:13 +0000 (12:15 -0700)]
Merge tag 'stream_open-5.2' of https://lab.nexedi.com/kirr/linux
Pull stream_open conversion from Kirill Smelkov:
- remove unnecessary double nonseekable_open from drivers/char/dtlk.c
as noticed by Pavel Machek while reviewing nonseekable_open ->
stream_open mass conversion.
- the mass conversion patch promised in commit
10dce8af3422 ("fs:
stream_open - opener for stream-like files so that read and write can
run simultaneously without deadlock") and is automatically generated
by running
$ make coccicheck MODE=patch COCCI=scripts/coccinelle/api/stream_open.cocci
I've verified each generated change manually - that it is correct to
convert - and each other nonseekable_open instance left - that it is
either not correct to convert there, or that it is not converted due
to current stream_open.cocci limitations. More details on this in the
patch.
- finally, change VFS to pass ppos=NULL into .read/.write for files
that declare themselves streams. It was suggested by Rasmus Villemoes
and makes sure that if ppos starts to be erroneously used in a stream
file, such bug won't go unnoticed and will produce an oops instead of
creating illusion of position change being taken into account.
Note: this patch does not conflict with "fuse: Add FOPEN_STREAM to
use stream_open()" that will be hopefully coming via FUSE tree,
because fs/fuse/ uses new-style .read_iter/.write_iter, and for these
accessors position is still passed as non-pointer kiocb.ki_pos .
* tag 'stream_open-5.2' of https://lab.nexedi.com/kirr/linux:
vfs: pass ppos=NULL to .read()/.write() of FMODE_STREAM files
*: convert stream-like files from nonseekable_open -> stream_open
dtlk: remove double call to nonseekable_open
Colin Ian King [Sun, 5 May 2019 21:38:14 +0000 (22:38 +0100)]
net: mvpp2: cls: fix less than zero check on a u32 variable
The signed return from the call to mvpp2_cls_c2_port_flow_index is being
assigned to the u32 variable c2.index and then checked for a negative
error condition which is always going to be false. Fix this by assigning
the return to the int variable index and checking this instead.
Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 90b509b39ac9 ("net: mvpp2: cls: Add Classification offload support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 7 May 2019 19:09:25 +0000 (12:09 -0700)]
Merge branch 'fc-quic-pacing'
Eric Dumazet says:
====================
net_sched: sch_fq: enable in-kernel pacing for QUIC servers
Willem added GSO support to UDP stack, greatly improving performance
of QUIC servers.
We also want to enable in-kernel pacing, which is possible thanks to EDT
model, since each sendmsg() can provide a timestamp for the skbs.
We have to change sch_fq to enable feeding packets in arbitrary EDT order,
and make sure that packet classification do not trust unconnected sockets.
Note that this patch series also is a prereq for a future TCP change
enabling per-flow delays/reorders/losses to implement high performance
TCP emulators.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 4 May 2019 23:48:54 +0000 (16:48 -0700)]
net_sched: sch_fq: handle non connected flows
FQ packet scheduler assumed that packets could be classified
based on their owning socket.
This means that if a UDP server uses one UDP socket to send
packets to different destinations, packets all land
in one FQ flow.
This is unfair, since each TCP flow has a unique bucket, meaning
that in case of pressure (fully utilised uplink), TCP flows
have more share of the bandwidth.
If we instead detect unconnected sockets, we can use a stochastic
hash based on the 4-tuple hash.
This also means a QUIC server using one UDP socket will properly
spread the outgoing packets to different buckets, and in-kernel
pacing based on EDT model will no longer risk having big rb-tree on
one flow.
Note that UDP application might provide the skb->hash in an
ancillary message at sendmsg() time to avoid the cost of a dissection
in fq packet scheduler.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 4 May 2019 23:48:53 +0000 (16:48 -0700)]
net_sched: sch_fq: do not assume EDT packets are ordered
TCP stack makes sure packets for a given flow are monotically
increasing, but we want to allow UDP packets to use EDT as
well, so that QUIC servers can use in-kernel pacing.
This patch adds a per-flow rb-tree on which packets might
be stored. We still try to use the linear list for the
typical cases where packets are queued with monotically
increasing skb->tstamp, since queue/dequeue packets on
a standard list is O(1).
Note that the ability to store packets in arbitrary EDT
order will allow us to implement later a per TCP socket
mechanism adding delays (with jitter eventually) and reorders,
to implement convenient network emulators.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 7 May 2019 18:46:56 +0000 (11:46 -0700)]
Merge tag 'xfs-5.2-merge-4' of git://git./fs/xfs/xfs-linux
Pull xfs updates from Darrick Wong:
"Here's a big pile of new stuff for XFS for 5.2. XFS has grown the
ability to report metadata health status to userspace after online
fsck checks the filesystem. The online metadata checking code is (I
really hope) feature complete with the addition of checks for the
global fs counters, though it'll remain EXPERIMENTAL for now.
There are also fixes for thundering herds of writeback completions and
some other deadlocks, fixes for theoretical integer overflow attacks
on space accounting, and removal of the long-defunct 'mntpt' option
which was deprecated in the mid-2000s and (it turns out) totally
broken since 2011 (and nobody complained...).
Summary:
- Fix some more buffer deadlocks when performing an unmount after a
hard shutdown.
- Fix some minor space accounting issues.
- Fix some use after free problems.
- Make the (undocumented) FITRIM behavior consistent with other
filesystems.
- Embiggen the xfs geometry ioctl's data structure.
- Introduce a new AG geometry ioctl.
- Introduce a new online health reporting infrastructure and ioctl
for userspace to query a filesystem's health status.
- Enhance online scrub and repair to update the health reports.
- Reduce thundering herd problems when writeback io completes.
- Fix some transaction reservation type errors.
- Fix integer overflow problems with delayed alloc reservation
counters.
- Fix some problems where we would exit to userspace without
unlocking.
- Fix inconsistent behavior when finishing deferred ops fails.
- Strengthen scrub to check incore data against ondisk metadata.
- Remove long-broken mntpt mount option.
- Add an online scrub function for the filesystem summary counters,
which should make online metadata scrub more or less feature
complete for now.
- Various cleanups"
* tag 'xfs-5.2-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (38 commits)
xfs: change some error-less functions to void types
xfs: add online scrub for superblock counters
xfs: don't parse the mtpt mount option
xfs: always rejoin held resources during defer roll
xfs: add missing error check in xfs_prepare_shift()
xfs: scrub should check incore counters against ondisk headers
xfs: allow scrubbers to pause background reclaim
xfs: rename the speculative block allocation reclaim toggle functions
xfs: track delayed allocation reservations across the filesystem
xfs: fix broken bhold behavior in xrep_roll_ag_trans
xfs: unlock inode when xfs_ioctl_setattr_get_trans can't get transaction
xfs: kill the xfs_dqtrx_t typedef
xfs: widen inode delalloc block counter to 64-bits
xfs: widen quota block counters to 64-bit integers
xfs: abort unaligned nowait directio early
xfs: assert that we don't enter agfl freeing with a non-permanent transaction
xfs: make tr_growdata a permanent transaction
xfs: merge adjacent io completions of the same type
xfs: remove unused m_data_workqueue
xfs: implement per-inode writeback completion queues
...
Linus Torvalds [Tue, 7 May 2019 18:43:32 +0000 (11:43 -0700)]
Merge tag 'iomap-5.2-merge-2' of git://git./fs/xfs/xfs-linux
Pull iomap updates from Darrick Wong:
"Nothing particularly exciting here, just adding some callouts for gfs2
and cleaning a few things.
Summary:
- Add some extra hooks to the iomap buffered write path to enable
gfs2 journalled writes
- SPDX conversion
- Various refactoring"
* tag 'iomap-5.2-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: move iomap_read_inline_data around
iomap: Add a page_prepare callback
iomap: Fix use-after-free error in page_done callback
fs: Turn __generic_write_end into a void function
iomap: Clean up __generic_write_end calling
iomap: convert to SPDX identifier
Linus Torvalds [Tue, 7 May 2019 18:37:27 +0000 (11:37 -0700)]
Merge tag 'jfs-5.2' of git://github.com/kleikamp/linux-shaggy
Pull jfs updates from Dave Kleikamp:
"Several minor jfs fixes"
* tag 'jfs-5.2' of git://github.com/kleikamp/linux-shaggy:
jfs: fix bogus variable self-initialization
fs/jfs: Switch to use new generic UUID API
jfs: compare old and new mode before setting update_mode flag
jfs: remove incorrect comment in jfs_superblock
jfs: fix spelling mistake, EACCESS -> EACCES
Linus Torvalds [Tue, 7 May 2019 18:34:19 +0000 (11:34 -0700)]
Merge tag 'for-5.2-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"This time the majority of changes are cleanups, though there's still a
number of changes of user interest.
User visible changes:
- better read time and write checks to catch errors early and before
writing data to disk (to catch potential memory corruption on data
that get checksummed)
- qgroups + metadata relocation: last speed up patch int the series
to address the slowness, there should be no overhead comparing
balance with and without qgroups
- FIEMAP ioctl does not start a transaction unnecessarily, this can
result in a speed up and less blocking due to IO
- LOGICAL_INO (v1, v2) does not start transaction unnecessarily, this
can speed up the mentioned ioctl and scrub as well
- fsync on files with many (but not too many) hardlinks is faster,
finer decision if the links should be fsynced individually or
completely
- send tries harder to find ranges to clone
- trim/discard will skip unallocated chunks that haven't been touched
since the last mount
Fixes:
- send flushes delayed allocation before start, otherwise it could
miss some changes in case of a very recent rw->ro switch of a
subvolume
- fix fallocate with qgroups that could lead to space accounting
underflow, reported as a warning
- trim/discard ioctl honours the requested range
- starting send and dedupe on a subvolume at the same time will let
only one of them succeed, this is to prevent changes that send
could miss due to dedupe; both operations are restartable
Core changes:
- more tree-checker validations, errors reported by fuzzing tools:
- device item
- inode item
- block group profiles
- tracepoints for extent buffer locking
- async cow preallocates memory to avoid errors happening too deep in
the call chain
- metadata reservations for delalloc reworked to better adapt in
many-writers/low-space scenarios
- improved space flushing logic for intense DIO vs buffered workloads
- lots of cleanups
- removed unused struct members
- redundant argument removal
- properties and xattrs
- extent buffer locking
- selftests
- use common file type conversions
- many-argument functions reduction"
* tag 'for-5.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (227 commits)
btrfs: Use kvmalloc for allocating compressed path context
btrfs: Factor out common extent locking code in submit_compressed_extents
btrfs: Set io_tree only once in submit_compressed_extents
btrfs: Replace clear_extent_bit with unlock_extent
btrfs: Make compress_file_range take only struct async_chunk
btrfs: Remove fs_info from struct async_chunk
btrfs: Rename async_cow to async_chunk
btrfs: Preallocate chunks in cow_file_range_async
btrfs: reserve delalloc metadata differently
btrfs: track DIO bytes in flight
btrfs: merge calls of btrfs_setxattr and btrfs_setxattr_trans in btrfs_set_prop
btrfs: delete unused function btrfs_set_prop_trans
btrfs: start transaction in xattr_handler_set_prop
btrfs: drop local copy of inode i_mode
btrfs: drop old_fsflags in btrfs_ioctl_setflags
btrfs: modify local copy of btrfs_inode flags
btrfs: drop useless inode i_flags copy and restore
btrfs: start transaction in btrfs_ioctl_setflags()
btrfs: export btrfs_set_prop
btrfs: refactor btrfs_set_props to validate externally
...
Linus Torvalds [Tue, 7 May 2019 18:17:26 +0000 (11:17 -0700)]
Merge branch 'stable-fodder' of git://git./linux/kernel/git/viro/vfs
Pull vfs stable fodder fixes from Al Viro:
- acct_on() fix for deadlock caught by overlayfs folks
- autofs RCU use-after-free SNAFU (->d_manage() can be called
locklessly, so we need to RCU-delay freeing the objects it looks at)
- (hopefully) the end of "do we need freeing this dentry RCU-delayed"
whack-a-mole.
* 'stable-fodder' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
autofs: fix use-after-free in lockless ->d_manage()
dcache: sort the freeing-without-RCU-delay mess for good.
acct_on(): don't mess with freeze protection
Linus Torvalds [Tue, 7 May 2019 17:57:05 +0000 (10:57 -0700)]
Merge branch 'work.icache' of git://git./linux/kernel/git/viro/vfs
Pull vfs inode freeing updates from Al Viro:
"Introduction of separate method for RCU-delayed part of
->destroy_inode() (if any).
Pretty much as posted, except that destroy_inode() stashes
->free_inode into the victim (anon-unioned with ->i_fops) before
scheduling i_callback() and the last two patches (sockfs conversion
and folding struct socket_wq into struct socket) are excluded - that
pair should go through netdev once davem reopens his tree"
* 'work.icache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (58 commits)
orangefs: make use of ->free_inode()
shmem: make use of ->free_inode()
hugetlb: make use of ->free_inode()
overlayfs: make use of ->free_inode()
jfs: switch to ->free_inode()
fuse: switch to ->free_inode()
ext4: make use of ->free_inode()
ecryptfs: make use of ->free_inode()
ceph: use ->free_inode()
btrfs: use ->free_inode()
afs: switch to use of ->free_inode()
dax: make use of ->free_inode()
ntfs: switch to ->free_inode()
securityfs: switch to ->free_inode()
apparmor: switch to ->free_inode()
rpcpipe: switch to ->free_inode()
bpf: switch to ->free_inode()
mqueue: switch to ->free_inode()
ufs: switch to ->free_inode()
coda: switch to ->free_inode()
...
David S. Miller [Tue, 7 May 2019 17:37:14 +0000 (10:37 -0700)]
Merge branch 'hns3-next'
Huazhong Tan says:
====================
cleanup & optimizations & bugfixes for HNS3 driver
This patchset contains some cleanup related to hns3_enet_ring
struct and tx bd filling process, optimizations related
to rx page reusing, barrier using and tso handling process,
bugfixes related to tunnel type handling and error handling for
desc filling.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 6 May 2019 02:48:52 +0000 (10:48 +0800)]
net: hns3: use devm_kcalloc when allocating desc_cb
This patch uses devm_kcalloc instead of kcalloc when allocating
ring->desc_cb, because devm_kcalloc not only ensure to free the
memory when the dev is deallocted, but also allocate the memory
from it's device memory node.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 6 May 2019 02:48:51 +0000 (10:48 +0800)]
net: hns3: some cleanup for struct hns3_enet_ring
This patch removes some unused field in struct hns3_enet_ring,
use ring->dev for ring_to_dev macro, and use dev consistently
in hns3_fill_desc.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 6 May 2019 02:48:50 +0000 (10:48 +0800)]
net: hns3: unify the page reusing for page size 4K and 64K
When page size is 64K, RX buffer is currently not reused when the
page_offset is moved to last buffer. This patch adds checking to
decide whether the buffer page can be reused when last_offset is
moved beyond last offset.
If the driver is the only user of page when page_offset is moved
to beyond last offset, then buffer can be reused and page_offset
is set to zero.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 6 May 2019 02:48:49 +0000 (10:48 +0800)]
net: hns3: optimize the barrier using when cleaning TX BD
Currently, a barrier is used when cleaning each TX BD, which may
cause performance degradation.
This patch optimizes it to use one barrier when cleaning TX BD
each round.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 6 May 2019 02:48:48 +0000 (10:48 +0800)]
net: hns3: fix error handling for desc filling
When desc filling fails in hns3_nic_net_xmit, it will call
hns3_clear_desc to unmap the dma mapping. But currently the
ring->next_to_use points to the desc where the desc filling
or dma mapping return error, which means the desc that
ring->next_to_use points to has not done the dma mapping,
the desc that need unmapping is before the ring->next_to_use.
This patch fixes it by calling ring_ptr_move_bw(next_to_use)
before doing unmapping operation, and set desc_cb->dma to
zero to avoid freeing it again when unloading.
Also, when filling skb head or frag fails, both need to unmap
all the way back to next_to_use_head, so remove one desc filling
error handling.
Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 6 May 2019 02:48:47 +0000 (10:48 +0800)]
net: hns3: combine len and checksum handling for inner and outer header.
When filling len and checksum info to description, there is some
similar checking or calculation.
So this patch adds hns3_set_l2l3l4 to fill the inner(/normal)
header's len and checksum info. If it is a encapsulation skb, it
calls hns3_set_outer_l2l3l4 to handle the outer header's len and
checksum info, in order to avoid some similar checking or
calculation.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 6 May 2019 02:48:46 +0000 (10:48 +0800)]
net: hns3: refactor BD filling for l2l3l4 info
This patch separates the inner and outer l2l3l4 len handling in
hns3_set_l2l3l4_len, this is a preparation to combine the l2l3l4
len and checksum handling for inner and outer header.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 6 May 2019 02:48:45 +0000 (10:48 +0800)]
net: hns3: fix for tunnel type handling in hns3_rx_checksum
According to hardware user manual, the tunnel packet type is
available in the rx.ol_info field of struct hns3_desc. Currently
the tunnel packet type is decided by the rx.l234_info, which may
cause RX checksum handling error.
This patch fixes it by using the correct field in struct hns3_desc
to decide the tunnel packet type.
Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 6 May 2019 02:48:44 +0000 (10:48 +0800)]
net: hns3: add linearizing checking for TSO case
HW requires every continuous 8 buffer data to be larger than MSS,
we simplify it by ensuring skb_headlen + the first continuous
7 frags to to be larger than GSO header len + mss, and the
remaining continuous 7 frags to be larger than MSS except the
last 7 frags.
This patch adds hns3_skb_need_linearized to handle it for TSO
case.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>