git.openwrt.org Git - openwrt/staging/blogic.git/log

projects / openwrt / staging / blogic.git / log

Quentin Monnet [Wed, 29 Nov 2017 01:44:33 +0000 (17:44 -0800)]

tools: bpftool: declare phony targets as such

In the Makefile, targets install, doc and doc-install should be added to
.PHONY. Let's fix this.

Fixes: 71bb428fe2c1 ("tools: bpf: add bpftool")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

commit | commitdiff | tree

Quentin Monnet [Wed, 29 Nov 2017 01:44:32 +0000 (17:44 -0800)]

tools: bpftool: unify installation directories

Programs and documentation not managed by package manager are generally
installed under /usr/local/, instead of the user's home directory. In
particular, `man` is generally able to find manual pages under
`/usr/local/share/man`.

bpftool generally follows perf's example, and perf installs to home
directory. However bpftool requires root credentials, so it seems
sensible to follow the more common convention of installing files under
/usr/local instead. So, make /usr/local the default prefix for
installing the binary with `make install`, and the documentation with
`make doc-install`. Also, create /usr/local/sbin if it does not exist.

Note that the bash-completion file, however, is still installed under
/usr/share/bash-completion/completions, as the default setup for bash
does not attempt to load completion files under /usr/local/.

Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

commit | commitdiff | tree

Quentin Monnet [Wed, 29 Nov 2017 01:44:31 +0000 (17:44 -0800)]

tools: bpftool: remove spurious line break from error message

The end-of-line character inside the string would break JSON compliance.
Remove it, `p_err()` already adds a '\n' character for plain output
anyway.

Fixes: 9a5ab8bf1d6d ("tools: bpftool: turn err() and info() macros into functions")
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

commit | commitdiff | tree

Quentin Monnet [Wed, 29 Nov 2017 01:44:30 +0000 (17:44 -0800)]

tools: bpftool: make error message from getopt_long() JSON-friendly

If `getopt_long()` meets an unknown option, it prints its own error
message to standard error output. While this does not strictly break
JSON output, it is the only case bpftool prints something to standard
error output if JSON output is required. All other errors are printed on
standard output as JSON objects, so that an external program does not
have to parse stderr.

This is changed by setting the global variable `opterr` to 0.
Furthermore, p_err() is used to reproduce the error message in a more
JSON-friendly way, so that users still get to know what the erroneous
option is.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

commit | commitdiff | tree

Quentin Monnet [Wed, 29 Nov 2017 01:44:29 +0000 (17:44 -0800)]

tools: bpftool: clean up the JSON writer before exiting in usage()

The writer is cleaned at the end of the main function, but not if the
program exits sooner in usage(). Let's keep it clean and destroy the
writer before exiting.

Destruction and actual call to exit() are moved to another function so
that clean exit can also be performed without printing usage() hints.

Fixes: d35efba99d92 ("tools: bpftool: introduce --json and --pretty options")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

commit | commitdiff | tree

Quentin Monnet [Wed, 29 Nov 2017 01:44:28 +0000 (17:44 -0800)]

tools: bpftool: fix crash on bad parameters with JSON

If bad or unrecognised parameters are specified after JSON output is
requested, `usage()` will try to output null JSON object before the
writer is created.

To prevent this, create the writer as soon as the `--json` option is
parsed.

Fixes: 004b45c0e51a ("tools: bpftool: provide JSON output for all possible commands")
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

commit | commitdiff | tree

Jakub Kicinski [Mon, 27 Nov 2017 20:10:23 +0000 (12:10 -0800)]

bpf: offload: add a license header

I forgot to add a license on kernel/bpf/offload.c. Luckily I'm
still the only author so make it explicitly GPLv2.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

commit | commitdiff | tree

Jon Maloy [Mon, 27 Nov 2017 19:13:39 +0000 (20:13 +0100)]

tipc: eliminate access after delete in group_filter_msg()

KASAN revealed another access after delete in group.c. This time
it found that we read the header of a received message after the
buffer has been released.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Eduardo Otubo [Thu, 23 Nov 2017 14:18:35 +0000 (15:18 +0100)]

xen-netfront: remove warning when unloading module

v2:
* Replace busy wait with wait_event()/wake_up_all()
* Cannot garantee that at the time xennet_remove is called, the
   xen_netback state will not be XenbusStateClosed, so added a
   condition for that
* There's a small chance for the xen_netback state is
   XenbusStateUnknown by the time the xen_netfront switches to Closed,
   so added a condition for that.

When unloading module xen_netfront from guest, dmesg would output
warning messages like below:

  [  105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use!
  [  105.236839] deferring g.e. 0x903 (pfn 0x35805)

This problem relies on netfront and netback being out of sync. By the time
netfront revokes the g.e.'s netback didn't have enough time to free all of
them, hence displaying the warnings on dmesg.

The trick here is to make netfront to wait until netback frees all the g.e.'s
and only then continue to cleanup for the module removal, and this is done by
manipulating both device states.

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
Acked-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Mon, 27 Nov 2017 16:09:42 +0000 (01:09 +0900)]

Merge tag 'mac80211-for-davem-2017-11-27' of git://git./linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Four fixes:
* CRYPTO_SHA256 is needed for regdb validation
* mac80211: mesh path metric was wrong in some frames
* mac80211: use QoS null-data packets on QoS connections
* mac80211: tear down RX aggregation sessions first to
drop fewer packets in HW restart scenarios
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Mon, 27 Nov 2017 15:38:45 +0000 (00:38 +0900)]

Merge branch 'sctp-stream-reconfig-fixes'

Xin Long says:

====================
sctp: a bunch of fixes for stream reconfig

This patchset is to make stream reset and asoc reset work more correctly
for stream reconfig.

Thank to Marcelo making them very clear.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Xin Long [Sat, 25 Nov 2017 13:05:36 +0000 (21:05 +0800)]

sctp: set sender next_tsn for the old result with ctsn_ack_point plus 1

When doing asoc reset, if the sender of the response has already sent some
chunk and increased asoc->next_tsn before the duplicate request comes, the
response will use the old result with an incorrect sender next_tsn.

Better than asoc->next_tsn, asoc->ctsn_ack_point can't be changed after
the sender of the response has performed the asoc reset and before the
peer has confirmed it, and it's value is still asoc->next_tsn original
value minus 1.

This patch sets sender next_tsn for the old result with ctsn_ack_point
plus 1 when processing the duplicate request, to make sure the sender
next_tsn value peer gets will be always right.

Fixes: 692787cef651 ("sctp: implement receiver-side procedures for the SSN/TSN Reset Request Parameter")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Xin Long [Sat, 25 Nov 2017 13:05:35 +0000 (21:05 +0800)]

sctp: avoid flushing unsent queue when doing asoc reset

Now when doing asoc reset, it cleans up sacked and abandoned queues
by calling sctp_outq_free where it also cleans up unsent, retransmit
and transmitted queues.

It's safe for the sender of response, as these 3 queues are empty at
that time. But when the receiver of response is doing the reset, the
users may already enqueue some chunks into unsent during the time
waiting the response, and these chunks should not be flushed.

To void the chunks in it would be removed, it moves the queue into a
temp list, then gets it back after sctp_outq_free is done.

The patch also fixes some incorrect comments in
sctp_process_strreset_tsnreq.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Xin Long [Sat, 25 Nov 2017 13:05:34 +0000 (21:05 +0800)]

sctp: only allow the asoc reset when the asoc outq is empty

As it says in rfc6525#section5.1.4, before sending the request,

C2: The sender has either no outstanding TSNs or considers all
outstanding TSNs abandoned.

Prior to this patch, it tried to consider all outstanding TSNs abandoned
by dropping all chunks in all outqs with sctp_outq_free (even including
sacked, retransmit and transmitted queues) when doing this reset, which
is too aggressive.

To make it work gently, this patch will only allow the asoc reset when
the sender has no outstanding TSNs by checking if unsent, transmitted
and retransmit are all empty with sctp_outq_is_empty before sending
and processing the request.

Fixes: 692787cef651 ("sctp: implement receiver-side procedures for the SSN/TSN Reset Request Parameter")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Xin Long [Sat, 25 Nov 2017 13:05:33 +0000 (21:05 +0800)]

sctp: only allow the out stream reset when the stream outq is empty

Now the out stream reset in sctp stream reconf could be done even if
the stream outq is not empty. It means that users can not be sure
since which msg the new ssn will be used.

To make this more synchronous, it shouldn't allow to do out stream
reset until these chunks in unsent outq all are sent out.

This patch checks the corresponding stream outqs when sending and
processing the request . If any of them has unsent chunks in outq,
it will return -EAGAIN instead or send SCTP_STRRESET_IN_PROGRESS
back to the sender.

Fixes: 7f9d68ac944e ("sctp: implement sender-side procedures for SSN Reset Request Parameter")
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Xin Long [Sat, 25 Nov 2017 13:05:32 +0000 (21:05 +0800)]

sctp: use sizeof(__u16) for each stream number length instead of magic number

Now in stream reconf part there are still some places using magic
number 2 for each stream number length. To make it more readable,
this patch is to replace them with sizeof(__u16).

Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Sara Sharon [Sun, 29 Oct 2017 09:51:09 +0000 (11:51 +0200)]

mac80211: tear down RX aggregations first

When doing HW restart we tear down aggregations.
Since at this point we are not TX'ing any aggregation, while
the peer is still sending RX aggregation over the air, it will
make sense to tear down the RX aggregations first.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

commit | commitdiff | tree

Chun-Yeow Yeoh [Tue, 14 Nov 2017 15:20:05 +0000 (23:20 +0800)]

mac80211: fix the update of path metric for RANN frame

The previous path metric update from RANN frame has not considered
the own link metric toward the transmitting mesh STA. Fix this.

Reported-by: Michael65535
Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

commit | commitdiff | tree

Johannes Berg [Tue, 21 Nov 2017 13:46:08 +0000 (14:46 +0100)]

mac80211: use QoS NDP for AP probing

When connected to a QoS/WMM AP, mac80211 should use a QoS NDP
for probing it, instead of a regular non-QoS one, fix this.

Change all the drivers to *not* allow QoS NDP for now, even
though it looks like most of them should be OK with that.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>

commit | commitdiff | tree

zhangliping [Sat, 25 Nov 2017 14:02:12 +0000 (22:02 +0800)]

openvswitch: fix the incorrect flow action alloc size

If we want to add a datapath flow, which has more than 500 vxlan outputs'
action, we will get the following error reports:
  openvswitch: netlink: Flow action size 32832 bytes exceeds max
  openvswitch: netlink: Flow action size 32832 bytes exceeds max
  openvswitch: netlink: Actions may not be safe on all matching packets
  ... ...

It seems that we can simply enlarge the MAX_ACTIONS_BUFSIZE to fix it, but
this is not the root cause. For example, for a vxlan output action, we need
about 60 bytes for the nlattr, but after it is converted to the flow
action, it only occupies 24 bytes. This means that we can still support
more than 1000 vxlan output actions for a single datapath flow under the
the current 32k max limitation.

So even if the nla_len(attr) is larger than MAX_ACTIONS_BUFSIZE, we
shouldn't report EINVAL and keep it move on, as the judgement can be
done by the reserve_sfa_size.

Signed-off-by: zhangliping <zhangliping02@baidu.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Gustavo A. R. Silva [Sat, 25 Nov 2017 19:14:40 +0000 (13:14 -0600)]

net: openvswitch: datapath: fix data type in queue_gso_packets

gso_type is being used in binary AND operations together with SKB_GSO_UDP.
The issue is that variable gso_type is of type unsigned short and
SKB_GSO_UDP expands to more than 16 bits:

SKB_GSO_UDP = 1 << 16

this makes any binary AND operation between gso_type and SKB_GSO_UDP to
be always zero, hence making some code unreachable and likely causing
undesired behavior.

Fix this by changing the data type of variable gso_type to unsigned int.

Addresses-Coverity-ID: 1462223
Fixes: 0c19f846d582 ("net: accept UFO datagrams from tuntap and packet")
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Stephen Hemminger [Fri, 24 Nov 2017 20:08:40 +0000 (12:08 -0800)]

uapi: add SPDX identifier to vm_sockets_diag.h

New file seems to have missed the SPDX license scan and update.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Vivien Didelot [Fri, 24 Nov 2017 16:36:06 +0000 (11:36 -0500)]

net: dsa: fix 'increment on 0' warning

Setting the refcount to 0 when allocating a tree to match the number of
switch devices it holds may cause an 'increment on 0; use-after-free',
if CONFIG_REFCOUNT_FULL is enabled.

To fix this, do not decrement the refcount of a newly allocated tree,
increment it when an already allocated tree is found, and decrement it
after the probing of a switch, as done with the previous behavior.

At the same time, make dsa_tree_get and dsa_tree_put accept a NULL
argument to simplify callers, and return the tree after incrementation,
as most kref users like of_node_get and of_node_put do.

Fixes: 8e5bf9759a06 ("net: dsa: simplify tree reference counting")
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jorgen Hansen [Fri, 24 Nov 2017 14:25:28 +0000 (06:25 -0800)]

VSOCK: Don't call vsock_stream_has_data in atomic context

When using the host personality, VMCI will grab a mutex for any
queue pair access. In the detach callback for the vmci vsock
transport, we call vsock_stream_has_data while holding a spinlock,
and vsock_stream_has_data will access a queue pair.

To avoid this, we can simply omit calling vsock_stream_has_data
for host side queue pairs, since the QPs are empty per default
when the guest has detached.

This bug affects users of VMware Workstation using kernel version
4.4 and later.

Testing: Ran vsock tests between guest and host, and verified that
with this change, the host isn't calling vsock_stream_has_data
during detach. Ran mixedTest between guest and host using both
guest and host as server.

v2: Rebased on top of recent change to sk_state values
Reviewed-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Roman Kapl [Fri, 24 Nov 2017 11:27:58 +0000 (12:27 +0100)]

net: sched: crash on blocks with goto chain action

tcf_block_put_ext has assumed that all filters (and thus their goto
actions) are destroyed in RCU callback and thus can not race with our
list iteration. However, that is not true during netns cleanup (see
tcf_exts_get_net comment).

Prevent the user after free by holding all chains (except 0, that one is
already held). foreach_safe is not enough in this case.

To reproduce, run the following in a netns and then delete the ns:
    ip link add dtest type dummy
    tc qdisc add dev dtest ingress
    tc filter add dev dtest chain 1 parent ffff: handle 1 prio 1 flower action goto chain 2

Fixes: 822e86d997 ("net_sched: remove tcf_block_put_deferred()")
Signed-off-by: Roman Kapl <code@rkapl.cz>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mika Westerberg [Fri, 24 Nov 2017 11:05:36 +0000 (14:05 +0300)]

net: thunderbolt: Stop using zero to mean no valid DMA mapping

Commit 86dabda426ac ("net: thunderbolt: Clear finished Tx frame bus
address in tbnet_tx_callback()") fixed a DMA-API violation where the
driver called dma_unmap_page() in tbnet_free_buffers() for a bus address
that might already be unmapped. The fix was to zero out the bus address
of a frame in tbnet_tx_callback().

However, as pointed out by David Miller, zero might well be valid
mapping (at least in theory) so it is not good idea to use it here.

It turns out that we don't need the whole map/unmap dance for Tx buffers
at all. Instead we can map the buffers when they are initially allocated
and unmap them when the interface is brought down. In between we just
DMA sync the buffers for the CPU or device as needed.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Sunil Goutham [Thu, 23 Nov 2017 19:34:31 +0000 (22:34 +0300)]

net: thunderx: Fix TCP/UDP checksum offload for IPv6 pkts

Don't offload IP header checksum to NIC.

This fixes a previous patch which enabled checksum offloading
for both IPv4 and IPv6 packets. So L3 checksum offload was
getting enabled for IPv6 pkts. And HW is dropping these pkts
as it assumes the pkt is IPv4 when IP csum offload is set
in the SQ descriptor.

Fixes: 3a9024f52c2e ("net: thunderx: Enable TSO and checksum offloads for ipv6")
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Johannes Berg [Fri, 24 Nov 2017 08:35:25 +0000 (09:35 +0100)]

cfg80211: select CRYPTO_SHA256 if needed

When regulatory database certificates are built-in, they're
currently using the SHA256 digest algorithm, so add that to
the build in that case.

Also add a note that for custom certificates, one may need
to add the right algorithms.

Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

commit | commitdiff | tree

Zhu Yanjun [Mon, 20 Nov 2017 03:21:08 +0000 (22:21 -0500)]

forcedeth: replace pci_unmap_page with dma_unmap_page

The function pci_unmap_page is obsolete. So it is replaced with
the function dma_unmap_page.

CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Joe Jin <joe.jin@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Fri, 24 Nov 2017 20:07:17 +0000 (05:07 +0900)]

Merge tag 'rxrpc-fixes-20171124' of git://git./linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Fixes and improvements

Here's a set of patches that fix and improve some stuff in the AF_RXRPC
protocol:

The patches are:

(1) Unlock mutex returned by rxrpc_accept_call().

(2) Don't set connection upgrade by default.

(3) Differentiate the call->user_mutex used by the kernel from that used
     by userspace calling sendmsg() to avoid lockdep warnings.

(4) Delay terminal ACK transmission to a work queue so that it can be
     replaced by the next call if there is one.

(5) Split the call parameters from the connection parameters so that more
     call-specific parameters can be passed through.

(6) Fix the call timeouts to work the same as for other RxRPC/AFS
     implementations.

(7) Don't transmit DELAY ACKs immediately, but instead delay them slightly
     so that can be discarded or can represent more packets.

(8) Use RTT to calculate certain protocol timeouts.

(9) Add a timeout to detect lost ACK/DATA packets.

(10) Add a keepalive function so that we ping the peer if we haven't
     transmitted for a short while, thereby keeping intervening firewall
     routes open.

(11) Make service endpoints expire like they're supposed to so that the UDP
     port can be reused.

(12) Fix connection expiry timers to make cleanup happen in a more timely
     fashion.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David Howells [Fri, 24 Nov 2017 10:18:42 +0000 (10:18 +0000)]

rxrpc: Fix conn expiry timers

Fix the rxrpc connection expiry timers so that connections for closed
AF_RXRPC sockets get deleted in a more timely fashion, freeing up the
transport UDP port much more quickly.

(1) Replace the delayed work items with work items plus timers so that
     timer_reduce() can be used to shorten them and so that the timer
     doesn't requeue the work item if the net namespace is dead.

(2) Don't use queue_delayed_work() as that won't alter the timeout if the
     timer is already running.

(3) Don't rearm the timers if the network namespace is dead.

Signed-off-by: David Howells <dhowells@redhat.com>

commit | commitdiff | tree

David Howells [Fri, 24 Nov 2017 10:18:42 +0000 (10:18 +0000)]

rxrpc: Fix service endpoint expiry

RxRPC service endpoints expire like they're supposed to by the following
means:

(1) Mark dead rxrpc_net structs (with ->live) rather than twiddling the
     global service conn timeout, otherwise the first rxrpc_net struct to
     die will cause connections on all others to expire immediately from
     then on.

(2) Mark local service endpoints for which the socket has been closed
     (->service_closed) so that the expiration timeout can be much
     shortened for service and client connections going through that
     endpoint.

(3) rxrpc_put_service_conn() needs to schedule the reaper when the usage
     count reaches 1, not 0, as idle conns have a 1 count.

(4) The accumulator for the earliest time we might want to schedule for
     should be initialised to jiffies + MAX_JIFFY_OFFSET, not ULONG_MAX as
     the comparison functions use signed arithmetic.

(5) Simplify the expiration handling, adding the expiration value to the
     idle timestamp each time rather than keeping track of the time in the
     past before which the idle timestamp must go to be expired.  This is
     much easier to read.

(6) Ignore the timeouts if the net namespace is dead.

(7) Restart the service reaper work item rather the client reaper.

Signed-off-by: David Howells <dhowells@redhat.com>

commit | commitdiff | tree

David Howells [Fri, 24 Nov 2017 10:18:42 +0000 (10:18 +0000)]

rxrpc: Add keepalive for a call

We need to transmit a packet every so often to act as a keepalive for the
peer (which has a timeout from the last time it received a packet) and also
to prevent any intervening firewalls from closing the route.

Do this by resetting a timer every time we transmit a packet. If the timer
ever expires, we transmit a PING ACK packet and thereby also elicit a PING
RESPONSE ACK from the other side - which prevents our last-rx timeout from
expiring.

The timer is set to 1/6 of the last-rx timeout so that we can detect the
other side going away if it misses 6 replies in a row.

This is particularly necessary for servers where the processing of the
service function may take a significant amount of time.

Signed-off-by: David Howells <dhowells@redhat.com>

commit | commitdiff | tree

David Howells [Fri, 24 Nov 2017 10:18:42 +0000 (10:18 +0000)]

rxrpc: Add a timeout for detecting lost ACKs/lost DATA

Add an extra timeout that is set/updated when we send a DATA packet that
has the request-ack flag set.  This allows us to detect if we don't get an
ACK in response to the latest flagged packet.

The ACK packet is adjudged to have been lost if it doesn't turn up within
2*RTT of the transmission.

If the timeout occurs, we schedule the sending of a PING ACK to find out
the state of the other side.  If a new DATA packet is ready to go sooner,
we cancel the sending of the ping and set the request-ack flag on that
instead.

If we get back a PING-RESPONSE ACK that indicates a lower tx_top than what
we had at the time of the ping transmission, we adjudge all the DATA
packets sent between the response tx_top and the ping-time tx_top to have
been lost and retransmit immediately.

Rather than sending a PING ACK, we could just pick a DATA packet and
speculatively retransmit that with request-ack set.  It should result in
either a REQUESTED ACK or a DUPLICATE ACK which we can then use in lieu the
a PING-RESPONSE ACK mentioned above.

Signed-off-by: David Howells <dhowells@redhat.com>

commit | commitdiff | tree