virtio_net: disable XDP_REDIRECT in receive_mergeable() case
authorJesper Dangaard Brouer <brouer@redhat.com>
Tue, 20 Feb 2018 13:32:04 +0000 (14:32 +0100)
committerDavid S. Miller <davem@davemloft.net>
Wed, 21 Feb 2018 20:09:29 +0000 (15:09 -0500)
The virtio_net code have three different RX code-paths in receive_buf().
Two of these code paths can handle XDP, but one of them is broken for
at least XDP_REDIRECT.

Function(1): receive_big() does not support XDP.
Function(2): receive_small() support XDP fully and uses build_skb().
Function(3): receive_mergeable() broken XDP_REDIRECT uses napi_alloc_skb().

The simple explanation is that receive_mergeable() is broken because
it uses napi_alloc_skb(), which violates XDP given XDP assumes packet
header+data in single page and enough tail room for skb_shared_info.

The longer explaination is that receive_mergeable() tries to
work-around and satisfy these XDP requiresments e.g. by having a
function xdp_linearize_page() that allocates and memcpy RX buffers
around (in case packet is scattered across multiple rx buffers).  This
does currently satisfy XDP_PASS, XDP_DROP and XDP_TX (but only because
we have not implemented bpf_xdp_adjust_tail yet).

The XDP_REDIRECT action combined with cpumap is broken, and cause hard
to debug crashes.  The main issue is that the RX packet does not have
the needed tail-room (SKB_DATA_ALIGN(skb_shared_info)), causing
skb_shared_info to overlap the next packets head-room (in which cpumap
stores info).

Reproducing depend on the packet payload length and if RX-buffer size
happened to have tail-room for skb_shared_info or not.  But to make
this even harder to troubleshoot, the RX-buffer size is runtime
dynamically change based on an Exponentially Weighted Moving Average
(EWMA) over the packet length, when refilling RX rings.

This patch only disable XDP_REDIRECT support in receive_mergeable()
case, because it can cause a real crash.

IMHO we should consider NOT supporting XDP in receive_mergeable() at
all, because the principles behind XDP are to gain speed by (1) code
simplicity, (2) sacrificing memory and (3) where possible moving
runtime checks to setup time.  These principles are clearly being
violated in receive_mergeable(), that e.g. runtime track average
buffer size to save memory consumption.

In the longer run, we should consider introducing a separate receive
function when attaching an XDP program, and also change the memory
model to be compatible with XDP when attaching an XDP prog.

Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/virtio_net.c

index 626c27352ae243fe6364d59d5274a718475a7587..0ca91942a884aec8fdbad0155010d4f1f9be395e 100644 (file)
@@ -677,7 +677,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
        struct bpf_prog *xdp_prog;
        unsigned int truesize;
        unsigned int headroom = mergeable_ctx_to_headroom(ctx);
-       int err;
 
        head_skb = NULL;
 
@@ -754,12 +753,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
                                goto err_xdp;
                        rcu_read_unlock();
                        goto xdp_xmit;
-               case XDP_REDIRECT:
-                       err = xdp_do_redirect(dev, &xdp, xdp_prog);
-                       if (!err)
-                               *xdp_xmit = true;
-                       rcu_read_unlock();
-                       goto xdp_xmit;
                default:
                        bpf_warn_invalid_xdp_action(act);
                case XDP_ABORTED: