Hannes Frederic Sowa [Wed, 26 Feb 2014 00:20:43 +0000 (01:20 +0100)]
ipv6: yet another new IPV6_MTU_DISCOVER option IPV6_PMTUDISC_OMIT
This option has the same semantic as IP_PMTUDISC_OMIT for IPv4 which
got recently introduced. It doesn't honor the path mtu discovered by the
host but in contrary to IPV6_PMTUDISC_INTERFACE allows the generation of
fragments if the packet size exceeds the MTU of the outgoing interface
MTU.
Fixes: 93b36cf3425b9b ("ipv6: support IPV6_PMTU_INTERFACE on sockets")
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hannes Frederic Sowa [Wed, 26 Feb 2014 00:20:42 +0000 (01:20 +0100)]
ipv4: yet another new IP_MTU_DISCOVER option IP_PMTUDISC_OMIT
IP_PMTUDISC_INTERFACE has a design error: because it does not allow the
generation of fragments if the interface mtu is exceeded, it is very
hard to make use of this option in already deployed name server software
for which I introduced this option.
This patch adds yet another new IP_MTU_DISCOVER option to not honor any
path mtu information and not accepting new icmp notifications destined for
the socket this option is enabled on. But we allow outgoing fragmentation
in case the packet size exceeds the outgoing interface mtu.
As such this new option can be used as a drop-in replacement for
IP_PMTUDISC_DONT, which is currently in use by most name server software
making the adoption of this option very smooth and easy.
The original advantage of IP_PMTUDISC_INTERFACE is still maintained:
ignoring incoming path MTU updates and not honoring discovered path MTUs
in the output path.
Fixes: 482fc6094afad5 ("ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE")
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hannes Frederic Sowa [Wed, 26 Feb 2014 00:20:41 +0000 (01:20 +0100)]
ipv4: use ip_skb_dst_mtu to determine mtu in ip_fragment
ip_skb_dst_mtu mostly falls back to ip_dst_mtu_maybe_forward if no socket
is attached to the skb (in case of forwarding) or determines the mtu like
we do in ip_finish_output, which actually checks if we should branch to
ip_fragment. Thus use the same function to determine the mtu here, too.
This is important for the introduction of IP_PMTUDISC_OMIT, where we
want the packets getting cut in pieces of the size of the outgoing
interface mtu. IPv6 already does this correctly.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timo Teräs [Wed, 26 Feb 2014 09:43:04 +0000 (11:43 +0200)]
neigh: probe application via netlink in NUD_PROBE
iproute2 arpd seems to expect this as there's code and comments
to handle netlink probes with NUD_PROBE set. It is used to flush
the arpd cached mappings.
opennhrp instead turns off unicast probes (so it can handle all
neighbour discovery). Without this change it will not see NUD_PROBE
probes and cannot reconfirm the mapping. Thus currently neigh entry
will just fail and can cause few packets dropped until broadcast
discovery is restarted.
Earlier discussion on the subject:
http://marc.info/?t=
139305877100001&r=1&w=2
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jean Sacren [Wed, 26 Feb 2014 05:38:29 +0000 (22:38 -0700)]
ieee802154: fix new function declaration
The commit
8fad346f366a7 ("
eee802154: add basic support for RF212 to
at86rf230 driver") introduced the new function is_rf212() with some
minor issues in declaration:
1) Fix the function type by changing it to bool as the function
definition returns a boolean value. Additionally both callers of
is_rf212() are expected to return a boolean value.
2) Fix the function specifier by deleting the inline keyword as the
compiler takes care of that.
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Cc: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Tue, 25 Feb 2014 20:11:02 +0000 (21:11 +0100)]
ipv6: log src and dst along with "udp checksum is 0"
These info messages are rather pointless without any means to identify
the source of the bogus packets. Logging the src and dst addresses and
ports may help a bit.
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso [Tue, 25 Feb 2014 16:46:10 +0000 (17:46 +0100)]
vxlan: remove unused port variable in vxlan_udp_encap_recv()
Signed-off-by: Pablo Neira Ayuso <pablo@gnumonks.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 26 Feb 2014 20:38:18 +0000 (15:38 -0500)]
Merge branch 'mlx4'
Amir Vadai says:
====================
net, net/mlx4: Add sysfs file for port number
Modern distro's are using biosdevname to rename interface to a name based on
slot/port number.
biosdevname can't get the port number of devices that have multiple ports that
share the same PCI function.
This patch adds a sysfs file under: /sys/devices/.../net/<interface>/dev_port,
that contains the port number (0 based) - to be used by biosdevname.
Also, dev_id was wrongly used in mlx4_en driver - added a patch that fix it.
This patch was tested and applied over commit
51adfcc "net: bcmgenet: remove
unused bh_lock member"
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Amir Vadai [Tue, 25 Feb 2014 16:17:52 +0000 (18:17 +0200)]
net/mlx4_en: Fix bad use of dev_id
dev_id should be set for multiple netdev's sharing the same MAC, which
is not the case here.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amir Vadai [Tue, 25 Feb 2014 16:17:51 +0000 (18:17 +0200)]
net/mlx4_en: Expose port number through sysfs
Initialize dev_port with port number (0 based) to be accessed through
sysfs from user space.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amir Vadai [Tue, 25 Feb 2014 16:17:50 +0000 (18:17 +0200)]
net: Add sysfs file for port number
Add a sysfs file to enable user space to query the device
port number used by a netdevice instance. This is needed for
devices that have multiple ports on the same PCI function.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 26 Feb 2014 20:28:08 +0000 (15:28 -0500)]
Merge branch 'bnx2x'
Michal Schmidt says:
====================
bnx2x: minimize RAM usage in kdump
kdump kernels usually have only a small amount of memory reserved.
bnx2x can be memory-hungry. Let's minimize its memory usage when
running in kdump.
I detect kdump by looking at the "reset_devices" flag. A couple of
storage drivers (cciss, hpsa) use it for the same purpose. I am not sure
this is the best way to solve the problem, but it works.
Should it be made more generic by, say, looking at the total amount
of lowmem instead? Not using TPA by default when lowmem is small and/or
defaulting to fewer queues would help 32bit systems where a driver for
a multi-function multi-queue NIC can consume a significant amount
of available memory. Or do we want no such heuristics?
Is this something to consider doing for other network drivers too?
====================
Acked-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Schmidt [Tue, 25 Feb 2014 15:04:26 +0000 (16:04 +0100)]
bnx2x: save RAM in kdump kernel by disabling TPA
When running in a kdump kernel, disable TPA. This saves memory, which
tends to be scarce in kdump.
TPA, being a receive acceleration, is unlikely to be useful for kdump,
whose purpose is to send the memory image out.
This saves additional 5 MB in the kdump environment.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Schmidt [Tue, 25 Feb 2014 15:04:25 +0000 (16:04 +0100)]
bnx2x: save RAM in kdump kernel by using a single queue
When running in a kdump kernel, make sure to use only a single ethernet
queue even if a num_queues option in /etc/modprobe.d/*.conf would specify
otherwise. This saves memory, which tends to be scarce in kdump.
This saves about 40 MB in the kdump environment on a setup with
num_queues=8 in the config file.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Schmidt [Tue, 25 Feb 2014 15:04:24 +0000 (16:04 +0100)]
bnx2x: clamp num_queues to prevent passing a negative value
Use the clamp() macro to make the calculation of the number of queues
slightly easier to understand. It also avoids a crash when someone
accidentally passes a negative value in num_queues= module parameter.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Tue, 25 Feb 2014 13:34:32 +0000 (14:34 +0100)]
net: tcp: add mib counters to track zero window transitions
Three counters are added:
- one to track when we went from non-zero to zero window
- one to track the reverse
- one counter incremented when we want to announce zero window,
but can't because we would shrink current window.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Jerram [Tue, 25 Feb 2014 11:17:25 +0000 (11:17 +0000)]
net: order MPLS ethertypes numerically
All ethertypes other than ETH_P_MPLS_UC, ETH_P_MPLS_MC and
ETH_P_ATMMPOA were already ordered numerically. This commit moves
those three ETH_P_... values into correct numerical order too.
Signed-off-by: Neil Jerram <Neil.Jerram@metaswitch.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Feb 2014 21:25:51 +0000 (13:25 -0800)]
bnx2x: Remove hidden flow control goto from BNX2X_ALLOC macros
BNX2X_ALLOC macros use "goto alloc_mem_err"
so these labels appear unused in some functions.
Expand these macros in-place via coccinelle and
some typing.
Update the macros to use statement expressions
and remove the BNX2X_ALLOC macro.
This adds some > 80 char lines.
$ cat bnx2x_pci_alloc.cocci
@@
expression e1;
expression e2;
expression e3;
@@
- BNX2X_PCI_ALLOC(e1, e2, e3);
+ e1 = BNX2X_PCI_ALLOC(e2, e3); if (!e1) goto alloc_mem_err;
@@
expression e1;
expression e2;
expression e3;
@@
- BNX2X_PCI_FALLOC(e1, e2, e3);
+ e1 = BNX2X_PCI_FALLOC(e2, e3); if (!e1) goto alloc_mem_err;
@@
expression e1;
expression e2;
@@
- BNX2X_ALLOC(e1, e2);
+ e1 = kzalloc(e2, GFP_KERNEL); if (!e1) goto alloc_mem_err;
@@
expression e1;
expression e2;
expression e3;
@@
- kzalloc(sizeof(e1) * e2, e3)
+ kcalloc(e2, sizeof(e1), e3)
@@
expression e1;
expression e2;
expression e3;
@@
- kzalloc(e1 * sizeof(e2), e3)
+ kcalloc(e1, sizeof(e2), e3)
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 25 Feb 2014 00:56:13 +0000 (16:56 -0800)]
net: bcmgenet: remove unused bh_lock member
bh_lock spinlock is unused, remove it from the private driver structure.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 25 Feb 2014 00:56:12 +0000 (16:56 -0800)]
net: bcmgenet: remove commented code in bcmgenet_xmit()
This code is commented since it is unused, left-over from the very first
time this driver was merged.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 25 Feb 2014 00:56:11 +0000 (16:56 -0800)]
net: bcmgenet: drop checks on priv->phydev
Drop all the checks on priv->phydev since we will refuse probing the
driver if we cannot attach to a PHY device. Drop all checks on
priv->phydev. This also fixes some smatch issues reported by Dan
Carpenter.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 25 Feb 2014 00:38:53 +0000 (19:38 -0500)]
Merge branch 'gianfar'
Claudiu Manoil says:
====================
gianfar: Device reset and reconfig fixes
These patches end up fixing some notable device reset & reconfig
related problems. One issue is on-the-fly (Rx/Tx on) programming
of interrupt coalescing (IC) registers on the processing path,
against HW recommendation. This is an old issue that became visible
after BQL introduction, as under certain conditions (low traffic)
one TX interrupt gets lost and BQL fires Tx timeout as a result.
Another notable issue is a race on the Tx path (xmit, clean_tx)
during device reset (i.e. during Tx timeout watchdog firing)
that leads to NULL access.
Fixing the problematic on-thy-fly register writes (i.e. the IC regs)
required the implementation of a MAC soft reset procedure.
The race leading to NULL access was addressed by fixing the
stop_gfar()/startup_gfar() pair (disable/enable napi a.s.o.)
and adding the device state DOWN to sync with the TX path.
v2: Refactored if() clauses from gfar_set_features(), PATCH 2.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Mon, 24 Feb 2014 10:13:46 +0000 (12:13 +0200)]
gianfar: Fix Tx int miss, dont write IC on-the-fly
Programming the interrupt coalescing (IC) registers while
the controller/DMA is on may incur the loss of one Tx
confirmation interrupt, under certain conditions. This is
a subtle hw race because it does not occur during a burst
of Tx packets. It has been observed on p2020 devices that,
if just one packet is being xmit'ed, the Tx confirmation
doesn't trigger and BQL evetually blocks the Tx queues,
followed by Tx timeout and an un-responsive device.
This issue was not apparent prior to introducing BQL
support, as a late Tx confirmation was not an issue back then
and the next burst of Tx frames would have triggered the
Tx confirmation/ Tx ring cleanup anyway.
Bottom line, the hw specifications state that the IC registers
should not be programmed while the Rx/Tx blocks (the DMA) are
enabled. Further more, these registers are currently re-written
with the same values on the processing path, over and over again.
To fix this, rewriting the IC registers has been removed from
the processing path (napi poll). A complete MAC reset procedure
has been implemented for the ethtool -c option instead, to
reliably update these registers while the controller is stopped.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Mon, 24 Feb 2014 10:13:45 +0000 (12:13 +0200)]
gianfar: Fix device reset races (oops) for Tx
The device reset procedure, stop_gfar()/startup_gfar(), has
concurrency issues.
"Kernel access of bad area" oopses show up during Tx timeout
device reset or other reset cases (like changing MTU) that
happen while the interface still has traffic. The oopses
happen in start_xmit and clean_tx_ring when accessing tx_queue->
tx_skbuff which is NULL. The race comes from de-allocating the
tx_skbuff while transmission and napi processing are still
active. Though the Tx queues get temoprarily stopped when Tx
timeout occurs, they get re-enabled as a result of Tx congestion
handling inside the napi context (see clean_tx_ring()). Not
disabling the napi during reset is also a bug, because
clean_tx_ring() will try to access tx_skbuff while it is being
de-alloc'ed and re-alloc'ed.
To fix this, stop_gfar() needs to disable napi processing
after stopping the Tx queues. However, in order to prevent
clean_tx_ring() to re-enable the Tx queue before the napi
gets disabled, the device state DOWN has been introduced.
It prevents the Tx congestion management from re-enabling the
de-congested Tx queue while the device is brought down.
An additional locking state, RESETTING, has been introduced
to prevent simultaneous resets or to prevent configuring the
device while it is resetting.
The bogus 'rxlock's (for each Rx queue) have been removed since
their purpose is not justified, as they don't prevent nor are
suited to prevent device reset/reconfig races (such as this one).
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Mon, 24 Feb 2014 10:13:44 +0000 (12:13 +0200)]
gianfar: Don't free/request irqs on device reset
Resetting the device (stop_gfar()/startup_gfar()) should
be fast and to the point, in order to timely recover
from an error condition (like Tx timeout) or during
device reconfig. The irq free/ request routines are just
redundant here, and they should be part of the device
close/ open routines instead.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Mon, 24 Feb 2014 10:13:43 +0000 (12:13 +0200)]
gianfar: Fix on-the-fly vlan and mtu updates
The RCTRL and TCTRL registers should not be changed
on-the-fly, while the controller is running, otherwise
unexpected behaviour occurs. But that's exactly what
gfar_vlan_mode() does, updating the VLAN acceleration
bits inside RCTRL/TCTRL. The attempt to lock these
operations doesn't help, but only adds to the confusion.
There's also a dependency for Rx FCB insertion (activating
/de-activating the TOE offload block on Rx) which might
change the required rx buffer size. This makes matters
worse as gfar_vlan_mode() ends up calling gfar_change_mtu(),
though the MTU size remains the same. Note that there are
other situations that may affect the required rx buffer size,
like changing RXCSUM or rx hw timestamping, but errorneously
the rx buffer size is not recomputed/ updated in the process.
To fix this, do the vlan updates properly inside the MAC
reset and reconfiguration procedure, which takes care of
the rx buffer size dependecy and the rx TOE block (PRSDEP)
activation/deactivation as well (in the correct order).
As a consequence, MTU/ rx buff size updates are done now
by the same MAC reset and reconfig procedure, so that out
of context updates to MAXFRM, MRBLR, and MACCFG inside
change_mtu() are no longer needed. The rx buffer size
dependecy to Rx FCB is now handled for the other cases too
(RXCSUM and rx hw timestamping).
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Mon, 24 Feb 2014 10:13:42 +0000 (12:13 +0200)]
gianfar: Implement MAC reset and reconfig procedure
The main MAC config registers like: RCTRL/TCTRL, MRBLR,
MAXFRM, RXIC/TXIC, most fields of MACCFG1/2, should not
be changed on-the-fly, but at least after stopping the
DMA and disabling the Rx/Tx blocks and, for increased
reliability, after a MAC soft reset.
Impelement a complete MAC soft reset and reconfig procedure
following the latest HW advisories - gfar_mac_reset() - to
replace gfar_mac_init() and (the confusing) init_registers()
functions.
Factor out separate config functions for RCTRL and TCTRL,
insure programming order of the relevant config regs after
MAC soft reset.
Split gfar_hw_init() into gfar_mac_reset() and the remaining
global regs that don't need to be reconfigured after MAC soft
reset (FIFOCFG, ATTRELI, HW counters a.s.o).
As gfar_hw_init() now makes all the register writes @probe()
time, based on all the device flags and config options, it
must be moved further down, just before register_netdev(),
as the last config step when the config values are comitted
to HW. Also, move netif_carrier_off() after register_netdev(),
because it has no effect if called before.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 25 Feb 2014 00:33:27 +0000 (19:33 -0500)]
bcmgenet: Deleted unnecessary select_queue method.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabio Estevam [Mon, 24 Feb 2014 03:47:24 +0000 (00:47 -0300)]
net: bcmgenet: Use devm_ioremap_resource()
According to Documentation/driver-model/devres.txt, devm_request_and_ioremap()
is deprecated, so use devm_ioremap_resource() instead.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Sun, 23 Feb 2014 08:05:26 +0000 (00:05 -0800)]
bridge: netfilter: Use ether_addr_copy
Convert the uses of memcpy to ether_addr_copy because
for some architectures it is smaller and faster.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Sun, 23 Feb 2014 08:05:25 +0000 (00:05 -0800)]
bridge: Use ether_addr_copy and ETH_ALEN
Convert the more obvious uses of memcpy to ether_addr_copy.
There are still uses of memcpy that could be converted but
these addresses are __aligned(2).
Convert a couple uses of 6 in gr_private.h to ETH_ALEN.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sun, 23 Feb 2014 00:03:24 +0000 (00:03 +0000)]
cgxb4: Stop using ethtool SPEED_* constants
ethtool speed values are just numbers of megabits and there is no need
to add SPEED_40000. To be consistent, use integer constants directly
for all speeds.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Sat, 22 Feb 2014 17:37:53 +0000 (18:37 +0100)]
tools: bpf_dbg: various misc code cleanups
Lets clean up bpf_dbg a bit and improve its code slightly
in various areas: i) Get rid of some macros as there's no
good reason for keeping them, ii) remove one unused variable
and reduce scope of various variables found by cppcheck,
iii) Close non-default file descriptors when exiting the shell.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Sat, 22 Feb 2014 13:01:53 +0000 (14:01 +0100)]
loopback: sctp: add NETIF_F_SCTP_CSUM to device features
Drivers are allowed to set NETIF_F_SCTP_CSUM if they have
hardware crc32c checksumming support for the SCTP protocol.
Currently, NETIF_F_SCTP_CSUM flag is available in igb,
ixgbe, i40e/i40evf drivers and for vlan devices.
If we don't have NETIF_F_SCTP_CSUM then crc32c is done
through CPU instructions, invoked from crypto layer, or
if not available as slow-path fallback in software.
Currently, loopback device propagates checksum offloading
feature flags in dev->features, but is missing SCTP checksum
offloading. Therefore, account for NETIF_F_SCTP_CSUM as
well.
Before patch:
./netperf_sctp -H 192.168.0.100 -t SCTP_STREAM_MANY
SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.100 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
4194304 4194304 4096 10.00 4683.50
After patch:
./netperf_sctp -H 192.168.0.100 -t SCTP_STREAM_MANY
SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.100 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
4194304 4194304 4096 10.00 15348.26
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mathias Krause [Fri, 21 Feb 2014 20:38:36 +0000 (21:38 +0100)]
pktgen: document all supported flags
The documentation misses a few of the supported flags. Fix this. Also
respect the dependency to CONFIG_XFRM for the IPSEC flag.
Cc: Fan Du <fan.du@windriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mathias Krause [Fri, 21 Feb 2014 20:38:35 +0000 (21:38 +0100)]
pktgen: simplify error handling in pgctrl_write()
The 'out' label is just a relict from previous times as pgctrl_write()
had multiple error paths. Get rid of it and simply return right away
on errors.
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mathias Krause [Fri, 21 Feb 2014 20:38:34 +0000 (21:38 +0100)]
pktgen: fix out-of-bounds access in pgctrl_write()
If a privileged user writes an empty string to /proc/net/pktgen/pgctrl
the code for stripping the (then non-existent) '\n' actually writes the
zero byte at index -1 of data[]. The then still uninitialized array will
very likely fail the command matching tests and the pr_warning() at the
end will therefore leak stack bytes to the kernel log.
Fix those issues by simply ensuring we're passed a non-empty string as
the user API apparently expects a trailing '\n' for all commands.
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 24 Feb 2014 23:44:05 +0000 (18:44 -0500)]
Merge branch 'qlcnic-next'
Shahed Shaikh says:
====================
qlcnic: Re-factoring and enhancements
This patch series includes following changes -
* Re-factored firmware minidump template header handling
* Support to make 8 vNIC mode application to work with 16 vNIC mode
* Enhance error message logging when adapter is in failed state and
when adapter lock access fails.
* Allow vlan0 traffic
* update MAINTAINERS
Please apply this series to net-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Shahed Shaikh [Fri, 21 Feb 2014 18:20:16 +0000 (13:20 -0500)]
Update MAINTAINERS for qlcnic driver
Keep myself as only maintainer for qlcnic driver and update
group email alias to Dept-HSGLinuxNICDev@qlogic.com
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shahed Shaikh [Fri, 21 Feb 2014 18:20:15 +0000 (13:20 -0500)]
qlcnic: Update version to 5.3.56
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Harish Patil [Fri, 21 Feb 2014 18:20:14 +0000 (13:20 -0500)]
qlcnic: Enhance semaphore lock access failure error message
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rajesh Borundia [Fri, 21 Feb 2014 18:20:13 +0000 (13:20 -0500)]
qlcnic: Allow vlan0 traffic
o Adapter allows vlan0 traffic in case of SR-IOV after setting
QLC_SRIOV_ALLOW_VLAN0 bit even though we do not add vlan0 filters.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sucheta Chakraborty [Fri, 21 Feb 2014 18:20:12 +0000 (13:20 -0500)]
qlcnic: Enhance driver message in failed state.
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jitendra Kalsaria [Fri, 21 Feb 2014 18:20:11 +0000 (13:20 -0500)]
qlcnic: Updates to QLogic application/driver interface for virtual NIC configuration
Qlogic application interface in the driver which has larger than 8 vNIC
configuration support has been updated to handle the following cases:
o Only 8 or lower total vNICs were enabled within the vNIC 0-7 range
o vNICs were enabled in the vNIC 0-15 range such that enabled vNICs were
not contiguous and only 8 or lower number of total VNICs were enabled
o Disconnect in the vNIC mapping between application and driver when the
enabled VNICs were dis contiguous
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shahed Shaikh [Fri, 21 Feb 2014 18:20:10 +0000 (13:20 -0500)]
qlcnic: Re-factor firmware minidump template header handling
Treat firmware minidump template headers for 82xx and 83xx/84xx adapters separately,
as it may change for 82xx and 83xx/84xx adapter type independently.
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 24 Feb 2014 23:38:27 +0000 (18:38 -0500)]
Merge branch 'mlx4'
Amir Vadai says:
====================
net/mlx4: Mellanox driver update 01-01-2014
This small patchset has a fix to a bogus usage of
netif_get_num_default_rss_queues() in mlx4_en driver.
Changes from V1:
- Removed affinity_hint patch, to make it a generic instead of mlx specific
Changes from V0:
- Instead of reverting the netif_get_num_default_rss_queues() in mlx4_en,
fixing it to limit the actual number of receive queues instead of limiting
the number of IRQ's.
Patchset was applied and tested against commit:
cb6e926 "ipv6:fix checkpatch
errors with assignment in if condition"
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Shamay [Fri, 21 Feb 2014 10:39:18 +0000 (12:39 +0200)]
net/mlx4: Fix limiting number of IRQ's instead of RSS queues
This fix a performance bug introduced by commit
90b1ebe "mlx4: set
maximal number of default RSS queues", which limits the numbers of IRQs
opened by core module.
The limit should be on the number of queues in the indirection table -
rx_rings, and not on the number of IRQ's. Also, limiting on mlx4_core
initialization instead of in mlx4_en, prevented using "ethtool -L" to
utilize all the CPU's, when performance mode is prefered, since limiting
this number to 8 reduces overall packet rate by 15%-50% in multiple TCP
streams applications.
For example, after running ethtool -L <ethx> rx 16
Packet rate
Before the fix 897799
After the fix
1142070
Results were obtained using netperf:
S=200 ; ( for i in $(seq 1 $S) ; do ( \
netperf -H 11.7.13.55 -t TCP_RR -l 30 &) ; \
wait ; done | grep "1 1" | awk '{SUM+=$6} END {print SUM}' )
CC: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Shamay [Fri, 21 Feb 2014 10:39:17 +0000 (12:39 +0200)]
net/mlx4: Set number of RX rings in a utility function
mlx4_en_add() is too long.
Moving set number of RX rings to a utiltity function to improve
readability and modulization of the code.
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Fri, 21 Feb 2014 08:08:54 +0000 (16:08 +0800)]
bonding: remove no longer needed lock for bond_xxx_info_query()
The bond_xxx_info_query() was already in RTNL, so no need to use
bond lock to protect the bond slave list, so remove it.
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Fri, 21 Feb 2014 08:08:53 +0000 (16:08 +0800)]
bonding: use rcu_dereference() to access curr_active_slave
The bond_info_show_master already in RCU read-side critical section,
and the we access curr_active_slave without the curr_slave_lock, we
could not sure whether the curr_active_slave will be changed during
the processing, so use RCU to protected the pointer.
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Fri, 21 Feb 2014 08:08:52 +0000 (16:08 +0800)]
bonding: netpoll: remove unwanted slave_dev_support_netpoll()
The __netpoll_setup() will check the slave's flag and ndo_poll_controller just
like the slave_dev_support_netpoll() does, and slave_dev_support_netpoll() was
not used by any place, so remove it.
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 24 Feb 2014 23:13:33 +0000 (18:13 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
1) Introduce skb_to_sgvec_nomark function to add further data to the sg list
without calling sg_unmark_end first. Needed to add extended sequence
number informations. From Fan Du.
2) Add IPsec extended sequence numbers support to the Authentication Header
protocol for ipv4 and ipv6. From Fan Du.
3) Make the IPsec flowcache namespace aware, from Fan Du.
4) Avoid creating temporary SA for every packet when no key manager is
registered. From Horia Geanta.
5) Support filtering of SA dumps to show only the SAs that match a
given filter. From Nicolas Dichtel.
6) Remove caching of xfrm_policy_sk_bundles. The cached socket policy bundles
are never used, instead we create a new cache entry whenever xfrm_lookup()
is called on a socket policy. Most protocols cache the used routes to the
socket, so this caching is not needed.
7) Fix a forgotten SADB_X_EXT_FILTER length check in pfkey, from Nicolas
Dichtel.
8) Cleanup error handling of xfrm_state_clone.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 21 Feb 2014 17:39:23 +0000 (12:39 -0500)]
Merge branch 'i40evf'
Aaron Brown says:
====================
Intel Wired LAN Driver Updates
This series contains updates to i40e and (mostly to) i40evf.
Mitch provides most the work for this series. For the vf driver he
requests a reset on a tx hang, removes vlan filtes on close since we
already remove the MAC filters, fixes some crashes, gets rid of PCI DAC
as it does not mean much on virtualized PCIe parts, skips assigning the
device name that just gets renamed anyway, stores the descriptor ring
size in a manner that allows the use of common tx and rx code with the
PF driver and makes a handful of cosmetic fixes. For i40e he removes
a delay left over from debugging and changes a do/while loop to a for
loop to avoid hitting another delay each time.
Catherine fixes inconsistent MSI and MSI-X messages and bumps the
driver version.
v2: Removed unnecessary periods and redundant OOM message.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Catherine Sullivan [Fri, 21 Feb 2014 03:29:18 +0000 (19:29 -0800)]
i40e and i40evf: Bump driver versions
Update the driver versions.
Change-ID: I3fe23024d17da0e614ce126edb365bb2c428d482
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Catherine Sullivan [Fri, 21 Feb 2014 03:29:17 +0000 (19:29 -0800)]
i40e: Change MSIX to MSI-X
Fix inconsistent use of MSIX and MSI-X in messages.
Change-ID: Iae9ffb42819677c34544719044ed77632e06147d
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitch Williams [Fri, 21 Feb 2014 03:29:16 +0000 (19:29 -0800)]
i40e: tighten up ring enable/disable flow
Change the do/while to a for loop, so we don't hit the delay each
time, even when the register is ready for action.
Don't bother to set or clear the QENA_STAT bit as it is
read-only.
Change-ID: Ie464718804dd79f6d726f291caa9b0c872b49978
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitch Williams [Fri, 21 Feb 2014 03:29:15 +0000 (19:29 -0800)]
i40e: remove unnecessary delay
Ain't nothing gonna break my stride, nobody's gonna slow me down,
oh no. I got to keep on moving.
This was originally put in for debugging just-in-case purposes
and never removed.
Change-ID: Ic12c2e179c3923f54e6ba0a9e4ab05d25c3bab29
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitch A Williams [Fri, 21 Feb 2014 03:29:14 +0000 (19:29 -0800)]
i40evf: remove errant space
Remove a bogus space.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitch Williams [Fri, 21 Feb 2014 03:29:13 +0000 (19:29 -0800)]
i40evf: update version and copyright date
A bunch of changes merit a new version number, and since these were
made in the new year, update the copyright date.
Change-ID: Ic3f282bf0c20679b9fb06860211afa7c78055bc2
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitch Williams [Fri, 21 Feb 2014 03:29:12 +0000 (19:29 -0800)]
i40evf: store ring size in ring structs
Keep the descriptor ring size in the actual ring structs instead of in
the adapter struct. This enables us to use common tx and rx code with
the i40e PF driver.
Also update copyrights.
Change-ID: I2861e599b2b4c76441c062ea14400f4750f54d0e
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitch Williams [Fri, 21 Feb 2014 03:29:11 +0000 (19:29 -0800)]
i40evf: don't guess device name
We don't need to set an interface name here; the net core will do
that, and then it will get renamed by udev anyway.
Change-ID: I839a17837d19bedd1f490bff32ac5b85b4bfd97f
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitch Williams [Fri, 21 Feb 2014 03:29:10 +0000 (19:29 -0800)]
i40evf: remove bogus comment
This comment is simply not true.
Change-ID: If006b02b60984601a24257a951ae873dff568008
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitch Williams [Fri, 21 Feb 2014 03:29:09 +0000 (19:29 -0800)]
i40evf: fix up strings in init task
Make sure errors are reported at the correct log level, quit printing
the function name every time, and make the messages more consistent in
format.
v2: Removed unnecessary periods and redundant OOM message.
Change-ID: I50e443467519ad3850def131d84626c50612c611
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitch Williams [Fri, 21 Feb 2014 03:29:08 +0000 (19:29 -0800)]
i40evf: get rid of pci_using_dac
PCI DAC doesn't really mean much on a virtualized PCI Express part, so
get rid of that check and just always set the HIGHDMA flag in the net
device.
Change-ID: I2040272be0e7934323f470c2bc73fbdd4f93e2b6
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitch Williams [Fri, 21 Feb 2014 03:29:07 +0000 (19:29 -0800)]
i40evf: fix multiple crashes on remove
Depending upon the state of the driver, there are several potential
pitfalls on remove. Kill the watchdog task so rmmod doesn't hang.
Check the adapter->msix_entries field, not the num_msix_vectors field,
which is never cleared.
Change-ID: I0546048477f09fc19e481bd37efa30daae4faa88
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitch Williams [Fri, 21 Feb 2014 03:29:06 +0000 (19:29 -0800)]
i40evf: remove VLAN filters on close
We remove all the MAC filters, so remove the VLAN filters, too.
Change-ID: I4f7559acdf005dc3f359bf6460ce32d183c8878b
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitch Williams [Fri, 21 Feb 2014 03:29:05 +0000 (19:29 -0800)]
i40evf: request reset on tx hang
If the kernel watchdog bites us, ask the PF to reset us and attempt to
reinit the driver.
Change-ID: Ic97665aeeed71ce712b9c4f057e78ff8372522b9
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert [Wed, 19 Feb 2014 12:33:24 +0000 (13:33 +0100)]
xfrm: Cleanup error handling of xfrm_state_clone
The error pointer passed to xfrm_state_clone() is unchecked,
so remove it and indicate an error by returning a null pointer.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Nicolas Dichtel [Thu, 20 Feb 2014 13:52:41 +0000 (14:52 +0100)]
pfkey: fix SADB_X_EXT_FILTER length check
This patch fixes commit
d3623099d350 ("ipsec: add support of limited SA dump").
sadb_ext_min_len array should be updated with the new type (SADB_X_EXT_FILTER).
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
David S. Miller [Thu, 20 Feb 2014 21:55:49 +0000 (16:55 -0500)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless-next
John W. Linville says:
====================
Please pull this batch of wireless updates intended for the 3.15
stream!
For the mac80211 bits, Johannes says:
"We have some cleanups and minor fixes as well as userspace API
improvements from a lot of people, extended VHT support for radiotap
from Emmanuel, CSA improvements from Andrei, Luca and Michal. I've also
included my work on hwsim to make dynamic registration of radios
possible."
Along with that, we get the usual round of updates to ath9k,
brcmfmac, mwifiex, wcn36xx, and the ti drivers -- nothing particularly
noteworthy, mostly just random updates and refactoring.
Also included is a pull of the wireless tree, intended to resolve
some potential merge issues.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Thu, 20 Feb 2014 20:02:02 +0000 (15:02 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next into for-davem
Veaceslav Falico [Thu, 20 Feb 2014 11:07:57 +0000 (12:07 +0100)]
bonding: fix bond_arp_rcv() race of curr_active_slave
bond->curr_active_slave can be changed between its deferences, even to
NULL, and thus we might panic.
We're always holding the rcu (rx_handler->bond_handle_frame()->bond_arp_rcv())
so fix this by rcu_dereferencing() it and using the saved.
Reported-by: Ding Tianhong <dingtianhong@huawei.com>
Fixes: aeea64a ("bonding: don't trust arp requests unless active slave really works")
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Acked-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Wed, 19 Feb 2014 23:49:45 +0000 (15:49 -0800)]
hyperv: Add latest NetVSP versions to auto negotiation
It auto negotiates the highest NetVSP version supported by both guest and host.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Wed, 19 Feb 2014 11:51:10 +0000 (12:51 +0100)]
tcp: use zero-window when free_space is low
Currently the kernel tries to announce a zero window when free_space
is below the current receiver mss estimate.
When a sender is transmitting small packets and reader consumes data
slowly (or not at all), receiver might be unable to shrink the receive
win because
a) we cannot withdraw already-commited receive window, and,
b) we have to round the current rwin up to a multiple of the wscale
factor, else we would shrink the current window.
This causes the receive buffer to fill up until the rmem limit is hit.
When this happens, we start dropping packets.
Moreover, tcp_clamp_window may continue to grow sk_rcvbuf towards rmem[2]
even if socket is not being read from.
As we cannot avoid the "current_win is rounded up to multiple of mss"
issue [we would violate a) above] at least try to prevent the receive buf
growth towards tcp_rmem[2] limit by attempting to move to zero-window
announcement when free_space becomes less than 1/16 of the current
allowed receive buffer maximum. If tcp_rmem[2] is large, this will
increase our chances to get a zero-window announcement out in time.
Reproducer:
On server:
$ nc -l -p 12345
<suspend it: CTRL-Z>
Client:
#!/usr/bin/env python
import socket
import time
sock = socket.socket()
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
sock.connect(("192.168.4.1", 12345));
while True:
sock.send('A' * 23)
time.sleep(0.005)
socket buffer on server-side will grow until tcp_rmem[2] is hit,
at which point the client rexmits data until -EDTIMEOUT:
tcp_data_queue invokes tcp_try_rmem_schedule which will call
tcp_prune_queue which calls tcp_clamp_window(). And that function will
grow sk->sk_rcvbuf up until it eventually hits tcp_rmem[2].
Thanks to Eric Dumazet for running regression tests.
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Tested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Erik Hugne [Wed, 19 Feb 2014 07:37:58 +0000 (08:37 +0100)]
tipc: failed transmissions should return error
When a message could not be sent out because the destination node
or link could not be found, the full message size is returned from
sendmsg() as if it had been sent successfully. An application will
then get a false indication that it's making forward progress. This
problem has existed since the initial commit in 2.6.16.
We change this to return -ENETUNREACH if the message cannot be
delivered due to the destination node/link being unavailable. We
also get rid of the redundant tipc_reject_msg call since freeing
the buffer and doing a tipc_port_iovec_reject accomplishes exactly
the same thing.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daeseok Youn [Wed, 19 Feb 2014 01:49:12 +0000 (10:49 +0900)]
atm: solos-pci: make solos_bh() as static
sparse says:
drivers/atm/solos-pci.c:763:6: warning:
symbol 'solos_bh' was not declared. Should it be static?
Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com>
Acked-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daeseok Youn [Wed, 19 Feb 2014 01:10:11 +0000 (10:10 +0900)]
atm: nicstar: use NULL instead of 0 for pointer
sparse says:
drivers/atm/nicstar.c:642:27: warning:
Using plain integer as NULL pointer
drivers/atm/nicstar.c:644:27:
warning: Using plain integer as NULL pointer
drivers/atm/nicstar.c:982:51:
warning: Using plain integer as NULL pointer
drivers/atm/nicstar.c:996:51:
warning: Using plain integer as NULL pointer
Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com>
Acked-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daeseok Youn [Wed, 19 Feb 2014 01:35:41 +0000 (10:35 +0900)]
atm: ambassador: use NULL instead of 0 for pointer
sparse says:
drivers/atm/ambassador.c:1928:24: warning:
Using plain integer as NULL pointer
Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com>
Acked-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hannes Frederic Sowa [Tue, 18 Feb 2014 20:38:08 +0000 (21:38 +0100)]
ipv6: honor IPV6_PKTINFO with v4 mapped addresses on sendmsg
In case we decide in udp6_sendmsg to send the packet down the ipv4
udp_sendmsg path because the destination is either of family AF_INET or
the destination is an ipv4 mapped ipv6 address, we don't honor the
maybe specified ipv4 mapped ipv6 address in IPV6_PKTINFO.
We simply can check for this option in ip_cmsg_send because no calls to
ipv6 module functions are needed to do so.
Reported-by: Gert Doering <gert@space.net>
Cc: Tore Anderson <tore@fud.no>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Tue, 18 Feb 2014 17:42:47 +0000 (09:42 -0800)]
bonding: Invert test
Make the error case return early.
Make the normal return at the bottom of the function.
Reduces indent for readability.
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Tue, 18 Feb 2014 17:42:46 +0000 (09:42 -0800)]
bonding: Remove unnecessary else
It's unnecessary and less readable after a clause ending in a goto.
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Tue, 18 Feb 2014 17:42:45 +0000 (09:42 -0800)]
bonding: More use of ether_addr_copy
It's smaller and faster for some architectures.
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don Fry [Tue, 18 Feb 2014 04:57:59 +0000 (20:57 -0800)]
pcnet32: add missing check for pci_dma_mapping_error
The pci_map_single calls never checked for error. Add error checking
as requested by someone last year. Tested on 79c972.
Signed-off-by: Don Fry <pcnet32@frontier.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don Fry [Tue, 18 Feb 2014 04:57:46 +0000 (20:57 -0800)]
pcnet32: fix reallocation error
pcnet32_realloc_rx_ring() only worked on the first log2 number of
entries in the receive ring instead of the all the entries.
Replaced "1 << size" with more descriptive variable.
This is my original bug from 2006. Found while testing another problem.
Tested on 79C972 and 79C976.
Signed-off-by: Don Fry <pcnet32@frontier.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert [Wed, 19 Feb 2014 09:07:34 +0000 (10:07 +0100)]
xfrm: Remove caching of xfrm_policy_sk_bundles
We currently cache socket policy bundles at xfrm_policy_sk_bundles.
These cached bundles are never used. Instead we create and cache
a new one whenever xfrm_lookup() is called on a socket policy.
Most protocols cache the used routes to the socket, so let's
remove the unused caching of socket policy bundles in xfrm.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
David S. Miller [Wed, 19 Feb 2014 06:24:22 +0000 (01:24 -0500)]
Merge git://git./linux/kernel/git/davem/net
Conflicts:
drivers/net/bonding/bond_3ad.h
drivers/net/bonding/bond_main.c
Two minor conflicts in bonding, both of which were overlapping
changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 19 Feb 2014 00:36:07 +0000 (16:36 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Lots of little small things, nothing too major: nouveau regression
fixes, vmware fixes for the new hw support, memory leaks in error path
fixes"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (31 commits)
drm/radeon/ni: fix typo in dpm sq ramping setup
drm/radeon/si: fix typo in dpm sq ramping setup
drm/radeon: fix CP semaphores on CIK
drm/radeon: delete a stray tab
drm/radeon: fix display tiling setup on SI
drm/radeon/dpm: reduce r7xx vblank mclk threshold to 200
drm/radeon: fill in DRM_CAPs for cursor size
drm: add DRM_CAPs for cursor size
drm/radeon: unify bpc handling
drm/ttm: Fix memory leak in ttm_agp_backend.c
drm/ttm: declare 'struct device' in ttm_page_alloc.h
drm/nouveau: fix TTM_PL_TT memtype on pre-nv50
drm/nv50/disp: use correct register to determine DP display bpp
drm/nouveau/fb: use correct ram oclass for nv1a hardware
drm/nv50/gr: add missing nv_error parameter priv
drm/nouveau: fix ENG_RUNLIST register address
drm/nv4c/bios: disallow retrieving from prom on nv4x igp's
drm/nv4c/vga: decode register is in a different place on nv4x igp's
drm/nv4c/mc: nv4x igp's have a different msi rearm register
drm/nouveau: set irq_enabled manually
...
Linus Torvalds [Wed, 19 Feb 2014 00:29:46 +0000 (16:29 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
Pull HID update from Jiri Kosina:
- fixes for several bugs in incorrect allocations of buffers by David
Herrmann and Benjamin Tissoires.
- support for a few new device IDs by Archana Patni, Benjamin
Tissoires, Huei-Horng Yo, Reyad Attiyat and Yufeng Shen
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: hyperv: make sure input buffer is big enough
HID: Bluetooth: hidp: make sure input buffers are big enough
HID: hid-sensor-hub: quirk for STM Sensor hub
HID: apple: add Apple wireless keyboard 2011 JIS model support
HID: fix buffer allocations
HID: multitouch: add FocalTech FTxxxx support
HID: microsoft: Add ID's for Surface Type/Touch Cover 2
HID: usbhid: quirk for CY-TM75 75 inch Touch Overlay
Linus Torvalds [Tue, 18 Feb 2014 23:52:43 +0000 (15:52 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) kvaser CAN driver has fixed limits of some of it's table, validate
that we won't exceed those limits at probe time. Fix from Olivier
Sobrie.
2) Fix rtl8192ce disabling interrupts for too long, from Olivier
Langlois.
3) Fix botched shift in ath5k driver, from Dan Carpenter.
4) Fix corruption of deferred packets in TIPC, from Erik Hugne.
5) Fix newlink error path in macvlan driver, from Cong Wang.
6) Fix netpoll deadlock in bonding, from Ding Tianhong.
7) Handle GSO packets properly in forwarding path when fragmentation is
necessary on egress, from Florian Westphal.
8) Fix axienet build errors, from Michal Simek.
9) Fix refcounting of ubufs on tx in vhost net driver, from Michael S
Tsirkin.
10) Carrier status isn't set properly in hyperv driver, from Haiyang
Zhang.
11) Missing pci_disable_device() in tulip_remove_one), from Ingo Molnar.
12) AF_PACKET qdisc bypass mode doesn't adhere to driver provided TX
queue selection method. Add a fallback method mechanism to fix this
bug, from Daniel Borkmann.
13) Fix regression in link local route handling on GRE tunnels, from
Nicolas Dichtel.
14) Bonding can assign dup aggregator IDs in some sequences of
configuration, fix by making the allocation counter per-bond instead
of global. From Jiri Bohac.
15) sctp_connectx() needs compat translations, from Daniel Borkmann.
16) Fix of_mdio PHY interrupt parsing, from Ben Dooks
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits)
MAINTAINERS: add entry for the PHY library
of_mdio: fix phy interrupt passing
net: ethernet: update dependency and help text of mvneta
NET: fec: only enable napi if we are successful
af_packet: remove a stray tab in packet_set_ring()
net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode
ipv4: fix counter in_slow_tot
irtty-sir.c: Do not set_termios() on irtty_close()
bonding: 802.3ad: make aggregator_identifier bond-private
usbnet: remove generic hard_header_len check
gre: add link local route when local addr is any
batman-adv: fix potential kernel paging error for unicast transmissions
batman-adv: avoid double free when orig_node initialization fails
batman-adv: free skb on TVLV parsing success
batman-adv: fix TT CRC computation by ensuring byte order
batman-adv: fix potential orig_node reference leak
batman-adv: avoid potential race condition when adding a new neighbour
batman-adv: properly check pskb_may_pull return value
batman-adv: release vlan object after checking the CRC
batman-adv: fix TT-TVLV parsing on OGM reception
...
Linus Torvalds [Tue, 18 Feb 2014 23:49:58 +0000 (15:49 -0800)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"A range of ARM fixes. Biggest change is the stage-2 attributes used
for for hyp mode which were wrong. I've killed some bits in a couple
of DT files which turned out not to be required, and a few other
fixes.
One fix touches code outside of arch/arm, which is related to sorting
out the DMA masks correctly. There is a long standing issue with the
conversion from PFNs to addresses where people assume that shifting an
unsigned long left by PAGE_SHIFT results in a correct address. This
is not the case with C: the integer promotion happens at assignment
after evaluation. This fixes the recently introduced dma_max_pfn()
function, but there's a number of other places where we try this
directly on an unsigned long in the mm code"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 7957/1: add DSB after icache flush in __flush_icache_all()
Fix uses of dma_max_pfn() when converting to a limiting address
ARM: 7955/1: spinlock: ensure we have a compiler barrier before sev
ARM: 7953/1: mm: ensure TLB invalidation is complete before enabling MMU
ARM: 7952/1: mm: Fix the memblock allocation for LPAE machines
ARM: 7950/1: mm: Fix stage-2 device memory attributes
ARM: dts: fix spdif pinmux configuration
Linus Torvalds [Tue, 18 Feb 2014 23:49:40 +0000 (15:49 -0800)]
Merge tag 'jfs-3.14-rc4' of git://github.com/kleikamp/linux-shaggy
Pull jfs fix from David Kleikamp:
"Another ACL regression. This one more subtle"
* tag 'jfs-3.14-rc4' of git://github.com/kleikamp/linux-shaggy:
jfs: set i_ctime when setting ACL
Jiri Pirko [Tue, 18 Feb 2014 19:53:18 +0000 (20:53 +0100)]
rtnl: make ifla_policy static
The only place this is used outside rtnetlink.c is veth. So provide
wrapper function for this usage.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Tue, 18 Feb 2014 18:37:20 +0000 (10:37 -0800)]
hsr: Use ether_addr_copy
It's slightly smaller/faster for some architectures.
Make sure def_multicast_addr is __aligned(2)
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 18 Feb 2014 17:47:49 +0000 (09:47 -0800)]
MAINTAINERS: add entry for the PHY library
The PHY library has been subject to some changes, new drivers and DT
interactions over the past few months. Add myself as a maintainer for
the core PHY library parts and drivers. Make sure the PHY library entry
also covers the Device Tree files which have a close interaction with
the MDIO bus, PHY connection and Ethernet PHY mode parsing.
CC: Grant Likely <grant.likely@linaro.org>
CC: Shaohui Xie <shaohui.xie@freescale.com>
CC: Andy Fleming <afleming@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Dooks [Tue, 18 Feb 2014 12:16:58 +0000 (12:16 +0000)]
of_mdio: fix phy interrupt passing
The of_mdiobus_register_phy() is not setting phy->irq thus causing
some drivers to incorrectly assume that the PHY does not have an
IRQ associated with it. Not only do some drivers report no IRQ
they do not install an interrupt handler for the PHY.
Simplify the code setting irq and set the phy->irq at the same
time so that we cover the following issues, which should cover
all the cases the code will find:
- Set phy->irq if node has irq property and mdio->irq is NULL
- Set phy->irq if node has no irq and mdio->irq is not NULL
- Leave phy->irq as PHY_POLL default if none of the above
This fixes the issue:
net eth0: attached PHY 1 (IRQ -1) to driver Micrel KSZ8041RNLI
to the correct:
net eth0: attached PHY 1 (IRQ 416) to driver Micrel KSZ8041RNLI
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florent Fourcot [Tue, 18 Feb 2014 13:45:42 +0000 (14:45 +0100)]
ipv6: remove some unused include in flowlabel
These include are here since kernel 2.2.7, but probably never used.
Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Phoebe Buckheister [Tue, 18 Feb 2014 13:39:27 +0000 (14:39 +0100)]
ieee802154: fix faulty check in set_phy_params api
phy_set_csma_params has a redundant (and impossible) check for
"retries", found by smatch. The check was supposed to be for
frame_retries, but wasn't moved during development when
phy_set_frame_retries was introduced. Also, maxBE >= 3 as required by
the standard is not enforced.
Remove the redundant check, assure max_be >= 3 and check -1 <=
frame_retries <= 7 in the correct function.
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Petazzoni [Tue, 18 Feb 2014 13:18:11 +0000 (14:18 +0100)]
net: ethernet: update dependency and help text of mvneta
With the introduction of the support for Armada 375 and Armada 38x,
the hidden Kconfig option MACH_ARMADA_370_XP is being renamed to
MACH_MVEBU_V7. Therefore, the dependency that was used for the mvneta
driver can no longer work. This commit replaces this dependency by a
dependency on PLAT_ORION, which is used similarly for the mv643xx_eth
driver.
In addition to this, it takes this opportunity to adjust the
description and help text to indicate that the driver can is also used
for Armada 38x. Note that Armada 375 cannot use this driver as it has
a completely different networking unit, which will require a separate
driver.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 18 Feb 2014 12:55:42 +0000 (12:55 +0000)]
NET: fec: only enable napi if we are successful
If napi is left enabled after a failed attempt to bring the interface
up, we BUG:
fec
2188000.ethernet eth0: no PHY, assuming direct connection to switch
libphy: PHY fixed-0:00 not found
fec
2188000.ethernet eth0: could not attach to PHY
------------[ cut here ]------------
kernel BUG at include/linux/netdevice.h:502!
Internal error: Oops - BUG: 0 [#1] SMP ARM
...
PC is at fec_enet_open+0x4d0/0x500
LR is at __dev_open+0xa4/0xfc
Only enable napi after we are past all the failure paths.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 18 Feb 2014 12:20:51 +0000 (15:20 +0300)]
af_packet: remove a stray tab in packet_set_ring()
At first glance it looks like there is a missing curly brace but
actually the code works the same either way. I have adjusted the
indenting but left the code the same.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>