git.openwrt.org Git - openwrt/staging/blogic.git/log

net: frag queue per hash bucket locking

This patch implements per hash bucket locking for the frag queue
hash.  This removes two write locks, and the only remaining write
lock is for protecting hash rebuild.  This essentially reduce the
readers-writer lock to a rebuild lock.

This patch is part of "net: frag performance followup"
http://thread.gmane.org/gmane.linux.network/263644
of which two patches have already been accepted:

Same test setup as previous:
(http://thread.gmane.org/gmane.linux.network/257155)
Two 10G interfaces, on seperate NUMA nodes, are under-test, and uses
Ethernet flow-control.  A third interface is used for generating the
DoS attack (with trafgen).

Notice, I have changed the frag DoS generator script to be more
efficient/deadly.  Before it would only hit one RX queue, now its
sending packets causing multi-queue RX, due to "better" RX hashing.

Test types summary (netperf UDP_STREAM):
Test-20G64K     == 2x10G with 65K fragments
Test-20G3F      == 2x10G with 3x fragments (3*1472 bytes)
Test-20G64K+DoS == Same as 20G64K with frag DoS
Test-20G3F+DoS  == Same as 20G3F  with frag DoS
Test-20G64K+MQ  == Same as 20G64K with Multi-Queue frag DoS
Test-20G3F+MQ   == Same as 20G3F  with Multi-Queue frag DoS

When I rebased this-patch(03) (on top of net-next commit a210576c) and
removed the _bh spinlock, I saw a performance regression.  BUT this
was caused by some unrelated change in-between.  See tests below.

Test (A) is what I reported before for patch-02, accepted in commit 1b5ab0de.
Test (B) verifying-retest of commit 1b5ab0de corrospond to patch-02.
Test (C) is what I reported before for this-patch

Test (D) is net-next master HEAD (commit a210576c), which reveals some
(unknown) performance regression (compared against test (B)).
Test (D) function as a new base-test.

Performance table summary (in Mbit/s):

(#) Test-type:  20G64K    20G3F    20G64K+DoS  20G3F+DoS  20G64K+MQ 20G3F+MQ
    ----------  -------   -------  ----------  ---------  --------  -------
(A) Patch-02  : 18848.7   13230.1   4103.04     5310.36     130.0    440.2
(B) 1b5ab0de  : 18841.5   13156.8   4101.08     5314.57     129.0    424.2
(C) Patch-03v1: 18838.0   13490.5   4405.11     6814.72     196.6    461.6

(D) a210576c  : 18321.5   11250.4   3635.34     5160.13     119.1    405.2
(E) with _bh  : 17247.3   11492.6   3994.74     6405.29     166.7    413.6
(F) without bh: 17471.3   11298.7   3818.05     6102.11     165.7    406.3

Test (E) and (F) is this-patch(03), with(V1) and without(V2) the _bh spinlocks.

I cannot explain the slow down for 20G64K (but its an artificial
"lab-test" so I'm not worried).  But the other results does show
improvements.  And test (E) "with _bh" version is slightly better.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
----
V2:
- By analysis from Hannes Frederic Sowa and Eric Dumazet, we don't
  need the spinlock _bh versions, as Netfilter currently does a
  local_bh_disable() before entering inet_fragment.
- Fold-in desc from cover-mail
V3:
- Drop the chain_len counter per hash bucket.
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git./linux/kernel/git/davem/net

Pull net into net-next to get the synchronize_net() bug fix in
bonding.

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Fix VSOCK layer handling of context ID changes, from Reilly Grant.

2) Now that we have a synchronize_net() in netdev_rx_handler_unregister(),
    we can't let any call sites hold locks.  Unfortunately bonding does,
    so we have to drop the rwlock there a little bit earlier, fix from
    Veaceslav Falico.

3) MAC address setting loop exits one iteration too early in mlx4
    driver, from Yan Burman.

4) Restore ipv6 routes properly upon ifdown/ifup of loopback, from
    Balakumaran Kannan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  VSOCK: Handle changes to the VMCI context ID.
  net IPv6 : Fix broken IPv6 routing table after loopback down-up
  cbq: incorrect processing of high limits
  net/mlx4_en: Fix setting initial MAC address
  bonding: get netdev_rx_handler_unregister out of locks

Merge tag 'regmap-v3.9-rc4' of git://git./linux/kernel/git/broonie/regmap

Pull regmap fixes from Mark Brown:
"A small collection of fixes.  The most important ones are those from
  Stephen and Lars-Peter both of which fix cache issues that have been
  lurking for a while but not manifesting noticably enough for anyone to
  report them."

* tag 'regmap-v3.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: async: Add missing return
  regmap: don't corrupt work buffer in _regmap_raw_write()
  regmap: cache Fix regcache-rbtree sync
  regmap: Initialize `map->debugfs' before regcache

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull DRM fixes from Dave Airlie:
"Two core fixes, both regressions, along with some intel and some
  nouveau fixes for regressions and oopses"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm: correctly restore mappings if drm_open fails
  drm/nouveau: fix NULL ptr dereference from nv50_disp_intr()
  drm/nouveau: fix handling empty channel list in ioctl's
  drm: don't unlock in the addfb error paths
  drm/i915: Fix build failure
  drm/i915: Be sure to turn hsync/vsync back on at crt enable (v2)
  drm/i915: duct-tape locking when eDP init fails

Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS fixes from Ralf Baechle:
"A collection of fixes pretty much across the MIPS code.  Even the
  change to include/linux/signal.h by David Howells' 2a1486981c13 ("Fix
  breakage in MIPS siginfo handling") should be considered MIPS-specific
  as it touches an ifdefed segment that is only relevant to MIPS and
  which unfortunately can't be made to go away entirely."

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  Fix breakage in MIPS siginfo handling
  Revert "MIPS: BCM63XX: Call board_register_device from device_initcall()"
  MIPS: BCM63XX: Make nvram checksum failure non fatal
  MIPS: Fix code generation for non-DSP capable CPUs
  MIPS: Fix inconsistent formatting inside /proc/cpuinfo
  MIPS: SEAD3: Enable LL/SC.
  MIPS: Get rid of CONFIG_CPU_HAS_LLSC again
  MIPS: Add dependencies for HAVE_ARCH_TRANSPARENT_HUGEPAGE
  MIPS: VR4133: Fix probe for LL/SC.
  MIPS: Fix logic errors in bitops.c
  MIPS: Use CONFIG_CPU_MIPSR2 in csum_partial.S
  MIPS: compat: Return same error ENOSYS as native for invalid operation.

net: fix smatch warnings inside datagram_poll

Commit 7d4c04fc170087119727119074e72445f2bb192b ("net: add option to enable
error queue packets waking select") has an issue due to operator precedence
causing the bit-wise OR to bind to the sock_flags call instead of the result of
the terniary conditional. This fixes the *_poll functions to work properly. The
old code results in "mask |= POLLPRI" instead of what was intended, which is to
only include POLLPRI when the socket option is enabled.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drm: correctly restore mappings if drm_open fails

If first drm_open fails, the error-handling path will
incorrectly restore inode's mapping to NULL. This can
cause the crash later on. Fix by separately storing
away mapping pointers that drm_open can touch and
restore each from its own respective variable if the
call fails.

Fixes: https://bugzilla.novell.com/show_bug.cgi?id=807850
(thanks to Michal Hocko for investigating investigating and
finding the root cause of the bug)

Reference:
http://lists.freedesktop.org/archives/dri-devel/2013-March/036564.html

v2: Use one variable to store file and inode mapping
since they are the same at the function entry.
Fix spelling mistakes in commit message.

v3: Add reference to the original bug report.

Reported-by: Marco Munderloh <munderl@tnt.uni-hannover.de>
Tested-by: Marco Munderloh <munderl@tnt.uni-hannover.de>
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>

Merge branch 'drm-nouveau-fixes-3.9' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-next

Oops fixers.
* 'drm-nouveau-fixes-3.9' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nouveau: fix NULL ptr dereference from nv50_disp_intr()
drm/nouveau: fix handling empty channel list in ioctl's

net/nxp/lpc_eth: Drop ifdef CONFIG_OF_NET

Since of_get_mac_address() is now declared even if CONFIG_OF_NET
is not configured, the ifdef is no longer necessary and can be
removed.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/freescale/fec: Simplify OF dependencies

Since of_get_mac_address() is now defined even if CONFIG_OF_NET
is not configured, the ifdef around the code calling it is no longer
necessary and can be removed.

Similar, since of_get_phy_mode() is now defined as dummy function
if OF_NET is not configured, it is no longer necessary to provide
an OF dependent function as front-end. Also, the function depends
on OF_NET, not on OF, so the conditional code was not correct anyway.
Drop the front-end function and call of_get_phy_mode() directly.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/cadence/macb: Simplify OF dependencies

With of_get_mac_address() and of_get_phy_mode() now defined as dummy
functions if OF_NET is not configured, it is no longer necessary to
provide OF dependent functions as front-end. Also, the two functions
depend on OF_NET, not on OF, so the conditional code was not correct
anyway.

Drop the front-end functions and call of_get_mac_address() and
of_get_phy_mode() directly instead.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/cadence/at91_ether: Simplify OF dependencies

With of_get_mac_address() and of_get_phy_mode() now defined as dummy
functions if OF_NET is not configured, it is no longer necessary to
provide OF dependent functions as front-end. Also, the two functions
depend on OF_NET, not on OF, so the conditional code was not correct
anyway.

Drop the front-end functions and call of_get_mac_address() and
of_get_phy_mode() directly instead.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

of_net.h: Provide empty functions if OF_NET is not configured

of_get_mac_address() and of_get_phy_mode() are only provided if OF_NET
is configured. While most callers check for the define, not all do, and those
who do require #ifdef around the code. For those who don't, the missing check
can result in errors such as

arch/powerpc/sysdev/tsi108_dev.c:107:3: error: implicit declaration of
function 'of_get_mac_address' [-Werror=implicit-function-declaration]
arch/powerpc/sysdev/mv64x60_dev.c:253:2: error: implicit declaration of
function 'of_get_mac_address' [-Werror=implicit-function-declaration]

Provide empty functions if OF_NET is not configured.
This is safe because all callers do check the return values.

Cc: David Daney <david.daney@cavium.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-next

One locking regression fix, and a couple of other i915 ones.

* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
  drm: don't unlock in the addfb error paths
  drm/i915: Fix build failure
  drm/i915: Be sure to turn hsync/vsync back on at crt enable (v2)
  drm/i915: duct-tape locking when eDP init fails

VSOCK: Handle changes to the VMCI context ID.

The VMCI context ID of a virtual machine may change at any time. There
is a VMCI event which signals this but datagrams may be processed before
this is handled. It is therefore necessary to be flexible about the
destination context ID of any datagrams received. (It can be assumed to
be correct because it is provided by the hypervisor.) The context ID on
existing sockets should be updated to reflect how the hypervisor is
currently referring to the system.

Signed-off-by: Reilly Grant <grantr@vmware.com>
Acked-by: Andy King <acking@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net IPv6 : Fix broken IPv6 routing table after loopback down-up

IPv6 Routing table becomes broken once we do ifdown, ifup of the loopback(lo)
interface. After down-up, routes of other interface's IPv6 addresses through
'lo' are lost.

IPv6 addresses assigned to all interfaces are routed through 'lo' for internal
communication. Once 'lo' is down, those routing entries are removed from routing
table. But those removed entries are not being re-created properly when 'lo' is
brought up. So IPv6 addresses of other interfaces becomes unreachable from the
same machine. Also this breaks communication with other machines because of
NDISC packet processing failure.

This patch fixes this issue by reading all interface's IPv6 addresses and adding
them to IPv6 routing table while bringing up 'lo'.

==Testing==
Before applying the patch:
$ route -A inet6
Kernel IPv6 routing table
Destination                    Next Hop                   Flag Met Ref Use If
2000::20/128                   ::                         U    256 0     0 eth0
fe80::/64                      ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
::1/128                        ::                         Un   0   1     0 lo
2000::20/128                   ::                         Un   0   1     0 lo
fe80::xxxx:xxxx:xxxx:xxxx/128  ::                         Un   0   1     0 lo
ff00::/8                       ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
$ sudo ifdown lo
$ sudo ifup lo
$ route -A inet6
Kernel IPv6 routing table
Destination                    Next Hop                   Flag Met Ref Use If
2000::20/128                   ::                         U    256 0     0 eth0
fe80::/64                      ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
::1/128                        ::                         Un   0   1     0 lo
ff00::/8                       ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
$

After applying the patch:
$ route -A inet6
Kernel IPv6 routing
table
Destination                    Next Hop                   Flag Met Ref Use If
2000::20/128                   ::                         U    256 0     0 eth0
fe80::/64                      ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
::1/128                        ::                         Un   0   1     0 lo
2000::20/128                   ::                         Un   0   1     0 lo
fe80::xxxx:xxxx:xxxx:xxxx/128  ::                         Un   0   1     0 lo
ff00::/8                       ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
$ sudo ifdown lo
$ sudo ifup lo
$ route -A inet6
Kernel IPv6 routing table
Destination                    Next Hop                   Flag Met Ref Use If
2000::20/128                   ::                         U    256 0     0 eth0
fe80::/64                      ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
::1/128                        ::                         Un   0   1     0 lo
2000::20/128                   ::                         Un   0   1     0 lo
fe80::xxxx:xxxx:xxxx:xxxx/128  ::                         Un   0   1     0 lo
ff00::/8                       ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
$

Signed-off-by: Balakumaran Kannan <Balakumaran.Kannan@ap.sony.com>
Signed-off-by: Maruthi Thotad <Maruthi.Thotad@ap.sony.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipconfig: add informative timeout messages while waiting for carrier

Commit 3fb72f1e6e6165c5f495e8dc11c5bbd14c73385c ("ipconfig wait
for carrier") added a "wait for carrier on at least one interface"
policy, with a worst case maximum wait of two minutes.

However, if you encounter this, you won't get any feedback from
the console as to the nature of what is going on. You just see
the booting process hang for two minutes and then continue.

Here we add a message so the user knows what is going on, and
hence can take action to rectify the situation (e.g. fix network
cable or whatever.) After the 1st 10s pause, output now begins
that looks like this:

Waiting up to 110 more seconds for network.
Waiting up to 100 more seconds for network.
Waiting up to 90 more seconds for network.
Waiting up to 80 more seconds for network.
...

Since most systems will have no problem getting link/carrier in the
1st 10s, the only people who will see these messages are people with
genuine issues that need to be resolved.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

forcedeth: Do a dma_mapping_error check after skb_frag_dma_map

This backtrace was recently reported on a 3.9 kernel:

Actual results: from syslog /var/log/messsages:
kernel: [17539.340285] ------------[ cut here ]------------
kernel: [17539.341012] WARNING: at lib/dma-debug.c:937 check_unmap+0x493/0x960()
kernel: [17539.341012] Hardware name: MS-7125
kernel: [17539.341012] forcedeth 0000:00:0a.0: DMA-API: device driver failed to
check map error[device address=0x0000000013c88000] [size=544 bytes] [mapped as
page]
kernel: [17539.341012] Modules linked in: fuse ebtable_nat ipt_MASQUERADE
nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_nat nf_nat_ipv6
ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat
nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack
nf_conntrack bnep bluetooth rfkill ebtable_filter ebtables ip6table_filter
ip6_tables snd_hda_codec_hdmi snd_cmipci snd_mpu401_uart snd_hda_intel
snd_intel8x0 snd_opl3_lib snd_ac97_codec gameport snd_hda_codec snd_rawmidi
ac97_bus snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd
k8temp soundcore serio_raw i2c_nforce2 forcedeth ata_generic pata_acpi nouveau
video mxm_wmi wmi i2c_algo_bit drm_kms_helper ttm drm i2c_core sata_sil pata_amd
sata_nv uinput
kernel: [17539.341012] Pid: 17340, comm: sshd Not tainted
3.9.0-0.rc4.git0.1.fc19.i686.PAE #1
kernel: [17539.341012] Call Trace:
kernel: [17539.341012]  [<c045573c>] warn_slowpath_common+0x6c/0xa0
kernel: [17539.341012]  [<c0701953>] ? check_unmap+0x493/0x960
kernel: [17539.341012]  [<c0701953>] ? check_unmap+0x493/0x960
kernel: [17539.341012]  [<c04557a3>] warn_slowpath_fmt+0x33/0x40
kernel: [17539.341012]  [<c0701953>] check_unmap+0x493/0x960
kernel: [17539.341012]  [<c049238f>] ? sched_clock_cpu+0xdf/0x150
kernel: [17539.341012]  [<c0701e87>] debug_dma_unmap_page+0x67/0x70
kernel: [17539.341012]  [<f7eae8f2>] nv_unmap_txskb.isra.32+0x92/0x100

Its pretty plainly the result of an skb fragment getting unmapped without having
its initial mapping operation checked for errors.  This patch corrects that.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

ISDN:divert: beautify code: useless 'break', 'return (0)', additional comments.

  delete useless 'break' after 'return'.
  let 'return 0' instead of 'return (0)'
  also give a comment for 'break' to let readers notice it.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cbq: incorrect processing of high limits

currently cbq works incorrectly for limits > 10% real link bandwidth,
and practically does not work for limits > 50% real link bandwidth.
Below are results of experiments taken on 1 Gbit link

In shaper | Actual Result
-----------+---------------
  100M     | 108 Mbps
  200M     | 244 Mbps
  300M     | 412 Mbps
  500M     | 893 Mbps

This happen because of q->now changes incorrectly in cbq_dequeue():
when it is called before real end of packet transmitting,
L2T is greater than real time delay, q_now gets an extra boost
but never compensate it.

To fix this problem we prevent change of q->now until its synchronization
with real time.

Signed-off-by: Vasily Averin <vvs@openvz.org>
Reviewed-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Bump up the version to 5.2.40

Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix sparse warnings.

warning: 'pf_info.max_tx_ques' may be used uninitialized in this function
[-Wmaybe-uninitialized]

Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix NULL dereference in error path.

o Fix for smatch tool reported error
   drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c:2029
   qlcnic_probe() error: potential NULL dereference 'adapter'.
o While returning from an error path in probe, adapter is not
  initialized. So do not access adapter in cleanup path.

Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix potential NULL dereference

[net-next:master 301/302] drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c:563
qlcnic_set_multi() error: potential null dereference 'cur'. (kzalloc returns null)

o Break out of the loop after memory allocation failure. Program all the
MAC addresses that were cached in the return path.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: Remove dead sysctl_tcp_cookie_size declaration

Remove a declaration left over from the TCPCT-ectomy. This sysctl is
no longer referenced anywhere since 1a2c6181c4 ("tcp: Remove TCPCT").

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipc: set msg back to -EAGAIN if copy wasn't performed

Make sure that msg pointer is set back to error value in case of
MSG_COPY flag is set and desired message to copy wasn't found. This
garantees that msg is either a error pointer or a copy address.

Otherwise the last message in queue will be freed without unlinking from
the queue (which leads to memory corruption) and the dummy allocated
copy won't be released.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bridge: remove a redundant synchronize_net()

commit 00cfec37484761 (net: add a synchronize_net() in
netdev_rx_handler_unregister())
allows us to remove the synchronized_net() call from del_nbp()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Fix setting initial MAC address

Commit 6bbb6d9 "net/mlx4_en: Optimize Rx fast path filter checks" introduced a regression
under which the MAC address read from the card was not converted correctly
(the most significant byte was not handled), fix that.

Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: get netdev_rx_handler_unregister out of locks

Now that netdev_rx_handler_unregister contains synchronize_net(), we need
to call it outside of bond->lock, cause it might sleep. Also, remove the
already unneded synchronize_net().

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'fixes' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC bug fixes from Arnd Bergmann:
"After a quiet set of fixes for 3.9-rc4, a lot of people woke up and
  sent urgent fixes for 3.9.  I pushed back on a number of them that got
  deferred to 3.10, but these are the ones that seemed important.

  Regression in 3.9:

   - Multiple regressions in OMAP2+ clock cleanup
   - SH-Mobile frame buffer bug fix that merged here because of
     maintainer MIA
   - ux500 prcmu changes broke DT booting
   - MMCI duplicated regulator setup on ux500
   - New ux500 clock driver broke ethernet on snowball
   - Local interrupt driver for mvebu broke ethernet
   - MVEBU GPIO driver did not get set up right on Orion DT
   - incorrect interrupt number on Orion crypto for DT

  Long-standing bugs, including candidates for stable:

   - Kirkwood MMC needs to disable invalid card detect pins
   - MV SDIO pinmux was wrong on Mirabox
   - GoFlex Net board file needs to set NAND chip delay
   - MSM timer restart race
   - ep93xx early debug code broke in 3.7
   - i.MX CPU hotplug race
   - Incorrect clock setup for OMAP1 USB
   - Workaround for bad clock setup by some old OMAP4 boot loaders
   - Static I/O mappings on cns3xxx since 3.2"

* tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: cns3xxx: fix mapping of private memory region
  arm: mvebu: Fix pinctrl for Armada 370 Mirabox SDIO port.
  arm: orion5x: correct IRQ used in dtsi for mv_cesa
  arm: orion5x: fix orion5x.dtsi gpio parameters
  ARM: Kirkwood: fix unused mvsdio gpio pins
  arm: mvebu: Use local interrupt only for the timer 0
  ARM: kirkwood: Fix chip-delay for GoFlex Net
  ARM: ux500: Enable the clock controlling Ethernet on Snowball
  ARM: ux500: Stop passing ios_handler() as an MMCI power controlling call-back
  ARM: ux500: Apply the TCPM and TCDM locations and sizes to dbx5x0 DT
  fbdev: sh_mobile_lcdc: fixup B side hsync adjust settings
  ARM: OMAP: clocks: Delay clk inits atleast until slab is initialized
  ARM: imx: fix sync issue between imx_cpu_die and imx_cpu_kill
  ARM: msm: Stop counting before reprogramming clockevent
  ARM: ep93xx: Fix wait for UART FIFO to be empty
  ARM: OMAP4: PM: fix PM regression introduced by recent clock cleanup
  ARM: OMAP3: hwmod data: keep MIDLEMODE in force-standby for musb
  ARM: OMAP4: clock data: lock USB DPLL on boot
  ARM: OMAP1: fix USB host on 1710

Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux

Pull nfsd bugfix from J Bruce Fields:
"An xdr decoding error--thanks, Toralf Förster, and Trinity!"

* 'for-3.9' of git://linux-nfs.org/~bfields/linux:
nfsd4: reject "negative" acl lengths

Merge tag 'v3.9-rc1_cns3xxx_fixes' of git://git.infradead.org/users/cbou/linux-cns3xxx into fixes

From Anton Vorontsov <anton@enomsg.org>:

This tag includes Mac Lin's work to revive CNS3xxx booting:

"Since commit 0536bdf33faf (ARM: move iotable mappings within the vmalloc
region), [...] the pre-defined iotable mappings is not in the vmalloc
region. [...] move the iotable mappings into the vmalloc region, and
merge the MPCore private memory region (containing the SCU, the GIC and
the TWD) as a single region."

Plus there is a small cosmetic fix, also from Mac Lin.

* tag 'v3.9-rc1_cns3xxx_fixes' of git://git.infradead.org/users/cbou/linux-cns3xxx:
ARM: cns3xxx: fix mapping of private memory region

[arnd: dropped the cosmetic fix from the merge as it is not needed for 3.9]

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/rusty/linux

Pull virtio fixes from Rusty Russell:
"One reversion, a tiny leak fix, and a cc:stable locking fix, in two
  parts"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  virtio: console: add locking around c_ovq operations
  virtio: console: rename cvq_lock to c_ivq_lock
  hw_random: free rng_buffer at module exit
  Revert "virtio_console: Initialize guest_connected=true for rproc_serial"

loop: prevent bdev freeing while device in use

struct block_device lifecycle is defined by its inode (see fs/block_dev.c) -
block_device allocated first time we access /dev/loopXX and deallocated on
bdev_destroy_inode. When we create the device "losetup /dev/loopXX afile"
we want that block_device stay alive until we destroy the loop device
with "losetup -d".

But because we do not hold /dev/loopXX inode its counter goes 0, and
inode/bdev can be destroyed at any moment. Usually it happens at memory
pressure or when user drops inode cache (like in the test below). When later in
loop_clr_fd() we want to use bdev we have use-after-free error with following
stack:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000280
  bd_set_size+0x10/0xa0
  loop_clr_fd+0x1f8/0x420 [loop]
  lo_ioctl+0x200/0x7e0 [loop]
  lo_compat_ioctl+0x47/0xe0 [loop]
  compat_blkdev_ioctl+0x341/0x1290
  do_filp_open+0x42/0xa0
  compat_sys_ioctl+0xc1/0xf20
  do_sys_open+0x16e/0x1d0
  sysenter_dispatch+0x7/0x1a

To prevent use-after-free we need to grab the device in loop_set_fd()
and put it later in loop_clr_fd().

The issue is reprodusible on current Linus head and v3.3. Here is the test:

  dd if=/dev/zero of=loop.file bs=1M count=1
  while [ true ]; do
    losetup /dev/loop0 loop.file
    echo 2 > /proc/sys/vm/drop_caches
    losetup -d /dev/loop0
  done

[ Doing bdgrab/bput in loop_set_fd/loop_clr_fd is safe, because every
  time we call loop_set_fd() we check that loop_device->lo_state is
  Lo_unbound and set it to Lo_bound If somebody will try to set_fd again
  it will get EBUSY.  And if we try to loop_clr_fd() on unbound loop
  device we'll get ENXIO.

  loop_set_fd/loop_clr_fd (and any other loop ioctl) is called under
  loop_device->lo_ctl_mutex. ]

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux

Pull tegra clock driver fix from Mike Turquette:
"Missing base address in Tegra clock driver results in non-operational
  PCIe.  On some devices this means that Ethernet will go uninitialized
  and other devices will fail.  This pull request fixes it with a single
  patch to pass the proper base address in the Tegra clock driver."

* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux:
  clk: tegra: Allow PLLE training to succeed

Merge tag 'for-3.9-rc' of git://git./linux/kernel/git/rwlove/fcoe

Pull FCoE fixes from Robert Love:
"Critical patches to fix FCoE VN2VN mode with new interfaces targeting
  3.9-rc"

* tag 'for-3.9-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rwlove/fcoe:
  libfcoe: Fix fcoe_sysfs VN2VN mode
  libfc, fcoe, bnx2fc: Split fc_disc_init into fc_disc_{init, config}
  libfc, fcoe, bnx2fc: Always use fcoe_disc_init for discovery layer initialization
  fcoe: Fix deadlock between create and destroy paths
  bnx2fc: Make the fcoe_cltr the SCSI host parent

clk: tegra: Allow PLLE training to succeed

Under some circumstances the PLLE needs to be retrained, in which case
access to the PMC registers is required. Fix this by passing a pointer
to the PMC registers instead of NULL when registering the PLLE clock.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Acked-By: Peter De Schrijver <pdeschrijver@nvidia.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>

Merge git://git./linux/kernel/git/davem/net

Conflicts:
net/mac80211/sta_info.c
net/wireless/core.h

Two minor conflicts in wireless. Overlapping additions of extern
declarations in net/wireless/core.h and a bug fix overlapping with
the addition of a boolean parameter to __ieee80211_key_free().

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'stable' of git://git./linux/kernel/git/cmetcalf/linux-tile

Pull arch/tile fix from Chris Metcalf:
"This change allows newer Tilera boot tools to work correctly with
  current (and stable) kernels by using the right filename to get the
  initramfs from the Tilera hypervisor filesystem."

* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: expect new initramfs name from hypervisor file system

Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) sadb_msg prepared for IPSEC userspace forgets to initialize the
    satype field, fix from Nicolas Dichtel.

2) Fix mac80211 synchronization during station removal, from Johannes
    Berg.

3) Fix IPSEC sequence number notifications when they wrap, from Steffen
    Klassert.

4) Fix cfg80211 wdev tracing crashes when add_virtual_intf() returns an
    error pointer, from Johannes Berg.

5) In mac80211, don't call into the channel context code with the
    interface list mutex held.  From Johannes Berg.

6) In mac80211, if we don't actually associate, do not restart the STA
    timer, otherwise we can crash.  From Ben Greear.

7) Missing dma_mapping_error() check in e1000, ixgb, and e1000e.  From
    Christoph Paasch.

8) Fix sja1000 driver defines to not conflict with SH port, from Marc
    Kleine-Budde.

9) Don't call il4965_rs_use_green with a NULL station, from Colin Ian
    King.

10) Suspend/Resume in the FEC driver fail because the buffer descriptors
    are not initialized at all the moments in which they should.  Fix
    from Frank Li.

11) cpsw and davinci_emac drivers both use the wrong interface to
    restart a stopped TX queue.  Use netif_wake_queue not
    netif_start_queue, the latter is for initialization/bringup not
    active management of the queue.  From Mugunthan V N.

12) Fix regression in rate calculations done by
    psched_ratecfg_precompute(), missing u64 type promotion.  From
    Sergey Popovich.

13) Fix length overflow in tg3 VPD parsing, from Kees Cook.

14) AOE driver fails to allocate enough headroom, resulting in crashes.
    Fix from Eric Dumazet.

15) RX overflow happens too quickly in sky2 driver because pause packet
    thresholds are not programmed correctly.  From Mirko Lindner.

16) Bonding driver manages arp_interval and miimon settings incorrectly,
    disabling one unintentionally disables both.  Fix from Nikolay
    Aleksandrov.

17) smsc75xx drivers don't program the RX mac properly for jumbo frames.
    Fix from Steve Glendinning.

18) Fix off-by-one in Codel packet scheduler.  From Vijay Subramanian.

19) Fix packet corruption in atl1c by disabling MSI support, from Hannes
    Frederic Sowa.

20) netdev_rx_handler_unregister() needs a synchronize_net() to fix
    crashes in bonding driver unload stress tests.  From Eric Dumazet.

21) rxlen field of ks8851 RX packet descriptors not interpreted
    correctly (it is 12 bits not 16 bits, so needs to be masked after
    shifting the 32-bit value down 16 bits).  Fix from Max Nekludov.

22) Fix missed RX/TX enable in sh_eth driver due to mishandling of link
    change indications.  From Sergei Shtylyov.

23) Fix crashes during spurious ECI interrupts in sh_eth driver, also
    from Sergei Shtylyov.

24) dm9000 driver initialization is done wrong for revision B devices
    with DSP PHY, from Joseph CHANG.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits)
  DM9000B: driver initialization upgrade
  sh_eth: make 'link' field of 'struct sh_eth_private' *int*
  sh_eth: workaround for spurious ECI interrupt
  sh_eth: fix handling of no LINK signal
  ks8851: Fix interpretation of rxlen field.
  net: add a synchronize_net() in netdev_rx_handler_unregister()
  MAINTAINERS: Update netxen_nic maintainers list
  atl1e: drop pci-msi support because of packet corruption
  net: fq_codel: Fix off-by-one error
  net: calxedaxgmac: Wake-on-LAN fixes
  net: calxedaxgmac: fix rx ring handling when OOM
  net: core: Remove redundant call to 'nf_reset' in 'dev_forward_skb'
  smsc75xx: fix jumbo frame support
  net: fix the use of this_cpu_ptr
  bonding: fix disabling of arp_interval and miimon
  ipv6: don't accept node local multicast traffic from the wire
  sky2: Threshold for Pause Packet is set wrong
  sky2: Receive Overflows not counted
  aoe: reserve enough headroom on skbs
  line up comment for ndo_bridge_getlink
  ...

net: add option to enable error queue packets waking select

Currently, when a socket receives something on the error queue it only wakes up
the socket on select if it is in the "read" list, that is the socket has
something to read. It is useful also to wake the socket if it is in the error
list, which would enable software to wait on error queue packets without waking
up for regular data on the socket. The main use case is for receiving
timestamped transmit packets which return the timestamp to the socket via the
error queue. This enables an application to select on the socket for the error
queue only instead of for the regular traffic.

-v2-
* Added the SO_SELECT_ERR_QUEUE socket option to every architechture specific file
* Modified every socket poll function that checks error queue

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jeffrey Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Matthew Vick <matthew.vick@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

DM9000B: driver initialization upgrade

Fix bug for DM9000 revision B which contain a DSP PHY

DM9000B use DSP PHY instead previouse DM9000 revisions' analog PHY,
So need extra change in initialization, For
explicity PHY Reset and PHY init parameter, and
first DM9000_NCR reset need NCR_MAC_LBK bit by dm9000_probe().

Following DM9000_NCR reset cause by dm9000_open() clear the
NCR_MAC_LBK bit.

Without this fix, Power-up FIFO pointers error happen around 2%
rate among Davicom's customers' boards. With this fix, All above
cases can be solved.

Signed-off-by: Joseph CHANG <josright123@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: make 'link' field of 'struct sh_eth_private' *int*

The 'link' field of 'struct sh_eth_private' has type 'enum phy_state' while the
'link' field of 'struct phy_device' is merely *int* (having values 0 and 1) and
the former field gets assigned from the latter. Make the field match, getting
rid of incorrectly used PHY_DOWN value in assignments/comparisons.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: workaround for spurious ECI interrupt

At least on Renesas R8A7778, EESR.ECI interrupt seems to fire regardless of its
mask in EESIPR register. I can 100% reproduce it with the following scenario:
target is booted with 'ip=on' option, and so IP-Config opens SoC Ether device
but doesn't get a proper reply and then succeeds with on-board SMC chip; then
I login and try to bring up the SoC Ether device with 'ifconfig', and I get
an ECI interrupt once request_irq() is called by sh_eth_open() (while interrupt
mask in EESIPR register is all 0), if that interrupt is accompanied by a pending
EESR.FRC (frame receive completion) interrupt, I get kernel oops in sh_eth_rx()
because sh_eth_ring_init() hasn't been called yet!

The solution I worked out is the following: in sh_eth_interrupt(), mask the
interrupt status from EESR register with the interrupt mask from EESIPR register
in order not to handle the disabled interrupts -- but forcing EESIPR.M_ECI bit
in this mask set because we always need to fully handle EESR.ECI interrupt in
sh_eth_error() in order to quench it (as it doesn't get cleared by just writing
1 to the this bit as all the other interrupts).

While at it, remove unneeded initializer for 'intr_status' variable and give it
*unsigned long* type, matching the type of sh_eth_read()'s result; fix comment.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Max Filippov <max.filippov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: fix handling of no LINK signal

The code handling the absent LINK signal (or the absent PSR register -- which
reflects the state of this signal) is quite naive and has probably never really
worked. It's probably enough to say that this code is executed only on the LINK
change interrupt (sic!) but even if we actually have the signal and choose to
ignore it (it might be connected to PHY's link/activity LED output as on the
Renesas BOCK-W board), sh_eth_adjust_link() on which this code relies to update
'mdp->link' gets executed later than the LINK change interrupt where it is
checked, and so RX/TX never get enabled via ECMR register.

So, ignore the LINK changed interrupt iff LINK signal is absent (or just chosen
not to be used) or PSR register is absent, and enable/disable RX/TX directly in
sh_eth_adjust_link() in this case.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge remote-tracking branch 'regmap/fix/async' into tmp

Linux 3.9-rc5

Merge remote-tracking branch 'regmap/fix/core' into tmp

Merge remote-tracking branch 'regmap/fix/cache' into tmp

Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma

Pull slave-dmaengine fixes from Vinod Koul:
"Two fixes for slave-dmaengine.

  The first one is for making slave_id value correct for dw_dmac and
  the other one fixes the endieness in DT parsing"

* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
  dw_dmac: adjust slave_id accordingly to request line base
  dmaengine: dw_dma: fix endianess for DT xlate function

Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
"For a some fixes for Kernel 3.9:
   - subsystem build fix when VIDEO_DEV=y, VIDEO_V4L2=m and I2C=m
   - compilation fix for arm multiarch preventing IR_RX51 to be selected
   - regression fix at bttv crop logic
   - s5p-mfc/m5mols/exynos: a few fixes for cameras on exynos hardware"

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] [REGRESSION] bt8xx: Fix too large height in cropcap
  [media] fix compilation with both V4L2 and I2C as 'm'
  [media] m5mols: Fix bug in stream on handler
  [media] s5p-fimc: Do not attempt to disable not enabled media pipeline
  [media] s5p-mfc: Fix encoder control 15 issue
  [media] s5p-mfc: Fix frame skip bug
  [media] s5p-fimc: send valid m2m ctx to fimc_m2m_job_finish
  [media] exynos-gsc: send valid m2m ctx to gsc_m2m_job_finish
  [media] fimc-lite: Fix the variable type to avoid possible crash
  [media] fimc-lite: Initialize 'step' field in fimc_lite_ctrl structure
  [media] ir: IR_RX51 only works on OMAP2

Merge tag 'for-linus-20130331' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"Alright, this time from 10K up in the air.

  Collection of fixes that have been queued up since the merge window
  opened, hence postponed until later in the cycle.  The pull request
  contains:

   - A bunch of fixes for the xen blk front/back driver.

   - A round of fixes for the new IBM RamSan driver, fixing various
     nasty issues.

   - Fixes for multiple drives from Wei Yongjun, bad handling of return
     values and wrong pointer math.

   - A fix for loop properly killing partitions when being detached."

* tag 'for-linus-20130331' of git://git.kernel.dk/linux-block: (25 commits)
  mg_disk: fix error return code in mg_probe()
  rsxx: remove unused variable
  rsxx: enable error return of rsxx_eeh_save_issued_dmas()
  block: removes dynamic allocation on stack
  Block: blk-flush: Fixed indent code style
  cciss: fix invalid use of sizeof in cciss_find_cfgtables()
  loop: cleanup partitions when detaching loop device
  loop: fix error return code in loop_add()
  mtip32xx: fix error return code in mtip_pci_probe()
  xen-blkfront: remove frame list from blk_shadow
  xen-blkfront: pre-allocate pages for requests
  xen-blkback: don't store dev_bus_addr
  xen-blkfront: switch from llist to list
  xen-blkback: fix foreach_grant_safe to handle empty lists
  xen-blkfront: replace kmalloc and then memcpy with kmemdup
  xen-blkback: fix dispatch_rw_block_io() error path
  rsxx: fix missing unlock on error return in rsxx_eeh_remap_dmas()
  Adding in EEH support to the IBM FlashSystem 70/80 device driver
  block: IBM RamSan 70/80 error message bug fix.
  block: IBM RamSan 70/80 branding changes.
  ...

Revert "lockdep: check that no locks held at freeze time"

This reverts commit 6aa9707099c4b25700940eb3d016f16c4434360d.

Commit 6aa9707099c4 ("lockdep: check that no locks held at freeze time")
causes problems with NFS root filesystems.  The failures were noticed on
OMAP2 and 3 boards during kernel init:

  [ BUG: swapper/0/1 still has locks held! ]
  3.9.0-rc3-00344-ga937536 #1 Not tainted
  -------------------------------------
  1 lock held by swapper/0/1:
   #0:  (&type->s_umount_key#13/1){+.+.+.}, at: [<c011e84c>] sget+0x248/0x574

  stack backtrace:
    rpc_wait_bit_killable
    __wait_on_bit
    out_of_line_wait_on_bit
    __rpc_execute
    rpc_run_task
    rpc_call_sync
    nfs_proc_get_root
    nfs_get_root
    nfs_fs_mount_common
    nfs_try_mount
    nfs_fs_mount
    mount_fs
    vfs_kern_mount
    do_mount
    sys_mount
    do_mount_root
    mount_root
    prepare_namespace
    kernel_init_freeable
    kernel_init

Although the rootfs mounts, the system is unstable.  Here's a transcript
from a PM test:

  http://www.pwsan.com/omap/testlogs/test_v3.9-rc3/20130317194234/pm/37xxevm/37xxevm_log.txt

Here's what the test log should look like:

  http://www.pwsan.com/omap/testlogs/test_v3.8/20130218214403/pm/37xxevm/37xxevm_log.txt

Mailing list discussion is here:

  http://lkml.org/lkml/2013/3/4/221

Deal with this for v3.9 by reverting the problem commit, until folks can
figure out the right long-term course of action.

Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: <maciej.rutecki@gmail.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ben Chan <benchan@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

doc: packet: add minimal TPACKET_V3 example code

Lost in space for a long time, but it finally came back to us from
some ancient code tombs. This patch adds a minimal runnable example
of Linux' packet mmap(2) from Chetan Loke's TPACKET_V3. Special
thanks to David S. Miller, and also Eric Leblond and Victor Julien!

Cc: Eric Leblond <eric@regit.org>
Cc: Victor Julien <victor@inliniac.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

macvlan: use the right RCU api

Make sure we use proper API to fetch dev->rx_handler_data,
instead of ugly casts.

Rename macvlan_port_get() to macvlan_port_get_rtnl() to document fact
that we hold RTNL when needed, with lockdep support.

This removes sparse warnings as well (CONFIG_SPARSE_RCU_POINTER=y)

CHECK drivers/net/macvlan.c
drivers/net/macvlan.c:706:37: warning: cast removes address space of expression
drivers/net/macvlan.c:775:16: warning: cast removes address space of expression
drivers/net/macvlan.c:924:16: warning: cast removes address space of expression

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: reorder some fields of net_device

As time passed, some fields were added in net_device, and not
at sensible offsets.

Lets reorder some fields to reduce number of cache lines in RX path.
Fields not used in data path should be moved out of this critical cache
line.

In particular, move broadcast[] to the end of the rx section,
as it is less used, and ethernet uses only the beginning of the 32bytes
field.

Before patch :

offsetof(struct net_device,dev_addr)=0x258
offsetof(struct net_device,rx_handler)=0x2b8
offsetof(struct net_device,ingress_queue)=0x2c8
offsetof(struct net_device,broadcast)=0x278

After :

offsetof(struct net_device,dev_addr)=0x280
offsetof(struct net_device,rx_handler)=0x298
offsetof(struct net_device,ingress_queue)=0x2a8
offsetof(struct net_device,broadcast)=0x2b0

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cdc_ncm: return -ENOMEM if kzalloc fails

return -ENOMEM instead if kzalloc of cdc_ncm_ctx structure is failed.

and also remove the comparision of ctx structure with NULL and make
it as !ctx.

Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip_gre: don't overwrite iflink during net_dev init

iflink is currently set to 0 in __gre_tunnel_init(). This
function is invoked in gre_tap_init() and
ipgre_tunnel_init() which are both used to initialise the
ndo_init field of the respective net_device_ops structs
(ipgre.. and gre_tap..) used by GRE interfaces.

However, in netdevice_register() iflink is first set to -1,
then ndo_init is invoked and then iflink is assigned to a
proper value if and only if it still was -1.

Assigning 0 to iflink in ndo_init is therefore first
preventing netdev_register() to correctly assign it a proper
value and then breaking iflink at all since 0 has not
correct meaning.

Fix this by removing the iflink assignment in
__gre_tunnel_init().

Introduced by c54419321455631079c7d6e60bc732dd0c5914c5
("GRE: Refactor GRE tunneling code.")

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Pravin B Shelar <pshelar@nicira.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git./linux/kernel/git/nab/target-pending

Pull SCSI target fixes from Nicholas Bellinger:
"This includes the bug-fix for a >= v3.8-rc1 regression specific to
  iscsi-target persistent reservation conflict handling (CC'ed to
  stable), and a tcm_vhost patch to drop VIRTIO_RING_F_EVENT_IDX usage
  so that in-flight qemu vhost-scsi-pci device code can detect the
  proper vhost feature bits.

  Also, there are two more tcm_vhost patches still being discussed by
  MST and Asias for v3.9 that will be required for the in-flight qemu
  vhost-scsi-pci device patch to function properly, and that should
  (hopefully) be the last target fixes for this round."

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: Fix RESERVATION_CONFLICT status regression for iscsi-target special case
  tcm_vhost: Avoid VIRTIO_RING_F_EVENT_IDX feature bit

ARM: cns3xxx: fix mapping of private memory region

Since commit 0536bdf33faf (ARM: move iotable mappings within the vmalloc
region), the Cavium CNS3xxx cannot boot anymore.

This is caused by the pre-defined iotable mappings is not in the vmalloc
region. This patch move the iotable mappings into the vmalloc region, and
merge the MPCore private memory region (containing the SCU, the GIC and
the TWD) as a single region.

Signed-off-by: Mac Lin <mkl0301@gmail.com>
Signed-off-by: Anton Vorontsov <anton@enomsg.org>
Cc: stable@vger.kernel.org [v3.3+]

virtio: console: add locking around c_ovq operations

When multiple ovq operations are being performed (lots of open/close
operations on virtio_console fds), the __send_control_msg() function can
get confused without locking.

A simple recipe to cause badness is:
* create a QEMU VM with two virtio-serial ports
* in the guest, do
  while true;do echo abc >/dev/vport0p1;done
  while true;do echo edf >/dev/vport0p2;done

In one run, this caused a panic in __send_control_msg().  In another, I
got

   virtio_console virtio0: control-o:id 0 is not a head!

This also results repeated messages similar to these on the host:

  qemu-kvm: virtio-serial-bus: Unexpected port id 478762112 for device virtio-serial-bus.0
  qemu-kvm: virtio-serial-bus: Unexpected port id 478762368 for device virtio-serial-bus.0

Reported-by: FuXiangChun <xfu@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org

virtio: console: rename cvq_lock to c_ivq_lock

The cvq_lock was taken for the c_ivq. Rename the lock to make that
obvious.

We'll also add a lock around the c_ovq in the next commit, so there's no
ambiguity.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Asias He <asias@redhat.com>
Reviewed-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org

include/linux: printk is needed in filter.h when CONFIG_BPF_JIT is defined

for make V=1 EXTRA_CFLAGS=-W ARCH=arm allmodconfig
printk is need when CONFIG_BPF_JIT is defined
or it will report pr_err and print_hex_dump are implicit declaration

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dw_dmac: adjust slave_id accordingly to request line base

On some hardware configurations we have got the request line with the offset.
The patch introduces convert_slave_id() helper for that cases. The request line
base is came from the driver data provided by the platform_device_id table.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>

dmaengine: dw_dma: fix endianess for DT xlate function

As reported by Wu Fengguang's build robot tracking sparse warnings, the
dma_spec arguments in the dw_dma_xlate are already byte swapped on
little-endian platforms and must not get swapped again. This code is
currently not used anywhere, but will be used in Linux 3.10 when the
ARM SPEAr platform starts using the generic DMA DT binding.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>

PNP: List Rafael Wysocki as a maintainer

The Adam Belay's e-mail address in MAINTAINERS under PNP SUPPORT
is not valid any more and I started to maintain that code in the
meantime as a matter of fact, so list myself as a maintainer of it
along with Bjorn and remove the Adam's entry from it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

qlcnic: Bump up the version to 5.2.39

Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Support atomic commands

o VFs might get scheduled out after sending a command to a PF
and scheduled in after receiving a response. Implement a
worker thread to handle atomic commands.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Support VF-PF communication channel commands.

o Add support for commands from VF to PF.
o PF validates the commands sent by the VF before sending
  it to adapter.
o vPort is a container of resources. PF creates vPort
  for VFs and attach resources to it. vPort is
  transparent to the VF.
o Separate 83xx TX and RX hardware resource cleanup from 82xx.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: VF-PF communication channel implementation

o Adapter provides communication channel between VF and PF.
  Any control commands from the VF driver are sent to the PF driver
  through this communication channel. PF driver validates the
  commands before sending them to the adapter. Similarly PF driver
  forwards any control command responses  to the VF driver
  through this communication channel.  Adapter sends message pending
  event to VF or PF when there is an outstanding response or a command
  for VF or PF respectively. When a command or a response is sent over
  a channel VF or PF cannot send another command or a response
  until adapter sends a channel free event. Adapter allocates 1K area to
  VF and  PF each for this communication.
o Commands and  responses are encapsulated in a header. Header determines
  sequence id, number of fragments, fragment number etc.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Use shared interrupt vector for Tx and Rx

o VF will use shared MSI-X interrupt vector for Tx and Rx.
o When QLCNIC_INTR_SHARED flag is set Tx and Rx will
share MSI-X interrupt vector. Tx will use a separate
MSI-X interrupt vector from Rx otherwise.

Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: SR-IOV VF probe

o Add PCI device entry for VF.
o Add HW operations for VF.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Support SR-IOV enable and disable

o Add QLCNIC_SRIOV to Kconfig.
o Provide PCI sysfs hooks to enable and disable SR-IOV.
o Allow enabling only when CONFIG_QLCNIC_SRIOV is defined.
o qlcnic_sriov_pf.c has all the PF related SR-IOV
functionality.
o qlcnic_sriov_common.c has VF functionality and SR-IOV
functionality which is common between VF and PF.
o qlcnic_sriov.h is a common header file for SR-IOV defines.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ks8851: Fix interpretation of rxlen field.

According to the Datasheet (page 52):
15-12 Reserved
11-0 RXBC Receive Byte Count
This field indicates the present received frame byte size.

The code has a bug:
                 rxh = ks8851_rdreg32(ks, KS_RXFHSR);
                 rxstat = rxh & 0xffff;
                 rxlen = rxh >> 16; // BUG!!! 0xFFF mask should be applied

Signed-off-by: Max Nekludov <Max.Nekludov@us.elster.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cirrus: cs89x0: remove two obsolete Kconfig macros

The CONFIG_ARCH_IXDP2X01 and CONFIG_MACH_IXDP2351 Kconfig macros are
unused since the ixp23xx and ixp2000 platforms were removed in v3.5. So
remove the last code still depending on these macros. And since
CS89x0_NONISA_IRQ was only set if either of these two macros was defined
we can also remove that macro and the code depending on it.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add a synchronize_net() in netdev_rx_handler_unregister()

commit 35d48903e97819 (bonding: fix rx_handler locking) added a race
in bonding driver, reported by Steven Rostedt who did a very good
diagnosis :

<quoting Steven>

I'm currently debugging a crash in an old 3.0-rt kernel that one of our
customers is seeing. The bug happens with a stress test that loads and
unloads the bonding module in a loop (I don't know all the details as
I'm not the one that is directly interacting with the customer). But the
bug looks to be something that may still be present and possibly present
in mainline too. It will just be much harder to trigger it in mainline.

In -rt, interrupts are threads, and can schedule in and out just like
any other thread. Note, mainline now supports interrupt threads so this
may be easily reproducible in mainline as well. I don't have the ability
to tell the customer to try mainline or other kernels, so my hands are
somewhat tied to what I can do.

But according to a core dump, I tracked down that the eth irq thread
crashed in bond_handle_frame() here:

        slave = bond_slave_get_rcu(skb->dev);
        bond = slave->bond; <--- BUG

the slave returned was NULL and accessing slave->bond caused a NULL
pointer dereference.

Looking at the code that unregisters the handler:

void netdev_rx_handler_unregister(struct net_device *dev)
{

        ASSERT_RTNL();
        RCU_INIT_POINTER(dev->rx_handler, NULL);
        RCU_INIT_POINTER(dev->rx_handler_data, NULL);
}

Which is basically:
        dev->rx_handler = NULL;
        dev->rx_handler_data = NULL;

And looking at __netif_receive_skb() we have:

        rx_handler = rcu_dereference(skb->dev->rx_handler);
        if (rx_handler) {
                if (pt_prev) {
                        ret = deliver_skb(skb, pt_prev, orig_dev);
                        pt_prev = NULL;
                }
                switch (rx_handler(&skb)) {

My question to all of you is, what stops this interrupt from happening
while the bonding module is unloading?  What happens if the interrupt
triggers and we have this:

        CPU0                    CPU1
        ----                    ----
  rx_handler = skb->dev->rx_handler

                        netdev_rx_handler_unregister() {
                           dev->rx_handler = NULL;
                           dev->rx_handler_data = NULL;

  rx_handler()
   bond_handle_frame() {
    slave = skb->dev->rx_handler;
    bond = slave->bond; <-- NULL pointer dereference!!!

What protection am I missing in the bond release handler that would
prevent the above from happening?

</quoting Steven>

We can fix bug this in two ways. First is adding a test in
bond_handle_frame() and others to check if rx_handler_data is NULL.

A second way is adding a synchronize_net() in
netdev_rx_handler_unregister() to make sure that a rcu protected reader
has the guarantee to see a non NULL rx_handler_data.

The second way is better as it avoids an extra test in fast path.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Pirko <jpirko@redhat.com>
Cc: Paul E. McKenney <paulmck@us.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

MAINTAINERS: Update netxen_nic maintainers list

o Add myself to netxen_nic maintainers list

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

atl1e: drop pci-msi support because of packet corruption

Usage of pci-msi results in corrupted dma packet transfers to the host.

Reported-by: rebelyouth <rebelyouth.hacklab@gmail.com>
Cc: Huang, Xiong <xiong@qca.qualcomm.com>
Tested-by: Christian Sünkenberg <christian.suenkenberg@student.kit.edu>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fq_codel: Fix off-by-one error

Currently, we hold a max of sch->limit -1 number of packets instead of
sch->limit packets. Fix this off-by-one error.

Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: add R-Car support for real

Commit d0418bb7123f44b23d69ac349eec7daf9103472f (net: sh_eth: Add eth support
for R8A7779 device) was a failed attempt to add support for one of members of
the R-Car SoC family.  That's for three reasons: it treated R8A7779 the  same
as SH7724 except including quite dirty hack adding ECMR_ELB  bit  to the mask
in sh_eth_set_rate() while not removing ECMR_RTM bit (despite it's reserved in
R-Car Ether), and it didn't add a new register offset array despite the closest
SH_ETH_REG_FAST_SH4 mapping differs by 0x200 to the offsets all the R-Car Ether
registers have, and also some of the registers in this old mapping don't exist
on R-Car Ether (due to this, SH7724's 'sh_eth_my_cpu_data' structure is not
adequeate for R-Car too).  Fix all these shortcomings, restoring the SH7724
related section to its pristine state...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: move data from header file to driver

The driver's header file contains initialized register offset tables which (as
any data definitions), of course, have no business being there. Move them to
the driver's body, somewhat beautifying the initializers, while at it...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: calxedaxgmac: Wake-on-LAN fixes

WOL is broken because the magic packet status bit is getting set rather
than the enable bit. The PMT interrupt is not getting serviced because
the PMT interrupt is also enabled on the global interrupt, but not
cleared by the global interrupt and the global interrupt is higher
priority. This fixes both of these issues to get WOL working.

There's still a problem with receive after resume, but at least now we
can wake-up.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: calxedaxgmac: fix rx ring handling when OOM

If skb allocation for the rx ring fails repeatedly, we can reach a point
were the ring is empty. In this condition, the driver is out of sync with
the h/w. While this has always been possible, the removal of the skb
recycling seems to have made triggering this problem easier.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: rtnetlink: fdb dflt dump must set idx used for cb->arg[0]

In rtnl_fdb_dump() when the fdb_dump ndo op is not populated
we never set the idx value so that cb->arg[0] is always 0.
Resulting in a endless loop of messages.

Introduced with this commit,

commit 090096bf3db1c281ddd034573260045888a68fea
Author: Vlad Yasevich <vyasevic@redhat.com>
Date: Wed Mar 6 15:39:42 2013 +0000

net: generic fdb support for drivers without ndo_fdb_<op>

CC: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: core: Remove redundant call to 'nf_reset' in 'dev_forward_skb'

'nf_reset' is called just prior calling 'netif_rx'.
No need to call it twice.

Reported-by: Igor Michailov <rgohita@gmail.com>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip_tunnel: Fix off-by-one error in forming dev name.

As Ben pointed out following patch fixes bug in checking device
name length limits while forming tunnel device name.

CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

smsc75xx: fix jumbo frame support

This patch enables RX of jumbo frames for LAN7500.

Previously the driver would transmit jumbo frames succesfully but
would drop received jumbo frames (incrementing the interface errors
count).

With this patch applied the device can succesfully receive jumbo
frames up to MTU 9000 (9014 bytes on the wire including ethernet
header).

Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: simplify the getting percpu of flow_cache

replace per_cpu with per_cpu_ptr to save conversion between address and pointer

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fix the use of this_cpu_ptr

flush_tasklet is not percpu var, and percpu is percpu var, and
this_cpu_ptr(&info->cache->percpu->flush_tasklet)
is not equal to
&this_cpu_ptr(info->cache->percpu)->flush_tasklet

1f743b076(use this_cpu_ptr per-cpu helper) introduced this bug.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tile: expect new initramfs name from hypervisor file system

The current Tilera boot infrastructure now provides the initramfs
to Linux as a Tilera-hypervisor file named "initramfs", rather than
"initramfs.cpio.gz", as before. (This makes it reasonable to use
other compression techniques than gzip on the file without having to
worry about the name causing confusion.) Adapt to use the new name,
but also fall back to checking for the old name.

Cc'ing to stable so that older kernels will remain compatible with
newer Tilera boot infrastructure.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: stable@vger.kernel.org

bonding: fix disabling of arp_interval and miimon

Currently if either arp_interval or miimon is disabled, they both get
disabled, and upon disabling they get executed once more which is not
the proper behaviour. Also when doing a no-op and disabling an already
disabled one, the other again gets disabled.
Also fix the error messages with the proper valid ranges, and a small
typo fix in the up delay error message (outputting "down delay", instead
of "up delay").

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: don't accept node local multicast traffic from the wire

Erik Hugne's errata proposal (Errata ID: 3480) to RFC4291 has been
verified: http://www.rfc-editor.org/errata_search.php?eid=3480

We have to check for pkt_type and loopback flag because either the
packets are allowed to travel over the loopback interface (in which case
pkt_type is PACKET_HOST and IFF_LOOPBACK flag is set) or they travel
over a non-loopback interface back to us (in which case PACKET_TYPE is
PACKET_LOOPBACK and IFF_LOOPBACK flag is not set).

Cc: Erik Hugne <erik.hugne@ericsson.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

sky2: Threshold for Pause Packet is set wrong

The sky2 driver sets the Rx Upper Threshold for Pause Packet generation to a
wrong value which leads to only 2kB of RAM remaining space. This can lead to
Rx overflow errors even with activated flow-control.

Fix: We should increase the value to 8192/8

Signed-off-by: Mirko Lindner <mlindner@marvell.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

sky2: Receive Overflows not counted

The sky2 driver doesn't count the Receive Overflows because the MAC
interrupt for this event is not set in the MAC's interrupt mask.
The MAC's interrupt mask is set only for Transmit FIFO Underruns.

Fix: The correct setting should be (GM_IS_TX_FF_UR | GM_IS_RX_FF_OR)
Otherwise the Receive Overflow event will not generate any interrupt.
The Receive Overflow interrupt is handled correctly

Signed-off-by: Mirko Lindner <mlindner@marvell.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

Pull ceph fix from Sage Weil:
"This fixes a regression introduced during the last merge window when
mapping non-existent images."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: don't zero-fill non-image object requests

netlink: fix the warning introduced by netlink API replacement

Signed-off-by: Hong Zhiguo <honkiko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rbd: don't zero-fill non-image object requests

A result of ENOENT from a read request for an object that's part of
an rbd image indicates that there is a hole in that portion of the
image.  Similarly, a short read for such an object indicates that
the remainder of the read should be interpreted a full read with
zeros filling out the end of the request.

This behavior is not correct for objects that are not backing rbd
image data.  Currently rbd_img_obj_request_callback() assumes it
should be done for all objects.

Change rbd_img_obj_request_callback() so it only does this zeroing
for image objects.  Encapsulate that special handling in its own
function.  Add an assertion that the image object request is a bio
request, since we assume that (and we currently don't support any
other types).

This resolves a problem identified here:
    http://tracker.ceph.com/issues/4559

The regression was introduced by bf0d5f503dc11d6314c0503591d258d60ee9c944.

Reported-by: Dan van der Ster <dan@vanderster.com>
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-off-by: Sage Weil <sage@inktank.com>

Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs

Pull btrfs fixes from Chris Mason:
"We've had a busy two weeks of bug fixing.  The biggest patches in here
  are some long standing early-enospc problems (Josef) and a very old
  race where compression and mmap combine forces to lose writes (me).
  I'm fairly sure the mmap bug goes all the way back to the introduction
  of the compression code, which is proof that fsx doesn't trigger every
  possible mmap corner after all.

  I'm sure you'll notice one of these is from this morning, it's a small
  and isolated use-after-free fix in our scrub error reporting.  I
  double checked it here."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: don't drop path when printing out tree errors in scrub
  Btrfs: fix wrong return value of btrfs_lookup_csum()
  Btrfs: fix wrong reservation of csums
  Btrfs: fix double free in the btrfs_qgroup_account_ref()
  Btrfs: limit the global reserve to 512mb
  Btrfs: hold the ordered operations mutex when waiting on ordered extents
  Btrfs: fix space accounting for unlink and rename
  Btrfs: fix space leak when we fail to reserve metadata space
  Btrfs: fix EIO from btrfs send in is_extent_unchanged for punched holes
  Btrfs: fix race between mmap writes and compression
  Btrfs: fix memory leak in btrfs_create_tree()
  Btrfs: fix locking on ROOT_REPLACE operations in tree mod log
  Btrfs: fix missing qgroup reservation before fallocating
  Btrfs: handle a bogus chunk tree nicely
  Btrfs: update to use fs_state bit

ia64 idle: delete stale (*idle)() function pointer

Commit 3e7fc708eb41 ("ia64 idle: delete pm_idle") in 3.9-rc1 didn't
finish the job, leaving an un-initialized reference to (*idle)().

[ Haven't seen a crash from this - but seems like we are just being
lucky that "idle" is zero so it does get initialized before we jump to
randomland - Len ]

Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>