openwrt/staging/blogic.git
10 years agobtrfs: fix unused variables in qgroup.c
Valentina Giusti [Mon, 4 Nov 2013 21:34:29 +0000 (22:34 +0100)]
btrfs: fix unused variables in qgroup.c

Use otherwise unused local variables slot in update_qgroup_limit_item and
in update_qgroup_info_item, and remove unused variable ins from
btrfs_qgroup_account_ref.

Signed-off-by: Valentina Giusti <valentina.giusti@microon.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: replace path->slots[0] with otherwise unused variable 'slot'
Valentina Giusti [Mon, 4 Nov 2013 21:34:28 +0000 (22:34 +0100)]
btrfs: replace path->slots[0] with otherwise unused variable 'slot'

Signed-off-by: Valentina Giusti <valentina.giusti@microon.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: remove unused variable from scrub_fixup_nodatasum
Valentina Giusti [Mon, 4 Nov 2013 21:34:27 +0000 (22:34 +0100)]
btrfs: remove unused variable from scrub_fixup_nodatasum

Signed-off-by: Valentina Giusti <valentina.giusti@microon.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: remove unused variable from setup_cluster_no_bitmap
Valentina Giusti [Mon, 4 Nov 2013 21:34:26 +0000 (22:34 +0100)]
btrfs: remove unused variable from setup_cluster_no_bitmap

The variable window_start in setup_cluster_no_bitmap is not used since commit
1bb91902dc90e25449893e693ad45605cb08fbe5
(Btrfs: revamp clustered allocation logic)

Signed-off-by: Valentina Giusti <valentina.giusti@microon.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: remove unused variables from extent_io.c
Valentina Giusti [Mon, 4 Nov 2013 21:34:25 +0000 (22:34 +0100)]
btrfs: remove unused variables from extent_io.c

Remove unused variables:
* tree from end_bio_extent_writepage,
* item from extent_fiemap.

Signed-off-by: Valentina Giusti <valentina.giusti@microon.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: remove unused variable from find_free_extent
Valentina Giusti [Mon, 4 Nov 2013 21:34:24 +0000 (22:34 +0100)]
btrfs: remove unused variable from find_free_extent

The variable found_uncached_bg in find_free_extent is not used since commit
285ff5af6ce358e73f53b55c9efadd4335f4c2ff
(Btrfs: remove the ideal caching code)

Signed-off-by: Valentina Giusti <valentina.giusti@microon.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: remove unused variables from disk-io.c
Valentina Giusti [Mon, 4 Nov 2013 21:34:23 +0000 (22:34 +0100)]
btrfs: remove unused variables from disk-io.c

Remove unused variables:
* tree from csum_dirty_buffer,
* tree from btree_readpage_end_io_hook,
* tree from btree_writepages,
* bytenr from btrfs_create_tree,
* fs_info from end_workqueue_fn.

Signed-off-by: Valentina Giusti <valentina.giusti@microon.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: remove unused variable from btrfs_new_inode
Valentina Giusti [Mon, 4 Nov 2013 21:34:22 +0000 (22:34 +0100)]
btrfs: remove unused variable from btrfs_new_inode

Variable owner in btrfs_new_inode is unused since commit
d82a6f1d7e8b61ed5996334d0db66651bb43641d
(Btrfs: kill BTRFS_I(inode)->block_group)

Signed-off-by: Valentina Giusti <valentina.giusti@microon.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: publish fs label in sysfs
Jeff Mahoney [Fri, 1 Nov 2013 17:07:06 +0000 (13:07 -0400)]
btrfs: publish fs label in sysfs

This adds a writeable attribute which describes the label.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: publish device membership in sysfs
Jeff Mahoney [Fri, 1 Nov 2013 17:07:05 +0000 (13:07 -0400)]
btrfs: publish device membership in sysfs

Now that we have the infrastructure for per-super attributes, we can
publish device membership in /sys/fs/btrfs/<fsid>/devices. The information
is published as symlinks to the block devices.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: publish allocation data in sysfs
Jeff Mahoney [Fri, 1 Nov 2013 17:07:04 +0000 (13:07 -0400)]
btrfs: publish allocation data in sysfs

While trying to debug ENOSPC issues, it's helpful to understand what the
kernel's view of the available space is. We export this information
via ioctl, but sysfs files are more easily used.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: add ioctl to export size of global metadata reservation
Jeff Mahoney [Fri, 1 Nov 2013 17:07:03 +0000 (13:07 -0400)]
btrfs: add ioctl to export size of global metadata reservation

btrfs filesystem df output will show the size of the metadata space
and how much of it is used, and the user assumes that the difference
is all usable space. Since that's not actually the case due to the
global metadata reservation, we should provide the full picture to the
user.

This patch adds an ioctl that exports the size of the global metadata
reservation so that btrfs filesystem df can report it.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: use feature attribute names to print better error messages
Jeff Mahoney [Fri, 1 Nov 2013 17:07:02 +0000 (13:07 -0400)]
btrfs: use feature attribute names to print better error messages

Now that we have the feature name strings available in the kernel via
the sysfs attributes, we can use them for printing better failure
messages from the ioctl path.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: add ability to change features via sysfs
Jeff Mahoney [Fri, 1 Nov 2013 17:07:01 +0000 (13:07 -0400)]
btrfs: add ability to change features via sysfs

This patch adds the ability to change (set/clear) features while the file
system is mounted. A bitmask is added for each feature set for the
support to set and clear the bits. A message indicating which bit
has been set or cleared is issued when it's been changed and also when
permission or support for a particular bit has been denied.

Since the the attributes can now be writable, we need to introduce
another struct attribute to hold the different permissions.

If neither set or clear is supported, the file will have 0444 permissions.
If either set or clear is supported, the file will have 0644 permissions
and the store handler will filter out the write based on the bitmask.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: publish unknown feature bits in sysfs
Jeff Mahoney [Fri, 1 Nov 2013 17:07:00 +0000 (13:07 -0400)]
btrfs: publish unknown feature bits in sysfs

With the compat and compat-ro bits, it's possible for file systems to
exist that have features that aren't supported by the kernel's file system
implementation yet still be mountable.

This patch publishes read-only info on those features using a prefix:number
format, where the number is the bit number rather than the shifted value.
e.g. "compat:12"

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: publish per-super features in sysfs
Jeff Mahoney [Fri, 1 Nov 2013 17:06:59 +0000 (13:06 -0400)]
btrfs: publish per-super features in sysfs

This patch publishes information on which features are enabled in the
file system on a per-super basis. At this point, it only publishes
information on features supported by the file system implementation.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: publish per-super attributes in sysfs
Jeff Mahoney [Fri, 1 Nov 2013 17:06:58 +0000 (13:06 -0400)]
btrfs: publish per-super attributes in sysfs

This patch adds per-super attributes to sysfs.

It doesn't publish any attributes yet, but does the proper lifetime
handling as well as the basic infrastructure to add new attributes.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: publish supported featured in sysfs
Jeff Mahoney [Fri, 1 Nov 2013 17:06:57 +0000 (13:06 -0400)]
btrfs: publish supported featured in sysfs

This patch adds the ability to publish supported features to sysfs under
/sys/fs/btrfs/features.

The files are module-wide and export which features the kernel supports.

The content, for now, is just "0\n".

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agokobject: export kobj_sysfs_ops
Jeff Mahoney [Fri, 1 Nov 2013 17:06:56 +0000 (13:06 -0400)]
kobject: export kobj_sysfs_ops

struct kobj_attribute implements the baseline attribute functionality
that can be used all over the place. We should export the ops associated
with it.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agobtrfs: add ioctls to query/change feature bits online
Jeff Mahoney [Fri, 15 Nov 2013 20:33:55 +0000 (15:33 -0500)]
btrfs: add ioctls to query/change feature bits online

There are some feature bits that require no offline setup and can
be enabled online. I've only reviewed extended irefs, but there will
probably be more.

We introduce three new ioctls:
- BTRFS_IOC_GET_SUPPORTED_FEATURES: query the kernel for supported features.
- BTRFS_IOC_GET_FEATURES: query the kernel for enabled features on a per-fs
  basis, as well as querying for which features are changeable with mounted.
- BTRFS_IOC_SET_FEATURES: change features on a per-fs basis.

We introduce two new masks per feature set (_SAFE_SET and _SAFE_CLEAR) that
allow us to define which features are safe to change at runtime.

The failure modes for BTRFS_IOC_SET_FEATURES are as follows:
- Enabling a completely unsupported feature: warns and returns -ENOTSUPP
- Enabling a feature that can only be done offline: warns and returns -EPERM

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agoBtrfs: skip merge part for delayed data refs
Liu Bo [Mon, 14 Oct 2013 04:59:43 +0000 (12:59 +0800)]
Btrfs: skip merge part for delayed data refs

When we have data deduplication on, we'll hang on the merge part
because it needs to verify every queued delayed data refs related to
this disk offset but we may have millions refs.

And in the case of delayed data refs, we don't usually have too much
data refs to merge.

So it's safe to shut it down for data refs.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agoBtrfs: introduce a head ref rbtree
Liu Bo [Mon, 14 Oct 2013 04:59:45 +0000 (12:59 +0800)]
Btrfs: introduce a head ref rbtree

The way how we process delayed refs is
1) get a bunch of head refs,
2) pick up one head ref,
3) go one node back for any delayed ref updates.

The head ref is also linked in the same rbtree as the delayed ref is,
so in 1) stage, we have to walk one by one including not only head refs, but
delayed refs.

When we have a great number of delayed refs pending to process,
this'll cost time a lot.

Here we introduce a head ref specific rbtree, it only has head refs, so troubles
go away.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agoBtrfs: fix check-integrity to look at the referenced data properly
Josef Bacik [Thu, 14 Nov 2013 02:11:49 +0000 (21:11 -0500)]
Btrfs: fix check-integrity to look at the referenced data properly

We were looking at file_extent_num_bytes unconditionally when looking at
referenced data bytes, but this isn't correct for compression.  Fix this by
checking the compression of the file extent we are and setting num_bytes to
disk_num_bytes in the case of compression so that we are marking the proper
bytes as referenced.  This fixes check_int_data freaking out when running
btrfs/004.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agoBtrfs: incompatible format change to remove hole extents
Josef Bacik [Tue, 22 Oct 2013 16:18:51 +0000 (12:18 -0400)]
Btrfs: incompatible format change to remove hole extents

Btrfs has always had these filler extent data items for holes in inodes.  This
has made somethings very easy, like logging hole punches and sending hole
punches.  However for large holey files these extent data items are pure
overhead.  So add an incompatible feature to no longer add hole extents to
reduce the amount of metadata used by these sort of files.  This has a few
changes for logging and send obviously since they will need to detect holes and
log/send the holes if there are any.  I've tested this thoroughly with xfstests
and it doesn't cause any issues with and without the incompat format set.
Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <clm@fb.com>
10 years agoLinux 3.13
Linus Torvalds [Mon, 20 Jan 2014 02:40:07 +0000 (18:40 -0800)]
Linux 3.13

10 years agodrm/nouveau/mxm: fix null deref on load
Ilia Mirkin [Sun, 19 Jan 2014 15:30:32 +0000 (10:30 -0500)]
drm/nouveau/mxm: fix null deref on load

Since commit 61b365a505d6 ("drm/nouveau: populate master subdev pointer
only when fully constructed"), the nouveau_mxm(bios) call will return
NULL, since it's still being called from the constructor.  Instead, pass
the mxm pointer via the unused data field.

See https://bugs.freedesktop.org/show_bug.cgi?id=73791

Reported-by: Andreas Reis <andreas.reis@gmail.com>
Tested-by: Andreas Reis <andreas.reis@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoMerge tag 'acpi-3.13-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Mon, 20 Jan 2014 01:18:13 +0000 (17:18 -0800)]
Merge tag 'acpi-3.13-fixup' of git://git./linux/kernel/git/rafael/linux-pm

Pull last-minute ACPI fix from Rafael Wysocki:
 "This reverts a commit that causes the Alan Cox' ASUS T100TA to "crash
  and burn" during boot if the Baytrail pinctrl driver is compiled in"

* tag 'acpi-3.13-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "ACPI: Add BayTrail SoC GPIO and LPSS ACPI IDs"

10 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 19 Jan 2014 21:06:51 +0000 (13:06 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:

 - an s2ram related fix on AMD systems

 - a perf fault handling bug that is relatively old but which has become
   much easier to trigger in v3.13 after commit e00b12e64be9 ("perf/x86:
   Further optimize copy_from_user_nmi()")

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h
  x86, mm, perf: Allow recursive faults from interrupts

10 years agoRevert "ACPI: Add BayTrail SoC GPIO and LPSS ACPI IDs"
Rafael J. Wysocki [Fri, 17 Jan 2014 13:23:29 +0000 (14:23 +0100)]
Revert "ACPI: Add BayTrail SoC GPIO and LPSS ACPI IDs"

This reverts commit f6308b36c411 (ACPI: Add BayTrail SoC GPIO and LPSS
ACPI IDs), because it causes the Alan Cox' ASUS T100TA to "crash and
burn" during boot if the Baytrail pinctrl driver is compiled in.

Fixes: f6308b36c411 (ACPI: Add BayTrail SoC GPIO and LPSS ACPI IDs)
Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Requested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
10 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sat, 18 Jan 2014 06:19:28 +0000 (22:19 -0800)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) The value choosen for the new SO_MAX_PACING_RATE socket option on
    parisc was very poorly choosen, let's fix it while we still can.
    From Eric Dumazet.

 2) Our generic reciprocal divide was found to handle some edge cases
    incorrectly, part of this is encoded into the BPF as deep as the JIT
    engines themselves.  Just use a real divide throughout for now.
    From Eric Dumazet.

 3) Because the initial lookup is lockless, the TCP metrics engine can
    end up creating two entries for the same lookup key.  Fix this by
    doing a second lookup under the lock before we actually create the
    new entry.  From Christoph Paasch.

 4) Fix scatter-gather list init in usbnet driver, from Bjørn Mork.

 5) Fix unintended 32-bit truncation in cxgb4 driver's bit shifting.
    From Dan Carpenter.

 6) Netlink socket dumping uses the wrong socket state for timewait
    sockets.  Fix from Neal Cardwell.

 7) Fix netlink memory leak in ieee802154_add_iface(), from Christian
    Engelmayer.

 8) Multicast forwarding in ipv4 can overflow the per-rule reference
    counts, causing all multicast traffic to cease.  Fix from Hannes
    Frederic Sowa.

 9) via-rhine needs to stop all TX queues when it resets the device,
    from Richard Weinberger.

10) Fix RDS per-cpu accesses broken by the this_cpu_* conversions.  From
    Gerald Schaefer.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions
  parisc: fix SO_MAX_PACING_RATE typo
  ipv6: simplify detection of first operational link-local address on interface
  tcp: metrics: Avoid duplicate entries with the same destination-IP
  net: rds: fix per-cpu helper usage
  e1000e: Fix compilation warning when !CONFIG_PM_SLEEP
  bpf: do not use reciprocal divide
  be2net: add dma_mapping_error() check for dma_map_page()
  bnx2x: Don't release PCI bars on shutdown
  net,via-rhine: Fix tx_timeout handling
  batman-adv: fix batman-adv header overhead calculation
  qlge: Fix vlan netdev features.
  net: avoid reference counter overflows on fib_rules in multicast forwarding
  dm9601: add USB IDs for new dm96xx variants
  MAINTAINERS: add virtio-dev ML for virtio
  ieee802154: Fix memory leak in ieee802154_add_iface()
  net: usbnet: fix SG initialisation
  inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets
  cxgb4: silence shift wrapping static checker warning

10 years agos390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions
Heiko Carstens [Fri, 17 Jan 2014 08:37:15 +0000 (09:37 +0100)]
s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions

The s390 bpf jit compiler emits the signed divide instructions "dr" and "d"
for unsigned divisions.
This can cause problems: the dividend will be zero extended to a 64 bit value
and the divisor is the 32 bit signed value as specified A or X accumulator,
even though A and X are supposed to be treated as unsigned values.

The divide instrunctions will generate an exception if the result cannot be
expressed with a 32 bit signed value.
This is the case if e.g. the dividend is 0xffffffff and the divisor either 1
or also 0xffffffff (signed: -1).

To avoid all these issues simply use unsigned divide instructions.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoparisc: fix SO_MAX_PACING_RATE typo
Eric Dumazet [Thu, 16 Jan 2014 19:15:12 +0000 (11:15 -0800)]
parisc: fix SO_MAX_PACING_RATE typo

SO_MAX_PACING_RATE definition on parisc got a typo.
Its not too late to fix it, before 3.13 is official.

Fixes: 62748f32d501 ("net: introduce SO_MAX_PACING_RATE")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoipv6: simplify detection of first operational link-local address on interface
Hannes Frederic Sowa [Thu, 16 Jan 2014 19:13:04 +0000 (20:13 +0100)]
ipv6: simplify detection of first operational link-local address on interface

In commit 1ec047eb4751e3 ("ipv6: introduce per-interface counter for
dad-completed ipv6 addresses") I build the detection of the first
operational link-local address much to complex. Additionally this code
now has a race condition.

Replace it with a much simpler variant, which just scans the address
list when duplicate address detection completes, to check if this is
the first valid link local address and send RS and MLD reports then.

Fixes: 1ec047eb4751e3 ("ipv6: introduce per-interface counter for dad-completed ipv6 addresses")
Reported-by: Jiri Pirko <jiri@resnulli.us>
Cc: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agotcp: metrics: Avoid duplicate entries with the same destination-IP
Christoph Paasch [Thu, 16 Jan 2014 19:01:21 +0000 (20:01 +0100)]
tcp: metrics: Avoid duplicate entries with the same destination-IP

Because the tcp-metrics is an RCU-list, it may be that two
soft-interrupts are inside __tcp_get_metrics() for the same
destination-IP at the same time. If this destination-IP is not yet part of
the tcp-metrics, both soft-interrupts will end up in tcpm_new and create
a new entry for this IP.
So, we will have two tcp-metrics with the same destination-IP in the list.

This patch checks twice __tcp_get_metrics(). First without holding the
lock, then while holding the lock. The second one is there to confirm
that the entry has not been added by another soft-irq while waiting for
the spin-lock.

Fixes: 51c5d0c4b169b (tcp: Maintain dynamic metrics in local cache.)
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: rds: fix per-cpu helper usage
Gerald Schaefer [Thu, 16 Jan 2014 15:54:48 +0000 (16:54 +0100)]
net: rds: fix per-cpu helper usage

commit ae4b46e9d "net: rds: use this_cpu_* per-cpu helper" broke per-cpu
handling for rds. chpfirst is the result of __this_cpu_read(), so it is
an absolute pointer and not __percpu. Therefore, __this_cpu_write()
should not operate on chpfirst, but rather on cache->percpu->first, just
like __this_cpu_read() did before.

Cc: <stable@vger.kernel.org> # 3.8+
Signed-off-byd Gerald Schaefer <gerald.schaefer@de.ibm.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm...
Linus Torvalds [Sat, 18 Jan 2014 01:29:36 +0000 (17:29 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ebiederm/user-namespace

Pull namespace fixes from Eric Biederman:
 "This is a set of 3 regression fixes.

  This fixes /proc/mounts when using "ip netns add <netns>" to display
  the actual mount point.

  This fixes a regression in clone that broke lxc-attach.

  This fixes a regression in the permission checks for mounting /proc
  that made proc unmountable if binfmt_misc was in use.  Oops.

  My apologies for sending this pull request so late.  Al Viro gave
  interesting review comments about the d_path fix that I wanted to
  address in detail before I sent this pull request.  Unfortunately a
  bad round of colds kept from addressing that in detail until today.
  The executive summary of the review was:

  Al: Is patching d_path really sufficient?
      The prepend_path, d_path, d_absolute_path, and __d_path family of
      functions is a really mess.

  Me: Yes, patching d_path is really sufficient.  Yes, the code is mess.
      No it is not appropriate to rewrite all of d_path for a regression
      that has existed for entirely too long already, when a two line
      change will do"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  vfs: Fix a regression in mounting proc
  fork:  Allow CLONE_PARENT after setns(CLONE_NEWPID)
  vfs: In d_path don't call d_dname on a mount point

10 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sat, 18 Jan 2014 00:40:27 +0000 (16:40 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fix from Paolo Bonzini:
 "Fix for a brown paper bag bug.  Thanks to Drew Jones for noticing"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: x86: fix apic_base enable check

10 years agoMerge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
David S. Miller [Fri, 17 Jan 2014 01:16:43 +0000 (17:16 -0800)]
Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge

Included change:
- properly compute the batman-adv header overhead. Such
  result is later used to initialize the hard_header_len
  member of the soft-interface netdev object

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 17 Jan 2014 00:33:27 +0000 (11:33 +1100)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache"

  We noticed that it breaks ioremap (and earlyprintk) with 64K page
  configuration"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache"

10 years agopercpu_counter: unbreak __percpu_counter_add()
Hugh Dickins [Thu, 16 Jan 2014 23:26:48 +0000 (15:26 -0800)]
percpu_counter: unbreak __percpu_counter_add()

Commit 74e72f894d56 ("lib/percpu_counter.c: fix __percpu_counter_add()")
looked very plausible, but its arithmetic was badly wrong: obvious once
you see the fix, but maddening to get there from the weird tmpfs ENOSPCs

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Fan Du <fan.du@windriver.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoe1000e: Fix compilation warning when !CONFIG_PM_SLEEP
Mika Westerberg [Thu, 16 Jan 2014 12:39:39 +0000 (14:39 +0200)]
e1000e: Fix compilation warning when !CONFIG_PM_SLEEP

Commit 7509963c703b (e1000e: Fix a compile flag mis-match for
suspend/resume) moved suspend and resume hooks to be available when
CONFIG_PM is set. However, it can be set even if CONFIG_PM_SLEEP is not set
causing following warnings to be emitted:

drivers/net/ethernet/intel/e1000e/netdev.c:6178:12: warning:
   ‘e1000_suspend’ defined but not used [-Wunused-function]

drivers/net/ethernet/intel/e1000e/netdev.c:6185:12: warning:
‘e1000_resume’ defined but not used [-Wunused-function]

To fix this make the hooks to be available only when CONFIG_PM_SLEEP is set
and remove CONFIG_PM wrapping from driver ops because this is already
handled by SET_SYSTEM_SLEEP_PM_OPS() and SET_RUNTIME_PM_OPS().

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Dave Ertman <davidx.m.ertman@intel.com>
Cc: Aaron Brown <aaron.f.brown@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoRevert "arm64: Fix memory shareability attribute for ioremap_wc/cache"
Catalin Marinas [Thu, 16 Jan 2014 18:32:25 +0000 (18:32 +0000)]
Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache"

This reverts commit 2f7dc6027522499582a520807cb9ffda589de47e.

The above commit breaks the mapping type for Device memory because
pgprot_default already contains a Normal memory type. pgprot_default is
also not initialised early enough for earlyprintk resulting in an
inconsistent memory mapping with 64K PAGE_SIZE configuration.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Will Deacon <will.deacon@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
10 years agoperf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h
Robert Richter [Wed, 15 Jan 2014 14:57:29 +0000 (15:57 +0100)]
perf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h

On AMD family 10h we see following error messages while waking up from
S3 for all non-boot CPUs leading to a failed IBS initialization:

 Enabling non-boot CPUs ...
 smpboot: Booting Node 0 Processor 1 APIC 0x1
 [Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
 perf: IBS APIC setup failed on cpu #1
 process: Switch to broadcast mode on CPU1
 CPU1 is up
 ...
 ACPI: Waking up from system sleep state S3

Reason for this is that during suspend the LVT offset for the IBS
vector gets lost and needs to be reinialized while resuming.

The offset is read from the IBSCTL msr. On family 10h the offset needs
to be 1 as offset 0 is used for the MCE threshold interrupt, but
firmware assings it for IBS to 0 too. The kernel needs to reprogram
the vector. The msr is a readonly node msr, but a new value can be
written via pci config space access. The reinitialization is
implemented for family 10h in setup_ibs_ctl() which is forced during
IBS setup.

This patch fixes IBS setup after waking up from S3 by adding
resume/supend hooks for the boot cpu which does the offset
reinitialization.

Marking it as stable to let distros pick up this fix.

Signed-off-by: Robert Richter <rric@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org> v3.2..
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1389797849-5565-1-git-send-email-rric.net@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agox86, mm, perf: Allow recursive faults from interrupts
Peter Zijlstra [Fri, 10 Jan 2014 20:06:03 +0000 (21:06 +0100)]
x86, mm, perf: Allow recursive faults from interrupts

Waiman managed to trigger a PMI while in a emulate_vsyscall() fault,
the PMI in turn managed to trigger a fault while obtaining a stack
trace. This triggered the sig_on_uaccess_error recursive fault logic
and killed the process dead.

Fix this by explicitly excluding interrupts from the recursive fault
logic.

Reported-and-Tested-by: Waiman Long <waiman.long@hp.com>
Fixes: e00b12e64be9 ("perf/x86: Further optimize copy_from_user_nmi()")
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140110200603.GJ7572@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoMerge branches 'sched-urgent-for-linus' and 'timers-urgent-for-linus' of git://git...
Linus Torvalds [Thu, 16 Jan 2014 01:33:21 +0000 (08:33 +0700)]
Merge branches 'sched-urgent-for-linus' and 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler and timer fixes from Ingo Molnar:
 "Contains a fix for a scheduler bug that manifested itself as a 3D
  performance regression and a crash fix for the ARM Cadence TTC clock
  driver"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Calculate effective load even if local weight is 0

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: cadence_ttc: Fix mutex taken inside interrupt context

10 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 16 Jan 2014 01:31:55 +0000 (08:31 +0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull locking fixes from Ingo Molnar:
 "Two fixes from lockdep coverage of seqlocks, which fix deadlocks on
  lockdep-enabled ARM systems"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched_clock: Disable seqlock lockdep usage in sched_clock()
  seqlock: Use raw_ prefix instead of _no_lockdep

10 years agoMerge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck...
Linus Torvalds [Thu, 16 Jan 2014 01:26:44 +0000 (08:26 +0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fix from Guenter Roeck:
 "Fix attribute length problem in coretemp driver"

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (coretemp) Fix truncated name of alarm attributes

10 years agoMerge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Linus Torvalds [Thu, 16 Jan 2014 01:26:00 +0000 (08:26 +0700)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:
 "Another few fixes for ARM, nothing major here"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 7938/1: OMAP4/highbank: Flush L2 cache before disabling
  ARM: 7939/1: traps: fix opcode endianness when read from user memory
  ARM: 7937/1: perf_event: Silence sparse warning
  ARM: 7934/1: DT/kernel: fix arch_match_cpu_phys_id to avoid erroneous match
  Revert "ARM: 7908/1: mm: Fix the arm_dma_limit calculation"

10 years agoMerge tag 'writeback-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg...
Linus Torvalds [Thu, 16 Jan 2014 01:23:34 +0000 (08:23 +0700)]
Merge tag 'writeback-fixes' of git://git./linux/kernel/git/wfg/linux

Pull writeback fix from Wu Fengguang:
 "Fix data corruption on NFS writeback.

  It has been in linux-next for one month"

* tag 'writeback-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
  writeback: Fix data corruption on NFS

10 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Thu, 16 Jan 2014 01:21:17 +0000 (08:21 +0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c bugfix from Wolfram Sang.

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: Re-instate body of i2c_parent_is_i2c_adapter()

10 years agobpf: do not use reciprocal divide
Eric Dumazet [Wed, 15 Jan 2014 14:50:07 +0000 (06:50 -0800)]
bpf: do not use reciprocal divide

At first Jakub Zawadzki noticed that some divisions by reciprocal_divide
were not correct. (off by one in some cases)
http://www.wireshark.org/~darkjames/reciprocal-buggy.c

He could also show this with BPF:
http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c

The reciprocal divide in linux kernel is not generic enough,
lets remove its use in BPF, as it is not worth the pain with
current cpus.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Daniel Borkmann <dxchgb@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Matt Evans <matt@ozlabs.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agobe2net: add dma_mapping_error() check for dma_map_page()
Ivan Vecera [Wed, 15 Jan 2014 10:11:34 +0000 (11:11 +0100)]
be2net: add dma_mapping_error() check for dma_map_page()

The driver does not check value returned by dma_map_page. The patch
fixes this.

v2: Removed the bugfix for non-bug ;-) (thanks Sathya)

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Sathya Perla <Sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agobnx2x: Don't release PCI bars on shutdown
Yuval Mintz [Wed, 15 Jan 2014 10:05:30 +0000 (12:05 +0200)]
bnx2x: Don't release PCI bars on shutdown

The bnx2x driver in its pci shutdown() callback releases its pci bars (in the
same manner it does during its pci remove() callback).
During a system reboot while VFs are enabled, its possible for the VF's remove
to be called (as a result of pci_disable_sriov()) after its shutdown callback
has already finished running; This will cause a paging request fault as the VF
tries to access the pci bar which it has previously released, crashing the
system.

This patch further differentiates the shutdown and remove callbacks, preventing the
pci release procedures from being called during shutdown.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet,via-rhine: Fix tx_timeout handling
Richard Weinberger [Tue, 14 Jan 2014 21:46:36 +0000 (22:46 +0100)]
net,via-rhine: Fix tx_timeout handling

rhine_reset_task() misses to disable the tx scheduler upon reset,
this can lead to a crash if work is still scheduled while we're resetting
the tx queue.

Fixes:
[   93.591707] BUG: unable to handle kernel NULL pointer dereference at 0000004c
[   93.595514] IP: [<c119d10d>] rhine_napipoll+0x491/0x6

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agobatman-adv: fix batman-adv header overhead calculation
Marek Lindner [Wed, 15 Jan 2014 12:31:18 +0000 (20:31 +0800)]
batman-adv: fix batman-adv header overhead calculation

Batman-adv prepends a full ethernet header in addition to its own
header. This has to be reflected in the MTU calculation, especially
since the value is used to set dev->hard_header_len.

Introduced by 411d6ed93a5d0601980d3e5ce75de07c98e3a7de
("batman-adv: consider network coding overhead when calculating required mtu")

Reported-by: cmsv <cmsv@wirelesspt.net>
Reported-by: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
10 years agokvm: x86: fix apic_base enable check
Andrew Jones [Wed, 15 Jan 2014 12:39:59 +0000 (13:39 +0100)]
kvm: x86: fix apic_base enable check

Commit e66d2ae7c67bd moved the assignment
vcpu->arch.apic_base = value above a condition with
(vcpu->arch.apic_base ^ value), causing that check
to always fail. Use old_value, vcpu->arch.apic_base's
old value, in the condition instead.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
10 years agoMerge branch 'akpm' (incoming from Andrew)
Linus Torvalds [Wed, 15 Jan 2014 08:42:11 +0000 (15:42 +0700)]
Merge branch 'akpm' (incoming from Andrew)

Merge patches from Andrew Morton:
 "Six fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  lib/percpu_counter.c: fix __percpu_counter_add()
  crash_dump: fix compilation error (on MIPS at least)
  mm: fix crash when using XFS on loopback
  MIPS: fix blast_icache32 on loongson2
  MIPS: fix case mismatch in local_r4k_flush_icache_range()
  nilfs2: fix segctor bug that causes file system corruption

10 years agoMerge tag 'md/3.13-fixes' of git://neil.brown.name/md
Linus Torvalds [Wed, 15 Jan 2014 08:07:36 +0000 (15:07 +0700)]
Merge tag 'md/3.13-fixes' of git://neil.brown.name/md

Pull late md fixes from Neil Brown:
 "Half a dozen md bug fixes.

  All of these fix real bugs the people have hit, and are tagged for
  -stable.  Sorry they are late ....  Christmas holidays and all that.
  Hopefully they can still squeak into 3.13"

* tag 'md/3.13-fixes' of git://neil.brown.name/md:
  md: fix problem when adding device to read-only array with bitmap.
  md/raid10: fix bug when raid10 recovery fails to recover a block.
  md/raid5: fix a recently broken BUG_ON().
  md/raid1: fix request counting bug in new 'barrier' code.
  md/raid10: fix two bugs in handling of known-bad-blocks.
  md/raid5: Fix possible confusion when multiple write errors occur.

10 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Wed, 15 Jan 2014 08:06:14 +0000 (15:06 +0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "One nouveau regression fix on older cards, i915 black screen fixes,
  and a revert for a strange G33 intel problem"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/nouveau: fix null ptr dereferences on some boards
  Revert "drm: copy mode type in drm_mode_connector_list_update()"
  drm/i915/bdw: make sure south port interrupts are enabled properly v2
  drm/i915: Don't grab crtc mutexes in intel_modeset_gem_init()
  drm/i915: fix DDI PLLs HW state readout code

10 years agolib/percpu_counter.c: fix __percpu_counter_add()
Ming Lei [Wed, 15 Jan 2014 01:56:42 +0000 (17:56 -0800)]
lib/percpu_counter.c: fix __percpu_counter_add()

__percpu_counter_add() may be called in softirq/hardirq handler (such
as, blk_mq_queue_exit() is typically called in hardirq/softirq handler),
so we need to call this_cpu_add()(irq safe helper) to update percpu
counter, otherwise counts may be lost.

This fixes the problem that 'rmmod null_blk' hangs in blk_cleanup_queue()
because of miscounting of request_queue->mq_usage_counter.

This patch is the v1 of previous one of "lib/percpu_counter.c:
disable local irq when updating percpu couter", and takes Andrew's
approach which may be more efficient for ARCHs(x86, s390) that
have optimized this_cpu_add().

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Fan Du <fan.du@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agocrash_dump: fix compilation error (on MIPS at least)
Qais Yousef [Wed, 15 Jan 2014 01:56:41 +0000 (17:56 -0800)]
crash_dump: fix compilation error (on MIPS at least)

  In file included from kernel/crash_dump.c:2:0:
  include/linux/crash_dump.h:22:27: error: unknown type name `pgprot_t'

when CONFIG_CRASH_DUMP=y

The error was traced back to commit 9cb218131de1 ("vmcore: introduce
remap_oldmem_pfn_range()")

include <asm/pgtable.h> to get the missing definition

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: <stable@vger.kernel.org> [3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agomm: fix crash when using XFS on loopback
Mikulas Patocka [Wed, 15 Jan 2014 01:56:40 +0000 (17:56 -0800)]
mm: fix crash when using XFS on loopback

Commit 8456a648cf44 ("slab: use struct page for slab management") causes
a crash in the LVM2 testsuite on PA-RISC (the crashing test is
fsadm.sh).  The testsuite doesn't crash on 3.12, crashes on 3.13-rc1 and
later.

 Bad Address (null pointer deref?): Code=15 regs=000000413edd89a0 (Addr=000006202224647d)
 CPU: 3 PID: 24008 Comm: loop0 Not tainted 3.13.0-rc6 #5
 task: 00000001bf3c0048 ti: 000000413edd8000 task.ti: 000000413edd8000

      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
 PSW: 00001000000001101111100100001110 Not tainted
 r00-03  000000ff0806f90e 00000000405c8de0 000000004013e6c0 000000413edd83f0
 r04-07  00000000405a95e0 0000000000000200 00000001414735f0 00000001bf349e40
 r08-11  0000000010fe3d10 0000000000000001 00000040829c7778 000000413efd9000
 r12-15  0000000000000000 000000004060d800 0000000010fe3000 0000000010fe3000
 r16-19  000000413edd82a0 00000041078ddbc0 0000000000000010 0000000000000001
 r20-23  0008f3d0d83a8000 0000000000000000 00000040829c7778 0000000000000080
 r24-27  00000001bf349e40 00000001bf349e40 202d66202224640d 00000000405a95e0
 r28-31  202d662022246465 000000413edd88f0 000000413edd89a0 0000000000000001
 sr00-03  000000000532c000 0000000000000000 0000000000000000 000000000532c000
 sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000

 IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000401fe42c 00000000401fe430
  IIR: 539c0030    ISR: 00000000202d6000  IOR: 000006202224647d
  CPU:        3   CR30: 000000413edd8000 CR31: 0000000000000000
  ORIG_R28: 00000000405a95e0
  IAOQ[0]: vma_interval_tree_iter_first+0x14/0x48
  IAOQ[1]: vma_interval_tree_iter_first+0x18/0x48
  RP(r2): flush_dcache_page+0x128/0x388
 Backtrace:
   flush_dcache_page+0x128/0x388
   lo_splice_actor+0x90/0x148 [loop]
   splice_from_pipe_feed+0xc0/0x1d0
   __splice_from_pipe+0xac/0xc0
   lo_direct_splice_actor+0x1c/0x70 [loop]
   splice_direct_to_actor+0xec/0x228
   lo_receive+0xe4/0x298 [loop]
   loop_thread+0x478/0x640 [loop]
   kthread+0x134/0x168
   end_fault_vector+0x20/0x28
   xfs_setsize_buftarg+0x0/0x90 [xfs]

 Kernel panic - not syncing: Bad Address (null pointer deref?)

Commit 8456a648cf44 changes the page structure so that the slab
subsystem reuses the page->mapping field.

The crash happens in the following way:
 * XFS allocates some memory from slab and issues a bio to read data
   into it.
 * the bio is sent to the loopback device.
 * lo_receive creates an actor and calls splice_direct_to_actor.
 * lo_splice_actor copies data to the target page.
 * lo_splice_actor calls flush_dcache_page because the page may be
   mapped by userspace.  In that case we need to flush the kernel cache.
 * flush_dcache_page asks for the list of userspace mappings, however
   that page->mapping field is reused by the slab subsystem for a
   different purpose.  This causes the crash.

Note that other architectures without coherent caches (sparc, arm, mips)
also call page_mapping from flush_dcache_page, so they may crash in the
same way.

This patch fixes this bug by testing if the page is a slab page in
page_mapping and returning NULL if it is.

The patch also fixes VM_BUG_ON(PageSlab(page)) that could happen in
earlier kernels in the same scenario on architectures without cache
coherence when CONFIG_DEBUG_VM is enabled - so it should be backported
to stable kernels.

In the old kernels, the function page_mapping is placed in
include/linux/mm.h, so you should modify the patch accordingly when
backporting it.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: John David Anglin <dave.anglin@bell.net>]
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Reviewed-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoMIPS: fix blast_icache32 on loongson2
Aaro Koskinen [Wed, 15 Jan 2014 01:56:38 +0000 (17:56 -0800)]
MIPS: fix blast_icache32 on loongson2

Commit 14bd8c082016 ("MIPS: Loongson: Get rid of Loongson 2 #ifdefery
all over arch/mips") failed to add Loongson2 specific blast_icache32
functions.  Fix that.

The patch fixes the following crash seen with 3.13-rc1:

  Reserved instruction in kernel code[#1]:
  [...]
  Call Trace:
    blast_icache32_page+0x8/0xb0
    r4k_flush_cache_page+0x19c/0x200
    do_wp_page.isra.97+0x47c/0xe08
    handle_mm_fault+0x938/0x1118
    __do_page_fault+0x140/0x540
    resume_userspace_check+0x0/0x10
  Code: 00200825  64834000  00200825 <bc900000bc900020  bc900040  bc900060  bc900080  bc9000a0

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: John Crispin <blogic@openwrt.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoMIPS: fix case mismatch in local_r4k_flush_icache_range()
Huacai Chen [Wed, 15 Jan 2014 01:56:37 +0000 (17:56 -0800)]
MIPS: fix case mismatch in local_r4k_flush_icache_range()

Currently, Loongson-2 call protected_blast_icache_range() and others
call protected_loongson23_blast_icache_range(), but I think the correct
behavior should be the opposite.  BTW, Loongson-3's cache-ops is
compatible with MIPS64, but not compatible with Loongson-2.  So, rename
xxx_loongson23_yyy things to xxx_loongson2_yyy.

The patch fixes early boot hang with 3.13-rc1, introduced in commit
14bd8c082016 ("MIPS: Loongson: Get rid of Loongson 2 #ifdefery all over
arch/mips").

Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: John Crispin <blogic@openwrt.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agonilfs2: fix segctor bug that causes file system corruption
Andreas Rohner [Wed, 15 Jan 2014 01:56:36 +0000 (17:56 -0800)]
nilfs2: fix segctor bug that causes file system corruption

There is a bug in the function nilfs_segctor_collect, which results in
active data being written to a segment, that is marked as clean.  It is
possible, that this segment is selected for a later segment
construction, whereby the old data is overwritten.

The problem shows itself with the following kernel log message:

  nilfs_sufile_do_cancel_free: segment 6533 must be clean

Usually a few hours later the file system gets corrupted:

  NILFS: bad btree node (blocknr=8748107): level = 0, flags = 0x0, nchildren = 0
  NILFS error (device sdc1): nilfs_bmap_last_key: broken bmap (inode number=114660)

The issue can be reproduced with a file system that is nearly full and
with the cleaner running, while some IO intensive task is running.
Although it is quite hard to reproduce.

This is what happens:

 1. The cleaner starts the segment construction
 2. nilfs_segctor_collect is called
 3. sc_stage is on NILFS_ST_SUFILE and segments are freed
 4. sc_stage is on NILFS_ST_DAT current segment is full
 5. nilfs_segctor_extend_segments is called, which
    allocates a new segment
 6. The new segment is one of the segments freed in step 3
 7. nilfs_sufile_cancel_freev is called and produces an error message
 8. Loop around and the collection starts again
 9. sc_stage is on NILFS_ST_SUFILE and segments are freed
    including the newly allocated segment, which will contain active
    data and can be allocated at a later time
10. A few hours later another segment construction allocates the
    segment and causes file system corruption

This can be prevented by simply reordering the statements.  If
nilfs_sufile_cancel_freev is called before nilfs_segctor_extend_segments
the freed segments are marked as dirty and cannot be allocated any more.

Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
Reviewed-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Andreas Rohner <andreas.rohner@gmx.net>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoMerge branch 'clockevents/3.13-fixes' of git://git.linaro.org/people/daniel.lezcano...
Ingo Molnar [Wed, 15 Jan 2014 06:39:30 +0000 (07:39 +0100)]
Merge branch 'clockevents/3.13-fixes' of git://git.linaro.org/people/daniel.lezcano/linux into timers/urgent

Pull clock driver fix from Daniel Lezcano:

 " * Soren Brinkmann fixed the cadence_ttc driver where a call to
     clk_get_rate happens in an interrupt context. More precisely in an IPI
     when the broadcast timer is initialized for each cpu in the cpuidle
     driver. "

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoMerge branch 'drm-nouveau-next' of git://git.freedesktop.org/git/nouveau/linux-2...
Dave Airlie [Wed, 15 Jan 2014 05:01:11 +0000 (15:01 +1000)]
Merge branch 'drm-nouveau-next' of git://git.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes

Single regression fix for nouveau

* 'drm-nouveau-next' of git://git.freedesktop.org/git/nouveau/linux-2.6:
  drm/nouveau: fix null ptr dereferences on some boards

10 years agodrm/nouveau: fix null ptr dereferences on some boards
Ben Skeggs [Tue, 14 Jan 2014 04:56:22 +0000 (14:56 +1000)]
drm/nouveau: fix null ptr dereferences on some boards

Regression from "device: populate master subdev pointer only when fully
constructed"

Reported-by: Bob Gleitsmann <rjgleits@bellsouth.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
10 years agoqlge: Fix vlan netdev features.
Jitendra Kalsaria [Tue, 14 Jan 2014 18:57:25 +0000 (13:57 -0500)]
qlge: Fix vlan netdev features.

vlan gets the same netdev features except vlan filter.

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: avoid reference counter overflows on fib_rules in multicast forwarding
Hannes Frederic Sowa [Mon, 13 Jan 2014 01:45:22 +0000 (02:45 +0100)]
net: avoid reference counter overflows on fib_rules in multicast forwarding

Bob Falken reported that after 4G packets, multicast forwarding stopped
working. This was because of a rule reference counter overflow which
freed the rule as soon as the overflow happend.

This patch solves this by adding the FIB_LOOKUP_NOREF flag to
fib_rules_lookup calls. This is safe even from non-rcu locked sections
as in this case the flag only implies not taking a reference to the rule,
which we don't need at all.

Rules only hold references to the namespace, which are guaranteed to be
available during the call of the non-rcu protected function reg_vif_xmit
because of the interface reference which itself holds a reference to
the net namespace.

Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables")
Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables")
Reported-by: Bob Falken <NetFestivalHaveFun@gmx.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agodm9601: add USB IDs for new dm96xx variants
Peter Korsgaard [Sun, 12 Jan 2014 22:15:51 +0000 (23:15 +0100)]
dm9601: add USB IDs for new dm96xx variants

A number of new dm96xx variants now exist.

Reported-by: Joseph Chang <joseph_chang@davicom.com.tw>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMAINTAINERS: add virtio-dev ML for virtio
Michael S. Tsirkin [Sun, 12 Jan 2014 11:37:56 +0000 (13:37 +0200)]
MAINTAINERS: add virtio-dev ML for virtio

Since virtio is an OASIS standard draft now, virtio implementation
discussions are taking place on the virtio-dev OASIS mailing list.
Update MAINTAINERS.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoieee802154: Fix memory leak in ieee802154_add_iface()
Christian Engelmayer [Sat, 11 Jan 2014 21:19:30 +0000 (22:19 +0100)]
ieee802154: Fix memory leak in ieee802154_add_iface()

Fix a memory leak in the ieee802154_add_iface() error handling path.
Detected by Coverity: CID 710490.

Signed-off-by: Christian Engelmayer <cengelma@gmx.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agohwmon: (coretemp) Fix truncated name of alarm attributes
Jean Delvare [Tue, 14 Jan 2014 14:59:55 +0000 (15:59 +0100)]
hwmon: (coretemp) Fix truncated name of alarm attributes

When the core number exceeds 9, the size of the buffer storing the
alarm attribute name is insufficient and the attribute name is
truncated. This causes libsensors to skip these attributes as the
truncated name is not recognized.

Reported-by: Andreas Hollmann <hollmann@in.tum.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
10 years agoi2c: Re-instate body of i2c_parent_is_i2c_adapter()
Stephen Warren [Mon, 13 Jan 2014 21:29:04 +0000 (14:29 -0700)]
i2c: Re-instate body of i2c_parent_is_i2c_adapter()

The body of i2c_parent_is_i2c_adapter() is currently guarded by
I2C_MUX. It should be CONFIG_I2C_MUX instead.

Among potentially other problems, this resulted in i2c_lock_adapter()
only locking I2C mux child adapters, and not the parent adapter. In
turn, this could allow inter-mingling of mux child selection and I2C
transactions, which could result in I2C transactions being directed to
the wrong I2C bus, and possibly even switching between busses in the
middle of a transaction.

One concrete issue caused by this bug was corrupted HDMI EDID reads
during boot on the NVIDIA Tegra Seaboard system, although this only
became apparent in recent linux-next, when the boot timing was changed
just enough to trigger the race condition.

Fixes: 3923172b3d70 ("i2c: reduce parent checking to a NOOP in non-I2C_MUX case")
Cc: Phil Carmody <phil.carmody@partner.samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
10 years agonet: usbnet: fix SG initialisation
Bjørn Mork [Fri, 10 Jan 2014 22:10:17 +0000 (23:10 +0100)]
net: usbnet: fix SG initialisation

Commit 60e453a940ac ("USBNET: fix handling padding packet")
added an extra SG entry in case padding is necessary, but
failed to update the initialisation of the list. This can
cause list traversal to fall off the end of the list,
resulting in an oops.

Fixes: 60e453a940ac ("USBNET: fix handling padding packet")
Reported-by: Thomas Kear <thomas@kear.co.nz>
Cc: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Tested-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoinet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets
Neal Cardwell [Fri, 10 Jan 2014 20:34:45 +0000 (15:34 -0500)]
inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets

Fix inet_diag_dump_icsk() to reflect the fact that both TCP_TIME_WAIT
and TCP_FIN_WAIT2 connections are represented by inet_timewait_sock
(not just TIME_WAIT), and for such sockets the tw_substate field holds
the real state, which can be either TCP_TIME_WAIT or TCP_FIN_WAIT2.

This brings the inet_diag state-matching code in line with the field
it uses to populate idiag_state. This is also analogous to the info
exported in /proc/net/tcp, where get_tcp4_sock() exports sk->sk_state
and get_timewait4_sock() exports tw->tw_substate.

Before fixing this, (a) neither "ss -nemoi" nor "ss -nemoi state
fin-wait-2" would return a socket in TCP_FIN_WAIT2; and (b) "ss -nemoi
state time-wait" would also return sockets in state TCP_FIN_WAIT2.

This is an old bug that predates 05dbc7b ("tcp/dccp: remove twchain").

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agomd: fix problem when adding device to read-only array with bitmap.
NeilBrown [Wed, 11 Dec 2013 23:13:33 +0000 (10:13 +1100)]
md: fix problem when adding device to read-only array with bitmap.

If an array is started degraded, and then the missing device
is found it can be re-added and a minimal bitmap-based recovery
will bring it fully up-to-date.

If the array is read-only a recovery would not be allowed.
But also if the array is read-only and the missing device was
present very recently, then there could be no need for any
recovery at all, so we simply include the device in the read-only
array without any recovery.

However... if the missing device was removed a little longer ago
it could be missing some updates, but if a bitmap is present it will
be conditionally accepted pending a bitmap-based update.  We don't
currently detect this case properly and will include that old
device into the read-only array with no recovery even though it really
needs a recovery.

This patch keeps track of whether a bitmap-based-recovery is really
needed or not in the new Bitmap_sync rdev flag.  If that is set,
then the device will not be added to a read-only array.

Cc: Andrei Warkentin <andreiw@vmware.com>
Fixes: d70ed2e4fafdbef0800e73942482bb075c21578b
Cc: stable@vger.kernel.org (3.2+)
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomd/raid10: fix bug when raid10 recovery fails to recover a block.
NeilBrown [Sun, 5 Jan 2014 23:35:34 +0000 (10:35 +1100)]
md/raid10: fix bug when raid10 recovery fails to recover a block.

commit e875ecea266a543e643b19e44cf472f1412708f9
    md/raid10 record bad blocks as needed during recovery.

added code to the "cannot recover this block" path to record a bad
block rather than fail the whole recovery.
Unfortunately this new case was placed *after* r10bio was freed rather
than *before*, yet it still uses r10bio.
This is will crash with a null dereference.

So move the freeing of r10bio down where it is safe.

Cc: stable@vger.kernel.org (v3.1+)
Fixes: e875ecea266a543e643b19e44cf472f1412708f9
Reported-by: Damian Nowak <spam@nowaker.net>
URL: https://bugzilla.kernel.org/show_bug.cgi?id=68181
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomd/raid5: fix a recently broken BUG_ON().
NeilBrown [Tue, 14 Jan 2014 04:16:10 +0000 (15:16 +1100)]
md/raid5: fix a recently broken BUG_ON().

commit 6d183de4077191d1201283a9035ce57a9b05254d
    md/raid5: fix newly-broken locking in get_active_stripe.

simplified a BUG_ON, but removed too much so now it sometimes fires
when it shouldn't.

When the STRIPE_EXPANDING flag is set, the stripe_head might be on a
special list while multiple stripe_heads are collected, or it might
not be on any list, even a 'free' list when the refcount is zero.  As
long as STRIPE_EXPANDING is set, it will be found and added back to a
list eventually.

So both of the BUG_ONs which test for the ->lru being empty or not
need to avoid the case where STRIPE_EXPANDING is set.

The patch which broke this was marked for -stable, so this patch needs
to be applied to any branch that received 6d183de4

Fixes: 6d183de4077191d1201283a9035ce57a9b05254d
Cc: stable@vger.kernel.org (any release to which above was applied)
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomd/raid1: fix request counting bug in new 'barrier' code.
NeilBrown [Tue, 14 Jan 2014 00:56:14 +0000 (11:56 +1100)]
md/raid1: fix request counting bug in new 'barrier' code.

The new iobarrier implementation in raid1 (which keeps normal writes
and resync activity separate) counts every request what is not before
the current resync point in either next_window_requests or
current_window_requests.
It flags that the request is counted by setting ->start_next_window.

allow_barrier follows this model exactly and decrements one of the
*_window_requests if and only if ->start_next_window is set.

However wait_barrier(), which increments *_window_requests uses a
slightly different test for setting -.start_next_window (which is set
from the return value of this function).
So there is a possibility of the counts getting out of sync, and this
leads to the resync hanging.

So change wait_barrier() to return a non-zero value in exactly the
same cases that it increments *_window_requests.

But was introduced in 3.13-rc1.

Reported-by: Bruno Wolff III <bruno@wolff.to>
URL: https://bugzilla.kernel.org/show_bug.cgi?id=68061
Fixes: 79ef3a8aa1cb1523cc231c9a90a278333c21f761
Cc: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomd/raid10: fix two bugs in handling of known-bad-blocks.
NeilBrown [Mon, 13 Jan 2014 23:38:09 +0000 (10:38 +1100)]
md/raid10: fix two bugs in handling of known-bad-blocks.

If we discover a bad block when reading we split the request and
potentially read some of it from a different device.

The code path of this has two bugs in RAID10.
1/ we get a spin_lock with _irq, but unlock without _irq!!
2/ The calculation of 'sectors_handled' is wrong, as can be clearly
   seen by comparison with raid1.c

This leads to at least 2 warnings and a probable crash is a RAID10
ever had known bad blocks.

Cc: stable@vger.kernel.org (v3.1+)
Fixes: 856e08e23762dfb92ffc68fd0a8d228f9e152160
Reported-by: Damian Nowak <spam@nowaker.net>
URL: https://bugzilla.kernel.org/show_bug.cgi?id=68181
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomd/raid5: Fix possible confusion when multiple write errors occur.
NeilBrown [Mon, 6 Jan 2014 02:19:42 +0000 (13:19 +1100)]
md/raid5: Fix possible confusion when multiple write errors occur.

commit 5d8c71f9e5fbdd95650be00294d238e27a363b5c
    md: raid5 crash during degradation

Fixed a crash in an overly simplistic way which could leave
R5_WriteError or R5_MadeGood set in the stripe cache for devices
for which it is no longer relevant.
When those devices are removed and spares added the flags are still
set and can cause incorrect behaviour.

commit 14a75d3e07c784c004b4b44b34af996b8e4ac453
    md/raid5: preferentially read from replacement device if possible.

Fixed the same bug if a more effective way, so we can now revert
the original commit.

Reported-and-tested-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Cc: stable@vger.kernel.org (3.2+ - 3.2 will need a different fix though)
Fixes: 5d8c71f9e5fbdd95650be00294d238e27a363b5c
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoRevert "drm: copy mode type in drm_mode_connector_list_update()"
Dave Airlie [Tue, 14 Jan 2014 02:50:49 +0000 (12:50 +1000)]
Revert "drm: copy mode type in drm_mode_connector_list_update()"

This reverts commit 3fbd6439e4639ecaeaae6c079e0aa497a1ac3482.

This caused some strange booting lockup issues on an Intel G33
belonging to Daniel Vetter, very unusual, I was hoping Daniel
would track this down, but it looks like instead I'll have to hack
a different fix for -next.

Signed-off-by: Dave Airlie <airlied@redhat.com>
10 years agoMerge tag 'drm-intel-fixes-2014-01-13' of git://people.freedesktop.org/~danvet/drm...
Dave Airlie [Tue, 14 Jan 2014 02:44:48 +0000 (12:44 +1000)]
Merge tag 'drm-intel-fixes-2014-01-13' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes

Black screen fixes, one for hsw+bdw each and a regression fix for
locking+load detection.

* tag 'drm-intel-fixes-2014-01-13' of git://people.freedesktop.org/~danvet/drm-intel:
  drm/i915/bdw: make sure south port interrupts are enabled properly v2
  drm/i915: Don't grab crtc mutexes in intel_modeset_gem_init()
  drm/i915: fix DDI PLLs HW state readout code

10 years agocxgb4: silence shift wrapping static checker warning
Dan Carpenter [Thu, 9 Jan 2014 05:34:00 +0000 (08:34 +0300)]
cxgb4: silence shift wrapping static checker warning

I don't know how large "tp->vlan_shift" is but static checkers worry
about shift wrapping bugs here.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Linus Torvalds [Mon, 13 Jan 2014 03:59:05 +0000 (10:59 +0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc

Pull powerpc fix from Ben Herrenschmidt:
 "Here's one regression fix for 3.13 that I would appreciate if you
  could still pull in.  It was an "interesting" one to debug, basically
  it's an old bug that got somewhat "exposed" by new code breaking the
  boot on PA Semi boards (yes, it does appear that some people are still
  using these!)"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Check return value of instance-to-package OF call

10 years agoMerge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Linus Torvalds [Mon, 13 Jan 2014 00:28:49 +0000 (07:28 +0700)]
Merge branch 'x86/urgent' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Peter Anvin:
 "Sorry, meant to push out this batch earlier this weekend"

* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround
  ftrace/x86: Load ftrace_ops in parameter not the variable holding it

10 years agopowerpc: Check return value of instance-to-package OF call
Benjamin Herrenschmidt [Sun, 12 Jan 2014 22:49:17 +0000 (09:49 +1100)]
powerpc: Check return value of instance-to-package OF call

On PA-Semi firmware, the instance-to-package callback doesn't seem
to be implemented. We didn't check for error, however, thus
subsequently passed the -1 value returned into stdout_node to
thins like prom_getprop etc...

Thus caused the firmware to load values around 0 (physical) internally
as node structures. It somewhat "worked" as long as we had a NULL in the
right place (address 8) at the beginning of the kernel, we didn't "see"
the bug. But commit 5c0484e25ec03243d4c2f2d4416d4a13efc77f6a
"powerpc: Endian safe trampoline" changed the kernel entry point causing
that old bug to now cause a crash early during boot.

This fixes booting on PA-Semi board by properly checking the return
value from instance-to-package.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Olof Johansson <olof@lixom.net>
---

10 years agoARM: 7938/1: OMAP4/highbank: Flush L2 cache before disabling
Taras Kondratiuk [Fri, 10 Jan 2014 00:27:08 +0000 (01:27 +0100)]
ARM: 7938/1: OMAP4/highbank: Flush L2 cache before disabling

Kexec disables outer cache before jumping to reboot code, but it doesn't
flush it explicitly. Flush is done implicitly inside of l2x0_disable().
But some SoC's override default .disable handler and don't flush cache.
This may lead to a corrupted memory during Kexec reboot on these
platforms.

This patch adds cache flush inside of OMAP4 and Highbank outer_cache.disable()
handlers to make it consistent with default l2x0_disable().

Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
10 years agoLinux 3.13-rc8
Linus Torvalds [Sun, 12 Jan 2014 10:04:18 +0000 (17:04 +0700)]
Linux 3.13-rc8

10 years agoSELinux: Fix possible NULL pointer dereference in selinux_inode_permission()
Steven Rostedt [Fri, 10 Jan 2014 02:46:34 +0000 (21:46 -0500)]
SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

While running stress tests on adding and deleting ftrace instances I hit
this bug:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  IP: selinux_inode_permission+0x85/0x160
  PGD 63681067 PUD 7ddbe067 PMD 0
  Oops: 0000 [#1] PREEMPT
  CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20
  Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
  task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000
  RIP: 0010:[<ffffffff812d8bc5>]  [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
  RSP: 0018:ffff88007ddb1c48  EFLAGS: 00010246
  RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840
  RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000
  RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54
  R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000
  R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000
  FS:  00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
  CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0
  Call Trace:
    security_inode_permission+0x1c/0x30
    __inode_permission+0x41/0xa0
    inode_permission+0x18/0x50
    link_path_walk+0x66/0x920
    path_openat+0xa6/0x6c0
    do_filp_open+0x43/0xa0
    do_sys_open+0x146/0x240
    SyS_open+0x1e/0x20
    system_call_fastpath+0x16/0x1b
  Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff
  RIP  selinux_inode_permission+0x85/0x160
  CR2: 0000000000000020

Investigating, I found that the inode->i_security was NULL, and the
dereference of it caused the oops.

in selinux_inode_permission():

isec = inode->i_security;

rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);

Note, the crash came from stressing the deletion and reading of debugfs
files.  I was not able to recreate this via normal files.  But I'm not
sure they are safe.  It may just be that the race window is much harder
to hit.

What seems to have happened (and what I have traced), is the file is
being opened at the same time the file or directory is being deleted.
As the dentry and inode locks are not held during the path walk, nor is
the inodes ref counts being incremented, there is nothing saving these
structures from being discarded except for an rcu_read_lock().

The rcu_read_lock() protects against freeing of the inode, but it does
not protect freeing of the inode_security_struct.  Now if the freeing of
the i_security happens with a call_rcu(), and the i_security field of
the inode is not changed (it gets freed as the inode gets freed) then
there will be no issue here.  (Linus Torvalds suggested not setting the
field to NULL such that we do not need to check if it is NULL in the
permission check).

Note, this is a hack, but it fixes the problem at hand.  A real fix is
to restructure the destroy_inode() to call all the destructor handlers
from the RCU callback.  But that is a major job to do, and requires a
lot of work.  For now, we just band-aid this bug with this fix (it
works), and work on a more maintainable solution in the future.

Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home
Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.home
Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agothp: fix copy_page_rep GPF by testing is_huge_zero_pmd once only
Hugh Dickins [Sun, 12 Jan 2014 09:25:21 +0000 (01:25 -0800)]
thp: fix copy_page_rep GPF by testing is_huge_zero_pmd once only

We see General Protection Fault on RSI in copy_page_rep: that RSI is
what you get from a NULL struct page pointer.

  RIP: 0010:[<ffffffff81154955>]  [<ffffffff81154955>] copy_page_rep+0x5/0x10
  RSP: 0000:ffff880136e15c00  EFLAGS: 00010286
  RAX: ffff880000000000 RBX: ffff880136e14000 RCX: 0000000000000200
  RDX: 6db6db6db6db6db7 RSI: db73880000000000 RDI: ffff880dd0c00000
  RBP: ffff880136e15c18 R08: 0000000000000200 R09: 000000000005987c
  R10: 000000000005987c R11: 0000000000000200 R12: 0000000000000001
  R13: ffffea00305aa000 R14: 0000000000000000 R15: 0000000000000000
  FS:  00007f195752f700(0000) GS:ffff880c7fc20000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000093010000 CR3: 00000001458e1000 CR4: 00000000000027e0
  Call Trace:
    copy_user_huge_page+0x93/0xab
    do_huge_pmd_wp_page+0x710/0x815
    handle_mm_fault+0x15d8/0x1d70
    __do_page_fault+0x14d/0x840
    do_page_fault+0x2f/0x90
    page_fault+0x22/0x30

do_huge_pmd_wp_page() tests is_huge_zero_pmd(orig_pmd) four times: but
since shrink_huge_zero_page() can free the huge_zero_page, and we have
no hold of our own on it here (except where the fourth test holds
page_table_lock and has checked pmd_same), it's possible for it to
answer yes the first time, but no to the second or third test.  Change
all those last three to tests for NULL page.

(Note: this is not the same issue as trinity's DEBUG_PAGEALLOC BUG
in copy_page_rep with RSI: ffff88009c422000, reported by Sasha Levin
in https://lkml.org/lkml/2013/3/29/103.  I believe that one is due
to the source page being split, and a tail page freed, while copy
is in progress; and not a problem without DEBUG_PAGEALLOC, since
the pmd_same check will prevent a miscopy from being made visible.)

Fixes: 97ae17497e99 ("thp: implement refcounting for huge zero page")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org # v3.10 v3.11 v3.12
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoblock: null_blk: fix queue leak inside removing device
Ming Lei [Thu, 26 Dec 2013 13:31:37 +0000 (21:31 +0800)]
block: null_blk: fix queue leak inside removing device

When queue_mode is NULL_Q_MQ and null_blk is being removed,
blk_cleanup_queue() isn't called to cleanup queue, so the queue
allocated won't be freed.

This patch calls blk_cleanup_queue() for MQ to drain all pending
requests first and release the reference counter of queue kobject, then
blk_mq_free_queue() will be called in queue kobject's release handler
when queue kobject's reference counter drops to zero.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agosched_clock: Disable seqlock lockdep usage in sched_clock()
John Stultz [Thu, 2 Jan 2014 23:11:14 +0000 (15:11 -0800)]
sched_clock: Disable seqlock lockdep usage in sched_clock()

Unfortunately the seqlock lockdep enablement can't be used
in sched_clock(), since the lockdep infrastructure eventually
calls into sched_clock(), which causes a deadlock.

Thus, this patch changes all generic sched_clock() usage
to use the raw_* methods.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Reported-by: Krzysztof Hałasa <khalasa@piap.pl>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1388704274-5278-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoseqlock: Use raw_ prefix instead of _no_lockdep
John Stultz [Thu, 2 Jan 2014 23:11:13 +0000 (15:11 -0800)]
seqlock: Use raw_ prefix instead of _no_lockdep

Linus disliked the _no_lockdep() naming, so instead
use the more-consistent raw_* prefix to the non-lockdep
enabled seqcount methods.

This also adds raw_ methods for the write operations
as well, which will be utilized in a following patch.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Krzysztof Hałasa <khalasa@piap.pl>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Willy Tarreau <w@1wt.eu>
Link: http://lkml.kernel.org/r/1388704274-5278-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agosched: Calculate effective load even if local weight is 0
Rik van Riel [Mon, 6 Jan 2014 11:39:12 +0000 (11:39 +0000)]
sched: Calculate effective load even if local weight is 0

Thomas Hellstrom bisected a regression where erratic 3D performance is
experienced on virtual machines as measured by glxgears. It identified
commit 58d081b5 ("sched/numa: Avoid overloading CPUs on a preferred NUMA
node") as the problem which had modified the behaviour of effective_load.

Effective load calculates the difference to the system-wide load if a
scheduling entity was moved to another CPU. The task group is not heavier
as a result of the move but overall system load can increase/decrease as a
result of the change. Commit 58d081b5 ("sched/numa: Avoid overloading CPUs
on a preferred NUMA node") changed effective_load to make it suitable for
calculating if a particular NUMA node was compute overloaded. To reduce
the cost of the function, it assumed that a current sched entity weight
of 0 was uninteresting but that is not the case.

wake_affine() uses a weight of 0 for sync wakeups on the grounds that it
is assuming the waking task will sleep and not contribute to load in the
near future. In this case, we still want to calculate the effective load
of the sched entity hierarchy. As effective_load is no longer used by
task_numa_compare since commit fb13c7ee (sched/numa: Use a system-wide
search to find swap/migration candidates), this patch simply restores the
historical behaviour.

Reported-and-tested-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
[ Wrote changelog]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140106113912.GC6178@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agox86, fpu, amd: Clear exceptions in AMD FXSAVE workaround
Linus Torvalds [Sun, 12 Jan 2014 03:15:52 +0000 (19:15 -0800)]
x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround

Before we do an EMMS in the AMD FXSAVE information leak workaround we
need to clear any pending exceptions, otherwise we trap with a
floating-point exception inside this code.

Reported-by: halfdog <me@halfdog.net>
Tested-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/CA%2B55aFxQnY_PCG_n4=0w-VG=YLXL-yr7oMxyy0WU2gCBAf3ydg@mail.gmail.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
10 years agoARM: 7939/1: traps: fix opcode endianness when read from user memory
Taras Kondratiuk [Fri, 10 Jan 2014 00:36:00 +0000 (01:36 +0100)]
ARM: 7939/1: traps: fix opcode endianness when read from user memory

Currently code has an inverted logic: opcode from user memory
is swapped to a proper endianness only in case of read error.
While normally opcode should be swapped only if it was read
correctly from user memory.

Reviewed-by: Victor Kamensky <victor.kamensky@linaro.org>
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
10 years agoARM: 7937/1: perf_event: Silence sparse warning
Stephen Boyd [Thu, 9 Jan 2014 23:57:06 +0000 (00:57 +0100)]
ARM: 7937/1: perf_event: Silence sparse warning

arch/arm/kernel/perf_event_cpu.c:274:25: warning: incorrect type in assignment (different modifiers)
arch/arm/kernel/perf_event_cpu.c:274:25: expected int ( *init_fn )( ... )
arch/arm/kernel/perf_event_cpu.c:274:25: got void const *const data

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>