openwrt/staging/blogic.git
6 years agogfs2: Get rid of gfs2_ea_strlen
Andreas Gruenbacher [Fri, 3 Aug 2018 09:57:52 +0000 (10:57 +0100)]
gfs2: Get rid of gfs2_ea_strlen

Function gfs2_ea_strlen is only called from ea_list_i, so inline it
there.  Remove the duplicate switch statement and the creative use of
memcpy to set a null byte.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Andrew Price <anprice@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
6 years agogfs2: cleanup: call gfs2_rgrp_ondisk2lvb from gfs2_rgrp_out
Bob Peterson [Thu, 26 Jul 2018 17:59:13 +0000 (12:59 -0500)]
gfs2: cleanup: call gfs2_rgrp_ondisk2lvb from gfs2_rgrp_out

Before this patch gfs2_rgrp_ondisk2lvb was called after every call
to gfs2_rgrp_out. This patch just calls it directly from within
gfs2_rgrp_out, and moves the function to be before it so we don't
need a function prototype.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
6 years agogfs2: Special-case rindex for gfs2_grow
Andreas Gruenbacher [Wed, 25 Jul 2018 17:45:08 +0000 (18:45 +0100)]
gfs2: Special-case rindex for gfs2_grow

To speed up the common case of appending to a file,
gfs2_write_alloc_required presumes that writing beyond the end of a file
will always require additional blocks to be allocated.  This assumption
is incorrect for preallocates files, but there are no negative
consequences as long as *some* space is still left on the filesystem.

One special file that always has some space preallocated beyond the end
of the file is the rindex: when growing a filesystem, gfs2_grow adds one
or more new resource groups and appends records describing those
resource groups to the rindex; the preallocated space ensures that this
is always possible.

However, when a filesystem is completely full, gfs2_write_alloc_required
will indicate that an additional allocation is required, and appending
the next record to the rindex will fail even though space for that
record has already been preallocated.  To fix that, skip the incorrect
optimization in gfs2_write_alloc_required, but for the rindex only.
Other writes to preallocated space beyond the end of the file are still
allowed to fail on completely full filesystems.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
6 years agoGFS2: rgrp free blocks used incorrectly
Bob Peterson [Wed, 30 May 2018 19:05:15 +0000 (14:05 -0500)]
GFS2: rgrp free blocks used incorrectly

Before this patch, several functions in rgrp.c checked the value of
rgd->rd_free_clone. That does not take into account blocks that were
reserved by a multi-block reservation. This causes a problem when
space gets tight in the file system. For example, when function
gfs2_inplace_reserve checks to see if a rgrp has enough blocks to
satisfy the request, it can accept a rgrp that it should reject
because, although there are enough blocks to satisfy the request
_now_, those blocks may be reserved for another running process.

A second problem with this occurs when we've reserved the remaining
blocks in an rgrp: function rg_mblk_search() can reject an rgrp
improperly because it calculates:

   u32 free_blocks = rgd->rd_free_clone - rgd->rd_reserved;

But rd_reserved includes blocks that the current process just
reserved in its own call to inplace_reserve. For example, it can
reserve the last 128 blocks of an rgrp, then reject that same rgrp
because the above calculates out to free_blocks = 0;

Consequences include, but are not limited to, (1) leaving holes,
and thus increasing file system fragmentation, and (2) reporting
file system is full long before it actually is.

This patch introduces a new function, rgd_free, which returns the
number of clone-free blocks (blocks that are truly free as opposed
to blocks that are still being used because an unlinked file is
still open) minus the number of blocks reserved by processes, but
not counting the blocks we ourselves reserved (because obviously
we need to allocate them).

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
6 years agogfs2: remove redundant variable 'moved'
Colin Ian King [Tue, 17 Jul 2018 14:59:28 +0000 (15:59 +0100)]
gfs2: remove redundant variable 'moved'

Variable 'moved' s being assigned but is never used hence it is
redundant and can be removed.  This has been the case ever since commit
c752666c.

Cleans up clang warning:
warning: variable 'moved' set but not used [-Wunused-but-set-variable]

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
6 years agogfs2: use iomap_readpage for blocksize == PAGE_SIZE
Andreas Gruenbacher [Wed, 6 Jun 2018 19:30:38 +0000 (20:30 +0100)]
gfs2: use iomap_readpage for blocksize == PAGE_SIZE

We only use iomap_readpage for pages that don't have buffer heads
attached yet: iomap_readpage would otherwise read pages from disk that
are marked buffer_uptodate() but not PageUptodate().  Those pages may
actually contain data more recent than what's on disk.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
6 years agogfs2: Use iomap for stuffed direct I/O reads
Andreas Gruenbacher [Wed, 27 Jun 2018 00:59:18 +0000 (01:59 +0100)]
gfs2: Use iomap for stuffed direct I/O reads

Remove the fallback code from direct to buffered I/O for stuffed reads.

For stuffed writes, we must keep the fallback code: the deferred glock
we are holding under direct I/O doesn't allow to write to the inode or
change the file size.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
6 years agoMerge branch 'iomap-4.19-merge' into linux-gfs2/for-next
Andreas Gruenbacher [Tue, 24 Jul 2018 22:06:59 +0000 (00:06 +0200)]
Merge branch 'iomap-4.19-merge' into linux-gfs2/for-next

Merge xfs branch 'iomap-4.19-merge' into linux-gfs2/for-next.  This
brings in readpage and direct I/O support for inline data.

The IOMAP_F_BUFFER_HEAD flag introduced in commit "iomap: add initial
support for writes without buffer heads" needs to be set for gfs2 as
well, so do that in the merge.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
6 years agogfs2: fallocate_chunk: Always initialize struct iomap
Andreas Gruenbacher [Fri, 6 Jul 2018 22:05:41 +0000 (23:05 +0100)]
gfs2: fallocate_chunk: Always initialize struct iomap

In fallocate_chunk, always initialize the iomap before calling
gfs2_iomap_get_alloc: future changes could otherwise cause things like
iomap.flags to leak across calls.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
6 years agoGFS2: Fix recovery issues for spectators
Bob Peterson [Thu, 5 Jul 2018 19:40:46 +0000 (14:40 -0500)]
GFS2: Fix recovery issues for spectators

This patch fixes a couple problems dealing with spectators who
remain with gfs2 mounts after the last non-spectator node fails.

Before this patch, spectator mounts would try to acquire the dlm's
mounted lock EX as part of its normal recovery sequence.
The mounted lock is only used to determine whether the node is
the first mounter, the first node to mount the file system, for
the purposes of file system recovery and journal replay.

It's not necessary for spectators: they should never do journal
recovery. If they acquire the lock it will prevent another "real"
first-mounter from acquiring the lock in EX mode, which means it
also cannot do journal recovery because it doesn't think it's the
first node to mount the file system.

This patch checks if the mounter is a spectator, and if so, avoids
grabbing the mounted lock. This allows a secondary mounter who is
really the first non-spectator mounter, to do journal recovery:
since the spectator doesn't acquire the lock, it can grab it in
EX mode, and therefore consider itself to be the first mounter
both as a "real" first mount, and as a first-real-after-spectator.

Note that the control lock still needs to be taken in PR mode
in order to fetch the lvb value so it has the current status of
all journal's recovery. This is used as it is today by a first
mounter to replay the journals. For spectators, it's merely
used to fetch the status bits. All recovery is bypassed and the
node waits until recovery is completed by a non-spectator node.

I also improved the cryptic message given by control_mount when
a spectator is waiting for a non-spectator to perform recovery.

It also fixes a problem in gfs2_recover_set whereby spectators
were never queueing recovery work for their own journal.
They cannot do recovery themselves, but they still need to queue
the work so they can check the recovery bits and clear the
DFL_BLOCK_LOCKS bit once the recovery happens on another node.

When the work queue runs on a spectator, it bypasses most of the
work so it won't print a bunch of annoying messages. All it will
print is a bunch of messages that look like this until recovery
completes on the non-spectator node:

GFS2: fsid=mycluster:scratch.s: recover generation 3 jid 0
GFS2: fsid=mycluster:scratch.s: recover jid 0 result busy

These continue every 1.5 seconds until the recovery is done by
the non-spectator, at which time it says:

GFS2: fsid=mycluster:scratch.s: recover generation 4 done

Then it proceeds with its mount.

If the file system is mounted in spectator node and the last
remaining non-spectator is fenced, any IO to the file system is
blocked by dlm and the spectator waits until recovery is
performed by a non-spectator.

If a spectator tries to mount the file system before any
non-spectators, it blocks and repeatedly gives this kernel
message:

GFS2: fsid=mycluster:scratch: Recovery is required. Waiting for a non-spectator to mount.
GFS2: fsid=mycluster:scratch: Recovery is required. Waiting for a non-spectator to mount.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
6 years agoMerge branch 'iomap-write' into linux-gfs2/for-next
Andreas Gruenbacher [Tue, 24 Jul 2018 18:02:40 +0000 (20:02 +0200)]
Merge branch 'iomap-write' into linux-gfs2/for-next

Pull in the gfs2 iomap-write changes: Tweak the existing code to
properly support iomap write and eliminate an unnecessary special case
in gfs2_block_map.  Implement iomap write support for buffered and
direct I/O.  Simplify some of the existing code and eliminate code that
is no longer used:

  gfs2: Remove gfs2_write_{begin,end}
  gfs2: iomap direct I/O support
  gfs2: gfs2_extent_length cleanup
  gfs2: iomap buffered write support
  gfs2: Further iomap cleanups

This is based on the following changes on the xfs 'iomap-4.19-merge'
branch:

  iomap: add private pointer to struct iomap
  iomap: add a page_done callback
  iomap: generic inline data handling
  iomap: complete partial direct I/O writes synchronously
  iomap: mark newly allocated buffer heads as new
  fs: factor out a __generic_write_end helper

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
6 years agofs: gfs2: Adding new return type vm_fault_t
Souptick Joarder [Mon, 2 Jul 2018 16:46:13 +0000 (22:16 +0530)]
fs: gfs2: Adding new return type vm_fault_t

Use new return type vm_fault_t for gfs2_page_mkwrite
handler.

see commit 1c8f422059ae ("mm: change return type to
vm_fault_t") for reference.

Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
6 years agogfs2: using posix_acl_xattr_size instead of posix_acl_to_xattr
Chengguang Xu [Fri, 22 Jun 2018 01:51:43 +0000 (09:51 +0800)]
gfs2: using posix_acl_xattr_size instead of posix_acl_to_xattr

It seems better to get size by calling posix_acl_xattr_size() instead of
calling posix_acl_to_xattr() with NULL buffer argument.

posix_acl_xattr_size() never returns 0, so remove the unnecessary check.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
6 years agogfs2: Don't reject a supposedly full bitmap if we have blocks reserved
Bob Peterson [Mon, 18 Jun 2018 18:24:13 +0000 (13:24 -0500)]
gfs2: Don't reject a supposedly full bitmap if we have blocks reserved

Before this patch, you could get into situations like this:

1. Process 1 searches for X free blocks, finds them, makes a reservation
2. Process 2 searches for free blocks in the same rgrp, but now the
   bitmap is full because process 1's reservation is skipped over.
   So it marks the bitmap as GBF_FULL.
3. Process 1 tries to allocate blocks from its own reservation, but
   since the GBF_FULL bit is set, it skips over the rgrp and searches
   elsewhere, thus not using its own reservation.

This patch adds an additional check to allow processes to use their
own reservations.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
6 years agogfs2: Eliminate redundant ip->i_rgd
Andreas Gruenbacher [Thu, 21 Jun 2018 12:42:37 +0000 (07:42 -0500)]
gfs2: Eliminate redundant ip->i_rgd

GFS2 remembers the last rgrp used for allocations in ip->i_rgd.
However, block allocations are made by way of a reservations structure,
ip->i_res, which keeps the last rgrp in ip->i_res.rs_rgd, and ip->i_res
is kept in sync with ip->i_res.rs_rgd, so it's redundant.  Get rid of
ip->i_rgd and just use ip->i_res.rs_rgd in its place.

Based on patches by Robert Peterson.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
6 years agogfs2: Stop messing with ip->i_rgd in the rlist code
Andreas Gruenbacher [Thu, 21 Jun 2018 12:22:12 +0000 (07:22 -0500)]
gfs2: Stop messing with ip->i_rgd in the rlist code

In the resource group list code, keep the last resource group added in
the last position in the array.  Check against that instead of messing
with ip->i_rgd.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
6 years agoiomap: add inline data support to iomap_readpage_actor
Andreas Gruenbacher [Tue, 3 Jul 2018 16:07:47 +0000 (09:07 -0700)]
iomap: add inline data support to iomap_readpage_actor

Just copy the inline data into the page using the existing helper.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
6 years agoiomap: support direct I/O to inline data
Andreas Gruenbacher [Tue, 3 Jul 2018 16:07:47 +0000 (09:07 -0700)]
iomap: support direct I/O to inline data

Add support for reading from and writing to inline data to iomap_dio_rw.
This saves filesystems from having to implement fallback code for this
case.

The inline data is actually cached in the inode, so the I/O is only
direct in the sense that it doesn't go through the page cache.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
6 years agoiomap: refactor iomap_dio_actor
Christoph Hellwig [Tue, 3 Jul 2018 16:07:46 +0000 (09:07 -0700)]
iomap: refactor iomap_dio_actor

Split the function up into two helpers for the bio based I/O and hole
case, and a small helper to call the two.  This separates the code a
little better in preparation for supporting I/O to inline data.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
6 years agogfs2: Remove gfs2_write_{begin,end}
Andreas Gruenbacher [Mon, 19 Mar 2018 23:10:52 +0000 (23:10 +0000)]
gfs2: Remove gfs2_write_{begin,end}

Now that generic_file_write_iter is no longer used, there are no
remaining users of these address space operations.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
6 years agogfs2: iomap direct I/O support
Andreas Gruenbacher [Tue, 19 Jun 2018 14:08:02 +0000 (15:08 +0100)]
gfs2: iomap direct I/O support

The page unmapping previously done in gfs2_direct_IO is now done
generically in iomap_dio_rw.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
6 years agogfs2: gfs2_extent_length cleanup
Andreas Gruenbacher [Fri, 11 May 2018 16:44:19 +0000 (17:44 +0100)]
gfs2: gfs2_extent_length cleanup

Now that gfs2_extent_length is no longer used for determining the size
of a hole and always with an upper size limit, the function can be
simplified.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
6 years agogfs2: iomap buffered write support
Andreas Gruenbacher [Sun, 24 Jun 2018 14:04:04 +0000 (15:04 +0100)]
gfs2: iomap buffered write support

With the traditional page-based writes, blocks are allocated separately
for each page written to.  With iomap writes, we can allocate a lot more
blocks at once, with a fraction of the allocation overhead for each
page.

Split calculating the number of blocks that can be allocated at a given
position (gfs2_alloc_size) off from gfs2_iomap_alloc: that size
determines the number of blocks to allocate and reserve in the journal.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
6 years agogfs2: Further iomap cleanups
Andreas Gruenbacher [Sun, 24 Jun 2018 09:43:49 +0000 (10:43 +0100)]
gfs2: Further iomap cleanups

In gfs2_iomap_alloc, set the type of newly allocated extents to
IOMAP_MAPPED so that iomap_to_bh will set the bh states correctly:
otherwise, the bhs would not be marked as mapped, confusing
__mpage_writepage.  This means that we need to check for the IOMAP_F_NEW
flag in fallocate_chunk now.

Further clean up gfs2_iomap_get and implement gfs2_stuffed_iomap here
directly.  For reads beyond the end of the file, return holes instead of
failing with -ENOENT so that we can get rid of that special case in
gfs2_block_map.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
6 years agogfs2: call ktime_get_coarse_real_ts64() directly
Arnd Bergmann [Wed, 20 Jun 2018 20:15:24 +0000 (15:15 -0500)]
gfs2: call ktime_get_coarse_real_ts64() directly

current_kernel_time64() is now just a deprecated wrapper around
ktime_get_coarse_real_ts64(), so let's just call that directly.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
6 years agogfs2: Minor clarification to __gfs2_punch_hole
Andreas Gruenbacher [Mon, 18 Jun 2018 15:34:59 +0000 (16:34 +0100)]
gfs2: Minor clarification to __gfs2_punch_hole

Rename end_off to end_len to make the code less confusing.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
6 years agogfs2: Don't withdraw under a spin lock
Andreas Gruenbacher [Thu, 7 Jun 2018 10:56:46 +0000 (11:56 +0100)]
gfs2: Don't withdraw under a spin lock

In two places, the gfs2_io_error_bh macro is called while holding the
sd_ail_lock spin lock.  This isn't allowed because gfs2_io_error_bh
withdraws the filesystem, which can sleep because it issues a uevent.
To fix that, add a gfs2_io_error_bh_wd macro that does withdraw the
filesystem and change gfs2_io_error_bh to not withdraw the filesystem.
In those places where the new gfs2_io_error_bh is used, withdraw the
filesystem after releasing sd_ail_lock.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andrew Price <anprice@redhat.com>
6 years agogfs2: eliminate rs_inum and reduce the size of gfs2 inodes
Bob Peterson [Wed, 13 Jun 2018 13:52:47 +0000 (08:52 -0500)]
gfs2: eliminate rs_inum and reduce the size of gfs2 inodes

Before this patch, block reservations kept track of the inode
number. At one point, that was a valid thing to do. However, since
we made the reservation a part of the inode (rather than a pointer
to a separate allocated object) the reservation can determine the
inode number by using container_of. This saves us a little memory
in our inode.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
6 years agoiomap: add initial support for writes without buffer heads
Christoph Hellwig [Tue, 19 Jun 2018 22:10:58 +0000 (15:10 -0700)]
iomap: add initial support for writes without buffer heads

For now just limited to blocksize == PAGE_SIZE, where we can simply read
in the full page in write begin, and just set the whole page dirty after
copying data into it.  This code is enabled by default and XFS will now
be feed pages without buffer heads in ->writepage and ->writepages.

If a file system sets the IOMAP_F_BUFFER_HEAD flag on the iomap the old
path will still be used, this both helps the transition in XFS and
prepares for the gfs2 migration to the iomap infrastructure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
6 years agoiomap: add private pointer to struct iomap
Andreas Gruenbacher [Tue, 19 Jun 2018 22:10:57 +0000 (15:10 -0700)]
iomap: add private pointer to struct iomap

Add a private pointer to struct iomap to allow filesystems to pass data
from iomap_begin to iomap_end.  Will be used by gfs2 for passing on the
on-disk inode buffer head.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
6 years agoiomap: add an iomap-based readpage and readpages implementation
Christoph Hellwig [Tue, 19 Jun 2018 22:10:57 +0000 (15:10 -0700)]
iomap: add an iomap-based readpage and readpages implementation

Simply use iomap_apply to iterate over the file and a submit a bio for
each non-uptodate but mapped region and zero everything else.  Note that
as-is this can not be used for file systems with a blocksize smaller than
the page size, but that support will be added later.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
6 years agoiomap: add a page_done callback
Christoph Hellwig [Tue, 19 Jun 2018 22:10:56 +0000 (15:10 -0700)]
iomap: add a page_done callback

This will be used by gfs2 to attach data to transactions for the journaled
data mode.  But the concept is generic enough that we might be able to
use it for other purposes like encryption/integrity post-processing in the
future.

Based on a patch from Andreas Gruenbacher.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
6 years agoiomap: generic inline data handling
Andreas Gruenbacher [Tue, 19 Jun 2018 22:10:56 +0000 (15:10 -0700)]
iomap: generic inline data handling

Add generic inline data handling by adding a pointer to the inline data
region to struct iomap.  When handling a buffered IOMAP_INLINE write,
iomap_write_begin will copy the current inline data from the inline data
region into the page cache, and iomap_write_end will copy the changes in
the page cache back to the inline data region.

This doesn't cover inline data reads and direct I/O yet because so far,
we have no users.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
[hch: small cleanups to better fit in with other iomap work]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
6 years agoiomap: complete partial direct I/O writes synchronously
Andreas Gruenbacher [Tue, 19 Jun 2018 22:10:55 +0000 (15:10 -0700)]
iomap: complete partial direct I/O writes synchronously

According to xfstest generic/240, applications seem to expect direct I/O
writes to either complete as a whole or to fail; short direct I/O writes
are apparently not appreciated.  This means that when only part of an
asynchronous direct I/O write succeeds, we can either fail the entire
write, or we can wait for the partial write to complete and retry the
remaining write as buffered I/O.  The old __blockdev_direct_IO helper
has code for waiting for partial writes to complete; the new
iomap_dio_rw iomap helper does not.

The above mentioned fallback mode is needed for gfs2, which doesn't
allow block allocations under direct I/O to avoid taking cluster-wide
exclusive locks.  As a consequence, an asynchronous direct I/O write to
a file range that contains a hole will result in a short write.  In that
case, wait for the short write to complete to allow gfs2 to recover.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
6 years agoiomap: mark newly allocated buffer heads as new
Andreas Gruenbacher [Tue, 19 Jun 2018 22:10:55 +0000 (15:10 -0700)]
iomap: mark newly allocated buffer heads as new

In iomap_to_bh, not only mark buffer heads in IOMAP_UNWRITTEN maps as
new, but also buffer heads in IOMAP_MAPPED maps with the IOMAP_F_NEW
flag set.  This will be used by filesystems like gfs2, which allocate
blocks in iomap->begin.

Minor corrections to the comment for IOMAP_UNWRITTEN maps.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
6 years agofs: factor out a __generic_write_end helper
Christoph Hellwig [Tue, 19 Jun 2018 22:10:55 +0000 (15:10 -0700)]
fs: factor out a __generic_write_end helper

Bits of the buffer.c based write_end implementations that don't know
about buffer_heads and can be reused by other implementations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
6 years agoLinux 4.18-rc1
Linus Torvalds [Sat, 16 Jun 2018 23:04:49 +0000 (08:04 +0900)]
Linux 4.18-rc1

6 years agoMerge tag 'for-linus-20180616' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 16 Jun 2018 20:37:55 +0000 (05:37 +0900)]
Merge tag 'for-linus-20180616' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A collection of fixes that should go into -rc1. This contains:

   - bsg_open vs bsg_unregister race fix (Anatoliy)

   - NVMe pull request from Christoph, with fixes for regressions in
     this window, FC connect/reconnect path code unification, and a
     trace point addition.

   - timeout fix (Christoph)

   - remove a few unused functions (Christoph)

   - blk-mq tag_set reinit fix (Roman)"

* tag 'for-linus-20180616' of git://git.kernel.dk/linux-block:
  bsg: fix race of bsg_open and bsg_unregister
  block: remov blk_queue_invalidate_tags
  nvme-fabrics: fix and refine state checks in __nvmf_check_ready
  nvme-fabrics: handle the admin-only case properly in nvmf_check_ready
  nvme-fabrics: refactor queue ready check
  blk-mq: remove blk_mq_tagset_iter
  nvme: remove nvme_reinit_tagset
  nvme-fc: fix nulling of queue data on reconnect
  nvme-fc: remove reinit_request routine
  blk-mq: don't time out requests again that are in the timeout handler
  nvme-fc: change controllers first connect to use reconnect path
  nvme: don't rely on the changed namespace list log
  nvmet: free smart-log buffer after use
  nvme-rdma: fix error flow during mapping request data
  nvme: add bio remapping tracepoint
  nvme: fix NULL pointer dereference in nvme_init_subsystem
  blk-mq: reinit q->tag_set_list entry only after grace period

6 years agoMerge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental
Linus Torvalds [Sat, 16 Jun 2018 20:25:18 +0000 (05:25 +0900)]
Merge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental

Pull documentation fixes from Mauro Carvalho Chehab:
 "This solves a series of broken links for files under Documentation,
  and improves a script meant to detect such broken links (see
  scripts/documentation-file-ref-check).

  The changes on this series are:

   - can.rst: fix a footnote reference;

   - crypto_engine.rst: Fix two parsing warnings;

   - Fix a lot of broken references to Documentation/*;

   - improve the scripts/documentation-file-ref-check script, in order
     to help detecting/fixing broken references, preventing
     false-positives.

  After this patch series, only 33 broken references to doc files are
  detected by scripts/documentation-file-ref-check"

* tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental: (26 commits)
  fix a series of Documentation/ broken file name references
  Documentation: rstFlatTable.py: fix a broken reference
  ABI: sysfs-devices-system-cpu: remove a broken reference
  devicetree: fix a series of wrong file references
  devicetree: fix name of pinctrl-bindings.txt
  devicetree: fix some bindings file names
  MAINTAINERS: fix location of DT npcm files
  MAINTAINERS: fix location of some display DT bindings
  kernel-parameters.txt: fix pointers to sound parameters
  bindings: nvmem/zii: Fix location of nvmem.txt
  docs: Fix more broken references
  scripts/documentation-file-ref-check: check tools/*/Documentation
  scripts/documentation-file-ref-check: get rid of false-positives
  scripts/documentation-file-ref-check: hint: dash or underline
  scripts/documentation-file-ref-check: add a fix logic for DT
  scripts/documentation-file-ref-check: accept more wildcards at filenames
  scripts/documentation-file-ref-check: fix help message
  media: max2175: fix location of driver's companion documentation
  media: v4l: fix broken video4linux docs locations
  media: dvb: point to the location of the old README.dvb-usb file
  ...

6 years agoMerge tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 16 Jun 2018 20:06:18 +0000 (05:06 +0900)]
Merge tag 'fsnotify_for_v4.18-rc1' of git://git./linux/kernel/git/jack/linux-fs

Pull fsnotify updates from Jan Kara:
 "fsnotify cleanups unifying handling of different watch types.

  This is the shortened fsnotify series from Amir with the last five
  patches pulled out. Amir has modified those patches to not change
  struct inode but obviously it's too late for those to go into this
  merge window"

* tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fsnotify: add fsnotify_add_inode_mark() wrappers
  fanotify: generalize fanotify_should_send_event()
  fsnotify: generalize send_to_group()
  fsnotify: generalize iteration of marks by object type
  fsnotify: introduce marks iteration helpers
  fsnotify: remove redundant arguments to handle_event()
  fsnotify: use type id to identify connector object type

6 years agoMerge tag 'fbdev-v4.18' of git://github.com/bzolnier/linux
Linus Torvalds [Sat, 16 Jun 2018 20:00:24 +0000 (05:00 +0900)]
Merge tag 'fbdev-v4.18' of git://github.com/bzolnier/linux

Pull fbdev updates from Bartlomiej Zolnierkiewicz:
 "There is nothing really major here, few small fixes, some cleanups and
  dead drivers removal:

   - mark omapfb drivers as orphans in MAINTAINERS file (Tomi Valkeinen)

   - add missing module license tags to omap/omapfb driver (Arnd
     Bergmann)

   - add missing GPIOLIB dependendy to omap2/omapfb driver (Arnd
     Bergmann)

   - convert savagefb, aty128fb & radeonfb drivers to use msleep & co.
     (Jia-Ju Bai)

   - allow COMPILE_TEST build for viafb driver (media part was reviewed
     by media subsystem Maintainer)

   - remove unused MERAM support from sh_mobile_lcdcfb and shmob-drm
     drivers (drm parts were acked by shmob-drm driver Maintainer)

   - remove unused auo_k190xfb drivers

   - misc cleanups (Souptick Joarder, Wolfram Sang, Markus Elfring, Andy
     Shevchenko, Colin Ian King)"

* tag 'fbdev-v4.18' of git://github.com/bzolnier/linux: (26 commits)
  fb_omap2: add gpiolib dependency
  video/omap: add module license tags
  MAINTAINERS: make omapfb orphan
  video: fbdev: pxafb: match_string() conversion fixup
  video: fbdev: nvidia: fix spelling mistake: "scaleing" -> "scaling"
  video: fbdev: fix spelling mistake: "frambuffer" -> "framebuffer"
  video: fbdev: pxafb: Convert to use match_string() helper
  video: fbdev: via: allow COMPILE_TEST build
  video: fbdev: remove unused sh_mobile_meram driver
  drm: shmobile: remove unused MERAM support
  video: fbdev: sh_mobile_lcdcfb: remove unused MERAM support
  video: fbdev: remove unused auo_k190xfb drivers
  video: omap: Improve a size determination in omapfb_do_probe()
  video: sm501fb: Improve a size determination in sm501fb_probe()
  video: fbdev-MMP: Improve a size determination in path_init()
  video: fbdev-MMP: Delete an error message for a failed memory allocation in two functions
  video: auo_k190x: Delete an error message for a failed memory allocation in auok190x_common_probe()
  video: sh_mobile_lcdcfb: Delete an error message for a failed memory allocation in two functions
  video: sh_mobile_meram: Delete an error message for a failed memory allocation in sh_mobile_meram_probe()
  video: fbdev: sh_mobile_meram: Drop SUPERH platform dependency
  ...

6 years agoMerge branch 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sat, 16 Jun 2018 07:32:04 +0000 (16:32 +0900)]
Merge branch 'afs-proc' of git://git./linux/kernel/git/viro/vfs

Pull AFS updates from Al Viro:
 "Assorted AFS stuff - ended up in vfs.git since most of that consists
  of David's AFS-related followups to Christoph's procfs series"

* 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  afs: Optimise callback breaking by not repeating volume lookup
  afs: Display manually added cells in dynamic root mount
  afs: Enable IPv6 DNS lookups
  afs: Show all of a server's addresses in /proc/fs/afs/servers
  afs: Handle CONFIG_PROC_FS=n
  proc: Make inline name size calculation automatic
  afs: Implement network namespacing
  afs: Mark afs_net::ws_cell as __rcu and set using rcu functions
  afs: Fix a Sparse warning in xdr_decode_AFSFetchStatus()
  proc: Add a way to make network proc files writable
  afs: Rearrange fs/afs/proc.c to remove remaining predeclarations.
  afs: Rearrange fs/afs/proc.c to move the show routines up
  afs: Rearrange fs/afs/proc.c by moving fops and open functions down
  afs: Move /proc management functions to the end of the file

6 years agoMerge branch 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sat, 16 Jun 2018 07:21:50 +0000 (16:21 +0900)]
Merge branch 'work.compat' of git://git./linux/kernel/git/viro/vfs

Pull compat updates from Al Viro:
 "Some biarch patches - getting rid of assorted (mis)uses of
  compat_alloc_user_space().

  Not much in that area this cycle..."

* 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  orangefs: simplify compat ioctl handling
  signalfd: lift sigmask copyin and size checks to callers of do_signalfd4()
  vmsplice(): lift importing iovec into vmsplice(2) and compat counterpart

6 years agoMerge branch 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sat, 16 Jun 2018 07:11:40 +0000 (16:11 +0900)]
Merge branch 'work.aio' of git://git./linux/kernel/git/viro/vfs

Pull aio fixes from Al Viro:
 "Assorted AIO followups and fixes"

* 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  eventpoll: switch to ->poll_mask
  aio: only return events requested in poll_mask() for IOCB_CMD_POLL
  eventfd: only return events requested in poll_mask()
  aio: mark __aio_sigset::sigmask const

6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Fri, 15 Jun 2018 22:39:34 +0000 (07:39 +0900)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Various netfilter fixlets from Pablo and the netfilter team.

 2) Fix regression in IPVS caused by lack of PMTU exceptions on local
    routes in ipv6, from Julian Anastasov.

 3) Check pskb_trim_rcsum for failure in DSA, from Zhouyang Jia.

 4) Don't crash on poll in TLS, from Daniel Borkmann.

 5) Revert SO_REUSE{ADDR,PORT} change, it regresses various things
    including Avahi mDNS. From Bart Van Assche.

 6) Missing of_node_put in qcom/emac driver, from Yue Haibing.

 7) We lack checking of the TCP checking in one special case during SYN
    receive, from Frank van der Linden.

 8) Fix module init error paths of mac80211 hwsim, from Johannes Berg.

 9) Handle 802.1ad properly in stmmac driver, from Elad Nachman.

10) Must grab HW caps before doing quirk checks in stmmac driver, from
    Jose Abreu.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits)
  net: stmmac: Run HWIF Quirks after getting HW caps
  neighbour: skip NTF_EXT_LEARNED entries during forced gc
  net: cxgb3: add error handling for sysfs_create_group
  tls: fix waitall behavior in tls_sw_recvmsg
  tls: fix use-after-free in tls_push_record
  l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl()
  l2tp: reject creation of non-PPP sessions on L2TPv2 tunnels
  mlxsw: spectrum_switchdev: Fix port_vlan refcounting
  mlxsw: spectrum_router: Align with new route replace logic
  mlxsw: spectrum_router: Allow appending to dev-only routes
  ipv6: Only emit append events for appended routes
  stmmac: added support for 802.1ad vlan stripping
  cfg80211: fix rcu in cfg80211_unregister_wdev
  mac80211: Move up init of TXQs
  mac80211_hwsim: fix module init error paths
  cfg80211: initialize sinfo in cfg80211_get_station
  nl80211: fix some kernel doc tag mistakes
  hv_netvsc: Fix the variable sizes in ipsecv2 and rsc offload
  rds: avoid unenecessary cong_update in loop transport
  l2tp: clean up stale tunnel or session in pppol2tp_connect's error path
  ...

6 years agoMerge tag 'modules-for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu...
Linus Torvalds [Fri, 15 Jun 2018 22:36:39 +0000 (07:36 +0900)]
Merge tag 'modules-for-v4.18' of git://git./linux/kernel/git/jeyu/linux

Pull module updates from Jessica Yu:
 "Minor code cleanup and also allow sig_enforce param to be shown in
  sysfs with CONFIG_MODULE_SIG_FORCE"

* tag 'modules-for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  module: Allow to always show the status of modsign
  module: Do not access sig_enforce directly

6 years agoMerge branch 'for-linus-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 15 Jun 2018 21:50:51 +0000 (06:50 +0900)]
Merge branch 'for-linus-4.18-rc1' of git://git./linux/kernel/git/rw/uml

Pull uml updates from Richard Weinberger:
 "Minor updates for UML:

   - fixes for our new vector network driver by Anton

   - initcall cleanup by Alexander

   - We have a new mailinglist, sourceforge.net sucks"

* 'for-linus-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: Fix raw interface options
  um: Fix initialization of vector queues
  um: remove uml initcalls
  um: Update mailing list address

6 years agoMerge tag 'riscv-for-linus-4.18-merge_window' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Fri, 15 Jun 2018 21:42:43 +0000 (06:42 +0900)]
Merge tag 'riscv-for-linus-4.18-merge_window' of git://git./linux/kernel/git/palmer/riscv-linux

Pull RISC-V updates from Palmer Dabbelt:
 "This contains some small RISC-V updates I'd like to target for 4.18.

  They are all fairly small this time. Here's a short summary, there's
  more info in the commits/merges:

   - a fix to __clear_user to respect the passed arguments.

   - enough support for the perf subsystem to work with RISC-V's ISA
     defined performance counters.

   - support for sparse and cleanups suggested by it.

   - support for R_RISCV_32 (a relocation, not the 32-bit ISA).

   - some MAINTAINERS cleanups.

   - the addition of CONFIG_HVC_RISCV_SBI to our defconfig, as it's
     always present.

  I've given these a simple build+boot test"

* tag 'riscv-for-linus-4.18-merge_window' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
  RISC-V: Add CONFIG_HVC_RISCV_SBI=y to defconfig
  RISC-V: Handle R_RISCV_32 in modules
  riscv/ftrace: Export _mcount when DYNAMIC_FTRACE isn't set
  riscv: add riscv-specific predefines to CHECKFLAGS
  riscv: split the declaration of __copy_user
  riscv: no __user for probe_kernel_address()
  riscv: use NULL instead of a plain 0
  perf: riscv: Add Document for Future Porting Guide
  perf: riscv: preliminary RISC-V support
  MAINTAINERS: Update Albert's email, he's back at Berkeley
  MAINTAINERS: Add myself as a maintainer for SiFive's drivers
  riscv: Fix the bug in memory access fixup code

6 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 15 Jun 2018 21:37:04 +0000 (06:37 +0900)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull more kvm updates from Paolo Bonzini:
 "Mostly the PPC part of the release, but also switching to Arnd's fix
  for the hyperv config issue and a typo fix.

  Main PPC changes:

   - reimplement the MMIO instruction emulation

   - transactional memory support for PR KVM

   - improve radix page table handling"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (63 commits)
  KVM: x86: VMX: redo fix for link error without CONFIG_HYPERV
  KVM: x86: fix typo at kvm_arch_hardware_setup comment
  KVM: PPC: Book3S PR: Fix failure status setting in tabort. emulation
  KVM: PPC: Book3S PR: Enable use on POWER9 bare-metal hosts in HPT mode
  KVM: PPC: Book3S PR: Don't let PAPR guest set MSR hypervisor bit
  KVM: PPC: Book3S PR: Fix failure status setting in treclaim. emulation
  KVM: PPC: Book3S PR: Fix MSR setting when delivering interrupts
  KVM: PPC: Book3S PR: Handle additional interrupt types
  KVM: PPC: Book3S PR: Enable kvmppc_get/set_one_reg_pr() for HTM registers
  KVM: PPC: Book3S: Remove load/put vcpu for KVM_GET_REGS/KVM_SET_REGS
  KVM: PPC: Remove load/put vcpu for KVM_GET/SET_ONE_REG ioctl
  KVM: PPC: Move vcpu_load/vcpu_put down to each ioctl case in kvm_arch_vcpu_ioctl
  KVM: PPC: Book3S PR: Enable HTM for PR KVM for KVM_CHECK_EXTENSION ioctl
  KVM: PPC: Book3S PR: Support TAR handling for PR KVM HTM
  KVM: PPC: Book3S PR: Add guard code to prevent returning to guest with PR=0 and Transactional state
  KVM: PPC: Book3S PR: Add emulation for tabort. in privileged state
  KVM: PPC: Book3S PR: Add emulation for trechkpt.
  KVM: PPC: Book3S PR: Add emulation for treclaim.
  KVM: PPC: Book3S PR: Restore NV regs after emulating mfspr from TM SPRs
  KVM: PPC: Book3S PR: Always fail transactions in guest privileged state
  ...

6 years agoMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Linus Torvalds [Fri, 15 Jun 2018 21:35:02 +0000 (06:35 +0900)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:
 "virtio, vhost: features, fixes

   - PCI virtual function support for virtio

   - DMA barriers for virtio strong barriers

   - bugfixes"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio: update the comments for transport features
  virtio_pci: support enabling VFs
  vhost: fix info leak due to uninitialized memory
  virtio_ring: switch to dma_XX barriers for rpmsg

6 years agofix a series of Documentation/ broken file name references
Mauro Carvalho Chehab [Thu, 14 Jun 2018 15:34:32 +0000 (12:34 -0300)]
fix a series of Documentation/ broken file name references

As files move around, their previous links break. Fix the
references for them.

Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agoDocumentation: rstFlatTable.py: fix a broken reference
Mauro Carvalho Chehab [Thu, 14 Jun 2018 15:33:06 +0000 (12:33 -0300)]
Documentation: rstFlatTable.py: fix a broken reference

The old HOWTO was removed a long time ago. The flat table
version is not metioned elsewhere, so just get rid of the
text.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agoABI: sysfs-devices-system-cpu: remove a broken reference
Mauro Carvalho Chehab [Thu, 14 Jun 2018 15:32:05 +0000 (12:32 -0300)]
ABI: sysfs-devices-system-cpu: remove a broken reference

This file doesn't exist anymore:
Documentation/cpu-freq/user-guide.txt

As the ABI already points to Documentation/cpu-freq, just
remove the broken link and the associated text.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agodevicetree: fix a series of wrong file references
Mauro Carvalho Chehab [Thu, 14 Jun 2018 15:30:30 +0000 (12:30 -0300)]
devicetree: fix a series of wrong file references

As files got renamed, their references broke.

Manually fix a series of broken refs at the DT bindings.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agodevicetree: fix name of pinctrl-bindings.txt
Mauro Carvalho Chehab [Thu, 14 Jun 2018 15:24:41 +0000 (12:24 -0300)]
devicetree: fix name of pinctrl-bindings.txt

Rename:
pinctrl-binding.txt -> pinctrl-bindings.txt

In order to match the current name of this file.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agodevicetree: fix some bindings file names
Mauro Carvalho Chehab [Thu, 14 Jun 2018 12:39:01 +0000 (09:39 -0300)]
devicetree: fix some bindings file names

There were some file movements that changed the location for
some DT bindings. Fix them with:
scripts/documentation-file-ref-check --fix

After manually checking if the new file makes sense.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agoMAINTAINERS: fix location of DT npcm files
Mauro Carvalho Chehab [Thu, 14 Jun 2018 11:59:37 +0000 (08:59 -0300)]
MAINTAINERS: fix location of DT npcm files

The specified locations are not right. Fix the wildcard logic
to point to the correct directories.

Without that, get-maintainer won't get things right:

$ ./scripts/get_maintainer.pl --no-git-fallback --no-r --no-n --no-l -f Documentation/devicetree/bindings/arm/cpu-enable-method/nuvoton,npcm750-smp
robh+dt@kernel.org (maintainer:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS)
mark.rutland@arm.com (maintainer:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS)

After the patch, it will properly point to NPCM arch maintainers:
$ ./scripts/get_maintainer.pl --no-git-fallback --no-r --no-n --no-l -f Documentation/devicetree/bindings/arm/cpu-enable-method/nuvoton,npcm750-smp
avifishman70@gmail.com (supporter:ARM/NUVOTON NPCM ARCHITECTURE)
tmaimon77@gmail.com (supporter:ARM/NUVOTON NPCM ARCHITECTURE)
robh+dt@kernel.org (maintainer:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS)
mark.rutland@arm.com (maintainer:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS)

Cc: Avi Fishman <avifishman70@gmail.com>
Cc: Tomer Maimon <tmaimon77@gmail.com>
Cc: Patrick Venture <venture@google.com>
Cc: Nancy Yuen <yuenn@google.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agoMAINTAINERS: fix location of some display DT bindings
Mauro Carvalho Chehab [Thu, 14 Jun 2018 11:01:00 +0000 (08:01 -0300)]
MAINTAINERS: fix location of some display DT bindings

Those files got a manufacturer's name prepended and were moved around.
Adjust their references accordingly.

Also, due those movements, Documentation/devicetree/bindings/video
doesn't exist anymore.

Cc: David Airlie <airlied@linux.ie>
Cc: David Lechner <david@lechnology.com>
Cc: Peter Senna Tschudin <peter.senna@collabora.com>
Cc: Martin Donnelly <martin.donnelly@ge.com>
Cc: Martyn Welch <martyn.welch@collabora.co.uk>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Alison Wang <alison.wang@nxp.com>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agokernel-parameters.txt: fix pointers to sound parameters
Mauro Carvalho Chehab [Thu, 14 Jun 2018 10:43:07 +0000 (07:43 -0300)]
kernel-parameters.txt: fix pointers to sound parameters

The alsa parameters file was renamed to alsa-configuration.rst.

With regards to OSS, it got retired as a hole by  at changeset
727dede0ba8a ("sound: Retire OSS"). So, it doesn't make sense
to keep mentioning it at kernel-parameters.txt.

Fixes: 727dede0ba8a ("sound: Retire OSS")
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agobindings: nvmem/zii: Fix location of nvmem.txt
Mauro Carvalho Chehab [Thu, 14 Jun 2018 10:18:45 +0000 (07:18 -0300)]
bindings: nvmem/zii: Fix location of nvmem.txt

The location pointed there is missing "bindings/" on its path.

Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agodocs: Fix more broken references
Mauro Carvalho Chehab [Tue, 8 May 2018 18:14:57 +0000 (15:14 -0300)]
docs: Fix more broken references

As we move stuff around, some doc references are broken. Fix some of
them via this script:
./scripts/documentation-file-ref-check --fix

Manually checked that produced results are valid.

Acked-by: Matthias Brugger <matthias.bgg@gmail.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agoscripts/documentation-file-ref-check: check tools/*/Documentation
Mauro Carvalho Chehab [Thu, 14 Jun 2018 14:06:08 +0000 (11:06 -0300)]
scripts/documentation-file-ref-check: check tools/*/Documentation

Some files, like tools/memory-model/README has references to
a Documentation file that is locale to it. Handle references
that are relative to them too.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agoscripts/documentation-file-ref-check: get rid of false-positives
Mauro Carvalho Chehab [Thu, 14 Jun 2018 13:47:29 +0000 (10:47 -0300)]
scripts/documentation-file-ref-check: get rid of false-positives

Now that the number of broken refs are smaller, improve the logic
that gets rid of false-positives.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agoscripts/documentation-file-ref-check: hint: dash or underline
Mauro Carvalho Chehab [Thu, 14 Jun 2018 13:14:54 +0000 (10:14 -0300)]
scripts/documentation-file-ref-check: hint: dash or underline

Sometimes, people use dash instead of underline or vice-versa.
Try to autocorrect it.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agoscripts/documentation-file-ref-check: add a fix logic for DT
Mauro Carvalho Chehab [Thu, 14 Jun 2018 12:36:35 +0000 (09:36 -0300)]
scripts/documentation-file-ref-check: add a fix logic for DT

There are several links broken due to DT file movements. Add
a hint logic to seek for those changes.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agoscripts/documentation-file-ref-check: accept more wildcards at filenames
Mauro Carvalho Chehab [Thu, 14 Jun 2018 10:48:22 +0000 (07:48 -0300)]
scripts/documentation-file-ref-check: accept more wildcards at filenames

at MAINTAINERS, some filename paths use '?' and things like [7,9].
So, accept more wildcards, in order to avoid false-positives.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agoscripts/documentation-file-ref-check: fix help message
Mauro Carvalho Chehab [Thu, 14 Jun 2018 10:11:02 +0000 (07:11 -0300)]
scripts/documentation-file-ref-check: fix help message

The name of the --fix option was renamed, but it was not
changed at the quick help message.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agomedia: max2175: fix location of driver's companion documentation
Mauro Carvalho Chehab [Thu, 14 Jun 2018 10:34:51 +0000 (07:34 -0300)]
media: max2175: fix location of driver's companion documentation

There's a missing ".rst" at the doc's file name.

Acked-by: Ramesh Shanmugasundaram <Ramesh.shanmugasundaram@bp.renesas.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agomedia: v4l: fix broken video4linux docs locations
Mauro Carvalho Chehab [Tue, 8 May 2018 22:41:44 +0000 (19:41 -0300)]
media: v4l: fix broken video4linux docs locations

There are several places pointing to old documentation files:

  Documentation/video4linux/API.html
  Documentation/video4linux/bttv/
  Documentation/video4linux/cx2341x/fw-encoder-api.txt
  Documentation/video4linux/m5602.txt
  Documentation/video4linux/v4l2-framework.txt
  Documentation/video4linux/videobuf
  Documentation/video4linux/Zoran

Make them point to the new location where available, removing
otherwise.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agomedia: dvb: point to the location of the old README.dvb-usb file
Mauro Carvalho Chehab [Tue, 8 May 2018 21:29:30 +0000 (18:29 -0300)]
media: dvb: point to the location of the old README.dvb-usb file

This file got renamed, but the references still point to the
old place.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agomedia: dvb: fix location of get_dvb_firmware script
Mauro Carvalho Chehab [Tue, 8 May 2018 21:10:05 +0000 (18:10 -0300)]
media: dvb: fix location of get_dvb_firmware script

This script was moved out of Documentation/dvb, but the
links weren't updated.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agodocs: Fix some broken references
Mauro Carvalho Chehab [Tue, 8 May 2018 18:14:57 +0000 (15:14 -0300)]
docs: Fix some broken references

As we move stuff around, some doc references are broken. Fix some of
them via this script:
./scripts/documentation-file-ref-check --fix

Manually checked if the produced result is valid, removing a few
false-positives.

Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agodocs: fix broken references with multiple hints
Mauro Carvalho Chehab [Tue, 8 May 2018 21:54:36 +0000 (18:54 -0300)]
docs: fix broken references with multiple hints

The script:
./scripts/documentation-file-ref-check --fix

Gives multiple hints for broken references on some files.
Manually use the one that applies for some files.

Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: James Morris <james.morris@microsoft.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agonet: stmmac: Run HWIF Quirks after getting HW caps
Jose Abreu [Fri, 15 Jun 2018 15:17:27 +0000 (16:17 +0100)]
net: stmmac: Run HWIF Quirks after getting HW caps

Currently we were running HWIF quirks before getting HW capabilities.
This is not right because some HWIF callbacks depend on HW caps.

Lets save the quirks callback and use it in a later stage.

This fixes Altera socfpga.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Fixes: 5f0456b43140 ("net: stmmac: Implement logic to automatically select HW Interface")
Reported-by: Dinh Nguyen <dinh.linux@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Dinh Nguyen <dinh.linux@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoneighbour: skip NTF_EXT_LEARNED entries during forced gc
Roopa Prabhu [Wed, 13 Jun 2018 04:26:10 +0000 (21:26 -0700)]
neighbour: skip NTF_EXT_LEARNED entries during forced gc

Commit 9ce33e46531d ("neighbour: support for NTF_EXT_LEARNED flag")
added support for NTF_EXT_LEARNED for neighbour entries.
NTF_EXT_LEARNED entries are neigh entries managed by control
plane (eg: Ethernet VPN implementation in FRR routing suite).
Periodic gc already excludes these entries. This patch extends
it to forced gc which the earlier patch missed.

Fixes: 9ce33e46531d ("neighbour: support for NTF_EXT_LEARNED flag")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: cxgb3: add error handling for sysfs_create_group
Zhouyang Jia [Fri, 15 Jun 2018 03:06:17 +0000 (11:06 +0800)]
net: cxgb3: add error handling for sysfs_create_group

When sysfs_create_group fails, the lack of error-handling code may
cause unexpected results.

This patch adds error-handling code after calling sysfs_create_group.

Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'tls-fixes'
David S. Miller [Fri, 15 Jun 2018 16:14:31 +0000 (09:14 -0700)]
Merge branch 'tls-fixes'

Daniel Borkmann says:

====================
Two tls fixes

First one is syzkaller trigered uaf and second one noticed
while writing test code with tls ulp. For details please see
individual patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotls: fix waitall behavior in tls_sw_recvmsg
Daniel Borkmann [Fri, 15 Jun 2018 01:07:46 +0000 (03:07 +0200)]
tls: fix waitall behavior in tls_sw_recvmsg

Current behavior in tls_sw_recvmsg() is to wait for incoming tls
messages and copy up to exactly len bytes of data that the user
provided. This is problematic in the sense that i) if no packet
is currently queued in strparser we keep waiting until one has been
processed and pushed into tls receive layer for tls_wait_data() to
wake up and push the decrypted bits to user space. Given after
tls decryption, we're back at streaming data, use sock_rcvlowat()
hint from tcp socket instead. Retain current behavior with MSG_WAITALL
flag and otherwise use the hint target for breaking the loop and
returning to application. This is done if currently no ctx->recv_pkt
is ready, otherwise continue to process it from our strparser
backlog.

Fixes: c46234ebb4d1 ("tls: RX path for ktls")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotls: fix use-after-free in tls_push_record
Daniel Borkmann [Fri, 15 Jun 2018 01:07:45 +0000 (03:07 +0200)]
tls: fix use-after-free in tls_push_record

syzkaller managed to trigger a use-after-free in tls like the
following:

  BUG: KASAN: use-after-free in tls_push_record.constprop.15+0x6a2/0x810 [tls]
  Write of size 1 at addr ffff88037aa08000 by task a.out/2317

  CPU: 3 PID: 2317 Comm: a.out Not tainted 4.17.0+ #144
  Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET47W (1.21 ) 11/28/2016
  Call Trace:
   dump_stack+0x71/0xab
   print_address_description+0x6a/0x280
   kasan_report+0x258/0x380
   ? tls_push_record.constprop.15+0x6a2/0x810 [tls]
   tls_push_record.constprop.15+0x6a2/0x810 [tls]
   tls_sw_push_pending_record+0x2e/0x40 [tls]
   tls_sk_proto_close+0x3fe/0x710 [tls]
   ? tcp_check_oom+0x4c0/0x4c0
   ? tls_write_space+0x260/0x260 [tls]
   ? kmem_cache_free+0x88/0x1f0
   inet_release+0xd6/0x1b0
   __sock_release+0xc0/0x240
   sock_close+0x11/0x20
   __fput+0x22d/0x660
   task_work_run+0x114/0x1a0
   do_exit+0x71a/0x2780
   ? mm_update_next_owner+0x650/0x650
   ? handle_mm_fault+0x2f5/0x5f0
   ? __do_page_fault+0x44f/0xa50
   ? mm_fault_error+0x2d0/0x2d0
   do_group_exit+0xde/0x300
   __x64_sys_exit_group+0x3a/0x50
   do_syscall_64+0x9a/0x300
   ? page_fault+0x8/0x30
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

This happened through fault injection where aead_req allocation in
tls_do_encryption() eventually failed and we returned -ENOMEM from
the function. Turns out that the use-after-free is triggered from
tls_sw_sendmsg() in the second tls_push_record(). The error then
triggers a jump to waiting for memory in sk_stream_wait_memory()
resp. returning immediately in case of MSG_DONTWAIT. What follows is
the trim_both_sgl(sk, orig_size), which drops elements from the sg
list added via tls_sw_sendmsg(). Now the use-after-free gets triggered
when the socket is being closed, where tls_sk_proto_close() callback
is invoked. The tls_complete_pending_work() will figure that there's
a pending closed tls record to be flushed and thus calls into the
tls_push_pending_closed_record() from there. ctx->push_pending_record()
is called from the latter, which is the tls_sw_push_pending_record()
from sw path. This again calls into tls_push_record(). And here the
tls_fill_prepend() will panic since the buffer address has been freed
earlier via trim_both_sgl(). One way to fix it is to move the aead
request allocation out of tls_do_encryption() early into tls_push_record().
This means we don't prep the tls header and advance state to the
TLS_PENDING_CLOSED_RECORD before allocation which could potentially
fail happened. That fixes the issue on my side.

Fixes: 3c4d7559159b ("tls: kernel TLS support")
Reported-by: syzbot+5c74af81c547738e1684@syzkaller.appspotmail.com
Reported-by: syzbot+709f2810a6a05f11d4d3@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'l2tp-l2tp_ppp-must-ignore-non-PPP-sessions'
David S. Miller [Fri, 15 Jun 2018 16:12:37 +0000 (09:12 -0700)]
Merge branch 'l2tp-l2tp_ppp-must-ignore-non-PPP-sessions'

Guillaume Nault says:

====================
l2tp: l2tp_ppp must ignore non-PPP sessions

The original L2TP code was written for version 2 of the protocol, which
could only carry PPP sessions. Then L2TPv3 generalised the protocol so that
it could transport different kinds of pseudo-wires. But parts of the
l2tp_ppp module still break in presence of non-PPP sessions.

Assuming L2TPv2 tunnels can only transport PPP sessions is right, but
l2tp_netlink failed to ensure that (fixed in patch 1).
When retrieving a session from an arbitrary tunnel, l2tp_ppp needs to
filter out non-PPP sessions (last occurrence fixed in patch 2).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agol2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl()
Guillaume Nault [Fri, 15 Jun 2018 13:39:19 +0000 (15:39 +0200)]
l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl()

pppol2tp_tunnel_ioctl() can act on an L2TPv3 tunnel, in which case
'session' may be an Ethernet pseudo-wire.

However, pppol2tp_session_ioctl() expects a PPP pseudo-wire, as it
assumes l2tp_session_priv() points to a pppol2tp_session structure. For
an Ethernet pseudo-wire l2tp_session_priv() points to an l2tp_eth_sess
structure instead, making pppol2tp_session_ioctl() access invalid
memory.

Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agol2tp: reject creation of non-PPP sessions on L2TPv2 tunnels
Guillaume Nault [Fri, 15 Jun 2018 13:39:17 +0000 (15:39 +0200)]
l2tp: reject creation of non-PPP sessions on L2TPv2 tunnels

The /proc/net/pppol2tp handlers (pppol2tp_seq_*()) iterate over all
L2TPv2 tunnels, and rightfully expect that only PPP sessions can be
found there. However, l2tp_netlink accepts creating Ethernet sessions
regardless of the underlying tunnel version.

This confuses pppol2tp_seq_session_show(), which expects that
l2tp_session_priv() returns a pppol2tp_session structure. When the
session is an Ethernet pseudo-wire, a struct l2tp_eth_sess is returned
instead. This leads to invalid memory access when
pppol2tp_session_get_sock() later tries to dereference ps->sk.

Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'mlxsw-IPv6-and-reference-counting-fixes'
David S. Miller [Fri, 15 Jun 2018 16:11:17 +0000 (09:11 -0700)]
Merge branch 'mlxsw-IPv6-and-reference-counting-fixes'

Ido Schimmel says:

====================
mlxsw: IPv6 and reference counting fixes

The first three patches fix a mismatch between the new IPv6 behavior
introduced in commit f34436a43092 ("net/ipv6: Simplify route replace and
appending into multipath route") and mlxsw. The patches allow the driver
to support multipathing in IPv6 overlays with GRE tunnel devices. A
selftest will be submitted when net-next opens.

The last patch fixes a reference count problem of the port_vlan struct.
I plan to simplify the code in net-next, so that reference counting is
not necessary anymore.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum_switchdev: Fix port_vlan refcounting
Petr Machata [Fri, 15 Jun 2018 13:23:38 +0000 (16:23 +0300)]
mlxsw: spectrum_switchdev: Fix port_vlan refcounting

Switchdev notifications for addition of SWITCHDEV_OBJ_ID_PORT_VLAN are
distributed not only on clean addition, but also when flags on an
existing VLAN are changed. mlxsw_sp_bridge_port_vlan_add() calls
mlxsw_sp_port_vlan_get() to get at the port_vlan in question, which
implicitly references the object. This then leads to discrepancies in
reference counting when the VLAN is removed. spectrum.c warns about the
problem when the module is removed:

[13578.493090] WARNING: CPU: 0 PID: 2454 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2973 mlxsw_sp_port_remove+0xfd/0x110 [mlxsw_spectrum]
[...]
[13578.627106] Call Trace:
[13578.629617]  mlxsw_sp_fini+0x2a/0xe0 [mlxsw_spectrum]
[13578.634748]  mlxsw_core_bus_device_unregister+0x3e/0x130 [mlxsw_core]
[13578.641290]  mlxsw_pci_remove+0x13/0x40 [mlxsw_pci]
[13578.646238]  pci_device_remove+0x31/0xb0
[13578.650244]  device_release_driver_internal+0x14f/0x220
[13578.655562]  driver_detach+0x32/0x70
[13578.659183]  bus_remove_driver+0x47/0xa0
[13578.663134]  pci_unregister_driver+0x1e/0x80
[13578.667486]  mlxsw_sp_module_exit+0xc/0x3fa [mlxsw_spectrum]
[13578.673207]  __x64_sys_delete_module+0x13b/0x1e0
[13578.677888]  ? exit_to_usermode_loop+0x78/0x80
[13578.682374]  do_syscall_64+0x39/0xe0
[13578.685976]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fix by putting the port_vlan when mlxsw_sp_port_vlan_bridge_join()
determines it's a flag-only change.

Fixes: b3529af6bb0d ("spectrum: Reference count VLAN entries")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum_router: Align with new route replace logic
Ido Schimmel [Fri, 15 Jun 2018 13:23:37 +0000 (16:23 +0300)]
mlxsw: spectrum_router: Align with new route replace logic

Commit f34436a43092 ("net/ipv6: Simplify route replace and appending
into multipath route") changed the IPv6 route replace logic so that the
first matching route (i.e., same metric) is replaced.

Have mlxsw replace the first matching route as well.

Fixes: f34436a43092 ("net/ipv6: Simplify route replace and appending into multipath route")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum_router: Allow appending to dev-only routes
Ido Schimmel [Fri, 15 Jun 2018 13:23:36 +0000 (16:23 +0300)]
mlxsw: spectrum_router: Allow appending to dev-only routes

Commit f34436a43092 ("net/ipv6: Simplify route replace and appending
into multipath route") changed the IPv6 route append logic so that
dev-only routes can be appended and not only gatewayed routes.

Align mlxsw with the new behaviour.

Fixes: f34436a43092 ("net/ipv6: Simplify route replace and appending into multipath route")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoipv6: Only emit append events for appended routes
Ido Schimmel [Fri, 15 Jun 2018 13:23:35 +0000 (16:23 +0300)]
ipv6: Only emit append events for appended routes

Current code will emit an append event in the FIB notification chain for
any route added with NLM_F_APPEND set, even if the route was not
appended to any existing route.

This is inconsistent with IPv4 where such an event is only emitted when
the new route is appended after an existing one.

Align IPv6 behavior with IPv4, thereby allowing listeners to more easily
handle these events.

Fixes: f34436a43092 ("net/ipv6: Simplify route replace and appending into multipath route")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'mac80211-for-davem-2018-06-15' of git://git.kernel.org/pub/scm/linux/kerne...
David S. Miller [Fri, 15 Jun 2018 16:08:26 +0000 (09:08 -0700)]
Merge tag 'mac80211-for-davem-2018-06-15' of git://git./linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
A handful of fixes:
 * missing RCU grace period enforcement led to drivers freeing
   data structures before; fix from Dedy Lansky.
 * hwsim module init error paths were messed up; fixed it myself
   after a report from Colin King (who had sent a partial patch)
 * kernel-doc tag errors; fix from Luca Coelho
 * initialize the on-stack sinfo data structure when getting
   station information; fix from Sven Eckelmann
 * TXQ state dumping is now done from init, and when TXQs aren't
   initialized yet at that point, bad things happen, move the
   initialization; fix from Toke Høiland-Jørgensen.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agostmmac: added support for 802.1ad vlan stripping
Elad Nachman [Fri, 15 Jun 2018 06:57:39 +0000 (09:57 +0300)]
stmmac: added support for 802.1ad vlan stripping

stmmac reception handler calls stmmac_rx_vlan() to strip the vlan before
calling napi_gro_receive().

The function assumes VLAN tagged frames are always tagged with
802.1Q protocol, and assigns ETH_P_8021Q to the skb by hard-coding
the parameter on call to __vlan_hwaccel_put_tag() .

This causes packets not to be passed to the VLAN slave if it was created
with 802.1AD protocol
(ip link add link eth0 eth0.100 type vlan proto 802.1ad id 100).

This fix passes the protocol from the VLAN header into
__vlan_hwaccel_put_tag() instead of using the hard-coded value of
ETH_P_8021Q.

NETIF_F_HW_VLAN_STAG_RX check was added and the strip action is now
dependent on the correct combination of features and the detected vlan tag.

NETIF_F_HW_VLAN_STAG_RX feature was added to be in line with the driver
actual abilities.

Signed-off-by: Elad Nachman <eladn@gilat.com>
Reviewed-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoarch/*: Kconfig: fix documentation for NMI watchdog
Mauro Carvalho Chehab [Wed, 9 May 2018 02:44:08 +0000 (23:44 -0300)]
arch/*: Kconfig: fix documentation for NMI watchdog

Changeset 9919cba7ff71 ("watchdog: Update documentation") updated
the documentation, removing the old nmi_watchdog.txt and adding
a file with a new content.

Update Kconfig files accordingly.

Fixes: 9919cba7ff71 ("watchdog: Update documentation")
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agodocs: crypto_engine.rst: Fix two parse warnings
Mauro Carvalho Chehab [Sun, 6 May 2018 17:30:09 +0000 (14:30 -0300)]
docs: crypto_engine.rst: Fix two parse warnings

./Documentation/crypto/crypto_engine.rst:13: WARNING: Unexpected indentation.
./Documentation/crypto/crypto_engine.rst:15: WARNING: Block quote ends without a blank line; unexpected unindent.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agodocs: can.rst: fix a footnote reference
Mauro Carvalho Chehab [Sun, 6 May 2018 15:00:11 +0000 (12:00 -0300)]
docs: can.rst: fix a footnote reference

As stated at:
http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#footnotes

A footnote should contain either a number, a reference or
an auto number, e. g.:
        [1], [#f1] or [#].

While using [*] accidentaly works for html, it fails for other
document outputs. In particular, it causes an error with LaTeX
output, causing all books after networking to not be built.

So, replace it by a valid syntax.

Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
6 years agoafs: Optimise callback breaking by not repeating volume lookup
David Howells [Fri, 15 Jun 2018 14:24:50 +0000 (15:24 +0100)]
afs: Optimise callback breaking by not repeating volume lookup

At the moment, afs_break_callbacks calls afs_break_one_callback() for each
separate FID it was given, and the latter looks up the volume individually
for each one.

However, this is inefficient if two or more FIDs have the same vid as we
could reuse the volume.  This is complicated by cell aliasing whereby we
may have multiple cells sharing a volume and can therefore have multiple
callback interests for any particular volume ID.

At the moment afs_break_one_callback() scans the entire list of volumes
we're getting from a server and breaks the appropriate callback in every
matching volume, regardless of cell.  This scan is done for every FID.

Optimise callback breaking by the following means:

 (1) Sort the FID list by vid so that all FIDs belonging to the same volume
     are clumped together.

     This is done through the use of an indirection table as we cannot do
     an insertion sort on the afs_callback_break array as we decode FIDs
     into it as we subsequently also have to decode callback info into it
     that corresponds by array index only.

     We also don't really want to bubblesort afterwards if we can avoid it.

 (2) Sort the server->cb_interests array by vid so that all the matching
     volumes are grouped together.  This permits the scan to stop after
     finding a record that has a higher vid.

 (3) When breaking FIDs, we try to keep server->cb_break_lock as long as
     possible, caching the start point in the array for that volume group
     as long as possible.

     It might make sense to add another layer in that list and have a
     refcounted volume ID anchor that has the matching interests attached
     to it rather than being in the list.  This would allow the lock to be
     dropped without losing the cursor.

Signed-off-by: David Howells <dhowells@redhat.com>
6 years agoafs: Display manually added cells in dynamic root mount
David Howells [Fri, 15 Jun 2018 14:19:22 +0000 (15:19 +0100)]
afs: Display manually added cells in dynamic root mount

Alter the dynroot mount so that cells created by manipulation of
/proc/fs/afs/cells and /proc/fs/afs/rootcell and by specification of a root
cell as a module parameter will cause directories for those cells to be
created in the dynamic root superblock for the network namespace[*].

To this end:

 (1) Only one dynamic root superblock is now created per network namespace
     and this is shared between all attempts to mount it.  This makes it
     easier to find the superblock to modify.

 (2) When a dynamic root superblock is created, the list of cells is walked
     and directories created for each cell already defined.

 (3) When a new cell is added, if a dynamic root superblock exists, a
     directory is created for it.

 (4) When a cell is destroyed, the directory is removed.

 (5) These directories are created by calling lookup_one_len() on the root
     dir which automatically creates them if they don't exist.

[*] Inasmuch as network namespaces are currently supported here.

Signed-off-by: David Howells <dhowells@redhat.com>
6 years agoafs: Enable IPv6 DNS lookups
David Howells [Fri, 15 Jun 2018 14:19:10 +0000 (15:19 +0100)]
afs: Enable IPv6 DNS lookups

Remove the restriction on DNS lookup upcalls that prevents ipv6 addresses
from being looked up.

Signed-off-by: David Howells <dhowells@redhat.com>
6 years agobsg: fix race of bsg_open and bsg_unregister
Anatoliy Glagolev [Wed, 13 Jun 2018 21:38:51 +0000 (15:38 -0600)]
bsg: fix race of bsg_open and bsg_unregister

The existing implementation allows races between bsg_unregister and
bsg_open paths. bsg_unregister and request_queue cleanup and deletion
may start and complete right after bsg_get_device (in bsg_open path)
retrieves bsg_class_device and releases the mutex. Then bsg_open path
touches freed memory of bsg_class_device and request_queue.

One possible fix is to hold the mutex all the way through bsg_get_device
instead of releasing it after bsg_class_device retrieval.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-Off-By: Anatoliy Glagolev <glagolig@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblock: remov blk_queue_invalidate_tags
Christoph Hellwig [Fri, 15 Jun 2018 11:55:07 +0000 (13:55 +0200)]
block: remov blk_queue_invalidate_tags

This function is entirely unused, so remove it and the tag_queue_busy
member of struct request_queue.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoMerge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-linus
Jens Axboe [Fri, 15 Jun 2018 14:11:05 +0000 (08:11 -0600)]
Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-linus

Pull NVMe fixes from Christoph:

"Fix various little regressions introduced in this merge window, plus
 a rework of the fibre channel connect and reconnect path to share the
 code instead of having separate sets of bugs. Last but not least a
 trivial trace point addition from Hannes."

* 'nvme-4.18' of git://git.infradead.org/nvme:
  nvme-fabrics: fix and refine state checks in __nvmf_check_ready
  nvme-fabrics: handle the admin-only case properly in nvmf_check_ready
  nvme-fabrics: refactor queue ready check
  blk-mq: remove blk_mq_tagset_iter
  nvme: remove nvme_reinit_tagset
  nvme-fc: fix nulling of queue data on reconnect
  nvme-fc: remove reinit_request routine
  nvme-fc: change controllers first connect to use reconnect path
  nvme: don't rely on the changed namespace list log
  nvmet: free smart-log buffer after use
  nvme-rdma: fix error flow during mapping request data
  nvme: add bio remapping tracepoint
  nvme: fix NULL pointer dereference in nvme_init_subsystem

6 years agocfg80211: fix rcu in cfg80211_unregister_wdev
Dedy Lansky [Fri, 15 Jun 2018 11:05:01 +0000 (13:05 +0200)]
cfg80211: fix rcu in cfg80211_unregister_wdev

Callers of cfg80211_unregister_wdev can free the wdev object
immediately after this function returns. This may crash the kernel
because this wdev object is still in use by other threads.
Add synchronize_rcu() after list_del_rcu to make sure wdev object can
be safely freed.

Signed-off-by: Dedy Lansky <dlansky@codeaurora.org>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
6 years agomac80211: Move up init of TXQs
Toke Høiland-Jørgensen [Fri, 25 May 2018 12:29:21 +0000 (14:29 +0200)]
mac80211: Move up init of TXQs

On init, ieee80211_if_add() dumps the interface. Since that now includes a
dump of the TXQ state, we need to initialise that before the dump happens.
So move up the TXQ initialisation to to before the call to
ieee80211_if_add().

Fixes: 52539ca89f36 ("cfg80211: Expose TXQ stats and parameters to userspace")
Reported-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>