Reinette Chatre [Fri, 22 Jun 2018 22:42:26 +0000 (15:42 -0700)]
x86/intel_rdt: Create debugfs files for pseudo-locking testing
There is no simple yes/no test to determine if pseudo-locking was
successful. In order to test pseudo-locking we expose a debugfs file for
each pseudo-locked region that will record the latency of reading the
pseudo-locked memory at a stride of 32 bytes (hardcoded). These numbers
will give us an idea of locking was successful or not since they will
reflect cache hits and cache misses (hardware prefetching is disabled
during the test).
The new debugfs file "pseudo_lock_measure" will, when the
pseudo_lock_mem_latency tracepoint is enabled, record the latency of
accessing each cache line twice.
Kernel tracepoints offer us histograms (when CONFIG_HIST_TRIGGERS is
enabled) that is a simple way to visualize the memory access latency
and immediately see any cache misses. For example, the hist trigger
below before trigger of the measurement will display the memory access
latency and instances at each latency:
echo 'hist:keys=latency' > /sys/kernel/debug/tracing/events/resctrl/\
pseudo_lock_mem_latency/trigger
echo 1 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable
echo 1 > /sys/kernel/debug/resctrl/<newlock>/pseudo_lock_measure
echo 0 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable
cat /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/hist
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/6b2ea76181099d1b79ccfa7d3be24497ab2d1a45.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:25 +0000 (15:42 -0700)]
x86/intel_rdt: Create resctrl debug area
In preparation for support of debugging of RDT sub features the user can
now enable a RDT debugfs region.
The debug area is always enabled when CONFIG_DEBUG_FS is set as advised in
http://lkml.kernel.org/r/
20180523080501.GA6822@kroah.com
Also from same discussion in above linked email, no error checking on the
debugfs creation return value since code should not behave differently when
debugging passes or fails. Even on failure the returned value can be passed
safely to other debugfs calls.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/9f553faf30866a6317f1aaaa2fe9f92de66a10d2.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:24 +0000 (15:42 -0700)]
x86/intel_rdt: Ensure RDT cleanup on exit
The RDT system's initialization does not have the corresponding exit
handling to ensure everything initialized on load is cleaned up also.
Introduce the cleanup routines that complement all initialization. This
includes the removal of a duplicate rdtgroup_init() declaration.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/a9e3a2bbd731d13915d2d7bf05d4f675b4fa109b.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:23 +0000 (15:42 -0700)]
x86/intel_rdt: Resctrl files reflect pseudo-locked information
Information about resources as well as resource groups are contained in a
variety of resctrl files. Now that pseudo-locked regions can be created the
files can be updated to present appropriate information to the user.
Update the resource group's schemata file to show only the information of
the pseudo-locked region.
Update the resource group's size file to show the size in bytes of only the
pseudo-locked region.
Update the bit_usage file to use the letter 'P' for all pseudo-locked
regions.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/5ece82869b651c2178b278e00bca959f7626b6e9.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:22 +0000 (15:42 -0700)]
x86/intel_rdt: Support creation/removal of pseudo-locked region
The user triggers the creation of a pseudo-locked region when writing a
valid schemata to the schemata file of a resource group in the
pseudo-locksetup mode.
A valid schemata is one that: (1) does not overlap with any other resource
group, (2) does not involve a cache that already contains a pseudo-locked
region within its hierarchy.
After a valid schemata is parsed the system is programmed to associate the
to be pseudo-lock bitmask with the closid associated with the resource
group. With the system set up the pseudo-locked region can be created.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/8929c3a9e2ba600e79649abe584aa28b8d0ff639.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:21 +0000 (15:42 -0700)]
x86/intel_rdt: Pseudo-lock region creation/removal core
The user requests a pseudo-locked region by providing a schemata to a
resource group that is in the pseudo-locksetup mode. This is the
functionality that consumes the parsed user data and creates the
pseudo-locked region.
First, required information is deduced from user provided data.
This includes, how much memory does the requested bitmask represent,
which CPU the requested region is associated with, and what is the
cache line size of that cache (to learn the stride needed for locking).
Second, a contiguous block of memory matching the requested bitmask is
allocated.
Finally, pseudo-locking is performed. The resource group already has the
allocation that reflects the requested bitmask. With this class of service
active and interference minimized, the allocated memory is loaded into the
cache.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/67391160bbf06143bc62d856d3d234eb152008b7.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:20 +0000 (15:42 -0700)]
x86/intel_rdt: Discover supported platforms via prefetch disable bits
Knowing the model specific prefetch disable bits is required to support
cache pseudo-locking because the hardware prefetchers need to be disabled
when the kernel memory is pseudo-locked to cache. We add these bits only
for platforms known to support cache pseudo-locking.
When the user requests locksetup mode to be entered it will fail if the
prefetch disabling bits are not known for the platform.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/3eef559aa9fd693a104ff99ff909cfee450c1695.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:19 +0000 (15:42 -0700)]
x86/intel_rdt: Add utilities to test pseudo-locked region possibility
A pseudo-locked region does not have a class of service associated with
it and thus not tracked in the array of control values maintained as
part of the domain. Even so, when the user provides a new bitmask for
another resource group it needs to be checked for interference with
existing pseudo-locked regions.
Additionally only one pseudo-locked region can be created in any cache
hierarchy.
Introduce two utilities in support of above scenarios: (1) a utility
that can be used to test if a given capacity bitmask overlaps with any
pseudo-locked regions associated with a particular cache instance, (2) a
utility that can be used to test if a pseudo-locked region exists within
a particular cache hierarchy.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/b8e31dbdcf22ddf71df46072647b47e7558abb32.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:18 +0000 (15:42 -0700)]
x86/intel_rdt: Split resource group removal in two
Resource groups used for pseudo-locking do not require the same work on
removal as the other resource groups.
The resource group removal is split in two in preparation for support of
pseudo-locking resource groups. A single re-ordering occurs - the
setting of the rdtgrp flag is moved to later. This flag is not used by
any of the code between its original and new location.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/c8cbf7a7c72480b39bb946a929dbae96c0f9aca1.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:17 +0000 (15:42 -0700)]
x86/intel_rdt: Enable entering of pseudo-locksetup mode
The user can request entering pseudo-locksetup mode by writing
"pseudo-locksetup" to the mode file. Act on this request as well as
support switching from a pseudo-locksetup mode (before pseudo-locked
mode was entered). It is not supported to modify the mode once
pseudo-locked mode has been entered.
The schemata reflects the new mode by adding "uninitialized" to all
resources. The size resctrl file reports zero for all cache domains in
support of the uninitialized nature. Since there are no users of this
class of service its allocations can be ignored when searching for
appropriate default allocations for new resource groups. For the same
reason resource groups in pseudo-locksetup mode are not considered when
testing if new resource groups may overlap.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/56f553334708022903c296284e62db3bbc1ff150.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:16 +0000 (15:42 -0700)]
x86/intel_rdt: Support enter/exit of locksetup mode
The locksetup mode is the way in which the user communicates that the
resource group will be used for a pseudo-locked region. Locksetup mode
should thus ensure that all restrictions on a resource group are met before
locksetup mode can be entered. The resource group should also be configured
to ensure that it cannot be modified in unsupported ways when a
pseudo-locked region.
Introduce the support where the request for entering locksetup mode can be
validated. This includes: CDP is not active, no cpus or tasks are assigned
to the resource group, monitoring is not in progress on the resource
group. Once the resource group is determined ready for a pseudo-locked
region it is configured to not allow future changes to these properties.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/b120f71ced30116bcc6c6f651e8a7906ae6b903d.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:15 +0000 (15:42 -0700)]
x86/intel_rdt: Introduce pseudo-locked region
A pseudo-locked region is introduced representing an instance of a
pseudo-locked cache region. Each cache instance (domain) can support one
pseudo-locked region. Similarly a resource group can be used for one
pseudo-locked region.
Include a pointer to a pseudo-locked region from the domain and resource
group structures.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/9f69eb159051067703bcbc714de62e69874d5dee.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:14 +0000 (15:42 -0700)]
x86/intel_rdt: Add check to determine if monitoring in progress
When a resource group is pseudo-locked it is orphaned without a class of
service associated with it. We thus do not want any monitoring in progress
on a resource group that will be used for pseudo-locking.
Introduce a test that can be used to determine if pseudo-locking in
progress on a resource group. Temporarily mark it as unused to avoid
compile warnings until it is used.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/14fd9494f87ca72a213b3a197d1172d4e66ae196.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:13 +0000 (15:42 -0700)]
x86/intel_rdt: Utilities to restrict/restore access to specific files
In support of Cache Pseudo-Locking we need to restrict access to specific
resctrl files to protect the state of a resource group used for
pseudo-locking from being changed in unsupported ways.
Introduce two utilities that can be used to either restrict or restore the
access to all files irrelevant to cache pseudo-locking when pseudo-locking
in progress for the resource group.
At this time introduce a new source file, intel_rdt_pseudo_lock.c, that
will contain most of the code related to cache pseudo-locking.
Temporarily mark these new functions as unused to silence compile warnings
until they are used.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/ab6319d1244366be3f9b7f9fba1c3da4810a274b.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:12 +0000 (15:42 -0700)]
x86/intel_rdt: Protect against resource group changes during locking
We intend to modify file permissions to make the "tasks", "cpus", and
"cpus_list" not accessible to the user when cache pseudo-locking in
progress. Even so, it is still possible for the user to force the file
permissions (using chmod) to make them writeable. Similarly, directory
permissions will be modified to prevent future monitor group creation but
the user can override these restrictions also.
Add additional checks to the files we intend to restrict to ensure that no
modifications from user space are attempted while setting up a
pseudo-locking or after a pseudo-locked region is set up.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/0c5cb006e81ead0b8bfff2df530c5d3017fd31d1.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:11 +0000 (15:42 -0700)]
x86/intel_rdt: Add utility to restrict/restore access to resctrl files
When a resource group is used for Cache Pseudo-Locking then the region of
cache ends up being orphaned with no class of service referring to it. The
resctrl files intended to manage how the classes of services are utilized
thus become irrelevant.
The fact that a resctrl file is not relevant can be communicated to the
user by setting all of its permissions to zero. That is, its read, write,
and execute permissions are unset for all users.
Introduce two utilities, rdtgroup_kn_mode_restrict() and
rdtgroup_kn_mode_restore(), that can be used to restrict and restore the
permissions of a file or directory belonging to a resource group.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/7afdbf5551b2f93cd45d61fbf5e01d87331f529a.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:10 +0000 (15:42 -0700)]
x86/intel_rdt: Add utility to test if tasks assigned to resource group
In considering changes to a resource group it becomes necessary to know
whether tasks have been assigned to the resource group in question.
Introduce a new utility that can be used to check if any tasks have been
assigned to a particular resource group.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/be9ea3969ffd731dfd90c0ebcd5a0e0a2d135bb2.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:09 +0000 (15:42 -0700)]
x86/intel_rdt: Respect read and write access
By default, if the opener has CAP_DAC_OVERRIDE, a kernfs file can be opened
regardless of RW permissions. Writing to a kernfs file will thus succeed
even if permissions are 0000.
It's required to restrict the actions that can be performed on a resource
group from userspace based on the mode of the resource group. This
restriction will be done through a modification of the file
permissions. That is, for example, if a resource group is locked then the
user cannot add tasks to the resource group.
For this restriction through file permissions to work it has to be ensured
that the permissions are always respected. To do so the resctrl filesystem
is created with the KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK flag that will result
in open(2) failing with -EACCESS regardless of CAP_DAC_OVERRIDE if the
permission does not have the respective read or write access.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/26f4fc25f110bfc07c2d2c8b2c4ee904922fedf7.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:08 +0000 (15:42 -0700)]
x86/intel_rdt: Introduce the Cache Pseudo-Locking modes
The two modes used to manage Cache Pseudo-Locked regions are introduced. A
resource group is assigned "pseudo-locksetup" mode when the user indicates
that this resource group will be used for a Cache Pseudo-Locked
region. When the Cache Pseudo-Locked region has been set up successfully
after the user wrote the requested schemata to the "schemata" file, then
the mode will automatically changed to "pseudo-locked". The user is not
able to modify the mode to "pseudo-locked" by writing "pseudo-locked" to
the "mode" file directly.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/98d6ca129bbe7dd0932d1fcfeb3cbb65f29a8d9d.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:07 +0000 (15:42 -0700)]
x86/intel_rdt: Documentation for Cache Pseudo-Locking
Add description of Cache Pseudo-Locking feature, its interface, as well as
an example of its usage.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/6e118c15d2c254a27b8891783505cd1bb94a2b10.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:06 +0000 (15:42 -0700)]
x86/intel_rdt: Display resource groups' allocations' size in bytes
The schemata file displays the allocations associated with each domain of
each resource. The syntax of this file reflects the capacity bitmask (CBM)
of the actual allocation. In order to determine the actual size of an
allocation the user needs to dig through three different files to query the
variables needed to compute it (the cache size, the CBM length, and the
schemata).
Introduce a new file "size" associated with each resource group that will
mirror the schemata file syntax and display the size in bytes of each
allocation.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/cc0058014c30adb88ca7d1a5abfadacbfb5edd0d.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:05 +0000 (15:42 -0700)]
x86/intel_rdt: Introduce "bit_usage" to display cache allocations details
With cache regions now explicitly marked as "shareable" or "exclusive"
we would like to communicate to the user how portions of the cache
are used.
Introduce "bit_usage" that indicates for each resource
how portions of the cache are configured to be used.
To assist the user to distinguish whether the sharing is from software or
hardware we add the following annotation:
0 - currently unused
X - currently available for sharing and used by software and hardware
H - currently used by hardware only but available for software use
S - currently used and shareable by software only
E - currently used exclusively by one resource group
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/105d44c40e582c2b7e2dccf0ae247e5e61137d4b.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:04 +0000 (15:42 -0700)]
x86/intel_rdt: Ensure requested schemata respects mode
When the administrator requests a change in a resource group's schemata
we have to ensure that the new schemata respects the current resource
group as well as the other active resource groups' schemata.
The new schemata is not allowed to overlap with the schemata of any
exclusive resource groups. Similarly, if the resource group being
changed is exclusive then its new schemata is not allowed to overlap
with any schemata of any other active resource group.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/b0c05b21110d3040fff45f4c1d2cfda8dba3f207.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:03 +0000 (15:42 -0700)]
x86/intel_rdt: Support flexible data to parsing callbacks
Each resource is associated with a configurable callback that should be
used to parse the information provided for the particular resource from
user space. In addition to the resource and domain pointers this callback
is provided with just the character buffer being parsed.
In support of flexible parsing the callback is modified to support a void
pointer as argument. This enables resources that need more data than just
the user provided data to pass its required data to the callback without
affecting the signatures for the callbacks of all the other resources.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/34baacfced4d787d994ec7015e249e6c7e619053.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:02 +0000 (15:42 -0700)]
x86/intel_rdt: Making CBM name and type more explicit
cbm_validate() receives a pointer to the variable that will be initialized
with a validated capacity bitmask. The pointer points to a variable of type
unsigned long that is immediately assigned to a variable of type u32 by the
caller on return from cbm_validate().
Let cbm_validate() initialize a variable of type u32 directly.
At this time also change tha variable name "data" within parse_cbm() to a
name more reflective of the content: "cbm_val". This frees up the generic
"data" to be used later when it is indeed used for a collection of input.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/5e29cf0209ea2deac9beacd35cbe5239a50959fb.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:01 +0000 (15:42 -0700)]
x86/intel_rdt: Enable setting of exclusive mode
The new "mode" file now accepts "exclusive" that means that the
allocations of this resource group cannot be shared.
Enable users to modify a resource group's mode to "exclusive". To
succeed it is required that there is no overlap between resource group's
current schemata and that of all the other active resource groups as
well as cache regions potentially used by other hardware entities.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/83642cbba3c8c21db7fa6bb36fe7d385d3b275f2.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:42:00 +0000 (15:42 -0700)]
x86/intel_rdt: Introduce new "exclusive" mode
At the moment all allocations are shareable. There is no way for a user to
designate that an allocation associated with a resource group cannot be
shared by another.
Introduce the new mode "exclusive". When a resource group is marked as such
it implies that no overlap is allowed between its allocation and that of
another resource group.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/f6d24672a4280fe3b24cd2da9b5f50214439c1af.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:41:59 +0000 (15:41 -0700)]
x86/intel_rdt: Initialize new resource group with sane defaults
Currently when a new resource group is created its allocations would be
those that belonged to the resource group to which its closid belonged
previously.
That is, we can encounter a case like:
mkdir newgroup
cat newgroup/schemata
L2:0=ff;1=ff
echo 'L2:0=0xf0;1=0xf0' > newgroup/schemata
cat newgroup/schemata
L2:0=0xf0;1=0xf0
rmdir newgroup
mkdir newnewgroup
cat newnewgroup/schemata
L2:0=0xf0;1=0xf0
When the new group is created it would be reasonable to expect its
allocations to be initialized with all regions that it can possibly use.
At this time these regions would be all that are shareable by other
resource groups as well as regions that are not currently used.
If the available cache region is found to be non-contiguous the
available region is adjusted to enforce validity.
When a new resource group is created the hardware is initialized with
these new default allocations.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/c468ed79340b63024111978e01430bb9589d85c0.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:41:58 +0000 (15:41 -0700)]
x86/intel_rdt: Make useful functions available internally
In support of the work done to enable resource groups to have different
modes some static functions need to be available for sharing amongst
all RDT components.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/2af8fd6e937ae4fbdaa52dee1123823cb4993176.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:41:57 +0000 (15:41 -0700)]
x86/intel_rdt: Introduce test to determine if closid is in use
During CAT feature discovery the capacity bitmasks (CBMs) associated
with all the classes of service are initialized to all ones, even if the
class of service is not in use. Introduce a test that can be used to
determine if a class of service is in use. This test enables code
interested in parsing the CBMs to know if its values are meaningful or
can be ignored.
Temporarily mark the function as unused to silence compile warnings
until it is used.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/798f8d89cd9b12df492d48c14bdc8ee3b39b1c6f.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:41:56 +0000 (15:41 -0700)]
x86/intel_rdt: Introduce resource group's mode resctrl file
A new resctrl file "mode" associated with each resource group is
introduced. This file will display the resource group's current mode and an
administrator can also use it to modify the resource group's mode.
Only shareable mode is currently supported.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20ab78fda26a8c8d98e18ec555f6a1f728948972.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:41:55 +0000 (15:41 -0700)]
x86/intel_rdt: Associate mode with each RDT resource group
Each RDT resource group is associated with a mode that will reflect
the level of sharing of its allocations. The default, shareable, will be
associated with each resource group on creation since it is zero and
resource groups are created with kzalloc. The managing of the mode of a
resource group will follow. The default resource group always remain
though so ensure that it is reset to the default mode when the resctrl
filesystem is unmounted.
Also introduce a utility that can be used to determine the mode of a
resource group when it is searched for based on its class of service.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/797e4e1de4e4fcdf5b5e0039354d6a28079e2015.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:41:54 +0000 (15:41 -0700)]
x86/intel_rdt: Introduce RDT resource group mode
At this time there are no constraints on how bitmasks represented by
schemata can be associated with closids represented by resource groups. A
bitmask of one class of service can without any objections overlap with the
bitmask of another class of service.
The concept of "mode" is introduced in preparation for support of control
over whether cache regions can be shared between classes of service. At
this time the only mode reflects the current cache allocations where all
can potentially be shared.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/87e88275597fbfa03ea9d41c1186bf012c831c01.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:41:53 +0000 (15:41 -0700)]
x86/intel_rdt: Document new mode, size, and bit_usage
By default resource groups allow sharing of their cache allocations. There
is nothing that prevents a resource group from configuring a cache
allocation that overlaps with that of an existing resource group.
To enable resource groups to specify that their cache allocations cannot be
shared a resource group "mode" is introduced to support two possible modes:
"shareable" and "exclusive". A "shareable" resource group allows sharing of
its cache allocations, an "exclusive" resource group does not. A new
resctrl file "mode" associated with each resource group is used to
communicate its (the associated resource group's) mode setting and allow
the mode to be changed. The new "mode" file as well as two other resctrl
files, "bit_usage" and "size", are introduced in this series.
Add documentation for the three new resctrl files as well as one example
demonstrating their use.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/f03a3059ec40ae719be6f3fba9f446bb055e0064.1529706536.git.reinette.chatre@intel.com
Reinette Chatre [Fri, 22 Jun 2018 22:41:52 +0000 (15:41 -0700)]
x86/intel_rdt: Provide pseudo-locking hooks within rdt_mount
Stephen Rothwell reported that the Cache Pseudo-Locking enabling and the
kernfs support for mounting with fs_context are conflicting.
In preparation for a conflict-free merge between the two repos some no-op
hooks are created within the RDT mount function being changed by the two
features. The goal is for this commit to be placed on a minimal no-rebase
branch to be consumed by both features.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Cc: David Howells <dhowells@redhat.com>
Link: https://lkml.kernel.org/r/410697ead08978bd12111c0afc4ce9e7bd71a5fe.1529706536.git.reinette.chatre@intel.com
Linus Torvalds [Sat, 16 Jun 2018 23:04:49 +0000 (08:04 +0900)]
Linux 4.18-rc1
Linus Torvalds [Sat, 16 Jun 2018 20:37:55 +0000 (05:37 +0900)]
Merge tag 'for-linus-
20180616' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A collection of fixes that should go into -rc1. This contains:
- bsg_open vs bsg_unregister race fix (Anatoliy)
- NVMe pull request from Christoph, with fixes for regressions in
this window, FC connect/reconnect path code unification, and a
trace point addition.
- timeout fix (Christoph)
- remove a few unused functions (Christoph)
- blk-mq tag_set reinit fix (Roman)"
* tag 'for-linus-
20180616' of git://git.kernel.dk/linux-block:
bsg: fix race of bsg_open and bsg_unregister
block: remov blk_queue_invalidate_tags
nvme-fabrics: fix and refine state checks in __nvmf_check_ready
nvme-fabrics: handle the admin-only case properly in nvmf_check_ready
nvme-fabrics: refactor queue ready check
blk-mq: remove blk_mq_tagset_iter
nvme: remove nvme_reinit_tagset
nvme-fc: fix nulling of queue data on reconnect
nvme-fc: remove reinit_request routine
blk-mq: don't time out requests again that are in the timeout handler
nvme-fc: change controllers first connect to use reconnect path
nvme: don't rely on the changed namespace list log
nvmet: free smart-log buffer after use
nvme-rdma: fix error flow during mapping request data
nvme: add bio remapping tracepoint
nvme: fix NULL pointer dereference in nvme_init_subsystem
blk-mq: reinit q->tag_set_list entry only after grace period
Linus Torvalds [Sat, 16 Jun 2018 20:25:18 +0000 (05:25 +0900)]
Merge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental
Pull documentation fixes from Mauro Carvalho Chehab:
"This solves a series of broken links for files under Documentation,
and improves a script meant to detect such broken links (see
scripts/documentation-file-ref-check).
The changes on this series are:
- can.rst: fix a footnote reference;
- crypto_engine.rst: Fix two parsing warnings;
- Fix a lot of broken references to Documentation/*;
- improve the scripts/documentation-file-ref-check script, in order
to help detecting/fixing broken references, preventing
false-positives.
After this patch series, only 33 broken references to doc files are
detected by scripts/documentation-file-ref-check"
* tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental: (26 commits)
fix a series of Documentation/ broken file name references
Documentation: rstFlatTable.py: fix a broken reference
ABI: sysfs-devices-system-cpu: remove a broken reference
devicetree: fix a series of wrong file references
devicetree: fix name of pinctrl-bindings.txt
devicetree: fix some bindings file names
MAINTAINERS: fix location of DT npcm files
MAINTAINERS: fix location of some display DT bindings
kernel-parameters.txt: fix pointers to sound parameters
bindings: nvmem/zii: Fix location of nvmem.txt
docs: Fix more broken references
scripts/documentation-file-ref-check: check tools/*/Documentation
scripts/documentation-file-ref-check: get rid of false-positives
scripts/documentation-file-ref-check: hint: dash or underline
scripts/documentation-file-ref-check: add a fix logic for DT
scripts/documentation-file-ref-check: accept more wildcards at filenames
scripts/documentation-file-ref-check: fix help message
media: max2175: fix location of driver's companion documentation
media: v4l: fix broken video4linux docs locations
media: dvb: point to the location of the old README.dvb-usb file
...
Linus Torvalds [Sat, 16 Jun 2018 20:06:18 +0000 (05:06 +0900)]
Merge tag 'fsnotify_for_v4.18-rc1' of git://git./linux/kernel/git/jack/linux-fs
Pull fsnotify updates from Jan Kara:
"fsnotify cleanups unifying handling of different watch types.
This is the shortened fsnotify series from Amir with the last five
patches pulled out. Amir has modified those patches to not change
struct inode but obviously it's too late for those to go into this
merge window"
* tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fsnotify: add fsnotify_add_inode_mark() wrappers
fanotify: generalize fanotify_should_send_event()
fsnotify: generalize send_to_group()
fsnotify: generalize iteration of marks by object type
fsnotify: introduce marks iteration helpers
fsnotify: remove redundant arguments to handle_event()
fsnotify: use type id to identify connector object type
Linus Torvalds [Sat, 16 Jun 2018 20:00:24 +0000 (05:00 +0900)]
Merge tag 'fbdev-v4.18' of git://github.com/bzolnier/linux
Pull fbdev updates from Bartlomiej Zolnierkiewicz:
"There is nothing really major here, few small fixes, some cleanups and
dead drivers removal:
- mark omapfb drivers as orphans in MAINTAINERS file (Tomi Valkeinen)
- add missing module license tags to omap/omapfb driver (Arnd
Bergmann)
- add missing GPIOLIB dependendy to omap2/omapfb driver (Arnd
Bergmann)
- convert savagefb, aty128fb & radeonfb drivers to use msleep & co.
(Jia-Ju Bai)
- allow COMPILE_TEST build for viafb driver (media part was reviewed
by media subsystem Maintainer)
- remove unused MERAM support from sh_mobile_lcdcfb and shmob-drm
drivers (drm parts were acked by shmob-drm driver Maintainer)
- remove unused auo_k190xfb drivers
- misc cleanups (Souptick Joarder, Wolfram Sang, Markus Elfring, Andy
Shevchenko, Colin Ian King)"
* tag 'fbdev-v4.18' of git://github.com/bzolnier/linux: (26 commits)
fb_omap2: add gpiolib dependency
video/omap: add module license tags
MAINTAINERS: make omapfb orphan
video: fbdev: pxafb: match_string() conversion fixup
video: fbdev: nvidia: fix spelling mistake: "scaleing" -> "scaling"
video: fbdev: fix spelling mistake: "frambuffer" -> "framebuffer"
video: fbdev: pxafb: Convert to use match_string() helper
video: fbdev: via: allow COMPILE_TEST build
video: fbdev: remove unused sh_mobile_meram driver
drm: shmobile: remove unused MERAM support
video: fbdev: sh_mobile_lcdcfb: remove unused MERAM support
video: fbdev: remove unused auo_k190xfb drivers
video: omap: Improve a size determination in omapfb_do_probe()
video: sm501fb: Improve a size determination in sm501fb_probe()
video: fbdev-MMP: Improve a size determination in path_init()
video: fbdev-MMP: Delete an error message for a failed memory allocation in two functions
video: auo_k190x: Delete an error message for a failed memory allocation in auok190x_common_probe()
video: sh_mobile_lcdcfb: Delete an error message for a failed memory allocation in two functions
video: sh_mobile_meram: Delete an error message for a failed memory allocation in sh_mobile_meram_probe()
video: fbdev: sh_mobile_meram: Drop SUPERH platform dependency
...
Linus Torvalds [Sat, 16 Jun 2018 07:32:04 +0000 (16:32 +0900)]
Merge branch 'afs-proc' of git://git./linux/kernel/git/viro/vfs
Pull AFS updates from Al Viro:
"Assorted AFS stuff - ended up in vfs.git since most of that consists
of David's AFS-related followups to Christoph's procfs series"
* 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
afs: Optimise callback breaking by not repeating volume lookup
afs: Display manually added cells in dynamic root mount
afs: Enable IPv6 DNS lookups
afs: Show all of a server's addresses in /proc/fs/afs/servers
afs: Handle CONFIG_PROC_FS=n
proc: Make inline name size calculation automatic
afs: Implement network namespacing
afs: Mark afs_net::ws_cell as __rcu and set using rcu functions
afs: Fix a Sparse warning in xdr_decode_AFSFetchStatus()
proc: Add a way to make network proc files writable
afs: Rearrange fs/afs/proc.c to remove remaining predeclarations.
afs: Rearrange fs/afs/proc.c to move the show routines up
afs: Rearrange fs/afs/proc.c by moving fops and open functions down
afs: Move /proc management functions to the end of the file
Linus Torvalds [Sat, 16 Jun 2018 07:21:50 +0000 (16:21 +0900)]
Merge branch 'work.compat' of git://git./linux/kernel/git/viro/vfs
Pull compat updates from Al Viro:
"Some biarch patches - getting rid of assorted (mis)uses of
compat_alloc_user_space().
Not much in that area this cycle..."
* 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
orangefs: simplify compat ioctl handling
signalfd: lift sigmask copyin and size checks to callers of do_signalfd4()
vmsplice(): lift importing iovec into vmsplice(2) and compat counterpart
Linus Torvalds [Sat, 16 Jun 2018 07:11:40 +0000 (16:11 +0900)]
Merge branch 'work.aio' of git://git./linux/kernel/git/viro/vfs
Pull aio fixes from Al Viro:
"Assorted AIO followups and fixes"
* 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
eventpoll: switch to ->poll_mask
aio: only return events requested in poll_mask() for IOCB_CMD_POLL
eventfd: only return events requested in poll_mask()
aio: mark __aio_sigset::sigmask const
Linus Torvalds [Fri, 15 Jun 2018 22:39:34 +0000 (07:39 +0900)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Various netfilter fixlets from Pablo and the netfilter team.
2) Fix regression in IPVS caused by lack of PMTU exceptions on local
routes in ipv6, from Julian Anastasov.
3) Check pskb_trim_rcsum for failure in DSA, from Zhouyang Jia.
4) Don't crash on poll in TLS, from Daniel Borkmann.
5) Revert SO_REUSE{ADDR,PORT} change, it regresses various things
including Avahi mDNS. From Bart Van Assche.
6) Missing of_node_put in qcom/emac driver, from Yue Haibing.
7) We lack checking of the TCP checking in one special case during SYN
receive, from Frank van der Linden.
8) Fix module init error paths of mac80211 hwsim, from Johannes Berg.
9) Handle 802.1ad properly in stmmac driver, from Elad Nachman.
10) Must grab HW caps before doing quirk checks in stmmac driver, from
Jose Abreu.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits)
net: stmmac: Run HWIF Quirks after getting HW caps
neighbour: skip NTF_EXT_LEARNED entries during forced gc
net: cxgb3: add error handling for sysfs_create_group
tls: fix waitall behavior in tls_sw_recvmsg
tls: fix use-after-free in tls_push_record
l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl()
l2tp: reject creation of non-PPP sessions on L2TPv2 tunnels
mlxsw: spectrum_switchdev: Fix port_vlan refcounting
mlxsw: spectrum_router: Align with new route replace logic
mlxsw: spectrum_router: Allow appending to dev-only routes
ipv6: Only emit append events for appended routes
stmmac: added support for 802.1ad vlan stripping
cfg80211: fix rcu in cfg80211_unregister_wdev
mac80211: Move up init of TXQs
mac80211_hwsim: fix module init error paths
cfg80211: initialize sinfo in cfg80211_get_station
nl80211: fix some kernel doc tag mistakes
hv_netvsc: Fix the variable sizes in ipsecv2 and rsc offload
rds: avoid unenecessary cong_update in loop transport
l2tp: clean up stale tunnel or session in pppol2tp_connect's error path
...
Linus Torvalds [Fri, 15 Jun 2018 22:36:39 +0000 (07:36 +0900)]
Merge tag 'modules-for-v4.18' of git://git./linux/kernel/git/jeyu/linux
Pull module updates from Jessica Yu:
"Minor code cleanup and also allow sig_enforce param to be shown in
sysfs with CONFIG_MODULE_SIG_FORCE"
* tag 'modules-for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
module: Allow to always show the status of modsign
module: Do not access sig_enforce directly
Linus Torvalds [Fri, 15 Jun 2018 21:50:51 +0000 (06:50 +0900)]
Merge branch 'for-linus-4.18-rc1' of git://git./linux/kernel/git/rw/uml
Pull uml updates from Richard Weinberger:
"Minor updates for UML:
- fixes for our new vector network driver by Anton
- initcall cleanup by Alexander
- We have a new mailinglist, sourceforge.net sucks"
* 'for-linus-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: Fix raw interface options
um: Fix initialization of vector queues
um: remove uml initcalls
um: Update mailing list address
Linus Torvalds [Fri, 15 Jun 2018 21:42:43 +0000 (06:42 +0900)]
Merge tag 'riscv-for-linus-4.18-merge_window' of git://git./linux/kernel/git/palmer/riscv-linux
Pull RISC-V updates from Palmer Dabbelt:
"This contains some small RISC-V updates I'd like to target for 4.18.
They are all fairly small this time. Here's a short summary, there's
more info in the commits/merges:
- a fix to __clear_user to respect the passed arguments.
- enough support for the perf subsystem to work with RISC-V's ISA
defined performance counters.
- support for sparse and cleanups suggested by it.
- support for R_RISCV_32 (a relocation, not the 32-bit ISA).
- some MAINTAINERS cleanups.
- the addition of CONFIG_HVC_RISCV_SBI to our defconfig, as it's
always present.
I've given these a simple build+boot test"
* tag 'riscv-for-linus-4.18-merge_window' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
RISC-V: Add CONFIG_HVC_RISCV_SBI=y to defconfig
RISC-V: Handle R_RISCV_32 in modules
riscv/ftrace: Export _mcount when DYNAMIC_FTRACE isn't set
riscv: add riscv-specific predefines to CHECKFLAGS
riscv: split the declaration of __copy_user
riscv: no __user for probe_kernel_address()
riscv: use NULL instead of a plain 0
perf: riscv: Add Document for Future Porting Guide
perf: riscv: preliminary RISC-V support
MAINTAINERS: Update Albert's email, he's back at Berkeley
MAINTAINERS: Add myself as a maintainer for SiFive's drivers
riscv: Fix the bug in memory access fixup code
Linus Torvalds [Fri, 15 Jun 2018 21:37:04 +0000 (06:37 +0900)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull more kvm updates from Paolo Bonzini:
"Mostly the PPC part of the release, but also switching to Arnd's fix
for the hyperv config issue and a typo fix.
Main PPC changes:
- reimplement the MMIO instruction emulation
- transactional memory support for PR KVM
- improve radix page table handling"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (63 commits)
KVM: x86: VMX: redo fix for link error without CONFIG_HYPERV
KVM: x86: fix typo at kvm_arch_hardware_setup comment
KVM: PPC: Book3S PR: Fix failure status setting in tabort. emulation
KVM: PPC: Book3S PR: Enable use on POWER9 bare-metal hosts in HPT mode
KVM: PPC: Book3S PR: Don't let PAPR guest set MSR hypervisor bit
KVM: PPC: Book3S PR: Fix failure status setting in treclaim. emulation
KVM: PPC: Book3S PR: Fix MSR setting when delivering interrupts
KVM: PPC: Book3S PR: Handle additional interrupt types
KVM: PPC: Book3S PR: Enable kvmppc_get/set_one_reg_pr() for HTM registers
KVM: PPC: Book3S: Remove load/put vcpu for KVM_GET_REGS/KVM_SET_REGS
KVM: PPC: Remove load/put vcpu for KVM_GET/SET_ONE_REG ioctl
KVM: PPC: Move vcpu_load/vcpu_put down to each ioctl case in kvm_arch_vcpu_ioctl
KVM: PPC: Book3S PR: Enable HTM for PR KVM for KVM_CHECK_EXTENSION ioctl
KVM: PPC: Book3S PR: Support TAR handling for PR KVM HTM
KVM: PPC: Book3S PR: Add guard code to prevent returning to guest with PR=0 and Transactional state
KVM: PPC: Book3S PR: Add emulation for tabort. in privileged state
KVM: PPC: Book3S PR: Add emulation for trechkpt.
KVM: PPC: Book3S PR: Add emulation for treclaim.
KVM: PPC: Book3S PR: Restore NV regs after emulating mfspr from TM SPRs
KVM: PPC: Book3S PR: Always fail transactions in guest privileged state
...
Linus Torvalds [Fri, 15 Jun 2018 21:35:02 +0000 (06:35 +0900)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
"virtio, vhost: features, fixes
- PCI virtual function support for virtio
- DMA barriers for virtio strong barriers
- bugfixes"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio: update the comments for transport features
virtio_pci: support enabling VFs
vhost: fix info leak due to uninitialized memory
virtio_ring: switch to dma_XX barriers for rpmsg
Mauro Carvalho Chehab [Thu, 14 Jun 2018 15:34:32 +0000 (12:34 -0300)]
fix a series of Documentation/ broken file name references
As files move around, their previous links break. Fix the
references for them.
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 15:33:06 +0000 (12:33 -0300)]
Documentation: rstFlatTable.py: fix a broken reference
The old HOWTO was removed a long time ago. The flat table
version is not metioned elsewhere, so just get rid of the
text.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 15:32:05 +0000 (12:32 -0300)]
ABI: sysfs-devices-system-cpu: remove a broken reference
This file doesn't exist anymore:
Documentation/cpu-freq/user-guide.txt
As the ABI already points to Documentation/cpu-freq, just
remove the broken link and the associated text.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 15:30:30 +0000 (12:30 -0300)]
devicetree: fix a series of wrong file references
As files got renamed, their references broke.
Manually fix a series of broken refs at the DT bindings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 15:24:41 +0000 (12:24 -0300)]
devicetree: fix name of pinctrl-bindings.txt
Rename:
pinctrl-binding.txt -> pinctrl-bindings.txt
In order to match the current name of this file.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 12:39:01 +0000 (09:39 -0300)]
devicetree: fix some bindings file names
There were some file movements that changed the location for
some DT bindings. Fix them with:
scripts/documentation-file-ref-check --fix
After manually checking if the new file makes sense.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 11:59:37 +0000 (08:59 -0300)]
MAINTAINERS: fix location of DT npcm files
The specified locations are not right. Fix the wildcard logic
to point to the correct directories.
Without that, get-maintainer won't get things right:
$ ./scripts/get_maintainer.pl --no-git-fallback --no-r --no-n --no-l -f Documentation/devicetree/bindings/arm/cpu-enable-method/nuvoton,npcm750-smp
robh+dt@kernel.org (maintainer:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS)
mark.rutland@arm.com (maintainer:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS)
After the patch, it will properly point to NPCM arch maintainers:
$ ./scripts/get_maintainer.pl --no-git-fallback --no-r --no-n --no-l -f Documentation/devicetree/bindings/arm/cpu-enable-method/nuvoton,npcm750-smp
avifishman70@gmail.com (supporter:ARM/NUVOTON NPCM ARCHITECTURE)
tmaimon77@gmail.com (supporter:ARM/NUVOTON NPCM ARCHITECTURE)
robh+dt@kernel.org (maintainer:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS)
mark.rutland@arm.com (maintainer:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS)
Cc: Avi Fishman <avifishman70@gmail.com>
Cc: Tomer Maimon <tmaimon77@gmail.com>
Cc: Patrick Venture <venture@google.com>
Cc: Nancy Yuen <yuenn@google.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 11:01:00 +0000 (08:01 -0300)]
MAINTAINERS: fix location of some display DT bindings
Those files got a manufacturer's name prepended and were moved around.
Adjust their references accordingly.
Also, due those movements, Documentation/devicetree/bindings/video
doesn't exist anymore.
Cc: David Airlie <airlied@linux.ie>
Cc: David Lechner <david@lechnology.com>
Cc: Peter Senna Tschudin <peter.senna@collabora.com>
Cc: Martin Donnelly <martin.donnelly@ge.com>
Cc: Martyn Welch <martyn.welch@collabora.co.uk>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Alison Wang <alison.wang@nxp.com>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 10:43:07 +0000 (07:43 -0300)]
kernel-parameters.txt: fix pointers to sound parameters
The alsa parameters file was renamed to alsa-configuration.rst.
With regards to OSS, it got retired as a hole by at changeset
727dede0ba8a ("sound: Retire OSS"). So, it doesn't make sense
to keep mentioning it at kernel-parameters.txt.
Fixes: 727dede0ba8a ("sound: Retire OSS")
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 10:18:45 +0000 (07:18 -0300)]
bindings: nvmem/zii: Fix location of nvmem.txt
The location pointed there is missing "bindings/" on its path.
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Tue, 8 May 2018 18:14:57 +0000 (15:14 -0300)]
docs: Fix more broken references
As we move stuff around, some doc references are broken. Fix some of
them via this script:
./scripts/documentation-file-ref-check --fix
Manually checked that produced results are valid.
Acked-by: Matthias Brugger <matthias.bgg@gmail.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 14:06:08 +0000 (11:06 -0300)]
scripts/documentation-file-ref-check: check tools/*/Documentation
Some files, like tools/memory-model/README has references to
a Documentation file that is locale to it. Handle references
that are relative to them too.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 13:47:29 +0000 (10:47 -0300)]
scripts/documentation-file-ref-check: get rid of false-positives
Now that the number of broken refs are smaller, improve the logic
that gets rid of false-positives.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 13:14:54 +0000 (10:14 -0300)]
scripts/documentation-file-ref-check: hint: dash or underline
Sometimes, people use dash instead of underline or vice-versa.
Try to autocorrect it.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 12:36:35 +0000 (09:36 -0300)]
scripts/documentation-file-ref-check: add a fix logic for DT
There are several links broken due to DT file movements. Add
a hint logic to seek for those changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 10:48:22 +0000 (07:48 -0300)]
scripts/documentation-file-ref-check: accept more wildcards at filenames
at MAINTAINERS, some filename paths use '?' and things like [7,9].
So, accept more wildcards, in order to avoid false-positives.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 10:11:02 +0000 (07:11 -0300)]
scripts/documentation-file-ref-check: fix help message
The name of the --fix option was renamed, but it was not
changed at the quick help message.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Thu, 14 Jun 2018 10:34:51 +0000 (07:34 -0300)]
media: max2175: fix location of driver's companion documentation
There's a missing ".rst" at the doc's file name.
Acked-by: Ramesh Shanmugasundaram <Ramesh.shanmugasundaram@bp.renesas.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Tue, 8 May 2018 22:41:44 +0000 (19:41 -0300)]
media: v4l: fix broken video4linux docs locations
There are several places pointing to old documentation files:
Documentation/video4linux/API.html
Documentation/video4linux/bttv/
Documentation/video4linux/cx2341x/fw-encoder-api.txt
Documentation/video4linux/m5602.txt
Documentation/video4linux/v4l2-framework.txt
Documentation/video4linux/videobuf
Documentation/video4linux/Zoran
Make them point to the new location where available, removing
otherwise.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Tue, 8 May 2018 21:29:30 +0000 (18:29 -0300)]
media: dvb: point to the location of the old README.dvb-usb file
This file got renamed, but the references still point to the
old place.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Tue, 8 May 2018 21:10:05 +0000 (18:10 -0300)]
media: dvb: fix location of get_dvb_firmware script
This script was moved out of Documentation/dvb, but the
links weren't updated.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Tue, 8 May 2018 18:14:57 +0000 (15:14 -0300)]
docs: Fix some broken references
As we move stuff around, some doc references are broken. Fix some of
them via this script:
./scripts/documentation-file-ref-check --fix
Manually checked if the produced result is valid, removing a few
false-positives.
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Tue, 8 May 2018 21:54:36 +0000 (18:54 -0300)]
docs: fix broken references with multiple hints
The script:
./scripts/documentation-file-ref-check --fix
Gives multiple hints for broken references on some files.
Manually use the one that applies for some files.
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: James Morris <james.morris@microsoft.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Jose Abreu [Fri, 15 Jun 2018 15:17:27 +0000 (16:17 +0100)]
net: stmmac: Run HWIF Quirks after getting HW caps
Currently we were running HWIF quirks before getting HW capabilities.
This is not right because some HWIF callbacks depend on HW caps.
Lets save the quirks callback and use it in a later stage.
This fixes Altera socfpga.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Fixes: 5f0456b43140 ("net: stmmac: Implement logic to automatically select HW Interface")
Reported-by: Dinh Nguyen <dinh.linux@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Dinh Nguyen <dinh.linux@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Wed, 13 Jun 2018 04:26:10 +0000 (21:26 -0700)]
neighbour: skip NTF_EXT_LEARNED entries during forced gc
Commit
9ce33e46531d ("neighbour: support for NTF_EXT_LEARNED flag")
added support for NTF_EXT_LEARNED for neighbour entries.
NTF_EXT_LEARNED entries are neigh entries managed by control
plane (eg: Ethernet VPN implementation in FRR routing suite).
Periodic gc already excludes these entries. This patch extends
it to forced gc which the earlier patch missed.
Fixes: 9ce33e46531d ("neighbour: support for NTF_EXT_LEARNED flag")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zhouyang Jia [Fri, 15 Jun 2018 03:06:17 +0000 (11:06 +0800)]
net: cxgb3: add error handling for sysfs_create_group
When sysfs_create_group fails, the lack of error-handling code may
cause unexpected results.
This patch adds error-handling code after calling sysfs_create_group.
Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 15 Jun 2018 16:14:31 +0000 (09:14 -0700)]
Merge branch 'tls-fixes'
Daniel Borkmann says:
====================
Two tls fixes
First one is syzkaller trigered uaf and second one noticed
while writing test code with tls ulp. For details please see
individual patches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Fri, 15 Jun 2018 01:07:46 +0000 (03:07 +0200)]
tls: fix waitall behavior in tls_sw_recvmsg
Current behavior in tls_sw_recvmsg() is to wait for incoming tls
messages and copy up to exactly len bytes of data that the user
provided. This is problematic in the sense that i) if no packet
is currently queued in strparser we keep waiting until one has been
processed and pushed into tls receive layer for tls_wait_data() to
wake up and push the decrypted bits to user space. Given after
tls decryption, we're back at streaming data, use sock_rcvlowat()
hint from tcp socket instead. Retain current behavior with MSG_WAITALL
flag and otherwise use the hint target for breaking the loop and
returning to application. This is done if currently no ctx->recv_pkt
is ready, otherwise continue to process it from our strparser
backlog.
Fixes: c46234ebb4d1 ("tls: RX path for ktls")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Fri, 15 Jun 2018 01:07:45 +0000 (03:07 +0200)]
tls: fix use-after-free in tls_push_record
syzkaller managed to trigger a use-after-free in tls like the
following:
BUG: KASAN: use-after-free in tls_push_record.constprop.15+0x6a2/0x810 [tls]
Write of size 1 at addr
ffff88037aa08000 by task a.out/2317
CPU: 3 PID: 2317 Comm: a.out Not tainted 4.17.0+ #144
Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET47W (1.21 ) 11/28/2016
Call Trace:
dump_stack+0x71/0xab
print_address_description+0x6a/0x280
kasan_report+0x258/0x380
? tls_push_record.constprop.15+0x6a2/0x810 [tls]
tls_push_record.constprop.15+0x6a2/0x810 [tls]
tls_sw_push_pending_record+0x2e/0x40 [tls]
tls_sk_proto_close+0x3fe/0x710 [tls]
? tcp_check_oom+0x4c0/0x4c0
? tls_write_space+0x260/0x260 [tls]
? kmem_cache_free+0x88/0x1f0
inet_release+0xd6/0x1b0
__sock_release+0xc0/0x240
sock_close+0x11/0x20
__fput+0x22d/0x660
task_work_run+0x114/0x1a0
do_exit+0x71a/0x2780
? mm_update_next_owner+0x650/0x650
? handle_mm_fault+0x2f5/0x5f0
? __do_page_fault+0x44f/0xa50
? mm_fault_error+0x2d0/0x2d0
do_group_exit+0xde/0x300
__x64_sys_exit_group+0x3a/0x50
do_syscall_64+0x9a/0x300
? page_fault+0x8/0x30
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This happened through fault injection where aead_req allocation in
tls_do_encryption() eventually failed and we returned -ENOMEM from
the function. Turns out that the use-after-free is triggered from
tls_sw_sendmsg() in the second tls_push_record(). The error then
triggers a jump to waiting for memory in sk_stream_wait_memory()
resp. returning immediately in case of MSG_DONTWAIT. What follows is
the trim_both_sgl(sk, orig_size), which drops elements from the sg
list added via tls_sw_sendmsg(). Now the use-after-free gets triggered
when the socket is being closed, where tls_sk_proto_close() callback
is invoked. The tls_complete_pending_work() will figure that there's
a pending closed tls record to be flushed and thus calls into the
tls_push_pending_closed_record() from there. ctx->push_pending_record()
is called from the latter, which is the tls_sw_push_pending_record()
from sw path. This again calls into tls_push_record(). And here the
tls_fill_prepend() will panic since the buffer address has been freed
earlier via trim_both_sgl(). One way to fix it is to move the aead
request allocation out of tls_do_encryption() early into tls_push_record().
This means we don't prep the tls header and advance state to the
TLS_PENDING_CLOSED_RECORD before allocation which could potentially
fail happened. That fixes the issue on my side.
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Reported-by: syzbot+5c74af81c547738e1684@syzkaller.appspotmail.com
Reported-by: syzbot+709f2810a6a05f11d4d3@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 15 Jun 2018 16:12:37 +0000 (09:12 -0700)]
Merge branch 'l2tp-l2tp_ppp-must-ignore-non-PPP-sessions'
Guillaume Nault says:
====================
l2tp: l2tp_ppp must ignore non-PPP sessions
The original L2TP code was written for version 2 of the protocol, which
could only carry PPP sessions. Then L2TPv3 generalised the protocol so that
it could transport different kinds of pseudo-wires. But parts of the
l2tp_ppp module still break in presence of non-PPP sessions.
Assuming L2TPv2 tunnels can only transport PPP sessions is right, but
l2tp_netlink failed to ensure that (fixed in patch 1).
When retrieving a session from an arbitrary tunnel, l2tp_ppp needs to
filter out non-PPP sessions (last occurrence fixed in patch 2).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Fri, 15 Jun 2018 13:39:19 +0000 (15:39 +0200)]
l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl()
pppol2tp_tunnel_ioctl() can act on an L2TPv3 tunnel, in which case
'session' may be an Ethernet pseudo-wire.
However, pppol2tp_session_ioctl() expects a PPP pseudo-wire, as it
assumes l2tp_session_priv() points to a pppol2tp_session structure. For
an Ethernet pseudo-wire l2tp_session_priv() points to an l2tp_eth_sess
structure instead, making pppol2tp_session_ioctl() access invalid
memory.
Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Fri, 15 Jun 2018 13:39:17 +0000 (15:39 +0200)]
l2tp: reject creation of non-PPP sessions on L2TPv2 tunnels
The /proc/net/pppol2tp handlers (pppol2tp_seq_*()) iterate over all
L2TPv2 tunnels, and rightfully expect that only PPP sessions can be
found there. However, l2tp_netlink accepts creating Ethernet sessions
regardless of the underlying tunnel version.
This confuses pppol2tp_seq_session_show(), which expects that
l2tp_session_priv() returns a pppol2tp_session structure. When the
session is an Ethernet pseudo-wire, a struct l2tp_eth_sess is returned
instead. This leads to invalid memory access when
pppol2tp_session_get_sock() later tries to dereference ps->sk.
Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 15 Jun 2018 16:11:17 +0000 (09:11 -0700)]
Merge branch 'mlxsw-IPv6-and-reference-counting-fixes'
Ido Schimmel says:
====================
mlxsw: IPv6 and reference counting fixes
The first three patches fix a mismatch between the new IPv6 behavior
introduced in commit
f34436a43092 ("net/ipv6: Simplify route replace and
appending into multipath route") and mlxsw. The patches allow the driver
to support multipathing in IPv6 overlays with GRE tunnel devices. A
selftest will be submitted when net-next opens.
The last patch fixes a reference count problem of the port_vlan struct.
I plan to simplify the code in net-next, so that reference counting is
not necessary anymore.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Fri, 15 Jun 2018 13:23:38 +0000 (16:23 +0300)]
mlxsw: spectrum_switchdev: Fix port_vlan refcounting
Switchdev notifications for addition of SWITCHDEV_OBJ_ID_PORT_VLAN are
distributed not only on clean addition, but also when flags on an
existing VLAN are changed. mlxsw_sp_bridge_port_vlan_add() calls
mlxsw_sp_port_vlan_get() to get at the port_vlan in question, which
implicitly references the object. This then leads to discrepancies in
reference counting when the VLAN is removed. spectrum.c warns about the
problem when the module is removed:
[13578.493090] WARNING: CPU: 0 PID: 2454 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2973 mlxsw_sp_port_remove+0xfd/0x110 [mlxsw_spectrum]
[...]
[13578.627106] Call Trace:
[13578.629617] mlxsw_sp_fini+0x2a/0xe0 [mlxsw_spectrum]
[13578.634748] mlxsw_core_bus_device_unregister+0x3e/0x130 [mlxsw_core]
[13578.641290] mlxsw_pci_remove+0x13/0x40 [mlxsw_pci]
[13578.646238] pci_device_remove+0x31/0xb0
[13578.650244] device_release_driver_internal+0x14f/0x220
[13578.655562] driver_detach+0x32/0x70
[13578.659183] bus_remove_driver+0x47/0xa0
[13578.663134] pci_unregister_driver+0x1e/0x80
[13578.667486] mlxsw_sp_module_exit+0xc/0x3fa [mlxsw_spectrum]
[13578.673207] __x64_sys_delete_module+0x13b/0x1e0
[13578.677888] ? exit_to_usermode_loop+0x78/0x80
[13578.682374] do_syscall_64+0x39/0xe0
[13578.685976] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fix by putting the port_vlan when mlxsw_sp_port_vlan_bridge_join()
determines it's a flag-only change.
Fixes: b3529af6bb0d ("spectrum: Reference count VLAN entries")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 15 Jun 2018 13:23:37 +0000 (16:23 +0300)]
mlxsw: spectrum_router: Align with new route replace logic
Commit
f34436a43092 ("net/ipv6: Simplify route replace and appending
into multipath route") changed the IPv6 route replace logic so that the
first matching route (i.e., same metric) is replaced.
Have mlxsw replace the first matching route as well.
Fixes: f34436a43092 ("net/ipv6: Simplify route replace and appending into multipath route")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 15 Jun 2018 13:23:36 +0000 (16:23 +0300)]
mlxsw: spectrum_router: Allow appending to dev-only routes
Commit
f34436a43092 ("net/ipv6: Simplify route replace and appending
into multipath route") changed the IPv6 route append logic so that
dev-only routes can be appended and not only gatewayed routes.
Align mlxsw with the new behaviour.
Fixes: f34436a43092 ("net/ipv6: Simplify route replace and appending into multipath route")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 15 Jun 2018 13:23:35 +0000 (16:23 +0300)]
ipv6: Only emit append events for appended routes
Current code will emit an append event in the FIB notification chain for
any route added with NLM_F_APPEND set, even if the route was not
appended to any existing route.
This is inconsistent with IPv4 where such an event is only emitted when
the new route is appended after an existing one.
Align IPv6 behavior with IPv4, thereby allowing listeners to more easily
handle these events.
Fixes: f34436a43092 ("net/ipv6: Simplify route replace and appending into multipath route")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 15 Jun 2018 16:08:26 +0000 (09:08 -0700)]
Merge tag 'mac80211-for-davem-2018-06-15' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A handful of fixes:
* missing RCU grace period enforcement led to drivers freeing
data structures before; fix from Dedy Lansky.
* hwsim module init error paths were messed up; fixed it myself
after a report from Colin King (who had sent a partial patch)
* kernel-doc tag errors; fix from Luca Coelho
* initialize the on-stack sinfo data structure when getting
station information; fix from Sven Eckelmann
* TXQ state dumping is now done from init, and when TXQs aren't
initialized yet at that point, bad things happen, move the
initialization; fix from Toke Høiland-Jørgensen.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Elad Nachman [Fri, 15 Jun 2018 06:57:39 +0000 (09:57 +0300)]
stmmac: added support for 802.1ad vlan stripping
stmmac reception handler calls stmmac_rx_vlan() to strip the vlan before
calling napi_gro_receive().
The function assumes VLAN tagged frames are always tagged with
802.1Q protocol, and assigns ETH_P_8021Q to the skb by hard-coding
the parameter on call to __vlan_hwaccel_put_tag() .
This causes packets not to be passed to the VLAN slave if it was created
with 802.1AD protocol
(ip link add link eth0 eth0.100 type vlan proto 802.1ad id 100).
This fix passes the protocol from the VLAN header into
__vlan_hwaccel_put_tag() instead of using the hard-coded value of
ETH_P_8021Q.
NETIF_F_HW_VLAN_STAG_RX check was added and the strip action is now
dependent on the correct combination of features and the detected vlan tag.
NETIF_F_HW_VLAN_STAG_RX feature was added to be in line with the driver
actual abilities.
Signed-off-by: Elad Nachman <eladn@gilat.com>
Reviewed-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mauro Carvalho Chehab [Wed, 9 May 2018 02:44:08 +0000 (23:44 -0300)]
arch/*: Kconfig: fix documentation for NMI watchdog
Changeset
9919cba7ff71 ("watchdog: Update documentation") updated
the documentation, removing the old nmi_watchdog.txt and adding
a file with a new content.
Update Kconfig files accordingly.
Fixes: 9919cba7ff71 ("watchdog: Update documentation")
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Sun, 6 May 2018 17:30:09 +0000 (14:30 -0300)]
docs: crypto_engine.rst: Fix two parse warnings
./Documentation/crypto/crypto_engine.rst:13: WARNING: Unexpected indentation.
./Documentation/crypto/crypto_engine.rst:15: WARNING: Block quote ends without a blank line; unexpected unindent.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Sun, 6 May 2018 15:00:11 +0000 (12:00 -0300)]
docs: can.rst: fix a footnote reference
As stated at:
http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#footnotes
A footnote should contain either a number, a reference or
an auto number, e. g.:
[1], [#f1] or [#].
While using [*] accidentaly works for html, it fails for other
document outputs. In particular, it causes an error with LaTeX
output, causing all books after networking to not be built.
So, replace it by a valid syntax.
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
David Howells [Fri, 15 Jun 2018 14:24:50 +0000 (15:24 +0100)]
afs: Optimise callback breaking by not repeating volume lookup
At the moment, afs_break_callbacks calls afs_break_one_callback() for each
separate FID it was given, and the latter looks up the volume individually
for each one.
However, this is inefficient if two or more FIDs have the same vid as we
could reuse the volume. This is complicated by cell aliasing whereby we
may have multiple cells sharing a volume and can therefore have multiple
callback interests for any particular volume ID.
At the moment afs_break_one_callback() scans the entire list of volumes
we're getting from a server and breaks the appropriate callback in every
matching volume, regardless of cell. This scan is done for every FID.
Optimise callback breaking by the following means:
(1) Sort the FID list by vid so that all FIDs belonging to the same volume
are clumped together.
This is done through the use of an indirection table as we cannot do
an insertion sort on the afs_callback_break array as we decode FIDs
into it as we subsequently also have to decode callback info into it
that corresponds by array index only.
We also don't really want to bubblesort afterwards if we can avoid it.
(2) Sort the server->cb_interests array by vid so that all the matching
volumes are grouped together. This permits the scan to stop after
finding a record that has a higher vid.
(3) When breaking FIDs, we try to keep server->cb_break_lock as long as
possible, caching the start point in the array for that volume group
as long as possible.
It might make sense to add another layer in that list and have a
refcounted volume ID anchor that has the matching interests attached
to it rather than being in the list. This would allow the lock to be
dropped without losing the cursor.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Fri, 15 Jun 2018 14:19:22 +0000 (15:19 +0100)]
afs: Display manually added cells in dynamic root mount
Alter the dynroot mount so that cells created by manipulation of
/proc/fs/afs/cells and /proc/fs/afs/rootcell and by specification of a root
cell as a module parameter will cause directories for those cells to be
created in the dynamic root superblock for the network namespace[*].
To this end:
(1) Only one dynamic root superblock is now created per network namespace
and this is shared between all attempts to mount it. This makes it
easier to find the superblock to modify.
(2) When a dynamic root superblock is created, the list of cells is walked
and directories created for each cell already defined.
(3) When a new cell is added, if a dynamic root superblock exists, a
directory is created for it.
(4) When a cell is destroyed, the directory is removed.
(5) These directories are created by calling lookup_one_len() on the root
dir which automatically creates them if they don't exist.
[*] Inasmuch as network namespaces are currently supported here.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Fri, 15 Jun 2018 14:19:10 +0000 (15:19 +0100)]
afs: Enable IPv6 DNS lookups
Remove the restriction on DNS lookup upcalls that prevents ipv6 addresses
from being looked up.
Signed-off-by: David Howells <dhowells@redhat.com>
Anatoliy Glagolev [Wed, 13 Jun 2018 21:38:51 +0000 (15:38 -0600)]
bsg: fix race of bsg_open and bsg_unregister
The existing implementation allows races between bsg_unregister and
bsg_open paths. bsg_unregister and request_queue cleanup and deletion
may start and complete right after bsg_get_device (in bsg_open path)
retrieves bsg_class_device and releases the mutex. Then bsg_open path
touches freed memory of bsg_class_device and request_queue.
One possible fix is to hold the mutex all the way through bsg_get_device
instead of releasing it after bsg_class_device retrieval.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-Off-By: Anatoliy Glagolev <glagolig@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Christoph Hellwig [Fri, 15 Jun 2018 11:55:07 +0000 (13:55 +0200)]
block: remov blk_queue_invalidate_tags
This function is entirely unused, so remove it and the tag_queue_busy
member of struct request_queue.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 15 Jun 2018 14:11:05 +0000 (08:11 -0600)]
Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Christoph:
"Fix various little regressions introduced in this merge window, plus
a rework of the fibre channel connect and reconnect path to share the
code instead of having separate sets of bugs. Last but not least a
trivial trace point addition from Hannes."
* 'nvme-4.18' of git://git.infradead.org/nvme:
nvme-fabrics: fix and refine state checks in __nvmf_check_ready
nvme-fabrics: handle the admin-only case properly in nvmf_check_ready
nvme-fabrics: refactor queue ready check
blk-mq: remove blk_mq_tagset_iter
nvme: remove nvme_reinit_tagset
nvme-fc: fix nulling of queue data on reconnect
nvme-fc: remove reinit_request routine
nvme-fc: change controllers first connect to use reconnect path
nvme: don't rely on the changed namespace list log
nvmet: free smart-log buffer after use
nvme-rdma: fix error flow during mapping request data
nvme: add bio remapping tracepoint
nvme: fix NULL pointer dereference in nvme_init_subsystem
Dedy Lansky [Fri, 15 Jun 2018 11:05:01 +0000 (13:05 +0200)]
cfg80211: fix rcu in cfg80211_unregister_wdev
Callers of cfg80211_unregister_wdev can free the wdev object
immediately after this function returns. This may crash the kernel
because this wdev object is still in use by other threads.
Add synchronize_rcu() after list_del_rcu to make sure wdev object can
be safely freed.
Signed-off-by: Dedy Lansky <dlansky@codeaurora.org>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Toke Høiland-Jørgensen [Fri, 25 May 2018 12:29:21 +0000 (14:29 +0200)]
mac80211: Move up init of TXQs
On init, ieee80211_if_add() dumps the interface. Since that now includes a
dump of the TXQ state, we need to initialise that before the dump happens.
So move up the TXQ initialisation to to before the call to
ieee80211_if_add().
Fixes: 52539ca89f36 ("cfg80211: Expose TXQ stats and parameters to userspace")
Reported-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Johannes Berg [Tue, 29 May 2018 10:04:51 +0000 (12:04 +0200)]
mac80211_hwsim: fix module init error paths
We didn't free the workqueue on any errors, nor did we
correctly check for rhashtable allocation errors, nor
did we free the hashtable on error.
Reported-by: Colin King <colin.king@canonical.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>