He Kuang [Tue, 10 May 2016 07:40:31 +0000 (07:40 +0000)]
perf build: Add build-test for libunwind cross-platforms support
Currently only test for local libunwind. We should check all supported
platforms so we can use them to parse perf.data with callchain info on
different machines.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1462866037-30382-4-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Chris Phlipot [Wed, 11 May 2016 03:26:49 +0000 (20:26 -0700)]
perf script: Fix export of callchains with recursion in db-export
When an IP with an unresolved symbol occurs in the callchain more than
once (ie. recursion), then duplicate symbols can be created because
the callchain nodes are never updated after they are first created.
To fix this issue we call dso__find_symbol whenever we encounter a NULL
symbol, in case we already added a symbol at that IP since we started
traversing the callchain.
This change prevents duplicate symbols from being exported when duplicate
IPs are present in the callchain.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-5-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Chris Phlipot [Wed, 11 May 2016 03:26:48 +0000 (20:26 -0700)]
perf script: Fix callchain addresses in db-export
Remove the call to map_ip() to adjust al.addr, because it has already
been called when assembling the callchain, in:
thread__resolve_callchain_sample(perf_sample)
add_callchain_ip(ip = perf_sample->callchain->ips[j])
thread__find_addr_location(addr = ip)
thread__find_addr_map(addr) {
al->addr = addr
if (al->map)
al->addr = al->map->map_ip(al->map, al->addr);
}
Calling it a second time can result in incorrect addresses being used.
This can have effects such as duplicate symbols being created and
exported.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-4-git-send-email-cphlipot0@gmail.com
[ Show the callchain where it is done, to help reviewing this change down the line ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Chris Phlipot [Wed, 11 May 2016 03:26:47 +0000 (20:26 -0700)]
perf script: Fix symbol insertion behavior in db-export
Use the dso__insert_symbol function instead of symbols__insert() in
order to properly update the dso symbol cache.
If the cache is not updated, then duplicate symbols can be
unintentionally created, inserted, and exported.
This change prevents duplicate symbols from being exported due to
dso__find_symbol() using a stale symbol cache.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-3-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Chris Phlipot [Wed, 11 May 2016 03:26:46 +0000 (20:26 -0700)]
perf symbols: Add dso__insert_symbol function
The current method for inserting symbols is to use the symbols__insert()
function. However symbols__insert() does not update the dso symbol
cache. This causes problems in the following scenario:
1. symbol not found at addr using dso__find_symbol
2. symbol inserted at addr using the existing symbols__insert function
3. symbol still not found at addr using dso__find_symbol() because cache isn't
updated. This is undesired behavior.
The undesired behavior in (3) is addressed by creating a new function,
dso__insert_symbol() to both insert the symbol and update the symbol
cache if necessary.
If dso__insert_symbol() is used in (2) instead of symbols__insert(),
then the undesired behavior in (3) is avoided.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-2-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Tue, 10 May 2016 15:33:52 +0000 (12:33 -0300)]
perf scripting python: Use Py_FatalError instead of die()
It probably is equivalent, but that seems to be the "pythonic" way of
dieing? Anyway, one less die() in the tools/perf codebase.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Chris Phlipot <cphlipot0@gmail.com>
Link: http://lkml.kernel.org/n/tip-nlzgepdv2818zs4e7faif9tu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ingo Molnar [Tue, 10 May 2016 20:23:34 +0000 (22:23 +0200)]
Merge tag 'perf-core-for-mingo-
20160510' of git://git./linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Recording 'dwarf' callchains do not need DWARF unwinding support (He Kuang)
- Print recently added perf_event_attr.write_backward bit flag in -vv
verbose mode (Arnaldo Carvalho de Melo)
- Fix incorrect python db-export error message in 'perf script' (Chris Phlipot)
- Fix handling of zero-length symbols (Chris Phlipot)
- perf stat: Scale values by unit before metrics (Andi Kleen)
Infrastructure changes:
- Rewrite strbuf not to die(), making tools using it to check its
return value instead (Masami Hiramatsu)
- Support reading from backward ring buffer, add a 'perf test' entry
for it (Wang Nan)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 11 May 2016 14:56:38 +0000 (16:56 +0200)]
Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Namhyung Kim [Tue, 10 May 2016 14:26:24 +0000 (11:26 -0300)]
perf diff: Fix duplicated output column
The commit
b97511c5bc94 ("perf tools: Add overhead/overhead_children
keys defaults via string") moved initialization of column headers but it
missed to check the sort__mode. As 'perf diff' doesn't call
perf_hpp__init(), the setup_overhead() also should not be called.
Before:
# Baseline Delta Children Overhead Shared Object Symbol
# ........ ....... ........ ........ ................... .......................
#
28.48% -28.47% 28.48% 28.48% [kernel.vmlinux ] [k] intel_idle
11.51% -11.47% 11.51% 11.51% libxul.so [.] 0x0000000001a360f7
3.49% -3.49% 3.49% 3.49% [kernel.vmlinux] [k] generic_exec_single
2.91% -2.89% 2.91% 2.91% libdbus-1.so.3.8.11 [.] 0x000000000000cdc2
2.86% -2.85% 2.86% 2.86% libxcb.so.1.1.0 [.] 0x000000000000c890
2.44% -2.39% 2.44% 2.44% [kernel.vmlinux] [k] perf_event_aux_ctx
After:
# Baseline Delta Shared Object Symbol
# ........ ....... ................... .......................
#
28.48% -28.47% [kernel.vmlinux] [k] intel_idle
11.51% -11.47% libxul.so [.] 0x0000000001a360f7
3.49% -3.49% [kernel.vmlinux] [k] generic_exec_single
2.91% -2.89% libdbus-1.so.3.8.11 [.] 0x000000000000cdc2
2.86% -2.85% libxcb.so.1.1.0 [.] 0x000000000000c890
2.44% -2.39% [kernel.vmlinux] [k] perf_event_aux_ctx
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: <stable@vger.kernel.org> # 4.5+
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: b97511c5bc94 ("perf tools: Add overhead/overhead_children keys defaults via string")
Link: http://lkml.kernel.org/r/1462890384-12486-2-git-send-email-acme@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Tue, 10 May 2016 19:04:40 +0000 (12:04 -0700)]
Merge tag 'pci-v4.6-fixes-3' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"Since v4.5, we've WARNed during resume if a PCI device, including a
Thunderbolt device, was added while we were suspended. A change we
merged for v4.6-rc1 turned that warning into a system hang. These
enumeration patches from Lukas Wunner fix this issue:
- Fix BUG on device attach failure
- Do not treat EPROBE_DEFER as device attach failure"
* tag 'pci-v4.6-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Do not treat EPROBE_DEFER as device attach failure
PCI: Fix BUG on device attach failure
Linus Torvalds [Tue, 10 May 2016 18:41:05 +0000 (11:41 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Two topology corner case fixes, and a MAINTAINERS file update for
mmiotrace maintenance"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/topology: Set x86_max_cores to 1 for CONFIG_SMP=n
MAINTAINERS: Add mmiotrace entry
x86/topology: Handle CPUID bogosity gracefully
Linus Torvalds [Tue, 10 May 2016 18:32:01 +0000 (11:32 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"A UP kernel cpufreq fix and a rt/dl scheduler corner case fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/rt, sched/dl: Don't push if task's scheduling class was changed
sched/fair: Fix !CONFIG_SMP kernel cpufreq governor breakage
Masami Hiramatsu [Tue, 10 May 2016 05:48:01 +0000 (14:48 +0900)]
perf tools: Remove xrealloc and ALLOC_GROW
Remove unused xrealloc() and ALLOC_GROW() from libperf.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054801.6158.6204.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Masami Hiramatsu [Tue, 10 May 2016 05:47:53 +0000 (14:47 +0900)]
perf help: Do not use ALLOC_GROW in add_cmd_list
Replace ALLOC_GROW with normal realloc code in add_cmd_list() so that it
can handle errors directly.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054752.6158.30562.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Masami Hiramatsu [Tue, 10 May 2016 05:47:44 +0000 (14:47 +0900)]
perf pmu: Make pmu_formats_string to check return value of strbuf
Make pmu_formats_string() to check return value of strbuf APIs so that
it can detect errors in it.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054744.6158.37810.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Masami Hiramatsu [Tue, 10 May 2016 05:47:35 +0000 (14:47 +0900)]
perf header: Make topology checkers to check return value of strbuf
Make topology checkers to check the return value of strbuf APIs so that
it can detect errors in it.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054735.6158.98650.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Masami Hiramatsu [Tue, 10 May 2016 05:47:26 +0000 (14:47 +0900)]
perf tools: Make alias handler to check return value of strbuf
Make alias handler and sq_quote_argv to check the return value of strbuf
APIs.
In sq_quote_argv() calls die(), but this fix handles strbuf failure as a
special case and returns to caller, since the caller - handle_alias()
also has to check the return value of other strbuf APIs and those checks
can be merged to one if() statement.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054725.6158.84597.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Masami Hiramatsu [Tue, 10 May 2016 05:47:17 +0000 (14:47 +0900)]
perf help: Make check_emacsclient_version to check strbuf APIs
Make check_emacsclient_version() to check the return value of strbuf
APIs so that it can handle errors in strbuf.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054716.6158.11755.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Masami Hiramatsu [Tue, 10 May 2016 05:47:07 +0000 (14:47 +0900)]
perf probe: Check the return value of strbuf APIs
Check the return value of strbuf APIs in perf-probe
related code, so that it can handle errors in strbuf.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054707.6158.69861.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Masami Hiramatsu [Tue, 10 May 2016 05:46:58 +0000 (14:46 +0900)]
perf tools: Rewrite strbuf not to die()
Rewrite strbuf implementation not to use die() nor xrealloc(). Instead
of die(), now most of the API returns error code or 0 if succeeded.
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054658.6158.24080.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Xunlei Pang [Mon, 9 May 2016 04:11:31 +0000 (12:11 +0800)]
sched/rt, sched/dl: Don't push if task's scheduling class was changed
We got this warning:
WARNING: CPU: 1 PID: 2468 at kernel/sched/core.c:1161 set_task_cpu+0x1af/0x1c0
[...]
Call Trace:
dump_stack+0x63/0x87
__warn+0xd1/0xf0
warn_slowpath_null+0x1d/0x20
set_task_cpu+0x1af/0x1c0
push_dl_task.part.34+0xea/0x180
push_dl_tasks+0x17/0x30
__balance_callback+0x45/0x5c
__sched_setscheduler+0x906/0xb90
SyS_sched_setattr+0x150/0x190
do_syscall_64+0x62/0x110
entry_SYSCALL64_slow_path+0x25/0x25
This corresponds to:
WARN_ON_ONCE(p->state == TASK_RUNNING &&
p->sched_class == &fair_sched_class &&
(p->on_rq && !task_on_rq_migrating(p)))
It happens because in find_lock_later_rq(), the task whose scheduling
class was changed to fair class is still pushed away as if it were
a deadline task ...
So, check in find_lock_later_rq() after double_lock_balance(), if the
scheduling class of the deadline task was changed, break and retry.
Apply the same logic to RT tasks.
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Link: http://lkml.kernel.org/r/1462767091-1215-1-git-send-email-xlpang@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Thomas Gleixner [Tue, 10 May 2016 07:20:33 +0000 (09:20 +0200)]
x86/topology: Set x86_max_cores to 1 for CONFIG_SMP=n
Josef reported that the uncore driver trips over with CONFIG_SMP=n because
x86_max_cores is 16 instead of 12.
The reason is, that for SMP=n the extended topology detection is a NOOP and
the cache leaf is used to determine the number of cores. That's wrong in two
aspects:
1) The cache leaf enumerates the maximum addressable number of cores in the
package, which is obviously not correct
2) UP has no business with topology bits at all.
Make intel_num_cpu_cores() return 1 for CONFIG_SMP=n
Reported-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team <Kernel-team@fb.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/761b4a2a-0332-7954-f030-c6639f949612@fb.com
Linus Torvalds [Tue, 10 May 2016 01:24:04 +0000 (18:24 -0700)]
Merge branch 'libnvdimm-fixes' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm build fix from Dan Williams:
"A build fix for the usage of HPAGE_SIZE in the last libnvdimm pull
request.
I have taken note that the kbuild robot build success test does not
include results for alpha_allmodconfig. Thanks to Guenter for the
report. It's tagged for -stable since the original fix will land
there and cause build problems"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm, pfn: fix ARCH=alpha allmodconfig build failure
Andy Lutomirski [Mon, 9 May 2016 22:48:51 +0000 (15:48 -0700)]
perf/core: Change the default paranoia level to 2
Allowing unprivileged kernel profiling lets any user dump follow kernel
control flow and dump kernel registers. This most likely allows trivial
kASLR bypassing, and it may allow other mischief as well. (Off the top
of my head, the PERF_SAMPLE_REGS_INTR output during /dev/urandom reads
could be quite interesting.)
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 10 May 2016 00:54:59 +0000 (17:54 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
"2 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
zsmalloc: fix zs_can_compact() integer overflow
Revert "proc/base: make prompt shell start from new line after executing "cat /proc/$pid/wchan""
Sergey Senozhatsky [Mon, 9 May 2016 23:28:49 +0000 (16:28 -0700)]
zsmalloc: fix zs_can_compact() integer overflow
zs_can_compact() has two race conditions in its core calculation:
unsigned long obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
zs_stat_get(class, OBJ_USED);
1) classes are not locked, so the numbers of allocated and used
objects can change by the concurrent ops happening on other CPUs
2) shrinker invokes it from preemptible context
Depending on the circumstances, thus, OBJ_ALLOCATED can become
less than OBJ_USED, which can result in either very high or
negative `total_scan' value calculated later in do_shrink_slab().
do_shrink_slab() has some logic to prevent those cases:
vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-64
vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
However, due to the way `total_scan' is calculated, not every
shrinker->count_objects() overflow can be spotted and handled.
To demonstrate the latter, I added some debugging code to do_shrink_slab()
(x86_64) and the results were:
vmscan: OVERFLOW: shrinker->count_objects() == -1 [
18446744073709551615]
vmscan: but total_scan > 0:
92679974445502
vmscan: resulting total_scan:
92679974445502
[..]
vmscan: OVERFLOW: shrinker->count_objects() == -1 [
18446744073709551615]
vmscan: but total_scan > 0:
22634041808232578
vmscan: resulting total_scan:
22634041808232578
Even though shrinker->count_objects() has returned an overflowed value,
the resulting `total_scan' is positive, and, what is more worrisome, it
is insanely huge. This value is getting used later on in
shrinker->scan_objects() loop:
while (total_scan >= batch_size ||
total_scan >= freeable) {
unsigned long ret;
unsigned long nr_to_scan = min(batch_size, total_scan);
shrinkctl->nr_to_scan = nr_to_scan;
ret = shrinker->scan_objects(shrinker, shrinkctl);
if (ret == SHRINK_STOP)
break;
freed += ret;
count_vm_events(SLABS_SCANNED, nr_to_scan);
total_scan -= nr_to_scan;
cond_resched();
}
`total_scan >= batch_size' is true for a very-very long time and
'total_scan >= freeable' is also true for quite some time, because
`freeable < 0' and `total_scan' is large enough, for example,
22634041808232578. The only break condition, in the given scheme of
things, is shrinker->scan_objects() == SHRINK_STOP test, which is a
bit too weak to rely on, especially in heavy zsmalloc-usage scenarios.
To fix the issue, take a pool stat snapshot and use it instead of
racy zs_stat_get() calls.
Link: http://lkml.kernel.org/r/20160509140052.3389-1-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org> [4.3+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Humble [Mon, 9 May 2016 23:28:46 +0000 (16:28 -0700)]
Revert "proc/base: make prompt shell start from new line after executing "cat /proc/$pid/wchan""
This reverts the 4.6-rc1 commit
7e2bc81da333 ("proc/base: make prompt
shell start from new line after executing "cat /proc/$pid/wchan")
because it breaks /proc/$PID/whcan formatting in ps and top.
Revert also because the patch is inconsistent - it adds a newline at the
end of only the '0' wchan, and does not add a newline when
/proc/$PID/wchan contains a symbol name.
eg.
$ ps -eo pid,stat,wchan,comm
PID STAT WCHAN COMMAND
...
1189 S - dbus-launch
1190 Ssl 0
dbus-daemon
1198 Sl 0
lightdm
1299 Ss ep_pol systemd
1301 S - (sd-pam)
1304 Ss wait sh
Signed-off-by: Robin Humble <plaguedbypenguins@gmail.com>
Cc: Minfei Huang <mnfhuang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Phlipot [Sat, 7 May 2016 09:16:59 +0000 (02:16 -0700)]
perf symbols: Fix handling of zero-length symbols.
This change introduces a fix to symbols__find, so that it is able to
find symbols of length zero (where start == end).
The current code has the following problem:
- The current implementation of symbols__find is unable to find any symbols
of length zero.
- The db-export framework explicitly creates zero length symbols at
locations where no symbol currently exists.
The combination of the two above behaviors results in behavior similar
to the example below.
1. addr_location is created for a sample, but symbol is unable to be
resolved.
2. db export creates an "unknown" symbol of length zero at that address
and inserts it into the dso.
3. A new sample comes in at the same address, but symbol__find is unable
to find the zero length symbol, so it is still unresolved.
4. db export sees the symbol is unresolved, and allocated a duplicate
symbol, even though it already did this in step 2.
This behavior continues every time an address without symbol information
is seen, which causes a very large number of these symbols to be
allocated.
The effect of this fix can be observed by looking at the contents of an
exported database before/after the fix (generated with
scripts/python/export-to-postgresql.py)
Ex.
BEFORE THE CHANGE:
example_db=# select count(*) from symbols;
count
--------
900213
(1 row)
example_db=# select count(*) from symbols where symbols.name='unknown';
count
--------
897355
(1 row)
example_db=# select count(*) from symbols where symbols.name!='unknown';
count
-------
2858
(1 row)
AFTER THE CHANGE:
example_db=# select count(*) from symbols;
count
-------
25217
(1 row)
example_db=# select count(*) from symbols where name='unknown';
count
-------
22359
(1 row)
example_db=# select count(*) from symbols where name!='unknown';
count
-------
2858
(1 row)
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462612620-25008-1-git-send-email-cphlipot0@gmail.com
[ Moved the test to later in the rb_tree tests, as this not the likely case ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 9 May 2016 21:08:33 +0000 (18:08 -0300)]
perf evsel: Print state of perf_event_attr.write_backward
Now we can see if it is set when using verbose mode in various tools,
such as 'perf test':
# perf test -vv back
45: Test backward reading from ring buffer :
--- start ---
<SNIP>
------------------------------------------------------------
perf_event_attr:
type 2
size 112
config 0x98
{ sample_period, sample_freq } 1
sample_type IP|TID|TIME|CPU|PERIOD|RAW
disabled 1
mmap 1
comm 1
task 1
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
write_backward 1
------------------------------------------------------------
sys_perf_event_open: pid 20911 cpu -1 group_fd -1 flags 0x8
<SNIP>
---- end ----
Test backward reading from ring buffer: Ok
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kxv05kv9qwl5of7rzfeiiwbv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Wang Nan [Mon, 9 May 2016 01:47:51 +0000 (01:47 +0000)]
perf tests: Add test to check backward ring buffer
This test checks reading from backward ring buffer.
Test result:
# ~/perf test 'ring buffer'
45: Test backward reading from ring buffer : Ok
The test case is a while loop which calls prctl(PR_SET_NAME) multiple
times. Each prctl should issue 2 events: one PERF_RECORD_SAMPLE, one
PERF_RECORD_COMM.
The first round creates a relative large ring buffer (256 pages). It can
afford all events. Read from it and check the count of each type of
events.
The second round creates a small ring buffer (1 page) and makes it
overwritable. Check the correctness of the buffer.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1462758471-89706-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Wang Nan [Mon, 9 May 2016 01:47:50 +0000 (01:47 +0000)]
perf tools: Support reading from backward ring buffer
perf_evlist__mmap_read_backward() is introduced for reading backward
ring buffer. Since direction for reading such ring buffer is different
from the direction kernel writing to it, and since user need to fetch
most recent record from it, a perf_evlist__mmap_read_catchup() is
introduced to move the reading pointer to the end of the buffer.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1462758471-89706-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Mon, 9 May 2016 19:24:19 +0000 (12:24 -0700)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- bug in ahash SG list walking that may lead to crashes
- resource leak in qat
- missing RSA dependency that causes it to fail"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: rsa - select crypto mgr dependency
crypto: hash - Fix page length clamping in hash walk
crypto: qat - fix adf_ctl_drv.c:undefined reference to adf_init_pf_wq
crypto: qat - fix invalid pf2vf_resp_wq logic
Linus Torvalds [Mon, 9 May 2016 19:11:37 +0000 (12:11 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Check klogctl failure correctly, from Colin Ian King.
2) Prevent OOM when under memory pressure in flowcache, from Steffen
Klassert.
3) Fix info leak in llc and rtnetlink ifmap code, from Kangjie Lu.
4) Memory barrier and multicast handling fixes in bnxt_en, from Michael
Chan.
5) Endianness bug in mlx5, from Daniel Jurgens.
6) Fix disconnect handling in VSOCK, from Ian Campbell.
7) Fix locking of netdev list walking in get_bridge_ifindices(), from
Nikolay Aleksandrov.
8) Bridge multicast MLD parser can look at wrong packet offsets, fix
from Linus Lüssing.
9) Fix chip hang in qede driver, from Sudarsana Reddy Kalluru.
10) Fix missing setting of encapsulation before inner handling completes
in udp_offload code, from Jarno Rajahalme.
11) Missing rollbacks during LAG join and flood configuration failures
in mlxsw driver, from Ido Schimmel.
12) Fix error code checks in netxen driver, from Dan Carpenter.
13) Fix key size in new macsec driver, from Sabrina Dubroca.
14) Fix mlx5/VXLAN dependencies, from Arnd Bergmann.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits)
net/mlx5e: make VXLAN support conditional
Revert "net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue"
macsec: key identifier is 128 bits, not 64
Documentation/networking: more accurate LCO explanation
macvtap: segmented packet is consumed
tools: bpf_jit_disasm: check for klogctl failure
qede: uninitialized variable in qede_start_xmit()
netxen: netxen_rom_fast_read() doesn't return -1
netxen: reversed condition in netxen_nic_set_link_parameters()
netxen: fix error handling in netxen_get_flash_block()
mlxsw: spectrum: Add missing rollback in flood configuration
mlxsw: spectrum: Fix rollback order in LAG join failure
udp_offload: Set encapsulation before inner completes.
udp_tunnel: Remove redundant udp_tunnel_gro_complete().
qede: prevent chip hang when increasing channels
net: ipv6: tcp reset, icmp need to consider L3 domain
bridge: fix igmp / mld query parsing
net: bridge: fix old ioctl unlocked net device walk
VSOCK: do not disconnect socket when peer has shutdown SEND only
net/mlx4_en: Fix endianness bug in IPV6 csum calculation
...
Josh Poimboeuf [Fri, 6 May 2016 14:22:25 +0000 (09:22 -0500)]
compiler-gcc: require gcc 4.8 for powerpc __builtin_bswap16()
gcc support for __builtin_bswap16() was supposedly added for powerpc in
gcc 4.6, and was then later added for other architectures in gcc 4.8.
However, Stephen Rothwell reported that attempting to use it on powerpc
in gcc 4.6 fails with:
lib/vsprintf.c:160:2: error: initializer element is not constant
lib/vsprintf.c:160:2: error: (near initialization for 'decpair[0]')
lib/vsprintf.c:160:2: error: initializer element is not constant
lib/vsprintf.c:160:2: error: (near initialization for 'decpair[1]')
...
I'm not entirely sure what those errors mean, but I don't see them on
gcc 4.8. So let's consider gcc 4.8 to be the official starting point
for __builtin_bswap16().
Arnd Bergmann adds:
"I found the commit in gcc-4.8 that replaced the powerpc-specific
implementation of __builtin_bswap16 with an architecture-independent
one. Apparently the powerpc version (gcc-4.6 and 4.7) just mapped to
the lhbrx/sthbrx instructions, so it ended up not being a constant,
though the intent of the patch was mainly to add support for the
builtin to x86:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52624
has the patch that went into gcc-4.8 and more information."
Fixes: 7322dd755e7d ("byteswap: try to avoid __builtin_constant_p gcc bug")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Phlipot [Sat, 7 May 2016 09:17:00 +0000 (02:17 -0700)]
perf script: Fix incorrect python db-export error message
Fix the error message printed when attempting and failing to create the
call path root incorrectly references the call return process.
This change fixes the message to properly reference the failure to
create the call path root.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462612620-25008-2-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 5 May 2016 23:04:04 +0000 (16:04 -0700)]
perf stat: Scale values by unit before metrics
Scale values by unit before passing them to the metrics printing
functions. This is needed for TopDown, because it needs to scale the
slots correctly by pipeline width / SMTness.
For existing metrics it shouldn't make any difference, as those
generally use events that don't have any units.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462489447-31832-8-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
He Kuang [Fri, 6 May 2016 08:59:07 +0000 (08:59 +0000)]
perf callchain: Recording 'dwarf' callchains do not need DWARF unwinding support
There is no need to check for DWARF unwinding support when using the
'dwarf' callchain record method, as this will only ask the kernel to
collect stack dumps for later DWARF CFI processing, which can be done in
another machine, where the support for DWARF unwinding need to be
present.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1462525154-125656-2-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
David S. Miller [Mon, 9 May 2016 04:21:13 +0000 (00:21 -0400)]
Merge branch 'mlx5-build-fix'
Saeed Mahameed says:
====================
net/mlx5e: Kconfig fixes for VxLAN
Reposting to net the build errors fixes posted by Arnd last week.
Originally Arnd posted those fixes to net-next, while the issue
is also seen in net. For net-next a different approach is required
for fixing the issue as VXLAN and Device Drivers are no longer
dependent, but there is no harm for those fixes to get into net-next.
Optionally, once net is merged into net-next we can
Revert "net/mlx5e: make VXLAN support conditional" as the
CONFIG_MLX5_CORE_EN_VXLAN will no longer be required.
Applied on top:
288928658583 ('mlxsw: spectrum: Add missing rollback in flood configuration')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Sun, 8 May 2016 11:55:25 +0000 (14:55 +0300)]
net/mlx5e: make VXLAN support conditional
VXLAN can be disabled at compile-time or it can be a loadable
module while mlx5 is built-in, which leads to a link error:
drivers/net/built-in.o: In function `mlx5e_create_netdev':
ntb_netdev.c:(.text+0x106de4): undefined reference to `vxlan_get_rx_port'
This avoids the link error and makes the vxlan code optional,
like the other ethernet drivers do as well.
Link: https://patchwork.ozlabs.org/patch/589296/
Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Sun, 8 May 2016 11:55:24 +0000 (14:55 +0300)]
Revert "net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue"
This reverts commit
69976fb1045850a742deb9790ea49cbc6f497531.
We cannot select VXLAN when IPv4 support is disabled, that just gives
us additional build errors, including:
warning: (MLX5_CORE_EN) selects VXLAN which has unmet direct dependencies (NETDEVICES && NET_CORE && INET)
In file included from ../drivers/net/vxlan.c:36:0:
include/net/udp_tunnel.h: In function 'udp_tunnel_handle_offloads':
include/net/udp_tunnel.h:112:9: error: implicit declaration of function 'iptunnel_handle_offloads' [-Werror=implicit-function-declaration]
return iptunnel_handle_offloads(skb, type);
^~~~~~~~~~~~~~~~~~~~~~~~
I'm sending a proper fix for the original bug in a separate patch.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Sat, 7 May 2016 18:19:29 +0000 (20:19 +0200)]
macsec: key identifier is 128 bits, not 64
The MACsec standard mentions a key identifier for each key, but
doesn't specify anything about it, so I arbitrarily chose 64 bits.
IEEE 802.1X-2010 specifies MKA (MACsec Key Agreement), and defines the
key identifier to be 128 bits (96 bits "member identifier" + 32 bits
"key number").
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shmulik Ladkani [Fri, 6 May 2016 17:27:43 +0000 (20:27 +0300)]
Documentation/networking: more accurate LCO explanation
In few places the term "ones-complement sum" was used but the actual
meaning is "the complement of the ones-complement sum".
Also, avoid enclosing long statements with underscore, to ease
readability.
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 6 May 2016 12:58:21 +0000 (05:58 -0700)]
macvtap: segmented packet is consumed
If GSO packet is segmented and its segments are properly queued,
we call consume_skb() instead of kfree_skb() to be drop monitor
friendly.
Fixes: 3e4f8b7873709 ("macvtap: Perform GSO on forwarding path.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
Reviewed-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Thu, 5 May 2016 22:39:33 +0000 (23:39 +0100)]
tools: bpf_jit_disasm: check for klogctl failure
klogctl can fail and return -ve len, so check for this and
return NULL to avoid passing a (size_t)-1 to malloc.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 5 May 2016 13:21:30 +0000 (16:21 +0300)]
qede: uninitialized variable in qede_start_xmit()
"data_split" was never set to false. It's just uninitialized.
Fixes: 2950219d87b0 ('qede: Add basic network device support')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 8 May 2016 21:38:32 +0000 (14:38 -0700)]
Linux 4.6-rc7
Ingo Molnar [Sun, 8 May 2016 08:27:33 +0000 (10:27 +0200)]
MAINTAINERS: Add mmiotrace entry
The Nouveau maintainers would like to follow and review mmiotrace
changes as well, so create a separate entry for that code. The high
level bits are living in the tracing code, the low level bits in the
x86 code.
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Pekka Paalanen <ppaalanen@gmail.com>
Acked-by: karol herbst <karolherbst@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dan Carpenter [Thu, 5 May 2016 13:20:20 +0000 (16:20 +0300)]
netxen: netxen_rom_fast_read() doesn't return -1
The error handling is broken here. netxen_rom_fast_read() returns zero
on success and -EIO on error. It never returns -1.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 5 May 2016 13:19:44 +0000 (16:19 +0300)]
netxen: reversed condition in netxen_nic_set_link_parameters()
My static checker complains that we are using "autoneg" without
initializing it. The problem is the ->phy_read() condition is reversed
so we only set this on error instead of success.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 5 May 2016 13:18:46 +0000 (16:18 +0300)]
netxen: fix error handling in netxen_get_flash_block()
My static checker complained that "v" can be used unintialized if
netxen_rom_fast_read() returns -EIO. That function never actually
returns -1.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 7 May 2016 17:53:32 +0000 (10:53 -0700)]
Merge tag 'char-misc-4.6-rc7' of git://git./linux/kernel/git/gregkh/char-misc
Pull misc driver fixes from Gfreg KH:
"Here are three small fixes for some driver problems that were
reported. Full details in the shortlog below.
All of these have been in linux-next with no reported issues"
* tag 'char-misc-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
nvmem: mxs-ocotp: fix buffer overflow in read
Drivers: hv: vmbus: Fix signaling logic in hv_need_to_signal_on_read()
misc: mic: Fix for double fetch security bug in VOP driver
Linus Torvalds [Sat, 7 May 2016 17:50:48 +0000 (10:50 -0700)]
Merge tag 'staging-4.6-rc7' of git://git./linux/kernel/git/gregkh/staging
Pull IIO driver fixes from Grek KH:
"It's really just IIO drivers here, some small fixes that resolve some
'crash on boot' errors that have shown up in the -rc series, and other
bugfixes that are required.
All have been in linux-next with no reported problems"
* tag 'staging-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: imu: mpu6050: Fix name/chip_id when using ACPI
iio: imu: mpu6050: fix possible NULL dereferences
iio:adc:at91-sama5d2: Repair crash on module removal
iio: ak8975: fix maybe-uninitialized warning
iio: ak8975: Fix NULL pointer exception on early interrupt
Linus Torvalds [Sat, 7 May 2016 17:47:03 +0000 (10:47 -0700)]
Merge tag 'usb-4.6-rc7' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some last-remaining fixes for USB drivers to resolve issues
that have shown up in testing. And two new device ids as well.
All of these have been in linux-next with no reported issues"
* tag 'usb-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
Revert "USB / PM: Allow USB devices to remain runtime-suspended when sleeping"
usb: musb: jz4740: fix error check of usb_get_phy()
Revert "usb: musb: musb_host: Enable HCD_BH flag to handle urb return in bottom half"
usb: musb: gadget: nuke endpoint before setting its descriptor to NULL
USB: serial: cp210x: add Straizona Focusers device ids
USB: serial: cp210x: add ID for Link ECU
Linus Torvalds [Sat, 7 May 2016 15:27:35 +0000 (08:27 -0700)]
Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"These are a number of updates to fix a few problems found in the ARM
nommu code over the last couple of years, caused mostly by changes on
the mmu side"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8573/1: domain: move {set,get}_domain under config guard
ARM: 8572/1: nommu: change memory reserve for the vectors
ARM: 8571/1: nommu: fix PMSAv7 setup
Linus Torvalds [Sat, 7 May 2016 15:17:45 +0000 (08:17 -0700)]
Merge tag 'media/v4.6-5' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- deadlock fixes on driver probe at exynos4-is and s43-camif drivers
- a build breakage if media controller is enabled and USB or PCI is
built as module.
* tag 'media/v4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] media-device: fix builds when USB or PCI is compiled as module
[media] media: s3c-camif: fix deadlock on driver probe()
[media] media: exynos4-is: fix deadlock on driver probe
Linus Torvalds [Sat, 7 May 2016 15:13:42 +0000 (08:13 -0700)]
Merge branch 'for-4.6-fixes' of git://git./linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"An ahci driver addition and updates to ahci port enable handling for
some platform devices"
* 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ata: add AMD Seattle platform driver
ARM: dts: apq8064: add ahci ports-implemented mask
ata: ahci-platform: Add ports-implemented DT bindings.
libahci: save port map for forced port map
Linus Torvalds [Sat, 7 May 2016 15:10:08 +0000 (08:10 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma
Pull rdma fix from Doug Ledford:
"Fix for max sector calculation in iSER"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/iser: Fix max_sectors calculation
Thomas Gleixner [Fri, 6 May 2016 18:48:16 +0000 (20:48 +0200)]
x86/topology: Handle CPUID bogosity gracefully
Joseph reported that a XEN guest dies with a division by 0 in the package
topology setup code. This happens if cpu_info.x86_max_cores is zero.
Handle that case and emit a warning. This does not fix the underlying XEN bug,
but makes the code more robust.
Reported-and-tested-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1605062046270.3540@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Rafael J. Wysocki [Fri, 6 May 2016 12:58:43 +0000 (14:58 +0200)]
sched/fair: Fix !CONFIG_SMP kernel cpufreq governor breakage
The following commit:
34e2c555f3e1 ("cpufreq: Add mechanism for registering utilization update callbacks")
overlooked the fact that update_load_avg(), where CFS invokes cpufreq
utilization update callbacks, becomes an empty stub on UP kernels.
In consequence, if !CONFIG_SMP, cpufreq governors are never invoked
from CFS and they do not have a chance to evaluate CPU performace
levels and update them often enough.
Needless to say, things don't work as expected then.
Fix the problem by making the !CONFIG_SMP stub of update_load_avg()
invoke cpufreq update callbacks too.
Reported-by: Steve Muckle <steve.muckle@linaro.org>
Tested-by: Steve Muckle <steve.muckle@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Steve Muckle <steve.muckle@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux PM list <linux-pm@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Fixes: 34e2c555f3e1 (cpufreq: Add mechanism for registering utilization update callbacks)
Link: http://lkml.kernel.org/r/6282396.VVEdgVYxO3@vostro.rjw.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Sat, 7 May 2016 04:49:28 +0000 (06:49 +0200)]
Merge tag 'perf-core-for-mingo-
20160506' of git://git./linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Fix ordering of kernel/user entries in 'caller' mode, where the kernel and
user parts were being correctly inverted but kept in place wrt each other,
i.e. 'callee' (k1, k2, u3, u4) became 'caller' (k2, k1, u4, u3) when it
should be 'caller' (u4, u3, k2, k1) (Chris Phlipot)
- In 'perf trace' don't print the raw arg syscall args for a syscall that has
no arguments, like gettid(). This was happening because just checking if
the syscall args list is NULL may mean that there are no args (e.g.: gettid)
or that there is no tracepoint info (e.g.: clone) (Arnaldo Carvalho de Melo)
- Add extra output of counter values with 'perf stat -vv' (Andi Kleen)
Infrastructure changes:
- Expose callchain db export via the python API (Chris Phlipot)
Code reorganization:
- Move some more syscall arg beautifiers from the 'perf trace' main file to
separate files in tools/perf/trace/beauty/, to reduce the main file line
count (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ido Schimmel [Fri, 6 May 2016 20:18:40 +0000 (22:18 +0200)]
mlxsw: spectrum: Add missing rollback in flood configuration
When we fail to set the flooding configuration for the broadcast and
unregistered multicast traffic, we should revert the flooding
configuration of the unknown unicast traffic.
Fixes: 0293038e0c36 ("mlxsw: spectrum: Add support for flood control")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 6 May 2016 20:18:39 +0000 (22:18 +0200)]
mlxsw: spectrum: Fix rollback order in LAG join failure
Make the leave procedure in the error path symmetric to the join
procedure and first remove the port from the collector before
potentially destroying the LAG.
Fixes: 0d65fc13042f ("mlxsw: spectrum: Implement LAG port join/leave")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Tue, 3 May 2016 23:10:21 +0000 (16:10 -0700)]
udp_offload: Set encapsulation before inner completes.
UDP tunnel segmentation code relies on the inner offsets being set for
an UDP tunnel GSO packet, but the inner *_complete() functions will
set the inner offsets only if 'encapsulation' is set before calling
them. Currently, udp_gro_complete() sets 'encapsulation' only after
the inner *_complete() functions are done. This causes the inner
offsets having invalid values after udp_gro_complete() returns, which
in turn will make it impossible to properly segment the packet in case
it needs to be forwarded, which would be visible to the user either as
invalid packets being sent or as packet loss.
This patch fixes this by setting skb's 'encapsulation' in
udp_gro_complete() before calling into the inner complete functions,
and by making each possible UDP tunnel gro_complete() callback set the
inner_mac_header to the beginning of the tunnel payload.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Reviewed-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Tue, 3 May 2016 23:10:20 +0000 (16:10 -0700)]
udp_tunnel: Remove redundant udp_tunnel_gro_complete().
The setting of the UDP tunnel GSO type is already performed by
udp[46]_gro_complete().
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 6 May 2016 20:08:35 +0000 (13:08 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull writeback fix from Jens Axboe:
"Just a single fix for domain aware writeback, fixing a regression that
can cause balance_dirty_pages() to keep looping while not getting any
work done"
* 'for-linus' of git://git.kernel.dk/linux-block:
writeback: Fix performance regression in wb_over_bg_thresh()
Linus Torvalds [Fri, 6 May 2016 19:59:27 +0000 (12:59 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"This contains two fixes: a boot fix for older SGI/UV systems, and an
APIC calibration fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
x86/platform/UV: Bring back the call to map_low_mmrs in uv_system_init
Sudarsana Reddy Kalluru [Thu, 5 May 2016 04:35:16 +0000 (00:35 -0400)]
qede: prevent chip hang when increasing channels
qede requires qed to provide enough resources to accommodate 16 combined
channels, but that upper-bound isn't actually being enforced by it.
Instead, qed inform back to qede how many channels can be opened based on
available resources - but that calculation doesn't really take into account
the resources requested by qede; Instead it considers other FW/HW available
resources.
As a result, if a user would increase the number of channels to more than
16 [e.g., using ethtool] the chip would hang.
This change increments the resources requested by qede to 64 combined
channels instead of 16; This value is an upper bound on the possible
available channels [due to other FW/HW resources].
Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Thu, 5 May 2016 04:26:08 +0000 (21:26 -0700)]
net: ipv6: tcp reset, icmp need to consider L3 domain
Responses for packets to unused ports are getting lost with L3 domains.
IPv4 has ip_send_unicast_reply for sending TCP responses which accounts
for L3 domains; update the IPv6 counterpart tcp_v6_send_response.
For icmp the L3 master check needs to be moved up in icmp6_send
to properly respond to UDP packets to a port with no listener.
Fixes: ca254490c8df ("net: Add VRF support to IPv6 stack")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 6 May 2016 18:58:45 +0000 (11:58 -0700)]
Merge tag 'pm+acpi-4.6-rc7' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"Fixes for problems introduced or discovered recently (intel_pstate,
sti-cpufreq, ARM64 cpuidle, Operating Performance Points framework,
generic device properties framework) and one fix for a hotplug-related
deadlock in ACPICA that's been there forever, but is nasty enough.
Specifics:
- Fix for a recent regression in the intel_pstate driver causing it
to fail to restore the HWP (HW-managed P-states) configuration of
the boot CPU after suspend-to-RAM (Rafael Wysocki).
- Fix for two recent regressions in the intel_pstate driver, one that
can trigger a divide by zero if the driver is accessed via sysfs
before it manages to take the first sample and one causing it to
fail to update a structure field used in a trace point, so the
information coming from it is less useful (Rafael Wysocki).
- Fix for a problem in the sti-cpufreq driver introduced during the
4.5 cycle that causes it to break CPU PM in multi-platform kernels
by registering cpufreq-dt (which subsequently doesn't work)
unconditionally and preventing the driver that would actually work
from registering (Sudeep Holla).
- Stable-candidate fix for an ARM64 cpuidle issue causing idle state
usage counters to be incorrectly updated for idle states that were
not entered due to errors (James Morse).
- Fix for a recently introduced issue in the OPP (Operating
Performance Points) framework causing it to print bogus error
messages for missing optional regulators (Viresh Kumar).
- Fix for a recently introduced issue in the generic device
properties framework that may cause it to attempt to dereferece and
invalid pointer in some cases (Heikki Krogerus).
- Fix for a deadlock in the ACPICA core that may be triggered by
device (eg Thunderbolt) hotplug (Prarit Bhargava)"
* tag 'pm+acpi-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / OPP: Remove useless check
ACPICA: Dispatcher: Update thread ID for recursive method calls
intel_pstate: Fix intel_pstate_get()
cpufreq: intel_pstate: Fix HWP on boot CPU after system resume
cpufreq: st: enable selective initialization based on the platform
ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
device property: Avoid potential dereferences of invalid pointers
Linus Torvalds [Fri, 6 May 2016 18:53:27 +0000 (11:53 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
"This contains a single fix that fixes a nohz tick stopping bug when
mixed-poliocy SCHED_FIFO and SCHED_RR tasks are present on a runqueue"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
nohz/full, sched/rt: Fix missed tick-reenabling bug in sched_can_stop_tick()
Linus Torvalds [Fri, 6 May 2016 18:40:24 +0000 (11:40 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"This tree contains two fixes: new Intel CPU model numbers and an
AMD/iommu uncore PMU driver fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/amd/iommu: Do not register a task ctx for uncore like PMUs
perf/x86: Add model numbers for Kabylake CPUs
Linus Torvalds [Fri, 6 May 2016 18:33:02 +0000 (11:33 -0700)]
Merge branch 'efi-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
"This tree contains three fixes: a console spam fix, a file pattern fix
and a sysfb_efi fix for a bug that triggered on older ThinkPads"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sysfb_efi: Fix valid BAR address range check
x86/efi-bgrt: Switch all pr_err() to pr_notice() for invalid BGRT
MAINTAINERS: Remove asterisk from EFI directory names
Linus Torvalds [Fri, 6 May 2016 18:27:05 +0000 (11:27 -0700)]
Merge branch 'parisc-4.6-5' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fix from Helge Deller:
"Patch from Dmitry V Levin to fix a kernel crash when a straced process
calls the (invalid) syscall which is equal to value of __NR_Linux_syscalls"
* 'parisc-4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls
Linus Torvalds [Fri, 6 May 2016 18:14:38 +0000 (11:14 -0700)]
Merge tag 'arc-4.6-rc7-fixes' of git://git./linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"Late in the cycle, but this has fixes for couple of issues: a PAE40
boot crash and Arnd spotting lack of barriers in BE io-accessors.
The 3rd patch for enabling highmem in low physical mem ;-) honestly is
more than a "fix" but its been in works for some time, seems to be
stable in testing and enables 2 of our customers to go forward with
4.6 kernel.
- Fix for PTE truncation in PAE40 builds
- Fix for big endian IO accessors lacking IO barrier
- Allow HIGHMEM to work with low physical addresses"
* tag 'arc-4.6-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: support HIGHMEM even without PAE40
ARC: Fix PAE40 boot failures due to PTE truncation
ARC: Add missing io barriers to io{read,write}{16,32}be()
Linus Torvalds [Fri, 6 May 2016 18:05:07 +0000 (11:05 -0700)]
Merge tag 'powerpc-4.6-5' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
"Fix bad inline asm constraint in create_zero_mask() from Anton
Blanchard"
* tag 'powerpc-4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: Fix bad inline asm constraint in create_zero_mask()
Linus Torvalds [Fri, 6 May 2016 17:59:53 +0000 (10:59 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Fixes for i915, amdgpu/radeon and imx.
The IMX fix is for an autoloading regression found in Fedora. The
radeon fixes, are the same fix to amdgpu/radeon to avoid a hardware
lockup in some circumstances with a bad mode, and a double free bug I
took a few hours chasing down the other morning.
The i915 fixes are across the board, all stable material, and fixing
some hangs and suspend/resume issues, along with a live status
regressions"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading
drm/amdgpu: make sure vertical front porch is at least 1
drm/radeon: make sure vertical front porch is at least 1
drm/amdgpu: set metadata pointer to NULL after freeing.
drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW
drm/i915: Fake HDMI live status
drm/i915: Fix eDP low vswing for Broadwell
drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
drm/i915: Fix system resume if PCI device remained enabled
drm/i915: Avoid stalling on pending flips for legacy cursor updates
Dan Williams [Fri, 6 May 2016 17:20:10 +0000 (10:20 -0700)]
libnvdimm, pfn: fix ARCH=alpha allmodconfig build failure
I had relied on the kbuild robot for cross build coverage, however it
only builds alpha_defconfig. Switch from HPAGE_SIZE to PMD_SIZE, which
is more widely defined.
Fixes: 658922e57b84 ("libnvdimm, pfn: fix memmap reservation sizing")
Cc: <stable@vger.kernel.org>
Reported-by: Guenter Roeck <guenter@roeck-us.net>
Tested-by: Guenter Roeck <guenter@roeck-us.net>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Linus Lüssing [Wed, 4 May 2016 15:25:02 +0000 (17:25 +0200)]
bridge: fix igmp / mld query parsing
With the newly introduced helper functions the skb pulling is hidden
in the checksumming function - and undone before returning to the
caller.
The IGMP and MLD query parsing functions in the bridge still
assumed that the skb is pointing to the beginning of the IGMP/MLD
message while it is now kept at the beginning of the IPv4/6 header.
If there is a querier somewhere else, then this either causes
the multicast snooping to stay disabled even though it could be
enabled. Or, if we have the querier enabled too, then this can
create unnecessary IGMP / MLD query messages on the link.
Fixing this by taking the offset between IP and IGMP/MLD header into
account, too.
Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
Reported-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnaldo Carvalho de Melo [Fri, 6 May 2016 15:45:25 +0000 (12:45 -0300)]
perf trace: Move futex_op beautifier to tools/perf/trace/beauty/
To reduce the size of builtin-trace.c.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vb8dpy7bptkf219q5c25ulfp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 6 May 2016 13:02:32 +0000 (10:02 -0300)]
perf trace: Move open_flags beautifier to tools/perf/trace/beauty/
To reduce the size of builtin-trace.c.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jt293541hv9od7gqw6lilioh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 6 May 2016 12:58:02 +0000 (09:58 -0300)]
perf trace: Move signum beautifier to tools/perf/trace/beauty/
To reduce the size of builtin-trace.c.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qecqxwwtreio6eaatfv58yq5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Wed, 27 Apr 2016 20:00:51 +0000 (13:00 -0700)]
perf stat: Add extra output of counter values with -vv
Add debug output of raw counter values per CPU when perf stat -v is
specified, together with their cpu numbers. This is very useful to
debug problems with per core counters, where we can normally only see
aggregated values.
v2: Make it depend on -vv, not -v
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461787251-6702-12-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Chris Phlipot [Thu, 28 Apr 2016 08:19:11 +0000 (01:19 -0700)]
perf script: Update export-to-postgresql to support callchain export
Update the export-to-postgresql.py to support the newly introduced
callchain export.
callchains are added into the existing call_paths table and can now
be associated with samples when the "callpaths" commandline option
is used with the script.
Ex.:
$ perf script -s export-to-postgresql.py example_db all callchains
Includes the following changes to enable callchain export via the python export
APIs:
- Add the "callchains" commandline option, which is used to enable
callchain export by setting the perf_db_export_callchains global
- Add perf_db_export_callchains checks for call_path table creation
and population.
- Add call_path_id to samples_table to conform with the new API
example usage and output using a small test app:
test_app.c:
volatile int x = 0;
void inc_x_loop()
{
int i;
for(i=0; i<
100000000; i++)
x++;
}
void a()
{
inc_x_loop();
}
void b()
{
inc_x_loop();
}
int main()
{
a();
b();
return 0;
}
example usage:
$ gcc -g -O0 test_app.c
$ perf record --call-graph=dwarf ./a.out
[ perf record: Woken up 77 times to write data ]
[ perf record: Captured and wrote 19.373 MB perf.data (2404 samples) ]
$ perf script -s scripts/python/export-to-postgresql.py
example_db all callchains
$ psql example_db
example_db=#
SELECT
(SELECT name FROM symbols WHERE id = cps.symbol_id) as symbol,
(SELECT name FROM symbols WHERE id =
(SELECT symbol_id from call_paths where id = cps.parent_id))
as parent_symbol,
sum(period) as event_count
FROM samples join call_paths as cps on call_path_id = cps.id
GROUP BY cps.id,evsel_id
ORDER BY event_count DESC
LIMIT 5;
symbol | parent_symbol | event_count
------------------+--------------------------+-------------
inc_x_loop | a |
734250982
inc_x_loop | b |
731028057
unknown | unknown |
1335858
task_tick_fair | scheduler_tick |
1238842
update_wall_time | tick_do_update_jiffies64 | 650373
(5 rows)
The above data shows total "self time" in cycles for each call path that was
sampled. It is intended to demonstrate how it accounts separately for the two
ways to reach the "inc_x_loop" function(via "a" and "b"). Recursive common
table expressions can be used as well to get cumulative time spent in a
function as well, but that is beyond the scope of this basic example.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-7-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Chris Phlipot [Thu, 28 Apr 2016 08:19:10 +0000 (01:19 -0700)]
perf script: Expose usage of the callchain db export via the python api
This change allows python scripts to be able to utilize the recent
changes to the db export api allowing the export of call_paths derived
from sampled callchains. These call paths are also now associated with
the samples from which they were derived.
- This feature is enabled by setting "perf_db_export_callchains" to true
- When enabled, samples that have callchain information will have the
callchains exported via call_path_table
- The call_path_id field is added to sample_table to enable association of
samples with the corresponding callchain stored in the call paths
table. A call_path_id of 0 will be exported if there is no
corresponding callchain.
- When "perf_db_export_callchains" and "perf_db_export_calls" are both
set to True, the call path root data structure will be shared. This
prevents duplicating of data and call path ids that would result from
building two separate call path trees in memory.
- The call_return_processor structure definition was relocated to the header
file to make its contents visible to db-export.c. This enables the
sharing of call path trees between the two features, as mentioned
above.
This change is visible to python scripts using the python db export api.
The change is backwards compatible with scripts written against the
previous API, assuming that the scripts model the sample_table function
after the one in export-to-postgresql.py script by allowing for
additional arguments to be added in the future. ie. using *x as the
final argument of the sample_table function.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-6-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Chris Phlipot [Thu, 28 Apr 2016 08:19:09 +0000 (01:19 -0700)]
perf script: Add call path id to exported sample in db export
The exported sample now contains a reference to the call_path_id that
represents its callchain.
While callchains themselves are nice to have, being able to associate
them with samples makes them much more useful, and can allow for such
things as determining how much cumulative time is spent in a particular
function. This information is normally possible to get from the call
return processor. However, when doing normal sampling, call/return
information is not available, thus necessitating the need for
associating samples directly with call paths.
This commit include changes to db-export layer to make this information
available for subsequent patches in this change set, but by itself, does
not make any changes visible to the user.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-5-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Chris Phlipot [Thu, 28 Apr 2016 08:19:08 +0000 (01:19 -0700)]
perf script: Enable db export to output sampled callchains
This change enables the db export api to export callchains. This is
accomplished by adding callchains obtained from samples to the
call_path_root structure and exporting them via the current call path
export API.
While the current API does support exporting call paths, this is not
supported when sampling. This commit addresses that missing feature by
allowing the export of call paths when callchains are present in
samples.
Summary:
- This feature is activated by initializing the call_path_root member
inside the db_export structure to a non-null value.
- Callchains are resolved with thread__resolve_callchain() and then stored
and exported by adding a call path under call path root.
- Symbol and DSO for each callchain node are exported via db_ids_from_al()
This commit puts in place infrastructure to be used by subsequent commits,
and by itself, does not introduce any user-visible changes.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-4-git-send-email-cphlipot0@gmail.com
[ Made adjustments suggested by Adrian Hunter, see thread via this cset's Link: tag ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Chris Phlipot [Thu, 28 Apr 2016 08:19:07 +0000 (01:19 -0700)]
perf tools: Refactor code to move call path handling out of thread-stack
Move the call path handling code out of thread-stack.c and
thread-stack.h to allow other components that are not part of
thread-stack to create call paths.
Summary:
- Create call-path.c and call-path.h and add them to the build.
- Move all call path related code out of thread-stack.c and thread-stack.h
and into call-path.c and call-path.h.
- A small subset of structures and functions are now visible through
call-path.h, which is required for thread-stack.c to continue to
compile.
This change is a prerequisite for subsequent patches in this change set
and by itself contains no user-visible changes.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-3-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Dmitry V. Levin [Wed, 27 Apr 2016 01:56:11 +0000 (04:56 +0300)]
parisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls
Do not load one entry beyond the end of the syscall table when the
syscall number of a traced process equals to __NR_Linux_syscalls.
Similar bug with regular processes was fixed by commit
3bb457af4fa8
("[PARISC] Fix bug when syscall nr is __NR_Linux_syscalls").
This bug was found by strace test suite.
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Chris Phlipot [Thu, 28 Apr 2016 08:19:06 +0000 (01:19 -0700)]
perf callchain: Fix incorrect ordering of entries
The existing implementation of thread__resolve_callchain, under certain
circumstances, can assemble callchain entries in the incorrect order.
The callchain entries are resolved incorrectly for a sample when all of
the following conditions are met:
1. callchain_param.order is set to ORDER_CALLER
2. thread__resolve_callchain_sample is able to resolve callchain entries
for the sample.
3. unwind__get_entries is also able to resolve callchain entries for the
sample.
The fix is accomplished by reversing the order in which
thread__resolve_callchain_sample and unwind__get_entries are called when
callchain_param.order is set to ORDER_CALLER.
Unwind specific code from thread__resolve_callchain is also moved into a
new static function to improve readability of the fix.
How to Reproduce the Existing Bug:
Modifying perf script to print call trees in the opposite order or
applying the remaining patches from this series and comparing the
results output from export-to-postgtresql.py are the easiest ways to see
the bug, however it can still be seen in current builds using perf
report.
Here is how i can reproduce the bug using perf report:
# perf record --call-graph=dwarf stress -c 1 -t 5
when i run this command:
# perf report --call-graph=flat,0,0,callee
This callchain, containing kernel (handle_irq_event, etc) and userspace
samples (__libc_start_main, etc) is contained in the output, which looks
correct (callee order):
gen8_irq_handler
handle_irq_event_percpu
handle_irq_event
handle_edge_irq
handle_irq
do_IRQ
ret_from_intr
__random
rand
0x558f2a04dded
0x558f2a04c774
__libc_start_main
0x558f2a04dcd9
Now run this command using caller order:
# perf report --call-graph=flat,0,0,caller
It is expected to see the exact reverse of the above when using caller
order (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the
bottom) in the output, but it is nowhere to be found.
instead you see this:
ret_from_intr
do_IRQ
handle_irq
handle_edge_irq
handle_irq_event
handle_irq_event_percpu
gen8_irq_handler
0x558f2a04dcd9
__libc_start_main
0x558f2a04c774
0x558f2a04dded
rand
__random
Notice how internally the kernel symbols are reversed and the user space
symbols are reversed, but the kernel symbols still appear above the user
space symbols.
if this patch is applied and perf script is re-run, you will see the
expected output (with "0x558f2a04dcd9" at the top and "gen8_irq_handler"
at the bottom):
0x558f2a04dcd9
__libc_start_main
0x558f2a04c774
0x558f2a04dded
rand
__random
ret_from_intr
do_IRQ
handle_irq
handle_edge_irq
handle_irq_event
handle_irq_event_percpu
gen8_irq_handler
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-2-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 6 May 2016 02:38:05 +0000 (23:38 -0300)]
perf trace: Do not print raw args list for syscalls with no args
The test to check if the arg format had been read from the
syscall:sys_enter_name/format file was looking at the list of non-commom
fields, and if that is empty, it would think it had failed to read it,
because it doesn't exist, for instance, for the clone() syscall.
So instead before dumping the raw syscall args list check
IS_ERR(sc->tp_format), if that is true, then an attempt was made to read
the format file and failed, in which case dump the raw arg list values.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ls7pmdqb2xy9339vdburwvnk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rafael J. Wysocki [Fri, 6 May 2016 11:16:22 +0000 (13:16 +0200)]
Merge branches 'pm-opp-fixes', 'pm-cpufreq-fixes' and 'pm-cpuidle-fixes'
* pm-opp-fixes:
PM / OPP: Remove useless check
* pm-cpufreq-fixes:
intel_pstate: Fix intel_pstate_get()
cpufreq: intel_pstate: Fix HWP on boot CPU after system resume
cpufreq: st: enable selective initialization based on the platform
* pm-cpuidle-fixes:
ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
Rafael J. Wysocki [Fri, 6 May 2016 11:15:52 +0000 (13:15 +0200)]
Merge branches 'acpica-fixes' and 'device-properties-fixes'
* acpica-fixes:
ACPICA: Dispatcher: Update thread ID for recursive method calls
* device-properties-fixes:
device property: Avoid potential dereferences of invalid pointers
Chen Yu [Fri, 6 May 2016 03:33:39 +0000 (11:33 +0800)]
x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
Currently we read the tsc radio: ratio = (MSR_PLATFORM_INFO >> 8) & 0x1f;
Thus we get bit 8-12 of MSR_PLATFORM_INFO, however according to the SDM
(35.5), the ratio bits are bit 8-15.
Ignoring the upper bits can result in an incorrect tsc ratio, which causes the
TSC calibration and the Local APIC timer frequency to be incorrect.
Fix this problem by masking 0xff instead.
[ tglx: Massaged changelog ]
Fixes: 7da7c1561366 "x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs"
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: stable@vger.kernel.org
Cc: Bin Gao <bin.gao@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/1462505619-5516-1-git-send-email-yu.c.chen@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Fri, 6 May 2016 06:35:14 +0000 (08:35 +0200)]
Merge tag 'perf-core-for-mingo-
20160505' of git://git./linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Order output of 'perf trace --summary' better, now the threads will
appear ascending order of number of events, and then, for each, in
descending order of syscalls by the time spent in the syscalls, so
that the last page produced can be the one about the most interesting
thread straced, suggested by Milian Wolff (Arnaldo Carvalho de Melo)
- Do not show the runtime_ms for a thread when not collecting it, that
is done so far only with 'perf trace --sched' (Arnaldo Carvalho de Melo)
- Fix kallsyms perf test on ppc64le (Naveen N. Rao)
Infrastructure changes:
- Move global variables related to presence of some keys in the sort order to a
per hist struct, to allow code like the hists browser to work with multiple
hists with different lists of columns (Jiri Olsa)
- Add support for generating bpf prologue in powerpc (Naveen N. Rao)
- Fix kprobe and kretprobe handling with kallsyms on ppc64le (Naveen N. Rao)
- evlist mmap changes, prep work for supporting reading backwards (Wang Nan)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Fri, 6 May 2016 03:48:35 +0000 (20:48 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
"14 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
byteswap: try to avoid __builtin_constant_p gcc bug
lib/stackdepot: avoid to return 0 handle
mm: fix kcompactd hang during memory offlining
modpost: fix module autoloading for OF devices with generic compatible property
proc: prevent accessing /proc/<PID>/environ until it's ready
mm/zswap: provide unique zpool name
mm: thp: kvm: fix memory corruption in KVM with THP enabled
MAINTAINERS: fix Rajendra Nayak's address
mm, cma: prevent nr_isolated_* counters from going negative
mm: update min_free_kbytes from khugepaged after core initialization
huge pagecache: mmap_sem is unlocked when truncation splits pmd
rapidio/mport_cdev: fix uapi type definitions
mm: memcontrol: let v2 cgroups follow changes in system swappiness
mm: thp: correct split_huge_pages file permission
Nikolay Aleksandrov [Wed, 4 May 2016 14:18:45 +0000 (16:18 +0200)]
net: bridge: fix old ioctl unlocked net device walk
get_bridge_ifindices() is used from the old "deviceless" bridge ioctl
calls which aren't called with rtnl held. The comment above says that it is
called with rtnl but that is not really the case.
Here's a sample output from a test ASSERT_RTNL() which I put in
get_bridge_ifindices and executed "brctl show":
[ 957.422726] RTNL: assertion failed at net/bridge//br_ioctl.c (30)
[ 957.422925] CPU: 0 PID: 1862 Comm: brctl Tainted: G W O
4.6.0-rc4+ #157
[ 957.423009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.8.1-20150318_183358- 04/01/2014
[ 957.423009]
0000000000000000 ffff880058adfdf0 ffffffff8138dec5
0000000000000400
[ 957.423009]
ffffffff81ce8380 ffff880058adfe58 ffffffffa05ead32
0000000000000001
[ 957.423009]
00007ffec1a444b0 0000000000000400 ffff880053c19130
0000000000008940
[ 957.423009] Call Trace:
[ 957.423009] [<
ffffffff8138dec5>] dump_stack+0x85/0xc0
[ 957.423009] [<
ffffffffa05ead32>]
br_ioctl_deviceless_stub+0x212/0x2e0 [bridge]
[ 957.423009] [<
ffffffff81515beb>] sock_ioctl+0x22b/0x290
[ 957.423009] [<
ffffffff8126ba75>] do_vfs_ioctl+0x95/0x700
[ 957.423009] [<
ffffffff8126c159>] SyS_ioctl+0x79/0x90
[ 957.423009] [<
ffffffff8163a4c0>] entry_SYSCALL_64_fastpath+0x23/0xc1
Since it only reads bridge ifindices, we can use rcu to safely walk the net
device list. Also remove the wrong rtnl comment above.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Wed, 4 May 2016 13:21:53 +0000 (14:21 +0100)]
VSOCK: do not disconnect socket when peer has shutdown SEND only
The peer may be expecting a reply having sent a request and then done a
shutdown(SHUT_WR), so tearing down the whole socket at this point seems
wrong and breaks for me with a client which does a SHUT_WR.
Looking at other socket family's stream_recvmsg callbacks doing a shutdown
here does not seem to be the norm and removing it does not seem to have
had any adverse effects that I can see.
I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact
on the vmci transport.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Cc: Andy King <acking@vmware.com>
Cc: Dmitry Torokhov <dtor@vmware.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: Adit Ranadive <aditr@vmware.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Jurgens [Wed, 4 May 2016 12:00:33 +0000 (15:00 +0300)]
net/mlx4_en: Fix endianness bug in IPV6 csum calculation
Use htons instead of unconditionally byte swapping nexthdr. On a little
endian systems shifting the byte is correct behavior, but it results in
incorrect csums on big endian architectures.
Fixes: f8c6455bb04b ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Carol Soto <clsoto@us.ibm.com>
Tested-by: Carol Soto <clsoto@us.ibm.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 6 May 2016 03:07:14 +0000 (20:07 -0700)]
mailmap: add John Paul Adrian Glaubitz
Apparently patchwork ended up truncating the full name.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 6 May 2016 01:10:01 +0000 (18:10 -0700)]
Merge branch 'libnvdimm-fixes' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
- a fix for the persistent memory 'struct page' driver. The
implementation overlooked the fact that pages are allocated in 2MB
units leading to -ENOMEM when establishing some configurations.
It's tagged for -stable as the problem was introduced with the
initial implementation in 4.5.
- The new "error status translation" routine, introduced with the 4.6
updates to the nfit driver, missed a necessary path in
acpi_nfit_ctl().
The end result is that we are falsely assuming commands complete
successfully when the embedded status says otherwise.
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nfit: fix translation of command status results
libnvdimm, pfn: fix memmap reservation sizing