perf_events: Fix bogus context time tracking
authorStephane Eranian <eranian@google.com>
Fri, 15 Oct 2010 13:26:01 +0000 (15:26 +0200)
committerIngo Molnar <mingo@elte.hu>
Mon, 18 Oct 2010 17:58:46 +0000 (19:58 +0200)
You can only call update_context_time() when the context
is active, i.e., the thread it is attached to is still running.

However, perf_event_read() can be called even when the context
is inactive, e.g., user read() the counters. The call to
update_context_time() must be conditioned on the status of
the context, otherwise, bogus time_enabled, time_running may
be returned. Here is an example on AMD64. The task program
is an example from libpfm4. The -p prints deltas every 1s.

$ task -p -e cpu_clk_unhalted sleep 5
    2,266,610 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
5,242,358,071 cpu_clk_unhalted (99.95% scaling, ena=5,000,359,984, run=2,319,270)

Whereas if you don't read deltas, e.g., no call to perf_event_read() until
the process terminates:

$ task -e cpu_clk_unhalted sleep 5
    2,497,783 cpu_clk_unhalted (0.00% scaling, ena=2,376,899, run=2,376,899)

Notice that time_enable, time_running are bogus in the first example
causing bogus scaling.

This patch fixes the problem, by conditionally calling update_context_time()
in perf_event_read().

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
LKML-Reference: <4cb856dc.51edd80a.5ae0.38fb@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
kernel/perf_event.c

index 1ec3916ffef008521862e1240a93040f1fa7df72..e7eeba1794fddac685f5c80a29e9675df17f5511 100644 (file)
@@ -1780,7 +1780,13 @@ static u64 perf_event_read(struct perf_event *event)
                unsigned long flags;
 
                raw_spin_lock_irqsave(&ctx->lock, flags);
-               update_context_time(ctx);
+               /*
+                * may read while context is not active
+                * (e.g., thread is blocked), in that case
+                * we cannot update context time
+                */
+               if (ctx->is_active)
+                       update_context_time(ctx);
                update_event_times(event);
                raw_spin_unlock_irqrestore(&ctx->lock, flags);
        }