sched/debug: Show the sum wait time of a task group
authorYun Wang <yun.wang@linux.alibaba.com>
Wed, 4 Jul 2018 03:27:27 +0000 (11:27 +0800)
committerIngo Molnar <mingo@kernel.org>
Wed, 25 Jul 2018 09:41:05 +0000 (11:41 +0200)
Although we can rely on cpuacct to present the CPU usage of task
groups, it is hard to tell how intense the competition is between
these groups on CPU resources.

Monitoring the wait time or sched_debug of each process could be
very expensive, and there is no good way to accurately represent the
conflict with these info, we need the wait time on group dimension.

Thus we introduce group's wait_sum to represent the resource conflict
between task groups, which is simply the sum of the wait time of
the group's cfs_rq.

The 'cpu.stat' is modified to show the statistic, like:

   nr_periods 0
   nr_throttled 0
   throttled_time 0
   wait_sum 2035098795584

Now we can monitor the changes of wait_sum to tell how much a
a task group is suffering in the fight of CPU resources.

For example:

   (wait_sum - last_wait_sum) * 100 / (nr_cpu * period_ns) == X%

means the task group paid X percentage of period on waiting
for the CPU.

Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/ff7dae3b-e5f9-7157-1caa-ff02c6b23dc1@linux.alibaba.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
kernel/sched/core.c

index fc177c06e490dbe8f708c476f113bfa948e13eea..2bc391a574e66e2c9ab132eadb8a0c4a4813b5a4 100644 (file)
@@ -6748,6 +6748,16 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
        seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
        seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);
 
+       if (schedstat_enabled() && tg != &root_task_group) {
+               u64 ws = 0;
+               int i;
+
+               for_each_possible_cpu(i)
+                       ws += schedstat_val(tg->se[i]->statistics.wait_sum);
+
+               seq_printf(sf, "wait_sum %llu\n", ws);
+       }
+
        return 0;
 }
 #endif /* CONFIG_CFS_BANDWIDTH */