cpufreq: Synchronize the cpufreq store_*() routines with CPU hotplug
The functions that are used to write to cpufreq sysfs files (such as
store_scaling_max_freq()) are not hotplug safe. They can race with CPU
hotplug tasks and lead to problems such as trying to acquire an already
destroyed timer-mutex etc.
Eg:
__cpufreq_remove_dev()
__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
policy->governor->governor(policy, CPUFREQ_GOV_STOP);
cpufreq_governor_dbs()
case CPUFREQ_GOV_STOP:
mutex_destroy(&cpu_cdbs->timer_mutex)
cpu_cdbs->cur_policy = NULL;
<PREEMPT>
store()
__cpufreq_set_policy()
__cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
policy->governor->governor(policy, CPUFREQ_GOV_LIMITS);
case CPUFREQ_GOV_LIMITS:
mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex)
if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL
So use get_online_cpus()/put_online_cpus() in the store_*() functions, to
synchronize with CPU hotplug. However, there is an additional point to note
here: some parts of the CPU teardown in the cpufreq subsystem are done in
the CPU_POST_DEAD stage, with cpu_hotplug.lock *released*. So, using the
get/put_online_cpus() functions alone is insufficient; we should also ensure
that we don't race with those latter steps in the hotplug sequence. We can
easily achieve this by checking if the CPU is online before proceeding with
the store, since the CPU would have been marked offline by the time the
CPU_POST_DEAD notifiers are executed.
Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>