xen, balloon: Fix CPU hotplug callback registration
authorSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Mon, 10 Mar 2014 20:41:45 +0000 (02:11 +0530)
committerRafael J. Wysocki <rafael.j.wysocki@intel.com>
Thu, 20 Mar 2014 12:43:48 +0000 (13:43 +0100)
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

The xen balloon driver doesn't take get/put_online_cpus() around this code,
but that is also buggy, since it can miss CPU hotplug events in between the
initialization and callback registration:

for_each_online_cpu(cpu)
init_cpu(cpu);
   ^
   |  Race window; Can miss CPU hotplug events here.
   v
register_cpu_notifier(&foobar_cpu_notifier);

Interestingly, the balloon code in xen can simply be reorganized as shown
below, to have a race-free method to register hotplug callbacks, without even
taking get/put_online_cpus(). This is because the initialization performed for
already online CPUs is exactly the same as that performed for CPUs that come
online later. Moreover, the code has checks in place to avoid double
initialization.

register_cpu_notifier(&foobar_cpu_notifier);

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

put_online_cpus();

A hotplug operation that occurs between registering the notifier and calling
get_online_cpus(), won't disrupt anything, because the code takes care to
perform the memory allocations only once.

So reorganize the balloon code in xen this way to fix the issues with CPU
hotplug callback registration.

Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
drivers/xen/balloon.c

index 37d06ea624aa953d40448bcd2a2d4943baf79fa9..dd7954922942eda677f2dc5090db1a6325043eca 100644 (file)
@@ -592,19 +592,29 @@ static void __init balloon_add_region(unsigned long start_pfn,
        }
 }
 
+static int alloc_balloon_scratch_page(int cpu)
+{
+       if (per_cpu(balloon_scratch_page, cpu) != NULL)
+               return 0;
+
+       per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL);
+       if (per_cpu(balloon_scratch_page, cpu) == NULL) {
+               pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n", cpu);
+               return -ENOMEM;
+       }
+
+       return 0;
+}
+
+
 static int balloon_cpu_notify(struct notifier_block *self,
                                    unsigned long action, void *hcpu)
 {
        int cpu = (long)hcpu;
        switch (action) {
        case CPU_UP_PREPARE:
-               if (per_cpu(balloon_scratch_page, cpu) != NULL)
-                       break;
-               per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL);
-               if (per_cpu(balloon_scratch_page, cpu) == NULL) {
-                       pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n", cpu);
+               if (alloc_balloon_scratch_page(cpu))
                        return NOTIFY_BAD;
-               }
                break;
        default:
                break;
@@ -624,15 +634,17 @@ static int __init balloon_init(void)
                return -ENODEV;
 
        if (!xen_feature(XENFEAT_auto_translated_physmap)) {
-               for_each_online_cpu(cpu)
-               {
-                       per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL);
-                       if (per_cpu(balloon_scratch_page, cpu) == NULL) {
-                               pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n", cpu);
+               register_cpu_notifier(&balloon_cpu_notifier);
+
+               get_online_cpus();
+               for_each_online_cpu(cpu) {
+                       if (alloc_balloon_scratch_page(cpu)) {
+                               put_online_cpus();
+                               unregister_cpu_notifier(&balloon_cpu_notifier);
                                return -ENOMEM;
                        }
                }
-               register_cpu_notifier(&balloon_cpu_notifier);
+               put_online_cpus();
        }
 
        pr_info("Initialising balloon driver\n");