From 09ef4939850aa81e3822b5dfb9ba2ada5e565816 Mon Sep 17 00:00:00 2001 From: "Kirill A. Shutemov" Date: Thu, 14 Nov 2013 14:31:13 -0800 Subject: [PATCH] x86: add missed pgtable_pmd_page_ctor/dtor calls for preallocated pmds In split page table lock case, we embed spinlock_t into struct page. For obvious reason, we don't want to increase size of struct page if spinlock_t is too big, like with DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC or on -rt kernel. So we disable split page table lock, if spinlock_t is too big. This patchset allows to allocate the lock dynamically if spinlock_t is big. In this page->ptl is used to store pointer to spinlock instead of spinlock itself. It costs additional cache line for indirect access, but fix page fault scalability for multi-threaded applications. LOCK_STAT depends on DEBUG_SPINLOCK, so on current kernel enabling LOCK_STAT to analyse scalability issues breaks scalability. ;) The patchset mostly fixes this. Results for ./thp_memscale -c 80 -b 512M on 4-socket machine: baseline, no CONFIG_LOCK_STAT: 9.115460703 seconds time elapsed baseline, CONFIG_LOCK_STAT=y: 53.890567123 seconds time elapsed patched, no CONFIG_LOCK_STAT: 8.852250368 seconds time elapsed patched, CONFIG_LOCK_STAT=y: 11.069770759 seconds time elapsed Patch count is scary, but most of them trivial. Overview: Patches 1-4 Few bug fixes. No dependencies to other patches. Probably should applied as soon as possible. Patch 5 Changes signature of pgtable_page_ctor(). We will use it for dynamic lock allocation, so it can fail. Patches 6-8 Add missing constructor/destructor calls on few archs. It's fixes NR_PAGETABLE accounting and prepare to use split ptl. Patches 9-33 Add pgtable_page_ctor() fail handling to all archs. Patches 34 Finally adds support of dynamically-allocated page->pte. Also contains documentation for split page table lock. This patch (of 34): I've missed that we preallocate few pmds on pgd_alloc() if X86_PAE enabled. Let's add missed constructor/destructor calls. I haven't noticed it during testing since prep_new_page() clears page->mapping and therefore page->ptl. It's effectively equal to spin_lock_init(&page->ptl). Signed-off-by: Kirill A. Shutemov Acked-by: Ingo Molnar Cc: "H. Peter Anvin" Cc: "James E.J. Bottomley" Cc: "Kirill A. Shutemov" Cc: Benjamin Herrenschmidt Cc: Catalin Marinas Cc: Chen Liqin Cc: Chris Metcalf Cc: Chris Zankel Cc: Christoph Lameter Cc: David Howells Cc: David S. Miller Cc: Fenghua Yu Cc: Geert Uytterhoeven Cc: Grant Likely Cc: Guan Xuetao Cc: Haavard Skinnemoen Cc: Hans-Christian Egtvedt Cc: Heiko Carstens Cc: Helge Deller Cc: Hirokazu Takata Cc: Ivan Kokshaysky Cc: James Hogan Cc: Jeff Dike Cc: Jesper Nilsson Cc: Jonas Bonn Cc: Koichi Yasutake Cc: Lennox Wu Cc: Martin Schwidefsky Cc: Matt Turner Cc: Max Filippov Cc: Michal Simek Cc: Mikael Starvik Cc: Paul Mackerras Cc: Paul Mundt Cc: Peter Zijlstra Cc: Ralf Baechle Cc: Richard Henderson Cc: Richard Kuo Cc: Richard Weinberger Cc: Rob Herring Cc: Russell King Cc: Thomas Gleixner Cc: Tony Luck Cc: Vineet Gupta Cc: Will Deacon Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/x86/mm/pgtable.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index dfa537a03be1..1a7d21342e02 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -189,8 +189,10 @@ static void free_pmds(pmd_t *pmds[]) int i; for(i = 0; i < PREALLOCATED_PMDS; i++) - if (pmds[i]) + if (pmds[i]) { + pgtable_pmd_page_dtor(virt_to_page(pmds[i])); free_page((unsigned long)pmds[i]); + } } static int preallocate_pmds(pmd_t *pmds[]) @@ -200,8 +202,13 @@ static int preallocate_pmds(pmd_t *pmds[]) for(i = 0; i < PREALLOCATED_PMDS; i++) { pmd_t *pmd = (pmd_t *)__get_free_page(PGALLOC_GFP); - if (pmd == NULL) + if (!pmd) + failed = true; + if (pmd && !pgtable_pmd_page_ctor(virt_to_page(pmd))) { + free_page((unsigned long)pmds[i]); + pmd = NULL; failed = true; + } pmds[i] = pmd; } -- 2.30.2