From: Alexey Dobriyan Date: Tue, 14 May 2019 22:43:54 +0000 (-0700) Subject: elf: init pt_regs pointer later X-Git-Url: http://git.cdn.openwrt.org/?a=commitdiff_plain;h=249b08e4e504d4c54eda3453c9c97edbafa51401;p=openwrt%2Fstaging%2Fblogic.git elf: init pt_regs pointer later Get "current_pt_regs" pointer right before usage. Space savings on x86_64: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-180 (-180) Function old new delta load_elf_binary 5806 5626 -180 !!! Looks like the compiler doesn't know that "current_pt_regs" is stable pointer (because it doesn't know ->stack isn't) even though it knows that "current" is stable pointer. So it saves it in the very beginning and then tries to carry it through a lot of code. Here is what happens here: load_elf_binary() ... mov rax,QWORD PTR gs:0x14c00 mov r13,QWORD PTR [rax+0x18] r13 = current->stack call kmem_cache_alloc # first kmalloc [980 bytes later!] # let's spill that sucker because we need a register # for "load_bias" calculations at # # if (interpreter) { # load_bias = ELF_ET_DYN_BASE; # if (current->flags & PF_RANDOMIZE) # load_bias += arch_mmap_rnd(); # elf_flags |= elf_fixed; # } mov QWORD PTR [rsp+0x68],r13 If this is not _the_ root cause it is still eeeeh. After the patch things become much simpler: mov rax, QWORD PTR gs:0x14c00 # current mov rdx, QWORD PTR [rax+0x18] # current->stack movq [rdx+0x3fb8], 0 # fill pt_regs ... call finalize_exec Link: http://lkml.kernel.org/r/20190419200343.GA19788@avx2 Signed-off-by: Alexey Dobriyan Tested-by: Andrew Morton Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 31f264bd5126..1a66b6215c80 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -704,12 +704,12 @@ static int load_elf_binary(struct linux_binprm *bprm) unsigned long start_code, end_code, start_data, end_data; unsigned long reloc_func_desc __maybe_unused = 0; int executable_stack = EXSTACK_DEFAULT; - struct pt_regs *regs = current_pt_regs(); struct { struct elfhdr elf_ex; struct elfhdr interp_elf_ex; } *loc; struct arch_elf_state arch_state = INIT_ARCH_ELF_STATE; + struct pt_regs *regs; loc = kmalloc(sizeof(*loc), GFP_KERNEL); if (!loc) { @@ -1150,6 +1150,7 @@ out_free_interp: MAP_FIXED | MAP_PRIVATE, 0); } + regs = current_pt_regs(); #ifdef ELF_PLAT_INIT /* * The ABI may specify that certain registers be set up in special