mm: treat indirectly reclaimable memory as free in overcommit logic
authorRoman Gushchin <guro@fb.com>
Tue, 10 Apr 2018 23:27:47 +0000 (16:27 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Wed, 11 Apr 2018 17:28:29 +0000 (10:28 -0700)
Indirectly reclaimable memory can consume a significant part of total
memory and it's actually reclaimable (it will be released under actual
memory pressure).

So, the overcommit logic should treat it as free.

Otherwise, it's possible to cause random system-wide memory allocation
failures by consuming a significant amount of memory by indirectly
reclaimable memory, e.g.  dentry external names.

If overcommit policy GUESS is used, it might be used for denial of
service attack under some conditions.

The following program illustrates the approach.  It causes the kernel to
allocate an unreclaimable kmalloc-256 chunk for each stat() call, so
that at some point the overcommit logic may start blocking large
allocation system-wide.

  int main()
  {
   char buf[256];
   unsigned long i;
   struct stat statbuf;

   buf[0] = '/';
   for (i = 1; i < sizeof(buf); i++)
   buf[i] = '_';

   for (i = 0; 1; i++) {
   sprintf(&buf[248], "%8lu", i);
   stat(buf, &statbuf);
   }

   return 0;
  }

This patch in combination with related indirectly reclaimable memory
patches closes this issue.

Link: http://lkml.kernel.org/r/20180313130041.8078-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/util.c

index 029fc2f3b395054a08595dca3ec38bae63877261..73676f0f1b43b49300c64be5754453c2da1ddd04 100644 (file)
--- a/mm/util.c
+++ b/mm/util.c
@@ -667,6 +667,13 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin)
                 */
                free += global_node_page_state(NR_SLAB_RECLAIMABLE);
 
+               /*
+                * Part of the kernel memory, which can be released
+                * under memory pressure.
+                */
+               free += global_node_page_state(
+                       NR_INDIRECTLY_RECLAIMABLE_BYTES) >> PAGE_SHIFT;
+
                /*
                 * Leave reserved pages. The pages are not for anonymous pages.
                 */