Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regent: Memory leak in stencil #798

Closed
elliottslaughter opened this issue Mar 31, 2020 · 10 comments
Closed

Regent: Memory leak in stencil #798

elliottslaughter opened this issue Mar 31, 2020 · 10 comments
Assignees
Labels
bug planned Feature/fix to be actively worked on - needs release target Regent Issues pertaining to Regent
Milestone

Comments

@elliottslaughter
Copy link
Contributor

This is part of the ongoing saga to diagnose #711. This time I'm running the Regent implementation of stencil. We already saw in #790 that the C++ circuit code runs with constant memory usage, so there's a good chance that whatever is left is in Regent at this point.

The leak this time is faster, more on the order of 0.7 MB/s or so.

Memory (MB) vs  Timestep

Command line: ./regent.py examples/stencil_fast.rg -ll:cpu 4 -tsteps 10000

Here's the diff I'm using to instrument the code:

diff --git a/language/examples/stencil_fast.rg b/language/examples/stencil_fast.rg
index 273b086..9c16cd3 100644
--- a/language/examples/stencil_fast.rg
+++ b/language/examples/stencil_fast.rg
@@ -549,8 +549,12 @@ task main()
     --   fill_(private[i], init)
     -- end
 
-    __demand(__trace)
+    -- __demand(__trace)
     for t = 0, tsteps do
+      if t % 100 == 0 then
+        __fence(__execution, __block)
+        cmapper.print_rusage(t);
+      end
       -- __demand(__index_launch)
       for i = 0, nt2 do
         stencil(private[i], interior[i], pxm_in[i], pxp_in[i], pym_in[i], pyp_in[i], t == tprune)
diff --git a/language/examples/stencil_mapper.cc b/language/examples/stencil_mapper.cc
index 1c678d5..1103b09 100644
--- a/language/examples/stencil_mapper.cc
+++ b/language/examples/stencil_mapper.cc
@@ -17,6 +17,15 @@
 
 #include "mappers/default_mapper.h"
 
+#include <sys/time.h>
+#include <sys/resource.h>
+void print_rusage(long long timestep)
+{
+  struct rusage usage;
+  if (getrusage(RUSAGE_SELF, &usage) != 0) return;
+  printf("memory usage at timestep %6lld: %ld MB\n", timestep, usage.ru_maxrss / 1024);
+}
+
 #define SPMD_SHARD_USE_IO_PROC 1
 
 using namespace Legion;
diff --git a/language/examples/stencil_mapper.h b/language/examples/stencil_mapper.h
index ce4ee66..029366e 100644
--- a/language/examples/stencil_mapper.h
+++ b/language/examples/stencil_mapper.h
@@ -22,6 +22,8 @@ extern "C" {
 
 void register_mappers();
 
+void print_rusage(long long timestep);
+
 #ifdef __cplusplus
 }
 #endif
@elliottslaughter elliottslaughter self-assigned this Mar 31, 2020
@elliottslaughter elliottslaughter added bug planned Feature/fix to be actively worked on - needs release target Regent Issues pertaining to Regent labels Mar 31, 2020
@elliottslaughter elliottslaughter added this to the 20.06 milestone Mar 31, 2020
@lightsighter
Copy link
Contributor

It could also be in a custom mapper too. If you run valgrind on the code with differing numbers of iterations are the leaks being reported at the end all the same?

@elliottslaughter
Copy link
Contributor Author

With -tsteps 200:

==29985== Warning: set address range perms: large range [0xc507028, 0x2c507157) (noaccess)
==29985== 
==29985== HEAP SUMMARY:
==29985==     in use at exit: 1,731,572 bytes in 18,514 blocks
==29985==   total heap usage: 1,756,864 allocs, 1,738,350 frees, 885,751,771 bytes allocated
==29985== 
==29985== LEAK SUMMARY:
==29985==    definitely lost: 418,888 bytes in 10,575 blocks
==29985==    indirectly lost: 1,239,980 bytes in 7,938 blocks
==29985==      possibly lost: 0 bytes in 0 blocks
==29985==    still reachable: 72,704 bytes in 1 blocks
==29985==         suppressed: 0 bytes in 0 blocks
==29985== Rerun with --leak-check=full to see details of leaked memory
==29985== 
==29985== For counts of detected and suppressed errors, rerun with: -v
==29985== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

With -tsteps 300:

==30119== Warning: set address range perms: large range [0xc507028, 0x2c507157) (noaccess)
==30119== 
==30119== HEAP SUMMARY:
==30119==     in use at exit: 1,846,804 bytes in 23,314 blocks
==30119==   total heap usage: 2,553,340 allocs, 2,530,026 frees, 982,048,115 bytes allocated
==30119== 
==30119== LEAK SUMMARY:
==30119==    definitely lost: 534,088 bytes in 15,375 blocks
==30119==    indirectly lost: 1,240,012 bytes in 7,938 blocks
==30119==      possibly lost: 0 bytes in 0 blocks
==30119==    still reachable: 72,704 bytes in 1 blocks
==30119==         suppressed: 0 bytes in 0 blocks
==30119== Rerun with --leak-check=full to see details of leaked memory
==30119== 
==30119== For counts of detected and suppressed errors, rerun with: -v
==30119== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

With -tsteps 400:

==30138== Warning: set address range perms: large range [0xc507028, 0x2c507157) (noaccess)
==30138== 
==30138== HEAP SUMMARY:
==30138==     in use at exit: 1,962,308 bytes in 28,115 blocks
==30138==   total heap usage: 3,347,584 allocs, 3,319,469 frees, 1,077,597,843 bytes allocated
==30138== 
==30138== LEAK SUMMARY:
==30138==    definitely lost: 649,288 bytes in 20,175 blocks
==30138==    indirectly lost: 1,240,012 bytes in 7,938 blocks
==30138==      possibly lost: 304 bytes in 1 blocks
==30138==    still reachable: 72,704 bytes in 1 blocks
==30138==         suppressed: 0 bytes in 0 blocks
==30138== Rerun with --leak-check=full to see details of leaked memory
==30138== 
==30138== For counts of detected and suppressed errors, rerun with: -v
==30138== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@elliottslaughter
Copy link
Contributor Author

For what it's worth, I get the same rate of leakage with the default mapper. Circuit, which we tested in #790, also uses a mapper derived from the default mapper. (That doesn't mean there can't be issues, but at least the basic stuff should be working.)

@lightsighter
Copy link
Contributor

Right, so this is different than what we were seeing with circuit. These are actual leaks, memory being lost. In circuit the definitely lost line was invariant to the number of iterations.

@elliottslaughter
Copy link
Contributor Author

Here's the result of running with TRACE_ALLOCATION and -tsteps 10000, in case it helps:

alloc.txt

@elliottslaughter
Copy link
Contributor Author

Here's the result of running valgrind with --leak-check=full:

leak_check.txt

@elliottslaughter
Copy link
Contributor Author

LEAK SUMMARY
  Leaked Futures: 0
  Leaked Future Maps: 0
  LEAKED CONSTRAINTS: 5
  Leaked Managers: 0
  Pinned Managers: 0
  Leaked Views: 0
  Leaked Equivalence Sets: 0
  LEAKED INDEX SPACES: 193
  LEAKED INDEX PARTITIONS: 11
  LEAKED FIELD SPACES: 5
  Leaked Regions: 0
  Leaked Partitions: 0

Probably the smoking gun.

@elliottslaughter
Copy link
Contributor Author

Fixed some things, now getting:

LEAK SUMMARY
  Leaked Futures: 0
  Leaked Future Maps: 0
  Leaked Constraints: 0
  Leaked Managers: 0
  Pinned Managers: 0
  Leaked Views: 0
  Leaked Equivalence Sets: 0
  LEAKED INDEX SPACES: 11
  Leaked Index Partitions: 0
  Leaked Field Spaces: 0
  Leaked Regions: 0
  Leaked Partitions: 0

And a new run with valgrind:

leak_check.txt

@elliottslaughter
Copy link
Contributor Author

Fixed an application bug. New valgrind run:

leak_check.txt

@elliottslaughter
Copy link
Contributor Author

The stencil issue is application-specific and is fixed in d007c1c. This may or may not generalize to other applications depending on what they do. But stencil at least is running in constant memory now.

I've some additional pending fixes which do not affect memory usage over time, but which result in leaks at the end, which I will push once they're more fully tested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug planned Feature/fix to be actively worked on - needs release target Regent Issues pertaining to Regent
Projects
None yet
Development

No branches or pull requests

2 participants