--- title: 🚰 Hunting Leaks date: 2019-06-16 --- There's currently a nasty memory leak in [haskell-ide-engine](https://github.com/haskell/haskell-ide-engine) which leaks all the cached information about a module from GHC. It seems to only occur on some platforms, but on platforms unfortunate to be affected by it, it means that a sizeable portion memory is leaked every time the user types. For a small module of about 60 lines, this was 30-ishMB. During this year's ZuriHac, Matthew Pickering, Daniel Gröber and I tried to get this sorted out once and for all: It ended up taking the entire weekend! # Profiling The first step in the investigation began with figuring out what exactly was leaking: Prior to ZuriHac, I had heap profiled HIE with a sample session. To do this I had to build the executable with profiling enabled: This adds some extra information regarding closures and "call sites", and causes your executable to be linked with one of the versions of the RTS that was built with profiling. ```bash ghc -prof Main.hs # or in our case, for cabal projects cabal v2-install :hie --enable-profiling ``` GHC comes with a bunch of different RTS libraries, each built with a different combination of features. ```bash ls `ghc --print-libdir`/rts | grep rts libHSrts-ghc8.6.5.dylib libHSrts.a libHSrts_debug-ghc8.6.5.dylib libHSrts_debug.a libHSrts_l-ghc8.6.5.dylib libHSrts_l.a libHSrts_p.a libHSrts_thr-ghc8.6.5.dylib libHSrts_thr.a libHSrts_thr_debug-ghc8.6.5.dylib libHSrts_thr_debug.a libHSrts_thr_l-ghc8.6.5.dylib libHSrts_thr_l.a libHSrts_thr_p.a ``` `thr` stands for threading, `debug` for debug, `l` for ??? and `p` for profiling. If you pass GHC `-v`, you should see `-lHSrts_thr_p` or equivalent when linking: We're bringing in the RTS built with threading and profiling enabled. Now we can profile our executable by running it with specific flags passed to the RTS. To see what's available, we ran `hie +RTS --help` ``` ... hie: -h Heap residency profile (hp2ps) (output file .hp) hie: break-down: c = cost centre stack (default) hie: m = module hie: d = closure description hie: y = type description hie: r = retainer hie: b = biography (LAG,DRAG,VOID,USE) ... ``` # Weak pointers Now that we know what is leaking, the question turns to where is it leaking. The next steps are taken from [Simon Marlow's excellent blog post](http://simonmar.github.io/posts/2018-06-20-Finding-fixing-space-leaks.html), in which he details how he tracked down and fixed multiple memory leaks in GHCi. The trick is to create a weak pointer -- a pointer that the garbage collector ignores when looking for retainers. Point it to the offending type and store it alongside the original, strongly referenced one. Whenever you know the object should have been dellocated, force garbage collection by calling `performGC` and dereference the weak pointer. If the pointer is null, it was released, otherwise something is still holding onto it: the object is being leaked! In ``` (lldb) e findPtr((P_)myobjptr & ~0b111, 1) 0x420bbdec50 = hie-plugin-api-0.10.0.0-inplace:Haskell.Ide.Engine.GhcModuleCache.UriCache(0x420bbc04f9, 0x420fdff691, 0x420fdfd39a, 0x10cca094a, 0x420bbc0539) 0x420fdfd150 = WEAK(key=0x420fdfd39a value=0x420fdfd39a finalizer=0x10c56ba41) 0x420fdfd150 = WEAK(key=0x420fdfd39a value=0x420fdfd39a finalizer=0x10c56ba41) ``` ``` -debug -g -fwhole-archive-hs-libs ```