While using an open source Cython library I found a memory leak. The leak seems to come from a typed numpy array, which is not freed from the memory when it goes out of scope. The declaration is the following:
cdef np.ndarray[object, ndim=1] my_array = np.empty(my_size, dtype=object)
In my understanding, this should be considered by the garbage collector like any other numpy array and the GC should free its memory as soon as the array goes out of scope -- in this case at the end of the function in which it is declared. Apparently this does not happen.
If the array were created using a cython array first, and then casting it to numpy array, one could use the callback_free_data function like described here and here. However, in this case it is not possible to reach the pointers of my_array and it is not possible to set the callback.
Any idea on why this kind of declaration could cause a memory leak and/or how to force the deallocation?
Update:
My question was very generic, and I wanted to avoid posting the code because it is a bit intricate, but since someone asked here we go:
cdef dijkstra(Graph G, int start_idx, int end_idx):
# Some code
cdef np.ndarray[object, ndim=1] fiboheap_nodes = np.empty([G.num_nodes], dtype=object) # holds all of our FiboHeap Nodes Pointers
Q = FiboHeap()
fiboheap_nodes[start_idx] = Q.insert(0, start_idx)
# Some other code where it could perform operations like:
# Q.decrease_key(fiboheap_nodes[w], vw_distance)
# End of operations
# do we need to cleanup the fiboheap_nodes array here?
return
The FiboHeap is a Cython wrapper for the c implementation. For example, the insert function looks like this:
cimport cfiboheap
from cpython.pycapsule cimport PyCapsule_New, PyCapsule_GetPointer
from python_ref cimport Py_INCREF, Py_DECREF
cdef inline object convert_fibheap_el_to_pycapsule(cfiboheap.fibheap_el* element):
return PyCapsule_New(element, NULL, NULL)
cdef class FiboHeap:
def __cinit__(FiboHeap self):
self.treeptr = cfiboheap.fh_makekeyheap()
if self.treeptr is NULL:
raise MemoryError()
def __dealloc__(FiboHeap self):
if self.treeptr is not NULL:
cfiboheap.fh_deleteheap(self.treeptr)
cpdef object insert(FiboHeap self, double key, object data=None):
Py_INCREF(data)
cdef cfiboheap.fibheap_el* retValue = cfiboheap.fh_insertkey(self.treeptr, key, <void*>data)
if retValue is NULL:
raise MemoryError()
return convert_fibheap_el_to_pycapsule(retValue)
The __dealloc__() function works as it is supposed to, so the FiboHeap is released from the memory at the end of the function dijkstra(...). My guess is that something is going wrong with the pointers contained in fiboheap_nodes.
Any guess?