Thursday, July 31, 2008

Crash course in Debugging

Debugging sucks. After implanting the new memory manager into the interpreter nothing worked at first. And after over a week spent on debugging, it is still not working. But yesterday I had a good run and slayed over half a dozen bugs. Feels good.

This is my crash course in GDB. I used GDB before occasionally, but I was always starting from zero because I use it so infrequently. This time around there are so many bugs to hunt down that I really need to use a lot of GDB's capabilities and I need to use it for a while! That's how you learn a tool really well.

Tuesday, July 8, 2008

Easier tracking of transient objects

My API for tracking transient objects in MM is a little different from Nickle's. Consider the three following functions.
char *a (int n)
{
assert(n>=0);
char *s = mm_allocv(mt_blob, n+1);
s[n] = '\0';
return s;
}

char *b (date_t d)
{
char *s = a(127);
sprintf(s, "Hello world on %s.\n", date2str(d));
return s;
}

char *c (date_t d)
{
MM_ANCHORED;
char *s = a(127);
sprintf(s, "Hello world on %s.\n", date2str(d));
MM_RETURN(s);
}

Functions b and c are almost the same, except that in c the body is what I call an "anchored scope". Neither b nor c call an allocation function directly, but a does.

Now, will mm_allocv in a push the address of the allocated memory onto the transient object stack or not? It depends. If you are not in an anchored scope and call a or b, then it won't. If you call c, it will. In other words, the allocation functions have context-dependent behavior: they push new addresses when executed in an anchored scope and don't push otherwise. Along the same lines, when exiting an anchored scope with MM_RETURN, the returned object will be pushed only if the return is into an(other) anchored scope.

I believe this behavior is more useful than that of the Nickle system. There are lots of little functions in Lush who create a new object. They typically do one or a small number of allocations and then initialize the object. I do not need to fortify all these functions with MM_ANCHORED, it is good enough to do it to the functions that call such constructor functions. In essence, I believe I will need to use MM_ANCHORED and company less often and will find it easier to write correct code when using it.

Wednesday, July 2, 2008

Destructors and Finalizers

As Hans Boehm argues in a recent paper, finalizers should run asynchronously to the client code. I will support this in my memory manager by queuing up finalization-ready objects and leave it to the client to actually run the finalizers through the API mm_run_pending_finalizer.

Lush "OO objects" may have a destructor method. For stack-allocated objects, the destructor method should be invoked when the object goes out of scope. For heap-allocated objects, the destructor method should invoked after the object becomes unreachable. In other words, "OO objects" need a finalizer, and one of its jobs is to call the object's destructor method. This makes clear that, when there are several finalization-ready objects, the order of finalizer invocation is important.

Current Lush has a facility for registering "finalizers" with individual objects. These are used to notify other objects that hold a weak reference to an object. For instance, a hash table may hold weak references to key objects, and events in the event queue hold a weak reference to an event handler object. The "finalizers" registered by the hash table and by the event queue do not look at the object, but only need to know an object's identity to update the hash table, or event queue, respectively. I would call them "notifiers" rather than "finalizers".

To support this mechanism for maintaining weak references I am extending the memory manager API by a new function mm_notify. This function will just mark a managed object. When the object has become unreachable and when it is about to be reclaimed by the memory manager, a notification function is called with the object's address as argument. The notification function is not specific to an individual object or an object's type and it needs to be given to mm_init when the system is initialized.