Saturday, June 16, 2007

Mac OS 10.x Cocoa programming - Article 1

Memory Management with Cocoa objects

By: Elango C

This article is a primer on managing the allocation and de-allocation of objects (and therefore the memory they use) in the context of applications built using Apple's Foundation framework, and other frameworks that rely upon it, such as EOF & WebObjects. It describes how to use Foundation's memory management infrastructure, including the reference counting mechanism and auto release pools, syntactic notations, object ownership as it pertains to memory management, common pitfalls, and good programming practices.

This article is intended for programmers who are new or have some limited exposure to Apple's frameworks, but have some Object-Oriented programming experience and are familiar with Objective-C, which is used for all the examples. OO concepts and terminology are liberally used below.

Objects Seen As Memory

Objects, or instances of Classes, are unique by virtue of being distinct fragments of memory that contain the state for each instance. Therefore, the creation and deletion of an object is equivalent to the allocation and de-allocation of the memory it occupies. The Foundation framework, upon which all other frameworks are built, provides reference counting for objects, as well as a delayed object disposal mechanism, by means of a root class, NSObject, as well as an Objective-C protocol of the same name that other classes can adopt. Most classes in Apple's frameworks and in applications built on them are subclasses of NSObject or conform to the NSObject protocol, and can therefore avail themselves of this infrastructure.

Since Apple's frameworks expose their functionality in the form of classes (though there are some C functions and struts), memory management is cast in terms of object creation and disposal. Over the spectrum of memory management methods ranging from the malloc/free of the C world to automatic Garbage Collection in Smalltalk & Java, Foundation's reference counting & delayed disposal lie somewhere in the middle.

Object Ownership

Foundation and other frameworks suggest a policy for creating and disposing objects:

  • If you create an object, you are responsible for disposing it.
  • If you want an object you didn't create to stay around, you must "retain" it and then "release" it when you no longer need it.

The idea being that the creator of an object is its owner and only the owner of an object may destroy it. Consistently adopting this policy makes code simpler, more robust and avoids problems such as references to destroyed objects or leaks. Note, though, that there is some subtlety here. By using NSAutoreleasePool, the delayed disposal mechanism, the creator of an object is technically delegating responsibility for its destruction to the NSAutoreleasePool. The term "Object Ownership" is somewhat misleading in this regard.

Object Allocation & Initialization

SomeClass *anInstance = [[SomeClass alloc] init];

is the standard idiom for creating an object by first allocating memory for it and then initializing it. On Operating Systems that understand the notion of memory "zones" (such as Mach), the allocWithZone: method attempts to allocate memory from within the specified zone to improve locality of reference. Subclasses of NSObject with state must also typically implement extended initialization methods, e.g.:

@interface CartesianCoordinate : NSObject
{
        NSNumber *abscissa;
        NSNumber *ordinate;
}
 
- (CartesianCoordinate *) initWithAbscissa: (NSNumber *)anAbscissa
                                 ordinate: (NSNumber *)anOrdinate;
 
@end
 

NSObject also provides the copy, mutableCopy, copyWithZone: and mutableCopyWithZone: methods that make identical copies of an object by allocating memory and duplicating the object's state.

Object Disposal

You indicate that you are no longer interested in an object by sending it the release message. When nobody is interested in an object, i.e., when there are no external references to it, it is de-allocated by sending it the dealloc message. Classes with state are responsible for cleaning up by releasing any other objects they in turn may be retaining in their dealloc implementations:

@implementation CartesianCoordinate
 
...
- (void) dealloc
{
        [abscissa release];
        [ordinate release];
 
        return [super dealloc];
}
 
@end

Object Reference Counting

Reference counting is extremely simple (modulo Distributed Objects, which can get a mite hairy). Each object has a "retain count" associated with it, that counts external references to it. When an object is initially created using init, initWith..., or one of the copy methods, it has an implicit retain count of 1. Other objects can "retain" it by sending the retain message, which increments the retain count. Each release message correspondingly decrements the retain count. When the count reaches 0, the object is de-allocated. You can examine the retain count of an object by sending it the retainCount message.

In the following example, an object (alertString) is created, used, and then disposed of:

- (void) notifyUserOfError: (NSString *)errorString
{
        NSMutableString *alertString = nil;
 
        alertString = [[NSMutableString alloc] initWithString:
                        @"The following error occurred: "];
        [alertString appendString: errorString];
        NSRunAlertPanel( alertString ...);
        [alertString release];
 
        return;
}

Temporary Objects

As you can see in the code fragment above, it is often necessary to create throw-away objects that are used once and then destroyed. This is simple when the scope is well-defined, as above. But what if the temporary object has to be returned to the caller? Commons idioms for dealing with this in C are to use statically allocated buffers or return dynamically allocated memory which the caller is then responsible for freeing. Foundation provides a somewhat more elegant solution by means of a delayed disposal mechanism that allows the creation of temporary objects which eventually go away auto -magically. Here's the same method rewritten:

 
- (void) notifyUserOfError: (NSString *)errorString
{
        NSMutableString *alertString = nil;
 
        alertString = [NSMutableString stringWithString:
                        @"The following error occurred: "];
        [alertString appendString: errorString];
        NSRunAlertPanel( alertString ...);
 
        return;
}

As you can see, the alertString is not sent a release message after it is used. Callers of this method need not worry about disposing alertString. Because of the way it was created, it is an "autoreleased" object and will go away eventually. An autoreleased object is simply one that will automatically receive a release message at some point in the future. Autoreleased objects hence have a finite lifetime and will be destroyed unless explicitly retained. You autorelease an object by sending it a (surprise) autorelease message. In the code fragment above, the line

 
alertString = [NSMutableString stringWithString:
                    @"The following error occurred: "];
 

is exactly the same as

alertString = [[[NSMutableString alloc] initWithString:
            @"The following error occurred: "] autorelease];

Per Foundation method naming conventions, creation conveniences such as stringWithString: always return autoreleased instances.

Gory Autorelease Details

Though autoreleasing an object is conceptually simple, it is useful to know more about how the mechanism works. Each application has a number of NSAutoreleasePool objects, which, as their name suggests, are collections of autoreleased objects. Sending autorelease to an object adds it to an NSAutoreleasePool. At some point in the future, typically at the end of the event loop in Foundation and AppKit applications, or at the end of the request-response loop in WebObjects applications, the NSAutoreleasePool sends release to all its objects (when it is itself released). Notice that NSAutoreleasePool is mentioned in the plural. Why would there be more than one? Because being able to scope the lifetime of objects is sometimes very useful, autorelease pools are stackable. Multi-threaded applications can have a stack of pools per thread. If you are creating a large number of temporary objects that are only valid within a very tight context such as a loop, and don't want those objects to hog memory until much later on, you can create an autorelease pool that is local to that context:

- (id) findSomething
{
        id theObject = nil;
        // Whatever we're looking for
        NSAutoreleasePool *localPool = [[NSAutoreleasePool alloc] init];
        // Autoreleased objects are now automatically placed in localPool.
 
        // Loop that creates many temporary objects
        while ( theObject == nil )
        {
            ...
            if ( [temporaryObject matchesSomeCondition] )
            {
                theObject = [temporaryObject retain];
                // We want this one
            }
        }
 
        // Get rid of all those temporary objects
        [localPool release];
 
        return [theObject autorelease];
}

Notice that by sending the temporaryObject we are interested in a retain message, we extend its life beyond that of localPool, and then again autorelease it before returning it, so that it is eventually disposed of.

Here is a more sophisticated example involving stacked pools:

- (NSArray *) findAListOfThings
{
        NSMutableArray *thingArray =
            [[NSMutableArray alloc] initWithCapacity: 25];
        // The list of 25 things we're looking for
        NSAutoreleasePool *outerPool = [[NSAutoreleasePool alloc] init];
        NSAutoreleasePool *innerPool = nil;
        NSArray *largeObjectArray = nil;
        id temporaryObject = nil;
        NSEnumerator *arrayEnumerator = nil;
 
        // Loops that create many temporary objects
        while ( [thingArray count] != 25 )
        {
            largeObjectArray = [self fetchLotsOfObjects];
            // largeObjectArray is autoreleased and contained in the
            // outer autorelease pool
            arrayEnumerator = [largeObjectArray objectEnumerator];
            // Note that the enumerator itself is a temporary object!
            // It will be released by the outerPool
 
            // Create the inner pool on each iteration. When
            // a pool is created, it automatically becomes the
            // "top" pool on the current thread's stack of pools.
            innerPool = [[NSAutoreleasePool alloc] init];
            // autoreleased objects now go into innerPool
 
            while ( temporaryObject = [arrayEnumerator nextObject] )
            {
                ...
                if ( [temporaryObject matchesSomeCondition] )
                {
                    [thingArray addObject: temporaryObject];
                    // Collections retain their members
                }            
            }
 
            // Dispose temporary objects created on this iteration;
            // Note that the objects added to thingArray during this
            // iteration are also in innerPool and thus sent a release
            // message, but are not destroyed because they have been
            // retained by thingArray and so have an additional reference
            // (their retainCount > 1)
            [innerPool release];
        }
 
        [outerPool release];
 
        return [thingArray autorelease];
}

Common Pitfalls

Here are some of the more straightforward mistakes made when using retain, release, and autorelease:

Releasing an object you didn't create:

 
@implementation Warden
 
- (void) chastizePrisonerNamed: (NSString *)aName
{
        Prisoner *thePrisoner = [Prisoner prisonerWithName: aName];
 
        // ...
        // Many a tense moment later
 
        [thePrisoner release];    // Ugh! thePrisoner isn't ours to release.
        return;
}
 
@end

How do we know that thePrisoner is autoreleased? Remember, other than the alloc..., copy..., and mutableCopy... methods, all class creation methods return autoreleased objects with a retain count of 1. Thus thePrisoner will automatically get a release message later on, taking it's retain count to 0 and deallocating it.

Not retaining autoreleased objects that you need beyond the present context:

 
@implementation Slacker
 
- (void) goofOff
{
        myRationale = [Rationale randomRationale];
        // myRationale is an instance variable
 
        sleep( rand( 7200 ) );
        // Do more slacker stuff
        ...
}
 
// Later on
 
- (void) justifyTimeToPointyHairedBoss
{
        [self blurtOut: [myRationale description]];
        // Ugh! myRationale may no longer exist!
        ...
}
 
@end

The more correct thing to do here is

   myRationale = [[Rationale randomRationale] retain];

or better yet,

   [self setMyRationale: [Rationale randomRationale]];

Returning temporary objects that you created without first autoreleasing them:

- (Emotion *) emotionForDate: (NSDate *)aDate
{
        Emotion *theEmotion = nil;
 
        // Compute an emotion
        theEmotion = [[Emotion alloc] initWithType:
                        ( rand( hash( [aDate stringValue] ) )];
 
        return theEmotion;
        // Ugh! You are responsible for disposing your creations
}

Writing sloppy accessors:

- (void) setGame: (Game *)newGame
{
        [game release];
        game = [newGame retain];
        // Ugh! What if game == newGame?
}

Retain cycles, i.e., objectA retains objectB and objectB retains objectA. Avoiding retain cycles is a matter of good design and clear object ownership paradigms. In general, ownership should be unidirectional. For example, it makes sense for a collection to retain its members. It doesn't for each member to retain the collection.

Useful Idioms

Always use accessor methods when referencing instance variables, even within your own class implementation! It is tempting to directly manipulate one's instance variables, but easy to forget to retain values, release previously referenced objects, and for multi-threaded applications, return references to destroyed objects. The ubiquitous use of accessors also makes it easy to differentiate between instance, automatic, and global variables, and makes code easier to read.

Don't use autorelease in accessor implementations. It's tempting to write set methods like this:

 
- (void) setTheory: (Theory *)value
{
        [theory autorelease];
        theory = [value retain];
        return;
}
 

But autoreleasing an object is an expensive operation, and should only be used when there is uncertainty about an object's lifespan. When invoking a set method, you are no longer interested in the currently referenced object, so immediately releasing it is the correct thing to do. This approach has the added benefit of exposing extra-release problems that might otherwise not appear during testing because the old, autoreleased object is still around. Here is a prototype for an efficient, if somewhat verbose, set method:

 
- (void) setTheory: (Theory *)newTheory
{
        Theory *oldTheory = nil;
 
        if ( theory != newTheory )        // If they're the same, do nothing
        {
            [self willChange];            // For Enterprise Objects only
            oldTheory = theory;        // Copy the reference
            theory = [newTheory retain];// First retain the new object
            [oldTheory release];        // Then release the old object
        }
 
        return;
}
 

For classes with shared or singleton instances, always reference the instance via an accessor that will create it as necessary:

 
@implementation Earth
 
static Earth *_sharedEarth = nil;
 
+ (Earth *) sharedEarth
{
        if ( _sharedEarth == nil )
        {
            _sharedEarth = [[Earth alloc] initWithTrees:  ... ];
        }
 
        return _sharedEarth;
}
 
@end
 

No comments: