Archive for October, 2006

Objective-C 3.0

Andy on Oct 25th 2006

My previous post about Key-Value Coding and being able dynamically add transient attributes to NSManagedObject got me thinking. Why can’t I dynamically add methods? I don’t mean on just NSManagedObjects, I mean on NSObjects. I realize that categories allow that somewhat, but only as a group of functions and only at compile time.

Anyway, this train of thought got me to thinking about what I’d like to see in the next version of Objective-C. Since 2.0 is pretty much a done deal, I’m thinking forward to Objective-C 3.0. I also ended up thinking about what it would take to implement the proposed features, in hopes it would give me a better idea as to what was feasible and what wasn’t. That said, I could totally be smoking crack, and none of this may be feasible.

Anonymous Functions

In the new Objective-C, I want at least some basic anonymous function support. To use JavaScript as an example:

this.foo = function() { alert('foo called'); }

For the JavaScript uninitiated, this adds a function called “foo” to the current object. In Objective-C, it would end up looking like:

[self addMethod: @function() { NSLog(@"foo called"); }
	forSelector:@selector(foo)];

As shown, this should be easily implementable because the anonymous function only touches global symbols. How to access globals doesn’t change from a normal function to an anonymous function.

Accessing instance variables in anonymous functions

Things get more interesting, and more difficult to implement, if you allow the code block to touch instance variables and methods. For example:

@interface Foo {
	int count;
}
- (float) cost;
@end

@implementation Foo

- (id) init {
	[self addMethod: @function() {
		NSLog(@"%d items at %f each",
			self->count, [self cost]);
		}
		forSelector:@selector(display)];
}

@end

In the above example, accessing the method, cost, should be pretty trivial to implement inside of a anonymous function. self is just a parameter to the function, and the compiler just needs to send a message to it.

Unfortunately, accessing the ivar, count, off of self, is probably impossible to implement. When the anonymous function is compiled, it doesn’t know which class it is attached to. This is a problem because in C, structure accesses are just byte offsets from the address of the structure. On any given class, count could be anywhere within the structure. Since the anonymous function doesn’t know which class to look at, it doesn’t know what byte offset to generate.

Fortunately, it looks like Objective-C 2.0 is going to add property support, which is basically syntatic sugar. Properties will automatically generate methods needed to access or mutate them. This brings properties into the realm of feasibility inside of anonymous functions because they can be treated as method calls.

Accessing local variables in anonymous functions

Although the ability to access and use member variables inside of anonymous functions is interesting, it’s also important to be able to access local variables. For example:


- (id) init {
	int count = 20;

	[self addMethod: @function() {
		NSLog(@"%d items", count);
		}
		forSelector:@selector(display)];

	[self foo]; // indirectly call the anonymous function
}

- (void) foo {
	[self display]; // invoke the added method
}

I should point out immediately that this example is an extremely contrived case, and doesn’t show how useful accessing locals from anonymous functions are. I need to introduce another concept before the usefulness becomes apparent. So for now, I’ll assume that it’s important, and just delve into how this might be implemented.

In the example given, count is a local in init, and it is used in the anonymous function. The compiler needs a mechanism to “bind” the local count to the anonymous function such that every time the function is invoked it can locate the local in memory. The compiler cannot assume that the anonymous function will always be invoked from the function that it was created in, but might instead be invoked by a function further down the stack. e.g. init calls foo and foo calls display. This fact rules out the possibility of passing in the local as a parameter to the anonymous function.

Normally locals are accessed by an offset from the base of a function’s stack frame. So if the code in the anonymous function can find the right stack frame for init, it can easily find the local count. It is also easy for code to walk up the call stack, and thus iterate the stack frames.

However, there is nothing currently in the stack frame that would uniquely identify a function. So the compiler would have to generate a context id for each function that defines an anonymous function, and generate code to store it in the stack frame. At runtime, the anonymous function would iterate up the stack frame until it found the correct context id, and then use the corresponding stack frame pointer as a base for the local’s byte offset.

The problem with this implementation is that I don’t remember seeing a space in the stack frame to store a context id. It would have to be a fixed byte offset from the base of the stack frame in order to be easily found. This is based off my memory of the PowerPC calling conventions. I unfortunately don’t remember my Intel calling conventions from my college days.

If the context id can’t be stored directly in the stack frame, then it’s possible to create a parallel stack which contains only the context ids. It would have to be stored as pairs of context ids and stack frame pointers. The downside of this approach is care would have to be taken to ensure thread safety.

Finally, there’s an obvious problem with the lifetime of local data. After init returns, count ceases to exist, and any call to display will result in a crash. Although at first tempted to try to fix this, I actually think it’s OK to allow the crash at runtime. This wouldn’t fly in the interpreted language world (like Ruby), but Objective-C isn’t interpreted. Objective-C programmers know they shouldn’t return pointers to locals, so creating anonymous functions that reference locals that will go out of scope shouldn’t be foreign or surprising to them.

Closures

Of course, the whole reason I bring up anonymous functions is because I want closures. Let’s take an example from Ruby:

array.each { |element| print element }

For those unfamilar with Ruby, some explaination is required. array, is, um, an array, and each is a instance method on array. each walks each element in the array, and calls the anonymous function attached to it, which is everything inside the curly braces, passing in the element. Parameters to the anonymous function are declared inside the pipe characters.

In Objective-C, the code would look something like:

NSArray* array = ...;
[array each] @function (id element) {
	NSLog( @"element: %@", element );
}

Where the implementation of each would look like:

- (void) each {
	NSEnumerator* enumerator = [self objectEnumerator];
	id element = nil;
	while ( element = [enumerator nextObject] )
		[closure yield:element];
}

Most of the above code is self explanatory. The new stuff is the closure object. Like the self object, it is a hidden parameter that is passed into the method except that it points to the anonymous function. The yield method takes a variable argument list, and actually invokes the anonymous function.

The interesting thing about the above code is that it implies anonymous functions can be encapsulated by objects. I’m not sure how feasible this is, however. An alternate syntax might be:

- (void) each {
	NSEnumerator* enumerator = [self objectEnumerator];
	id element = nil;
	while ( element = [enumerator nextObject] )
		@yield(element);
}

The only gotcha to the new syntax is that it doesn’t allow the called function to determine if there’s a closure attached, and change its behavior. As an example as to why you might want this can be illustrated via an NSArray initWithCapacity method:

...
- (id) initWithCapacity:(unsigned) size {
	...
	if ( closure == nil ) {
		// Normal initialization with capacity
	} else {
		for(unsigned i = 0; i < size; ++i)
			[self addObject: [closure yield:i]];
	}
	...
}
...

NSArray* uninitializedArray = [[NSArray alloc] initWithCapacity:20];
NSArray* initializedArray = [[NSArray alloc] initWithCapacity:20]
	@function(unsigned index) {
		return [NSNumber numberWithUnsignedInt: index * 2];
	}

As you can see, initWithCapacity changes behavior depending on if an anonymous function is attached or not. If there isn’t a function, then it simply allocates an array of a given size. If there is a closure, then it calls the function to generate each element.

So, as you can see, it’s advantageous inside of a method to know if there is an attached anonymous function. If @yield is used, then you obviously lose this ability. There are ways around this, such as introducing a new hidden BOOL parameter, or another code flow construct. However, they aren’t quite as elegant.

Examples

Earlier I mentioned that it is an interesting feature to allow anonymous functions to access local variables. This becomes more apparent when using closures. For example:

unsigned sum = 0;
[array each] @function(NSNumber* element) {
	sum += [element unsignedIntValue];
}

You also don’t have to write separate functions for threads. With closures you could write something like:

[NSThread spawn] @function() {
	NSLog(@"Hello World from a thread.");
}

I can take my earlier example of dynamically adding a method and make it simpler with closures:

[self addMethod: @selector(foo)] @function() {
	NSLog(@"foo called");
}

Syntax

If you go back to the Ruby example, you’ll notice that it has a much nicer syntax. Anonymous functions look like normal code blocks in Ruby. I don’t think that would be reproducible in Objective-C because of the backwards compatibility to C. Furthermore, Objective-C likes to prepend an ‘@’ to all the Objective-C features. I’m not sure if it’s for technical reasons or purely aesthetic reasons, but it’s unlikely that Objective-C would drop it for new features. In an attempt to clean up the sum example, it could look like:

unsigned sum = 0;
[array each] @ {
	@| NSNumber* element |

	sum += [element unsignedIntValue];
}

However, I’m not sure how much of an improvement this really is. I had to retain the ‘@’ character, so the anonymous function still doesn’t look like normal code block. What’s more, the use of pipe characters to declare parameters will probably look foreign to most people used to C. Using the C function style parameter list will probably feel more intuitive to more people.

Properties

Although I’ve focused on methods the entire article, there’s no reason why the same thing couldn’t apply to properties.

For example:

[self addProperty: @"cost"] @function() {
	return 30.0;
} 

NSLog(@"cost %f", self.cost); // prints 30.0

Of course, there’s no reason why the property has to be backed by a method. It could be as simple as adding an entry to an internal NSMutableDictionary. In fact, if the caller doesn’t provide a closure, then addProperty: could just add a dictionary entry:

[self addProperty: @"cost"] ;

And to initialize it with a default value:

[self addProperty: @"cost" withValue: [NSNumber numberWithFloat:30.0] ] ;

Conclusion

When I started writing this article, I was just thinking about the direction that Apple was going with Core Data. I didn’t intend to end up talking about adding closures to Objective-C, but that’s the logical progression. Despite all the rambling on my part, I didn’t even touch other interesting features, such as built-in concurrency support. I suppose that’s another post for another day.

Objective-C is unique in that it combines the low-level functionality of C with some really high level features, such as garbage collection, not commonly found in compiled languages. The high level features really improve the productivity of programmers, but they haven’t been updated in a while (Objective-C 2.0 not withstanding). I’m not sure if Apple will implement these features, or if they do it will be in the way I implied, but I do hope that the language will continue to progress.

Filed in Macintosh, Programming | 9 responses so far

Dynamically adding transient attributes in Core Data

Andy on Oct 24th 2006

One of the NNTP commands I wanted to implement in Wombat was XPAT. XPAT is a simple search command that searches a specified header on a range of messages for a given wildmat expression. For example:


XPAT subject 1-12 *fred*

searches messages 1 through 12 for any message that has a subject containing the word “fred.” Matching a given header against a wildmat expression (i.e. the *fred* in the previous example), was easy because wildmat expressions are easily convertible into regular expressions. The problem was the user could specify any header to search against.

Although NNTP specifies a group of standard headers, posts are not limited to those standard headers. Any header, as long as they are syntactically correct, can be added to a message. What’s more, each message is going to have a different set of headers. The first message might have an X-NNTP-Posting-Agent header, while the second one might not.

Since headers vary from post to post, and could be almost innumerable, I couldn’t add them to my Post entity as persistent attributes. Instead, I have one attribute, headers, which is a string with each header on its own line.

This caused a problem because I wanted to use NSPredicate’s MATCHES operator to compare a specific header against a regular expression. I wanted to specify the predicate query as:


subject MATCHES ".*fred.*"

Unfortunately NSPredicate wants an attribute name to compare against (subject in this case), and my NSManagedObject didn’t have one. It had the headers attribute, but that contained all the headers, not just one. Since, once again, the headers can vary, I needed a way to dynamically add transient attributes to my NSManagedObject derived class, Post.

Since NSPredicate uses Key Value Coding to obtain the value of an attribute, I figured there was a solution. My first thought was to simply overload valueForKey: on my Post class. However, I was worried I’d muck up the Key Value Coding mechanism, and break something important. Fortunately, by digging through the Key Value Coding documentation I discovered the valueForUndefinedKey: message. According to the documentation, if a key wasn’t found via the normal means, the valueForUndefinedKey: message was called. This is important because that means if I overload valueForUndefinedKey:, I don’t run the risk of masking built-in, and possibly important, keys.

By default, valueForUndefinedKey: simply throws an exception stating no such attribute exists. This is valid behavior that I wanted to keep when overloading it. However, on the other hand, no matter what header the client asked for, I had to pretend that it was there and at least return an empty string. This was on the chance that some of the messages had the header. In other words, I needed a way to determine if the caller of valueForKey: was just asking for Joe Random attribute or a header attribute.

The only easy way I came up with was to prepend header attributes with a common prefix, like “header_”. So the NSPredicate query above becomes:


header_subject MATCHES ".*fred.*"

This makes the processing in valueForUndefinedKey: pretty easy. The method, in general, can be defined as:


- (id)valueForUndefinedKey:(NSString *)key
{
if ( [key hasPrefix:@"header_"] )
return [self headerForName:key];

return [super valueForUndefinedKey:key];
}

As you can see, determining valid header requests from invalid attribute requests is easy. I call the super’s implementation of valueForUndefinedKey: so that the exception is thrown, as expected, in the case of bad attribute requests. Also it means that if the parent class does something other than throw in the future, that it will continue to work.

I have to admit this was much easier to implement than I thought it would be. I also like the idea that I can easily access all my header values via NSPredicate, which is very useful.

Filed in Core Data, Macintosh, Programming, Wombat | One response so far

Merging multiple contexts in Core Data

Andy on Oct 23rd 2006

Last time I was working on Wombat, I was trying to get its Core Data multi-threading use correct. Namely, I stopped sharing one context (NSManagedObjectContext) between all threads, and created a new context for each thread. This meant I didn’t have to lock the context each time it was touched, which resulted in better performance. I thought that it would be simple as that. Core Data would take care of the merging of the multiple contexts, and all would be good.

I was close, but it’s not quite that simple.

The first problem I ran into was the possibility of two different clients, each on its own on thread and thus context, posting the same message. Since contexts aren’t saved until a client disconnects, that would work for the first client who quit, but the second one would run into trouble. It would most likely succeed in its save, assuming it had no merge conflicts, but it would compromise the data integrity of the data store. That is, I shouldn’t have the same post in the database twice. That’s bad.

Now the NNTP protocol specifies that each post should have a unique ID, called a Message-ID. Its simply a blackbox string that is supposed to be globally unique. Typically it is a time stamp concatenated with the local host name, and enclosed in angle brackets (<>). Some NNTP clients attempt to generate the Message-ID themselves, but most are smart and allow the server to generate one on their behalf. That means that the odds of receiving duplicate messages from clients are pretty much nil, although the situation still has to be considered.

It’s far more likely that two peer servers connect to Wombat, and offer it the same message that was posted elsewhere on the network. This is still somewhat unlikely, because direct peers should be somewhat rare, and thus the likelihood of them connecting at the same time would be low.

I bring up the rarity of the these events for a reason. If these collisions were frequent it means I should probably go back and consider locking down and updating the contexts more often. That way I always know what messages I have and which ones I don’t. i.e. I should make sure I do preventative maintenance. The downside of preventative is obvious: its slower because I have to bottleneck all threads and update the data store. However, since the message collision events are (theoretically) rare, then I can just assume the thread’s current context is up-to-date. Then, when the thread goes to merge any changes to the context into the data store, it can handle any conflicts then.

Before this point, Wombat never made use of any of Core Data’s validation methods. That’s because everything was serialized through one context and I manually validated the incoming data before I inserted it into the context. For example, before I inserted a message I searched for any messages in the context with the same Message-ID. If I found one, I simply didn’t insert the new message, thus maintaining the integrity of the data store.

Now I needed to be able to catch any duplicates when I went to save the context to the data store. My first thought was: “wouldn’t it be great if Core Data modeling allowed me to specify an attribute as unique?” It would be, but alas, Core Data doesn’t allow it. If I could specify an attribute as being globally unique then I wouldn’t have to write any validation methods, but just let Core Data catch them for me. I also wonder if SQLite would be able to do anything with an attribute if it knew it was unique. For example, create an index on it so searching was quicker.

Anyway, dreams aside, I needed to write a validation method. The first thing I thought of was the validation method generated by Xcode for each attribute. The one of the form:


- (BOOL)validate: (id *)valueRef error:(NSError **)outError;

where is the name of the attribute. The problem with this is that Message-ID never changes, and I actually only want to validate when a new message is inserted into the data store. After searching around the documentation some more, I discovered:


- (BOOL) validateForInsert:(NSError **)error;

It’s a pretty easy to use method, and it only gets called on an inserted object. I simply added code at this point to check for more than one message with the same Message-ID. If I found more than one, I returned NO and put an error in the out parameter. The error parameter is passed back to the call to [NSManagedObjectContext save:&error], so I can do useful things like stuff the offending object into the error and the person who called save will get it.

Now I’d like to take an intermission to rant a bit. The error mechanism here is ghetto at best. If you’ll notice the out error parameter only points to one error, as does the error parameter in save. So what happens if you have more than one error, because, say, you have six messages that are duplicates? You curl up in a corner and cry, that’s what. Then you have to use some convoluted logic to pretend the API was designed to support multiple errors.

First, you have to check to see if the error parameter is nil. If it is, you just jam a pointer to your error in it. If it’s not nil, then you have to check to see if it’s a special error that’s designated as “multiple errors.” If it’s the special “multiple errors” error, then you create a entirely new “multiple errors” error with all the old stuff in it, plus your error added to its array of multiple errors. If the current error is not the special “multiple errors” error, then you have to create one, and jam the old error and your new error in it. Fun stuff.

Hey Apple, you wanna know what else would have worked? An NSMutableArray of NSError’s. Crazy idea, I know.

Anyway, after I figured out how to get errors back to the caller of save, I needed to do something about them. Fortunately processing duplicates are easy. You just deleted them. I thought about doing this inside of validateForInsert, but some of the Apple documentation advises against mutating the context inside of validation routines. Instead, the caller of save just walks the list of errors (which has its own special logic to produce an array of errors out of a single error) and deletes the duplicates, then attempts the save again.

At this point, I thought I was home free. But save kept returning merge errors even after I had deleted the duplicates, and I didn’t know why. The answer turned out to be the merge policy on the context. By default, the merge policy is “don’t merge, and report anything that doesn’t merge.” I’m sure that’s a fine policy for applications who never use more than one context per data store, but it doesn’t work so well in Wombat.

There are actually several merge policies described in Apple’s documentation. NSErrorMergePolicy simply returns an error for each merge conflict, and is the default. NSMergeByPropertyStoreTrumpMergePolicy and NSMergeByPropertyObjectTrumpMergePolicy are similar to each other in that they merge on a property by property basis. They only differ when the property has been changed in both the store and context. In that case, NSMergeByPropertyStoreTrumpMergePolicy takes whatever was in the store. Conversely, NSMergeByPropertyObjectTrumpMergePolicy takes whatever was in the object context. NSOverwriteMergePolicy simply forces all the changes in the object context into the data store. Finally, NSRollbackMergePolicy discards any object context changes that conflict with what’s in the data store.

For Wombat, I chose NSMergeByPropertyObjectTrumpMergePolicy because it does finer grain merging, and because there’s a slim chance what’s in the object context is more up-to-date than what’s in the store.

All in all, merging multiple contexts was harder than I thought, and harder than I thought it needed to be. It would be nice if Core Data could do some more validation automatically (like unique attributes) and if the error handling was better. I also think picking a better default merge policy would help, because when getting the random merge errors, it wasn’t all that obvious to me that the merge policy was the problem.

Filed in Core Data, Macintosh, Programming, Wombat | 2 responses so far

Review: ZigVersion 1.0.1

Andy on Oct 17th 2006

Overview

ZigVersion from ZigZig Software, is a graphical Subversion client. I recently had an opportunity to use it when developing my WordPress plugin, PollPress. Although I am not a Subversion expert, I was familiar with it before evaluating ZigVersion. I use some sort of version control software every day in my day job, usually Perforce or CVS.

ZigVersion is unique on the Mac because, as far as I can tell, it is the only commercial Subversion client. There are a few other open source competitors, most notably, svnX, which I have used before. Because ZigVersion is commercial, I have higher expectations for it.

Ease of use

One of the first things mentioned on the ZigVersion website is its usability:

Instead of simply reproducing the command line concepts as a graphical interface, we looked at the typical workflows of professional programmers and designed an interface around them.

This was a very compelling statement given my frustration with svnX (which window am I supposed to be in??). Enough that I decided to download ZigVersion and give it a try.

My first impression of ZigVersion was how simple it was. Although installation was the standard drag and drop to the Applications folder, this was already an improvement over most Subversion clients. Other clients required me to track down the binaries of the Subversion command line tools and install those first.

Startup dialogOn startup ZigVersion presented me with a dialog listing all the recently accessed servers, and a field to type in a new one. Although this sounds pedestrian, this is an improvement over what most of ZigVersion’s competitors offer.

File viewOnce connected, ZigVersion displays a single view. It has the common Subversion actions in the toolbar across the top: check in, check out, update, revert and refresh. Most of the view is taken up by the hierarchical file view, but it also has a drawer for the file history. The file view is a merge of both the server and the working copy, and works the same as the Finder’s list view. This is my favorite feature of ZigVersion. The combining of the two views makes browsing the source tree very intuitive, and frees me from having to switch between a server view and a local view depending on what I want to do. Everything is in this one view.

Actions menuThe menus are likewise very simple. In fact only the File and Actions menus have anything that’s Subversion specific, and the File menu just has a couple of menu items to connect to a server. Unlike other Subversion clients, ZigVersion actually puts all possible actions in the menu, so they are easily found. On the other hand, they don’t just dump every conceivable command into the menu, and call it a day (I’m glaring in your general direction Perforce.). As you can see from the screenshot, the creators of ZigVersion chose only the most commonly use commands to put in the menu. What’s more, the actions work on folders, and work recursively, just as I would expect.

There are some actions that aren’t in the menus, but that I easily found. These included moving and coping files. To move a file I simply dragged and dropped it where I wanted it. ZigVersion figured out the copies, deletes, etc behind the scenes so I didn’t have to. Very nice. ZigVersion also provides contextual menus in the file view, which simply mirror what’s in the Actions menu.

Check-in sheetThe only other piece of major UI I saw was when I went to check in my changes. ZigVersion presented me with a simple, nonmodal window that allowed me to easily review my changes. It consists of a list of the files that have been modified, added, or deleted, a file comparison view, and a text area to add my comments to the check in. The file comparison view updates as a file is selected from the file list, comparing what’s in the working copy with what’s on the server. In the margin on the right, there are blue markers that show where in the file there are differences, so I could easily jump to those locations. Once there, the comparison view hilited the differing lines, the same as in any file differencing application.

After using ZigVersion for a while, I can confirm that it is much easier to use than all the other Subversion clients. ZigZig Software did, in fact, put thought into which actions to expose and how to expose them, as opposed to forcing the user to juggle a couple of windows for each working copy, or just dumping all possible options into the menu bar.

Missing functionality

On the flip side of simplicity, there is missing functionality. With the sparseness of the menus and toolbar, I figured almost immediately there was some feature that would not be implemented. I kept waiting to run into the feature that I needed that ZigVersion did not provide.

The first thing I ran into was the lack of support for multiple working copies. ZigVersion only knows how to deal with one working copy per project. So if you check out a project twice on your machine, but in different locations, ZigVersion will forget the first one. I assume this limitation is because ZigVersion has the consolidated file view. Still, this feels a bit arbitrary, and ZigVersion would be improved if it were lifted.

Another limitation I ran into was in the check-in window. I found myself wanting to merge changes between the working copy and server copy or to simply make edits. Unfortunately the file comparison view in the check-in window is strictly read only. It cannot merge changes or be edited.

In general, I found that I wanted to be able to see differences in files from various parts of the application. For example, ZigVersion has a file history view, but it doesn’t allow me to see differences between revisions. That makes it difficult for me to track down when a change was introduced. Also, from the file view, ZigVersion would not let me see the difference between what was in the working copy and the server. I had to bring up the check-in window to view differences.

All in all, ZigVersion provided 99% of the functionality I needed. The multiple workspaces support is not something I think I would need except for WordPress plugins. That’s because I wanted to check out a subtree of the project into a WordPress site, and check out the entire project in a different location to be able to edit the testplans and documentation. The other features would make my life much simpler but did not prevent me from being able to do what I needed to.

Bugs

During my use of ZigVersion, I only found one bug, and it was a minor one. When displaying the modification date in the file view and file history, recent dates would often only display the time and older dates would only display the date.

Help

I actually didn’t need the help, which says something about the usability of ZigVersion. The only reason I even tried it was because I wanted to see if ZigVersion really did support multiple working copies, and I was just too dense to figure it out.

Its a good thing I didn’t really need the help. The only menu item in the Help menu takes you to ZigZig’s website. That’s it. The page I was taken to has no documentation, its simply the main page of ZigZig Software’s website. There is a link at the top promisingly named “Documentation,” however the results of clicking that link were disappointing. The documentation has a link to the release notes, screenshots, and how to use ZigVersion with ssh. No help to be found.

There are, however, forums on ZigZig’s site, where, presumably, I could ask questions. I did not use them however, so I cannot comment on how responsive or helpful they are.

I have to say, I’m quite shocked that anyone shipped a product with no help in this day and age.

Cost

After using ZigVersion for several weeks, I was quite happy with it. I decided to go buy it.

That’s when I discovered that it retailed for US $140 per user.

The reason I previously didn’t know the cost is because ZigZig provides a non-commercial license that is free. Since PollPress, my WordPress plugin, was open source it qualified for the non-commercial license.

I really like ZigVersion, but given its 1.0 status and relatively small feature set, I have a hard time justifying the $140 price tag. I, personally, would be willing to spend US $50 or $60 on it, but $140 is a little hard to swallow. On the other hand, each day that I use svnX increases the amount I’m willing to spend on ZigVersion. It still hasn’t gotten anywhere near up to $140 though.

Bottom line

ZigVersion is a highly intuitive, very Mac-like Subversion graphical client. It really is fun to use. It definitely has the feel of a 1.0 release, as evidenced by the small feature set. However, it does the Subversion client thing better than any other software out there that I’ve used. I look forward to future releases which will hopefully fill in the missing features. At that point, I hope that the features justify the current cost, or that the cost comes down to meet what’s implemented in the software.

Filed in Programming, Reviews | 13 responses so far

Core Data and Multi-threading

Andy on Oct 14th 2006

I’ve been wanting to write this article since Tuesday, but I’ve been distracted by my day job. One of our clients is getting close to shipping so I have to put in more hours than usual. It’s no where near as fun as working on Wombat, but it pays the bills.

Anyway, back when I wrote a couple of weeks ago about Wombat, I mentioned the trouble I was having with Core Data and multiple threads. Basically, I was finding that the entire context (NSManagedObjectContext) had to be locked anytime a thread touched the context or any one of its managed objects (NSManagedObject). That included even accessing attributes on a NSManagedObject as well as mutating them.

Apparently I wasn’t the only one who figured this out. Florian Zschocke, creator of Xnntp, told me that he was running into the same problem of having to lock the entire context each time he touched anything. He was also wondering if there was a better way.

The obvious problem with locking every time is that it defeats the concurrency of threads. The threads end up being serialized anytime they touch the data store. This is pretty troublesome for Wombat, because it’s an NNTP server. Most of its time is spent doing I/O - either reading/writing to the data store or reading/writing to sockets. Accessing the data store is already a potential performance hotspot, and the serialization or threads makes it even worse.

Fortunately there’s a better way. Blake Seely left a comment on my previous post, letting me know that the appropriate way to handle multiple threads is to have a separate context for each thread. About this time I also found some Apple documentation pertaining to Core Data and multiple threads, which echoed Blake’s comments. This is as simple as allocating a NSManagedObjectContext each time a thread is spawned, and handing it the solitary NSPersistentStoreCoordinator.

The one gotcha is that NSManagedObject’s from one context cannot be used in another context. If you want to send an object from one thread to the other, you have to pass the NSManagedObjectID around. This can be obtained by [object objectID] from one thread, then used on the other thread by [context objectWithID:objectID] to get the corresponding object in that context. However, this only works for objects that have been saved. In general, Wombat isn’t going to have to worry about passing objects between threads. That’s because each client is pretty isolated and has no reason to talk directly to another client.

That said, having multiple contexts has some implications for Wombat. Currently each client gets its own thread, and thus its own object context. In the future this will probably change, and clients will be pooled together on a few threads that handle multiple clients using something like kqueue to multiplex sockets. The catch is that no client saves its changes until the remote client closes the connection. Currently that means if Wombat has two clients, and Client A posts an article, Client B will not see that article until Client A quits. For performance reasons, NNTP clients often leave the connection open for a specified amount of time after they’ve done their work.

The behavior is acceptable, but it gets a bit more weird when clients start getting multiplexed by a single thread. In that scenario, Client B might see the article immediately if its in the same pool, or it might have to wait until Client A quits. In other words, some clients will see articles sooner than other clients. Once again, it’s acceptable behavior, but it’s a little odd.

Meanwhile, I’ve been reading Another Day in the Code Mines, which has a lot to say about threading. One of the thoughts that I came away with is that forking processes in Wombat would probably be better than spawning threads. That is, for each client that connects, instead of spawning a thread for it, spawn a process to handle it. Processes are heavier weight, but they provide a couple advantages. First, they provide separate memory spaces for each client, so one client can’t mess with another. Second, if one client crashes, it doesn’t take down the entire server, thus making Wombat more robust. Forking for each client also happens to be a classic NNTP server design, and for good reason.

Unfortunately, as far as I can tell, Core Data doesn’t support this. Multiple contexts can exist because they all share one NSPersistentStoreCoordinator, which serializes all I/O to the data store file. Since SQLite often updates just parts of the file at a time, I can’t imagine that it would allow multiple processes to have the data store file open at once, especially for write. The only way I see around this is to make the data store file its own server. Unfortunately, this reintroduces the single point of failure (if it goes down, all clients go down) and since NNTP is fairly thin protocol over news, it would just end up being something pretty close to an NNTP server itself. Not a win.

In the end, it looks as though I’m just going to give each thread its own context, and then multiplex several sockets on each thread using kqueue. It may not be as robust as forking processes, but it should be possible to get some good performance out of it.

Filed in Core Data, Macintosh, Programming, Wombat | 5 responses so far

Bad Behavior has blocked 1317 access attempts in the last 7 days.