Thursday, August 19, 2010

Core Data Starting Data

Yesterday, I tweeted asking for advice about the best way to provide a starter set of data in a Core Data-based application. The app I'm working on had started with just a small set of starter data, so the first time the app was run, I simply pulled in that starter data from a plist in the background and created managed objects based on the contents and all was good. The user never noticed it and the load was complete before they needed to do anything with the data.

Then the data set got quite a bit larger and that solution became too slow. So, I asked the Twitterverse for opinions about the best way to provide a large amount of starting data in a Core Data app. What I was hoping to find out was whether you could include a persistent store in your application bundle and use that instead of creating new persistent store the first time the app was launched.

The answer came back from lots and lots of people that you could, indeed, copy an existing persistent store from your app bundle. You could even create the persistent store using a Mac Cocoa Core Data application as long as it used the same xcdatamodel file as your iPhone app.

Before I go on, I want to thank all the people who responded with suggestions and advice. A special thanks to Dan Pasco from the excellent dev shop Black Pixel who gave very substantive assistance. With the help of the iOS dev community, it took me about 15 minutes to get this running in one of my current apps. Several people have asked for the details over Twitter. 140 characters isn't going to cut it for this, but here's what I did.

First, I created a new Mac / Cocoa project in Xcode. I used the regular Cocoa Application template, making sure to use Core Data. Several people also suggested that you could use a Document-Based Cocoa Application using Core DAta which would allow you to save the persistent store anywhere you wanted. I create the Xcode project in a subfolder of my main project folder and I added the data model file and all the managed object classes from my main project to the new project using project-relative references, making sure NOT to copy the files into the new project folder - I want to use the same files as my original project so any changes made in one are reflected in the other.

If my starting data was changing frequently, I'd probably make this new project a dependency of my main project and add a copy files build phase that would copy the persistent store into the main project's Resources folder, but my data isn't changing very often, so I'm just doing it manually. You definitely can automate the task within Xcode, and I heard from several people who have done so.

In the new Cocoa project, the first thing to do is modify the persistentStoreCoordinator method of the app delegate so it uses a persistent store with the same name as your iOS app. This is the line of code you need to modify:

NSURL *url = [NSURL fileURLWithPath: [applicationSupportDirectory stringByAppendingPathComponent: @"My_App_Name.sqlite"]];


Make sure you add the .sqlite extension to the filename. By default, the Cocoa Application template uses an XML datastore and no file extension. The filename you enter here is used exactly, so if you want a file extension, you have to specify it.

Since the Cocoa Application project defaults to an XML-based persistent store, you also need to change the Cocoa App's store type to SQLite. That's a few lines later. Look for this line:

if (![persistentStoreCoordinator addPersistentStoreWithType:NSXMLStoreType 
configuration:nil
URL:url
options:nil
error:&error
]
){


Change NSXMLStoreType to NSSQLiteStoreType.

Optionally, you can also change the applicationSupportDirectory method to return a different location if you want to make the persistent store easier to find. By default, it's going to go in
~/Library/Application Support/[Cocoa App Name]/
which can be a bit of a drag to navigate to.

Next, you need to do your data import. This code will inherently be application-specific and will depend on you data model and the data you need to import. Here's a simple pseudo-method for parsing a tab-delimited text file to give an idea what this might look like. This method creates an NSAutoreleasePool and a context so it can be launched in a thread if you desire. You can also call it directly - it won't hurt anything.


- (void)loadInitialData
{
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
NSManagedObjectContext *context = [[NSManagedObjectContext alloc] init];
[context setPersistentStoreCoordinator:self.persistentStoreCoordinator];


NSString *path = [[NSBundle mainBundle] pathForResource:@"foo" ofType:@"txt"];

char buffer[1000];
FILE* file = fopen([path UTF8String], "r");

while(fgets(buffer, 1000, file) != NULL)
{
NSString* string = [[NSString alloc] initWithUTF8String:buffer];
NSArray *parts = [string componentsSeparatedByString:@"\t"]

MyManagedObjectClass *oneObject = [self methodToCreateObjectFromArray:parts];
[string release];
}

NSLog(@"Done initial load");
fclose(file);
NSError *error;
if (![context save:&error])
NSLog(@"Error saving: %@", error);

[context release];
[pool drain];
}


You can add the delegate method applicationDidFinishLaunching: to your app delegate and put your code there. You don't even really need to worry about how long it takes - there's no watchdog process on the Mac that kills your app if it isn't responding to events after a period of time. If you prefer, you can have your data import functionality working in the background, though since the app does nothing else, there's no real benefit except the fact that it's the "right" way to code an application.

- (void)applicationDidFinishLaunching:(NSNotification *)aNotification
{
// Do import or launch threads to do import

// So, do this:
[self loadInitialData];

// or this:

[self performSelectorInBackground:@selector(loadInitialData) withObject:nil];

// But not both for the same method probably
}


Now, run your app. When it's done importing, you can just copy the persistent store file into your iOS app's Xcode project in the Resources group. When you build your app, this file will automatically get copied into your application's bundle. Now, you just need to modify the app delegate of your iOS application to use this persistent store instead of creating a new, empty persistent store the first time the app is run.

To do that, you simply check for the existence of the persistent store in your application's /Documents folder, and if it's not there, you copy it from the application bundle to the the /Documents folder before creating the persistent store. In the app delegate, the persistentStoreCoordinator method should look something like this:

- (NSPersistentStoreCoordinator *)persistentStoreCoordinator 
{
@synchronized (self)
{
if (persistentStoreCoordinator != nil)
return persistentStoreCoordinator;

NSString *defaultStorePath = [[NSBundle bundleForClass:[self class]] pathForResource:@"My_App_Name" ofType:@"sqlite"];
NSString *storePath = [[self applicationDocumentsDirectory] stringByAppendingPathComponent: @"My_App_Name.sqlite"];

NSError *error;
if (![[NSFileManager defaultManager] fileExistsAtPath:storePath])
{
if ([[NSFileManager defaultManager] copyItemAtPath:defaultStorePath toPath:storePath error:&error])
NSLog(@"Copied starting data to %@", storePath);
else
NSLog(@"Error copying default DB to %@ (%@)", storePath, error);
}


NSURL *storeURL = [NSURL fileURLWithPath:storePath];

persistentStoreCoordinator = [[NSPersistentStoreCoordinator alloc] initWithManagedObjectModel:[self managedObjectModel]];

NSDictionary *options = [NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithBool:YES], NSMigratePersistentStoresAutomaticallyOption,
[NSNumber numberWithBool:YES], NSInferMappingModelAutomaticallyOption, nil
]
;

if (![persistentStoreCoordinator addPersistentStoreWithType:NSSQLiteStoreType configuration:nil URL:storeURL options:options error:&error])
{

NSLog(@"Unresolved error %@, %@", error, [error userInfo]);
abort();
}


return persistentStoreCoordinator;
}

}


Et voilĂ ! If you run the app for the first time on a device, or run it on the simulator after resetting content and settings, you should start with the data that was loaded in your Cocoa application.



17 comments:

JB said...

been using this exact technique for a while now... works great.

Lee Probert said...

I need to do this for something simple I'm working on. I have hundreds of objects I am loading into a persistent store when the app is loaded for the first time. It takes a while loading from a plist. Thanks for this solution. Now all I need to do is work out entity relationships.

MyID.config.php said...

Hey Jeff, great post. Sorry I missed your tweet in time to chime in.

I've used a similar technique, with a twist. One of the features of my project was that the app would periodically update a subset of the contents of its persistent store from a server ... you can see where this is going.

I used the iOS app itself, running in the Simulator, to generate the database that would later ship with it. I created a special build configuration that set a flag that, when true, would cause the app to create a new database and update with all the (relevant) server data, rather than just a subset. After this (comparatively lengthy) import process, the app would proceed as per usual.

(There are a few ways to trigger that state, perhaps better ones than that, but you catch my drift.)

This also gave me the chance to verify my actual import code and the import process UX, which revealed a few bugs and even surfaced a gnarly memory / performance issue in a way that was easy to find much earlier than it would have come up naturally. (Normally I'm against premature optimization, but this was more of a bug than fine-tuning.)

A risk / downside of this approach is that I had to be *very* careful that the app would not save extra / unwanted objects to the persistent store after the desired import was complete. There are a lot of ways to approach that challenge, but they're likely to be app-specific and TBH I don't recall offhand how I prevented that situation in my case.

A safer approach may be a sort of hybrid that uses the import-oriented classes from the iOS project in the Cocoa app, but in my case it was helpful to be able to test the actual import UX (status indicators, etc) with this large set of data. YMMV.

If you're dealing with large data sets, one thing to keep your eye on is the size of the final generated database. You may benefit from compression, as the .ipa is zipped, but IIRC it's uncompressed once installed to the device. If you're copying this file to the Documents or Library directory on first launch, your users will have two copies on their device.

.ipa size is less of an issue now that we have a 20MB download cap to play with, but that still allows for a pretty big (compressed) database. The larger the database, the larger the chance your users will lack the space for your copied file.

Bill Kunz said...

Sorry gang, personal OpenID fail over here. That was my long comment above. Serves me right for not previewing before the coffee kicked in ...

Jeff LaMarche said...

Bill:

Yeah, I've actually been toying with the idea of storing the database as a zip file and using TWZipArchive to unzip and expand it to the Documents folder. Not going to worry about that until closer to release, though. :)

Rafif Yalda said...

I use this exact technique but I use a ifdef to determine whether I'm in "import mode" or not - it will overwrite the .sqlite if so. No need to reset the simulator.

Thomas said...

Jeff,

That's some great info.

I'm up against a similar situation, but it's a little more complicated. The solution you articulated works great for a new application, but not so great in a upgrade. I have a app that I'm looking to add a feature to, and this feature requires the creation of about 5000 new rows of canned data in a new table/entity, all while preserving the existing data in the other tables.

Any suggestions?

tesseractor said...

I saw Marcus Zarra give a talk where he mentioned this technique. Something else to note is that you can have more than one persistent store for your coordinator, so you can have your initial data in a read-only store. Thomas, this also means you can add a new store in your update and still have the existing data intact.

ed said...
This comment has been removed by the author.
ed said...

Really... it works great. Thanks a lot.iPhone App Development

Greg Combs said...

Up to this point I've been using Bill's approach since I use 100% static data on the device and my store runs about 22megs. After a few revisions it becomes exceedingly a pain in the ass to edit, update, etc. on occasion I've manually massaged the data store in Lita, but it's dangerous. I've come to the conclusion that I'm going to build a cocoa editor, not just an importer. The concept is similar, but I'll at least get to use NSArrayController for tableviews and more robust IB tools.

adrian said...

I don't usually post comments on blogs but this one really helped me out. A lot! So thanks!

adrian said...

... at the same time it doesn't work at all. i keep getting exec bad access messages.

skajam66 said...

This is a great. It gives me an approach to start from. Thanks Jeff. I have a further question (request for advice, actually) that pertains to on-going updates to the store once in production.
As background I'm working on an app similar to a music sequencer. I'm using a commercial third-party application to author the sequences - this spits out XML which I then parse into the app using libxml2 and through Core Data into the sqlite store. It all works fine although I am dismayed as to the amount of code i have to write just to manage the data :-(
For ongoing updates, I could distribute new XML files and have the parsing code sitting in the app proper to pull in the XML files or I could load up a new database at homebase and distribute that. Does anyone have an opinion as to pros and cons of each approach?

skajam66 said...

This is a great post - it gives me a great starting point. Thanks Jeff. I have a request for advice on how to handle on-going updates. As background, I am working on something similar to a music sequencer. I use a third-party tool to author the sequences. That tool spits out XML which I then parse using libxml2 and into sqlite via Core Data. It all works fine but I am dismayed at the amount of code I have to write just to manage the data - it's not trivial - I have to load leaf nodes first then construct an entire object graph with established relationships before committing to the store. Anyway, I have it working.
What is the recommendation for on-going updates? I could distribute new XML files and have the parsing code in the distributed app or I could load a new database centrally and distribute that. What comments are there on pros and cons of each approach?

SEO Services Consultants said...

Nice information, many thanks to the author. It is incomprehensible to me now, but in general, the usefulness and significance is overwhelming. Thanks again and good luck! Web Design Company

ankitthakur said...

Hey Jeff, great post. Thanks
I am having a doubt if we are having more than 1 Entity in our application and we want 1 entity to be preloaded with data, then how the NSManagedObjectModel will help us. as we want the other entities to be created at runtime.