Leveraging the Power of EOF
- Leveraging the Power of EOF
Leveraging the Power of EOF
Fetching Objects
Objects Involved in Fetching
There are many objects involved in retrieving data in an Enterprise Objects application. The ones you'll most commonly work with are introduced here.
EOFetchSpecification A fetch specification provides a description of what data to retrieve from a data source. A fetch specification always includes the name of an entity---in Enterprise Objects, a single database fetch operation is always done from the perspective of a particular entity. A fetch specification usually includes a qualifier---specific criteria to look for when searching the database. A fetch specification can also include a sort ordering, which specifies that the result set should be sorted in a particular way.
EOQualifier A qualifier is often included in a fetch specification to provide criteria for a particular database fetch. There are a number of different kinds of qualifiers, some of which map to a SQL expression such as AND or OR. A qualifier is commonly compound---that is, a qualifier often consists of multiple qualifiers.
EOSortOrdering A sort ordering is often included in a fetch specification to specify that the fetch's result set should be sorted in a particular way.
EOEditingContext In Enterprise Objects, a fetch almost always takes place within an object workspace called an editing context.
Other objects are involved in a fetch specification, such as EODatabaseContext and EOAdaptorChannel, but you rarely need to interact with these objects programmatically.
Flow of Data During a Fetch
A fetch begins with the construction of a fetch specification. You can create fetch specifications programmatically, but they are also created by various components within a WebObjects application such as display groups. You can also use EOModeler to build fetch specifications.
Once a fetch specification is created, the fetch must be initiated. Again, you commonly do this programmatically by invoking objectsWithFetchSpecification on an EOEditingContext, but it is also often done automatically by objects such as display groups.
Once a fetch is initiated, the following sequence occurs to retrieve data from a data source:
- When objectsWithFetchSpecification is invoked on an EOEditingContext, that editing context forwards the invocation on to its parent object store. The parent object store again forwards the invocation on to its parent object store until the root object store is reached (the root object store is usually an instance of EOObjectStoreCoordinator).
- The root object store (EOObjectStoreCoordinator) determines which of its EOCooperatingObjectStores should service the fetch specification. It forwards the objectsWithFetchSpecification invocation to the determined cooperating object store to ask it to retrieve data from the data source.
How does an EOObjectStoreCoordinator determine which of its EOCooperatingObjectStores should service a particular fetch specification? Remember that within an EOModelGroup, entity names must be unique. Also remember that fetch specifications are entity-centric---every fetch specification is specified on the basis of a particular entity. So an object store coordinator simply looks for the list of entities registered within its cooperating object stores to match an entity name to particular cooperating object store.
When an EOCooperatingObjectStore receives a request to fetch data from a data source, it invokes objectsWithFetchSpecification on its EODatabaseContext object to do the work. When a database context receives this fetch request, it fetches a number of rows from the database, transforms them into enterprise objects (in most cases), and registers them as needed with the EOEditingContext that initiated the chain of objectsWithFetchSpecification invocations.
A database context uses an EODatabaseChannel to do all this. That object in turn uses an EOAdaptorChannel object to communicate directly with data sources and model-level objects---EOEntity, EOAttribute, EORelationship---that are necessary to perform the fetch.
Within EODatabaseContext, fetching occurs in two major steps:
- A database context uses a database channel to select the rows in the database for which objects are being fetched. It does this using the EODatabaseChannel method selectObjectsWithFetchSpecification, which takes as an argument the fetch specification that originated in the editing context.
- The database channel fetches each enterprise object, one at a time, as the database context repeatedly invokes on it the method fetchObject. This method uses state built up in the first step to get data from the object, create an enterprise object instance if necessary, and register the new instance with the fetch's editing context. The database channel uses the entity name specified in the fetch specification to know which enterprise object class to instantiate for every fetched object.
When an EODatabaseChannel receives an invocation of fetchObject from an EODatabaseContext, the following sequence of events occurs:
- The database channel uses an EOAdaptorChannel to retrieve a record for the requested entity. The record retrieved includes the record's primary key, class properties and client-side class properties, attributes used for locking, and any foreign keys used by the entity's relationships.
- The database channel then assigns an EOGlobalID to the row by invoking globalIDForRow.
- The database channel records a snapshot for the fetched row. A global ID may already have a recorded snapshot, but if this is not the case, the method recordSnapshotForGlobalID is invoked on EODatabase. However, if a snapshot is already recorded for the given global ID, the database context delegate method databaseContextShouldUpdateCurrentSnapshot is instead invoked. The default behavior does not update the already recorded snapshot with the new one, but you can change this by implementing the delegate method.
At this point in the fetch, if the fetch specification is set to refresh refetched objects, an ObjectsChangedInStoreNotification is posted to invalidate (refault) any existing enterprise object instances that correspond to this global ID.
- The database channel records whether the object was locked when it was selected. This would be the case only if you enable pessimistic locking (row-level locking) in your application.
- The database channel then checks with the editing context in which the fetch originated to see whether a copy of the object already exists in that editing context. It uses the EOEditingContext method
- If the editing context contains an enterprise object for the global ID and if that enterprise object is not a fault, the editing context returns the enterprise object. Otherwise, the enterprise object returns null.
- If the editing context doesn't return an enterprise object for the global ID, the database channel invokes the EOEntityClassDescription method createInstanceWithEditingContext, which determines the object's class based on the fetch specification's entity and instantiates an object of that class.
- The database channel invokes the method recordObject on the editing context to unique the newly created object.
- If the editing context has a fault for the global ID, the fault is cleared and initialization proceeds just as if an empty enterprise object had been created and registered.
- To initialize the object, the database channel invokes the method initializeObject on the editing context, which is passed down the object store hierarchy. If the editing context is nested, it passes the message to its parent editing context. If the parent editing context contains an object with a matching global ID, that object is used to initialize the object in the child editing context.
Otherwise, the initializeObject invocation is forwarded down to the editing context's EODatabaseContext, which initializes the new instance from the appropriate snapshot and creates faults for its relationships. initializeObject in EODatabaseContext sets the values of the newly instantiated enterprise object's properties using takeStoredValueForKey.
- The database channel invokes awakeFromFetch on the new enterprise object. Custom enterprise object classes can override this method to perform additional initialization after an object has been created from a database row and initialized with database values.
Enterprise Object Initialization
The following sequence of events occurs when an object is fetched from the database:
- A database row is fetched as raw binary data.
- The values retrieved from that row are converted from their database-specific types to instances of standard value classes. A sample mapping of this conversion appears in Table 6-1. An application's EOModel specifies the mapping from external data types (database type) to internal data types (Java value type).
- Once the data has been converted to objects, these objects are put in an NSDictionary. The elements of the dictionary correspond to columns in the database table: Their names are the names of the attributes they map to in the EOModel and their values are the values retrieved from the database.
The dictionary provides a snapshot of the database row and is eventually used to initialize an enterprise object. This snapshot also participates in optimistic locking.
The dictionary contains an entry for all of a row's columns, but an enterprise object initialized from the dictionary contains only the attributes that are defined as class properties or client-side class properties in the entity's EOModel.
- A new enterprise object is instantiated by an EOEntityClassDescription object.
- The enterprise object is initialized from a row snapshot. Only objects that are class properties or client-side
class properties are included. Faults are created for any references to relationships defined in the EOModel.
Faulting and Relationship Resolution
One of the most powerful and useful features of Enterprise Objects is that it automatically resolves the relationships defined in a model. It does this in part by delaying the actual retrieval of data—and delaying communicating with the database—until the data is needed, a feature of Enterprise Objects called faulting. Faulting happens in two phases: the creation of a placeholder object (a fault) for the data to be fetched, and fetching the data when it's needed (firing a fault).
When Enterprise Objects fetches an object, it examines the object's relationships as defined in the EOModel in which the object (entity) is defined. It then creates objects (faults) representing the destinations of the fetched object's relationships. For example, if you fetch a Listing object that has an agent relationship and an address relationship, faults are created for the destination of those relationships, which are an Agent object and an Address object. The Agent and Address objects are not fetched (their rows in the database are not accessed) until their data is actually needed.
Fetching is resource-intensive and often recursive—fetching the destination object of one enterprise object may require fetching that destination object's destination objects, and so on until all of the interrelated rows in the database have been retrieved. To avoid this waste of time and resources, the destination objects are created as stand-ins, which are referred to as faults.
There are two kinds of faults: single-object faults for to-one relationships and array faults for to-many relationships. A single-object fault is an enterprise object instance that is associated with a particular editing context, class description, and global ID. However, the enterprise object's data hasn't yet been fetched from the database—you can think of a single-object fault as a shell of an enterprise object.
Array faults are instances of NSMutableArray and are triggered to fire their faults by any request for a member object or for the number of objects in the array (the number of objects in a to-many relationship can't be determined without actually fetching them all). More specifically, array faults may start out as deferred faults, which are very small and cheap and contain little information. They may then become NSMutableArrays, which have more information about their state and contents. If an object in the relationship is then directly accessed (if an element in the array is accessed), the array fault fully fires, filling the array with enterprise objects.
Locking
When instrumenting concurrency in any application, you must take responsibility for locking certain objects to ensure thread integrity. Even if you don't instrument concurrency in an application, that application still uses multiple threads—no Java application is truly single-threaded. On most Java virtual machines, the garbage collector and finalize methods each run in separate threads. To ensure that objects you manipulate aren't affected by one of these threads, you must lock the objects you manipulate directly. As part of instrumenting concurrency in an Enterprise Objects application, but also in Enterprise Objects applications in which you don't explicitly instrument concurrency, it is your responsibility to lock the Enterprise Objects you use directly.
In versions of Enterprise Objects prior to WebObjects 5.2, you were expected to explicitly lock and unlock EOEditingContext objects. Most other Enterprise Objects locked themselves in any methods that changed state (which includes most methods) or did not support locking.
You usually interact only with the EOEditingContext lock. It is vital to properly lock and unlock EOEditingContext objects to ensure the integrity of their EOEnterpriseObject instances. An editing context (an object store) automatically locks its parent object store (usually an instance of EOObjectStoreCoordinator). Obtaining a lock on an EOObjectStoreCoordinator causes it to lock all of its registered cooperating object stores.
Since EOObjectStoreCoordinator is the highest-level object in the access layer stack, and since it automatically locks the object stores registered with it, obtaining a lock on an EOObjectStoreCoordinator is sufficient to manipulate any access-layer objects underneath it. In other words, objects in the access layer can be used only by the thread that has obtained a lock on the object store coordinator that is the highest-level object in that particular access layer stack—you must secure a lock on a particular object store coordinator before using any of the objects it manages.
Here are a few additional guidelines regarding locking in Enterprise Objects applications:
- In general, you should first secure the appropriate locks on EOObjectStoreCoordinators before posting notifications that other Enterprise Objects register to receive.
- Enterprise Object delegates do not need to worry about locking unless they attempt to access additional resources.
- Enterprise Objects uses more sophisticated locking objects than those built in to Java. These objects provide both you and Enterprise Objects with more control over the scope of critical regions within applications. This reduces contention and the possible scenarios that can generate deadlocks.
- Child (nested) editing contexts use their parent's lock.
- EOSharedEditingContext objects have a multireader, single-writer lock.
- Each EOObjectStore instance and each EOObjectStoreCoordinator instance may have its own lock.
- There is a global lock for loading EOModel files.
Data Freshness
When developing Enterprise Objects applications, one of the most common challenges is providing users with the freshest possible data while maintaining reasonable application performance. In a multiuser database environment, there is a risk of update conflicts occurring in which multiple users access and attempt to change the same set of data simultaneously. Providing fresher data to users can help alleviate update conflicts.
The first thing to understand when dealing with the issue of data freshness is to understand when Enterprise Objects uses cached data and when it fetches data from a database. In most cases, if an editing context asks an enterprise object for its data, it receives cached data unless:
- the timestamp of the snapshots of enterprise objects are older than the editing context's timestamp
- the enterprise object has been invalidated
- the enterprise object is a fault (its data hasn't yet been fetched)
When multiple users access the same data source by sharing an application instance, they most often share data caches. This means that one user's data query may not actually invoke a fetch from the data source if the data requested has already been fetched by another user and so exists in the cache.
Fetch Timestamp
Each editing context in an application includes a fetch timestamp that it uses to tell its parent object store that it wants cached data or fresh data from the database. An editing context prefers data that was fetched on or after an absolute time that is tracked by an editing context's fetch timestamp. (Ultimately, an editing context's parent object store decides when to perform database fetches. In the default case, an editing context's parent object store does honor its editing context's fetch timestamps, but this may not be the case for all object stores).
When enterprise objects are requested from an editing context, the editing context sends this request along with a fetch timestamp to its parent object store. If the requested enterprise objects have already been fetched, the parent object store finds the snapshots of those enterprise objects and compares their fetch timestamps with the fetch timestamp sent by the editing context that requested the objects.
If the timestamp of the snapshots from which the requested enterprise objects were formed is older than the editing context's fetch timestamp, the snapshots are considered stale and fresh values for those enterprise objects are requested from the database. Otherwise, cached enterprise object values are used (these cached values are in the database context's snapshots).
Timestamp Lag
An editing context's fetch timestamp is set to the time of a fetch minus the default timestamp lag. The default lag is sixty minutes so the default fetch timestamp on an application's editing contexts is one hour before a fetch occurred. So, if the timestamp of an enterprise object's snapshots in the database context are within an hour of the fetch timestamp of the object's editing context, a fetch returns the cached data in those snapshots rather than refetching from the database. However, if the timestamp of the snapshots in the database context are older than an hour (or older than the editing context's fetch timestamp), the snapshots are discarded and data is refetched from the database.
A common design pattern is to set the default timestamp lag to a smaller number to encourage more refetching from the database. You can change the default timestamp lag for all the editing contexts in an application using the static method on EOEditingContext called setDefaultFetchTimestampLag. In some cases, you may want to explicitly set the fetch timestamp of a particular editing context to encourage refetching of its data. You can do this by invoking setFetchTimestamp on an editing context.
Nested editing contexts use the fetch timestamp of their parent, so applications that make heavy use of nested editing contexts may have to take additional measures to ensure fresh data.
Other Mechanisms to Ensure Freshness
Enterprise Objects provides other mechanisms to ensure the freshness of data in enterprise objects. By using the method refreshesRefetchedObjects, you force data values to be updated with fresh values from the data source when those objects are refetched.
You should instead consider using the method setFetchTimestampLag along with refaultAllObjects or refreshAllObjects on an editing context to update the data in enterprise objects in a given editing context.
By default, when you refetch data, Enterprise Objects does not update the data in enterprise object instances with the refetched data from the database. This is the default behavior that helps maintain an internally consistent view of the application's data. However, this is not always the behavior you want, especially if you are more concerned with ensuring the freshness of data than anything else (which is especially true in read-only applications).
To override the default behavior, invoke the method setRefreshesRefetchedObjects on an EOFetchSpecification. Then, when you refetch data, the data in enterprise object instances is refreshed with the refreshed data from the database. (You can refetch the data for a particular enterprise object by invoking on it the method refreshObject. If you need to refresh a set of enterprise objects, use a fetch specification.)
By default setRefreshesRefetchedObjects refreshes only the objects you are refetching. For example, if you refetch Employee objects, you don't also refetch the Employees' departments. However, by invoking setPrefetchingRelationshipKeyPaths on a fetch specification, the refetch is also propagated for all of the fetched object's relationships that you specify in that invocation.
Using SQL when needed
Although fetch specifications are the most common type of objects used to fetch data in Enterprise Objects applications, a lighter-weight mechanism is also provided that fetches raw rows based on an SQL expression you provide. This mechanism is provided as a method called rawRowsForSQL on the EOUtilities class. You pass to the method as arguments an editing context, a String representing the model that contains the entities on which to perform to the fetch, and a valid SQL expression. The results are returned as raw rows rather than as full-fledged enterprise objects.
Raw Rows
Fetch specifications provide an option to fetch raw rows. When you use raw row fetching, database rows that are fetched are not automatically transformed into enterprise object instances. There are a number of reasons why you'd want to specify raw row fetching for a particular fetch specification. These include:
- reducing memory usage when fetching large data sets
- improving application performance when fetching large data sets
- reducing the general overhead of an application instance
You can specify raw row fetching for a particular fetch specification either in EOModeler's fetch specification builder or by invoking the method setFetchesRawRows on a fetch specification. You can more closely control which rows are fetched as raw rows using the method setRawRowKeyPaths on a fetch specification.
When you fetch raw rows, you lose many of the benefits of using full-fledged enterprise object instances such as the object graph, change notifications, and so forth. But many of the cases in which you need to fetch raw rows involve fetching large data sets that don't need the benefits of the object graph, so this is an acceptable trade-off in light of the performance benefits of raw row fetching.
Plus, you can always instantiate an enterprise object of that row using the method faultForRawRow on an EOEditingContext.
Fetch Efficiency (pre and batch fetching)
when Enterprise Objects fetches an enterprise object, it creates faults for the object's relationships. Each time a fault is fired, a round trip is made to the database to retrieve the fault's data. You can batch together fault firing as described in "Batch Faulting" (page 67) to reduce the number of round trips to the database. However, you can go even further in reducing the number of round trips to the database by prefetching all the objects in a particular relationship. Prefetching allows you to anticipate that some of an enterprise object's relationships will be fetched and provides a mechanism to preload them; it provides a performance opportunity.
For example, consider a Listing entity that has an agent relationship. When you fetch twenty Listing objects, faults are created for each Listing's agent relationship. When the data in the agent relationship is accessed for a particular Listing object, a fault is fired to retrieve the relationship's data, which invokes a round trip to the database. Implementing batch faulting reduces the number of round trips but you can further reduce the number of round trips by simply prefetching all of the agent data in the database. With prefetching, when a fetch is performed for a particular entity, the objects in the relationships specified by the prefetching key paths (agent in this case) are immediately fetched.
You instrument your application for prefetching by configuring certain fetch specifications for prefetching. You can use the Prefetching pane of EOModeler's fetch specification builder to configure it for a particular fetch specification or you can invoke setPrefetchingRelationshipKeyPaths on a fetch specification, which takes an array of strings representing the relationships to prefetch.
There are a few guidelines to consider when using prefetching. First, if memory usage is an issue for your application rather than database performance, don't use prefetching as it consumes more memory. In fact, prefetching can consume an inordinate amount of memory depending on the size of the data set, so it's probably more appropriate to prefetch only those relationships that have a small number of destination objects.
Second, don't use prefetching on a fetch specification that uses a fetch limit. The prefetching hint ignores the fetch limit.
Third, don't use prefetching when performing multiple queries that return the same records. The performance benefits of prefetching are negated by the overhead of re-creating enterprise objects of the same rows of data multiple times.
Many applications have read-only entities that contain static data such as lists of states and countries, building names, or department names. Since many users of an application use the data in these entities, it makes sense to cache the data in memory to reduce the number of fetches to the database. In Enterprise Objects, you can cache an entire table in memory, thereby eliminating unnecessary fetches for the same static data by multiple users.
To enable entity caching for an entity, select the Cache In Memory option in EOModeler's advanced entity inspector for a particular entity, as shown in Figure 6-7. You can enable this option programmatically using the method setCachesObjects on EOEntity.
When entity caching is enabled for a particular entity, the first fetch of that entity's table causes the whole table to be fetched into memory. Clearly, this option is appropriate only for tables with a small number of rows.
An entity's cache of objects is maintained by an EODatabaseContext. If you provide a separate access layer stack for each user as described in "Providing Separate Stacks" (page 51), each session has its own EODatabaseContext, so bear in mind that entity caching in this scenario may consume a lot of memory.
Another advanced faulting feature is batch faulting. When a fault is fired, its data is fetched from the database. However, firing one fault has no effect on other faults—firing one fault just fetches the object or objects for the one fault. By batching fault firings together, you can more efficiently use the round trip to the database that is necessary when a single fault is fired.
For example, given an array of Employee enterprise objects, you can fetch all of the objects that are the destination of their department relationship with one round trip to the server. Without batch faulting, a round trip to the database is made to resolve each Employee's department relationship.
There are a number of ways to implement batch faulting. You can configure batch faulting in three contexts: on entities, on relationships, and on relationships under certain circumstances.
You configure batch faulting for an entity in EOModeler in an entity's advanced inspector, as shown in Figure 6-5. The integer you specify in the Batch Faulting Size field specifies the number of faults to fire the first time a fault is fired for any relationship in that entity. You can set this size programmatically using the method in EOEntity called setMaxNumberOfInstancesToBatchFetch.
You can also specify a batch faulting size for a particular relationship. The easiest and most common way to do this is in EOModeler using the advanced relationship inspector, which is shown in Figure 6-6. The batch size specifies the number of faults to fire when the first fault in the relationship is fired.
Finally, you can take more precise control of batch faulting by explicitly batching together faults for particular objects. When you specify the batch size in EOModeler for all of an entity's relationships or for particular relationships, you don't actually control which faults are fired. The method batchFetchRelationship in EODatabaseContext allows you to batch fetch all of the faults in a particular relationship. The method databaseContextShouldFetchArrayFault in EODatabaseContext.Delegate allows you to turn batch faulting on and off arbitrarily. See the API reference for EODatabaseContext and EODatabaseContext.Delegate for more details.