Documentum: Enterprise use of the DFS Data Model

The Documentum Foundation Services (DFS) introduced developers to the ‘DFS Data Model’, a rich object model that is capable of representing complex repository objects and relationships during interactions with content services. For those with a DFC programming background, it can be a challenge to shift into the DFS paradigm which focuses on service oriented calls and relies on the data model to fully describe the requested transformations.

Based on my contact with customers through formal Service Requests as well as the EMC Support Forums, I see that many architects, when presented with this unfamiliar landscape instantly assume that the best course of action is to design a custom model to shield other developers from the perceived complexity of the DFS data model. Although well intentioned, I believe this initial reaction to change can have serious implications that are not often considered or understood at the time of their implementation.

While I believe that abstracting the construction of the DFS data model carries a great deal of value, I believe that replacing the DFS data model with a custom model should be done only with deliberate purpose and awareness. I will use this article to explore the motivations behind the development of these “simplified” models, their ramifications in a long-term SOA strategy, and how you can deliver convenience without making integration unnecessarily difficult or hindering the building-block nature of SOA.

The Initial Reaction

One of the first things noticed by DFC programmers is the amount of setup required with the DFS data model. Let’s go through one quick example to make this point concrete, here is an example of using the DFC to link a pre-existing object to the ‘Temp’ cabinet:

[java]

// identify object
IDfSysObject sysObject = (IDfSysObject) session.getObjectByPath(“/dmadmin/test.doc”);

// create link relationship
sysObject.link(“/Temp”);

// update object
sysObj.save();

[/java]

Here is the equivalent using the DFS SDK:

[java]

// identify object
DataObject dataObject = new DataObject(new ObjectPath(“/dmadmin/test.doc”),”dm_document”);

// create link relationship
ObjectIdentity folderTarget = new ObjectIdentity(new ObjectPath(“/Temp”), repository);
ReferenceRelationship referenceRelationship = new ReferenceRelationship();
referenceRelationship.setName(Relationship.RELATIONSHIP_FOLDER);
referenceRelationship.setTarget(folderTarget);
referenceRelationship.setTargetRole(Relationship.ROLE_PARENT);
referenceRelationship.setIntentModifier(RelationshipIntentModifier.ADD);
dataObject.getRelationships().add(referenceRelationship);

// update object
objectService.update(new DataPackage(dataObject), new OperationOptions());

[/java]

There are many examples of DFC “one-liner” operations such as IDfSysObject.link that require more statements when expressed in the DFS data model, as illustrated above. The initial exposure to DFS programming often leads architects to the premature conclusion that the DFS data model is too difficult to work with directly because of the verbosity.

Based on these initial assumptions about the complexity of the DFS Data Model, it is understandable why an intermediate layer would be considered. However, I would assert that the DFS data model is quite simple to understand and it is not the use of of the model that adds complexity, but the construction of the model that one should seek to simplify.

Design Options

We will first consider the consequences of creating a custom data model (Option A), then contrast that with preserving the DFS data model but offering convenience methods or Builders to simplify the construction of the model (Option B).

Option A. Creating a Custom ‘Simplified’ Data Model

One way to approach the problem is to create a new data model to represent objects and transformations. In this solution, users of the new model are presented with a simplified set of objects, which only expose the immediate and short-term needs of the consumers.

As an example, let’s imagine we created a framework where a developer could use a simple type called MySimpleDFSObject to assign the identity and relationships of a document in a convenient manner. We would also want to insulate users from having to call the DFS service methods directly which requires extra parameters (i.e. OperationOptions, etc.), so we have a MyServicesFactorythat instantiates streamlined services that internally provides defaults for the most common options and methods.

[java]
// identity object
MySimpleDFSObject myObj = new MySimpleDFSObject(“/dmadmin/test.doc”,”dm_document”);

// create link relationship
myObj.link(“/Temp”);

// update object
MyObjectService myService = MyServicesFactory.getMyObjectService(mySession);
myService.doSimpleSave(myObj);
[/java]
The example framework above may seem like a ideal solution, but while it is successful at keeping DFC developers in their comfort zone and shielding them from SOA concepts, it also has serious consequences that need to be weighed.

PROS

Intuitive to the domain – because of the concise and simplified nature of these custom objects and their methods, the functionality has been tuned to the exact methods that are going to be used by the initial developer population (setting attributes, saving, etc), and therefore the user conceptual model matches the API.

CONS

Simplicity is fleeting – version 1.0 of your custom API could be the most intuitive and minimalist object model ever created, but in two months your end users will be asking about BOCS content transfer options, then permission sets, then advanced structured queries, etc…it is only a matter of time before your model needs to represent the equivalent functionality of the DFS data model.
Build versus Buy – as mentioned in the item above, users will demand more functionality/options/services. Each enhancement to your API will need to be coded, comprehensively tested, and deployed. This requires a significant effort and therefore cost, and it is smarter to shift these costs to EMC which has gone to great efforts to maintain a stable API that is continually patched based on real-world use.
Lock-in to custom layer – since the custom model is fixed, any significant features or new services added by EMC to the platform will not be available to users until the custom layer is enhanced. For example, in the upgrade to Documentum D6.5, seven new services were added to the platform. Using a custom layer, these would have all been unavailable until the custom data model was enhanced.
SOA Integration potentially more difficult – The DFS object model is uniform across the entire spectrum of content services delivered on the platform including the: ObjectService, Search service, Workflow Service, CTS TransformationService, RPM Services, etc.. This uniformity supports the orchestration goals and service composition advocated in an SOA architecture. DFS object types can be easily passed in to custom DFS services, and then immediately used as parameters to other DFS services. Custom types force service writers to constantly marshall data types into those understood by the target service.

Option B. Developing Methods that Simplify Manipulation of the DFS Data Model

The other approach to simplify DFS development is to expose users directly to the data model, but have utility methods or builders that take on the brunt of the work. For illustration, let’s imagine a simple utility class, DFSHelper, that has static methods that assist in the construction of the DFS data model for common tasks:

[java]

// identify object
DataObject dataObject = DFSHelper.constructDataObject(“/dmadmin/test.doc”,”dm_document”);

// create link relationship (originally took 7 lines to represent)
DFSHelper.link(dataObject,repository,”/Temp”);

// update object
IObjectService objService = DFSHelper.getObjectService();
objService.update(new DataPackage(dataObject), DFSHelper.getDefaultOperationOptions());

[/java]

What may not be obvious to readers not looking at the DFS JavaDocs is that DataObject, DataPackage, andIObjectService are part of the DFS SDK. And the DFSHelper class does not preclude the developer from using the DFS data model directly versus the convenience methods it provides.
PROS

Simplifies exactly where needed – the DFS data model is full-featured, documented well, and is relatively simple. Providing a utility class or builder to bundle common sequences of operations is a natural extension.
Stable data model – the DFS data model contains all the objects and options available to the core DFS services as well as the product specific services, and is mature enough to have been through several releases already. End user requests for enhanced functionality are likely to already be satisifed.
Take immediate advantage of new services – there are many core DFS services as well as those tied specifically to products like CTS, Records Manager, CenterStage, etc. Each new release of Documentum or a product will bring new services that can instantly be leveraged using the DFS common data model.
Encourages ‘building block’ SOA – the nirvana of an SOA architecture is the ability to take disparate services scattered throughout an organization and orchestrate/aggregate these building blocks into a valuable business service. Using a common object model is key to this initiative because without it, integration of several services is an exercise in writing adaptors and transformations to satisfy the input parameters and output results of each service.

CONS

General data model – a general model such as the DFS data model must fulfill broad requirements, a custom model can be tailored exactly to the end user solution.

Summary

In this article we have gone over two different approaches to simplifying DFS development with respect to the data model. There are no absolutes in design, but I hope I have presented a strong argument toward direct use of the DFS data model, and that the facts presented here will allow you to make an informed decision based on the long-term implications.