Chapter 4. Services

WHAT'S IN THIS CHAPTER?

  • Introducing the Content Repository Services

  • Understanding the existing Content, Control, and Collaboration Services

  • Using and extending services

  • Developing your own services

  • Introducing Content Management Interoperability Services

  • Understanding CMIS concepts

  • Using CMIS with Alfresco

The Alfresco server provides capabilities for capturing, managing, and collaborating on content using services. These services form the basis of the functionality provided by any Alfresco implementation.

Services address the core use cases for content management applications, including the logical organization of content, file management, version control, and security. In addition, services support the control of content through workflow and process management, and social and collaborative applications.

Alfresco exposes services at various levels, including Java, scripting, REST, and Web services, as well as through client interfaces such as Explorer and Share. Some services are considered internal; others are public. For example, the Java-level services are internal services. The majority of these are accessible through other public interfaces, including the public APIs, client applications, and CMIS.

The services are divided into two main categories: Content Repository Services and Content Application Services.

CONTENT REPOSITORY SERVICES

The Content Repository Services, written in Java, are the fundamental services for working with content. The following describes the out-of-the-box services for organizing and managing content, controlling versions, recording changes and updates, enforcing security, modeling content types, and searching for information in the repository.

File and Folder Management

Services are provided to support the management of the nodes used to model files and folders in the repository. The services provide methods to create and update nodes and define the relationships between them.

The operations supported by the File Folders service include:

  • Create—Creates nodes, sets property values, creates associations between nodes.

  • Read—Reads node properties and content, reads and navigates node associations.

  • Update—Updates the properties and content of nodes.

  • Delete—Deletes nodes. If the archive store is enabled, the node is not deleted, but is moved from its current node to the archive node store. From there, they can then be restored or purged.

Versioning and Check Out/Check In

Alfresco's version management is designed to manage versions of individual content nodes. To enable the versioning behavior, the versionable aspect must be applied to the node.

The Versioning services include the following capabilities:

  • Create Version—Creates a new version of the referenced node, which is placed at the end of the appropriate version history. If the node has no version history, one is created and this version is considered the initial version.

  • Version History—Gets the version history that relates to the referenced node.

  • Get Current Version—Gets the current version for a referenced node.

  • Revert—Reverts the state of a referenced node to that of a previous node.

  • Restore Version—Restores a previously deleted node from a version in its version history.

  • Delete Version History—Deletes the version history for a versioned node.

Each version has a version number, which is allocated on a sequential basis and follows a similar strategy to Concurrent Versions System (CVS) version numbering. Generally, this version number is only used internally; the version label is used publicly to identify the version.

The version label is calculated from the version number and gives, within the scope of the version history, a unique label for the version. This label is placed in the versionable aspect to indicate the related current version for a node.

You can customize the generation of the version label by creating a version label policy behavior and registering it in place of the default version label policy. This gives applications flexibility to determine their own version-labeling policies. The default version label policy uses the 1.1, 1.2 style of progressive version labels, moving to 2.0 if the version is considered a major change. This is indicated in the version metadata to which the version label policy has access.

Check Out and Check In services are provided to control updates to document and prevent unwanted overwrites. When you check out a document, the document is locked, thus preventing other users writing changes to it. Alfresco uses an exclusive locking model that allows only one user to have a particular document checked out (locked) at any time. The user or application can unlock the document by either checking in the document or canceling the checkout.

You can use Check Out and Check In with or without versioning. If versioning is not enabled on a node (the versionable aspect is not present on the node), the check in overwrites the existing node and releases the lock unless the keepCheckedOut flag is used. With versioning enabled on the node, a new version is always created.

Auditing

The Audit service provides a configurable record of actions and events. The information is stored in a database in a form that is designed to be simple for third-party reporting tools to consume.

The capabilities provided by the Audit service include:

  • Auditing of virtually any system event (user- and system-triggered)

  • Metadata change auditing, including before and after values

  • Audit data stored in database-indexed tables according to the type of data

Authentication, Authorities, and Permissions

A number of services are provided to support creating and updating user and group (authorities) information, authenticating users, and defining the actions that users can perform against nodes.

These services are:

  • Authority Service—Provides capabilities to support creating and deleting authorities, querying authorities, and managing zones

  • Permission Service—Provides methods for reading, setting and deleting permissions for nodes, querying permissions, and evaluating permissions for a user against nodes

  • Person Service—Provides methods for looking up people from user names, and for creating, deleting, and altering user information

See Chapter 6 for more detail on these services.

Modeling

The content repository uses the repository Data Dictionary to manage definitions for content models such as folders, files, and metadata schemes. The content models are registered with the content repository to constrain the structure of nodes and the relationships between them, as well as to constrain property values. A Dictionary service is provided to allow access to the content models and provide a range of methods for querying and inspecting the model definitions.

See Chapter 5 for detailed information on working with content models.

Search

The Search service provides methods for querying the repository and returning a filtered collection of nodes based on the user's permission. A number of search languages are available, including the following:

  • Lucene—Based on Apache Lucene, provides any combination of metadata, path, and full-text search using the Lucene query syntax. This includes the ability to search for terms and/or phrases in properties and content, paths, types, aspects, and ranges.

  • XPath—Supports simple path-based contextual navigation against the node service based on version 1 of the XPath specification.

  • Alfresco Full Text—Provides a comprehensive, language independent full text search capability.

  • CMIS QL—Supports all CMIS QL (except join between Types) standard. The Alfresco Full Text Search language can also be embedded in the CMIS QL contains() predicate.

See the "Content Management Interoperability Services" section later in this chapter for more detail on CMIS and CMIS QL.

CONTENT APPLICATION SERVICES

Content Application Services extend the repository services to provide the extended capabilities required for rich content and collaborative applications. These can be further categorized as Content, Control, and Collaboration services.

Content Services

Content services provide advanced content management capabilities to automate tasks, handle format conversions, automatically extract content metadata, and generate thumbnails and proxies.

Rules and Actions

Rules and actions automatically trigger behavior when certain defined conditions are met. A standard set of conditions and actions is provided that can be further extended using scripts and custom actions.

Transformation

Transformation services provide the ability to convert content between different file formats, such as generating PDF files from Microsoft Office formats and converting between a large range of image formats. The Transformation service is designed to be extensible, allowing the use of additional transformers.

Metadata Extraction

This service automatically extracts metadata information from inbound and/or updated content, and updates the corresponding node properties with the metadata values.

Thumbnailing

This service creates a thumbnail of a given content property for a node. A number of different standard types of thumbnails can be generated, including Flash Web previews and image thumbnails (small and medium-sized). The Thumbnailing service makes use of the specific transformations available via the Transformation service.

Control Services

Control services provide the ability to manage workflow, Web projects, sandboxes, and assets.

Workflow

Workflow services are provided to manage business processes around content using user-assigned tasks, automated steps, and flow control. The underlying workflow engine uses the embedded JBoss jBPM Workflow engine, which is encapsulated within a Workflow service that provides a standard interface to the underlying workflow engine itself. Workflow definitions are used to define process templates, a number of which are provided out of box. You can also define additional process definitions (see Chapter 7).

Web Projects

Web content management applications use Web projects to store Web content related to a Web site, Web application, and other types of managed Web property. The key use case is where multiple artifacts must be managed together through the concept of change sets – collections of related assets that must be managed as a whole.

The Web Projects service provides a set of methods for creating and managing Web project instances to support Web applications. These services are accessible from both the JavaScript and RESTful layers.

Sandboxes

Within a Web project, sandboxes provide users with an isolated working area in which to make changes to the Web content without affecting the view of the data by other users. Workflow is then used to manage a controlled submission process for publishing the changes to the production Web property.

The Sandbox services provide methods for creating, reading, modifying, and deleting content in a sandbox, and for submitting content from the sandbox for review, approval, and publishing.

Assets

Assets are the individual items of content being managed within a Web project sandbox. Methods include the ability to list all the assets, inspect their properties, and submit collections of changes in the form of a change set for review and publishing.

Collaboration Services

Collaboration services provide the ability to manage sites, user and group membership, activities, tagging, and comments.

Sites

Sites are a key concept within Alfresco Share for managing documents, wiki pages, blog posts, discussions, and other collaborative content relating to teams, projects, communities of interest, and other types of collaborative sites. The Sites service itself provides management capabilities for creating, updating, and deleting sites.

Invite

The Invite service is used to maintain the user and group membership for sites. The service is responsible for sending invite notices to users and managing the acceptance or rejection status for particular invites.

Activity

Alfresco Share uses activities to track a range of changes, updates, events, and actions, allowing users to be aware of what is being changed, and where, by whom, and when the changes occurred. The Activity service provides facilities for posting events and generating feeds for Share sites.

Tagging

Tags are keywords or terms assigned to a piece of information, including documents, blogs, wiki pages, and calendar events. The Tagging service provides methods and properties to add, remove, use information, and search by tags.

Commenting

Comments are modeled as separate content items that are associated with the relevant node through child associations. The Commenting service provides a RESTful API for managing comments against nodes and provides methods to get existing comments, post new comments, and delete comments.

See Chapter 5 for more information on modeling and associations.

HOW SERVICES ARE BUILT

Most services are built using three tiers: core Java, a Public Script service, and a RESTful API (see Figure 4-1). In some cases, services may be implemented using just one or two of the layers. For example, some low-level services are only available at the core Java level, in which case they may be used as a component of a higher-level service.

FIGURE 4-1

Figure 4.1. FIGURE 4-1

Generally, each tier has the following characteristics:

  • Tier 1 — Embedded Java API—The Embedded Java API is a low-level, stateless API implemented in Java. It encapsulates all the functionality provided by the service and is typically a collection of fine-grained, stateless methods. It is considered an internal API only suitable for core Java developers. All other interfaces are built on top of these APIs. Examples include the FileFolderService, SearchService, and AuthenticationService.

  • Tier 2 — JavaScript API—The JavaScript API provides an object-based interface on top of the Embedded Java API. It is designed to provide a developer-friendly interface to the capabilities provided by the server. Example methods available include createNode, createAssociation, setPermission, and query.

    Scripts can be used to implement independent behaviors in the form of actions and are used to provide the backing behavior for Web scripts when implementing the RESTful APIs.

  • Tier 3 — RESTful API—The RESTful APIs are designed around resources and data to provide a remote, URL-based API. As they are URL-based, they can be called from virtually any language. Although they are typically built on top of the Public Script services, they can also be implemented directly on the Java API. Examples include the Sites service and Tagging service, both of which are implemented on top of associated Script services.

USING SERVICES

As services are core to the Alfresco Content Application Server, they are used by all applications working against the server, including the Explorer and Share clients, Virtual File System interfaces such as CIFS and WebDAV, and the APIs. The APIs fall into two main categories: those available directly against the server (embedded APIs) and those that run on a separate tier (remote APIs). Developers use these APIs as appropriate to access and extend the out-of-the-box services.

Embedded APIs

Embedded APIs are used by custom extensions executed directly against the Content Application Server. There are three main embedded APIs: the Alfresco Java Foundation API, the JavaScript API, and the Template API.

  • Alfresco Java Foundation API—Provides a collection of public Java Interfaces to the services provided by the server.

  • JavaScript API—Provides an object-oriented view of the Java Foundation API with comprehensive access to the core services.

  • Template API—A read-only API designed to render output such as HTML, XML, JSON, and text using the FreeMarker template engine; the Template API uses an object-oriented view of the content repository in combination with templates to generate the output.

The JavaScript and Template APIs are the key building blocks for Web scripts, which are used to develop the RESTful APIs.

The following code samples illustrate usage of the embedded APIs.

The following example uses the Java API to create new content:

/**
 * Creates a new content node setting the content provided.
 *
 * @param  parent   the parent node reference
 * @param  name     the name of the newly created content object
 * @param  text     the content text to be set on the newly created node
 * @return NodeRef  node reference to the newly created content node
 *
private NodeRef createContentNode(NodeRef parent, String name, String text)
    {
        // Create a map to contain the values of the properties of the node
        Map<QName, Serializable> props = new HashMap<QName, Serializable>(1);
        props.put(ContentModel.PROP_NAME, name);
// use the node service to create a new node
        NodeRef node = this.nodeService.createNode(
                   parent,
                   ContentModel.ASSOC_CONTAINS,
                   QName.createQName(NamespaceService.CONTENT_MODEL_1_0_URI, name),
                   ContentModel.TYPE_CONTENT,
                   props).getChildRef();

        // Use the content service to set the content onto the newly created node
        ContentWriter writer = this.contentService.getWriter(node,
        ContentModel.PROP_CONTENT, true);
        writer.setMimetype(MimetypeMap.MIMETYPE_TEXT_PLAIN);
        writer.setEncoding("UTF-8");
        writer.putContent(text);

        // Return a node reference to the newly created node
        return node;
    }

Code Snippet: createContentNode.java

This example uses the Java API and the NodeService and ContentService to create a content node including both content and metadata. It takes a nodeRef of the folder that will contain the node, a string to be used for the nodes name, and a string containing the content. A map of the metadata data values is then created, which is passed as a parameter to the NodeServer that creates the node itself. The content is written to the node using the ContentService and finally the node reference for the newly created node is returned.

This second example uses the JavaScript API to create new content:

// create file in the user's home folder
var doc = userhome.createFile("myDoc.txt");
doc.content = "This is some content.";
doc.save();

Code Snippet: createfile.js

In this example a new document called myDoc.txt is created in the home space of the current user. The content for the new document is set to doc.content and the document is saved to commit our changes.

The final example uses the Template API to display all the properties for a given document:

<table>
 <#-- Get a list of all the property names for the document -->
 <#assign props = document.properties?keys>
 <#list props as t>
    <#-- If the property exists -->
    <#if document.properties[t]?exists>
       <#-- If it is a date, format it accordingly-->
       <#if document.properties[t]?is_date>
<tr><td>${t} = ${document.properties[t]?date}</td></tr>
       <#-- If it is a boolean, format it accordingly-->
       <#elseif document.properties[t]?is_boolean>
       <tr><td>${t} = ${document.properties[t]?string("yes", "no")}</td></tr>

       <#-- Otherwise treat it as a string -->
       <#else>
       <tr><td>${t} = ${document.properties[t]}</td></tr>
       </#if>
    </#if>
 </#list>
 </table>

Code Snippet: documentProperties.ftl

The template iterates over all the properties for a node called document and renders the values as appropriate for the data types returned.

Remote APIs

There are several remote APIs available, allowing clients connecting from a separate tier to communicate with the Alfresco Content Application Server. These are based on Web services, and RESTful and CMIS protocols. The remote APIs are designed to be language agnostic, allowing development against these APIs using a range of development languages, including Java, PHP, Ruby, .NET, and many more.

  • Web services: An object-oriented API using SOAP and supporting a range of content services, including authentication, query, node creation and update, access control, and actions.

  • REST: HTTP-based resource-oriented interfaces used by the Surf framework and Alfresco Share.

  • CMIS: The CMIS standard defines Web services and REST-based bindings for working with CMIS-compliant repositories.

The following is an example of calling a RESTful API to retrieve the list of tags for a document:

http://localhost:8080/alfresco/service/api/node/workspace/SpacesStore/
97526d57-d1ce-4578-931d-0cc48ff23602/tags

This will retrieve all the tags for the node with the node reference workspace: //SpacesStore/97526d57-d1ce-4578-931d-0cc48ff23602 in the body of the HTTP response, formatted in JSON. For example:

{
   "data" : ["tagOne", "tagTwo"]
}

Configuring and Extending Existing Services

Alfresco uses the Spring framework to implement an extremely modular architecture. Services are bound together through their interfaces and configured using Spring's declarative Dependency Injection. This allows existing services to be configured, extended, and replaced, and new services to be introduced.

The specific details vary from service to service. For example, it is possible to define new transformers by extending the baseContentTransformer. This defines how the new transformer is invoked, the source and target MIME types it supports, and the transformer's availability. This is done through configuration that extends the existing service. The underlying service itself does not need to be modified and no additional code is required.

The following example shows the Spring configuration required to extended the out-of-the-box RuntimeExecutableContentTransformer. This is a standard transformer that is able to execute system executables. An example of a command line transformation program is HTML Tidy (http://tidy.sourceforge.net/), which can transform HTML documents into XHTML documents.

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE beans PUBLIC '-//SPRING//DTD BEAN//EN' 'http://www.springframework.org/
dtd/spring-beans.dtd'>

<beans>
   <bean id="transformer.Tidy.XHTML" class="org.alfresco.repo.content.transform.
   RuntimeExecutableContentTransformer" parent="baseContentTransformer">
      <property name="checkCommand">
         <bean class="org.alfresco.util.exec.RuntimeExec">
            <property name="commandMap">
                <map>
                    <entry key=".*">
                        <value>tidy -help</value>
                    </entry>
                </map>
            </property>
            <property name="errorCodes">
               <value>2</value>
            </property>
         </bean>
      </property>
      <property name="transformCommand">
         <bean class="org.alfresco.util.exec.RuntimeExec">
            <property name="commandMap">
                <map>
                    <entry key="Linux">
                        <value>tidy -asxhtml -o '${target}' '${source}'</value>
                    </entry>
                    <entry key="Windows.*">
                        <value>tidy -asxhtml -o "${target}" "${source}"</value>
                    </entry>
                </map>
            </property>
            <property name="errorCodes">
               <value>2</value>
            </property>
         </bean>
      </property>
      <property name="explicitTransformations">
<list>
            <bean class="org.alfresco.repo.content.transform.
            ExplictTransformationDetails" >
                <property name="sourceMimetype"><value>text/html</value></property>
                <property name="targetMimetype"><value>application/xhtml+xml
                </value></property>
            </bean>
         </list>
      </property>
   </bean>
</beans>

Code snippet: tidyTransformer.xml

The actual transformation command used is defined in the transformCommand property. The transformation mechanism performs substitutions of the variables ${source} and ${target}, which are the full file paths of the source and target files for the transformation. For example:

tidy -asxhtml -o "${target}" "${source}"

The transformer comes with an optional feature, checkCommand, which is executed by the init method. If an error occurs during execution of this command, which cannot take any parameters, then the transformer is flagged as not available. When not available, the getReliability method will always return 0.0; otherwise it is assumed that the transformation command will be successful. The reliability of the transformation is used by the transformation registry to select the most appropriate transformer for a given transformation. The transformer remains directly usable—you can directly select it as an action to perform.

External utilities stick to a rough convention regarding the return codes. In this case, tidy returns a code value 2. The errorCodes property defines a comma separated list of values indicating failure; the default is "1, 2".

The final piece of configuration defines the MIME types that this transformer supports via the explicitTransformations property. In this case, the transformer supports a source MIME type of text/html and a target MIME type of application/xhtml+xml.

This example illustrates how to extend an existing service via configuration only: it has been possible to add a new transformer without any code changes. To complete the example, additional configuration is needed to expose the new transformer via the client interfaces or as a repository action. See Chapter 14 for more details.

Building a Simple Service

The following section walks through an example of building a new service by showing the high-level interfaces for each tier of the service. To simply the example, the full detail of some of the methods will not be shown.

This example can be considered as a new content application service built on the Content Repository Services.

The example service provides a counter with methods for creating, listing, incrementing, decrementing, and resetting integer counter instances and their values. The counter can be used to persist how many times a particular event has occurred. For example, to generate a unique reference number for a document, the counter can be used to generate an integer that can be used as the basis of the reference number.

The service will implemented using the three tier approach described earlier in this chapter. Java will be used to provide underlying methods in Tier 1; the JavaScript API is used for Tier 2 to provide higher-level methods, allowing the counter to be called via repository actions and exposed via Tier 3 as the new RESTful counter API.

Example Counter Service: Tier 1 – Java Service Layer

The following code outline shows Tier 1 of the Counter service.

Java Service Layer

public interface CounterService
// Low level services for creating and updating counters.
{
    public enum CounterOperation { INC, DEC };

    // Create a counter and set the initial value.
    // Returns the counter id.
    String createCounter(int initialValue);

    // Get the value of a counter.
    int getCounterValue(counterId);

    // Update the counter value by a given step.  The counter operation
    // is provided.
    int updateCounterValue(String counterId, CounterOperation operation, int step);

    // Delete the counter.
    int deleteCounter();
}

The Java Service Layer outline implements the low-level Java methods for the counter. It includes methods for creating counters, getting counters by ID, updating counter values, and resetting counter values. These methods will use other services (not shown) from the Internal Java API to create the underlying repository objects used to persist the counters.

Example Counter Service: Tier 2 – JavaScript API

To allow the Java service to be used outside of Java, such as for a repository action or as a RESTful API, access is provided via Tier 2, the JavaScript API layer.

JavaScript API

// Create counter.  Initial value and default step are optional parameters.
// Returns the created counter object
function createCounter(initialValue=1, defaultStep=1);

// Get counter.
// Returns the counter object
function getCounter(counterId);

Counter
{
   var id;
   var defaultStep;
   var value;

   function increment(value=0);

   function decrement(value=0);

   function delete();
}

The createCounter script outline implements higher-level methods on top of the Tier 1 Java service. It includes a function to get instances of a counter by ID and functions to increment, decrement, and reset the counter value. This script will be callable via the Tier 3 RESTful API that follows.

Example Counter Service: Tier 3 – RESTful API

The Service layer exposes the script layer as two resource URIs working against counter collections and counter instances. The first resource returns a counter collection, which a list of all the available counters and is also used to create new counter instances. The second resource is counter, which is used to manipulate the value of a particular counter instance.

Resource: Counter Collection

Method: GET /alfresco/counters
   Description: Returns a list of counters

   Example output:

   {
      "counters":
      {
         "counter0":
         {
            "id" : "counter0",
            "value" : 12,
            "url" : "/counters/counter0"
         },

 "counter1":
{
            "id" : "counter1",
            "value" : 3,
            "url" : "/counters/counter1"
         }
      }
   }

The counter collection resource is called using a simple HTTP GET method against the /alfresco/counters URI. It provides a platform- and language-independent interface to get a list of counters. The response is formatted using JSON and the calling application can parse the JSON response to get the list of counters as appropriate.

Method: POST /alfresco/counters
   Description: Creates a new counter

   Example input:
   {
      "id" : "counter2",
      "initialValue" : 0,
      "defaultStep" : 1
   }

   Example output:

   {
      "id" : "counter2",
      "value" : 0,
      "url" : "/counters/counter2"
   }

Calling the counter collection resource using an HTTP POST method creates a new counter instance. The body of the post method uses a JSON-formatted payload that includes the ID for the new counter, the initialValue, and also the default value (defaultStep) used to increment or decrement the counter.

Resource: Counter

Method: GET /alfresco/counters/{counter-id}
   Description: Gets the current value for a counter

   Example output:

   {
      "id" : "counter2",
      "value" : 0,
      "url" : "/counters/counter2"
   }

The counter resource is used to read and update the values for a particular counter. As with the counter collection, the behavior is based on the HTTP request method used to call the resource. In this example the URI is being called using an HTTP GET, which returns the current value as a JSON response for the counter-id identified at the end of the URI request.

Method - POST /alfresco/counters/{counter-id}
   Description: Increments or decrements a counter's value according to the passed
   in step value.

   Example input:

   {
      "step" : 1
   }

   Example output where existing value is 11:

   {
      "id" : "counter0",
      "value" : 12,
      "url" : "/counters/counter0"
   }

Calling the counter resource with POST increments or decrements the counter according to the step value passed in via the request BODY. If no BODY is provided, the default step value will be used. It returns a JSON response including the new counter value.

Method - DELETE /alfresco/counters/{counter-id}
   Description: Deletes the counter

The final method is used to delete the counter by calling the counter with an HTTP DELETE method. Note that in this case, there is no response in the HTTP body. A 204 status code would be returned in the HTTP header to indicate the delete had been successful.

You now have a set of RESTful APIs that provide simple, URI-addressable and platform-independent services to manage and inspect your counters.

CONTENT MANAGEMENT INTEROPERABILITY SERVICES (CMIS)

In September 2008, Microsoft, IBM, EMC, Alfresco, BEA (now Oracle), and SAP submitted the Content Management Interoperability Services (CMIS) specification to OASIS to become a standard. The goal of CMIS is to access any content management systems that implement CMIS, such as Microsoft SharePoint, IBM FileNet, EMC Documentum, and Alfresco, in a standardized and interoperable way. This allows the ECM industry to create a new ecosystem around content management. CMIS is designed to enable new classes of cross-repository applications in areas such as eDiscovery, publishing, collaboration, and information access. CMIS also strives to create a common understanding of content query, content properties and types, type inheritance, and common content operations; however, CMIS is not designed to expose all capabilities of a repository or expose administration or management functions.

CMIS provides a level of portability that allows you to build applications that are not locked into any content management system and to future-proof those applications. CMIS provides a rich set of functionality, yet is capable of handling a wide variety of content management systems. CMIS provides a set of content services for managing content metadata, versioning, folder containment, associations, and binary transfer. In addition, CMIS provides a query language based upon SQL querying content, its metadata, and context.

After a lot of development and public review, 2010 shows the emergence of CMIS as a full-fledged standard.

Figure 4-2 shows a high-level overview of CMIS.

FIGURE 4-2

Figure 4.2. FIGURE 4-2

CMIS Requirements

Many large organizations today run multiple Enterprise content management systems, each with millions of dollars of implementation and integration on top of those systems that are specific to each underlying repository. Yet each system usually remains a silo of information that does not share content. In addition, an application built to a system cannot be used on another system. The lack of integration and interoperability means that organizations using multiple systems cannot get a consistent view of information. This creates substantial operational and compliance risks in that content cannot be found and consistently managed. The challenge is even greater for independent application vendors who create content management application solutions. Supporting more than one content management system can be a very expensive proposition.

CMIS promises to be a standard that provides interoperability where others have not. CMIS focuses on a few use cases and on being able to map to existing systems rather than specify how those systems should work. It is particularly important for CMIS to work with major ECM systems with a large installed base of users and content. CMIS is designed to be language-independent and uses Web protocols to access the repositories. CMIS supports both SOAP and REST through the AtomPub protocol.

Note

Atom Publishing Protocol, or AtomPub, is an IETF standard for creating and updating Web resources. It is a REST-based protocol and is very flexible in extending the metadata it handles. The OASIS CMIS committee chose this protocol as the basis for its REST-based APIs.

CMIS does a good job of mapping data modeling, query capabilities, and content functionality of basic content services of these underlying systems. Despite the relative commonality of these systems, the CMIS Technical Committee has tried to ensure that the functionality exposed can actually be implemented.

The core use cases targeted by CMIS are:

  • Collaborative content creation: Collaborative content creation allows users to work collaboratively to create one or more documents or Web pages.

  • Portal access of ECM systems: Portals with CMIS provide an aggregated interface to viewing content from multiple sources.

  • Mashups of content in Web sites: Web sites using CMIS can create composite applications that mash up or integrate data and functionality from one or more repositories.

  • Portable search against multiple repositories: Search interfaces in CMIS support a consistent way for search engines to index and access content from a content repository.

These core use cases drove requirements around query, authentication, security, versioning, change logs, and basic content operations.

Some of the applications not directly addressed by CMIS but that the Technical Committee intended to be built on CMIS are:

  • Workflow and Business Process Management—Business processes frequently have content attached and CMIS is a good way to ensure accurate versions of content are attached, such as contracts or invoices.

  • Archival—As the archive stores information as part of an ECM system, applications can be archive-independent by using CMIS.

  • Compound or virtual document publishing—There aren't specifically publishing functions for compound documents. However, publishing applications can manipulate and access compound or virtual documents, which can be modeled using CMIS relationships.

  • eDiscovery—Applications to discover content in many different repositories can create a federated view of what documents are discoverable using CMIS to access many different repositories in a consistent way.

Core Concepts of CMIS

At the root of the CMIS model and services is a repository, which is an instance of the content management system and its store of metadata, content, and indexes. The repository is the end point to which all requests are directed. In the RESTful model, it is the root path of the resources being addressed in CMIS. The repository is capable of describing itself and its capabilities.

The core CMIS object model (see Figure 4-3) is not very different from the Alfresco object model minus the support of aspects. Like Alfresco, CMIS supports object types that define properties associated with each type. Each object has an object type, properties defined by that object type, and an object ID that uniquely identifies that object. Object types support inheritance and are sub-typed as Document object types and Folder object types. Document object types may have content streams to store and access binary data. Object types may also be related through Relationship object types.

FIGURE 4-3

Figure 4.3. FIGURE 4-3

A Policy object represents an administrative policy that can be enforced by a repository, such as a retention management policy. An Access Control List (ACL) is a type of Policy object. CMIS allows applications to create or apply ACLs. The Alfresco repository also uses Policy objects to apply aspects.

Document objects have properties and content streams for accessing the binary information that is the document, properties that may be multi-valued, and versions (see Figure 4-4). Document objects can also have Renditions that represent alternate file types of the document. Only one Rendition type, a Thumbnail, is well defined.

FIGURE 4-4

Figure 4.4. FIGURE 4-4

Versioning in CMIS (see Figure 4-5) makes it relatively simple to encompass the various versioning models of different CMIS implementations. Each version is a separate object with its own object ID. For a given object ID, you can retrieve the specific version, the current version, or all versions of the object, as well as delete specific or all versions of a Document object. Document versions are accessed as a set of Document objects organized on the timestamp of the object. CMIS does not provide a history graph.

FIGURE 4-5

Figure 4.5. FIGURE 4-5

Document objects live in a folder hierarchy (see Figure 4-6). As in Alfresco, a folder can exist in another folder to create the hierarchy. The relationship between folder and document is many-to-many if the repository supports multifiling, allowing a document to appear in more than one folder. Otherwise, it is one-to-many.

FIGURE 4-6

Figure 4.6. FIGURE 4-6

A query in CMIS is based upon SQL-92 and should be familiar if you have used a relational database system like MySQL. The query itself is read-only and presents no data manipulation capabilities. The syntax consists of the following clauses:

  • SELECT with a target list

  • FROM with the object types being queried

  • JOIN to perform a join between object types

  • WHERE with the predicate

  • IN and ANY to query multivalue properties

  • CONTAINS to specify a full-text qualification

  • IN_FOLDER and IN_TREE to search within a folder hierarchy

  • ORDERBY to sort the results

The CMIS query maps the object type into a relational structure where object type approximates a table, the object approximates a row, and the property approximates a column that can be multivalued. The actual binary content can be queried using a full-text query and folder path information using the in_folder and in_tree functions. A query can also be paged for user interface presentation.

CMIS Services

CMIS Services comprise Repository Change services, Navigation services, Object services, Multifiling (Folder) services, Discovery (Query) services, Versioning services, Relationship services, Policy services, and ACL services. You can access these services equally using SOAP or AtomPub, depending on your preferred architectural style. This book uses AtomPub and the RESTful style as an example of how to use CMIS with Alfresco.

  • Repository services discover available repositories and get the capabilities of these repositories. They also provide some basic Data Dictionary information of what types are available in the repository.

  • Navigation services let you navigate the repository by accessing the folder tree and traversing the folder/child hierarchy. These services can be used to get both children and parents of an object.

  • Object services provide the basic CRUD (Create, Read, Update, Delete) and Control services on any object, including Document, Folder, Policy, and Relationship objects. For Document objects, this includes setting and getting of properties, policies, and content streams. Object services retrieve objects by path or object ID. Applications may also discover what actions users are allowed to perform.

  • Multifiling services let you establish the hierarchy by adding or removing an object to or from a folder.

  • Discovery services provide Query and Change services. Discovery services accept the SQL-like query described earlier and provide a means of paging the results of the query.

  • Change services let you discover what content has changed since the last time checked, as specified by a special token. Change services can be used for external search indexing and replication services.

  • Versioning services control concurrent operation of the Object services by providing Check In and Check Out services. Version services also provide version histories for objects that are versioned.

  • Relationship services create, manage, and access relationships or associations between objects as well as allow an application to traverse those associations.

  • Policy services apply policies on document objects. Policies are free-form objects and can be used by implementations for security, record, or control policies.

  • ACL services are used to create, manage, and access Access Control Lists to control who can perform what operation on an object.

Obviously, each CMIS service can become quite an involved topic in its own right. For a complete catalog of services, please see the CMIS specification on the OASIS Web site at www.oasis-open.org/committees/tc_home.php?wg_abbrev=cmis.

Using CMIS with Alfresco

The Alfresco implementation is a thorough implementation of CMIS and has been the basis for many CMIS applications. Most applications that use CMIS with Alfresco use the AtomPub protocol over the SOAP protocol; however, some applications that have a strong Web services framework, like SAP and Tibco, use SOAP instead. In this book, examples are presented using the AtomPub protocol.

If you are programming CMIS applications in Java, you can use the Apache Abdera libraries, which were built to handle AtomPub. Abdera provides both client and server implementations of the Atom Publish and Subscribe protocols. You can find Apache Abdera at http://abdera.apache.org.

To use CMIS with PHP, it is best to use one of the PHP Web frameworks such as Drupal or Joomla. The Drupal interface was built by Optaros and Acquia and can be found at http://drupal.org/project/cmis.

CMIS is a good choice for building applications or application integrations against Alfresco when you wish to make the application portable to other systems; however, you will need to use Web scripts instead to:

  • Use or query aspects or access properties in aspects

  • Add or manage workflows

  • Apply actions or rules

  • Perform any records management operations

  • Work with Web content management

  • Perform any management or administrative task, such as user or group management, or indexing control

You can also integrate Web scripts with the AtomPub protocol of CMIS.

For more information on using CMIS with Alfresco, including new Java bindings, visit http://cmis.alfresco.com. To access the CMIS specification, visit www.oasis-open.org/committees/cmis.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.236.255