IBM FileNet Content Manager solutions
In this chapter, we describe product feature capabilities and solution components to aid designers in preparing Enterprise Content Management (ECM) solutions. The material describes available options for the input, management, storage, process, and presentation phases of ECM solutions.
We introduce these ECM design aids:
 – Foundation components
 – Process management
Sample use cases of solution building blocks
13.1 Solution building blocks
At a fundamental level, any ECM solution is composed of five major solution components:
Foundation components
Process management
All ECM solutions are essentially a construction of information input, storage, information processing, and presentation and delivery. Figure 13-1 illustrates the major ECM solution components.
Figure 13-1 Major ECM solution components
These major components form the building blocks of ECM solution design. Solution building blocks are the features that ECM solution designers can specify and combine to build out each of the components of an ECM solution: content ingestion, content management, process, and presentation. The IBM FileNet suite of products contains applications and tools that offer designers a wide range of features and functions for the design of each of the major components of an ECM implementation.
Figure 13-2 on page 429 shows several of the IBM tools that are available to ECM designers and the place for these blocks within the four major design phases of an ECM solution.
Figure 13-2 ECM design phases
13.1.1 Foundation components
We describe the foundation components of the P8 Content Manager:
Repositories
Business object management
Classification
Versioning
Security
Auditing
Search
Content Platform Engine application programming interfaces (APIs)
Content storage and content caching
Lifecycle management and retention management
Social ECM capabilities
Repositories
Repositories are the basic components of FileNet P8 Content Manager. Their main purpose is to store the business objects, for example, documents, images, folders, and custom objects, with the respective metadata and provide a centralized information library.
A single FileNet Content Platform Engine can serve several repositories that are also called object stores. An object store can store various business data, including structured and unstructured content, such as images, XML documents, Microsoft Office documents, and web pages. It can also be configured to store the content in a database, a file system, a fixed content device, or any combination of these options.
 
Business benefits: The FileNet content repository provides a standard solution for document creation, versioning, and check-out, or add and check-in. The FileNet content repository can house all documents in a central repository with accessibility for all authorized users.
 
Recommendations: Use repositories to separate business objects based on functional and logical purposes, enforce object security, and improve the manageability of objects. For example, you can create a repository for sensitive Human Resources objects, a second repository for customer-related objects, and a third repository for vendor-related objects.
Business object management
In addition to managing documents, P8 Content Manager also can manage other types of data, such as folders, custom objects, and annotations.
Folders are special objects that are used to relate other type of objects, such as document and custom objects, and provide a way to browse through other objects. Folders have the following characteristics :
Have system properties that the system manages automatically, such as Date Created.
Can have custom properties for storing business-related metadata.
Are secured.
Can be hierarchical (a folder can have subfolders).
Can contain documents and custom objects.
Can generate server events when they are created, modified, or deleted. These events are then used to customize behavior.
Can be annotated.
Provide containment by reference, which allows any specific object to be filed in multiple folders at the same time.
 
Business benefits: Folders provide an option to organize and structure documents and custom objects in logical entities. For example, you can create a loan folder that contains all the relevant business objects of a loan application.
 
Recommendations: When you use folders, be aware of the following design considerations that might affect overall system performance:
The number of folders
The depth of a folder structure
The number of objects in a folder
Folder items with the same name in the same folder because this duplication violates a uniqueness constraint.
Custom objects contain only metadata without any content and are used to store business information. Custom objects have the following characteristics:
Have system properties that the system manages automatically, such as Date Created.
Can have custom properties for storing business-related metadata, such as Account Number.
Are secured.
Can participate in business processes as workflow attachments.
Can generate server events when they are created, modified, or deleted. These events are then used to customize behavior.
 
Recommendations: Use custom objects to store data that relates to your business requirements, for example, a client can be represented as a custom object.
Document objects represent the electronically stored information that is managed by the ECM system. Documents have the following characteristics:
Have system properties that the system manages automatically, such as Date Created.
Can have custom properties for storing business-related metadata about the document.
Are secured.
Can have content that can be indexed for searching.
Can point to content that is outside of the object store (external content).
Can have no content (metadata only).
Can be versioned to maintain a history of the content over time.
Can be filed in folders.
Can have a lifecycle.
Can participate in business processes as workflow attachments.
Can generate server events when they are created, modified, or deleted. These events are then used to customize behavior and trigger any custom events or workflows in other systems
Can be rendered to different formats, such as PDF and HTML, by using the add-on IBM Rendition Engine.
Can be published to a website.
Can be annotated.
Can be audited.
Annotation objects represent relevant incidental information about objects that can be associated with custom objects, folders, or documents. Annotation objects have the following characteristics:
Annotation security is independent from object security. Default security is provided by the class and by the annotated object. An annotation can optionally have a security policy assigned to it.
Can have subclasses.
Can have zero or more associated content elements, and the content does not need to have the same format as its annotated object.
Are uniquely associated with a single document version and, therefore, are not versioned when a document version is updated.
Can be modified and deleted independently of the annotated object.
Can be searched for and retrieved using the Content Engine API.
Can subscribe to server-side events that fire when an action (such as creating an annotation) occurs.
Can participate in a link relationship.
Can be audited.
Classification
Each document class defines the properties and the default security for all the documents that are added under a specific document class. Document classes support design based on organization content or function and can encapsulate a single design aspect. Classification of the documents is performed by selecting document class and property values for each document. Classification can also be performed by adding objects into folders that define classification taxonomies.
Classification can be performed in the following ways:
Manually by a user.
By an application that uses the P8 Content Engine API.
Automatically by using the content-based classification capability that is provided in the P8 Content Platform Engine.
 
Business benefits: Document classes support transparent business functionality and can automatically file the document in the correct folder and apply the correct retention schedule.
 
 
Recommendations: Define a top-level document class with all the properties that are common across all objects in the enterprise. All specific application objects are children of the enterprise document class.
Versioning
Versioning is a base document management capability that is used to maintain the history of the changes of the document content. The set of versions for a single document is called a version series. P8 Content Platform Engine supports a minor and major version scheme; a minor version typically represents a “work in progress” document and a major version represents a completed document.
The system can be configured to apply security policies that in turn automatically apply different access rights for major and minor versions, making it easy to enforce a different viewing audience for in-progress documents.
In addition to version numbers, P8 Content Platform Engine maintains a state property that indicates the current state of each document version. The states are listed:
In Process: A work-in-progress version. Only one version of a version series can be in process.
Reservation: A document currently checked out for modification. Only the latest version of a version series can be reserved.
Released: A document released as a major version. Only one major version of a version series can be released.
Superseded: A version superseded by another version. Many versions in the version series can be superseded.
 
Business benefits: A version series retains all history of the content and supports the correct information management.
 
Recommendations: Define security on versions in order to ensure that only authorized users can access “work in progress” versions or completed versions.
Auditing
Auditing is the recording of events that occur on business objects. All business objects and almost all events can be audited. Audit definitions describe how to audit an event. For example, you can configure an audit definition for a document class so that audit entries are automatically created whenever documents of that class are checked in.
Audit entries are stored in a table of the object store database. Those entries can be viewed, exported for reporting reasons, and administered by users with the correct authorization.
 
Business benefits: Auditing helps you monitor content and process management for the following activities and regulatory compliance:
Object creation
Updates
Deletions
 
Recommendations: Only enable auditing for selected events and plan for audit log cleanup. Audit logs are stored in a database table and a large audit log table can create performance issues.
Security
In P8 Content Manager, you secure the business objects by defining a directory service that controls who logs on to the Content Platform Engine by setting access rights for those users.
P8 Content Manager has a defined security context. Only those users, groups, and machine accounts that are explicitly given access to the object store can access the resources (business objects).
There are many ways to define the security of a business object:
Default instance security: The security of the object is defined in the object class and inherited to all instances of that document class.
Version state: The security of the object is defined by the version level of the document. For example, some users might have access to minor versions of the document, and other users have access only to major versions of the same document.
Document state: The security of the document is controlled by the document state.
Marking sets: The security of the document is controlled by a property value and by the code you implement that interprets the meaning of the property.
Directly applied security: The security of the object is assigned directly to the object by a user or an application.
Inherited security: Security is placed on the object by a security parent or by setting up a relationship with an object-valued property whose Security Proxy Type is set to Inherited.
 
Business benefits: Using the model where document privileges are assigned from Directory Services functional groups, not individual users, helps reduce the cost of managing the security of the system by reducing security access complexity and by handling the separation of duty requirements.
Marking sets allow access to a document to be controlled based on specific property values to ensure that sensitive information is protected, for example, Secret/Confidential. With marking sets, you ensure document security and privacy control, limit access to sensitive data, and control access to documents.
 
Recommendations: Always assign security to LDAP groups and not to specific LDAP users. Assigning rights to groups gives you more flexibility over object access. (You can easily add or remove users to the specific group, therefore giving or removing access to objects that are already stored.)
Try to avoid the Deny security option on objects, which is an implicit way to remove access from users or groups.
Marking sets are stored in the global configuration database (GCD). If you add too many of them, it might affect the performance of the system.
Search
P8 Content Manager supports property and content-based searches.
With property searches, a user or an application can search multiple object stores for business objects that have a specific value in a property. Therefore, users can search for documents, custom objects, or folders in different object stores based on a property value.
Searches are defined in P8 Content Manager as SQL queries and support many of the standard SQL operators, such as OR, AND, LIKE, UNION, and INTERSECTION.
Search definitions can be created and then stored in an object store, allowing users easy access to common queries.
Content-based retrieval (CBR) supports searching within the content of a document or the metadata. CBR provides capabilities to search for misspelled words, typographical errors, word stems, synonym expansion, and wildcards. CBR search results can be ranked by relevancy and can display a document summary format.
Bulk operations can be performed on search results. Operations can be scripted or selected from a set of predefined operations, such as delete, cancel checkout, file, unfile, and change security.
 
Business benefits: Content Platform Engine search capabilities provide an effective means of locating information, improving the ability to share information across the organization, and enhancing record requests, thus improving organizational efficiency.
 
Recommendations: Always specify maximum limit for returned results. Searches are translated as queries against object store databases. Long running, non-optimized queries negatively affect the overall system performance.
Do not use wildcard searches because they can create problems, such as database table locks, and performance issues.
Search is not a reporting tool and you must not use it that way.
APIs
Content Platform Engine has a collection of available development and integration tools.
Depending on your enterprise strategy and architecture, you can use any of the following APIs for application development:
Content Engine Java API: Provides access to the full content capabilities of Content Platform Engine
Content Engine .NET API: Is the functional equivalent of Content Engine Java API for .NET application development
Content Management Interoperability Services API: Is an open source OASIS standard that enables applications to work with one or more content management systems by defining a standard domain model and a standard set of services
Process Java API: Provides classes for all workflow and business process management features
Process Engine REST Service: Is used from custom applications to perform fundamental business process management operations
Web Services: Provides a service that provides access to most of the same functionality as the Content Engine and Process Java APIs
 
Business benefits: The P8 Content Manager APIs provide a way to integrate with other line of business applications, improve the user experience by offering functionality, and access multiple systems transparently.
Content storage management
P8 Content Manager supports storing content in a file system, a fixed content storage system, and the object store database. Depending on the requirements of your solution, choose one or more of these options for content storage.
Database storage is useful when you only need to store a few, small documents. There are performance advantages to storing smaller documents (less than 10 MB) in a database storage area when compared to other storage area types. Avoid storing any document that is over 100 MB in a database storage area. The main benefit of database storage is that backups are much simpler because your document content is backed up along with your normal object store database backup.
File and fixed storage areas are the preferred medium when storing large numbers of files with high ingestion rates.
File storage areas use a directory structure on the file system to store a document’s content. Documents are stored among the directories at the leaf level by using a hashing algorithm to evenly distribute files among these leaf directories.
Fixed storage areas are used to store documents in external repositories, such as IBM Tivoli Storage Manager, EMC Centera, and Network Appliance SnapLock. There are two scenarios for the integration of P8 Content Manager with those external repositories. In the first scenario, the document content is managed by Content Manager and the external device is used only as a storage device. In the second scenario, fixed storage areas are used with federation when content is stored in an external repository. In this scenario, the document and its associated metadata can be accessed as native P8 documents, in addition to their accessibility via the source repository.
P8 Content Manager also provides the following features for storage management:
Bulk content move: With the bulk content move sweep job, you can move content from one storage area to another. There is also a Move Content API method, which can be used from other applications to move content from one storage area to another. Content can be moved from any storage area type (database, file system, or fixed content) as long as content is not under device-level retention.
Content caching: Database file store area content and fixed storage area content can be cached on a cache server. For frequently accessed content, content caching provides a faster response time in content retrieval. Content caching also benefits geographically distributed systems and systems with hierarchical storage devices by storing copies of content local to where they are accessed most often.
Content de-duplication: P8 Content Manager supports de-duplication of the content that is ingested from various sources. This feature saves storage costs for duplicate content that is saved in the repository.
Content compression: Content Manager supports the compression of the content that is stored in a storage area. The compression is transparent to the client that is storing or retrieving the content. This feature saves storage costs but it has performance implications because the compression and decompression of the content occurs when the content is stored or retrieved.
Content encryption: This feature helps protect the documents in the storage areas from unauthorized access directly to the storage area. The encryption and decryption of the content is performed automatically by the Content Platform Engine. The entire process is transparent to the users, but it imposes performance penalties when the content is stored for the encryption and when the content is retrieved for the content decryption.
 
Business benefits: Content Manager storage management provides many options to satisfy every functional and technical requirement.
 
Recommendations: Encrypt all the sensitive content of your organization to ensure privacy. Encryption is defined on the storage area level. Enabling the encryption on a storage area does not encrypt the content that is already stored in that storage area.
If your organization is geographically dispersed, use content caching and minimize the network traffic for content retrieval.
If you are using hierarchical storage devices with tape support, such as IBM IBM Tivoli Storage Manager, consider using the content caching capability for slow content retrieval from tapes. Using content caching, the retrieval might time out but the content cache continues to retrieve the content so that the content is available in cache for the next user request.
If you are using records management, be careful with the content deduplication feature of Content Manager.
Retention management
Retention management is an event-based retention infrastructure that can define object-level retention policies. It is supported for documents, annotations, folders, and custom objects.
A retention management automatic deletion and disposal policy defines the rules for when objects are automatically deleted.
The policy has these characteristics:
Can apply to any searchable repository object.
Allows documents, folders, and custom root classes that have a retention date in the past to be deleted.
Allows custom root classes that have a closure date in the past to be deleted.
Can delete queue items that have reached the maximum failure count more than one month in the past.
Is based on values assigned to the CmRetentionDate system property.
Can also be based on system-defined or user-defined properties.
 
Business benefits: Implementing a retention management scheme helps organizations meet organizational, business, regulatory, and legislative requirements.
 
Recommendations: Always define a retention period for the content that is considered critical for your organization. Retention periods that are defined on Content Manager need to match the retention period of your records management requirements. Also, by defining a retention period, you can decrease the volume of information on your Content Manager repository and provide only relevant information to the business users of your ECM system.
Social ECM
P8 Content Manager can be used for the enablement of social collaboration, social content management, and integration with IBM Connections.
P8 Content Manager supports the following social ECM features:
Ability for users to recommend a business object.
Ability for users to comment on managed objects. Comments can only be created by authorized users.
Ability for users to “follow” updates to business objects.
Social tagging of managed objects.
Activity stream generation for business objects. Activity streams provide a syndicated view of updates to the content, including notifications and recommendations.
Tracking the number of downloads of a document.
Large content streaming.
Thumbnail generation and storing.
User-centric recycle bin for deletion and recovery of documents.
Foundational components provide the strong build blocks for ECM enterprise solution for organizations across multiple industries.
13.1.2 Content ingestion tools
IBM FileNet offers several content ingestion (capture) applications, each of which is designed for capturing a different type of media. The capture applications listed in this section can handle the following media types:
Paper documents
Faxes
Office (electronic) documents, presentations, or spreadsheets
PDF, web, .txt files, or multimedia files
Email messages
Documents stored on network drives or desktops
Documents stored in remote repositories, such as IBM FileNet Image Services, IBM Domino®, OpenText, Microsoft SharePoint, or EMC Documentum
IBM Datacap Taskmaster Capture
IBM Datacap Taskmaster Capture is used as a capture portal for all the documents that are managed by an organization. IBM Datacap Taskmaster Capture offers these features:
Scans and verifies all paper-based documents
Supports multiple recognition engines that are configurable with rules for data extraction from scanned documents
Supports bar code recognition for automatic classification and indexing
Supports web-based remote scanning and verification by using a browser
Supports the import of documents from file systems
Bulk Import tool
Bulk Import is a tool that is designed to help organizations move document images in Content Manager even when those images are created by third-party tools or external processes. With this tool, you can store document images from files at a rate of more than a million documents per day. It also can assemble documents from more than one image, create batches of documents, and assign metadata and indexing properties to documents.
Microsoft Office Integration facility
IBM Content Navigator and FileNet Workplace XT integrate with Microsoft Office applications to help the casual business users manage documents, emails, and attachments. By using the Microsoft Office Integration facility, business users can add new documents to an object store and retrieve and update existing content. This integration provides access to most ECM features, such as hierarchical folder navigation, version control, metadata management, security, and even approval workflows.
 
Recommendations: For casual users who use only Microsoft Office content, consider this feature of Content Manager as your primary option.
IBM Content Collector for Email
IBM Content Collector for Email supports organizations to gain control of email in order to meet record management obligations, connect email to business processes, and manage storage space.
With Content Collector for Email, you can monitor the incoming and outgoing emails of an organization and, based on rules, archive the messages in Content Manager. It also supports the extraction of message attachments and storage, based on user-defined rules, such as keywords and email addresses, to Content Manager.
It also helps organizations to reduce the storage requirements for their email management system by leveraging the storage management features of Content Manager.
IBM Content Collector for SAP
IBM Content Collector for SAP provides outbound archiving and retrieval of SAP generated business documents, SAP reports, and database data It provides inbound archiving and retrieval for external documents, such as invoices.
Content Collector for SAP reduces operational costs by managing the growth of SAP application data through archiving. It increases the efficiency of SAP users and business processes by linking relevant content to SAP transactions.
IBM Content Collector for SAP provides two options for integration with the SAP system. One option is the P8 client that can link documents and folders selected by a user to an SAP transaction. The second option is called Utility Client. Utility Client can link an archived document on Content Manager by processing bar codes or by creating work items in the SAP workflow.
IBM Content Collector for Files
IBM Content Collector for Files helps organizations manage documents on network file shares and provides tools to help users comply with corporate and regulatory policies.
IBM Content Collector for Files automatically captures documents that are placed on a monitored location of a file share and uses the Content Manager archiving and deduplication capabilities to help reduce infrastructure costs that are associated with the management of file systems. It permits advanced file handling based on rules. It can be configured to leave documents in place or move them and replace them with shortcuts.
IBM Content Collector for SharePoint
IBM Content Collector for SharePoint provides collection and archiving. It also provides extended enterprise content management and business process management capabilities for the SharePoint content.
IBM Content Collector for SharePoint collects and archives content from SharePoint document libraries, wikis, and blogs and automatically classifies it. Collected documents are replaced by a shortcut in SharePoint. It can also be configured to declare collected content as a record by using IBM Enterprise Records.
Content Federation Services
Content Federation Services (CFS) unifies content from different repositories into one or more object stores so that the content and associated metadata can be accessed by using the P8 Content Manager suite of products.
With CFS, you can put metadata from heterogeneous repositories into object stores so that the metadata is available to all users. You can also manage the full lifecycle of the digital content, enforce records management policies, and provide an enterprise-wide content search mechanism regardless of the repository in which the content is stored.
 
Business benefits: Content Ingestion tools help make the collection of data easy and quick. They simplify information search and can automatically extract data from documents during the ingestion process. Using the content ingestion tools indexing process can be standardized, simplified, and independent of the organizational changes.
13.1.3 Process management
IBM FileNet offers various solution building blocks for process management.
Events and subscriptions
Events provide a mechanism to initiate actions that are executed when objects are created, modified, and deleted from an object store. A subscription is the association of a particular event trigger with an event action. Many subscriptions can be associated with an event trigger.
Subscriptions can be associated with a class so that they apply to the class itself or to all instances of all objects of that class type, or they can be associated with a specific object. Event subscriptions can be executed synchronously or asynchronously. When set to run synchronously, the object action (for example, create or update) and the operations of the event actions are completed as a single transaction; failure in either results in the rollback of both operations. When set to run asynchronously, the object action and the event action operation run as separate transactions; in this case, the object operation can succeed independently of the event action operation.
 
Business benefits: Subscriptions can streamline the business process and make all integration transparent to the business user.
 
Recommendations: Subscriptions provide a powerful way to activate your content. You can define a subscription so that a workflow is launched when a document is created or modified. For example, when a loan application document is created, a loan approval workflow is initiated, a loan folder is created, the client information is retrieved from core systems, and a custom object is created with the information.
Change preprocessors
Change preprocessors are action handlers that change new or updated objects before they are saved to the Content Manager. Change preprocessor handlers are associated with a class definition. When an object of that class is saved, the action handler is triggered.
Change preprocessors allow object modifications that are difficult or impossible to accomplish by using event action handlers. For example, a change preprocessor can alter a modifiable-only-on-create property because those properties cannot be altered after the object is saved.
Content lifecycle management
The content lifecycle is a series of sequential states that a document goes through during its life.
A content lifecycle definition consists of two objects:
Lifecycle policy: Identifies the allowed document states. The policy also identifies the lifecycle action that executes in response to state changes.
Lifecycle action: Custom action that the system performs when a document moves from one state to another. A lifecycle action is typically coded by a developer, but managed by an administrator. The custom actions handle the following state changes:
 – Promote: Moves the document forward to its next lifecycle state.
 – Demote: Returns the document to its previous lifecycle state.
 – Reset: Returns the document to its first lifecycle state with each new version.
 – Set in exception mode: Prevents the document from changing lifecycle state.
 – Clear from exception mode: Enables the document to change lifecycle state.
The content lifecycle defines simple processes related to managing a document’s lifecycle. For more complex document lifecycle actions, use IBM Case Foundation.
 
Business benefit: Ability for a group of people to work on the same document with proper version control and avoid duplication, providing real-time collaboration and information sharing to improve and streamline business processes.
 
Recommendations: For simple serial workflows, use content lifecycle management.
Business process management
IBM Case Foundation is a package that includes P8 Content Manager. Case Foundation can create, execute, manage, analyze, and simulate business processes (also referred as workflows) that are performed by users or applications.
By creating a workflow definition, you define the activities and resources that are needed to complete a business process. A workflow definition is a series of steps connected by a series of routes that defines the sequence that the steps are executed. Workflow definitions can contain several maps and submaps that can group related steps.
Steps in the workflow represent a business task or a system activity. Steps can be executed by a user, a group of users, or by an automated application. Workflow steps can run in parallel to facilitate efficient processes.
Routing defines the order in which the steps are executed. Routing can be based on a specific rule or events. Except for the last step, every step has one or more routes that lead to it. Routes can be defined so that they are always taken or followed only if a condition is met.
You can use deadlines and timers to ensure that work is processed in a timely manner. A deadline provides a time-based scheduling constraint, which requires that a step or workflow is completed within a certain amount of time. The deadline can be relative to the time that the step was routed to the participant or to the time that the workflow was launched. A participant with a deadline can receive a reminder of the pending deadline through an email message. When the deadline is passed, a visual reminder displays in the participant’s inbox, and an email can be sent to a configurable list, such as one or more supervisors. The distribution list can be specific to each work item. This automatic process escalation has the additional benefit of operatively ensuring that certain functions or processes are completed on time and without tying up resources to continuously monitor system activities.
A timer indicates a time during which you want a specified series of steps to process. If the timer expires before this processing completes, processing proceeds to another workflow map that provides alternate processing of the work.
 
Recommendations: For complex content-centric processes, use the process management capabilities of Content Manager. Examples of content-centric processes are loan origination processes and insurance claim processes. Using Content Manager process management, you can activate the organizational content and take control over processes that involve documents.
13.1.4 Presentation features
Several P8 Content Manager features are available to present content to your users. IBM FileNet P8 presentation features include options for converting active content to the following formats:
Native content format
PDF
HTML
Darwin Information Typing Architecture (DITA) documents
Annotations
Native content format
FileNet Content Manager stores the content in the native content format (format in which the content is created). When the content is retrieved from Content Manager, the content stream is passed to the client and managed by the associated client application. Content also can be displayed on the requesting client by using the embedded Content Manager Viewer.
 
Recommendations: Try to use the embedded Content Manager Viewer as much as possible. It provides multiple views of the content according to user preference and IT permission sets and it ensures security. It makes the ECM experience positive and transparent to the users by solving issues such as where and how to store documents.
PDF and HTML presentation
The Rendition Engine is a P8 Content Manager add-on that facilitates document publishing. Publishing a document enables converting a document into PDF or HTML format, or generating a replica of the document in either PDF or HTML format. The replica, which is known as the publication document, can have its own security and property settings. Publishing can be triggered by event actions or by changes in a document’s lifecycle state. When a document reaches the “released” lifecycle state, for example, a PDF version can be automatically created with public-view security rights.
Published documents have these characteristics:
Can continue to exist after the source document is deleted based on the assigned retention schedule and business rules
Can be automatically deleted when the source document is deleted
Are not changed when their source documents are changed
Can exist in a different folder than the source documents
Can have a different file format than the source documents, for example, the source document can be a word document, and the publication document can be an HTML document. Publishing options are defined by individual templates
Can originate as Microsoft Office (for example, Word, Excel, and PowerPoint) documents and be rendered to PDF or HTML
DITA documents
DITA is an XML-based open standard for developing, managing, structuring, and publishing content. IBM originally developed DITA for more efficient reuse of content in product documentation. IBM donated DITA work to the Organization for the Advancement of Structured Information Standards (OASIS) for further development and public release.
The content can be composed based on the DITA model that allows content to be linked to multiple topics. After the content is reviewed and approved, it is published to allow business users to perform searches and to navigate around the content.
The two central units of authoring in DITA are the topic and the map. The map combines multiple topics into a structure that has a unique map. The topic might appear in different manuals, and in multiple sections of the final document. Maps are XML documents that consist of links to topics and metadata. Maps do not have content themselves. DITA content (topics and maps) is rendered into PDF and HTML.
Storing each piece of content in a separate file allows users to check out, revise, check back in as a new version, and reuse the single source material in multiple locations.
Next, we review the sample use cases from Chapter 2, “Solution examples and design methodology” on page 17. We use the available components of Content Manager and explain how those components are used to address business requirements.
13.2 Sample use cases using solution building blocks
We explore various IBM FileNet Content Manager sample solutions through use case descriptions. These use cases correspond to the use cases that we introduced in Chapter 2, “Solution examples and design methodology” on page 17.
13.2.1 Policy document creation use case
The use case concerns policy or procedures and safety documentation. The departments that are in charge of the policy need to go through a simple lifecycle process to develop and publish a new policy on the web or to update an existing policy. This process includes transforming an idea into policy, implementing the policy actions, and then evaluating and measuring policy performance. This example, which is illustrated in Figure 13-3 on page 450, is a typical first project when introducing P8 Content Manager into a company.
The use case has the following requirements:
Documents can be authored, reviewed, approved, or released.
There are four roles (author, reviewer, approver, and user).
Each role has certain permissions.
Users can only see approved documents and always the latest version.
Changes to documents need to be audited.
Documents must be filed in folders according to their classification.
Policy documents exist in a shared folder on the network.
Users must be able to search documents based on their properties or words inside the content.
A PDF version of the policy is published to the intranet/Internet.
Figure 13-3 Policy document review and approval use case
Based on the requirements, this use case uses these components:
Foundation components:
 – Document classes for classification of documents
 – Folders for document classification
 – Minor and major versions
 – Security based on version state
 – Auditing for document changes
 – Property-based search and content-based retrieval for locating documents
Content ingestion tools:
 – Microsoft Office integration for storing documents directly from Microsoft Office applications
 – IBM Datacap Task Master for scanning paper-based documents and importing already created policy documents from a shared folder
Process management
Content lifecycle management for the different states of the policy approval process
Presentation features:
 – Native content format
 – Rendition Engine for PDF output
Solution details
As described in the requirements, authorized users must be able to check out policy documents from the repository, change the content, and put it back in the repository as a new version. For this requirement, we use P8 Content Manager checkout, checkin, and versioning capabilities and Microsoft Office integration. By adding a version to the repository, we use the content lifecycle management capabilities of P8 Content Manager, and we assign the lifecycle state Pending approval on the document. Approvers can check out the document, review the content, and change it, creating a version of the document. They can return the policy document to the previous state with comments for the author by adding a new minor version, or they can approve the document and put it in the approved state by adding a new major version.
In the requirements, there is the need for the user community to easily locate the policy documents that are stored in an object store. For that purpose, we use the object store foldering capabilities. We are creating logical folders for the policy documents where users can put them according to their classification (for example, Human Resources policies or procurement policies). We are also providing to users the ability to search for policy documents based on their properties. The P8 Content Manager search capabilities allow you to search for documents by using any combination of document properties (for example, Human Resources policies that were published two years ago).
According to the requirements, general users must have access only to approved documents. Authors and reviewers must also have access to draft documents.
We use object store security to present the draft documents to the special users that create, review, and approve the policy documents.
Also, there is a requirement that an approved policy document must be published to the company’s intranet site as a PDF document. For this requirement, we use IBM Rendition Engine for PDF generation and publishing.
Finally, there is the need to import all already created policy documents to Content Manager. For this step, we use the file import capabilities of IBM Datacap task master.
13.2.2 Insurance claim processing use case
This use case concerns the handling of an insurance claim. The claims processing department needs to handle the documents that are associated with a claim. The compliance department needs to declare records after the claim is closed and define the records retention period. Paper-based documents that are sent to the claims department need to be scanned. A tight integration with the core insurance claim application is needed. This example, which is illustrated in Figure 13-4 on page 453, presents a typical scenario of a use case with significant integration with external systems and other content repositories.
This use case has the following requirements:
Integration of core insurance application with Content Manager.
Folders for document organization.
Document scanning and optical character recognition (OCR) of scanned documents.
Email capture of claim-related documents.
Events for document indexing from the Claim Management System and exception process notification.
Records management for claim document declaration as enterprise records and to set retention.
Documents must be available for viewing by authorized users during the claims handling process.
Insurance policy documents from other repositories are displayed.
Figure 13-4 Insurance claim processing use case
Based on the requirements, this use case uses these components:
Foundation components:
 – Records folders for records filing
 – Records management for insurance claim documents
 – Annotation for marking and highlighting specific parts of the documents
 – Content Manager APIs for integration with the Claim Management application for folder creation and properties update
 – Events for task initiation on the Claim Management System
 – Security for display documents of each claim in authorized user groups
Content ingestion tools:
 – IBM Datacap Task Master for document capture and OCR
 – Email archiving
 – Content federation services for content that is stored in other repositories
Process management
Events for notification of content addition on the Content Manager repository
Presentation feature
Content display in native format in claim processing department
Solution details
This complex solution uses many features of P8 Content Manager.
As described in the requirements, after claim notification, a new claim must be opened in the core Claim Management System. New claim registration on the core system triggers the creation of a new folder in IBM Enterprise Records. In that folder, all records objects will be created for the claim-related documents that are stored in the Content Platform Engine repository. For that requirement, we use Content Manager APIs for integration with the Claim Management System.
When the paper document arrives, we use IBM Datacap Task Master to scan that document. By using OCR/intelligent document recognition (ICR) capabilities, we retrieve the claim number from the paper documents. Using that claim number, along with other indexing information, such as the document type, the document is stored in the Content Manager repository and filed under the claim folder. A notification is sent to the Claim Management System. For the notification of the Claim Management System, we use Content Platform Engine events to execute certain code when a document is added in the ECM system.
During the claim lifecycle, the Claim Management System updates documents and folders on the ECM system by using the Content Manager APIs with the current claim status.
Due to sensitive personal information in the document, only authorized users must have access to claim documents based on the claim status. For that requirement, we use object store security and marking sets that control the access to the document based on a property value (claim status).
According to the requirements, the Claim Management System users must have access to insurance policy documents that are stored in a different Content Manager repository. For that requirement, we use Content Federation Services to integrate the claim repository with the insurance policy repository.
Users must be able to provide casual information around the documents, such as comments, or highlight a specific portion of a document that contains critical information. For that capability, we use Content Manager annotations over the documents.
When the claim is closed in the Claim Management System, a notification needs to be sent to trigger a retention for the claim documents. For that requirement, we use the IBM Enterprise Records features and APIs.
13.2.3 SAP invoice archiving use case
This use case covers the paper invoice data extraction and archiving. The accounts payable department needs to process many paper invoices, update the Enterprise Resource Planning (ERP) system and control the invoice processing. Paper invoices that are sent to the accounts payable department need to be scanned, the information needs to be extracted, and an update must be made to the SAP system with the information on the paper invoice. Furthermore, the scanned image needs to be associated with the corresponding SAP transaction for auditing reasons. The example is illustrated in Figure 13-5 on page 456.
This use case has the following requirements:
Scanning paper documents.
Data extraction from scanned documents.
Extracted data must be validated against the SAP system.
Data must be confirmed by authorized employees of the accounts payable department.
A record of the invoice must be created on the SAP system and the scanned document must be associated with the SAP record.
The scanned image must be available though the SAP system.
Authorized users must be able to search for the invoice document without having access to the SAP system.
Figure 13-5 SAP invoice archiving use case
Based on the requirements, this use case uses these components:
Foundation components:
 – Repositories to store invoice images
 – Content Manager security
 – Property search
Content ingestion tools:
 – Datacap Taskmaster Capture
 – IBM Content Collector for SAP
Presentation features
Native content format with Image Viewer Pro, which supports the scanned image display
Solution details
Based on requirements, all incoming invoices must be scanned on arrival and data must be exported from the scanned images. For that requirement, we use Datacap Task Master scanning and OCR/ICR capabilities.
Exported data must be validated by authorized users and the data accuracy must be verified. For that requirement, we use Datacap Task Master validation features.
The scanned images of the invoices must be stored in the Content Manager and become available to authorized SAP users to link those documents to SAP transactions. For that requirement, we use IBM Content Collector for SAP, which provides the functionality for linking Content Manager images to SAP transactions.
SAP users must be able to view the scanned image of the invoice on the related SAP transaction. For that requirement, we use IBM Content Collector for SAP and the native content format viewing presentation feature of Content Manager.
Authorized users must have access to scanned images outside of the SAP system and must be able to search for those images based on their properties. For that requirement, we use Content Manager security and search features.
13.2.4 Email capture for compliance use case
This use case covers the email archiving and records management needs of an organization. Laws in several countries require the preservation, archival, and eDiscovery of emails that are addressed to an organization. Furthermore, email archiving helps reduce the storage requirements for the email management system. This example is illustrated in Figure 13-6 on page 458.
Figure 13-6 Email capture for compliance use case
This use case has the following requirements:
Email to various accounts must be monitored and, based on enterprise rules, must be stored in an object store.
Based on enterprise rules, a record must be declared, and a retention period must be defined.
Employees must have access to their archived mail through the mail client.
Authorized employees from other departments, such as the Legal department, must have access to the archived emails.
Content-based retrievals must be available for locating emails with specific words in the mail body or on the subject line.
Based on the requirements, this use case uses these components:
Foundation components:
 – Object stores
 – Storage management
 – Content-based searches
 – Records management
Content ingestion
IBM Content Collector for Email
Solution details
According to the requirements, mail to specific accounts or mail that meets a rule (for example, contains the word “proposal”) needs to be stored in an object store. For that requirement, we use IBM Content Collector for Email. It monitors mailboxes, retrieves email based on business rules, and stores it in an object store.
Some of the emails that are considered special based on business rules are declared as records by using IBM Enterprise Records. Emails are associated with a retention period based on legislative requirements.
Users must be able to see the emails and their attachments in their mail clients, but the content must be stored in an object store. For that requirement, we use the stubbing capabilities of IBM Content Collector for Email. With that capability, the original content of the email is removed from the email server and is replaced by a stub that points to the object store where the content is stored.
For the emails that are not declared as records, content deduplication and content compression are needed, specifically when an email with a large attachment is sent to multiple recipients within the organization. For that requirement, we use the content deduplication and content compression feature of P8 Content Manager.
Authorized users must have access to those emails and attachments that are declared as records outside the mail clients of the user. In order to locate specific emails and attachments, searches within the content are required. The searches must be implemented by using the CBR capabilities of P8 Content Manager.
13.2.5 Knowledge management through collaboration use case
This use case covers the collaboration between authors and experts for the development of an organization’s training material and the collaboration between trainees for content recommendation, tagging, and recommendations. Training material has to be prepared by an author and published to a social community of subject matter experts (SMEs). Using the social features of P8 Content Manager, the SME and content authors collaborate and share ideas and comments for the finalization of the training material. When the training material is finalized, it becomes available to the communities in which the trainees participate. In those user communities, trainees can add and view comments and download counts and recommendations from other members of that community. This example is illustrated in Figure 13-7 on page 460.
Figure 13-7 Knowledge management through collaboration use case
This use case has the following requirements:
Support is needed for multiple versions of the content.
Content must be published on the IBM Connections communities.
Users must be able to put comments, recommendations, and tags on the content and download it.
Users must be able to follow the content and get updated versions or comments over the content.
Activity feeds over the content must be available to users.
The recommendation counts and download counts must be available to the community users.
Based on the requirements, this use case uses these components:
Foundation components:
 – Content repositories
 – Versioning
 – Social ECM features
Solution details
Based on the requirements, users must be able to create different types of training material from Microsoft Word and PDF to video and audio files and store them in Content Manager. For that requirement, we use the versions and folders capabilities.
After storing the content, it must be published in an IBM Connections community with SMEs for review and collaboration. The SMEs can create new versions of the content, write comments on it, and follow the content activity. By using the collaboration capabilities of Content Manager, a final version of the content is produced.
When the final version of the training material is available, that material must be published on an IBM Connections community where users that are members of that community can view and download the content. For that requirement, we use the Content Manager large content streaming capabilities.
Members of the community must be able to tag content and provide comments and recommendations. Also, they need to be able to “follow” the content when it is updated, view activity feeds on that content, and view download counts and recommendation counts. For that requirement, we use the Content Manager Social ECM capabilities.
13.3 Conclusion
In this chapter, we described the main solution building blocks of an ECM system. We described features and characteristics of Content Manager and add-ons that can be used for the implementation of a huge range of applications from small departmental applications to large Enterprise Content Management applications that cross the boundaries of many departments. As a reference, we used these solution building blocks for the implementation of the five use cases that we introduced in Chapter 2, “Solution examples and design methodology” on page 17.
 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.4.135