SharePoint ECM Technologies

Companies of every size, shape, and market segment need to consider how they will manage all of the unstructured content produced by their employees. ECM systems provide content management in five areas: document management, WCM, records management, forms management, and e-mail archiving. In this section, we’ll review how Windows SharePoint Services 3.0 and SharePoint Server 2007 capabilities can be used to manage both the unstructured and structured content in your organization.

More Info

The Microsoft ECM Team blog is an excellent source of information about how to effectively use SharePoint Products and Technologies to implement ECM in your company. The blog location is http://blogs.msdn.com/ecm.

Document Management

Document management is ignored by many administrators because they view it as an insurmountable summit to reach. Taking a simplistic approach in the beginning and maturing your file plan as you understand the technology is a best practice. Basically, files have usually been created by a word processing program like Word 2007, a spreadsheet program like Office Excel 2007, or a variety of other programs like Adobe Acrobat. These files contain a wealth of information that is critical to the business processes of an organization. SharePoint Products and Technologies provides a variety of features that can be used to catalog, store, and manage the retention and disposal of these documents. In this section, we’ll examine the basic capabilities available for managing unstructured content stored as files in document libraries.

More Info

For a detailed discussion of document libraries, refer to Chapter 8. Only the points relevant to this discussion are included here.

Content Types

A best practice to managing unstructured content is mandating that the appropriate metadata is collected when the file is first stored in a document library. This metadata can then be used to manage the content programmatically and leveraged by search to find unstructured content that would otherwise be lost to obscurity. In SharePoint Products and Technologies, metadata is stored in data columns, also called Site Columns in the user interface, and these are associated with the documents in a document library. The specific metadata that will be stored with a particular document is defined by the Site Columns associated with that document. There are two ways to encourage, if not enforce, this metadata collection: List Columns and Site Columns used in content types. Content types can also be used to specify the template used to create a new document, a rights management policy, or a workflow for a document.

More Info

For a detailed explanation of content types, see Microsoft Office SharePoint Server 2007 Administrator’s Companion (Microsoft Press, 2007) and Microsoft SharePoint Products and Technologies Administrator’s Pocket Consultant (Microsoft Press, 2007). This section is concerned with how to leverage content types, not what they are.

Custom content types can be created for a site or a site collection by clicking the Create button in the Site Content Type Gallery found on the Site Settings page of every SharePoint Web site. Figure 9-1 shows a Site Content Type Gallery.

Site Content Type Gallery

Figure 9-1. Site Content Type Gallery

These content types will be available to document libraries and lists in the site where they are created and every site below it. However, you can also define a content type as a SharePoint feature, which allows the content type to be defined centrally but also used in any site or site collection in a SharePoint farm.

Note

Defining a content type as a feature is a best practice for two reasons. First, features can be defined by a central IT staff and deployed to an entire farm using a solution package. This guarantees the availability of identical content types in all site collections. Second, content types deployed as features have the same definition wherever they are used, including the content type ID. This is critical when deploying content across site collection boundaries. A prime example is when you send official records to a Record Center. The inbound content type must match the record routing definition.

More Info

For a more detailed discussion of features and solutions, refer to Chapter 12.

When you create a custom content type, your first step will be to choose the content type from which it inherits. To be able to use it in a document library, a new content type must trace its inheritance back to the document content type. When a content type inherits from another content type, it automatically picks up the Site Columns, rights management policy, document template, and workflows associated with the parent content type. Then you can expand on the original definition by adding to or changing the properties of the inherited content type. Using inheritance, you can create increasingly complex content types without having to start each from scratch.

When you create a custom content type, you can also specify columns that will be required metadata when a document is saved or uploaded. By doing this, you can guarantee that specific metadata is available when users are searching for unstructured content. Be careful that you don’t specify too many columns as required when creating a content type. Large amounts of required metadata will make it easier to find content but more difficult for the author to save it. If content is too difficult to save, people will store it elsewhere, and that defeats the purpose of using SharePoint Products and Technologies for ECM. A best practice is requiring no more than five metadata fields when users save documents. You can most certainly have more than five fields to be populated, but don’t create them as required fields.

You should strive for a balance between easy creation and storage of content and the need to categorize it. One way to balance these goals is to edit the default DIP. This is the InfoPath form that is displayed in Office 2007 applications when users save a file. If the document is loaded into a document library or created from a content type within a documents library, a generic form is created for each content type that enforces the capture of required properties. By editing the form, you can set custom default values or add validation logic to the form that will enhance the user experience when documents are saved from Office 2007. Microsoft Office can display the DIP without the InfoPath client application, but you will need the full InfoPath 2007 client if you wish to edit a DIP. Figure 9-2 shows a DIP being edited in InfoPath 2007.

Editing a DIP requires InfoPath 2007.

Figure 9-2. Editing a DIP requires InfoPath 2007.

If a content type is already in use, you should be very careful about changing its definition. The best practice is to have a content type completely defined before using it. Adding required columns or removing existing columns that contain data can cause unexpected side effects. To prevent this, set content types to Read Only under Advanced Settings of the content type.

Note

When you see Sealed content types in the interface and documentation, it simply means Read Only.

Figure 9-3 shows the radio button that must be clicked to set a content type to Read Only.

To seal a content type and prevent inheritance from above, set it to Read Only.

Figure 9-3. To seal a content type and prevent inheritance from above, set it to Read Only.

By default, each document library is based on one content type selected when the document library was created. But, by enabling the Allow Management Of Content Types option in the advanced settings of a document library, you can add additional content types to the document library. Using multiple content types in a single document library allows the storage of related material in a single location while preserving a unique set of metadata, including Site Columns, for each document. Each content type can also contain a unique document template, information management policy, and workflow.

Note

If you are struggling with the concept of content types, imagine what your users will go through learning them! A good starting point for learning content types is leveraging them for multiple templates for a single document library. When your users see multiple items in the New drop-down menu of a library, they will be convinced that they need to learn more about content types. Then, continue your end-user education into workflows, Site Columns, and policies.

Later in this section, we will see that content types also provide the basis for specific auditing, labeling, and disposition information management policies. We will also examine how content types are used to file content in a Records Center.

Versioning

Versioning is another SharePoint Products and Technologies feature that is useful when you are managing unstructured content in a document library. By default, versioning is turned off in most document libraries, but can easily be turned on and configured through the versioning settings link in the document library settings panel. In fact, some organizations create custom list definitions for their environment that specify at least one major version to support their backup and restore governance policy. SharePoint supports two different types of versioning. First, there is a simple versioning system that tracks all versions as simply another major version. There is also a second type of versioning that tracks versions as a combination of major and minor version numbers. Figure 9-4 shows the versioning settings panel in document library settings.

You can enable major versions only or major and minor (draft) versions.

Figure 9-4. You can enable major versions only or major and minor (draft) versions.

Using either versioning type will allow you to recover previous versions of a document. If content is accidentally deleted or changed in a document, versioning can provide a quick way to get lost content back without needing to go through a lengthy backup restoration process that can be accomplished only by the server administrators. Both versioning systems also allow you to conserve storage space by limiting how many previous major versions are retained. When you use major and minor versioning, all of the minor versions for a specific major version are kept, but you can control which major versions retain all their minor versions. Versions are always retained as full copies, so make sure you have enough storage space in your SQL server to maintain all of the versions. This is often correlated with site collection quotas, so be sure to include it in your overall design.

Note

SharePoint Products and Technologies does not support differential versioning. A full copy is stored whether it is a major or minor version.

Major and minor version numbering adds a number of additional capabilities. Major versions are considered published versions, while minor versions are designated as draft versions. You can choose whether a document is checked in as a major or minor version. Figure 9-5 shows the panel of versioning choices available when you check in a document.

Users select the version when checking in a document.

Figure 9-5. Users select the version when checking in a document.

Only major versions are subject to the approval settings of the document library. You can also designate who can see minor versions. By default, draft versions are viewable by everyone who has access to the library. But you can limit visibility to only those who can edit a document or even to just the original author. Of course, approvers and administrators can see all of the versions. By implementing versioning, content authors can manage their own content efficiently by keeping it both up to date and available.

Item-Level Security

Security is also a concern when you need to manage unstructured content. Many of the new legal requirements are directly related to holding companies and individuals accountable for unauthorized access to content. HIPAA and FERPA specifically deal with who should have access to medical and student records, and these records are not limited to structured content stored in databases. These records include all of the documentation produced while creating and maintaining that structured content. For example, the grades recorded in an Excel spreadsheet by a university professor need to be secured, just like the grade records in the university registrar’s database system.

By default, security permissions granted to users or groups in SharePoint Products and Technologies at the site collection level will be inherited, but exceptions can be created by customizing security at any level down to the individual document. Best practice dictates creating a security structure in SharePoint sites that will cover most security concerns automatically and minimize the number of exceptions that must be applied at lower levels. For example, don’t give all users read/write access to a site if only a subset of those users will be adding or editing content on the site. Instead, create a SharePoint group at the site level with contribute permissions, and add the users who need read/write access to that group.

Security in SharePoint Products and Technologies works by bringing together three things: a securable object like a Web site or document, a SharePoint user or group, and a permission level. Collections of individual site, list, and personal permissions—called permission levels—are defined at the site collection level and inherited down to the level of an individual document. The use of appropriately named permission levels such as approve, contribute, or read, makes it easy to maintain security without having to understand the more technical list of thirty-two base permissions. This makes the establishment of descriptively named permission levels and groups essential because content authors will not use what they don’t understand.

Note

A best practice is creating permission levels with corresponding names. For example, a permission level named "professors" should be associated with a SharePoint Group named "professors." Extending this idea, name your Active Directory directory services groups accordingly as well, if you are using them in SharePoint groups. Another best practice is to never change the default permission levels. Doing so could cause a user to inadvertently escalate permissions for another user.

Integrated Information Rights Management

Although SharePoint Products and Technologies provides security all the way down to the individual document level, that security protects unstructured content only on the server. But what about copies of files that are checked out of SharePoint Products and Technologies document libraries and stored on local hard drives? This is really no different than any system, but there are some tools that help encourage users to adhere to policy. A file that is downloaded is no longer subject to the security settings established in SharePoint Products and Technologies. It can be copied or edited by anyone who has access to the file. Normally this will be someone with edit access to the original file, but even someone with read-only access to a document library can download a copy of the file. That’s where Microsoft Windows Rights Management Services (RMS) for Microsoft Windows Server 2003 comes into play.

Figure 9-6 shows how a RMS works in conjunction with SharePoint.

Information Rights Management Server environment

Figure 9-6. Information Rights Management Server environment

We are slow to say that RMS is a best practice, but it is definitely an enabling technology for enterprise access control. RMS integrates with SharePoint Server 2007 to protect content even when it is downloaded from a SharePoint Products and Technologies server. Content owners can control who can open, edit, print, forward, and/or take other actions with unstructured content stored in files. RMS consists of a Windows Server service that installs on a computer running Windows Server 2003 and an RMS client application that installs on each workstation that will access protected content. Authors can use the RMS client to digitally encrypt documents that are stored in SharePoint Products and Technologies document libraries.

Note

To integrate with SharePoint Server 2007, your clients must be using Service Pack 2 for the RMS client.

The documents are encrypted with RSA 1024-bit Internet encryption. Included in the encrypted document are instructions that control what can be done with the document. Users at workstations that have the RMS client installed can decrypt the document but are still limited by the client regarding what they can do with the document. Common limitations include the ability to edit, copy from, e-mail, or print the document.

The following list details how an RMS client interacts with a document:

  1. An author with client software installed obtains a certificate from the RMS server the first time she contacts the server.

  2. The author creates a file in an RMS-enabled application such as Word 2007 and uses the RMS client software to encrypt the file, along with a "publishing" license that specifies what can be done with the file.

  3. The author saves the file to a document library in SharePoint.

  4. When a recipient with an RMS-enabled application and the RMS client software opens the file, the client software validates the user and issues a "use" license. If the recipient does not have an RMS client, he will be unable to open the document.

  5. The RMS-enabled application displays the file but enforces any limits on usage defined in the use license. The file can now be removed from the network location, but the use license stays with the file and limits how it can be accessed.

More Info

You can get more information about Windows Rights Management Services, including a 180-day evaluation copy, at the Windows Server 2003 R2 Technology Center located at http://www.microsoft.com/windowsserver2003/technologies/rightsmgmt/default.mspx.

Web Content Management

Windows SharePoint Services 3.0 document libraries can be used to store unstructured content, but SharePoint Server 2007 includes features that make it possible to author and display content as Web pages. The WCM capabilities of SharePoint Server 2007 can be used to create and maintain a set of content-driven Web sites. Content-driven Web sites enforce corporate standards governing the look and feel of the sites, but adding or modifying content is left in the hands of end-users. IT professionals can create the infrastructure for the Web sites, including a corporate-branded look and feel. Then users can add, modify, and change content using a variety of tools to create a dynamic, constantly changing Web site. In this section, we’ll examine the features provided by SharePoint Server 2007 to create and manage content in the form of content-driven Web sites.

Publishing and Publishing Infrastructure Features

The basis for creating a content-driven Web site is contained in a set of SharePoint features that are rolled up into two overall features. The first is a site collection scoped feature called the Publishing Infrastructure. It enables the publishing infrastructure "plumbing" for the site collection, but doesn’t enable it in any Web site. The second is a site scoped feature called Publishing. The Publishing feature is the actual enabling of the publishing "plumbing," which allows use of the features in a given site. A best practice is using the Publishing Portal and Collaboration Portal Site Templates when requiring publishing functionality. These templates activate features automatically when a site is provisioned. It is important to note that almost any SharePoint site can be retrofitted with this capability by activating these two features later. See Chapter 13, for a complete list of best practices.

Activating the Publishing features creates an infrastructure consisting of a combination of content types, document libraries, "templatized" field controls, approval workflows, page layouts, master pages, and external document conversion services that can be used to author Web pages for a content-driven Web site. There are two primary methods for authoring content to be displayed in a publishing Web site. The first is direct editing of a new Web page through the use of a page layout that contains field controls and, possibly, Web parts. The second is by re-purposing content that was authored in a program like Word 2007 and then converted to Web content.

Page Layouts and Field Controls

Page layouts are a combination of a content type, field controls, and an .ASPX page. When an author chooses to create a new page in a publishing site, he is required to choose from a list of pre-built layouts. These layouts confine authors to specific approved formats while allowing them complete freedom in choosing the content that can be added to a site. These layouts are as much about your governance strategy as they are about ECM. Figure 9-7 shows the dialog box for selecting a layout when you create a page for a Web site in which the Publishing feature has been activated.

Choosing a layout page on a publishing site

Figure 9-7. Choosing a layout page on a publishing site

Once a layout has been chosen, an author adds the content directly to field controls that are displayed on the page. Each field control is associated with a column in a related content type that will store the information entered by the author. Because the positioning of the field controls on the page is controlled by the layout page, any layout page built on the same content type can be used to display the page. This allows the author, editor, or approver the freedom to change the layout at any time without having to re-enter the content. Custom layout pages, content types, and field controls can be created to customize the entry and display of information for any environment. Figure 9-8 shows a layout page with a Rich Text field control open for editing.

Editing a field control

Figure 9-8. Editing a field control

Document Converters

Not all content to be displayed on a content-driven Web site will be entered directly onto a SharePoint layout page. Some content has already been authored in external programs and uploaded to document libraries. SharePoint Server 2007 can be used to automatically convert this unstructured content from one format to another, making it available to a wider audience. For example, you can use one of the provided document converters to automatically convert press releases written in Word 2007 and stored in a document library to a Web page for display on your site.

Several document converters are provided with SharePoint Server 2007, including DOCX or DOCM to HTML, InfoPath to HTML, and XML to HTML. Using these external services, you can easily take content authored in either Word 2007 or InfoPath 2007 and re-purpose it for display on your SharePoint Server 2007 Web site. These document converters are external applications that can be invoked from the command line with values specifying four parameters:

  • Input fileA required parameter containing the full path to the original file to be converted

  • Configuration information. An optional parameter that is the full path to the ConfigInfo file containing custom configuration settings for the document converter

  • Output files. A required parameter containing the full path where the converted file should be placed

  • Log file. An optional parameter with the full path to a log file where errors and other information can be recorded

As long as you have, or can write, a .NET console application that accepts these parameters, you can create your own document conversion services by writing a SharePoint feature that includes the application and a document converter definition file.

Although document conversion services were envisioned to be used to re-purpose previously written content for use on a Web site, they can also be used for a more general purpose for content management. You can create and deploy a custom document conversion service that will convert editable content into a read-only format like Adobe PDF, or automatically convert files from older formats like Microsoft Office 2003 to a newer standard format like Office 2007. You could even convert documents in the opposite direction to support older, non-upgraded clients or external users on different platforms.

ASP.NET 2.0 Master Pages

One the best aspects of the design of SharePoint Products and Technologies is its support for the standards of ASP.NET 2.0. Master pages are one of the most important of these standards. When the Publishing features are turned on in SharePoint Server 2007, support is enabled for the inheritance of two master pages throughout the site collection. These two master pages are the site master and system master. The site master page is used for the pages stored in the pages library of a publishing site; the system master page is used for the default pages in a non-publishing site. You can choose these two master pages from several that are stored in the master page gallery of the site. When publishing is not enabled for a site collection, there is a single editable default master page stored in site’s master page gallery. Each layout page contains a declaration at the top of the page. Custom programming can expand these limitations, so any site can support a different master page for each layout page.

More Info

For a more detailed discussion of master pages, refer to Chapter 11.

Implementing a custom master page or a coordinated set of master pages separates the content on a page from the basic look and feel of the frame around the edge of the page. This allows centralized control of a corporate look and feel while still allowing localized control of the content.

Reusable Content and Image Libraries

Another way that SharePoint provides for both central standardization and local editorial freedom is the use of document libraries to house various kinds of reusable content. When the Publishing features of SharePoint Server 2007 are activated, libraries are created at both the Web site and site collection levels to hold content that can be used over and over again in field controls on layout pages. There are two primary kinds of libraries: one to hold text and one to hold pictures.

A single reusable content library for text is created in the top-level site of the site collection when publishing is enabled. There is no reusable content library for each site. Text in the library can be either unformatted or HTML-formatted text. You can also choose whether the text will be inserted as a static or a dynamic copy when used. If the Automatic Update check box is selected, then the copy will be dynamic, and any usage of the content will be updated if the content in the library is changed. Figure 9-9 shows the dialog box for editing a reusable content entry with the Automatic Update check box selected.

Setting reusable content to be automatically updated

Figure 9-9. Setting reusable content to be automatically updated

Two image libraries are created in the top-level site of a site collection: one for the entire site collection and one for that site alone. An additional image library will be created in each child site to hold graphics that are reused only in that site. A variety of image formats are supported, and metadata about the image will be collected when images are added to the library.

Reusable content libraries provide an opportunity to decrease duplication of effort by reusing commonly used words, phrases, and pictures. But it also provides a mechanism for implementing standardized versions of wording, formatting, and images at various levels in the corporation. If the content is marked as dynamic, it also provides an opportunity to update the content easily. Companies should use these facilities for things such as standardized logos, copyright statements, and perhaps even company and division names. For example, if division names are entered as HTML-formatted reusable text, then not only will their display have a uniform look and feel, but if a company reorganization occurs you will be able to quickly change important names by changing a few entries in a reusable content list.

Approval Workflow

SharePoint Server 2007 uses a built-in, three-state workflow as the default approval process for content pages added to a publishing site. But you can also substitute your own custom workflow. There are several reasons why you might want your own custom workflow for this type of approval. First, content may need to be approved by more than one person or group. For example, you may need signoffs from several different groups, such as a marketing department or a legal group, in addition to the normal editor. A second reason might be that you have a multilingual site, and content entered in one site needs to be translated and duplicated for another site. A workflow can be used to route the content to a translation service. Yet another reason might be that you want to implement a more complex workflow that would allow for escalation of approval to another group if a lower-level editor has a question. Whatever the reason, the integration of Windows Workflow Foundations as the underlying workflow engine for SharePoint makes extensible workflows possible.

Content Deployment

The final piece of the puzzle in a content-driven Web site is deployment. WCM is frequently used to create a read-only Internet presence Web site for a company. This kind of scenario commonly has two or even three different sites involved in creating, editing, and publishing content. A typical three-layer model would consist of an authoring site, where users create new content; a staging site, where new content is held once approved until the time is right to publish it; and a production site, which contains the finished content. SharePoint Server 2007 has a content deployment service that can be used to move content from one site or site collection to another on a scheduled basis. Although the service moves both draft and approved content, only approved content whose publishing start date has passed will be visible on the production site.

Records Management

According to most modern universities, nonprofits, and corporations, records management is the systematic control of all records, regardless of media format, from initial creation to final disposition. The thing that makes records management different from ECM, in general, is the emphasis on preserving and destroying content. When we consider the huge amounts of unstructured content produced by most businesses on a daily basis, the real challenge is determining which documents to keep and for how long. The focus of records management in SharePoint is on the long-term preservation and disposition of content that should be considered part of the official record of a company. Preservation of this official record is increasingly important in light of new regulatory requirements such as the Sarbanes-Oxley Act of 2002. For example, if a legally required document gets buried in the mountain of unstructured content and can’t be retrieved in a timely fashion, an auditing agency might consider the document discarded, which could lead to a negative audit finding or even contempt of court. If a critical document is accidentally disposed of during litigation, a company might lose its case simply because it can’t produce evidence. Of course, losing a major case in court can be the downfall of a company.

More Info

For a more in-depth look at how SharePoint can be used to comply with new legal requirements, download the white paper titled "Compliance Features in the 2007 Microsoft Office System" by Joanna Bichsel. You can download the white paper at http://www.microsoft.com/downloads/details.aspx?FamilyID=d64dfb49-aa29-4a4b-8f5a-32c922e850ca&displaylang=en.

Records Center

To accomplish the preservation and disposition of content, SharePoint Server 2007 provides a custom site template called a Records Center. Figure 9-10 shows a site created using the Records Center site definition.

Records Center site

Figure 9-10. Records Center site

There are several characteristics that make a site created from the Records Center uniquely suited for the retention and disposition of content. The first is that submission of content to a Records Center site can be done through a secure Web service. This service lets users submit content to the Records Center without having actual security rights in the Records Center. If the Records Center is created on a separate Web application with its own application pool, then it will be a secure location where access can be made audited. This secure audit trail is critical if the documents in the site will be used in any kind of legal proceeding or used to protect intellectual property. This service connection can be configured in the External Service Connections section of Application Management, in Central Administration. Note that you can have only one external service connection per farm.

Information Management Policies

Another critical component in automating the preservation and disposition of content submitted to the Records Center site is the establishment of information management policies. Information management policies are actually a broader concept than the Records Center. They are available in every document library and can even be set for each unique content type. But the Records Center is where these policies become critical. You can define four types of information management policies. Figure 9-11 shows the configuration panel where you can choose which of the four policies to implement.

Selecting information management policies

Figure 9-11. Selecting information management policies

The first type is a labeling policy. Using this policy, you can create and print a custom label that is generated from a document’s metadata whenever the document is printed.

The second type is an auditing policy. Using this policy gives you the ability to choose to track specific events in the life cycle of a document. These events include when a document is viewed in any way (including downloading), edited, checked in/out, moved/copied within the site, or deleted/restored. The audit information collected by these policies is stored in a single audit log on the server. Reports showing the audit trail can be run by site administrators, but no one can delete or modify individual audit records.

Note

There are no list or item permission event handlers in the SharePoint Products and Technologies object model. Therefore, you cannot see who changed permissions on a document, item, or list. Because event handlers do not exist, you cannot custom code this functionality either.

The third type is an expiration policy. This policy setting controls when content will be scheduled for automatic deletion/archiving. The timing of this disposition can be based on a formula that uses metadata stored with the document record. The action taken can be an automatic deletion of the item or a more complex workflow process that archives the document to offline storage before it is deleted. An often-overlooked capability of expiration is using these policies in lists and libraries. When an expiration policy is defined at the list level, you can create a custom policy based on any Site Column of the date type.

The final category is applying a barcode. Using the barcode feature, you can automatically assign a barcode label that will be printed whenever the document is printed. This simplifies the subsequent retrieval of related hard copy. Many individuals have found creative ways to use the barcode feature, such as labeling student records on printed forms at universities. But the primary use of the barcode feature is tracking hard copies of documents. Yes, SharePoint Server 2007 can manage documents outside of a site as well as native objects.

Information management policies include both an administrative description and a policy statement. The administrative description is used to explain the purpose of the policy to anyone thinking of altering the policy; the policy statement is displayed to users when they work on documents that have associated information management policies. This can be an invaluable way of keeping users informed about what they need to know when they work with various types of content. For example, users need to know not to discuss sensitive content stored in certain documents and that their actions are being audited. Policy statements automatically appear in the DIP in Office 2007.

Information management policies can be a critical component in the business processes of your company and can be used to either replace or enhance workflows that implement those business processes. Best practices for using information management policies include the following:

  • Use expiration to reduce unnecessary content. This reduces required storage and backup capacity and increases the relevancy of other documents in your search corpus.

  • Use auditing when needed based on a Site Column. Do not use auditing by default because it affects system performance.

  • Use labels and barcodes sparingly. Do not use them for functions for which they were not intended. A product version change might affect your custom code.

Record Routing

Because content added to the Records Center can be submitted anonymously, there needs to be an automated method of categorizing and storing the records when submitted. This process is accomplished by a record routing list that is created in the Records Center site. Each item in the record routing list specifies where content of a particular content type should be stored in the site. Don’t make this part difficult. If an inbound content type matches a record or alias, it routes to the corresponding document library. If an entry does not exist for an incoming content type, then that content is automatically placed in a default library. Entries can also be created that send different types of content to the same final library or list. Because the Records Center will usually be created in a separate security context where regular users have no rights, it is critical for an automated procedure to exist for categorized storage of submitted content. This categorization system should be considered when planning the creation of custom content types. You should also consider implementing these content types as features to make it easy to create parallel definitions in different site collections such as the Records Center.

Holds

So far, we’ve discussed how SharePoint Server 2007 supports the automated categorization and disposition of content by using information management policies and the Records Center site. But what happens when a legal question is raised? If a suite is filled, there needs to be some way to guarantee that the content will be available even if a policy calls for the content to be deleted. Like the record routing list, a holds list is created in the Records Center. You can add a new record to this list to create a hold for a specific legal case. Then you can add or remove content from a hold by selecting Manage Holds from the context menu of a document in the Records Center. Figure 9-12 shows a holds list in a Records Center with the context menu for removing a hold.

Removing a hold

Figure 9-12. Removing a hold

As long as the link to the hold exists, any expiration policy will be delayed. Once the case is settled, removing the hold from the document will allow suspended expiration policies to be processed. Holds can be established only for content in the Records Center, so you must add to the Records Center any official content that needs to be retained for legal purposes.

Forms Management

So far, we’ve primarily discussed best practices surrounding managing unstructured content stored in documents. But automating most business processes requires finding ways to collect structured information electronically. Creating full database projects for the collection and storage of this information can be very time and resource intensive, so many companies are turning to electronic forms packages that can simplify the collection of this information. InfoPath 2007 provides an excellent way to collect information electronically, but it allows the user to store the collected information either in a structured content repository like a database or in a library of unstructured content files. InfoPath forms libraries in Windows SharePoint Services 3.0 and InfoPath Forms Services in SharePoint Server 2007 can streamline the collection and manipulation of this kind of content.

There are a number of reasons to use InfoPath to collect business process information. InfoPath forms can be designed to include data validation and calculations without coding. This is particularly important to take pressure off scarce resources such as professional programming staff. Figure 9-13 shows data validation settings for a form designed in the InfoPath forms client.

Designing a form with data validation in InfoPath

Figure 9-13. Designing a form with data validation in InfoPath

Different views of the data can also be easily created to provide a view that better matches the usage of the data or to hide information from unauthorized viewers. For example, a professional data entry clerk may need only a streamlined view for data entry, but end-users may want a more detailed view. Figure 9-14 shows two views of the same InfoPath form.

Data entry and end-user views of an InfoPath form

Figure 9-14. Data entry and end-user views of an InfoPath form

Another advantage is the presence of advanced controls such as repeating tables and optional sections. These controls can solve many of the common problems associated with data entry forms. Finally, the separation of the actual data collected from the form used to collect the data means that the form can be easily modified or copied to other locations. This makes InfoPath 2007 a far better option for collecting form-based data than traditional paper-based forms.

Forms Library

In the same way that documents can be stored in a document library, SharePoint Products and Technologies makes use of libraries to store InfoPath 2007 forms data. But in the case of InfoPath 2007, the template associated with the library is an InfoPath form template—an .XSN file. Clicking the New button in the form library will open the user’s copy of the InfoPath client and load the associated form template. When the form is submitted, the XML data collected by the form will be stored as an XML document. Publishing an InfoPath form to a forms library also makes it possible to surface individual data collected by the form as metadata of the document. This makes an InfoPath 2007 form a hybrid that has characteristics of both unstructured and structured content. Figure 9-15 shows an InfoPath Form library in SharePoint with metadata columns that have been promoted from data fields on the form.

Promoting InfoPath fields as library metadata

Figure 9-15. Promoting InfoPath fields as library metadata

InfoPath Forms Services

Windows SharePoint Services 3.0–based forms libraries require that everyone who fills out a form must have the InfoPath 2007 client. SharePoint Server 2007 Enterprise Edition removes this requirement by adding InfoPath Forms Services. InfoPath Forms Services allows users to fill out InfoPath forms by using a Web browser. Forms must still be created using the InfoPath 2007 client, and there are some limitations on the features supported by the Web client. You also need to consider whether you will configure InfoPath Forms Services to use session state or Form view for storing session information. Form view stores session information in hidden fields on the InfoPath form, which increases the use of bandwidth when the form is downloaded or posted back to the server. Session state stores that information in the SharePoint database, which decreases the amount of bandwidth used between the SharePoint server and the Web browser.

Note

A best practice would be to configure InfoPath to default to Form view for forms that contain a small amount of session state but automatically default to session state for larger forms. That will allow you to balance SQL server load against the use of client bandwidth.

Figure 9-16 shows the configuration settings that will use Form view for forms with up to 4 KB of session data but use session state for larger forms.

Setting InfoPath Forms Services session state

Figure 9-16. Setting InfoPath Forms Services session state

E-mail Management

Another form of structured content that many organizations overlook when planning an ECM strategy is electronic mail. In today’s business climate, a great deal of content never makes it into formal documents but is simply passed around in the form of electronic mail messages. But this content is also covered by many of the legal initiatives mentioned earlier. For example, one element of the Sarbanes-Oxley Act states that companies must maintain an archive of e-mail messages, many of which contain sensitive information. Managing this information, including deletion of expired content, requires more than just an e-mail archive or a journaling feature installed on the user’s desktop. But this archived material also needs to be protected from unauthorized access because e-mail often inadvertently contains extremely sensitive information. Consider how many recent scandals can be traced back to leaked copies of e-mail messages. But simply protecting the e-mail archive is not enough because a record must be maintained that allows retrieval of specific messages when they are needed.

To facilitate the processing of e-mail messages as records, Microsoft has implemented tight integration between Outlook 2007, SharePoint Server 2007, and Exchange Server 2007. Organizational folders can be created in Exchange Server 2007 that map to business functions and then pushed out to a user’s Outlook 2007 client using group policies. This makes it possible for users to simply drag and drop e-mail messages into these folders. Then Exchange Server 2007 will automatically copy the messages to SharePoint Server 2007. If there is missing metadata, an e-mail alerts the user to fill it in.

More Info

Kathy Hughes, Microsoft MVP, has written a thorough white paper on using SharePoint to archive e-mail titled "E-Mail Records Management in SharePoint Server 2007." You can download a copy from the Mindsharp Premium Content area of the Mindsharp Web site at http://www.mindsharp.com.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.67.40