Chapter 9. Content Management Server Architecture

<feature><title>In This Chapter</title> </feature>

Technology Capabilities

Content Management Server (CMS) offers a complex, powerful environment for enterprise Web development and publishing. This isn’t simply a souped-up, server-based version of FrontPage, either; CMS offers a complete publishing workflow from content development through deployment. CMS actually uses some pretty unique approaches to Web publishing, so in the next few sections I’ll familiarize you with what CMS does, how it does it, and how you can take advantage of its capabilities in your environment.

Web Publishing

Before diving into CMS, you need to understand some of the new terms, and new definitions for common terms, that CMS uses to describe the Web publishing process. Here are the major terms:

  • A user is any valid user account in your domain. CMS accesses Active Directory by using the Lightweight Directory Access Protocol (LDAP), enabling it to integrate with Active Directory and utilize your Active Directory user accounts.

  • A container is a storage area within CMS that is used to organize Web pages, resources, templates, and rights groups. You can think of containers as folders, because containers can be organized in a hierarchy not unlike the folder hierarchy shown in Windows Explorer. CMS actually includes four types of containers: folders and channels, which I’ll discuss shortly, and template and resource galleries, which are collections of page templates and resources that authors can use.

  • Templates are used to control the overall look and feel for your Web site. CMS actually uses two types of templates: page and navigation. Page templates provide a framework in which Web authors can create Web pages. Page templates might include common graphics, an overall page layout, and other elements. By requiring your authors to use templates, you can ensure that the content they produce will have a consistent appearance and behavior. Navigation templates give a Web site its navigation controls, such as “Home” buttons, the order in which pages appear, and so forth.

  • When an author creates a new page that’s ready for publication, the author posts the page. Postings can include a publishing schedule—enabling CMS to automatically make content live on a particular date—and an expiration date. Postings also include the channel on which the content should be published.

  • Channels organize postings into a hierarchy, and enable you to organize and manage access to the postings. When content is posted to a channel, it is not actually published until the channel’s moderator, or administrator, approves the content for publication. Channels usually comprise the various sections of a Web site. For example, a company’s public Web site might include channels for Sales, Product Information, Technical Support, Marketing, and so forth. Breaking a Web site into channels enables different groups within your organization to take responsibility for different portions of the site.

  • Folders are used to store pages. Every channel is associated with a folder, and each folder usually has one or more editors assigned to it. Pages in a folder must be approved by an editor before those pages can continue through the publishing process. You can also configure CMS to automatically approve pages for folders without an editor.

  • Roles are used to assign capabilities to users. CMS includes seven roles: author, subscriber, editor, moderator, resource manager, template designer, and administrator. Placing users into these roles gives the users the associated capabilities within CMS.

  • Rights groups associate roles with containers. You assign users to roles, and then create rights groups for the roles. Rights groups are assigned permissions to one or more containers, resulting in users having the appropriate access. For example, suppose a user named Don is a member of a rights group named “Marketing Authors,” which you created for the authors role. The Marketing Authors rights group has permissions assigned to several containers, giving Don author rights within those containers.

CMS’s publishing process seems complex based on these terms. As shown in Figure 9.1, however, the process is relatively straightforward.

The publishing process includes tasks from content creation through the end user accessing that content.

Figure 9.1. The publishing process includes tasks from content creation through the end user accessing that content.

Rights assignment using users, roles, rights groups, and containers may also seem complex based on the terminology; Figure 9.2 shows how straightforward the process actually is.

Users can belong to multiple rights groups, giving them different capabilities in different containers.

Figure 9.2. Users can belong to multiple rights groups, giving them different capabilities in different containers.

To summarize the entire Web publishing process

  1. You’ll begin with the users in your Active Directory domain, so make sure that everyone who needs to be involved with the publishing process has a domain user account.

  2. Use the CMS Site Builder application to create folders for your site’s pages. Normally, you’ll start with a single root folder and create subfolders for each section of the site.

  3. Define channels for the various sections of your site. In general, a channel should represent an area of the site that one or two moderators will be responsible for. For example, the manager of your Marketing department might be responsible for approving all content related to company press releases, so you might create a Press Releases channel to hold that content. Channels are associated with folders, so your folder hierarchy will have to reflect your channel hierarchy.

  4. Create rights groups within the various roles. I like to start by creating a single rights group for each channel for each role. In other words, each channel should have an authoring rights group, and editorial rights group, and so forth.

  5. Assign users to the appropriate rights groups to give them the necessary control over the Web site’s content. You’ll need authors for each channel. Unless you intend to set up channels with auto-approval, you’ll also need editors and moderators for each channel. You’ll also need template designers to create the templates used on your site.

  6. Authors kick off the actual publishing process by using the CMS Web Author application to create content, based upon templates in a CMS template gallery. Authors can utilize resources—such as graphics, videos, and other pieces of content—from CMS resource galleries. When finished with a page, authors can either post them or simply let them sit.

  7. Editors review and edit authors’ pages, and either accept or decline them. Editors can also create new content from scratch, just like an author. When editors approve pages, they are considered published. However, they are not viewable until a moderator approves the posting. If the author created a posting for the page, the editor simply approves the page and waits for the moderator to approve the posting. If the author didn’t, the editor can create a posting for the page, passing it on to the moderator for approval.

  8. Moderators approve or decline postings and modify the channels’ posting schedules, allowing content to be viewed by subscribers.

  9. Subscribers read the content approved by moderators.

Administrators work in the background, creating folders and channels, assigning users to rights groups, and so forth. Resource managers control one or more resource galleries, which contain content—images, sounds, and so forth—that authors can utilize in their Web pages. For example, you might create a Corporate Logos resource gallery and assign someone from your company’s Marketing department as the resource manager for the gallery. The resource manager ensures that the content in the gallery is approved, enabling authors throughout the company to utilize the correct logos in the Web pages they create.

Template designers play a similar role for template galleries. Templates define the overall look and feel for a site, including the inclusion of standardized resources, such as a company logo at the top of each page, or a copyright statement at the bottom. When authors begin creating Web pages, they must start with one of the templates available in a template gallery, ensuring that the content meets corporate standards for layout and design.

CMS as a Web Server

CMS stores content in a SQL Server database. Multiple CMS computers can use the same back-end SQL Server database, enabling those computers to all provide the same Web content to end users. Most Web servers, by contrast, store their content in simple text files on the server’s hard disk. When a typical Web server receives a request for content, it reads the appropriate file from disk and sends it to the requestor. In the case of dynamic content, such as Microsoft Active Server Pages (ASP), the Web server also processes the ASP program code, sending only the results to the requestor.

CMS’s use of a SQL Server database for storage means that all incoming Web content requests must be satisfied from that database. In fact, CMS dynamically generates content for each and every Web page request, loading content from the database, processing ASP program code (if necessary), and delivering the final content to the requestor via the HTTP protocol. CMS’s behavior as a Web server is referred to as a non-staged site, because the site isn’t staged off of the CMS computers, and is instead hosted entirely on the CMS computers. Note that CMS does include a caching architecture, which enables it to create pages dynamically and then deliver the static result to a number of clients. This caching architecture helps reduce the overhead of dynamically creating Web pages from a database.

In a non-staged site, CMS still controls access to content. That means you’ll have to assign end users to a subscriber role for the channels you want them to have access to. You can also create a generic “GuestUser” user account, which can be used to represent all users not specifically defined within CMS. By assigning subscriber access to this user account, you can allow all users to access the content you choose.

CMS as a Web Server

For more information on securing access to CMS-based Web sites, seeContent Management Server,” p. 664

CMS also supports a more traditional Web publishing environment, referred to as staged sites. In this environment, CMS publishes its content to traditional Web servers, which then provide the content to end users. CMS uses its normal dynamic generation of pages from a SQL Server database to initially generate the content, saving the results in static, disk-based files on a Web server. Staged sites provide better efficiency, because Web servers can serve static pages much, much faster than CMS can create dynamic content from a SQL Server database.

Tip

Public Web sites will almost always be hosted in a staged site environment. You’ll use CMS internally to manage the publishing process, and publish approved content to servers running Internet Information Services (IIS) for delivery to your actual users. Intranet applications, which are usually much smaller in user volume, may use a non-staged site, publishing content directly from CMS.

In order to create a staged site, you’ll use the CMS Site Stager. Essentially the Site Stager queries CMS for content, converting CMS’s internal page hyperlinks and other code into static HTML hyperlinks, ASP code, and so forth. Site Stager then transfers the content to a regular Web server, which can serve the content to end users. In other words, the Site Stager translates CMS content into standard HTML or ASP content. Site Stager uses profiles to determine how often it will re-stage a site, which channels are staged, and so forth.

Note

While staging is an automated process, it’s not something that runs every time a Web page is changed by an author. Instead, it’s a batch process that runs according to a fixed schedule. The result is that your staged Web site—which your end users utilize to obtain content—won’t always reflect the latest changes within CMS. If your site’s content changes frequently, and you want the most current content to be available to end users, then you should use a non-staged site and serve content directly from CMS.

CMS supports a tiered development environment, as well. For example, as shown in Figure 9.3, you can use one group of CMS computers for content development and management, and deploy that content to a second set of production CMS computers which are responsible for actually serving the content to end users. CMS includes a Site Deployment Manager tool that facilitates this kind of non-staged, tiered environment.

A tiered environment allows more granular management of CMS computers, and provides better performance for end users.

Figure 9.3. A tiered environment allows more granular management of CMS computers, and provides better performance for end users.

Web Content

CMS supports plenty of variety in Web content. CMS enables authors to create framed and frameless Web pages, including Java or Active X components, server-side scripting via ASP, client-side (Dynamic HTML, or DHTML) scripting, and much more.

Tip

Remember, when you stage a site using the Site Stager, server-side code (such as ASP) isn’t staged. That means any dynamic logic in your Web site must be implemented in client-side code, either DHTML, Java, or ActiveX components. There are techniques for staging server-side code, one of which is documented in Microsoft Knowledge Base article Q302066.

CMS can pretty much do anything that a non-CMS Web site can do, provided you’re hosting the Web site on CMS computers. If you plan to use a staged site and host your content on regular Web servers, then you may be sacrificing much of your server-side capabilities, relying entirely on static HTML and client-side code to create dynamic Web sites.

Web Development Considerations

If you plan to use CMS in a staged-site environment, your Web authors will need to take special precautions so that the Site Stager can correctly publish the pages out of CMS and into a normal Web server. Consider the following:

  • Any hyperlinks that reference pages outside the CMS Web site—such as pages on other company Web sites, or on other companies’ Web sites—must be created using CMS’s ResolveURL() method. CMS uses this method normally when you add hyperlinks to a document, but if ASP programmers are hand-coding hyperlinks, they’ll need to use the method. The method ensures that the Site Stager can correctly resolve the hyperlink into a regular HTML hyperlink when staging your site.

  • ASP pages that utilize CMS’s Publishing Application Programming Interface (API) to obtain URLs should not attempt to manipulate those URLs, but should instead use them exactly as delivered by the API. Failure to do so will result in unpredictable results when the site is staged.

  • When you stage a page containing server-side script, the script is executed at staging time, and only the resulting static HTML is staged. That means the page’s content becomes fixed in time, rather than dynamic, on the Web server that hosts that staged content.

  • ActiveX or Java components included in your Web pages must be packaged in CAB or JAR files within CMS’s resource galleries.

  • HTML forms must include a fully qualified action URL. Failure to do so will result in staged forms that don’t submit their information properly.

By and large, basic Web sites created in CMS can be staged with no problems. Web sites that make heavy use of ASP code, or that use a lot of page redirects, will require special attention in order to work in a non-staged environment.

Tip

By making a decision to use CMS, you’re basically committing to the product’s method of using CMS computers as the actual Web servers. At more than $40,000 per processor, I’m sure Microsoft would like to see you do just that. However, if you’re trying to save money by publishing your sites to regular Web servers (thus using a staged-site environment), be aware of the extra work involved, especially for highly dynamic Web pages.

Why are all the extra precautions necessary? Primarily because of the way that CMS stores Web pages in its SQL Server database. Essentially, page content and page templates are stored separately. When a user requests a Web page, CMS retrieves the template and content from the database and dynamically assembles them into a completed Web page, which is delivered to the user. It’s not as if CMS just stores regular Web pages in the database; what’s in the database is a highly customized, not-ready-for-viewing representation of the content.

So, when a content author links from one Web page to another, CMS doesn’t simply embed a standard HTML <A> (anchor) tag linking to the page. Instead, CMS inserts a special code indicating that the hyperlink goes to such-and-such a location within the database. When the page containing the hyperlink is dynamically generated—or rendered—for an end user, CMS changes the special code into an actual HTML hyperlink. Even then, though, the hyperlink doesn’t point to a Web page because there is no Web page to point to right then. Instead, the hyperlink contains the code necessary for CMS to retrieve and render the second Web page.

When staging Web sites, however, all of that special coding, page templates, and content must be combined together into standardized HTML. In order for that combination process—which is what the Site Stager does—to work properly, you have to make sure that all of the coding is in a condition that CMS can create the correct standard, static HTML. Hence, the special precautions.

Note

The precautions I’ve outlined here are really only the tip of the iceberg. If you’re doing a lot of heavy-duty custom programming in CMS, you’ll need to refer to the CMS documentation, especially the Publishing API Help file, for additional precautions and tips.

Version Control

Another useful feature offered by CMS is version control. Version control enables CMS to not only maintain the current version of content, but also previous versions. Suppose, for example, that an author creates a revision to a Web page, which an editor accepts. The next morning, everyone realizes that the author was insane, and that the old version of the page needs to be republished. You could resort to your backup files (you did make a backup, right?), but it would be easier for CMS to let you simply revert to the previous version. With version control, you can do just that.

Version history is maintained within postings. Version control doesn’t affect the posting itself, so you can’t retrieve earlier versions of the posting’s schedule, for example. But you can view

  • The latest unapproved page revision, including its date, time, and status.

  • The date and time of all approved revisions, including the type of revision and the name of the approver.

You can preview any approved version from the past, or the current unapproved revision (if one exists), and compare them to another revision from the version history. When you compare two revisions of a posting, changes (from the oldest revision to the newest) are shown in an alternate color, or in strikethrough text, very similar to Microsoft Word’s “Track Changes” feature. This display makes it very easy to see how one revision of the posting differs from the other.

Keep in mind that pages within CMS always contain dependencies. At the very least, every page is dependent upon the page’s template, and most pages will also be dependent upon resources—such as graphics—used within the pages. Changes to a dependent resource result in version history for all pages using that dependency. For example, suppose you have four pages, all based upon the same template. A change in that template will result in four new revisions, one for each page that depends on the template.

Remember, too, that version history is tracked with postings, and not with pages. If a page is changed fifty times before being submitted as a posting, then you won’t have any version history available for that page.

Tip

Revisions obviously occupy space in the CMS database. Administrators can periodically “clean up” by purging revisions older than a particular date and time. This action applies to all postings, enabling administrators to, for example, discard all revisions that are older than one year. However, CMS does not include any capability for automatically purging, so you’ll want to make sure a periodic revision purge is on your list of manual administrative tasks for CMS.

Content Deployment

The CMS Site Deployment Manager enables you to export content from one CMS system to another. The Site Deployment Manager enables you to select the content to be exported, so that you can (for example) export only recent changes to your site. The tool automatically includes dependent objects (such as templates and resources) for export along with pages; there is no way to export a page without also exporting its dependent objects.

Caution

The automatic selection of dependent objects may sometimes have unexpected results. For example, suppose you change a page template on your development system, and then create a new page based upon that template. If you select that page for export to your production system, the modified page template will go with it, and will apply to any existing pages using the same template.

The Site Deployment Manager can only export items included in the CMS database, such as pages, postings, folders, channels, templates, resource, and so forth. The tool cannot export any external items referenced within a page or a template, such as server-side include files. Anything not included in the CMS database must be manually exported or copied to the destination system.

A typical deployment scenario might have you deploying content from a development server to a read-only production server, where the content can be viewed but not changed. This is the scenario shown in Figure 9.3, earlier in this chapter. Because the production server is a CMS computer, dynamic content can still be delivered.

Note

Deployment is more than just copying Web pages, because CMS doesn’t serve content from static files. Deployment is really a matter of replicating content from one database to another. All CMS Web sites, whether development or production, must use a SQL Server database to store the content.

Frankly, CMS’s deployment capabilities are far inferior to other systems in the .NET Enterprise Servers line. Application Center, for example, provides far easier and more efficient content synchronization capabilities. This lack of functionality is another reflection of CMS’s rookie status in the .NET Enterprise Servers line (and in the Microsoft company in general), and is no doubt an area targeted for major improvement in subsequent versions. Among CMS’s major deployment weaknesses is that deployment is basically a two-step process, requiring you to export site objects (such as pages) on the source server, and then import them on the destination server. Both processes can involve site-wide content locks, preventing users from effectively using the server that you are exporting from or importing to. This behavior requires that you exercise special care when performing an import or export, although CMS does include scripts that enable you to schedule an automated import (for example) during evening hours when the fewest users will be impacted.

Supporting Technologies

Like many of the other .NET Enterprise Servers, CMS doesn’t stand alone. It relies on both SQL Server and Active Directory to provide the underlying technologies and infrastructure on which CMS runs.

Active Directory

CMS’s non-Microsoft roots show in the product’s use of Active Directory. CMS actually maintains its own internal list of users, which it simply imports from Active Directory every 10–15 minutes. You have no control over this import process; every user account in the domain becomes a potential CMS user.

Note

Of course, if you don’t assign a user to a rights group, they won’t have any special permissions. Nonetheless, many organizations create a separate Active Directory domain containing the users that they want to be in CMS. That way you can ensure that CMS imports only users that you plan to assign to rights groups.

CMS uses LDAP to access Active Directory and import users. CMS must be installed on servers that are members of the domain, and will import user accounts from that domain. I would expect future versions of CMS to integrate more closely with Active Directory, enabling you to assign Active Directory users directly to rights groups without an import process of any kind.

Note

For more information on Active Directory, seeWindows Enterprise Technologies,” p. 87

SQL Server

Any environment containing CMS will also contain SQL Server, because CMS uses SQL Server as its back-end storage for configuration information, access permissions, and, most importantly, Web content. This is actually a benefit of CMS’s non-Microsoft roots: Had Microsoft developed CMS internally, CMS would likely have used the same Extensible Storage Engine (ESE) that Exchange and SharePoint Portal Server use. While both of those products are now being redesigned to use SQL Server instead, CMS is already there.

CMS can place a heavy burden on SQL Server:

  • 100% of the content delivered by CMS comes from the SQL Server database. For every page requested by an end user, CMS must retrieve the page’s template, the actual page, and any page resources on individual calls to the database.

  • Multiple CMS servers in a site all use the same SQL Server, increasing the load on that computer.

  • All publishing, authoring, editing, and other tasks within CMS require multiple accesses to SQL Server.

  • SQL Server queries can result in relatively large results, because page content can become quite large.

In a large CMS environment, especially if you’ve decided to use non-staged sites, scaling SQL Server up by using larger server hardware may not offer enough scalability. You may need to investigate scaling SQL Server out. Figure 9.4 shows an example, where different CMS computers use different SQL Server computers. Only one SQL Server computer contains a writable copy of the site database, and the CMS computers using that database are the ones you’ll use for publishing. The other SQL Server computers use replication to maintain a read-only copy of the site database.

Scaling SQL Server out, instead of up, will enable you to build larger CMS-based Web sites.

Figure 9.4. Scaling SQL Server out, instead of up, will enable you to build larger CMS-based Web sites.

Scaling SQL Server out, instead of up, will enable you to build larger CMS-based Web sites.

For more information on how to incorporate SQL Server into your environment, seeIncorporating SQL Server into Your Design,” p. 387

.NET Enterprise Server Integration

CMS is designed to take advantage of the services offered by some of the other .NET Enterprise Servers, even though CMS doesn’t strictly require them in order to operate.

Application Center

Application Center doesn’t integrate with CMS per se, but it does offer a great companion if you’re using staged sites. Application Center has no real place in a non-staged site. Although you can certainly use Application Center’s load balancing features to help balance traffic across your CMS servers, you don’t need any of Application Center’s content synchronization and deployment capabilities. That’s because CMS servers all retrieve their content from a centralized SQL Server database, not their local hard drives. Since CMS is already using a common storage area for content, there’s nothing to synchronize.

In a staged site, though, you deploy CMS’s content to regular Web servers. You have a choice of either deploying to each server in a Web farm, or deploying to an Application Center cluster controller and enabling Application Center to get the content out to the other Web servers. Application Center’s load balancing and centralized management makes the Web farm more efficient and easier to administer, too.

Figure 9.5 shows the contrast between content synchronization in a staged site (the production site) and centralized content in a non-staged site (the development site). Although each of the two sites shown can benefit from Network Load Balancing (which is also available with Windows 2000 Advanced Server and all editions of Windows .NET Server), the staged site takes full advantage of Application Center’s capabilities.

Using Application Center in a non-staged site is overkill, since the only real advantage becomes Network Load Balancing.

Figure 9.5. Using Application Center in a non-staged site is overkill, since the only real advantage becomes Network Load Balancing.

Using Application Center in a non-staged site is overkill, since the only real advantage becomes Network Load Balancing.

For more details on what Application Center can do, seeTechnology Capabilities,” p. 180

SharePoint Portal Server

SharePoint and CMS seem to occupy very similar niches. Both offer document management-like capabilities, and both are Web products. SharePoint, however, is designed as an intranet portal site, where CMS is very focused on Web publishing and hosting, most especially public Web site hosting (since not many companies will spend the money necessary to acquire CMS on a purely intranet site). The fact that CMS and SharePoint didn’t really offer any integration was more a function of their parallel, unintegrated development (remember, Microsoft bought CMS and developed SharePoint internally) than of any master plan on Microsoft’s part.

Microsoft recently released an Integration Pack for CMS and SharePoint Portal Server (SPS). The purpose of the Integration Pack is to provide tighter integration between the two related products. Specifically, the Integration Pack enables users to manage content more efficiently across SPS and CMS, making it easier to manage content in SPS and then publish it out through CMS for subsequent publication on the Web. The Pack also makes it easier for SPS-based intranet portals to integrate content on the corporate Web site, which would be hosted by CMS. Finally, the Web Content Management processes in CMS are exposed in SPS, providing integration between the company’s intranet portal (based on SPS) and their public Web publishing infrastructure (in CMS).

One of the best features of the Integration Pack is its price: free. Normally, that’s a free download through www.Microsoft.com/downloads/release.asp?releaseid=38801. You can read more about the Integration Pack on the CMS Web site at www.Microsoft.com/cmserver.

SharePoint Portal Server

For more information on SPS’s capabilities, seeTechnology Capabilities,” p. 354

Commerce Server

CMS’s Content Connector for Commerce Server allows CMS to work in conjunction with Commerce Server, enabling you to use CMS to create Commerce Server–based e-commerce sites. The steps for using the Content Connector are pretty involved, and you’ll need to have your site developers take extra care when developing complex Commerce Server sites under CMS. If you make the effort, though, you’ll find that CMS offers answers to many of Commerce Server’s problems:

  • CMS provides a managed authoring environment, enabling you to distribute the task of page creation to less technically experienced individuals.

  • CMS’s workflow fits into what most companies already use for their publishing workflow. For example, catalog companies already use an author-editor-approver process, which CMS mimics.

  • By allowing less technically expert authors to work within page templates to create site content, CMS really extends Commerce Server’s philosophy of delegating tasks throughout the organization, rather than dumping everything on the technical staff.

Commerce Server

For information on what Commerce Server is capable of, seeTechnology Capabilities,” p. 230

Incorporating Content Management Server into Your Design

You’ll have to adopt some slightly different design techniques for CMS 2001 than for other .NET Enterprise Servers. That’s because Microsoft’s acquisition of CMS was pretty recent, and they haven’t yet released a version that Microsoft has actually developed. In fact, the CMS documentation is primarily delivered in PDF files, as opposed to Microsoft’s usual compiled HTML (CHM) help files, and the documentation’s screen shots still show the product’s original company and name, NCompass Revolution. That means CMS doesn’t always use the same procedures and methods that the other .NET Enterprise Servers share, although future versions of CMS will undoubtedly toe the Microsoft company line much more closely.

Bearing in mind that CMS is designed to host a Web site all by itself, I’ll present you with three different design scenarios. In the first, I’ll show you how to set up a basic CMS publishing infrastructure, which you’ll need to do no matter how you plan to use CMS. Then, I’ll present designs that use CMS as the Web server, and designs that enable you to publish from CMS to a standalone set of Web servers. In that latter set, I’ll include designs that utilize Application Center for content synchronization and server load balancing.

Development Infrastructure

Figure 9.6 shows a basic development setup that uses a single CMS computer and a single SQL Server for development. Optionally, as shown, you might add a second testing tier. Doing so enables your authors and Web site tester to operate against different sites, and enables you to ensure that your development-to-testing deployment techniques work properly.

You can accommodate heavier development workloads by adding CMS computers.

Figure 9.6. You can accommodate heavier development workloads by adding CMS computers.

If you’re working in a distributed environment, you’ll want to centralize your CMS and SQL Server computers. Authors work with CMS over HTTP using relatively low-bandwidth connections; CMS, on the other hand, needs high-speed LAN connectivity to SQL Server for optimum performance. Figure 9.7 shows how users distributed in several offices can use a centralized CMS site.

The best physical location for your CMS and SQL Server computers is near the IT professionals who will support them.

Figure 9.7. The best physical location for your CMS and SQL Server computers is near the IT professionals who will support them.

CMS-Based Web Site Designs

Unless you’re using CMS to develop and manage an intranet Web site—an expensive proposition, given CMS’s per-processor license price—I don’t recommend using the same CMS computers to develop your site and host it for end users. Doing so is simply a security risk; many organizations who are using non-staged sites deploy their content from an internal development site to a production site. The production site is often in a completely different domain, which helps restrict the number of user accounts that CMS imports from Active Directory. Figure 9.8 shows a two-tier (development and production) CMS Web site, including the firewalls that protect the various servers.

Deploying to a separate site enables you to place that site in a demilitarized zone (DMZ) for better protection of your internal network.

Figure 9.8. Deploying to a separate site enables you to place that site in a demilitarized zone (DMZ) for better protection of your internal network.

How well does this type of design scale? Microsoft published a white paper (available at www.Microsoft.com/cmserver/techinfo/cms+performance-scalability.doc) that provides some clues:

  • A quad-processor SQL Server using 700MHz processors, with sufficient disk space, can support about four CMS Web servers.

  • A dual-processor CMS computer using 700MHz processors can deliver about 65 Web pages per second. That’s about 32 pages per second per processor.

  • If you use Network Load Balancing to distribute incoming load across CMS computers, you can expect a four-node cluster (with each node running four 550MHz processors) to deliver a combined throughput of about 235 Web pages per second. That’s roughly 15 pages per second per processor. The number here is lower than 32 pages per second because, for one, a slower processor is in use. Also, CMS seems to scale better across two processors than four, so you’re not really getting the full advantage of four processors.

Note

Microsoft used 550MHz computers in the cluster scenario because most companies prefer to build Web farms by using relatively low-powered computers. Doing so makes it more economically feasible to scale the farm out by buying additional servers as needed.

Those are easy numbers to work with. If your average end user requests 100 Web pages over a 10-minute session (which is quite a lot), and you plan to support 10,000 users during the traditional eight-hour business day, then you’ll probably only need one Web server (assuming your users evenly distribute themselves throughout the day). Here’s how the math falls out:

  • If each user has about a 10-minute session with 100 page requests, that’s about .17 pages per second per user for the 10-minute period.

  • If 10,000 users will each have a 10-minute session during an 8-hour period, that’s 48 10-minute session blocks (8 hours is 480 minutes, divided by 10).

  • 10,000 users spread across 48 10-minute session blocks is about 210 users per session block. If each user needs .17 pages per second, then 210 will need about 36 pages per second—well within what a single CMS computer can deliver, according to Microsoft’s benchmarks.

To turn the math around, a single CMS computer with a capability of 65 pages per second could support about 382 users per 10-minute session block, or a little more than 18,000 users in your 8-hour day (again assuming the users were evenly distributed throughout the day). Four of those dual-processor 700MHz computers could handle more than 73,000 users in a day, all using a single quad-700MHz SQL Server back-end.

73,000 users seems like a large number, but in Web terms it isn’t. Microsoft’s Web site, for example, takes millions of hits per day from millions of users. Once you hit the maximum capacity for a four-node load balancing cluster (which is about as many Web servers as you should expect a single SQL Server to support), you’ll need to expand with a second cluster so that you can dedicate a second SQL Server to the task, as shown in Figure 9.9.

Scaling CMS often means additional SQL Server computers to support the load.

Figure 9.9. Scaling CMS often means additional SQL Server computers to support the load.

Note

Scaling is where CMS becomes expensive. Imagine two four-node clusters, where each node has two processors. Each cluster is supported by a four-processor SQL Server computer. That’s $888,000 in software licenses (CMS, SQL Server Enterprise Edition, and Windows 2000 Advanced Server), not to mention the hardware itself. Throw in the hardware costs and you could easily be looking at a cool two million just to get everything running. Those kinds of numbers are why they’re called .NET Enterprise Servers, not “.NET Small Business Servers.”

Non-CMS Web Site Designs

CMS seems to scale well enough, but nothing—and I mean nothing—beats a regular Web server serving up static Web pages. IIS 5.0 benchmarks at well over 4,000 pages per second even when many of those pages are dynamically generated—a far cry from CMS’s paltry 65 pages per second. For truly massive scalability, you’ll stage your CMS Web sites to static pages on a regular Web server if at all possible. You won’t need to worry about scaling SQL Server, and you can take advantage of Application Center to synchronize content and make site administration easier. Figure 9.10 shows an example.

Scaling this Web site is as easy as adding another Web server to the Application Center cluster.

Figure 9.10. Scaling this Web site is as easy as adding another Web server to the Application Center cluster.

Staging does remove some of the benefit of dynamic pages, since CMS doesn’t stage server-side scripts (it only stages the results of those scripts). Of course, the current version of CMS does offer some workarounds to this restriction, as I’ve mentioned. So, another technique might be to split your Web site, as shown in Figure 9.11. In this case, a portion of the site—all the static content—is staged to regular Web servers for fast performance. The bulk of the site’s dynamic pages remain within CMS, creating a hybrid environment.

You’ll scale each “side” of the Web site differently, using both CMS and traditional Web scaling techniques as appropriate.

Figure 9.11. You’ll scale each “side” of the Web site differently, using both CMS and traditional Web scaling techniques as appropriate.

Alternative Technologies and Products

Microsoft clearly isn’t alone in the content management space. In fact, until they bought CMS (and its manufacturer, Ncompass), Microsoft wasn’t even a player. Other solutions abound: Vignette’s V6 system (www.vignette.com) is a major player in the field, and IBM’s Content Manager (www.ibm.com) is a well-respected product. Just about every major document management solution has Web tie-ins now, including Documentum (www.documentum.com), one of the biggest names in the document management industry.

Normally in these “Alternative Technologies” sections I let you know why Microsoft’s solution might be a bit better for you, and normally those reasons revolve around integration with the other .NET Enterprise Servers. Since CMS is a new product in the Microsoft stable, and an acquired product at that, its integration is still shaky. The SPS Integration Pack is a step in the right direction, and CMS already offers Commerce Server integration. But you’ll need to carefully examine your business needs and the features of Microsoft’s competitors to make sure that CMS is the right solution. One thing, however, is certain: If you’ve already made in investment in .NET Enterprise Servers, expect CMS to become a more integral part of that line in the next version.

Also, for all that I like to highlight CMS’s pricing ($43,000 per processor), it’s actually quite price-competitive in its field. That’s a scary thought if you’ve never investigated pricing for this type of product, but rest assured that the content management space is one of the most expensive in the horizontal software market. Do your homework and don’t be afraid to negotiate on pricing whenever you can. A Microsoft sales rep, for example, might be persuaded to offer you a sweet deal on an all-inclusive CMS/SQL/Windows package if it kept you from going with a Vignette solution.

Summary

CMS is a major new step for Microsoft, putting them head-to-head with some large vertical-market companies. As content management becomes less and less a vertical market and more mainstream, expect CMS to offer great things. Already, CMS’s Active Directory integration and easy-to-understand publishing workflow make it an excellent choice, and its scalability options make it appropriate even for large environments. If you’re on the fence about a content management solution and don’t need to make an immediate decision, wait for the next generation: The next version of CMS—and its competitors—will probably change the playing field significantly.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.102.249