UML provides great artifacts that aid traceability from start to finish; however, they don't directly clarify the distributed or throughput requirements of the application. Although component and deployment diagrams can model the notion of “location,” applications need a set of views that deal specifically with network and database loading and distribution. The good news is that the input to this effort comes directly from the use-cases, class diagram, and sequence diagrams; thus the project still weaves in traceability. These non-UML diagrams I call usage matrices. There are three types:
Event/frequency
Object/location
Object/volume
The event/frequency matrix applies volumetric analysis to the events the system will automate. It begins to establish a basis for what will become important decisions about the location of both program code and data. The matrix applies to any application, regardless of whether the requirements call for distributed elements. Even if only one location will be served, this form of matrix will present a network-loading picture to the operations staff.
The event/frequency matrix has as its input the event table created during the initial project scoping effort. Recall from Chapter 3, when we did project scoping for Remulak Productions, that we added more columns to the event table that attempted to capture the frequency information, albeit informally (see Table 3-3). The event/frequency matrix adds the element of location to the picture. The goal is to ask questions concerning throughput and growth spread over the dynamic spectrum of current and potential geographic locations.
Table 7-3 is an abbreviated event/frequency matrix for Remulak Productions. This is an incomplete list of events, so the growth percentage is specified as “per year.”
Event | Location | |||
---|---|---|---|---|
Newport Hills, Wash. | Portland, Maine (proposed) | |||
Frequency | Growth (per year) | Frequency | Growth (per year) | |
Customer Places Order | 1,000/day | 20% | 200/day | 40% |
Shipping Clerk Sends Order | 700/day | 20% | 130/day | 40% |
Customer Buys Warranty | 60/day | 5% | 10/day | 10% |
Customer Changes Order | 200/day | 20% | 80/day | 20% |
Supplier Sends Inventory | 10/day | 10% | 3/day | 10% |
Some of the information in Table 7-3 surfaced early during the project charter effort and by now has been refined and made more accurate. For example, early in the project the number of changed orders was much lower. Further research has caused that number to increase tremendously. Notice the column that represents a proposed East Coast location for Remulak Productions. Although this location isn't operational yet, we can take it into account when devising the final architecture solution and thereby improve the solution.
In a networked environment, an application's throughput is a function not so much of the media being used (fiber) or the data link layer being deployed (frame relay) but rather of the abuse that the application will inflict on the network technology. Granted, if the application will require subsecond response times to countries that have no communications infrastructure, the network technology itself will have a large impact. However, the very act of trying to collect these types of statistics might prove very difficult at best, and wrong information can be disastrous.
The operations element of the project might not care about 1,000 orders per day, but the occurrence of 90 percent of those orders between 8:00 A.M. and 9:00 A.M. would raise their curiosity as a potential problem to be addressed. So the application can be more effective by capturing the potential peak hour frequency—input that will be much more meaningful to the operations staff.
On the basis of the numbers and the anticipated growth rates, decisions will be made that affect the following:
Where the processes (objects) that satisfy the events will reside
The communications infrastructure that might need to be in place to satisfy the desired throughput levels
Component granularity (the number of packages and classes)
Eventually we will be able to approximate just what the Customer Places Order event is in terms of the amount of information moving across the “pipe.” This information is ultimately what the network staff will require. However, loadings can be simulated to anticipate potential bottlenecks much earlier than most project teams realize.
The object/location matrix focuses on the potential location at which various objects may need to reside to meet performance criteria. The matrix is really useful when multiple locations will require access to the application. For this matrix to be effective, we need information about not only the locations but also the kind of access the objects will require.
The object/location matrix captures two dimensions of the application concerning objects and locations:
Breadth of object access:
A = All object occurrences
S = Subset of object occurrences (the subset must be specified)
Pattern of object access:
R = Read-only (no operations require update activity)
U = Update (all types of operations are possible, including read)
In the current case, Remulak Productions wants the Newport Hills facility to have update access (U) to all objects in the system (A), including the proposed new location at Portland. The Portland location will need read-only access (R) to all objects in the system (A), but it will be able to update (U) only those objects that are serviced at Portland (S). An 800 number will enable customers to reach a dynamic call-routing system that is based on the calling area code and that will shuttle calls to the appropriate call center. Table 7-4 is an object/location matrix for Remulak Productions.
The object/location matrix quickly paints a picture of eventual object distribution in the system and the ultimate determination of how any database replication strategies might be laid out. Specifically, the object/location matrix influences decisions that will affect the following:
Object | Location | |
---|---|---|
Newport Hills, Wash. | Portland, Maine (proposed) | |
Customer | AU | AR, SU |
Order | AU | AR, SU |
OrderHeader | AU | AR, SU |
OrderLine | AU | AR, SU |
OrderSummary | AU | AR, SU |
Shipment | AU | AR, SU |
Address | AU | AR, SU |
Role | AU | AR, SU |
Invoice | AU | AR, SU |
Payment | AU | AR, SU |
Product | AU | AR, SU |
Guitar | AU | AR, SU |
SheetMusic | AU | AR, SU |
Supplies | AU | AR, SU |
AU = Update access to all object occurrences allowed
AR = Read access to all object occurrences allowed SU = Update access to a subset of object occurrences allowed |
Where physical objects will reside in the application. Unless the application will use an object-oriented database, which Remulak will not, this decision will affect the design of the underlying relational database.
Data segmentation and distribution strategies, such as replication and extraction policies.
Common patterns found in the matrix will lead to specific solutions (or at least consideration of them). A classic pattern, which does not show up in Remulak Productions' object/location matrix, is one location having update access to all object occurrences (AU), and the remaining sites having read access to all object occurrences (AR). This pattern might be common when code tables, and in some cases inventory, are housed and managed centrally. It usually leads to some type of database snapshot extraction approach by which locations receive refreshed copies of a master database.
In another, similar, pattern one location has update access to all object occurrences (AU), and the remaining sites have read access to only their own unique subset (SR). Depending on the database technology being chosen, such as Oracle or Microsoft SQL Server, many of these issues can be handled with out-of-the-box solutions.
The object/volume matrix is intended primarily to look at the number of objects used at specific locations and their anticipated growth rate over time. It is beneficial for single- or multiple-location application requirements. It uses the same x- and y-axes (object and location) as the object-location matrix. Table 7-5 is the object/volume matrix for Remulak Productions.
The object/volume matrix will affect several areas of the design, including the following:
Server sizing, as it pertains to both the database server and the application server that might house the application's Business Rule Services layer. The sizing pertains not only to disk storage but also to memory and CPU throughput, and the quantity of CPUs per server.
Database table size allocations, free space, and index sizing. Also affected will be the logging activities and how often logs are cycled, as well as backup and recovery strategies, considering the volumes expected at a given location.
Obviously, for any application many of these numbers will not be so exact that no changes will be made. They are approximations that allow for planning and implementation tactics for the application.
The usage matrices introduced in this section add additional perspective to the four dynamic UML diagrams. They enforce traceability because they get their input directly from the artifacts produced earlier in the chapter.
Object | Location | |||
---|---|---|---|---|
Newport Hills, Wash. | Portland, Maine (proposed) | |||
Volume(100s) | Growth(per year) | Volume (100s) | Growth (per year) | |
Customer | 750 | 20% | 150 | 60% |
Order | 1,400 | 25% | 275 | 25% |
OrderHeader | 60 | 5% | 10 | 10% |
OrderLine | 3,400 | 35% | 700 | 35% |
OrderSummary | 1,400 | 25% | 275 | 25% |
Shipment | 2,200 | 10% | 500 | 10% |
Address | 2,000 | 10% | 450 | 20% |
Role | 2,600 | 10% | 600 | 10% |
Invoice | 1,700 | 25% | 500 | 25% |
Payment | 1,900 | 25% | 400 | 25% |
Product | 300 | 15% | 300 | 15% |
Guitar | 200 | 5% | 200 | 5% |
SheetMusic | 50 | 5% | 50 | 5% |
Supplies | 50 | 5% | 50 | 5% |
3.141.31.125