3.4. Review: Building the Data Dictionary

Defining Piccadilly Entries

Your task was to define the information that Piccadilly receives from its programme suppliers. This is the minimum that Perry Vale needs to make his programme buying decisions. The entries are listed alphabetically.

Director Name = * Data element. Identifies the director of a programme. *

New Programme = * Data flow. Describes a programme offered by an English or overseas programme supplier. *
Programme Name + Programme Type + Programme Description
+ Programme Duration + Programme Price
+ ?Programme Episodes
+ {Performer Name}
+ ?(Producer Name + Director Name)
+ ?Supplier Name

Performer Name = * Data element. Identifier of an actor or actress appearing in a programme. *

Producer Name = * Data element. Identifies the producer of a programme. *

Programme Description = * Data element. Synopsis of the contents of a programme.*

Programme Duration = * Data element. Running time of a programme. Units: hrs/mins/secs. *

Programme Episodes = * Data element. The number of episodes of a programme covered by an agreement with a supplier or by Piccadilly’s internal production plans.*

Programme Name = * Data element. The name that uniquely identifies this programme, for example, News at Ten, Brideshead Revisited, Coronation Street. *

Programme Price = * Data element. Price paid to the supplier of a programme. Units: pounds sterling. *

Programme Type = * Data element *
[First-Run Film | Sporting Event | Documentary | Talk Show | Old Movie ]
* The types of programmes that Piccadilly transmits. Note that there are other programme types to add to these.*

Supplier Name = * Data element. Identification for a programme supplier. *

Although not specifically mentioned, a supplier would likely tell Piccadilly who he is when the new programme information is sent, since Perry must inform the supplier of his buying decision. Because the information was not specifically mentioned by the user, a question mark is used to indicate that an assumption is being made. A question mark can be used anywhere you’re not certain of what you are writing. Definitions and data flows highlighted by this notation can then be clarified by the users. There is nothing wrong with showing what you don’t know, but there is everything wrong with hiding it.

The supplier’s address is not listed, as it is likely to be on file and can be retrieved using the name. Perry can confirm this when you go back with questions.

You were given the information “Some programmes include the names of the producer and director.” When names appear in the data flow, does it mean that both the producer’s and the director’s names are there? Or does it mean that one or the other may be present? The English language doesn’t have any precedence rules for “and,” but the data dictionary does. You could have defined (PRODUCER NAME + DIRECTOR NAME) to mean that if any names appear, then both do; or (PRODUCER NAME) + (DIRECTOR NAME) to say that one or both or none may be present. Alternatively, if it is possible to have more than one producer and more than one director, then {PRODUCER NAME} + {DIRECTOR NAME} would be used. Again, a question mark is used in the definition as you cannot be sure about the correct meaning until you have checked with Perry Vale.

Although the user did not mention it, the analyst realized that there might be several episodes of a programme. The ?PROGRAMME EPISODES has been included in the definition so that the analyst remembers to ask the question.

In the description, you were told that there are other types of programmes. The comment in the data dictionary entry for PROGRAMME TYPE reminds you that some more values need to be defined.

Defining Ratecard

This definition was derived from a sample page of a ratecard presented in Figure 1.5.2.

Ratecard = * Data flow. Prices, moveability, and preemption rules of time available for sale. *
Rate From Date
+ ?{Rate Spot Duration
+ {Rate Segment Day
+ {Rate Segment Start + Rate Segment End
+ {Rate Moveability + Spot Price}}}}

Rate From Date = * Data element. The commencement date for a new ratecard period. Format: Day/Month/Year. *

Rate Moveability = * Data element. Rate moveability as defined in the ratecard. *
[Fixed: fixed on a nominated day and break | Broad: moveable within a specified segment on a nominated day | ROD: run-of-day, moveable to any similarly priced segment on a nominated day | ROW: run-of-week, moveable to any similarly priced segment during a week]

Rate Segment Day = * Data element. Day(s) of week on which this segment of time occurs. *
[ Weekday | Saturday | Sunday ]

Rate Segment End = * Data element. The end time for a ratecard segment. *

Rate Segment Start = * Data element. The start time for a ratecard segment. *

Rate Spot Duration = * Data element. Duration of spots as defined in the ratecard. Units: seconds. *
[10|20|30|40|50|60]

Spot Price = * Data element. Ratecard price for a spot rate. Units: pounds sterling.*

Your names may be different from those that we have used, but their meanings should be substantially the same. Take a moment to reconcile your names with ours. Also, make sure that you understand ours because, naturally enough, we will be using them for the rest of the Project.

Our definition of RATECARD has a ? before RATE SPOT DURATION. We have assumed that when a new ratecard is issued for a quarter, the rates for all durations are changed. So for one RATE FROM DATE, there are a number of RATE SPOT DURATIONs. The brace indicates that there are several RATE SPOT DURATIONs, and the question mark says that we are not certain.

Make sure that you have your pairs of braces in the right places. If you specify the repeating groups incorrectly, it will mislead the file designers later when they receive your specification.

The ratecard is made up of a number of components; these are defined in turn. When the component is a data element, it has the comment * Data element *. For example, RATE SPOT DURATION is a data element, and its definition shows all the possible values it can have. RATE FROM DATE is another data element, but has an almost infinite number of possible values. There is no point attempting to list them in the dictionary, and dates are commonly understood. So the definition simply has a comment, and points out that the format is day/month/year.

It is not possible to define all possible values of SPOT PRICE by looking at the sample ratecard page in Figure 1.5.2. There are other pages of the ratecard, for spots of different durations. You were told there are similar ratecard pages for 10-, 20-, 40-, 50-, and 60-second spots. Besides, over the life of the system, there will be an almost infinite number of values for this item. To find more information about this data element, you would look through the other pages of the ratecard. You’ll find that the cheapest price is £150 for a 10-second ROW spot in the last segment of the day. The highest-priced spot, a 60-second fixed in the prime segment, sells for £25,000. You could enhance your definition in the data dictionary with

Spot Price = * Data element. Ratecard price for a spot rate. Units: pounds sterling. Value range: >= 150 and <= 25000. *

Adding to Your Data Model

You were also asked to study your definitions of the data flows to see if they revealed any new or updated entities. Then you added any new information you found to the data model. Figure 3.4.1 is an updated version of the data model in Figure 3.3.3.

Image

Figure 3.4.1: Analysis of the data flow NEW PROGRAMME reveals three new potential entities—DIRECTOR, PRODUCER, and PERFORMER—and appropriate relationships. Analysis of RATECARD unveils RATECARD PERIOD and RATECARD SEGMENT, together with some new relationships.

When you built your first-cut data model, you included an entity called PROGRAMME. At that stage, you didn’t have enough information to write a detailed definition of it. However, you now know that most of the data content of PROGRAMME comes from the data flow NEW PROGRAMME. Now that you have defined the content of the flow, you can begin to define the entity PROGRAMME.

Let’s say that the users have accepted our version of the definition for a new programme:

New Programme = * Data flow. Describes a programme offered by an English or overseas programme supplier. *
Programme Name + Programme Type + Programme Description
+ Programme Duration + Programme Price
+ ?Programme Episodes
+ {Performer Name}
+ ?(Producer Name + Director Name)
+ ?Supplier Name

The first five items are data elements, and, as they describe a programme and are remembered by the system, you can say they are attributes of the entity PROGRAMME. Entities are defined by listing their components just as if they were data flows (a sample appears below).

PROGRAMME EPISODES has a ? beside it in the definition of the data flow. The reason is because a question occurred to the analyst: “Do the suppliers tell Piccadilly if a new programme has multiple episodes?” If the answer to this question is yes, the question mark will be removed. Otherwise the element will be removed from the definition. Let us say that in this case the users agreed that PROGRAMME EPISODES should be part of the definition. So we can also attribute it to the entity PROGRAMME. Your dictionary can now support your data model with this definition:

Programme = * Entity. A television programme made by Piccadilly or bought from a programme supplier. *
Programme Name + Programme Type + Programme Description
+ Programme Duration + Programme Price
+ Programme Episodes + Programme Purchase Date

We added the * Entity * comment to differentiate between the types of definitions in the data dictionary. You will find this approach a useful aid in finding your way around a large data dictionary.

Did you include {PERFORMER NAME} in the definition of PROGRAMME? We hope not. While {PERFORMER NAME} lists the main performers in the programme, you cannot say that a performer’s name describes a programme; it describes a performer. Instead of attributing this data item to PROGRAMME, it should be attributed to a new entity called PERFORMER. As usual, you support your data model with a data dictionary definition:

Performer = * Entity. Actor or actress appearing in a programme. *
Performer Name

Two other entities derived from the data in NEW PROGRAMME are DIRECTOR and PRODUCER. Adding these two entities to the data model, you define them:

Director = * Entity. The director of a programme. *
Director Name

Producer = * Entity. The producer of a programme. *
Producer Name

The NEW PROGRAMME data flow carries data about a number of different entities that are grouped in one data flow because there is a need for Piccadilly to know which director has directed a programme. When the data are stored as entities, there is a need to remember the relationship between them. The name DIRECTING given to the relationship in Figure 3.4.1 describes the reason for the link. Similarly, the data flow reveals a relationship PRODUCING between the PROGRAMME and its PRODUCER.

Think of these additions to the data model as potential entities and relationships. For example, we are not yet sure, within this context of study, whether Piccadilly thinks of DIRECTOR as an entity or whether the DIRECTOR NAME is an attribute of the PROGRAMME entity. We will verify this when we do a detailed analysis of the use of the data. Until then, you need to show all potential entities and relationships in the data model because by doing so, you make the questions obvious.

In the first version of the data model that you built (Figure 3.2.8), an entity called SPOT RATE is related to many commercial spots. At that stage of your analysis, Piccadilly’s pricing looked quite straightforward. Since analyzing the sample ratecard, you now find that the pricing is more complex than it appeared when you did the first-cut data model. Your detailed study of the ratecard has resulted in this definition:

Ratecard = * Data flow. Prices, moveability, and preemption rules of time available for sale. *
Rate From Date
+ ?{Rate Spot Duration
+ {Rate Segment Day
+ {Rate Segment Start + Rate Segment End
+ {Rate Moveability + Spot Price}}}}

The data model should reflect the reality of the business. So instead of the single entity called SPOT RATE, you can break it down into separate entities, each one describing something that is familiar and important to the business:

Ratecard Period = * Entity. Period during which given rates apply. *
Rate From Date

Ratecard Segment = * Entity. Continuous band of time defined on ratecard. *
Rate Segment Day + Rate Segment Start + Rate Segment End

Spot Rate = * Entity *
Rate Spot Duration + Spot Price + Rate Moveability

Relationships

Like entities, relationships must reflect the policy of the business. The LEGAL PLACEMENT relationship between RATECARD SEGMENT and SPOT RATE is there to define the segment placement offered by a particular set of moveability rules. This relationship is introduced because the price paid for a spot depends in part on the segments in which it can be transmitted. The relationship is many to many, as a rate can apply to several segments, while the one segment can have many rates applicable to it.

The MOVEABILITY relationship between COMMERCIAL SPOT and RATECARD SEGMENT is established when the agency agrees to buy a spot with certain moveability conditions. (Broad spots can be moved within a segment; run-of-day spots moved to similarly priced segments on the same day; and run-of-week spots moved to similarly priced segments over the week.) The relationship exists to link a commercial spot with the segments in which it may be broadcast.

Eventually, you’ll verify each relationship and define each of these relationships in the data dictionary, and you can do this when you acquire more knowledge of the processes that create and use the relationships.

Image Ski Patrol

Here’s some first aid with the data dictionary definitions. At this stage, we want you to feel secure about defining data. We trust that the reason for writing definitions was revealed by the exercise, namely to better understand the data and the system that uses the data. If you think you can model a system without defining the data, please rethink. If you are not happy with the meaning of the operators, we can only suggest reviewing them in Chapter 2.9 Data Dictionary. If you were not happy with our interpretation of the data, take a few moments to reread the problem statement in Chapter 1.5 Building the Data Dictionary. This time, have the sample answers beside the problem statement. Note how the nouns that are important to Piccadilly’s business make up the data dictionary definitions.

We suspect the data model was your biggest problem. The most important thing we have to say about the data model is, “Don’t panic!” Data models don’t come easily to most people, so you are not alone. Data modeling requires knowledge of the subject matter and the ability to view that subject at a high level of abstraction. The Piccadilly Project will give you plenty of practice in developing these skills, and you will progressively improve your data model. If you attempted to do the data modeling part of the exercise without reading and doing the exercises in Chapters 2.4 Data Viewpoint and 2.5 Data Models, we suggest a detour through those chapters.

So far, we have approached the data model somewhat piecemeal. We have attempted to build a data model from a general description of the business, and have made some small alterations to it by analyzing data flows. This approach could be classified as a “fuzzy top-down approach.” Its strength lies in providing a way of getting started with a complex problem.

There is more to come: When we work through the chapters on essential event-response models, we’ll look at another way to model the system’s stored data. This approach works by partitioning the data model into small logical chunks based on the need to support one event (this is explained in Chapter 2.11 Event-Response Models). By the time you have experienced both approaches, we’re sure that data modeling will be much clearer to you. And, as is necessary in practice, you will be in a position to blend both approaches.

Finish comparing your data dictionary and data model upgrades with the samples. Make sure that you can reconcile any differences between your answers and ours.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.9.106