16.8. Metamodeling

Modeling involves making models of business domains. Metamodeling involves making models of models—this time things being modeled are themselves models. Just as recursion is one of the most elegant and powerful concepts in logic, metamodeling is one of the most beautiful and powerful notions in conceptual modeling. This section uses a simple example to convey the basic idea.

A database holds fact instances from a business domain, while its conceptual schema models the structure of the domain. Figure 16.23 recalls our basic view of an information system, where the information processor ensures that the database conforms to the rules laid down in the conceptual schema. Essentially, a DBMS is a system for managing various databases; for each database that models some UoD, it checks that each database state agrees with the structure specified in the conceptual schema for that UoD.

Figure 16.23. The database must conform to the structure of the conceptual schema.


Among other things, a conceptual modeling tool is a system for managing conceptual schemas. Each valid schema diagram in this book may be thought of as an output report from this system. The trick then is to treat a schema as an instance of this higher level system. As long as we can verbalize the diagrams into atomic facts, we can use the CSDP to develop a conceptual schema for such conceptual schemas. We would then have a conceptual metaschema (schema about schemas).

Suppose that Table 16.4 is part of an output report from a movie database. Other reports from the same domain provide further information (e.g., people’s birth places).

Table 16.4. One report extracted from a movie database.
MovieNrTitleDirectorStars
1WildernessTony O’Connor 
2Sleepy in SeattleAnne WithaneeIma Dozer Anne Withanee
3WildernessAnne WithaneePaul Bunyip

The conceptual subschema for this report is shown in Figure 16.24. The inclusive-or constraint on Person is shown explicitly since Person plays other roles in the global schema (e.g., as in Figure 16.8 from an earlier section).

Figure 16.24. A conceptual schema for Table 13.4.


In this case, the information system architecture in Figure 16.23 still applies, but the user is an information modeler, the database holds a conceptual schema, the UoD is about conceptual schemas, and the conceptual information processor ensures that only valid conceptual schemas are placed in the database by checking that they satisfy the structure specified in the meta-conceptual schema.

Rather than developing a complete metaschema for ORM conceptual schemas, let’s confine our discussion here to simple examples such as Figure 16.24, ignoring nesting, subtyping, derivation, and all constraints other than uniqueness and mandatory role constraints. If you’ve never done metamodeling before, it seems a bit strange at first. As a challenge, try to perform CSDP step 1 using Figure 16.24 as a sample report.

Metamodeling is like ordinary modeling, except the thing being modeled is itself a model. You might begin by describing Figure 16.24 roughly. For example: “It has two entity types (Movie and Person) and one value type (MovieTitle). The first role of the ‘was directed by’ predicate is mandatory and has a uniqueness constraint”, and so on. This verbalization conveys the information, but we need to express it in terms of atomic facts. For example, using “ObjectKind” to mean “kind of object type”, we might say: “The ObjectType named ‘Movie’ is of the ObjectKind named ‘Entity’”.

In previous chapters we saw that the same UoD may be modeled in different ways. This applies here too, since there are many different ways in which the information in Figure 16.24 can be verbalized as atomic facts. For example, consider the information that both roles of the predicate called “starred” are spanned by the same uniqueness constraint. How would you say this in atomic facts?

With diagrammatic applications like this, you often find that you want to talk about an object (such as a constraint) but it hasn ‘t got a name on the diagram. You would naturally identify it to somebody next to you by pointing to it, but this won’t help you convey the information over the telephone to someone.

In such cases it is often convenient to introduce an artificial name, or surrogate, to identify the object. This is done in Figure 16.25, where each constraint is given an identifying constraint number. For convenience, we’ve also introduced role numbers and predicate numbers (although we could have identified roles by their positions in predicates, and predicates by their expanded fact type readings).

Figure 16.25. Surrogates are added to identify constraints, roles, and predicates.


If readings are used to identify predicates, we need to expand them with the object type names (e.g., to distinguish the starred predicate in Movie starred Person from that in Play starred Person, or the runs predicate in Horse runs Race from that in Person runs Barbershop). Note that in these examples, the two “starred” predicates have the same meaning, whereas the two “runs” predicates have different meanings. We could extend the metamodel to capture this distinction, but for simplicity we ignore this issue here.

The metaschema shown in Figure 16.26 is only one of many possible solutions. As an aid to understanding, it is populated with the database, or set of facts, that corresponds to the conceptual schema shown in Figure 16.25. Here, “UC” and “MR” abbreviate “uniqueness constraint” and “mandatory role”, respectively. Whether a uniqueness constraint is internal or external is derivable by checking whether its roles come from the same predicate. Whether a mandatory constraint is simple or disjunctive (inclusive-or) is derivable by checking whether it spans multiple roles. For simplicity the diagram omits these derivations.

Figure 16.26. A conceptual metaschema, populated with the schema of Figure 16.25.


Recall that other constraints are ignored in this discussion. For simplicity, only one reading is stored for each predicate, and nesting is ignored. This solution also ignores the implicit reference types implied by reference modes. If you developed an alternative metaschema, don’t forget to do a population check.

The metaschema actually contains features not present in our original example (e.g., value constraints and subtyping). So it is not rich enough to capture itself. As a nontrivial exercise you may wish to extend the metaschema until it can capture any ORM schema. For instance you can capture subtype links by adding the fact type: ObjectType is a subtype of ObjectType. To test a full ORM metaschema, you should be able to populate it with itself.

Metamodeling is not restricted to conceptual schemas. Any well-defined formalism can be metamodeled. Apart from being used to manage a given formalism, metamodels can also be developed to allow translation between different formalisms (e.g., ER, ORM, and UML). This is sometimes referred to as metametamodeling.

Because of ORM’s greater expressive power, it is reasonably straightforward to capture data models in UML or ER within an ORM framework. Although less convenient, it is possible to work in the other direction as well. To begin with, UML’s graphic constraint notation can be supplemented by textual constraints in a language of choice (e.g., OCL). Moreover, the UML metamodel itself has built-in extensibility that allows many ORM-specific constraints to be captured within a UML-based repository.

For example, the ORM model in Figure 16.27(a) contains four constraints C1..C4. While the uniqueness constraints are easily expressed in UML as multiplicity constraints, the subset and exclusion constraints have no graphic counterpart in UML. The UML metamodel fragment shown in Figure 16.27(b) extends the standard UML metamodel with constraintNr, constraintType, and elementNr attributes, and SetCompConstraint as a subtype along with the argLength attribute.

Figure 16.27. Modeling ORM set-comparison constraints in ORM and extended UML.


The full UML metamodel is vast, so we’ve included only the fragment relevant to the example. The attribute constraintType stores the type of constraint (subset, exclusion, mandatory, etc.). SetCompConstraint denotes set comparison constraint (subset, equality, or exclusion), and argLength is the argument length or number of roles (association ends) at each end of the constraint.

This metamodel fragment is probably easiest to understand in ORM. Figure 16.27(c) shows an ORM metamodel fragment with sample population based on Figure 16.27(a).

The four ORM constraints may now be stored as the object-relation shown in Table 16.5. The subset (SS) and exclusion (X) constraints have their argument length recorded. The actual arguments of these two constraints may now be derived by “dividing” the role lists by this number.

Table 16.5. Meta-table for storing ORM constraints.


Thus the arguments of the subset constraint are the simple roles r4 and r2, whereas the arguments of the exclusion constraint are the role pairs (r1, r2) and (r3, r4). The constraint type may now be used to determine the appropriate semantics.

Although this simple example illustrates the basic idea, transforming the complete ORM metamodel into an extended version of UML is complex. For example, as the UML metamodel fragment indicates, UML associations must have at least two roles (association ends), so artificial constructs must be introduced to deal with unaries.

The following exercise includes several questions to hone your metamodeling skills. A taste of metamodeling can really whet one’s appetite, but unfortunately this is as far as we go in this book. The authors hope that you have gained some insights into the science and art of conceptual modeling by reading this book and that you share their belief that modeling the real world is one of the most challenging, important, and satisfying human activities.

Exercise 16.8

1.Devise a conceptual metaschema (in ORM, UML, or ER) to store simple SQL schemas where each table has at most one primary key (possibly none). A primary key may be composite (multi-column), but each foreign key references a simple (single column) primary key. Tables are identified by simple names. Columns are ordered by position, and may be optional or mandatory. Each column in a primary key is mandatory, but it is possible that all columns in a keyless table are optional. Ignore domains and all other constraints (e.g., uniqueness constraints on column sets other than primary keys). For example, your metaschema should be able to store the following schema:

2.Map your answer to Question 1 to a relational metaschema.
3.A sample UML class diagram is shown.

Specify a metaschema in ORM notation for simplified UML class diagrams that are restricted to the following features. Classes have attributes only (e.g., no operations). Attributes have visibility (+ = public, # = protected, - = private, ~ = package) and multiplicity (* = 0 or more, 0..1 =0 or 1, 1..* = 1 or more, 1 = exactly 1). Associations may be binary or n-ary. No association classes or qualified associations are allowed. Association ends have multiplicity (* = 0 or more, 0..1 =0 or 1, 1..* = 1 or more, 1 = exactly 1). Role names are mandatory (whether or not they are displayed). The only constraints that may be added to a subclassing scheme are disjoint and complete. No derived attributes or derived associations are allowed. No notes are allowed. No aggregation is allowed.

No other graphic constraints (e.g., {ordered}, {xor}, {subset}) or textual constraints are allowed. Ignore our extensions (e.g., {P}, {Ul}). Ignore data types. Ignore instance populations and presentation aspects (e.g., layout, or whether the display of a feature is suppressed). Your metaschema should be able to store a single class diagram like the one shown.

4.Specify a metaschema in ORM notation for Barker ER (as discussed in Section 8.2). Your metaschema should be able to store a single Barker ER diagram like the one shown.

5.Specify a metaschema in ORM for deterministic, finite-state automata. Two examples of a deterministic automaton are depicted in the transition graphs shown here. Each state appears as a named circle. The starting state has a no-input (unlabeled) arrow attached to it. Each automaton has one or more accepting states, depicted as named double-circles. Transitions between states are depicted as labeled arrows from a state to the next state, where the label names the action (or disjunction of actions) causing the state transition.

A sequence of transitions is acceptable (well formed) only if it ends in an accepting state. An accepting state cannot be the start state. The examples have only one accepting state, but your metaschema should allow for multiple accepting states. A finite-state automaton is deterministic if and only if for each action applied to a state there is only one possible resulting state. Your metaschema should be able to store the information content of single transition graphs like the examples shown.

6.In a nondeterministic finite-state automaton, a given action performed on a given state may result in one of many different transitions (we cannot know in advance which transition option will occur for any specific occurrence of the action being applied to the state). For instance, in the following example the action a performed on state s0 might cause a transition to state s1 at one time and to state s1 at another time. Modify your answer to (a) to cater for nondeterministic automata.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.173.227