DATA NAMING CONVENTIONS

The purpose of a data naming convention

A data naming convention should be developed to provide consistent, unique and meaningful names for all existing and new items within the enterprise’s common data resource. A consistent approach to data naming should be applied across the enterprise to help achieve unambiguous understanding of data. Data names need to be both unique and meaningful to the business; unique so that the data objects to which they refer can be unambiguously identified and meaningful so that the terms used in the names are terms familiar to the business. Technical terms and abbreviations should be avoided wherever possible.

The data names should, however, also aid communication between all those involved in the use and development of information systems; they must be meaningful to those involved in systems analysis and design as well as to the business.

Consistent, unique and meaningful names can only be achieved if the conventions are well known and easily understood and followed. The conventions must be enforceable.

A typical data naming convention

All data naming conventions rely on using terms, which may be either single words or a number of words, in a precise and predefined manner. Key to this in all naming conventions is what is known as the class word or term. This is also sometimes known as a representation term. Typical class words might be, for example, ‘identifier’, ‘number’ and ‘text’, providing an indication of the form or the representation of the data.

The typical data naming convention that I describe provides names for ‘data items’ constructed of three terms:

  • a mandatory prime term that provides the context of the data, which normally means the entity or table holding the ‘data item’;

  • one or more optional modifier terms that are used to make the meaning of the data explicit;

  • a mandatory class term that indicates the ‘class’ of the data.

Examples of some names constructed using this convention are:

Employee Height

Employee Hair Colour

Employee Eye Colour

Employee Birth Date

Employee Employment Start Date

Employee Qualification Effective Date

Employee Qualification Issuing Authority Name.

In these names, ‘Employee’ and ‘Employee Qualification’ are prime terms; ‘Hair’, ‘Eye’, ‘Birth’, ‘Employment Start’, ‘Effective’ and ‘Issuing Authority’ are modifier terms; and ‘Date’, ‘Height’, ‘Colour’ and ‘Name’ are class terms.

With very few exceptions, abbreviations should not be used in the names of objects that are part of a conceptual data model. The only exceptions should be where an abbreviation has achieved common usage and the full term is seldom, if ever, used, for example, UN instead of United Nations. Because of constraints in the database management system software, it may be necessary to abbreviate ‘physical’ names of tables and columns in an SQL schema. The naming convention should include, therefore, a standard approach to abbreviating names when the SQL schema is being derived from the conceptual data model. Appendix D provides an example of a full data naming convention.

Problems associated with data naming conventions

There are basically two problems with data naming conventions: they can be over-prescriptive and they may not deliver what is expected. Some organisations have decided to adopt the prime-modifier-class term approach to data naming but have then produced a very restricted list of acceptable class terms. The worst case I have seen had only 11 class terms – ‘amount’, ‘quantity’, ‘code’, ‘identifier’, ‘name’, ‘text’, ‘rate’, ‘dimension’, ‘volume’, ‘weight’, ‘date’ and ‘time’. Within this convention, the name ‘Employee Height’ would need to be replaced with ‘Employee Height Dimension’ (i.e. ‘Height’ has now become a modifier term) and ‘Employee Hair Colour’ would become ‘Employee Hair Colour Name’ (‘Hair Colour’ now being the modifier term). Whilst data names developed using such a restricted set of class terms are very precise they do produce names that appear to those who are not managers to be very idiosyncratic. Exposure of data names such as ‘Employee Hair Colour Name’ to the business community can bring the whole data management initiative into disrepute. Whilst I believe that there should be a standard list of acceptable class terms, I do believe that this list should not be too restrictive.

There is a view that the correct use of a data naming convention enables the identification, through common names, of common ‘data items’ in different data models. I call this the ‘utopian view of data modelling’. Data modelling is largely a subjective activity. Whilst the use of a naming convention to ensure a consistency of approach to the naming of data objects is good practice, it is unlikely that the use of a naming convention alone will lead to the development of identical names by modellers operating independently of each other who are modelling similar business concepts. One modeller might use the name ‘Prospect’ for an entity that another modeller might quite legitimately call ‘Potential Customer’.

The chance of the development of identical or similar names is improved if the naming convention is supported by a thesaurus or controlled vocabulary. A data naming thesaurus contains a list of approved terms, their meanings and their allowed uses as prime, modifier or class terms. It may also include details of terms that are ‘broader than’ or ‘narrower than’ an individual term, making it easier to select the appropriate term to use in any particular circumstance. It also contains a list of disallowed terms and the appropriate approved terms to use in their place. For example:

  • ‘Customer’ may be used as a prime term, and is a narrower term than ‘External Party’ and a broader term than ‘Potential Customer’ and ‘Actual Customer’.

  • ‘Prospect’ is disallowed as a prime term; use ‘Potential Customer’.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.65.247