12.2. Date, Time, and Duration Datatypes

The Schema Recommendation introduces a number of datatypes related to dates and times. They can be partitioned naturally into three groups:

  • Individual points on the “time line”: dates, with and without times

  • Lengths of intervals on the time line: durations

  • Sets of points on the time line, modulo some interval: repeating times

In this respect, the time line is much like the “number line” used to describe real numbers (and Schema’s decimal, for that matter).

Think of a continuous number line: One point is the origin; another is “one year” to the right. Think of the origin as the moment at the boundary of the year A.D. 1 and the year 1 B.C. and the other point as the moment at the boundary of A.D. 1 and A.D. 2. From there on, you can divide years into months, months into days, days into hours, hours into minutes, and minutes into seconds, and then have arbitrary fractional seconds between the whole seconds. This is the time line.

Start with the value space of decimal; treat it as counting time in seconds. Zero is the boundary between 1 C.E. (“Common Era,” often called “A.D.”) and 1 B.C.E. (“Before Common Era,” often called “B.C.”). Every so many seconds there is a minute: usually 60 seconds, occasionally 61, and there are a few other weird “minutes” on the time line. Yes, not all minutes have 60 seconds, only most of them. Now look at the minutes; every so many (usually but not always 60) minutes there is an hour. Every (usually) 24 hours there is a day. Every so many (28, 29, 30, 31; even this list has an exception or two) days there is a month. Almost always every 12 months there is a year. The anomalies, of course, arise from leap years, “leap seconds,” various sizes of months, and an arbitrary fix or two when someone decided that the calendar had gotten too far out of sync with the earth’s rotation around the sun.

Seconds are very smooth; all else is somewhat chaotic—but we have a time line. However, it is clear that you cannot exactly say, for example, how many seconds there are in a month, at least until you specify which month. The same for a year, a day, a week, an hour, a minute, or a century. 5:00 A.M. on 15 April 1998 is a specific number of seconds from zero, but it is pretty tricky to figure out exactly how many.

More complication arises because times can be “absolute” (tied to a time zone) or “generic” (not tied to a time zone). No problem comparing time points when both are in the same category, but the generic noon, 12:00:00, can be any time during the day on the “absolute” time line (which is explicitly tied to Coordinated Universal Time (“Z,” formerly Greenwich Mean Time, “GMT”).

Caution

The ISO Standard for which the various date, time, and duration datatypes were designed permits a year 0000, which seems to be generally interpreted, at least in computer-based systems, as 1 B.C.E., the year before the year 0001, which is 1 C.E. So −0001 is 2 B.C.E., and so forth.


12.2.1. The Built-in Time-line-based Datatypes: dateTime, date, gYearMonth, and gYear

When a schema component specifies a dateTime datatype, a valid XML instance must contain a value (character string) that corresponds to the date and time components specified in ISO 8601. The lexical values of the dateTime datatype are nominally of the form

							cctt-mm-ddThh:mm:ss
						

in which the date and time are represented as follows:

  • cc represents the century.

  • yy represents the year within the century.

  • -’ is a separator between parts of the date portion.

  • the first mm represents the month.

  • dd represents the day.

  • T’ is a separator indicating that time-of-day follows (compare date and time, Section 12.2.3).

  • hh represents the hour.

  • :’ is a separator between parts of the time-of-day portion.

  • The second mm represents the minutes.

  • ss represents the seconds.

The preceding lexical values must all be positive integer numerals with exactly two digits (which might require a leading zero), with two exceptions. One exception is the century (cc ), which may have ‘+’ or ‘-’ prefixed and may have more than two digits (in which case, no leading zero digits permitted). A negative century indicates years B.C.E. (Before Common Era, often called B.C.). The other exception is the seconds (ss ), which may be an unsigned decimal numeral (denoting fractions of a second)—but must still have exactly two leading integer digits.

The following two element elements are representations of element types with dateTime (or dateTime-derived) simple content. The simple version is

<xsd:element name="demoDateTime" type="xsd:dateTime"/> 

A version with some constraining facets that constrain corresponding instances to a date and time between 1970 and 2050, inclusive, is:

<xsd:element name="demoDateTime"> 
    <xsd:simpleType> 
        <xsd:restriction base="xsd:dateTime"> 
            <xsd:minInclusive value="1970-01-01T00:00:00"/> 
            <xsd:maxExclusive value="2051-01-01T00:00:00"/>  
        </xsd:restriction> 
    </xsd:simpleType> 
</xsd:element> 

The upper bound is given via a maxExclusive attribute to ensure that all fractional-second values, no matter how close to the end of the last day, are included.

Given either of the preceding element types, the following is a valid instance:

<demoDateTime>2000-04-01T16:58:03.22</demoDateTime> 

On the other hand, the following is a valid instance only of the “simple” datatype, because the derived datatype does not permit years before 1970 (−2000 is well before 1970):

<demoDateTime>-2000-04-01T16:58:03.22</demoDateTime> 

Caution

Implementations are not required to handle centuries of more than two digits. They must handle fractional seconds to six fraction digits (microseconds), and must round if more digits are presented in a representation.


Warning

ISO 8601 permits year 0000, and the Schema Recommendation makes 0000 valid and prescribes that, in conformity with a common current practice, 0000 be the year 1 B.C.E.—the year before 1 C.E., which is 0001. (This wasn’t always the case; a prohibition against using year 0000 was removed by an Erratum to the Recommendation.)


12.2.1.1. Discrete Times: The “Integers” of the Time Line

It would be easy to “integerize” the time line at many different granularities. For example, you could ignore fractions of a second. You could ignore seconds—or minutes—or hours or days or months or years (leaving centuries or millennia). These would all appear to be relatively easy to derive from dateTime by using a pattern facet restriction.

However, removing any of these parts of the lexical representation results in a character string that is not a valid representation of a dateTime. Therefore, the Schema Recommendation has chosen to provide three discrete date/time datatypes—date, gYearMonth, and gYear—and makes them all primitive with their own lexical representations. They represent ignoring time-of-day, day-within-month, and month-within-year, respectively.

When a schema component specifies a date datatype, a valid XML instance must contain a value (character string) that corresponds to the date components specified in ISO 8601. The lexical values of the date, gYearMonth, and gYear datatypes are nominally of the forms

								cctt-mm-dd
								cctt-mm
								cctt
							

respectively (with optional sign), where the date is represented exactly the same way as for dateTime just described.

The following two element elements are each a representation of an element type whose structure type is date (or derived from date). The simple version is:

<xsd:element name="demoDate" type="xsd:date"/> 

A version with constraining facets that constrain corresponding instances to a date between 1970 and 2050, inclusive, is:

<xsd:element name="demoDate"> 
    <xsd:simple> 
        <xsd:restriction base="xsd:date"> 
            <xsd:minInclusive value="1970-01-01"/> 
            <xsd:maxInclusive value="2050-01-01"/> 
        </xsd:restriction> 
    </xsd:simpleType> 
</xsd:element> 

Given either of the preceding element types, the following is a valid instance:

<demoDate>2000-04-01</demoDate> 

On the other hand, the following is a valid instance only of the “simple” datatype, because the derived datatype does not permit years before 1970 (−2000 is well before 1970):

<demoDateTime>-2000-04-01</demoDate> 

To illustrate the other datatypes, here are two examples, with values:

2000-04’: April 2000

2000’: the year 2000

12.2.1.2. Time Zones

The generic points in time just described have no associated time zone, not even Coordinated Universal Time, which is the correct modern term for what is often called Greenwich Mean Time. It is possible to pin a time zone to a point in time, still using any of the time-line-based datatypes.

Any of the lexical values described for the time-line-based datatypes may have a time zone suffix attached.

The universal suffix consists of

+hh:mm
							

or

-hh:mm
							

to indicate the time to add or subtract to indicate a specific time zone. The simple suffix ‘Z’ has the same meaning as ‘+00:00’: “Coordinated Universal Time.”

To illustrate, here are a few examples:

  • 2000-04-01T13:58:03.22’: April 1, 2000 13:58:03.22, generic—no time zone

  • 2000-04-01T13:58:03.22Z’: April 1, 2000 13:58:03.22, Coordinated Universal Time

  • 2000-04-01T13:58:03.22-07:00’: April 1, 2000 13:58:03.22, (US) Mountain Standard Time

  • -0004-04-01T13:58:03.22Z’: April 1, 5 B.C.E. 13:58:03.22, Coordinated Universal Time

  • 2000-04-01’: April 1, 2000, generic—no time zone

  • 2000-04-01Z’: April 1, 2000, Coordinated Universal Time

  • 2000-04’: April 2000, generic—no time zone

  • -0004-04-01Z’: April 1, 3 B.C.E., Coordinated Universal Time

(In these examples we have assumed that 1 B.C.E. is 0000, not −0001.)

Note

2000-04-01T13:58:03.22Z is eight hours before 2000-04-01T13:58:03.22-08:00; thus the former is less than the latter. 2000-04-01T13:58:03.22-08:00 is the same as 2000-04-01T21:58:03.22Z.

2000-04-01T16:58:03.22 is a generic time that is not comparable with either of the other two.


Note

No distinction is made between Standard Time and Daylight Saving Time, nor between which time zone is indicated—all are effectively converted to Coordinated Universal Time. 2000-04-01T13:58:03.22-07:00 is also 13:58:03.22 April 1, 2000, Pacific Daylight Saving Time—and is also 2000-04-01T20:58:03.22Z.

This conversion to Coordinated Universal Time gives the canonical representation for all date/time datatype values that are not generic (that have an attached time zone), using the ‘Z’ suffix.


12.2.1.3. Time-line-based Constraining Facets

The following constraining facets apply when deriving from the built-in dateTime or date datatypes:

  • maxExclusive

  • maxInclusive

  • minExclusive

  • minInclusive

  • pattern

  • enumeration

  • whiteSpace

whiteSpace is COLLAPSE and therefore cannot be further restricted.

Warning

Time-line-based datatypes are only partially ordered. Generic points in time can be thought of as having an unknown time zone. Because of this, comparing (in the sense of ordering from earlier times to later times) a generic dateTime or date value with a corresponding dateTime or date value tied to a time zone can have unexpected results: some are not comparable! You cannot tell whether 2000-04-01T16:58:03.22 is before or after—or the same as—2000-04-01T16:58:03.22Z. Because they are not comparable, specifying either in setting a max or min limit will disallow the other.


Tip

If all values are tied to time zones, or all values are generic, the ordering is total. You should derive datatypes that either require or prohibit time zones if a total order is important. This is most easily done by using a pattern facet restriction.


12.2.1.4. Time-line-based Derivation Relationships

The dateTime, date, gMonthYear, and gYear datatypes are primitive. There are no built-in derivations of dateTime, date, gMonthYear, and gYear. date cannot be derived from dateTime as integer is derived from decimal, for two reasons:

  • Although the value space is a subset, no facets are provided that will furnish the precise restriction needed.

  • The lexical values for the date datatype are not available in the lexical space of dateTime.

The others cannot be derived one from another for much the same reasons.

12.2.1.5. Time-line-based Alternatives

There are no reasonable built-in alternatives to dateTime, date, gMonthYear, and gYear datatypes—other than each other, depending on the value granularity you need.

12.2.2. The Built-in duration Datatype

“Time-line” dates and times cannot be added one to another. The “arithmetic” of dates and times is carried out using durations. Durations can be added and subtracted one from another ad infinitum. A duration also can be added or subtracted from a time-line date or time; the result is another date or time.

That said, you should be aware that Schema does not care about the arithmetic of dates, times, and durations, with only one exception: The time zone suffix is effectively a duration added to the date or time indicated to get the Coordinated Universal Time point on the time line.

When a schema component specifies a duration datatype, a valid XML instance must contain a value (character string) that corresponds to the date and time components specified in ISO 8601. The lexical values of the duration datatype are nominally of the form

PyYmMdDThHmMsS 

in which the date and time are broken up as follows:

  • P’ is a prefix indicating that this is a duration (required by ISO 8601).

  • y represents the number of years.

  • Y’ is a terminator indicating the preceding digits represent years.

  • The first m represents the number of months.

  • The first ‘M’ is a terminator indicating the preceding digits represent months.

  • d represents the number of days.

  • D’ is a terminator indicating the preceding digits represent months; ‘T’ is a separator indicating that time-of-day follows (compare dateTime in Section 12.2.1).

  • h represents the number of hours.

  • H’ is a terminator indicating the preceding digits represent hours.

  • The second m represents the number of minutes.

  • The second ‘M’ is a terminator indicating the preceding digits represent minutes.

  • s represents the number of seconds.

  • S’ is a terminator indicating the preceding unsigned decimal numeral represents seconds.

All of the preceding numerals must be positive integer numerals, with one exception: the seconds(s ), which may be an unsigned decimal numeral (denoting fractions of a second). Fractions finer than microseconds are rounded to microseconds. A negative sign preceding the 'P' indicates a negative duration.

The combination of a numeral (y , m , d , h , m , s ) and its following terminator is called (just for this discussion) a “component.” Any of the components may be omitted, as long as one remains; the ‘T’ separator must occur if any time components occur, but otherwise must not. The initial ‘P’ is always required.

The following two element elements are representations of element types whose structure type is duration (or derived from duration). The simple version is

<xsd:element name="demoDuration" type="xsd:duration"/> 

A version with some constraining facets that constrain corresponding instances to durations of at least one month is

<xsd:element name="demoDuration"> 
    <xsd:simpleType> 
        <xsd:restriction base="xsd:duration"> 
            <xsd:minInclusive value="P1M"/> 
        </xsd:restriction> 
    </xsd:simpleType> 
</xsd:element> 

Given either of the preceding element types, the following is a valid instance:

<demoDuration>P1M3DT6H</demoDuration> 

On the other hand, the following is a valid instance only of the simple datatype, because the derived datatype requires values greater than or equal to P1M, and P30D is incomparable with P1M:

<demoDuration>P30D</demoDuration> 

Tip

To avoid problems with incomparable values, consider using a pattern facet restriction to permit only one duration component—or warn your users about the problems and enjoin them to avoid edge cases.


To illustrate other possibilities, here are a few examples, with corresponding values:

  • P80D’: 80 days

  • P2Y3M’: Two years and three months

  • P1DT6H32M7.544S’: One day, six hours, 32 minutes, 7.544 seconds

  • -P5D’: Five days, negative

Note

durations cannot carry a time zone.


12.2.2.1. duration Constraining Facets

The following constraining facets apply when deriving from the built-in duration datatype:

  • maxExclusive

  • maxInclusive

  • minExclusive

  • minInclusive

  • pattern

  • enumeration

  • whiteSpace

whiteSpace is COLLAPSE and therefore cannot be further restricted.

Warning

The duration value space is only partially ordered. For example, one month is not comparable to 30 days. Specifying a maxInclusive of one month eliminates not only durations of 32 days and more but also 29 through 31 days. Similar possibly unexpected edge cases occur when years compare with days and years or months compare with hours, minutes, and seconds. All days are exactly 24 hours; leap seconds are ignored. Multiple duration components in one value (months, days, seconds, and so on) can make edge cases even more fuzzy.

These comparisons apply to Schema uses only; if an application chooses to compare differently, it may do so.


Tip

To avoid comparison problems with durations, consider using a pattern facet to derive datatypes that permit only one of the possible duration components.


12.2.2.2. duration Derivation Relationships

duration is a primitive datatype, and no built-in datatypes are derived from it.

12.2.2.3. duration Alternatives

There are no useful alternatives to duration. Make sure, however, that what you need is a duration and not a repeating time period (time, gMonthDay, gDay, or gMonth). Specifically, a duration is a time interval, whereas a repeating date or time is a recurring event. For example, to indicate a report date on the first of each month, do not specify a duration of one month—rather, specify a gDay of “first day of the month.”

12.2.3. The Built-in Repeating Dates and Times Datatypes: time, gMonthDay, gDay, and gMonth

Repeating dates and times are effectively dates and/or times modulo an appropriate time interval. time is modulo day (so that 24:00:00 is the same value as 00:00:00), gMonthDay is modulo years and “integerized” to “day precision”, gDay is modulo months and integerized to day precision, and gMonth is modulo years and integerized to month precision.

When a schema component specifies a repeating time datatype, a valid XML instance must contain a value lexical representation that corresponds to the components specified in ISO

8601. The lexical values of the time, gMonthDay, gDay, and gMonth datatypes are nominally of the form

							hh:mm:ss 
--mm-dd 
---dd 
--mm
						

respectively, where the date is represented exactly the same way as for dateTime (described in Section 12.2.1).

Appropriate element elements for these element types should be obvious. To illustrate lexical representations and corresponding values, here are a few examples:

  • --04’: The month of April, recurring each year, generic—no time zone

  • ---04’: The fourth day of the month, recurring each month, generic—no time zone

  • --04-04’: The fourth day of April, recurring each year, generic—no time zone

  • 16:58:03-05:00’: Fifty-eight minutes and three seconds after 4 P.M., recurring each day, Eastern Standard Time or Central Daylight Saving Time—or after 9 P.M. Coordinated Universal Time

12.2.3.1. Repeating Dates and Times Constraining Facets

The following constraining facets apply when deriving from any of the built-in repeating dates and times datatypes:

  • maxExclusive

  • maxInclusive

  • minExclusive

  • minInclusive

  • pattern

  • enumeration

  • whiteSpace

whiteSpace is COLLAPSE and therefore cannot be further restricted.

12.2.3.2. Repeating Dates and Times Derivation Relationships

All of the repeating dates and times datatypes are primitive datatypes, and no built-in datatypes are derived from them.

12.2.3.3. Repeating Dates and Times Alternatives

There are no serious alternatives to the repeating time period datatypes. Make sure, however, that what you need is a repeating time point and not a duration. Specifically, a duration is a time interval, whereas a repeating date or time is a recurring event. See the description of durations at the beginning of Section 12.2.2.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.236.174