Physical Structures

XML text is stored in entities. Entities are identified in various ways, but most commonly by filename or URI. There is no constraint on this, however, and many systems do use alternate means for entity storage — for example, many live happily in large databases. Many XML documents involve more than one entity; perhaps the most common arrangement is that the document is in one entity and its type definition is in another. As documents get larger, increasing numbers of entities are often involved with each document. This may be more common with document-centric applications than with data-communication applications of XML.

Entities are typically given names in one or more global namespaces. XML requires that entities be given system identifiers, which are always URIs. The term has roots in the SGML community, where system identifiers were used to refer to storage locations using whatever syntax the tools in use happened to understand. An additional global namespace is shared with the SGML world; the identifiers in that space are called formal public identifiers (FPIs). Use of this namespace is very limited in the XML world, as it is not always easily mapped to URLs that can be used to retrieve arbitrary resources, although there are ways to do it. They do see some use, and extensible support for FPIs is available in the PyXML toolkit.

Entities are used for several things in XML:

Document entities

Regardless of the application, all documents start somewhere. With XML, they are also guaranteed to end in the same entity. The entity containing the start of the document is called the document entity. The document entity is interesting because it is the only entity that may be completely anonymous. An application can provide the content of the entity directly to the XML parser, allowing it to operate without extracting the text from a disk file or another local or remote data source.

External entities

Other physical storage units that contribute to a document are external entities. These entities may contain all or part of the type specification for the document, or they may contain document content. While external entities are defined by their system and formal public identifiers, most are given local names for easy reference.

External DTD subset

If a document contains a document type declaration that specifies an external document type subset, that subset is given in an entity. This entity is special in that it is not given a local name, but otherwise is simply an external entity.

Linked resources

Some documents refer to other documents without making them a part of themselves. Whether or not these external resources are really entities is not always clear if they are not referenced via a name defined in an entity declaration. One typical example of this is the resources identified by URI in the href attribute of HTML’s a element—it is not referenced by a named entity, and it is not always known just what the linked resource will contain when the reference is created. The fact that the external resource is identified as the target of a link is important to the linking document.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.255.87